Toggle All Sections
THEORY
PRACTICALS
ML1: Category Encoding
Installing 'Category Encoders' Python Package Using Pip And Conda
Category Encoders Analysis (in Python)
One Hot Encoding Using Pandas' get_dummies() Method on Titanic Dataset
Do we need all the one hot features?
One Hot Encoding from PySpark, Pandas, Category Encoders and skLearn
Comparing StringIndexer (PySpark), LabelEncoder (skLearn), OrdinalEncoder (skLearn), OrdinalEncoder
(category_encoders) [Tags: Machine Learning, Spark]
Effect of PySpark's StringIndexer on clustering of data [Tags: Machine Learning, Spark]
Two ways to get Frequency Based Order of Categorical Data (Python)
ML2: Data Preprocessing
Binning (of a column or 1D Numerical Data)
Feature Scaling in Machine Learning (when to use which among MinMaxScaler and StandardScaler)
Similarity and Dissimilarity for Numerical (interval-scaled) variables, Asymmetric binary variables,
Categorical variables, For text
Correlation between continuous-numeric columns, and between categorical columns
Importance of posing right question for machine learning, data analysis and data preprocessing [Tags:
Machine Learning, Spark]
Working with skLearn's MinMax scaler and defining our own
Data Preprocessing Using Python package Pandas (Use case: Creating a Nifty50 SIP Simulator)
Loading data from Pandas to PostgreSQL [Tags: Databases, Machine Learning]
Exploring skLearn's CountVectorizer
Using Snorkel to create test data and classifying using Scikit-Learn
Pandas DataFrame Filtering Using eval()
Three Types of Input Data Format For Apriori Algorithm (Association Analysis)
Bagging in overcoming variance of a classifier, clustering algorithm or regressor
ML3: Data visualization
ML3.1: Misc
Data Visualization's Basic Theory
Box Plot and Anomaly Detection in 1D [Tags: Data Visualization, Anomaly Detection]
Social Analysis (SOAN using Python 3) Report [Tags: Data Visualization, NLP]
Plotting Correlation Matrix in Three Ways Using Pandas, Matplotlib and Seaborn
Creating Heatmap from Pandas DataFrame correlation matrix
Binomial Probability Distribution (visualization using Seaborn) [Tags: Machine Learning, NumPy]
Google Analytics for Beginners (Assessments Dump, Oct 2020)
survival8 Audience Around The World (Jun 2021)
Topic modeling using Latent Dirichlet Allocation from sklearn and visualization using pyLDAvis
[Tags: Data Visualization, Natural Language Processing]
ML3.2: Using PowerBI
PowerBI's HTML Content Visualization [Tags: Data Visualization, PowerBI]
Sorting an 'HTML Content' Visual in PowerBI [Tags: Data Visualization, PowerBI]
Concatenate Two Tables using R in PowerBI [Tags: Data Visualization, PowerBI, R Language]
Timeline View Using HTML Content Visual in PowerBI [Tags: Data Visualization, PowerBI, Python]
ML3.3: Line Chart
Creating and Editing Line Chart in LibreOffice Calc [Tags: Data Visualization, FOSS]
Line plot with multiple lines for page views for survival8 (Sep 2022)
ML3.4: Pie Plot
Stratified sampling and fixed size sampling plus visualization using pie plot (Nov 2022)
Plotting changes in Nifty50's top-5 sectors after last three market crashes Using Pie Plot, Bar Chart
and Grouped Bar Chart
ML3.5: Choropleth and Cartography
Choropleth: A choropleth map is a type of statistical thematic map that uses pseudocolor, i.e., color corresponding with an aggregate summary of a geographic characteristic within spatial enumeration units, such as population density or per-capita income. Cartography: the science or practice of drawing maps.
Drawing a world heat map using 'cartopy'
Ownership of Bicycle, 2-wheeler and car in India by percentages (2022)
Global Footprint of Survival8 (Nov 2022)
ML3.6: Histogram
Differences between 'bar graph' and histogram
Histogram report and binning on Sales data
An exercise in visualization (plotting line plot and multicolored histogram with -ve and +ve values)
using Pandas and Matplotlib
ML4: Outlier Detection / Anomaly Detection
Box Plot and Anomaly Detection in 1D [Tags: Data Visualization, Anomaly Detection]
DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles (Research paper, 2018)
Unsupervised Outlier Detection Using PyOD
Isolation based anomaly detection using iForest (2133360.2133363 / Research Paper)
Density-based algorithm for anomaly detection (Adeel Hashmi / Research Paper)
Isolation Forest Implementation using skLearn, PyOD, and spark-iForest
Anomaly Detection using Scikit-Learn and "eif" PyPI package (for Extended Isolation Forest)
Distributed Deep Learning Using Python Packages Elephas, Keras, Tensorflow and PySpark [Tags: Anomaly
Detection, Deep Learning, Spark]
Anomalies in 'survival8' Viewers' Stats (Mar 2022)
Estimating the Contamination Factor For Unsupervised Anomaly Detection
ML5: Classification
ML5.1: Decision Tree
Decision Tree Learning
Interpretation of Decision Tree J48 Classifier output in Weka [Tags: FOSS, Machine Learning,
Weka]
Calculations for Info Gain and Gini Coefficient for Building Decision Tree
ML5.2: Miscellaneous of Classification
Creating ML model, saving it, and creating Flask API [Tags: Flask, Machine Learning,
Classification]
kNN classification parallelized using MapReduce [Tags: Hadoop / Spark, Machine Learning]
Elbow Method for identifying k in kMeans (clustering) and kNN (classification)
Snorkel's Analysis Package Overview (v0.9.6, Sep 2020). This dicusses how to interpret classification
results
Improving a Classifier (ML) Using Snorkel's Slicing Technique
Multi-label Classification using Python
Naïve Bayes Classifier for Spam Filtering
Weka classification experiment on Iris dataset [Tags: FOSS, Machine Learning, Weka]
ML6: Clustering
Elbow Method for identifying k in kMeans (clustering) and kNN (classification)
Weka clustering experiment on Iris dataset [Tags: FOSS, Machine Learning, Weka]
'Supervised classification-oriented measures' for Cluster Analysis
Similarity-oriented measures for cluster analysis
ML7: Regression
Linear Regression (Theory)
Improvements over OLS (Forward Stepwise, Ridge, Lasso and LARS forms of Regression)
Descriptive Statistics and Linear Regression Using 'statistics' module and 'statsmodels' module
Hands-on 5 Regression Algorithms Using Scikit-Learn
Demo of Linear Regression on Boston Housing Data Using Weka [Tags: FOSS, Machine Learning]
Saving Model, Loading Model and Making Predictions for Linear Regression (in Weka)
Linear Regression Using Java Code And Weka JAR [Tags: FOSS, Java, Machine Learning, Weka]
ML8: Association Mining Between Attributes
Three Types of Input Data Format For Apriori Algorithm (Association Analysis)
The Concept of Lift in Association Rules Mining
Apriori Algorithm For Association Mining Using Weka's Supermarket Dataset
Running Weka's Apriori on 9_TXN_5_ITEMS Dataset
Interpretation of output from Weka for Apriori Algorithm
ML9: Weka Tool
Demo of Linear Regression on Boston Housing Data Using Weka [Tags: FOSS, Machine Learning, Weka]
Interpretation of Decision Tree J48 Classifier output in Weka [Tags: FOSS, Machine Learning,
Weka]
Weka classification experiment on Iris dataset [Tags: FOSS, Machine Learning, Weka]
Weka clustering experiment on Iris dataset [Tags: FOSS, Machine Learning, Weka]
Demo of Linear Regression on Boston Housing Data Using Weka [Tags: FOSS, Machine Learning]
Saving Model, Loading Model and Making Predictions for Linear Regression (in Weka)
Linear Regression Using Java Code And Weka JAR [Tags: FOSS, Java, Machine Learning, Weka]
Apriori Algorithm For Association Mining Using Weka's Supermarket Dataset
Running Weka's Apriori on 9_TXN_5_ITEMS Dataset
Interpretation of output from Weka for Apriori Algorithm
Machine Learning and Weka Interview (5 Questions) [Tags: Machine Learning Q&A, Weka Tool]
ML10: Traffic Prediction on my Blog
Traffic Prediction on my Blog (Oct 2023)
When not to use Poisson Distribution for prediction?
Time Series Analysis and Forecasting Using Exponential Moving Average (A use case of traffic prediction
on my blog)
ML11: Questions and Answers
Machine Learning dose with ten Q&A (Set 1)
Machine Learning Q&A (Set 2)
Machine Learning Q&A (Set 3)
LinkedIn Machine Learning Assessment Dump (Aug 2021)
Machine Learning and Weka Interview (5 Questions) [Tags: Machine Learning Q&A, Weka Tool]
ARIMA forecast for timeseries is one step ahead. Why? (Solved Interview Problem)
ML12: Miscellaneous
Simple demonstration of how important data is for machine learning
Reading a JSON file from the Google Drive in the Google Colab
A case of cyclic dependencies between PyPI packages [Tags: Machine Learning, Python]
Extracting Information From Search Engines [Tags: Machine Learning, Python]
Digging deeper into your toolbox (Viewing LDiA code of sklearn)
[Tags: FOSS, Machine Learning, Natural Language Processing]
ML13: Articles
Machine Learning Resources (Dec 2019)
Machine Learning Evolution (Jan 2020)
Data Science Timeline (Aug 2020)
ML14: Our 'Machine Learning' Videos on YouTube
Session 1 - Linear Regression (OLS Method and Theory) - 20210716
Session 2 - Improvements over Linear Regression method of OLS (Forward Stepwise, Ridge, Lasso, LARS)
Linear Regression Theory (2022-02-15)
Pandas and Linear Regression in Code (Dated: 2022-Feb-16)
Naive Bayes Classifier / Application: Spam Filtering / Dated: 2022 Feb 17
Decision Trees Learning (2022 Feb 22)
Perceptron in Machine Learning (24 Apr 2022)
Tags: Machine Learning,Mathematical Foundations for Data Science,Technology,Index
No comments:
Post a Comment