Sunday, January 7, 2024

Index of Machine Learning

Toggle All Sections

THEORY

What is Machine Learning?

Logistic Regression

Decision Tree

Deep Learning

Integration

Vector Calculus

Exercises from the book "Pattern Recognition and Machine Learning" by Christopher Bishop

Statistics


PRACTICALS


ML1: Category Encoding

  1. Installing 'Category Encoders' Python Package Using Pip And Conda
  2. Category Encoders Analysis (in Python)
  3. One Hot Encoding Using Pandas' get_dummies() Method on Titanic Dataset
  4. Do we need all the one hot features?
  5. One Hot Encoding from PySpark, Pandas, Category Encoders and skLearn
  6. Comparing StringIndexer (PySpark), LabelEncoder (skLearn), OrdinalEncoder (skLearn), OrdinalEncoder (category_encoders) [Tags: Machine Learning, Spark]
  7. Effect of PySpark's StringIndexer on clustering of data [Tags: Machine Learning, Spark]
  8. Two ways to get Frequency Based Order of Categorical Data (Python)

ML2: Data Preprocessing

  1. Binning (of a column or 1D Numerical Data)
  2. Feature Scaling in Machine Learning (when to use which among MinMaxScaler and StandardScaler)
  3. Similarity and Dissimilarity for Numerical (interval-scaled) variables, Asymmetric binary variables, Categorical variables, For text
  4. Correlation between continuous-numeric columns, and between categorical columns
  5. Importance of posing right question for machine learning, data analysis and data preprocessing [Tags: Machine Learning, Spark]
  6. Working with skLearn's MinMax scaler and defining our own
  7. Data Preprocessing Using Python package Pandas (Use case: Creating a Nifty50 SIP Simulator)
  8. Loading data from Pandas to PostgreSQL [Tags: Databases, Machine Learning]
  9. Exploring skLearn's CountVectorizer
  10. Using Snorkel to create test data and classifying using Scikit-Learn
  11. Pandas DataFrame Filtering Using eval()
  12. Three Types of Input Data Format For Apriori Algorithm (Association Analysis)
  13. Bagging in overcoming variance of a classifier, clustering algorithm or regressor

ML3: Data visualization

ML3.1: Misc

  1. Data Visualization's Basic Theory
  2. Box Plot and Anomaly Detection in 1D [Tags: Data Visualization, Anomaly Detection]
  3. Social Analysis (SOAN using Python 3) Report [Tags: Data Visualization, NLP]
  4. Plotting Correlation Matrix in Three Ways Using Pandas, Matplotlib and Seaborn
  5. Creating Heatmap from Pandas DataFrame correlation matrix
  6. Binomial Probability Distribution (visualization using Seaborn) [Tags: Machine Learning, NumPy]
  7. Google Analytics for Beginners (Assessments Dump, Oct 2020)
  8. survival8 Audience Around The World (Jun 2021)
  9. Topic modeling using Latent Dirichlet Allocation from sklearn and visualization using pyLDAvis
    [Tags: Data Visualization, Natural Language Processing]

ML3.2: Using PowerBI

  1. PowerBI's HTML Content Visualization [Tags: Data Visualization, PowerBI]
  2. Sorting an 'HTML Content' Visual in PowerBI [Tags: Data Visualization, PowerBI]
  3. Concatenate Two Tables using R in PowerBI [Tags: Data Visualization, PowerBI, R Language]
  4. Timeline View Using HTML Content Visual in PowerBI [Tags: Data Visualization, PowerBI, Python]

ML3.3: Line Chart

  1. Creating and Editing Line Chart in LibreOffice Calc [Tags: Data Visualization, FOSS]
  2. Line plot with multiple lines for page views for survival8 (Sep 2022)

ML3.4: Pie Plot

  1. Stratified sampling and fixed size sampling plus visualization using pie plot (Nov 2022)
  2. Plotting changes in Nifty50's top-5 sectors after last three market crashes Using Pie Plot, Bar Chart and Grouped Bar Chart

ML3.5: Choropleth and Cartography

Choropleth: A choropleth map is a type of statistical thematic map that uses pseudocolor, i.e., color corresponding with an aggregate summary of a geographic characteristic within spatial enumeration units, such as population density or per-capita income. 
Cartography: the science or practice of drawing maps.

  1. Drawing a world heat map using 'cartopy'
  2. Ownership of Bicycle, 2-wheeler and car in India by percentages (2022)
  3. Global Footprint of Survival8 (Nov 2022)

ML3.6: Histogram

  1. Differences between 'bar graph' and histogram
  2. Histogram report and binning on Sales data
  3. An exercise in visualization (plotting line plot and multicolored histogram with -ve and +ve values) using Pandas and Matplotlib

ML4: Outlier Detection / Anomaly Detection

  1. Box Plot and Anomaly Detection in 1D [Tags: Data Visualization, Anomaly Detection]
  2. DCSO: Dynamic Combination of Detector Scores for Outlier Ensembles (Research paper, 2018)
  3. Unsupervised Outlier Detection Using PyOD
  4. Isolation based anomaly detection using iForest (2133360.2133363 / Research Paper)
  5. Density-based algorithm for anomaly detection (Adeel Hashmi / Research Paper)
  6. Isolation Forest Implementation using skLearn, PyOD, and spark-iForest
  7. Anomaly Detection using Scikit-Learn and "eif" PyPI package (for Extended Isolation Forest)
  8. Distributed Deep Learning Using Python Packages Elephas, Keras, Tensorflow and PySpark [Tags: Anomaly Detection, Deep Learning, Spark]
  9. Anomalies in 'survival8' Viewers' Stats (Mar 2022)
  10. Estimating the Contamination Factor For Unsupervised Anomaly Detection

ML5: Classification

ML5.1: Decision Tree

  1. Decision Tree Learning
  2. Interpretation of Decision Tree J48 Classifier output in Weka [Tags: FOSS, Machine Learning, Weka]
  3. Calculations for Info Gain and Gini Coefficient for Building Decision Tree

ML5.2: Miscellaneous of Classification

  1. Creating ML model, saving it, and creating Flask API [Tags: Flask, Machine Learning, Classification]
  2. kNN classification parallelized using MapReduce [Tags: Hadoop / Spark, Machine Learning]
  3. Elbow Method for identifying k in kMeans (clustering) and kNN (classification)
  4. Snorkel's Analysis Package Overview (v0.9.6, Sep 2020). This dicusses how to interpret classification results
  5. Improving a Classifier (ML) Using Snorkel's Slicing Technique
  6. Multi-label Classification using Python
  7. Naïve Bayes Classifier for Spam Filtering
  8. Weka classification experiment on Iris dataset [Tags: FOSS, Machine Learning, Weka]

ML6: Clustering

  1. Elbow Method for identifying k in kMeans (clustering) and kNN (classification)
  2. Weka clustering experiment on Iris dataset [Tags: FOSS, Machine Learning, Weka]
  3. 'Supervised classification-oriented measures' for Cluster Analysis
  4. Similarity-oriented measures for cluster analysis

ML7: Regression

  1. Linear Regression (Theory)
  2. Improvements over OLS (Forward Stepwise, Ridge, Lasso and LARS forms of Regression)
  3. Descriptive Statistics and Linear Regression Using 'statistics' module and 'statsmodels' module
  4. Hands-on 5 Regression Algorithms Using Scikit-Learn
  5. Demo of Linear Regression on Boston Housing Data Using Weka [Tags: FOSS, Machine Learning]
  6. Saving Model, Loading Model and Making Predictions for Linear Regression (in Weka)
  7. Linear Regression Using Java Code And Weka JAR [Tags: FOSS, Java, Machine Learning, Weka]

ML8: Association Mining Between Attributes

  1. Three Types of Input Data Format For Apriori Algorithm (Association Analysis)
  2. The Concept of Lift in Association Rules Mining
  3. Apriori Algorithm For Association Mining Using Weka's Supermarket Dataset
  4. Running Weka's Apriori on 9_TXN_5_ITEMS Dataset
  5. Interpretation of output from Weka for Apriori Algorithm

ML9: Weka Tool

  1. Demo of Linear Regression on Boston Housing Data Using Weka [Tags: FOSS, Machine Learning, Weka]
  2. Interpretation of Decision Tree J48 Classifier output in Weka [Tags: FOSS, Machine Learning, Weka]
  3. Weka classification experiment on Iris dataset [Tags: FOSS, Machine Learning, Weka]
  4. Weka clustering experiment on Iris dataset [Tags: FOSS, Machine Learning, Weka]
  5. Demo of Linear Regression on Boston Housing Data Using Weka [Tags: FOSS, Machine Learning]
  6. Saving Model, Loading Model and Making Predictions for Linear Regression (in Weka)
  7. Linear Regression Using Java Code And Weka JAR [Tags: FOSS, Java, Machine Learning, Weka]
  8. Apriori Algorithm For Association Mining Using Weka's Supermarket Dataset
  9. Running Weka's Apriori on 9_TXN_5_ITEMS Dataset
  10. Interpretation of output from Weka for Apriori Algorithm
  11. Machine Learning and Weka Interview (5 Questions) [Tags: Machine Learning Q&A, Weka Tool]

ML10: Traffic Prediction on my Blog

  1. Traffic Prediction on my Blog (Oct 2023)
  2. When not to use Poisson Distribution for prediction?
  3. Time Series Analysis and Forecasting Using Exponential Moving Average (A use case of traffic prediction on my blog)

ML11: Questions and Answers

  1. Machine Learning dose with ten Q&A (Set 1)
  2. Machine Learning Q&A (Set 2)
  3. Machine Learning Q&A (Set 3)
  4. LinkedIn Machine Learning Assessment Dump (Aug 2021)
  5. Machine Learning and Weka Interview (5 Questions) [Tags: Machine Learning Q&A, Weka Tool]
  6. ARIMA forecast for timeseries is one step ahead. Why? (Solved Interview Problem)

ML12: Miscellaneous

  1. Simple demonstration of how important data is for machine learning
  2. Reading a JSON file from the Google Drive in the Google Colab
  3. A case of cyclic dependencies between PyPI packages [Tags: Machine Learning, Python]
  4. Extracting Information From Search Engines [Tags: Machine Learning, Python]
  5. Digging deeper into your toolbox (Viewing LDiA code of sklearn)
    [Tags: FOSS, Machine Learning, Natural Language Processing]

ML13: Articles

  1. Machine Learning Resources (Dec 2019)
  2. Machine Learning Evolution (Jan 2020)
  3. Data Science Timeline (Aug 2020)

ML14: Our 'Machine Learning' Videos on YouTube

  1. Session 1 - Linear Regression (OLS Method and Theory) - 20210716
  2. Session 2 - Improvements over Linear Regression method of OLS (Forward Stepwise, Ridge, Lasso, LARS)
  3. Linear Regression Theory (2022-02-15)
  4. Pandas and Linear Regression in Code (Dated: 2022-Feb-16)
  5. Naive Bayes Classifier / Application: Spam Filtering / Dated: 2022 Feb 17
  6. Decision Trees Learning (2022 Feb 22)
  7. Perceptron in Machine Learning (24 Apr 2022)
Tags: Machine Learning,Mathematical Foundations for Data Science,Technology,Index

No comments:

Post a Comment