Showing posts with label Deep Learning. Show all posts
Showing posts with label Deep Learning. Show all posts

Thursday, December 21, 2023

Choice of activation function between Sigmoid and Hyperbolic tangent, and its relation to vanishing gradients problem

In terms of their use as activation functions, when is sigmoid used and when is hyperbolic tangent used?

Sigmoid and hyperbolic tangent (tanh) are both activation functions commonly used in neural networks, but they have different characteristics that make them suitable for different situations. Sigmoid Function (Logistic Function): Range: It squashes input values to the range (0, 1). Output: The output is always positive, and it is particularly useful in binary classification problems where the goal is to output probabilities that sum to 1. Issues: Sigmoid tends to suffer from the "vanishing gradient" problem, which can make training deep networks challenging. Use Cases: Output layer of binary classification models. Situations where you need a binary decision, such as in logistic regression. Hyperbolic Tangent Function (tanh): Range: It squashes input values to the range (-1, 1). Output: Like the sigmoid, it is suitable for models that require outputs in the form of probabilities. However, its output range includes negative values, making it zero-centered. This helps mitigate the vanishing gradient problem to some extent. Advantages: Zero-centered nature helps in the optimization process during training, compared to the sigmoid. Use Cases: Hidden layers of neural networks, especially in cases where zero-centered activation functions are preferred. Situations where the output needs to be in a range that spans both positive and negative values. In practice, the choice between sigmoid and tanh depends on the specific requirements of your neural network and the nature of the problem you're trying to solve. In modern deep learning architectures, rectified linear units (ReLU) and its variants are also popular choices for hidden layers due to their training efficiency and ability to mitigate the vanishing gradient problem.

Why and how do sigmoid activation function results in vanishing gradient problem?

The vanishing gradient problem is a challenge that can occur during the training of deep neural networks, particularly when using activation functions with certain characteristics, such as the sigmoid function. The problem arises in the context of gradient-based optimization algorithms, like backpropagation, which are commonly used to update the weights of neural networks during training. Here's why the sigmoid activation function can lead to the vanishing gradient problem: Output Range: The sigmoid function squashes input values to the range (0, 1). This means that for large positive or negative inputs, the sigmoid saturates, and the output becomes very close to 0 or 1. Once the output is near these extremes, the gradient of the sigmoid becomes extremely small. Derivative Characteristics: The derivative of the sigmoid function is maximum at its center (around 0.25) and approaches zero as you move away from the center. This means that the gradients for large positive or negative inputs are close to zero. In the context of backpropagation, the gradient is used to update the weights of the network, and if the gradients are very small, the updates to the weights become negligible. Multiplicative Nature of Gradients: In a deep neural network, during backpropagation, gradients are propagated backward through the layers. Since the gradients are multiplied at each layer, if the gradients are very small, they can quickly diminish as they are propagated backward through the network. This can result in the gradients becoming effectively zero for the weights in the early layers of the network, preventing effective learning in those layers. In summary, the vanishing gradient problem occurs because the sigmoid function, when used in deep networks, can produce very small gradients for weights in the early layers, hindering the training process. This can lead to slower convergence and, in extreme cases, the early layers may not learn meaningful representations. To mitigate the vanishing gradient problem, other activation functions like hyperbolic tangent (tanh) or rectified linear units (ReLU) are often used, as they exhibit better training characteristics in deep networks.

On a side note...

Derivative of Sigmoid function

Derivative of tanh(z)

An alternative to the logistic sigmoid is the hyperbolic tangent, or tanh function (Figure 1, green curves):
Like the logistic sigmoid, the tanh function is also sigmoidal (“s”-shaped), but instead outputs values that range (-1,1). Thus strongly negative inputs to the tanh will map to negative outputs. Additionally, only zero-valued inputs are mapped to near-zero outputs. These properties make the network less likely to get “stuck” during training. Calculating the gradient for the tanh function also uses the quotient rule:
Similar to the derivative for the logistic sigmoid, the derivative of gtanh(z) is a function of feed-forward activation evaluated at z, namely (1-gtanh(z)2). Thus the same caching trick can be used for layers that implement tanh activation functions.

Common Activation Functions and Plots of Their Derivatives

Side notes

(1) How are tanh and sigmoid related? Can you write tanh as a function of sigmoid?

The hyperbolic tangent function (tanh) and the sigmoid function (often the logistic sigmoid) are related through a simple mathematical transformation. The tanh function can be expressed in terms of the sigmoid function as follows:

tanh(x)=2σ(2x)1\tanh(x) = 2 \sigma(2x) - 1

Here, σ(x)\sigma(x) represents the sigmoid function, defined as σ(x)=11+ex\sigma(x) = \frac{1}{1 + e^{-x}}.

So, to obtain the tanh function, you take twice the input (2x2x), apply the sigmoid function to it (σ(2x)\sigma(2x)), multiply the result by 2 (2σ(2x)2 \sigma(2x)), and then subtract 1 (2σ(2x)12 \sigma(2x) - 1). This transformation ensures that the output of the tanh function is in the range of [1,1][-1, 1], similar to how the sigmoid function squashes values into the range [0,1][0, 1].

(2) If both Sigmoid and Tanh based models face vanishing gradient problem, what's the alternative?: Ans: ReLU (Rectified Linear Unit)

Tuesday, October 24, 2023

Deep Learning Books (Oct 2023)

Download Books
1.
Deep Learning
Yoshua Bengio, 2015

2.
Deep Learning with Python
François Chollet, 2017

3.
Deep Learning: A Practitioner's Approach
Josh Patterson, 2017

4.
Deep Learning from Scratch: Building with Python from First Principles
Seth Weidman, 2019

5.
Grokking Deep Learning
Andrew W. Trask, 2019

6.
Deep Learning for Coders with Fastai and PyTorch
Jeremy Howard, 2020

7.
Neural Networks and Deep Learning
2017

8.
Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems
Geron Aurelien, 2017

9.
Neural Networks and Deep Learning: A Textbook
Charu C. Aggarwal, 2018

10.
Learning Deep Learning: Theory and Practice of Neural Networks, Computer Vision, Natural Language Processing, and Transformers Using TensorFlow
Magnus Ekman, 2021

11.
Fundamentals of Deep Learning: Designing Next-Generation Machine Intelligence Algorithms
Nikhil Buduma, 2017

12.
Deep Learning with R
François Chollet, 2018

13.
Grokking Deep Reinforcement Learning
Miguel Morales, 2020

14.
The Hundred-page Machine Learning Book
Andriy Burkov, 2019

15.
Artificial Intelligence By Example: Acquire Advanced AI, Machine Learning, and Deep Learning Design Skills, 2nd Edition
Denis Rothman, 2020

16.
Hands-On Deep Learning Algorithms with Python: Master Deep Learning Algorithms with Extensive Math by Implementing Them Using TensorFlow
Sudharsan Ravichandiran, 2019

17.
Fundamentals of Deep Learning
Nikhil Buduma, 2022

18.
Deep Learning Illustrated: A Visual, Interactive Guide to Artificial Intelligence
Jon Krohn, 2019

19.
Deep Learning Cookbook: Practical Recipes to Get Started Quickly
Douwe Osinga, 2018

20.
Deep Learning
John D. Kelleher, 2019

21.
Deep Learning for Vision Systems
Mohamed Elgendy, 2020

22.
Generative Deep Learning: Teaching Machines to Paint, Write, Compose, and Play
David Foster, 2019

23.
Deep Learning: A Visual Approach
Andrew Glassner, 2021

24.
Deep Learning with Python, Second Edition
François Chollet, 2021

25.
Deep Learning with R, Second Edition
Joseph J. Allaire, 2022

26.
Deep Learning in Computer Vision: Principles and Applications
2020

27.
TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers
Daniel Situnayake, 2019

28.
GANs in Action: Deep Learning with Generative Adversarial Networks
Vladimir Bok, 2019

29.
Grokking Artificial Intelligence Algorithms
Rishal Hurbans, 2020

30.
Dive Into Deep Learning
Zachary Lipton, 2023

31.
Inside Deep Learning: Math, Algorithms, Models
Edward Raff, 2022

32.
Math for Deep Learning: What You Need to Know to Understand Neural Networks
Ronald T Kneusel, 2021

33.
Evolutionary Deep Learning: Genetic Algorithms and Neural Networks
Micheal Lanham, 2023

34.
Understanding Machine Learning: From Theory to Algorithms
Shai Shalev-Shwartz, 2014

35.
Deep Learning with PyTorch
Thomas Viehmann, 2020

36.
Python Machine Learning: Machine Learning and Deep Learning with Python, Scikit-learn, and TensorFlow 2
Sebastian Raschka, 2019

37.
TensorFlow Machine Learning Cookbook
Nick McClure, 2017

38.
The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World
Pedro Domingos, 2015

39.
Deep Learning with TensorFlow and Keras: Build and Deploy Supervised, Unsupervised, Deep, and Reinforcement Learning Models
Sujit Pal, 2022

40.
Machine Learning - A Journey To Deep Learning: With Exercises And Answers
Andreas Miroslaus Wichert, 2021

41.
Hands-on Machine Learning with JavaScript: Solve Complex Computational Web Problems Using Machine Learning
Burak Kanber, 2018

42.
Introduction to Machine Learning with Python: A Guide for Data Scientists
Sarah Guido, 2016

43.
Python Machine Learning
Sebastian Raschka, 2015

44.
Artificial Intelligence for Humans
Jeff Heaton, 2013

45.
Machine Learning for Absolute Beginners: A Plain English Introduction
Oliver Theobald, 2017

46.
Practical Deep Learning for Cloud, Mobile, and Edge: Real-World AI & Computer-Vision Projects Using Python, Keras & TensorFlow
Anirudh Koul, 2019

47.
Introduction to Deep Learning: From Logical Calculus to Artificial Intelligence
Sandro Skansi, 2018

48.
Deep Learning: Methods and Applications
Dong Yu, 2014

49.
Machine Learning with Python Cookbook: Practical Solutions from Preprocessing to Deep Learning
Chris Albon, 2018

50.
Advanced Deep Learning with TensorFlow 2 and Keras: Apply DL, GANs, VAEs, Deep RL, Unsupervised Learning, Object Detection and Segmentation, and More, 2nd Edition
Rowel Atienza, 2020

51.
An Introduction to Statistical Learning: With Applications in R
Trevor Hastie, 2013

52.
Deep Learning. Foundations and Concepts
Christopher M. Bishop, Hugh Bishop
Springer (2023)
Tags: List of Books,Technology,Deep Learning,

Monday, September 4, 2023

Deep Learning Roadmap A Step-by-Step Guide to Learning Deep Learning

Introduction

Deep Learning, a subfield of Artificial Intelligence, has made astounding strides in recent years, powering everything from image recognition to language translation. If you're eager to embark on your journey into the world of Deep Learning, it's essential to have a roadmap. In this article, we'll provide you with a concise guide on the key milestones and steps to navigate as you master the art of Deep Learning.



Deep Learning Roadmap



Step 1: The Foundation - Understand Machine Learning Basics

Before diving deep, ensure you have a solid grasp of Machine Learning concepts. Familiarize yourself with supervised and unsupervised learning, regression, classification, and model evaluation. Books like "Machine Learning for Dummies" can be a great starting point.

Step 2: Python Proficiency

Python is the lingua franca of Deep Learning. Learn Python and its libraries, particularly NumPy, Pandas, and Matplotlib. Understanding Python is crucial as it's the primary language for developing Deep Learning models.

Step 3: Linear Algebra and Calculus

Deep Learning involves complex mathematics. Brush up on your linear algebra (vectors, matrices, eigenvalues) and calculus (derivatives, gradients) as they form the foundation of neural network operations.

Step 4: Dive into Neural Networks

Start with understanding the basics of neural networks. Learn about artificial neurons, activation functions, and feedforward neural networks. The book "Deep Learning" by Ian Goodfellow is an excellent resource.

Step 5: Convolutional Neural Networks (CNNs)

For image-related tasks, CNNs are essential. Explore how they work, learn about convolution, pooling, and their applications in image recognition. Online courses like Stanford's CS231n provide excellent materials.

Step 6: Recurrent Neural Networks (RNNs)

RNNs are crucial for sequential data, such as natural language processing and time series analysis. Study RNN architectures, vanishing gradient problems, and LSTM/GRU networks.

Step 7: Deep Dive into Deep Learning Frameworks

Become proficient in popular Deep Learning frameworks like TensorFlow and PyTorch. These libraries simplify building and training complex neural networks.

Step 8: Projects and Hands-On Practice

Apply what you've learned through projects. Start with simple tasks like digit recognition and progressively tackle more complex challenges. Kaggle offers a platform for real-world practice.

Step 9: Natural Language Processing (NLP)

For text-related tasks, delve into NLP. Learn about word embeddings, recurrent models for text, and pre-trained language models like BERT.

Step 10: Advanced Topics

Explore advanced Deep Learning topics like Generative Adversarial Networks (GANs), Reinforcement Learning, and transfer learning. Stay updated with the latest research through journals, conferences, and online courses.

Step 11: Model Optimization and Deployment

Understand model optimization techniques to make your models efficient. Learn how to deploy models in real-world applications using cloud services or on-device deployment.

Step 12: Continuous Learning

Deep Learning is a rapidly evolving field. Stay up-to-date with the latest research papers, attend conferences like NeurIPS and CVPR, and join online forums and communities to learn from others.

Conclusion

The Deep Learning roadmap is your guide to mastering this exciting field. Remember that the journey may be challenging, but it's immensely rewarding. By building a strong foundation, exploring key neural network architectures, and constantly seeking to expand your knowledge, you'll be well on your way to becoming a proficient Deep Learning practitioner. Happy learning!




References:

Full Stack Data Science with Python Course on Github


Monday, August 7, 2023

Enhancing AI Risk Management in Financial Services with Machine Learning

Introduction:

The realm of financial services is rapidly embracing the power of artificial intelligence (AI) and machine learning (ML) to enhance risk management strategies. By leveraging advanced ML models, financial institutions can gain deeper insights into potential risks, make informed decisions, and ensure the stability of their operations. In this article, we'll explore how AI-driven risk management can be achieved using the best ML models in Python, complete with code examples.



AI Risk Management in Financial Services


Step 1: Data Collection and Preprocessing

To begin, gather historical financial data relevant to your risk management objectives. This could include market prices, economic indicators, credit scores, and more. Clean and preprocess the data by handling missing values, normalizing features, and encoding categorical variables.


Step 2: Import Libraries and Data

In your Python script, start by importing the necessary libraries:

import pandas as pd import numpy as np from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from sklearn.metrics import accuracy_score, classification_report from sklearn.ensemble import RandomForestClassifier from xgboost import XGBClassifier

Load and preprocess your dataset:

data = pd.read_csv("financial_data.csv") X = data.drop("risk_label", axis=1) y = data["risk_label"]

Step 3: Train-Test Split and Data Scaling

Split the data into training and testing sets:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Scale the features for better model performance:

scaler = StandardScaler() X_train_scaled = scaler.fit_transform(X_train) X_test_scaled = scaler.transform(X_test)

Step 4: Implement ML Models

In this example, we'll use two powerful ML models: Random Forest and XGBoost.

  1. Random Forest Classifier:
rf_model = RandomForestClassifier(n_estimators=100, random_state=42) rf_model.fit(X_train_scaled, y_train) rf_predictions = rf_model.predict(X_test_scaled) rf_accuracy = accuracy_score(y_test, rf_predictions) print("Random Forest Accuracy:", rf_accuracy) print(classification_report(y_test, rf_predictions))
  1. XGBoost Classifier:
xgb_model = XGBClassifier(n_estimators=100, random_state=42) xgb_model.fit(X_train_scaled, y_train) xgb_predictions = xgb_model.predict(X_test_scaled) xgb_accuracy = accuracy_score(y_test, xgb_predictions) print("XGBoost Accuracy:", xgb_accuracy) print(classification_report(y_test, xgb_predictions))

Step 5: Evaluate and Compare

Evaluate the models' performance using accuracy and classification reports. Compare their results to determine which model is better suited for your risk management goals.


Conclusion:

AI-driven risk management is revolutionizing the financial services industry. By harnessing the capabilities of machine learning, financial institutions can accurately assess risks, make informed decisions, and ultimately ensure their stability and growth. In this article, we've demonstrated how to implement risk management using the best ML models in Python. Experiment with different models, fine-tune hyperparameters, and explore more advanced techniques to tailor the solution to your specific financial service needs. The future of risk management lies at the intersection of AI and finance, and now is the time to embrace its potential.


AI and Financial Risk Management – Critical Insights for Banking Leaders

I hope this article was helpful. If you have any questions, please feel free to leave a comment below.