survival8

Saturday, April 20, 2024

Streamlining NLP Tasks: A Deep Dive into Hugging Face Transformers Pipelines

The Hugging Face Transformers library has become a powerhouse for Natural Language Processing (NLP) tasks. While its core functionality revolves around pre-trained models and tokenization, the library offers a powerful abstraction layer called pipelines. Pipelines simplify the process of applying these models to real-world NLP applications. This blog post delves into the world of Transformers pipelines, exploring their capabilities, functionalities, and code examples.

What are Transformers Pipelines?

Imagine you have a toolbox filled with specialized tools for various construction tasks. Using each tool effectively requires knowledge of its operation and purpose. Transformers pipelines act similarly. They encapsulate the complexities involved in using pre-trained models for NLP tasks, providing a user-friendly interface for inference.

Here's a breakdown of what pipelines offer:

Simplified Model Usage: Pipelines hide the underlying complexities of loading models, tokenization, and model execution. You don't need to write intricate code for each step; the pipeline handles it all.
Task-Specific Functionality: Pipelines are designed for specific NLP tasks like sentiment analysis, question answering, or named entity recognition. This makes them ideal for developers who want to quickly integrate these functionalities into their applications.
Batch Processing: Pipelines can efficiently process multiple text inputs at once, improving performance for large datasets.
Flexibility: While pipelines offer pre-built functionalities, they also allow customization through various parameters. You can fine-tune the processing steps based on your specific needs.

Unveiling the Power of Pipelines with Code Examples

Let's explore the capabilities of Transformers pipelines with some code examples:

1. Sentiment Analysis:

Sentiment analysis gauges the emotional tone of a piece of text (positive, negative, or neutral). Here's how to use a pipeline for sentiment analysis:

Python
from transformers import pipeline

# Initialize pipeline for sentiment analysis
sentiment_analysis = pipeline("sentiment-analysis")

# Analyze the sentiment of a sentence
sentence = "This movie was absolutely fantastic!"
sentiment = sentiment_analysis(sentence)

print(sentiment)

# Output: {'label': 'POSITIVE', 'score': 0.9983537774009705}
 Use code with caution.

This code snippet imports the pipeline function and creates a sentiment-analysis pipeline instance. It then feeds the sentence "This movie was absolutely fantastic!" and retrieves the sentiment information (label and score).

2. Question Answering:

Question answering pipelines allow you to extract answers to questions from a given context. Here's an example:

Python
from transformers import pipeline

# Initialize pipeline for question answering
question_answering = pipeline("question-answering")

# Context passage and question
passage = "Hugging Face Transformers is a powerful NLP library."
question = "What is Transformers?"

# Find the answer within the context
answer = question_answering({"context": passage, "question": question})

print(f"Answer: {answer['answer']}")

# Output: Answer: Transformers
 Use code with caution.

This code demonstrates question answering. It creates a question-answering pipeline and provides both the context passage and the question. The pipeline extracts the answer ("Transformers") from the context.

3. Customizing Pipelines:

Pipelines offer various parameters for customization. Here's how to modify the sentiment analysis example to include a specific model:

Python
from transformers import pipeline

# Specify the pre-trained model for sentiment analysis
sentiment_analysis = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")

# Analyze sentiment with the specified model
sentence = "Today is a gloomy day."
sentiment = sentiment_analysis(sentence)

print(sentiment)
 Use code with caution.

In this example, the model parameter specifies the pre-trained model (distilbert-base-uncased-finetuned-sst-2-english) to be used for sentiment analysis. This allows you to leverage different models based on your task and performance requirements.

These are just a few examples showcasing the versatility of Transformers pipelines. The library offers pipelines for various tasks like summarization, feature extraction, text generation, and more. You can explore the comprehensive list of available pipelines in the Hugging Face documentation https://huggingface.co/docs/transformers/en/main_classes/pipelines.

Beyond the Code: Advantages and Considerations

While pipelines offer a convenient way to leverage NLP models, it's essential to consider some factors:

Black Box Nature: Pipelines abstract the underlying complexities, which can be beneficial for quick implementation. However, for advanced users who need more control over the processing steps, custom code might be necessary.
Limited Customization: While pipelines allow parameter adjustments, they may not

Expanding Our NLP Toolkit: A Look at Transformers Pipelines with More Examples

In the previous section, we explored the fundamentals of Transformers pipelines and their functionalities with code examples for sentiment analysis and question answering. However, the Transformers library offers a much richer set of pipelines catering to diverse NLP tasks. Let's delve deeper and discover the potential of these pipelines with more examples, drawing inspiration from the resource: https://huggingface.co/learn/nlp-course/chapter1/3.

Unveiling a Broader Spectrum of Pipelines

The Hugging Face Transformers library boasts a comprehensive collection of pipelines, each tailored to a specific NLP requirement. Here's a glimpse into some of the pipelines you'll encounter:

Feature Extraction: get the vector representation of a text
Sentiment Analysis: As discussed earlier, this pipeline gauges the emotional tone of text (positive, negative, or neutral).
Zero-Shot Classification: This pipeline goes beyond pre-defined categories. It allows you to classify text data based on new classes you provide at runtime. Imagine classifying emails into "urgent," "informational," or "promotional" categories without explicitly training a model for these specific labels.
Text Generation: Unleash your creativity with this pipeline! It enables you to generate different creative text formats of text content, like poems, code, scripts, musical pieces, email, letters, etc. You can provide a starting prompt or choose from various generation algorithms to produce creative text formats.
Fill-Mask: This pipeline is like a word completion game on steroids. It takes a sentence with a masked token and predicts the most likely word to fill the blank. This can be useful for tasks like text summarization or machine translation.
Named Entity Recognition (NER): Identify and classify named entities in text, such as people, organizations, locations, monetary values, percentages, dates, times, etc. This is crucial for information extraction tasks.
Question Answering: As seen previously, this pipeline finds answers to your questions within a given context.
Summarization: This pipeline condenses lengthy text passages into a shorter, informative summary, perfect for generating quick overviews of documents or articles.
Translation: Break down language barriers! This pipeline translates text from one language to another.

This is just a selection of the many Transformers pipelines available. The Hugging Face website provides a detailed list with information on their functionalities and usage https://huggingface.co/docs/transformers/en/main_classes/pipelines.

Code Examples in Action

Let's explore how we can leverage some of these pipelines with code examples:

1. Zero-Shot Classification:

Python
from transformers import pipeline

# Initialize pipeline for zero-shot classification
zero_shot_classifier = pipeline("zero-shot-classification")

# Define custom classes
custom_classes = ["urgent", "informational", "promotional"]

# Classify an email based on custom classes
email_text = "This email contains important information about your upcoming flight."
classification = zero_shot_classifier(email_text, custom_classes=custom_classes)

print(classification)

# Output: {'labels': ['informational'], 'scores': [0.9998778791427612]}
 Use code with caution.

This code snippet demonstrates zero-shot classification. We define custom classes ("urgent", "informational", "promotional") and use them to classify an email. The pipeline assigns the most likely class ("informational") along with a confidence score.

2. Text Generation:

Python
from transformers import pipeline

# Initialize pipeline for text generation
text_generator = pipeline("text-generation")

# Generate different creative text formats
print(text_generator("Once upon a time,", max_length=50, num_return_sequences=2))

# Output: [
#           "Once upon a time, there was a little girl who lived in a small village. She was a kind and curious girl, and she always loved to explore the forest behind her house.",
#           "Once upon a time, in a land far, far away, there lived a brave knight named Sir Lancelot. He was a loyal and courageous warrior, and he was always ready to defend his kingdom."
#         ]
 Use code with caution.

This example showcases text generation. We provide a starting prompt ("Once upon a time,") and generate two different creative text continuations using the pipeline.

By incorporating these pipelines into your projects, you can unlock a vast array of NLP capabilities.

Conclusion

Transformers pipelines offer a powerful and user-friendly approach to applying pre-trained models for various NLP tasks. This blog post has provided a foundation for understanding pipelines, explored their functionalities with a wider range of examples

Friday, April 19, 2024

How your personality influences your habits And the Big-5 Personality Traits (From the book Atomic Habits by James Clear)

Your genes are operating beneath the surface of every habit. Indeed, beneath the surface of every behavior. Genes have been shown to influence everything from the number of hours you spend watching television to your likelihood to marry or divorce to your tendency to get addicted to drugs, alcohol, or nicotine. There's a strong genetic component to how obedient or rebellious you are when facing authority, how vulnerable or resistant you are to stressful events, how proactive or reactive you tend to be, and even how captivated or bored you feel during sensory experiences like attending a concert. As Robert Plomin, a behavioral geneticist at King's College in London, told me,
“It is now at the point where we have stopped testing to see if traits have a genetic component because we literally can't find a single one that isn't influenced by our genes.”

Bundled together, your unique cluster of genetic traits predispose you to a particular personality. Your personality is the set of characteristics that is consistent from situation to situation. The most proven scientific analysis of personality traits is known as the “Big Five,” which breaks them down into five spectrums of behavior.

1. Openness to experience: from curious and inventive on one end to cautious and consistent on the other.

2. Conscientiousness: organized and efficient to easygoing and spontaneous.

3. Extroversion: outgoing and energetic to solitary and reserved (you likely know them as extroverts vs. introverts).

4. Agreeableness: friendly and compassionate to challenging and detached.

5. Neuroticism: anxious and sensitive to confident, calm, and stable.

All five characteristics have biological underpinnings. Extroversion, for instance, can be tracked from birth. If scientists play a loud noise in the nursing ward, some babies turn toward it while others turn away.
When the researchers tracked these children through life, they found that the babies who turned toward the noise were more likely to grow up to be extroverts. Those who turned away were more likely to become introverts.
People who are high in agreeableness are kind, considerate, and warm. They also tend to have higher natural oxytocin levels, a hormone that plays an important role in social bonding, increases feelings of trust, and can act as a natural antidepressant. You can easily imagine how someone with more oxytocin might be inclined to build habits like writing thank-you notes or organizing social events.
As a third example, consider neuroticism, which is a personality trait all people possess to various degrees. People who are high in neuroticism tend to be anxious and worry more than others. This trait has been linked to hypersensitivity of the amygdala, the portion of the brain responsible for noticing threats. In other words, people who are more sensitive to negative cues in their environment are more likely to score high in neuroticism.
Our habits are not solely determined by our personalities, but there is no doubt that our genes nudge us in a certain direction. Our deeply rooted preferences make certain behaviors easier for some people than for others. You don't have to apologize for these differences or feel guilty about them, but you do have to work with them. A person who scores lower on conscientiousness, for example, will be less likely to beorderly by nature and may need to rely more heavily on environment design to stick with good habits. (As a reminder for the less conscientious readers among us, environment design is a strategy we discussed in Chapters 6 and 12.)
The takeaway is that you should build habits that work for your personality.* People can get ripped working out like a bodybuilder, but if you prefer rock climbing or cycling or rowing, then shape your exercise habit around your interests. If your friend follows a low-carb diet but you find that low-fat works for you, then more power to you. If you want to read more, don't be embarrassed if you prefer steamy romance novels over nonfiction. Read whatever fascinates you. You don't have to build the habits everyone tells you to build. Choose the habit that best suits you, not the one that is most popular.
There is a version of every habit that can bring you joy and satisfaction. Find it. Habits need to be enjoyable if they are going to stick. This is the core idea behind the 4th Law (Make it satisfying).

How are 'Openness to experience' and 'Conscientiousness' related?

Conscientiousness is doing what you need to do.

Openness to experience is how critical you are to the experience of doing things.

The idea is that you should be both Conscientious and Open to experience means you do what you need to do without thinking too much about how it might feel. You do what you need to do without controlling every aspect of the experience of doing it.

On a side note, you can remember the Big-Five personality traits using acronym: OCEAN.

1. Openness to experience
2. Conscientiousness
3. Extroversion
4. Agreeableness
5. Neuroticism

Thursday, April 18, 2024

Demystifying GLUE: A Benchmark for Natural Language Processing

First, let's try to understand what GLUE would mean to a layman before diving in the details...

Imagine you're trying to teach your friend how well their pet parrot understands them. You wouldn't just ask the parrot to mimic one phrase, right? You'd give it a variety of tasks to see its overall ability to understand language.

GLUE is kind of like that, but for machines that deal with text and language, called NLP models.

Here's the breakdown:

GLUE stands for General Language Understanding Evaluation. It's a big test with many parts, like a mini-Olympics for NLP models.
The test has nine parts, each focusing on a different language skill. There are tasks to see if the model can tell if sentences mean the same thing, understand jokes (or weird grammar!), and even answer questions based on reading passages.
By doing well on all these tasks, the model shows it has a good general understanding of language. It's like your parrot being able to mimic different sounds, answer questions, and maybe even sing a little tune!

GLUE is important because it helps researchers:

See how good NLP models are getting: As models do better on GLUE tests, it shows progress in the field.
Find areas for improvement: If a model struggles on a specific part, researchers can focus on making it better in that area.
Compare different models: Just like comparing athletes, GLUE helps see which models perform best on different language tasks.

So, the next time you hear about a new language translator or chatbot, remember GLUE – it might have played a part in making it work better!

Now The Details...

GLUE, which stands for General Language Understanding Evaluation, is a crucial benchmark in the field of Natural Language Processing (NLP). This blog post dives deep into GLUE, exploring its purpose, the tasks it encompasses, and its significance for NLP advancements.

What is GLUE?

Developed by researchers at Alphabet AI, GLUE is a collection of challenging NLP tasks that assess a model's ability to understand and reason with language. It provides a standardized platform for evaluating the performance of NLP models across various tasks, allowing researchers to compare different approaches and track progress in the field.

The Tasks of GLUE

GLUE consists of nine individual NLP tasks, each focusing on a specific aspect of language understanding. These tasks can be broadly categorized into three areas:

Semantic Similarity and Paraphrasing:
- MNLI (Multi-Genre Natural Language Inference): Determines the relationship between a premise and a hypothesis (entailment, neutral, contradiction).
- QQP (Question Answering over Paraphrased Passages): Identifies if a question is paraphrased from a passage in a document.
- STS-B (Semantic Textual Similarity Benchmark): Measures the semantic similarity between two sentences.
Natural Language Understanding and Reasoning:
- WNLI (Winograd Schema NLI): Leverages commonsense knowledge to solve pronoun coreference issues.
- RTE (Recognizing Textual Entailment): Similar to MNLI, determines entailment between a text and a hypothesis.
Information Extraction:
- CoLA (Corpus of List Arguments): Evaluates the ability to determine if a sentence is grammatically correct and conveys a relationship between two entities.
- SST-2 (Sentiment Sentiment Treebank): Assigns sentiment polarity (positive or negative) to sentences.
- MRPC (Microsoft Research Paraphrase Corpus): Similar to QQP, identifies if a sentence is a paraphrase of another sentence.
- QNLI (Question Answering over NLI): Determines if an answer to a question can be entailed from the passage.

By encompassing a diverse range of tasks, GLUE provides a comprehensive evaluation of a model's overall NLP capabilities.

Why is GLUE Important?

GLUE has played a significant role in advancing NLP research in several ways:

Standardization: It offers a common ground for evaluating NLP models, facilitating comparisons between different approaches.
Progress Tracking: GLUE allows researchers to track the progress of the field by monitoring how models perform on the benchmark over time.
Identifying Weaknesses: By analyzing model performance on specific tasks, researchers can pinpoint areas where NLP models struggle and work towards improvements.
Benchmarking New Models: New NLP models can be readily evaluated using GLUE to assess their capabilities.

GLUE's impact extends beyond research. It also helps companies develop and deploy NLP-powered applications with a clearer understanding of model strengths and limitations.

Conclusion

GLUE serves as a cornerstone for evaluating and advancing the field of NLP. By providing a comprehensive benchmark, it fosters innovation and facilitates the development of more robust and versatile NLP models that can understand and interact with human language more effectively.

Reference: Research Paper

Tuesday, April 16, 2024

Time Series Analysis and Forecasting Using Exponential Moving Average (A use case of traffic prediction on my blog)

In the previous two articles in this series, we explored Poisson Distribution for modeling our traffic data on the blog ( Link to Poisson Distribution Based Modeling ).
And we encoutered the challenges there of outliers and local trends as shown in this post: Data Issues

This triggered us to explore further and we came up with the idea of using Exponential Moving Average (EMA) based model to do the traffic prediction.

All of the code and data is available to download from this link:
Download Code and Data

First, a visual story:

1: Plotting EMA



2: EMA numbers we got



3: Predictions using ExponentialSmoothing (or EMA)

How we used Gemini to help us?

Can you share exponential moving average based python code for 1D time series prediction?

Absolutely, here's the Python code for 1D time series prediction using Exponential Moving Average (EMA):

Python
import pandas as pd

    def predict_ema(data, alpha):
      """
      This function calculates the Exponential Moving Average (EMA) for a given time series data.
    
      Args:
          data (pandas.Series): The time series data.
          alpha (float): The smoothing factor (0 < alpha <= 1).
    
      Returns:
          pandas.Series: The EMA of the time series data.
      """
      ema = [data.iloc[0]]  # Initialize with the first data point
      for i in range(1, len(data)):
        ema.append((alpha * data.iloc[i]) + ((1 - alpha) * ema[i-1]))
      return pd.Series(ema)
    
    # Example usage
    # Assuming your time series data is loaded in a pandas Series named 'values'
    predictions = predict_ema(values, alpha=0.5)  # Adjust alpha as needed
    
    # Plot the original data and the EMA predictions
    values.plot(label="Original Data")
    predictions.plot(label="EMA Prediction")
    

        
        

This code defines a function predict_ema that takes the time series data and the smoothing factor (alpha) as arguments. The function iterates through the data points and calculates the EMA using the following formula at each step:

EMA[i] = (alpha * data[i]) + ((1 - alpha) * EMA[i-1])

alpha: Smoothing factor (0 < alpha <= 1). A higher alpha gives more weight to recent data points, resulting in a more responsive but potentially noisier EMA.
data[i]: The current data point.
EMA[i-1]: The EMA of the previous data point.

The example usage demonstrates how to call the function with your data and plot the original data along with the EMA predictions.

Note:

This is a basic implementation. For more advanced functionalities, consider libraries like statsmodels.tsa.holtwinters which offer functionalities for trend and seasonality handling.
Exponential Moving Averages are best suited for capturing recent trends. They might not be ideal for data with strong seasonality or long-term trends.

Using statsmodels?

Certainly, here's the code for using statsmodels.tsa.holtwinters for exponential moving average (EMA) prediction in Python:

Python
from statsmodels.tsa.holtwinters import ExponentialSmoothing

    def predict_ema_statsmodels(data, alpha=0.5):
      """
      This function uses statsmodels to fit an Exponential Smoothing model (equivalent to EMA) 
      and generate predictions for a 1D time series data.
    
      Args:
          data (pandas.Series): The time series data.
          alpha (float): The smoothing factor (0 < alpha <= 1).
    
      Returns:
          pandas.Series: The predicted values using EMA.
      """
      model = ExponentialSmoothing(data, trend="add", seasonal=None)  # Set trend to 'add' for EMA
      model_fit = model.fit()
      predictions = model_fit.forecast(len(data))
      return predictions
    
    # Example usage
    # Assuming your time series data is loaded in a pandas Series named 'values'
    predictions = predict_ema_statsmodels(values, alpha=0.5)  # Adjust alpha as needed
    
    # Plot the original data and the EMA predictions
    values.plot(label="Original Data")
    predictions.plot(label="EMA Prediction")
    

This code utilizes the ExponentialSmoothing class from statsmodels.tsa.holtwinters. Here's a breakdown of the key points:

Model Definition:
- We create an ExponentialSmoothing model instance with:
  - data: The time series data (values in this example).
  - trend="add": This specifies the trend component as additive, which is equivalent to a basic EMA. Setting seasonal=None excludes seasonality.
Model Fitting:
- model_fit = model.fit(): This fits the EMA model to the data.
Prediction:
- predictions = model_fit.forecast(len(data)): This generates predictions for the next len(data) periods using the fitted model.

Advantages of using statsmodels:

Flexibility: It allows for handling trend and seasonality by adjusting the model arguments.
Additional functionalities: The model provides methods for calculating fit statistics, confidence intervals, and more.

Remember to replace values with your actual time series data and adjust the alpha value for smoothing as needed.

How do we validate the results?

I can provide some insights on interpreting the Exponential Moving Average (EMA) results and offer pointers for further validation:

General Observations from the Image:

The y-axis likely represents the predicted values (possibly scaled).
The x-axis seems to represent time steps.
There's a visible upward trend in the prediction, indicating a positive bias.

Code Validation Techniques:

Visual Inspection:
- Plot the original time series data alongside the predicted EMA values. This helps identify how well the EMA captures the underlying trend and seasonality (if applicable).
Error Metrics:
- Calculate error metrics like Mean Squared Error (MSE) or Mean Absolute Error (MAE) to quantify the difference between the predictions and actual future values (assuming you have those). Lower errors suggest a better fit.
Domain Knowledge:
- Consider your understanding of the time series data's domain. Does the predicted trend align with your expectations? Are there any outliers or unexpected patterns in the predictions?

Additional Considerations:

The value of the smoothing factor (alpha) can significantly impact the EMA's responsiveness to trends. Experiment with different alpha values to see how they affect the predictions.
If your data exhibits seasonality, you might want to explore seasonal extensions of exponential smoothing models provided by statsmodels.tsa.holtwinters.

By following these suggestions and incorporating your domain knowledge, you can effectively validate your EMA model and assess its suitability for your time series forecasting task.

Conclusion

Based on the results we are getting from ExponentialSmoothing:

90    10811.852982
91    10840.919992
92    10869.987003

We can say that we can see around 10K monthly views in the coming 2-3 months.

On a side note: Views for April 2024 have already touched 8500 at the point of writing this.

Monday, April 15, 2024

Maximum Slice Problem (using Kadane's Algo and Dynamic Programming)

Maximum slice problem

Let’s deﬁne a problem relating to maximum slices. You are given a sequence of n integers a0 , a1 , . . . , an−1 and the task is to ﬁnd the slice with the largest sum. More precisely, we are looking for two indices p, q such that the total ap + ap+1 + . . . + aq is maximal. We assume that the slice can be empty and its sum equals 0.



In the picture, the slice with the largest sum is highlighted in gray. The sum of this slice
equals 10 and there is no slice with a larger sum. Notice that the slice we are looking for may
contain negative integers, as shown above.

In other words...

A non-empty array A consisting of N integers is given. A pair of integers (P, Q), such that 0 ≤ P ≤ Q < N, is called a slice of array A. The sum of a slice (P, Q) is the total of A[P] + A[P+1] + ... + A[Q].

Write a function:

    def solution(A)

that, given an array A consisting of N integers, returns the maximum sum of any slice of A.

For example, given array A such that:
A[0] = 3  A[1] = 2  A[2] = -6
A[3] = 4  A[4] = 0

the function should return 5 because:

        (3, 4) is a slice of A that has sum 4,
        (2, 2) is a slice of A that has sum −6,
        (0, 1) is a slice of A that has sum 5,
        no other slice of A has sum greater than (0, 1).

Write an efficient algorithm for the following assumptions:

        N is an integer within the range [1..1,000,000];
        each element of array A is an integer within the range [−1,000,000..1,000,000];
        the result will be an integer within the range [−2,147,483,648..2,147,483,647].

Solution

Please solve this coding puzzle

The image you sent describes the Maximum Slice Problem. It is a coding challenge where you’re given an array of integers and you need to find the subarray (slice) with the largest sum. The slice can be empty, in which case the sum is zero.

Here’s how to solve this problem:

Kadane’s Algorithm: There is an efficient algorithm to solve the maximum slice problem called Kadane’s Algorithm. It uses dynamic programming to find the maximum sum subarray in linear time (O(n)).

The idea is to keep track of two variables:

current_sum: This variable stores the sum of the current subarray.
max_sum: This variable stores the maximum sum of any subarray seen so far.

We iterate through the array and at each element, we do the following: * If the current_sum is less than zero, we restart the current subarray by setting it to the current element. * Otherwise, we add the current element to the current_sum. * We then compare the current_sum with the max_sum and update the max_sum if the current_sum is greater.

Implementation: Here’s how you can implement Kadane’s Algorithm in Python:

Python
def max_slice_sum(arr):
      """
      Finds the maximum sum of a contiguous subarray in a given array.
    
      Args:
        arr: A list of integers.
    
      Returns:
        The maximum sum of a contiguous subarray in the array.
      """
      current_sum = max_sum = arr[0]
      for num in arr[1:]:
        current_sum = max(num, current_sum + num)
        max_sum = max(max_sum, current_sum)
      return max_sum
    

This function iterates over the array ‘arr’ and calculates the maximum sum of a slice.

You can call the function like this:

arr = [5, -7, 8, -4, 1]
    print(max_slice_sum(arr))  # Output: 8
    

In this example, the maximum sum of a slice is 8 (8).

I hope this helps!

Can you explain this code to a layman?

Imagine you're walking a trail and collecting rocks. Each rock has a weight, positive if it's nice and heavy, negative if it's a drag. You want to find the biggest haul you can carry, but you can't go back and drop rocks you've already picked up.

This code is like figuring out the heaviest load you can carry on that trail.

Here's how it works:

Two piles:
- Current pile: This is the pile of rocks you're carrying right now. It starts with just the first rock on the trail (the first number in the list).
- Best pile ever: This keeps track of the heaviest pile you've found so far on the trail.
Walking the trail:
- You go through each rock on the trail (except the first one, since that's already in your current pile).
- At each rock, you check your current pile:
  - If the current pile is too heavy (negative weight), you ditch it and start fresh with just this new rock.
  - If the current pile is okay (positive or zero weight), you add this new rock to it.
- No matter what you did with the current pile, you then compare its weight to the "best pile ever." If the current pile is heavier, then the current pile becomes the new "best pile ever."
The answer:
- Once you've finished going through all the rocks, the "best pile ever" holds the heaviest load you could have carried.

The code uses fancy names for these piles ("current_sum" and "max_sum"), but that's the basic idea. It keeps track of the best possible haul you can make as you walk the trail (process the list).

Please listen carefully. What is the difference between following two functions: def golden_max_slice(A): max_ending = max_slice = 0 for a in A: max_ending = max(0, max_ending + a) max_slice = max(max_slice, max_ending) return max_slice def max_slice_sum(arr): """ Finds the maximum sum of a contiguous subarray in a given array. Args: arr: A list of integers. Returns: The maximum sum of a contiguous subarray in the array. """ current_sum = max_sum = arr[0] for num in arr[1:]: current_sum = max(num, current_sum + num) max_sum = max(max_sum, current_sum) return max_sum

The two functions you provided, golden_max_slice and max_slice_sum, both solve the same problem: finding the maximum sum of a contiguous subarray in a given list of integers. However, they achieve this in slightly different ways:

1. Variable Naming:

max_slice_sum: This function uses more descriptive names.
- current_sum: Clearly indicates the current sum of the subarray being considered.
- max_sum: Represents the maximum sum found so far.
golden_max_slice: Uses shorter names.
- max_ending: Less descriptive but conveys the idea of the maximum sum ending at the current element.
- max_slice: Same as in max_slice_sum.

2. Handling Negative Sums:

max_slice_sum: This function explicitly restarts the current_sum to be the current element if it becomes negative. This ensures that the current subarray only considers elements that contribute positively to the sum.
golden_max_slice: This function implicitly handles negative sums. It only adds the current element to max_ending if the resulting sum would be positive. If the sum becomes negative, max_ending effectively restarts at 0 (since max(0, a + b) will be 0 for negative values of a).

In essence:

Both functions implement Kadane's Algorithm.
max_slice_sum might be easier to understand for beginners due to its clear variable names and explicit handling of negative sums.
golden_max_slice is a more concise approach that achieves the same result.

Efficiency:

Both functions have a time complexity of O(n), which is optimal for this problem as they iterate through the list once.

Saturday, April 13, 2024

Make Mistakes to Learn Better And The Role That Neurotransmitter Acetylcholine Plays in Learning

From 15 min 10 seconds...
Next, we have mistakes, and I know making mistakes is terrifying.
It is really scary, but there's a biological reason behind that.
So that feeling of anxiety and stress you get when you make a mistake, it serves a really important purpose.
So when you make a mistake, what happens is you're releasing neuromodulators like Acetylcholine.
And you're getting increased activity in your focused attention networks.
And that increase in attention and that feeling of anxiety serves a really important purpose.
It's basically saying to us "Hey, you made a mistake. You need to change and do better and become more efficient."
And it's opening up this window for neuroplasticity.
So, whatever happens next, your brain is ready to take in.
Now, if you make a mistake and you feel a bit anxious and you walk away, well:
A: you're not going to learn that thing,
B: well you're actually learning to be less able to cope with failure.

Here's what you should do.

You should set yourself up for a little bit of failure, right?
Quiz yourself on that topic as you go. Don't wait until you're ready.
If you're learning something - for example, soccer - don't just kick it straight at the goal.
Change the angle. Make it more difficult, so you make mistakes.
Don't wait for everything to be perfect before you have a go, because at the end of the day if you make a mistake, you'll be releasing neuromodulators that improve your attention. And if you get it right, you'll be releasing things like dopamine in your reward circuits, which makes you feel good, which makes you feel more motivated and consolidates the learning of the thing that you just did correctly, right?

So that's why turning our learning into a bit of a game can work so well. It's a bit of a win-win situation for our brain either way, right?
So when you make a mistake, you know, don't view that anxiety as a bad thing.
Lean into that feeling and keep going because it's really your brain's way of helping you be your best.
It's helping you be better than the person that you were yesterday.

How to Build Better Habits in 4 Simple Steps (Ch 3 from the book Atomic Habits)

THE SCIENCE OF HOW HABITS WORK

The process of building a habit can be divided into four simple steps: cue, craving, response, and reward.* Breaking it down into these fundamental parts can help us understand what a habit is, how it works, and how to improve it.

FIGURE 5: All habits proceed through four stages in the same order: cue, craving, response, and reward.

This four-step pattern is the backbone of every habit, and your brain runs through these steps in the same order each time.

First, there is the cue. The cue triggers your brain to initiate a behavior. It is a bit of information that predicts a reward. Our prehistoric ancestors were paying attention to cues that signaled the location of primary rewards like food, water, and sex. Today, we spend most of our time learning cues that predict secondary rewards like money and fame, power and status, praise and approval, love and friendship, or a sense of personal satisfaction. (Of course, these pursuits also indirectly improve our odds of survival and reproduction, which is the deeper motive behind everything we do.) Your mind is continuously analyzing your internal and external environment for hints of where rewards are located. Because the cue is the first indication that we're close to a reward, it naturally leads to a craving.
Cravings are the second step, and they are the motivational force behind every habit. Without some level of motivation or desire— without craving a change—we have no reason to act. What you crave is not the habit itself but the change in state it delivers. You do not crave smoking a cigarette, you crave the feeling of relief it provides. You are not motivated by brushing your teeth but rather by the feeling of a clean mouth. You do not want to turn on the television, you want to be entertained. Every craving is linked to a desire to change your internal state. This is an important point that we will discuss in detail later.
Cravings differ from person to person. In theory, any piece of information could trigger a craving, but in practice, people are not motivated by the same cues. For a gambler, the sound of slot machines can be a potent trigger that sparks an intense wave of desire. For someone who rarely gambles, the jingles and chimes of the casino are just background noise. Cues are meaningless until they are interpreted.
The thoughts, feelings, and emotions of the observer are what transform a cue into a craving.
The third step is the response. The response is the actual habit you perform, which can take the form of a thought or an action. Whether a response occurs depends on how motivated you are and how much friction is associated with the behavior. If a particular action requires more physical or mental effort than you are willing to expend, then you won't do it. Your response also depends on your ability. It sounds simple, but a habit can occur only if you are capable of doing it. If you

want to dunk a basketball but can't jump high enough to reach the hoop, well, you're out of luck.
Finally, the response delivers a reward. Rewards are the end goal of every habit. The cue is about noticing the reward. The craving is about wanting the reward. The response is about obtaining the reward. We chase rewards because they serve two purposes:
(1) they satisfy us and
(2) they teach us.

The first purpose of rewards is to satisfy your craving. Yes, rewards provide benefits on their own. Food and water deliver the energy you need to survive. Getting a promotion brings more money and respect.
Getting in shape improves your health and your dating prospects. But the more immediate benefit is that rewards satisfy your craving to eat or to gain status or to win approval. At least for a moment, rewards deliver contentment and relief from craving.
Second, rewards teach us which actions are worth remembering in the future. Your brain is a reward detector. As you go about your life, your sensory nervous system is continuously monitoring which actions satisfy your desires and deliver pleasure. Feelings of pleasure and disappointment are part of the feedback mechanism that helps your brain distinguish useful actions from useless ones. Rewards close the feedback loop and complete the habit cycle.
If a behavior is insufficient in any of the four stages, it will not become a habit. Eliminate the cue and your habit will never start.
Reduce the craving and you won't experience enough motivation to act. Make the behavior difficult and you won't be able to do it. And if the reward fails to satisfy your desire, then you'll have no reason to do it again in the future. Without the first three steps, a behavior will not occur. Without all four, a behavior will not be repeated.

THE HABIT LOOP

FIGURE 6: The four stages of habit are best described as a feedback loop.
They form an endless cycle that is running every moment you are alive. This “habit loop” is continually scanning the environment, predicting what will happen next, trying out different responses, and learning from the results.*

In summary, the cue triggers a craving, which motivates a response, which provides a reward, which satisfies the craving and, ultimately, becomes associated with the cue. Together, these four steps form a neurological feedback loop—cue, craving, response, reward; cue, craving, response, reward—that ultimately allows you to create automatic habits. This cycle is known as the habit loop.
This four-step process is not something that happens occasionally, but rather it is an endless feedback loop that is running and active during every moment you are alive—even now. The brain is continually scanning the environment, predicting what will happen next, trying out different responses, and learning from the results. The entire process is completed in a split second, and we use it again and again without realizing everything that has been packed into the previous moment.
We can split these four steps into two phases: the problem phase and the solution phase. The problem phase includes the cue and the craving, and it is when you realize that something needs to change. The solution phase includes the response and the reward, and it is when you take action and achieve the change you desire.

Problem phase

1. Cue
2. Craving

Solution phase

3. Response
4. Reward

All behavior is driven by the desire to solve a problem. Sometimes the problem is that you notice something good and you want to obtain it. Sometimes the problem is that you are experiencing pain and you want to relieve it. Either way, the purpose of every habit is to solve the problems you face.
In the table on the following page, you can see a few examples of what this looks like in real life.
Imagine walking into a dark room and flipping on the light switch.
You have performed this simple habit so many times that it occurs without thinking. You proceed through all four stages in the fraction of a second. The urge to act strikes you without thinking.

Problem phase
1. Cue: Your phone buzzes with a new text message.
2. Craving: You want to learn the contents of the message.

Solution phase
3. Response: You grab your phone and read the text.
4. Reward: You satisfy your craving to read the message. Grabbing your phone becomes associated with your phone buzzing.

~~~

Problem phase
1. Cue: You are answering emails.
2. Craving: You begin to feel stressed and overwhelmed by work. You want to feel in control.

Solution phase
3. Response: You bite your nails.
4. Reward: You satisfy your craving to reduce stress. Biting your nails becomes associated with answering email.

~~~

Problem phase
1. Cue: You wake up.
2. Craving: You want to feel alert.

Solution phase
3. Response: You drink a cup of coffee.
4. Reward: You satisfy your craving to feel alert. Drinking coffee becomes associated with waking up.

~~~

Problem phase
1. Cue: You smell a doughnut shop as you walk down the street near your office.
2. Craving: You begin to crave a doughnut.

Solution phase
3. Response: You buy a doughnut and eat it.
4. Reward: You satisfy your craving to eat a doughnut. Buying a doughnut becomes associated with walking down the street near your office.

~~~

Problem phase
1. Cue: You hit a stumbling block on a project at work.
2. Craving: You feel stuck and want to relieve your frustration.

Solution phase
3. Response: You pull out your phone and check social media.
4. Reward: You satisfy your craving to feel relieved. Checking social media becomes associated with feeling stalled at work.

~~~

Problem phase
1. Cue: You walk into a dark room.
2. Craving: You want to be able to see.

Solution phase
3. Response: You flip the light switch.
4. Reward: You satisfy your craving to see. Turning on the light switch becomes associated with being in a dark room.

By the time we become adults, we rarely notice the habits that are running our lives. Most of us never give a second thought to the fact that we tie the same shoe first each morning, or unplug the toaster after each use, or always change into comfortable clothes after getting home from work. After decades of mental programming, we automatically slip into these patterns of thinking and acting.

THE FOUR LAWS OF BEHAVIOR CHANGE

In the following chapters, we will see time and again how the four stages of cue, craving, response, and reward influence nearly everything we do each day. But before we do that, we need to

transform these four steps into a practical framework that we can use to design good habits and eliminate bad ones.
I refer to this framework as the Four Laws of Behavior Change, and it provides a simple set of rules for creating good habits and breaking bad ones. You can think of each law as a lever that influences human behavior. When the levers are in the right positions, creating good habits is effortless. When they are in the wrong positions, it is nearly impossible.

How to Create a Good Habit

The 1st law (Cue): Make it obvious.
The 2nd law (Craving): Make it attractive.
The 3rd law (Response): Make it easy.
The 4th law (Reward): Make it satisfying.

We can invert these laws to learn how to break a bad habit.

How to Break a Bad Habit

Inversion of the 1st law (Cue): Make it invisible.
Inversion of the 2nd law (Craving): Make it unattractive.
Inversion of the 3rd law (Response): Make it difficult.
Inversion of the 4th law (Reward): Make it unsatisfying.

It would be irresponsible for me to claim that these four laws are an exhaustive framework for changing any human behavior, but I think they're close. As you will soon see, the Four Laws of Behavior Change apply to nearly every field, from sports to politics, art to medicine, comedy to management. These laws can be used no matter what challenge you are facing. There is no need for completely different strategies for each habit.
Whenever you want to change your behavior, you can simply ask yourself:

1. How can I make it obvious?

2. How can I make it attractive?

3. How can I make it easy?

4. How can I make it satisfying?

If you have ever wondered, “Why don't I do what I say I'm going to do? Why don't I lose the weight or stop smoking or save for retirement or start that side business? Why do I say something is important but never seem to make time for it?” The answers to those questions can be found somewhere in these four laws. The key to creating good habits and breaking bad ones is to understand these fundamental laws and how to alter them to your specifications. Every goal is doomed to fail if it goes against the grain of human nature.
Your habits are shaped by the systems in your life. In the chapters that follow, we will discuss these laws one by one and show how you can use them to create a system in which good habits emerge naturally and bad habits wither away.

Key Points

# A habit is a behavior that has been repeated enough times to become automatic.

# The ultimate purpose of habits is to solve the problems of life with as little energy and effort as possible.

# Any habit can be broken down into a feedback loop that involves four steps: cue, craving, response, and reward.

# The Four Laws of Behavior Change are a simple set of rules we can use to build better habits. They are:

(1) make it obvious,

(2) make it attractive,

(3) make it easy, and

(4) make it satisfying.

Friday, April 12, 2024

When not to use Poisson Distribution for prediction?

Previously we had seen how we were trying to fit Poisson Distribution to the traffic data on our blog. 

Link to the Post

Today, we study why our analysis could be wrong.



By looking at this plot, we suspect three things:

1. The data has outliers. A bunch of points around May 2021.

2. The data has some local trend. Upward trend till Jan 2022 and then downward trend till Mar 2023.

3. Because of the above two points, we feel: there is constant rate (average) for the events occurring within a specific time interval.

Can we use poisson distribution when we suspect there are outliers in the data?

The Poisson distribution has its limitations when dealing with outliers. Here's why:

Underlying Assumption: The Poisson distribution assumes a constant rate of events across the observed interval. Outliers, by definition, deviate significantly from this expected pattern.
Impact on Results: If outliers are present, the Poisson model might not accurately reflect the actual probability of observing certain counts. This can lead to misleading interpretations of the data.

However, there are still ways to approach this situation:

Outlier Detection: You can use statistical methods to identify potential outliers in your data. Techniques like Grubbs' test or examining standardized residuals can help [3].
Alternative Models: Depending on the nature of the outliers, alternative models like the negative binomial distribution might be a better fit. This model accounts for overdispersion, which is a higher variance than expected by the Poisson distribution, often caused by outliers [1].
Data Transformation: In some cases, transforming the data (like using the Anscombe transform) before applying the Poisson model can help reduce the influence of outliers [3].

Ultimately, the decision of using the Poisson distribution with outliers depends on the severity of the outliers and the purpose of your analysis.

Here are some additional resources you might find helpful:

Outliers in long-tailed discrete data: https://web-archive.lshtm.ac.uk/csm.lshtm.ac.uk/wp-content/uploads/sites/6/2016/04/Mario-Cortina-Borja-16-11-2012.pdf

And can we use poisson distribution when we suspect there is a trend in the data?

No, the standard Poisson distribution is not ideal for data with a suspected trend. Here's why:

Constant Rate Assumption: The Poisson distribution assumes a constant rate (average) for the events occurring within a specific time interval. A trend implies a non-constant rate, where the frequency of events increases or decreases over time.
Misleading Results: Using the Poisson model with a trend can lead to inaccurate estimates of event probabilities and unreliable hypothesis tests.

However, there are ways to analyze count data with trends:

Poisson Regression: This statistical technique incorporates a trend term into the Poisson model. You can use a linear trend or even more complex functions to capture the non-constant rate [2].
Segmented Poisson Regression: This approach divides the data into segments with potentially different rates. It's useful when the trend has breaks or shifts at specific points [2].
Change-Point Detection: Methods exist to statistically identify the time points where the event rate changes, allowing for further analysis of segments with constant rates [1].

In conclusion, the standard Poisson distribution is not suitable for data with trends. Consider using Poisson regression or other techniques that account for the non-constant event rate.

Here are some additional resources you might find helpful:

Trend analysis with count data: https://pubmed.ncbi.nlm.nih.gov/29775435/
Poisson regression for trends: https://pubmed.ncbi.nlm.nih.gov/7977282/

Sunday, April 7, 2024

What is zero shot? Single shot? Few Shots in terms of LLMs?

In the context of Language Model (LM) training and fine-tuning, terms like "zero-shot," "single-shot," and "few-shot" refer to different approaches for adapting a pre-trained LM to a specific task without extensive task-specific training data. These terms are commonly associated with transfer learning, where a model pre-trained on a large corpus of text is fine-tuned on a smaller dataset for a specific downstream task.

Zero-shot Learning: In zero-shot learning, the model is directly applied to a task without any task-specific training examples. Instead, the model leverages its pre-trained knowledge to make predictions on the task. For example, if a pre-trained LM has been trained on a diverse range of text data, you could directly use it for tasks like text classification or text generation without fine-tuning on task-specific data.
Single-shot Learning: Single-shot learning involves providing only a single example or a very small amount of labeled data for the task. The model then learns from this limited supervision to make predictions on similar examples. This approach is useful when labeled data is scarce or expensive to obtain. The model may adapt its parameters slightly to accommodate the provided example during training.
Few-shot Learning: Few-shot learning is similar to single-shot learning but involves providing a few examples (more than one) for the task. These examples are typically used to fine-tune the pre-trained model on the specific task. Few-shot learning enables the model to generalize better compared to single-shot learning as it has more training instances to learn from. Techniques like meta-learning or transfer learning from related tasks can enhance few-shot learning performance.

In summary, zero-shot learning does not involve any task-specific training examples, while single-shot learning involves a single example or very limited labeled data, and few-shot learning involves a small number of examples for the task. These approaches allow pre-trained LMs to adapt to various downstream tasks efficiently, leveraging their learned representations and capabilities.