Showing posts with label Large Language Models. Show all posts
Showing posts with label Large Language Models. Show all posts

Saturday, April 20, 2024

Streamlining NLP Tasks: A Deep Dive into Hugging Face Transformers Pipelines

The Hugging Face Transformers library has become a powerhouse for Natural Language Processing (NLP) tasks. While its core functionality revolves around pre-trained models and tokenization, the library offers a powerful abstraction layer called pipelines. Pipelines simplify the process of applying these models to real-world NLP applications. This blog post delves into the world of Transformers pipelines, exploring their capabilities, functionalities, and code examples.

What are Transformers Pipelines?

Imagine you have a toolbox filled with specialized tools for various construction tasks. Using each tool effectively requires knowledge of its operation and purpose. Transformers pipelines act similarly. They encapsulate the complexities involved in using pre-trained models for NLP tasks, providing a user-friendly interface for inference.

Here's a breakdown of what pipelines offer:

  • Simplified Model Usage: Pipelines hide the underlying complexities of loading models, tokenization, and model execution. You don't need to write intricate code for each step; the pipeline handles it all.
  • Task-Specific Functionality: Pipelines are designed for specific NLP tasks like sentiment analysis, question answering, or named entity recognition. This makes them ideal for developers who want to quickly integrate these functionalities into their applications.
  • Batch Processing: Pipelines can efficiently process multiple text inputs at once, improving performance for large datasets.
  • Flexibility: While pipelines offer pre-built functionalities, they also allow customization through various parameters. You can fine-tune the processing steps based on your specific needs.

Unveiling the Power of Pipelines with Code Examples

Let's explore the capabilities of Transformers pipelines with some code examples:

1. Sentiment Analysis:

Sentiment analysis gauges the emotional tone of a piece of text (positive, negative, or neutral). Here's how to use a pipeline for sentiment analysis:

Python
from transformers import pipeline

# Initialize pipeline for sentiment analysis
sentiment_analysis = pipeline("sentiment-analysis")

# Analyze the sentiment of a sentence
sentence = "This movie was absolutely fantastic!"
sentiment = sentiment_analysis(sentence)

print(sentiment)

# Output: {'label': 'POSITIVE', 'score': 0.9983537774009705}

This code snippet imports the pipeline function and creates a sentiment-analysis pipeline instance. It then feeds the sentence "This movie was absolutely fantastic!" and retrieves the sentiment information (label and score).

2. Question Answering:

Question answering pipelines allow you to extract answers to questions from a given context. Here's an example:

Python
from transformers import pipeline

# Initialize pipeline for question answering
question_answering = pipeline("question-answering")

# Context passage and question
passage = "Hugging Face Transformers is a powerful NLP library."
question = "What is Transformers?"

# Find the answer within the context
answer = question_answering({"context": passage, "question": question})

print(f"Answer: {answer['answer']}")

# Output: Answer: Transformers

This code demonstrates question answering. It creates a question-answering pipeline and provides both the context passage and the question. The pipeline extracts the answer ("Transformers") from the context.

3. Customizing Pipelines:

Pipelines offer various parameters for customization. Here's how to modify the sentiment analysis example to include a specific model:

Python
from transformers import pipeline

# Specify the pre-trained model for sentiment analysis
sentiment_analysis = pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english")

# Analyze sentiment with the specified model
sentence = "Today is a gloomy day."
sentiment = sentiment_analysis(sentence)

print(sentiment)

In this example, the model parameter specifies the pre-trained model (distilbert-base-uncased-finetuned-sst-2-english) to be used for sentiment analysis. This allows you to leverage different models based on your task and performance requirements.

These are just a few examples showcasing the versatility of Transformers pipelines. The library offers pipelines for various tasks like summarization, feature extraction, text generation, and more. You can explore the comprehensive list of available pipelines in the Hugging Face documentation https://huggingface.co/docs/transformers/en/main_classes/pipelines.

Beyond the Code: Advantages and Considerations

While pipelines offer a convenient way to leverage NLP models, it's essential to consider some factors:

  • Black Box Nature: Pipelines abstract the underlying complexities, which can be beneficial for quick implementation. However, for advanced users who need more control over the processing steps, custom code might be necessary.
  • Limited Customization: While pipelines allow parameter adjustments, they may not

Expanding Our NLP Toolkit: A Look at Transformers Pipelines with More Examples

In the previous section, we explored the fundamentals of Transformers pipelines and their functionalities with code examples for sentiment analysis and question answering. However, the Transformers library offers a much richer set of pipelines catering to diverse NLP tasks. Let's delve deeper and discover the potential of these pipelines with more examples, drawing inspiration from the resource: https://huggingface.co/learn/nlp-course/chapter1/3.

Unveiling a Broader Spectrum of Pipelines

The Hugging Face Transformers library boasts a comprehensive collection of pipelines, each tailored to a specific NLP requirement. Here's a glimpse into some of the pipelines you'll encounter:

  • Feature Extraction: get the vector representation of a text
  • Sentiment Analysis: As discussed earlier, this pipeline gauges the emotional tone of text (positive, negative, or neutral).
  • Zero-Shot Classification: This pipeline goes beyond pre-defined categories. It allows you to classify text data based on new classes you provide at runtime. Imagine classifying emails into "urgent," "informational," or "promotional" categories without explicitly training a model for these specific labels.
  • Text Generation: Unleash your creativity with this pipeline! It enables you to generate different creative text formats of text content, like poems, code, scripts, musical pieces, email, letters, etc. You can provide a starting prompt or choose from various generation algorithms to produce creative text formats.
  • Fill-Mask: This pipeline is like a word completion game on steroids. It takes a sentence with a masked token and predicts the most likely word to fill the blank. This can be useful for tasks like text summarization or machine translation.
  • Named Entity Recognition (NER): Identify and classify named entities in text, such as people, organizations, locations, monetary values, percentages, dates, times, etc. This is crucial for information extraction tasks.
  • Question Answering: As seen previously, this pipeline finds answers to your questions within a given context.
  • Summarization: This pipeline condenses lengthy text passages into a shorter, informative summary, perfect for generating quick overviews of documents or articles.
  • Translation: Break down language barriers! This pipeline translates text from one language to another.

This is just a selection of the many Transformers pipelines available. The Hugging Face website provides a detailed list with information on their functionalities and usage https://huggingface.co/docs/transformers/en/main_classes/pipelines.

Code Examples in Action

Let's explore how we can leverage some of these pipelines with code examples:

1. Zero-Shot Classification:

Python
from transformers import pipeline

# Initialize pipeline for zero-shot classification
zero_shot_classifier = pipeline("zero-shot-classification")

# Define custom classes
custom_classes = ["urgent", "informational", "promotional"]

# Classify an email based on custom classes
email_text = "This email contains important information about your upcoming flight."
classification = zero_shot_classifier(email_text, custom_classes=custom_classes)

print(classification)

# Output: {'labels': ['informational'], 'scores': [0.9998778791427612]}

This code snippet demonstrates zero-shot classification. We define custom classes ("urgent", "informational", "promotional") and use them to classify an email. The pipeline assigns the most likely class ("informational") along with a confidence score.

2. Text Generation:

Python
from transformers import pipeline

# Initialize pipeline for text generation
text_generator = pipeline("text-generation")

# Generate different creative text formats
print(text_generator("Once upon a time,", max_length=50, num_return_sequences=2))

# Output: [
#           "Once upon a time, there was a little girl who lived in a small village. She was a kind and curious girl, and she always loved to explore the forest behind her house.",
#           "Once upon a time, in a land far, far away, there lived a brave knight named Sir Lancelot. He was a loyal and courageous warrior, and he was always ready to defend his kingdom."
#         ]

This example showcases text generation. We provide a starting prompt ("Once upon a time,") and generate two different creative text continuations using the pipeline.

By incorporating these pipelines into your projects, you can unlock a vast array of NLP capabilities.

Conclusion

Transformers pipelines offer a powerful and user-friendly approach to applying pre-trained models for various NLP tasks. This blog post has provided a foundation for understanding pipelines, explored their functionalities with a wider range of examples

Tags: Technology,Natural Language Processing,Large Language Models,

Thursday, April 18, 2024

Demystifying GLUE: A Benchmark for Natural Language Processing

First, let's try to understand what GLUE would mean to a layman before diving in the details...

Imagine you're trying to teach your friend how well their pet parrot understands them. You wouldn't just ask the parrot to mimic one phrase, right? You'd give it a variety of tasks to see its overall ability to understand language.

GLUE is kind of like that, but for machines that deal with text and language, called NLP models.

Here's the breakdown:

  • GLUE stands for General Language Understanding Evaluation. It's a big test with many parts, like a mini-Olympics for NLP models.
  • The test has nine parts, each focusing on a different language skill. There are tasks to see if the model can tell if sentences mean the same thing, understand jokes (or weird grammar!), and even answer questions based on reading passages.
  • By doing well on all these tasks, the model shows it has a good general understanding of language. It's like your parrot being able to mimic different sounds, answer questions, and maybe even sing a little tune!

GLUE is important because it helps researchers:

  • See how good NLP models are getting: As models do better on GLUE tests, it shows progress in the field.
  • Find areas for improvement: If a model struggles on a specific part, researchers can focus on making it better in that area.
  • Compare different models: Just like comparing athletes, GLUE helps see which models perform best on different language tasks.

So, the next time you hear about a new language translator or chatbot, remember GLUE – it might have played a part in making it work better!

Now The Details...

GLUE, which stands for General Language Understanding Evaluation, is a crucial benchmark in the field of Natural Language Processing (NLP). This blog post dives deep into GLUE, exploring its purpose, the tasks it encompasses, and its significance for NLP advancements.

What is GLUE?

Developed by researchers at Alphabet AI, GLUE is a collection of challenging NLP tasks that assess a model's ability to understand and reason with language. It provides a standardized platform for evaluating the performance of NLP models across various tasks, allowing researchers to compare different approaches and track progress in the field.

The Tasks of GLUE

GLUE consists of nine individual NLP tasks, each focusing on a specific aspect of language understanding. These tasks can be broadly categorized into three areas:

  1. Semantic Similarity and Paraphrasing:

    • MNLI (Multi-Genre Natural Language Inference): Determines the relationship between a premise and a hypothesis (entailment, neutral, contradiction).
    • QQP (Question Answering over Paraphrased Passages): Identifies if a question is paraphrased from a passage in a document.
    • STS-B (Semantic Textual Similarity Benchmark): Measures the semantic similarity between two sentences.
  2. Natural Language Understanding and Reasoning:

    • WNLI (Winograd Schema NLI): Leverages commonsense knowledge to solve pronoun coreference issues.
    • RTE (Recognizing Textual Entailment): Similar to MNLI, determines entailment between a text and a hypothesis.
  3. Information Extraction:

    • CoLA (Corpus of List Arguments): Evaluates the ability to determine if a sentence is grammatically correct and conveys a relationship between two entities.
    • SST-2 (Sentiment Sentiment Treebank): Assigns sentiment polarity (positive or negative) to sentences.
    • MRPC (Microsoft Research Paraphrase Corpus): Similar to QQP, identifies if a sentence is a paraphrase of another sentence.
    • QNLI (Question Answering over NLI): Determines if an answer to a question can be entailed from the passage.

By encompassing a diverse range of tasks, GLUE provides a comprehensive evaluation of a model's overall NLP capabilities.

Why is GLUE Important?

GLUE has played a significant role in advancing NLP research in several ways:

  • Standardization: It offers a common ground for evaluating NLP models, facilitating comparisons between different approaches.
  • Progress Tracking: GLUE allows researchers to track the progress of the field by monitoring how models perform on the benchmark over time.
  • Identifying Weaknesses: By analyzing model performance on specific tasks, researchers can pinpoint areas where NLP models struggle and work towards improvements.
  • Benchmarking New Models: New NLP models can be readily evaluated using GLUE to assess their capabilities.

GLUE's impact extends beyond research. It also helps companies develop and deploy NLP-powered applications with a clearer understanding of model strengths and limitations.

Conclusion

GLUE serves as a cornerstone for evaluating and advancing the field of NLP. By providing a comprehensive benchmark, it fosters innovation and facilitates the development of more robust and versatile NLP models that can understand and interact with human language more effectively.

Reference: Research Paper Tags: Technology,Large Language Models,Natural Language Processing,

Sunday, April 7, 2024

What is zero shot? Single shot? Few Shots in terms of LLMs?

In the context of Language Model (LM) training and fine-tuning, terms like "zero-shot," "single-shot," and "few-shot" refer to different approaches for adapting a pre-trained LM to a specific task without extensive task-specific training data. These terms are commonly associated with transfer learning, where a model pre-trained on a large corpus of text is fine-tuned on a smaller dataset for a specific downstream task.

  1. Zero-shot Learning: In zero-shot learning, the model is directly applied to a task without any task-specific training examples. Instead, the model leverages its pre-trained knowledge to make predictions on the task. For example, if a pre-trained LM has been trained on a diverse range of text data, you could directly use it for tasks like text classification or text generation without fine-tuning on task-specific data.

  2. Single-shot Learning: Single-shot learning involves providing only a single example or a very small amount of labeled data for the task. The model then learns from this limited supervision to make predictions on similar examples. This approach is useful when labeled data is scarce or expensive to obtain. The model may adapt its parameters slightly to accommodate the provided example during training.

  3. Few-shot Learning: Few-shot learning is similar to single-shot learning but involves providing a few examples (more than one) for the task. These examples are typically used to fine-tune the pre-trained model on the specific task. Few-shot learning enables the model to generalize better compared to single-shot learning as it has more training instances to learn from. Techniques like meta-learning or transfer learning from related tasks can enhance few-shot learning performance.

In summary, zero-shot learning does not involve any task-specific training examples, while single-shot learning involves a single example or very limited labeled data, and few-shot learning involves a small number of examples for the task. These approaches allow pre-trained LMs to adapt to various downstream tasks efficiently, leveraging their learned representations and capabilities.

Friday, April 5, 2024

Building Zero Shot Classifiers For Text Using Large Language Models

The software tools we need for the activity covered in this post are:

  1. Google Colab
  2. GitHub
  3. And last: ChatGPT

Why I needed these three items?

I needed Google Colab to write code. Google Colab allowed to me not care about creating a local environment and setting it up with the required packages such as 'transformers' from Hugging Face.

I needed GitHub to put my code in a place that I can make available to myself anywhere and also to you (my readers).

I needed ChatGPT to get boilerplate code for our particular task. I learnt about the prompts I needed for this activity from the book by Sinan Ozdemir titled: 

Quick Start Guide to Large Language Models. Strategies and Best Practices for using ChatGPT and Other LLMs - Addison-Wesley Professional (2023)

Download Book

What you would need would (I think) is:

Google Colab that is connected for code to my public repository hosted on GitHub.

 
How to connect Google Colab with GitHub?

1. Once you would open the Google Colab, you get a screen as shown below.

Note that: I am logged in to my Google Account for this task.


2. Next: you would click on "File" at the top left.

You would click on "Open Notebook".

And you would select "GitHub" as shown below:


3. You would fill in the username. It is: ashishjain1547

Once you fill in the username, the "Repository" dropdown would auto populate with public repositories available for that user.

The repository you would select is: "generative_ai_workspace_2024_04_05"

4. Once repo is selected, it's notebooks start appearing below:

Code for zero shot Spam vs. Not Spam classifier using Facebook's BART


from transformers import pipeline
def classify_text (email):
    """
    Use Facebook's BART model to classify an email into "spam" or "not spam"
    Args:
    email (str): The email to classify
    Returns:
    str: The classification of the email
    """

    classifier = pipeline('zero-shot-classification', model='facebook/bart-large-mnli')
    labels = ['spam', 'not spam']
    hypothesis_template = 'This email is {}.'
    results = classifier(email, labels, hypothesis_template=hypothesis_template)
    return results['labels'][0]

 

Usage:


 

How we used ChatGPT?

Provide code to do zero shot sentiment analysis using Google BERT.
 
Answer:

To perform zero-shot sentiment analysis using Google's BERT model, you can utilize the Hugging Face Transformers library along with the pre-trained BERT model. Here's how you can do it in Python:
 
from transformers import pipeline

# Load the zero-shot classification pipeline with BERT
classifier = pipeline("zero-shot-classification", model="bert-base-uncased")

# Define a list of texts for sentiment analysis
texts = [
    "I love this product, it's amazing!",
    "This movie was terrible, I hated it.",
    "The weather today is fantastic.",
    "I feel neutral about this situation."
]

# Define the possible labels/categories for sentiment
candidate_labels = ["positive", "negative", "neutral"]

# Perform zero-shot sentiment analysis
results = classifier(texts, candidate_labels)

# Print the results
for i, text in enumerate(texts):
    print(f"Text: {text}")
    print("Sentiment probabilities:")
    for label, score in zip(results[i]['labels'], results[i]['scores']):
        print(f"{label}: {score:.4f}")
    print()
 
 
Output

Text: I love this product, it's amazing!
Sentiment probabilities:
positive: 0.4718
negative: 0.2679
neutral: 0.2603

Text: This movie was terrible, I hated it.
Sentiment probabilities:
negative: 0.3644
positive: 0.3179
neutral: 0.3177

Text: The weather today is fantastic.
Sentiment probabilities:
positive: 0.4026
negative: 0.3039
neutral: 0.2935

Text: I feel neutral about this situation.
Sentiment probabilities:
neutral: 0.3587
negative: 0.3248
positive: 0.3166


Additional Note

Question:

How does the BERT know that the problem is of sentiment analysis and not spam-vs-not_spam?

Answer: 

In the code provided, the BERT model doesn't inherently "know" that the problem is specifically about sentiment analysis. Instead, it is designed for zero-shot classification, which means it can classify texts into predefined categories without being explicitly trained on those categories.

The key lies in the input data and the labels provided. In the example, we've defined a list of texts that contain statements or reviews, and we've also provided a list of candidate labels (positive, negative, neutral) that represent sentiment categories. When you pass these texts and labels to the BERT model through the zero-shot classification pipeline, it infers the sentiment of each text based on its understanding of the language and the provided labels.

If you were to perform a different classification task, such as spam vs. not spam, you would simply replace the list of candidate labels with the appropriate labels for that task (e.g., spam, not_spam). The model doesn't need to be explicitly told the nature of the classification problem because it learns to make predictions based on the provided labels during training.