Sunday, April 7, 2024

What is zero shot? Single shot? Few Shots in terms of LLMs?

In the context of Language Model (LM) training and fine-tuning, terms like "zero-shot," "single-shot," and "few-shot" refer to different approaches for adapting a pre-trained LM to a specific task without extensive task-specific training data. These terms are commonly associated with transfer learning, where a model pre-trained on a large corpus of text is fine-tuned on a smaller dataset for a specific downstream task.

  1. Zero-shot Learning: In zero-shot learning, the model is directly applied to a task without any task-specific training examples. Instead, the model leverages its pre-trained knowledge to make predictions on the task. For example, if a pre-trained LM has been trained on a diverse range of text data, you could directly use it for tasks like text classification or text generation without fine-tuning on task-specific data.

  2. Single-shot Learning: Single-shot learning involves providing only a single example or a very small amount of labeled data for the task. The model then learns from this limited supervision to make predictions on similar examples. This approach is useful when labeled data is scarce or expensive to obtain. The model may adapt its parameters slightly to accommodate the provided example during training.

  3. Few-shot Learning: Few-shot learning is similar to single-shot learning but involves providing a few examples (more than one) for the task. These examples are typically used to fine-tune the pre-trained model on the specific task. Few-shot learning enables the model to generalize better compared to single-shot learning as it has more training instances to learn from. Techniques like meta-learning or transfer learning from related tasks can enhance few-shot learning performance.

In summary, zero-shot learning does not involve any task-specific training examples, while single-shot learning involves a single example or very limited labeled data, and few-shot learning involves a small number of examples for the task. These approaches allow pre-trained LMs to adapt to various downstream tasks efficiently, leveraging their learned representations and capabilities.

No comments:

Post a Comment