survival8: Model Exploration on HuggingFace Platform for Automatic Speech Recognition

Wednesday, March 12, 2025

Model Exploration on HuggingFace Platform for Automatic Speech Recognition

To See All Articles About Technology: Index of Lessons in Technology

This model is too heavy for my setup.
# "distil-whisper/distil-small.en" Size: +310MB
I need a model that's smaller in size.

If you need a smaller speech-to-text model, you can try the following:

1. Use `openai/whisper-tiny` (Smallest Whisper Model)

Size: ~155MB
Pros: Smallest Whisper model, decent accuracy
Cons: Limited performance on noisy speech

Code to Load Locally

python
from transformers import pipeline
    
    # Define the local model directory
    local_model_path = "./models/whisper-tiny"
    
    # Load the smallest Whisper ASR model
    asr = pipeline("automatic-speech-recognition", model=local_model_path)

Download the Model Locally (If Not Already)

python
from transformers import AutoModelForSpeechSeq2Seq, AutoProcessor
    
    model_id = "openai/whisper-tiny"
    save_path = "./models/whisper-tiny"
    
    model = AutoModelForSpeechSeq2Seq.from_pretrained(model_id)
    model.save_pretrained(save_path)
    
    processor = AutoProcessor.from_pretrained(model_id)
    processor.save_pretrained(save_path)

2. Alternative: `facebook/wav2vec2-base-960h` (+350MB)

Size: +350MB
Pros: Lighter than Whisper, works well on short speech
Cons: Doesn't support multilingual speech

Code for `wav2vec2-base`

python
asr = pipeline("automatic-speech-recognition", model="facebook/wav2vec2-base-960h")

Would you like an even lighter model or real-time transcription support? 🚀

survival8

Pages

Wednesday, March 12, 2025

Model Exploration on HuggingFace Platform for Automatic Speech Recognition

This model is too heavy for my setup.
# "distil-whisper/distil-small.en" Size: +310MB
I need a model that's smaller in size.

1. Use `openai/whisper-tiny` (Smallest Whisper Model)

Code to Load Locally

Download the Model Locally (If Not Already)

2. Alternative: `facebook/wav2vec2-base-960h` (+350MB)

Code for `wav2vec2-base`

No comments:

Post a Comment

Pages

Wednesday, March 12, 2025

Model Exploration on HuggingFace Platform for Automatic Speech Recognition

This model is too heavy for my setup. # "distil-whisper/distil-small.en" Size: +310MB I need a model that's smaller in size.

1. Use openai/whisper-tiny (Smallest Whisper Model)

Code to Load Locally

Download the Model Locally (If Not Already)

2. Alternative: facebook/wav2vec2-base-960h (+350MB)

Code for wav2vec2-base

No comments:

Post a Comment

This model is too heavy for my setup.
# "distil-whisper/distil-small.en" Size: +310MB
I need a model that's smaller in size.

1. Use `openai/whisper-tiny` (Smallest Whisper Model)

2. Alternative: `facebook/wav2vec2-base-960h` (+350MB)

Code for `wav2vec2-base`