Note: We are using this model:
https://huggingface.co/openai/whisper-tiny/tree/main
$ pwd
/home/ashish/Desktop/Using OpenAI-Whisper-Tiny via HuggingFace for Automatic Speech Recognition app (Research)/models--openai--whisper-tiny
$ ls -lR
...
...
...
./snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af:
total 4
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 added_tokens.json -> ../../blobs/e3d256c988462aa153dcabe2aa38b8e9b436c06f
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 config.json -> ../../blobs/417aa9de49a132dd3eb6a56d3be2718b15f08917
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 generation_config.json -> ../../blobs/4b26dd66b8f7bca37d851d259fdc118315cacc62
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 merges.txt -> ../../blobs/6038932a2a1f09a66991b1c2adae0d14066fa29e
lrwxrwxrwx 1 ashish ashish 76 Mar 8 22:46 model.safetensors -> ../../blobs/7ebd0e69e78190ffe1438491fa05cc1f5c1aa3a4c4db3bc1723adbb551ea2395
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 normalizer.json -> ../../blobs/dd6ae819ad738ac1a546e9f9282ef325c33b9ea0
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 preprocessor_config.json -> ../../blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 special_tokens_map.json -> ../../blobs/bf69932dca4b3719b59fdd8f6cc1978109509f6c
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 tokenizer_config.json -> ../../blobs/d13b786c04765fb1a06492b53587752cd67665ea
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 tokenizer.json -> ../../blobs/1e95340ff836fad1b5932e800fb7b8c5e6d78a74
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 vocab.json -> ../../blobs/d7016e21da8776c8a9d577d0f559600f09a240eb
I am getting this error with my local setup:
OSError: ./models--openai--whisper-tiny does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/./models--openai-
But this file is present in my model_dir as I can check using "ls -lR"
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 preprocessor_config.json -> ../../blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2
This error usually happens when the model directory contains symbolic links (symlinks) that point to files (like preprocessor_config.json) outside the current folder, and the Transformers library can’t resolve them. Even though your ls output shows the symlink, if the target file (in this case, the blob file) isn’t actually available in the expected relative location, you’ll get an error.
Here are some steps to resolve it:
Check the Symlink Target:
Run:
bash
readlink preprocessor_config.json
Verify that the target file (e.g. ../../blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2) exists relative to your model directory. If it doesn’t, the symlink is broken.
Properly Download the Repository:
The recommended way to obtain the model files is to clone the repository using Git LFS. For example:
This ensures that all the blob files are downloaded and the symlinks point to real files.
Manually Fix the Symlink:
If you already downloaded the files manually and have the blob file somewhere else, copy that blob file into the expected relative path so the symlink resolves. Alternatively, you can replace the symlink with the actual file:
Use the Hugging Face Hub API:
Instead of manually handling files, you can let the Transformers library handle the download and caching by specifying the model repository name. For example:
python
from transformers import WhisperProcessor, AutoModelForConditionalGeneration
model_dir = "openai/whisper-tiny"# This will download the files and cache them properly
processor = WhisperProcessor.from_pretrained(model_dir)
model = AutoModelForConditionalGeneration.from_pretrained(model_dir)
This avoids the need to manage symlinks manually.
By ensuring that the symlinks are valid (either by using Git LFS to clone the repository or by copying the actual files in place), the Transformers library should be able to load the preprocessor configuration without error.
Let me know if you need more help!
If you don't fix the symlinks manually, you will get the below error:
04:04 ~ $ python app_pt.py
2025-03-09 04:06:00.012059: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.10/site-packages/cv2/../../lib64:
2025-03-09 04:06:00.012096: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
File "/home/keshavpawar137/app_pt.py", line 13, in [module]
processor = WhisperProcessor.from_pretrained(model_dir)
File "/home/keshavpawar137/.local/lib/python3.10/site-packages/transformers/processing_utils.py", line 1070, in from_pretrained
args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
File "/home/keshavpawar137/.local/lib/python3.10/site-packages/transformers/processing_utils.py", line 1116, in _get_arguments_from_pretrained
args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
File "/home/keshavpawar137/.local/lib/python3.10/site-packages/transformers/feature_extraction_utils.py", line 385, in from_pretrained
feature_extractor_dict, kwargs = cls.get_feature_extractor_dict(pretrained_model_name_or_path, **kwargs)
File "/home/keshavpawar137/.local/lib/python3.10/site-packages/transformers/feature_extraction_utils.py", line 511, in get_feature_extractor_dict
resolved_feature_extractor_file = cached_file(
File "/home/keshavpawar137/.local/lib/python3.10/site-packages/transformers/utils/hub.py", line 313, in cached_file
raise EnvironmentError(
OSError: ./models--openai--whisper-tiny does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/./models--openai--whisper-tiny/tree/main' for available files.
Manually Fixing the Symlink
$ pwd
/home/ashish/Desktop/Using OpenAI-Whisper-Tiny via HuggingFace for Automatic Speech Recognition app (Research)/models--openai--whisper-tiny
cp ./blobs/e3d256c988462aa153dcabe2aa38b8e9b436c06f added_tokens.json
cp ./blobs/417aa9de49a132dd3eb6a56d3be2718b15f08917 config.json
cp ./blobs/4b26dd66b8f7bca37d851d259fdc118315cacc62 generation_config.json
cp ./blobs/6038932a2a1f09a66991b1c2adae0d14066fa29e merges.txt
cp ./blobs/7ebd0e69e78190ffe1438491fa05cc1f5c1aa3a4c4db3bc1723adbb551ea2395 model.safetensors
cp ./blobs/dd6ae819ad738ac1a546e9f9282ef325c33b9ea0 normalizer.json
cp ./blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2 preprocessor_config.json
cp ./blobs/bf69932dca4b3719b59fdd8f6cc1978109509f6c special_tokens_map.json
cp ./blobs/d13b786c04765fb1a06492b53587752cd67665ea tokenizer_config.json
cp ./blobs/1e95340ff836fad1b5932e800fb7b8c5e6d78a74 tokenizer.json
cp ./blobs/d7016e21da8776c8a9d577d0f559600f09a240eb vocab.json
~~~
05:34 ~/mysite $ ls
__pycache__ flask_app.py models--openai--whisper-tiny
https://www.pythonanywhere.com/user/keshavpawar137/files/var/log/keshavpawar137.pythonanywhere.com.error.log
2025-03-09 05:31:23,560: OSError: Incorrect path_or_model_id: './models--openai--whisper-tiny'. Please provide either the path to a local folder or the repo_id of a model on the Hub.
2025-03-09 05:31:23,560: File "/var/www/keshavpawar137_pythonanywhere_com_wsgi.py", line 16, in [module>
2025-03-09 05:31:23,560: from flask_app import app as application # noqa
2025-03-09 05:31:23,560:
2025-03-09 05:31:23,560: File "/home/keshavpawar137/mysite/flask_app.py", line 23, in [module>
2025-03-09 05:31:23,561: processor = WhisperProcessor.from_pretrained(model_dir)
Where are we?
04:40 ~ $ pwd
/home/keshavpawar137
04:41 ~ $ ls
README.txt app_pt.py models--openai--whisper-tiny.zip
04:43 ~ $ ls -a
. .bashrc .gitconfig .keras .my.cnf .pythonstartup.py .virtualenvs app_pt.py
.. .cache .ipython .local .profile .vimrc README.txt models--openai--whisper-tiny.zip
04:43 ~ $ ls -lh
total 86M
-rwxr-xr-x 1 keshavpawar137 registered_users 232 Mar 5 09:53 README.txt
-rw-rw-r-- 1 keshavpawar137 registered_users 1.6K Mar 9 04:04 app_pt.py
-rw-rw-r-- 1 keshavpawar137 registered_users 86M Mar 9 04:42 models--openai--whisper-tiny.zip
04:43 ~ $ du -sh .
298M
04:44 ~ $ rm -rf ~/.cache/pip
04:44 ~ $ find . -type d -name "__pycache__" -exec rm -rf {} +
04:45 ~ $ du -sh .
159M
Can I delete : ~/.cache/pip
Yes, you can delete the ~/.cache/pip directory. This directory is used by pip to cache packages and metadata to speed up subsequent installations. Deleting it will not harm your system, but it will mean that pip will need to re-download any packages you install in the future, which might take a bit longer.
To delete the directory, you can use the following command in your terminal:
sh
rm -rf ~/.cache/pip
Make sure to double-check the command to avoid accidentally deleting the wrong files or directories.
... ... ... ... ... ...
Delete __pycache__ from all subdirectories
To delete all __pycache__ directories (including all their contents) from the current directory and its subdirectories, use the following find command:
bash
find . -type d -name "__pycache__" -execrm -rf {} +
Explanation:
.: Start searching from the current directory.
-type d: Search for directories only.
-name "__pycache__": Match directories named __pycache__.
-exec rm -rf {} +: Executes the rm -rf command on each found directory ({} represents the directory name). The + batches the deletions into fewer rm calls, making it more efficient.
Important: This command permanently deletes the __pycache__ directories, so double-check you’re in the correct directory before running it.
Issue of: "Address already in use"* Serving Flask app '__main__' (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: on
Address already in use
Port 5000 is in use by another program. Either identify and stop that program, or start the server with a different port.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/werkzeug/serving.py", line 908, in prepare_socket
s.bind(server_address)
OSError: [Errno 98] Address already in use
FIX:
if __name__ == '__main__':
app.run(debug=True, port=5002, use_reloader=False)
Project Setup For ASR
04:47 ~ $ python app_pt.py
2025-03-09 04:49:18.556050: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.10/site-packages/cv2/../../lib64:
2025-03-09 04:49:18.556080: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
* Serving Flask app 'app_pt' (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: on
Address already in use
Port 5000 is in use by another program. Either identify and stop that program, or start the server with a different port.
05:08 ~ $ tail -3 app_pt.py
if __name__ == "__main__":
# Run the server on port 5000 and listen on all interfaces.
app.run(host="0.0.0.0", port=5001, debug=False)
04:59 ~ $ python app_pt.py
2025-03-09 05:01:13.156268: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.10/site-packages/cv2/../../lib64:
2025-03-09 05:01:13.156294: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
* Serving Flask app 'app_pt' (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
Address already in use
Port 5001 is in use by another program. Either identify and stop that program, or start the server with a different port.
05:03 ~ $
05:09 ~ $ tail -3 app_pt.py
if __name__ == "__main__":
# Run the server on port 5000 and listen on all interfaces.
app.run(host="0.0.0.0", port=5001, debug=False, use_reloader=False)
05:09 ~ $ python app_pt.py
2025-03-09 05:11:43.255847: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.10/site-packages/cv2/../../lib64:
2025-03-09 05:11:43.255871: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
* Serving Flask app 'app_pt' (lazy loading)
* Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
* Debug mode: off
Address already in use
Port 5001 is in use by another program. Either identify and stop that program, or start the server with a different port.
05:13 ~
The way to resolve this is to not attempt to run a 'flask app' as a Python Script. Rather use 'Web' app configuration of PythonAnywhere as shown below:
Below screen is not the actual one but only for demo purpose...
Step 2: I want to check if this package is already installed: ffmpeg
You can check whether ffmpeg is installed on your system by running a command like ffmpeg -version in your terminal or by using a small Python script to do the same.
Method 1: Using the Terminal
Open your terminal (or Command Prompt on Windows) and run:
sh
ffmpeg -version
If installed: You’ll see version information and details about the ffmpeg build.
If not installed: You might see an error message like command not found (Linux/Mac) or a similar message on Windows.
Method 2: Using Python
You can use Python’s subprocess module to check if ffmpeg is available:
python
import subprocess
defcheck_ffmpeg_installed():
try:
result = subprocess.run(
["ffmpeg", "-version"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
text=True
)
if result.returncode == 0:
print("ffmpeg is installed!")
print(result.stdout)
else:
print("ffmpeg appears to be installed but returned an error.")
except FileNotFoundError:
print("ffmpeg is not installed or not found in the system PATH.")
check_ffmpeg_installed()
Explanation:
The script tries to run ffmpeg -version and captures its output.
If ffmpeg is not found, a FileNotFoundError is raised.
If the command runs successfully (return code 0), it prints the version info.
Choose the method that best fits your needs. Would you like more details on using or installing ffmpeg?
12:58 ~ $ pip show flask
Name: Flask
Version: 2.1.2
Summary: A simple framework for building complex web applications.
Home-page: https://palletsprojects.com/p/flask
Author: Armin Ronacher
Author-email: armin.ronacher@active-4.com
License: BSD-3-Clause
Location: /usr/local/lib/python3.10/site-packages
Requires: click, itsdangerous, Jinja2, Werkzeug
Required-by: dash, Flask-Admin, Flask-Babel, Flask-Bcrypt, Flask-Bootstrap, Flask-Compress, Flask-HTTPAuth, Flask-Login, Flask-RESTful, Flask-SQLAlchemy, Flask-SSLify, Flask-WTF
Step 4: What is the difference between whisper openai-whisper and what are there uses?
Both names generally refer to the same underlying technology—a state‐of‐the‐art automatic speech recognition (ASR) model developed by OpenAI—but they’re used in slightly different contexts:
OpenAI Whisper
Official Implementation:
OpenAI Whisper is the original model released by OpenAI. It’s available on GitHub (in the openai/whisper repository) and comes in several sizes (tiny, base, small, medium, large) to balance accuracy and speed.
Primary Use Cases:
High-Quality Transcription: Converts spoken language into text.
Multilingual Recognition: Supports multiple languages and can also identify the language spoken.
Subtitling and Captioning: Can be used to generate subtitles or closed captions.
Voice-Activated Applications: Powers various voice assistant and dictation systems.
How to Use:
You typically install it via pip as the package openai-whisper:
bash
pip install openai-whisper
Then you can load and run the model directly in Python using its provided API.
"whisper" or "openai-whisper" (as a Package)
Naming Convention:
When you see references to “whisper” or “openai-whisper” (especially on PyPI), they are referring to the official package that implements OpenAI’s Whisper model.
The PyPI package is named openai-whisper, which is simply the packaged form of the original OpenAI Whisper implementation.
Usage Context:
If you’re building an application (for example, a Flask endpoint or an offline speech-to-text app), you would install the package using the PyPI name and then use the provided API to load one of the available model sizes and perform transcription.
The package supports both CPU and GPU inference, making it adaptable for different deployment scenarios—from high-end servers to on-device applications after appropriate optimizations (such as quantization).
In Summary
They’re Essentially the Same:
“OpenAI Whisper” is the name of the model and the technology, while openai-whisper is the name you’ll use to install and work with the official implementation in Python.
Their Uses Are Aligned:
Both are designed for accurate and multilingual speech recognition, suitable for applications like transcription, voice assistants, subtitles, and more.
If you plan to build applications that leverage this technology (whether via a Flask endpoint, desktop app, or mobile integration), you’ll be using the official OpenAI Whisper model via the openai-whisper package.
Step 5: Setup "whisper"
13:17 ~ $ cat test_whisper.py
import whisper
model = whisper.load_model("small")
print(model)
13:17 ~ $ python test_whisper.py
Traceback (most recent call last):
File "/home/keshavpawar137/test_whisper.py", line 1, in
import whisper
ModuleNotFoundError: No module named 'whisper'
13:17 ~ $ pip install whisper
Defaulting to user installation because normal site-packages is not writeable
Looking in links: /usr/share/pip-wheels
Collecting whisper
Downloading whisper-1.1.10.tar.gz (42 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 42.8/42.8 kB 618.8 kB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Requirement already satisfied: six in /usr/local/lib/python3.10/site-packages (from whisper) (1.16.0)
Building wheels for collected packages: whisper
Building wheel for whisper (setup.py) ... done
Created wheel for whisper: filename=whisper-1.1.10-py3-none-any.whl size=41138 sha256=f7c359cb3ced55eba4775c1d331167df4286838de0288cf0bce194d38bc48da1
Stored in directory: /home/keshavpawar137/.cache/pip/wheels/aa/7c/1d/015619716e2facae6631312503baf3c3220e6a9a3508cb14b6
Successfully built whisper
Installing collected packages: whisper
Successfully installed whisper-1.1.10
Get the current space usage in my account in PythonAnywhere
13:21 ~ $ pwd
/home/keshavpawar137
13:19 ~ $ du -sh .
668K
Step 6: Uninstall "whisper" (if you installed that) and install "openai-whisper"
With "whisper:"
13:22 ~ $ python test_whisper.py
Traceback (most recent call last):
File "/home/keshavpawar137/test_whisper.py", line 2, in [module]
model = whisper.load_model("small")
AttributeError: module 'whisper' has no attribute 'load_model'
This error usually indicates that Python isn’t loading the expected Whisper module. Here are a few common reasons and solutions:
1. Naming Conflicts
Local File Name Conflict:
If you have a file named whisper.py in your working directory, Python will import that instead of the official package. Solution: Rename your file (and remove any associated whisper.pyc) to avoid conflicts.
Module Shadowing:
Make sure there’s no directory or module named “whisper” in your project that could be interfering with the import.
2. Incorrect Package Installation
Verify the Correct Package:
The official package is called openai-whisper. Solution:
bash
pip uninstall whisper
pip install openai-whisper
Then, ensure you’re importing it as:
python
import whisper
model = whisper.load_model("small")
3. Check Module Contents
You can verify which attributes the module provides with:
python
import whisper
print(dir(whisper))
If you don’t see load_model in the output, it means the wrong module is being loaded.
By ensuring you’re using the official openai-whisper package and avoiding naming conflicts, the load_model function should be available.
13:22 ~ $ pip uninstall whisper
Found existing installation: whisper 1.1.10
Uninstalling whisper-1.1.10:
Would remove:
/home/keshavpawar137/.local/bin/find-corrupt-whisper-files.py
/home/keshavpawar137/.local/bin/rrd2whisper.py
/home/keshavpawar137/.local/bin/update-storage-times.py
/home/keshavpawar137/.local/bin/whisper-auto-resize.py
/home/keshavpawar137/.local/bin/whisper-auto-update.py
/home/keshavpawar137/.local/bin/whisper-create.py
/home/keshavpawar137/.local/bin/whisper-diff.py
/home/keshavpawar137/.local/bin/whisper-dump.py
/home/keshavpawar137/.local/bin/whisper-fetch.py
/home/keshavpawar137/.local/bin/whisper-fill.py
/home/keshavpawar137/.local/bin/whisper-info.py
/home/keshavpawar137/.local/bin/whisper-merge.py
/home/keshavpawar137/.local/bin/whisper-resize.py
/home/keshavpawar137/.local/bin/whisper-set-aggregation-method.py
/home/keshavpawar137/.local/bin/whisper-set-xfilesfactor.py
/home/keshavpawar137/.local/bin/whisper-update.py
/home/keshavpawar137/.local/lib/python3.10/site-packages/whisper-1.1.10.dist-info/*
/home/keshavpawar137/.local/lib/python3.10/site-packages/whisper.py
Proceed (Y/n)? y
Successfully uninstalled whisper-1.1.10
13:28 ~ $ pip install openai-whisper
Defaulting to user installation because normal site-packages is not writeable
Looking in links: /usr/share/pip-wheels
Collecting openai-whisper
Downloading openai-whisper-20240930.tar.gz (800 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 800.5/800.5 kB 14.1 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Preparing metadata (pyproject.toml) ... done
Requirement already satisfied: numpy in /usr/local/lib/python3.10/site-packages (from openai-whisper) (1.21.6)
Collecting triton>=2.0.0
Downloading triton-3.2.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (253.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 253.1/253.1 MB 5.0 MB/s eta 0:00:00
ERROR: Could not install packages due to an OSError: [Errno 122] Disk quota exceeded
13:29 ~ $ du -sh .
83M .
13:30 ~ $
Requirements:
pip install flask openai-whisper soundfile
conda-forge / flask 3.1.0
A simple framework for building complex web applications.
piiq / openai-whisper 20230308
Robust Speech Recognition via Large-Scale Weak Supervisionconda osx-arm64
Sheepless / openai-whisper 20231117
Robust Speech Recognition via Large-Scale Weak Supervisionconda noarch
Moving away from "openai-whisper" and using Whisper-Tiny via HuggingFace:
Using OpenAI-Whisper-Tiny via HuggingFace for Automatic Speech Recognition app (Research)
16:34 ~ $ pip install transformers
Defaulting to user installation because normal site-packages is not writeable
Looking in links: /usr/share/pip-wheels
Collecting transformers
Downloading transformers-4.49.0-py3-none-any.whl (10.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.0/10.0 MB 49.6 MB/s eta 0:00:00
Requirement already satisfied: regex!=2019.12.17 in /usr/local/lib/python3.10/site-packages (from transformers) (2021.11.10)
Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/site-packages (from transformers) (1.21.6)
Collecting tokenizers<0.22,>=0.21
Downloading tokenizers-0.21.0-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.0/3.0 MB 42.7 MB/s eta 0:00:00
Collecting safetensors>=0.4.1
Downloading safetensors-0.5.3-cp38-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (471 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 471.6/471.6 kB 8.0 MB/s eta 0:00:00
Requirement already satisfied: requests in /usr/local/lib/python3.10/site-packages (from transformers) (2.28.1)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/site-packages (from transformers) (6.0)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.10/site-packages (from transformers) (21.3)
Requirement already satisfied: tqdm>=4.27 in /usr/local/lib/python3.10/site-packages (from transformers) (4.62.3)
Collecting huggingface-hub<1.0,>=0.26.0
Downloading huggingface_hub-0.29.2-py3-none-any.whl (468 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 468.1/468.1 kB 10.7 MB/s eta 0:00:00
Requirement already satisfied: filelock in /usr/local/lib/python3.10/site-packages (from transformers) (3.4.2)
Collecting fsspec>=2023.5.0
Downloading fsspec-2025.3.0-py3-none-any.whl (193 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 193.6/193.6 kB 4.3 MB/s eta 0:00:00
Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/site-packages (from huggingface-hub<1.0,>=0.26.0->transformers) (3.10.0.2)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.10/site-packages (from packaging>=20.0->transformers) (2.4.7)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/site-packages (from requests->transformers) (3.3)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/site-packages (from requests->transformers) (1.26.9)
Requirement already satisfied: charset-normalizer<3,>=2 in /usr/local/lib/python3.10/site-packages (from requests->transformers) (2.1.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/site-packages (from requests->transformers) (2022.6.15)
Installing collected packages: safetensors, fsspec, huggingface-hub, tokenizers, transformers
Successfully installed fsspec-2025.3.0 huggingface-hub-0.29.2 safetensors-0.5.3 tokenizers-0.21.0 transformers-4.49.0
We are going to use TensorFlow as backend for HuggingFace implementation as that comes with PythonAnywhere cloud by default.
16:39 ~ $ pip show tensorflow
Name: tensorflow
Version: 2.9.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /usr/local/lib/python3.10/site-packages
Requires: absl-py, astunparse, flatbuffers, gast, google-pasta, grpcio, h5py, keras, keras-preprocessing, libclang, numpy, opt-einsum, packaging, protobuf, setuptools, six, tensorboard, tensorflow-estimator, tensorflow-io-gcs-filesystem, termcolor, typing-extensions, wrapt
Required-by:
16:43 ~ $ pip install librosa
Defaulting to user installation because normal site-packages is not writeable
Looking in links: /usr/share/pip-wheels
Collecting librosa
Downloading librosa-0.10.2.post1-py3-none-any.whl (260 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 260.1/260.1 kB 6.5 MB/s eta 0:00:00
Requirement already satisfied: scikit-learn>=0.20.0 in /usr/local/lib/python3.10/site-packages (from librosa) (1.0.2)
Collecting audioread>=2.1.9
Downloading audioread-3.0.1-py3-none-any.whl (23 kB)
Collecting pooch>=1.1
Downloading pooch-1.8.2-py3-none-any.whl (64 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 64.6/64.6 kB 1.3 MB/s eta 0:00:00
Collecting typing-extensions>=4.1.1
Downloading typing_extensions-4.12.2-py3-none-any.whl (37 kB)
Requirement already satisfied: numpy!=1.22.0,!=1.22.1,!=1.22.2,>=1.20.3 in /usr/local/lib/python3.10/site-packages (from librosa) (1.21.6)
Collecting msgpack>=1.0
Downloading msgpack-1.1.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (378 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 378.0/378.0 kB 9.6 MB/s eta 0:00:00
Requirement already satisfied: numba>=0.51.0 in /usr/local/lib/python3.10/site-packages (from librosa) (0.55.1)
Requirement already satisfied: decorator>=4.3.0 in /usr/local/lib/python3.10/site-packages (from librosa) (5.1.1)
Collecting lazy-loader>=0.1
Downloading lazy_loader-0.4-py3-none-any.whl (12 kB)
Collecting soundfile>=0.12.1
Downloading soundfile-0.13.1-py2.py3-none-manylinux_2_28_x86_64.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 25.1 MB/s eta 0:00:00
Collecting soxr>=0.3.2
Downloading soxr-0.5.0.post1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (252 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 252.8/252.8 kB 4.7 MB/s eta 0:00:00
Requirement already satisfied: joblib>=0.14 in /usr/local/lib/python3.10/site-packages (from librosa) (1.1.0)
Requirement already satisfied: scipy>=1.2.0 in /usr/local/lib/python3.10/site-packages (from librosa) (1.7.3)
Requirement already satisfied: packaging in /usr/local/lib/python3.10/site-packages (from lazy-loader>=0.1->librosa) (21.3)
Requirement already satisfied: llvmlite<0.39,>=0.38.0rc1 in /usr/local/lib/python3.10/site-packages (from numba>=0.51.0->librosa) (0.38.0)
Requirement already satisfied: setuptools in /usr/local/lib/python3.10/site-packages (from numba>=0.51.0->librosa) (60.2.0)
Requirement already satisfied: platformdirs>=2.5.0 in /usr/local/lib/python3.10/site-packages (from pooch>=1.1->librosa) (2.5.2)
Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.10/site-packages (from pooch>=1.1->librosa) (2.28.1)
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.10/site-packages (from scikit-learn>=0.20.0->librosa) (3.0.0)
Requirement already satisfied: cffi>=1.0 in /usr/local/lib/python3.10/site-packages (from soundfile>=0.12.1->librosa) (1.15.1)
Requirement already satisfied: pycparser in /usr/local/lib/python3.10/site-packages (from cffi>=1.0->soundfile>=0.12.1->librosa) (2.21)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.10/site-packages (from packaging->lazy-loader>=0.1->librosa) (2.4.7)
Requirement already satisfied: charset-normalizer<3,>=2 in /usr/local/lib/python3.10/site-packages (from requests>=2.19.0->pooch>=1.1->librosa) (2.1.0)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.10/site-packages (from requests>=2.19.0->pooch>=1.1->librosa) (1.26.9)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/site-packages (from requests>=2.19.0->pooch>=1.1->librosa) (2022.6.15)
Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/site-packages (from requests>=2.19.0->pooch>=1.1->librosa) (3.3)
Installing collected packages: typing-extensions, soxr, msgpack, audioread, soundfile, pooch, lazy-loader, librosa
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
arviz 0.11.4 requires typing-extensions<4,>=3.7.4.3, but you have typing-extensions 4.12.2 which is incompatible.
Successfully installed audioread-3.0.1 lazy-loader-0.4 librosa-0.10.2.post1 msgpack-1.1.0 pooch-1.8.2 soundfile-0.13.1 soxr-0.5.0.post1 typing-extensions-4.12.2
16:44 ~ $ du -sh .
209M .
16:44 ~ $
(hf_202412) ashish@ashish-ThinkPad-T440s:~/Desktop/Using OpenAI-Whisper-Tiny via HuggingFace for Automatic Speech Recognition app (Research)$ conda list librosa
# packages in environment at /home/ashish/anaconda3/envs/hf_202412:
#
# Name Version Build Channel
librosa 0.10.2.post1 pyhd8ed1ab_1 conda-forge
--- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
16:56 ~ $ pip install torch
Defaulting to user installation because normal site-packages is not writeable
Looking in links: /usr/share/pip-wheels
Requirement already satisfied: torch in /usr/local/lib/python3.10/site-packages (1.11.0+cpu)
Requirement already satisfied: typing-extensions in ./.local/lib/python3.10/site-packages (from torch) (4.12.2)
16:59 ~ $
16:59 ~ $ pip show torch
Name: torch
Version: 1.11.0+cpu
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3
Location: /usr/local/lib/python3.10/site-packages
Requires: typing-extensions
Required-by: torchaudio, torchvision
17:00 ~ $
17:22 ~ $ ls
README.txt app_pt.py models--openai--whisper-tiny.zip
17:22 ~ $ unzip models--openai--whisper-tiny.zip
Archive: models--openai--whisper-tiny.zip
creating: models--openai--whisper-tiny/
creating: models--openai--whisper-tiny/blobs/
inflating: models--openai--whisper-tiny/blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2
inflating: models--openai--whisper-tiny/blobs/d13b786c04765fb1a06492b53587752cd67665ea
inflating: models--openai--whisper-tiny/blobs/d7016e21da8776c8a9d577d0f559600f09a240eb
inflating: models--openai--whisper-tiny/blobs/1e95340ff836fad1b5932e800fb7b8c5e6d78a74
inflating: models--openai--whisper-tiny/blobs/6038932a2a1f09a66991b1c2adae0d14066fa29e
inflating: models--openai--whisper-tiny/blobs/dd6ae819ad738ac1a546e9f9282ef325c33b9ea0
inflating: models--openai--whisper-tiny/blobs/e3d256c988462aa153dcabe2aa38b8e9b436c06f
inflating: models--openai--whisper-tiny/blobs/bf69932dca4b3719b59fdd8f6cc1978109509f6c
inflating: models--openai--whisper-tiny/blobs/417aa9de49a132dd3eb6a56d3be2718b15f08917
inflating: models--openai--whisper-tiny/blobs/7ebd0e69e78190ffe1438491fa05cc1f5c1aa3a4c4db3bc1723adbb551ea2395
inflating: models--openai--whisper-tiny/blobs/4b26dd66b8f7bca37d851d259fdc118315cacc62
creating: models--openai--whisper-tiny/snapshots/
creating: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/preprocessor_config.json -> ../../blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/tokenizer_config.json -> ../../blobs/d13b786c04765fb1a06492b53587752cd67665ea
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/vocab.json -> ../../blobs/d7016e21da8776c8a9d577d0f559600f09a240eb
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/tokenizer.json -> ../../blobs/1e95340ff836fad1b5932e800fb7b8c5e6d78a74
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/merges.txt -> ../../blobs/6038932a2a1f09a66991b1c2adae0d14066fa29e
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/normalizer.json -> ../../blobs/dd6ae819ad738ac1a546e9f9282ef325c33b9ea0
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/added_tokens.json -> ../../blobs/e3d256c988462aa153dcabe2aa38b8e9b436c06f
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/special_tokens_map.json -> ../../blobs/bf69932dca4b3719b59fdd8f6cc1978109509f6c
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/config.json -> ../../blobs/417aa9de49a132dd3eb6a56d3be2718b15f08917
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/model.safetensors -> ../../blobs/7ebd0e69e78190ffe1438491fa05cc1f5c1aa3a4c4db3bc1723adbb551ea2395
linking: models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/generation_config.json -> ../../blobs/4b26dd66b8f7bca37d851d259fdc118315cacc62
creating: models--openai--whisper-tiny/refs/
inflating: models--openai--whisper-tiny/refs/main
creating: models--openai--whisper-tiny/.no_exist/
creating: models--openai--whisper-tiny/.no_exist/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/
inflating: models--openai--whisper-tiny/.no_exist/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/chat_template.jinja
inflating: models--openai--whisper-tiny/.no_exist/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/processor_config.json
inflating: models--openai--whisper-tiny/.no_exist/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/chat_template.json
finishing deferred symbolic links:
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/preprocessor_config.json -> ../../blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/tokenizer_config.json -> ../../blobs/d13b786c04765fb1a06492b53587752cd67665ea
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/vocab.json -> ../../blobs/d7016e21da8776c8a9d577d0f559600f09a240eb
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/tokenizer.json -> ../../blobs/1e95340ff836fad1b5932e800fb7b8c5e6d78a74
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/merges.txt -> ../../blobs/6038932a2a1f09a66991b1c2adae0d14066fa29e
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/normalizer.json -> ../../blobs/dd6ae819ad738ac1a546e9f9282ef325c33b9ea0
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/added_tokens.json -> ../../blobs/e3d256c988462aa153dcabe2aa38b8e9b436c06f
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/special_tokens_map.json -> ../../blobs/bf69932dca4b3719b59fdd8f6cc1978109509f6c
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/config.json -> ../../blobs/417aa9de49a132dd3eb6a56d3be2718b15f08917
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/model.safetensors -> ../../blobs/7ebd0e69e78190ffe1438491fa05cc1f5c1aa3a4c4db3bc1723adbb551ea2395
models--openai--whisper-tiny/snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af/generation_config.json -> ../../blobs/4b26dd66b8f7bca37d851d259fdc118315cacc62
17:22 ~ $
17:22 ~ $
17:22 ~ $ du -sh .
446M .
17:22 ~ $