Monday, March 10, 2025

Import-Export CSV Data in SQLite Database

To See All Articles About Technology: Index of Lessons in Technology

Export the data of a table as CSV

To export data from an SQLite table to a CSV file, you can use the following methods:


📌 Method 1: Using SQLite Command Line

  1. Open your SQLite database in the terminal:
    sh
    sqlite3 my_database.db
  2. Set output mode to CSV:
    sql
    .mode csv
  3. Set the output file name:
    sql
    .output my_table_data.csv
  4. Run the query to export:
    sql
    SELECT * FROM my_table;
  5. Exit SQLite:
    sql
    .quit

Now, my_table_data.csv will contain your table data in CSV format.


📌 Method 2: Using SQL Query

If you're inside SQLite, run:

sql
.mode csv .headers on .output my_table_data.csv SELECT * FROM my_table;

This exports data with column headers.


📌 Method 3: Using Python

If you prefer Python, use the csv module:

python
import sqlite3 import csv # Connect to SQLite database conn = sqlite3.connect("my_database.db") cursor = conn.cursor() # Execute query cursor.execute("SELECT * FROM my_table") # Fetch data rows = cursor.fetchall() column_names = [desc[0] for desc in cursor.description] # Write to CSV with open("my_table_data.csv", "w", newline="") as f: writer = csv.writer(f) writer.writerow(column_names) # Write headers writer.writerows(rows) # Write data # Close connection conn.close() print("Data exported successfully!")

✅ This will create my_table_data.csv with column headers.


📌 Check Your Exported CSV

After exporting, verify the data using:

sh
cat my_table_data.csv

or

sh
head my_table_data.csv

Let me know if you need further help! 🚀😊

... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

Import the CSV data into an SQLite table

To import CSV data into an SQLite table, use one of the following methods:


📌 Method 1: Using SQLite Command Line

  1. Open your SQLite database in the terminal:
    sh
    sqlite3 my_database.db
  2. Enable CSV mode:
    sql
    .mode csv
  3. Import the CSV file into the table:
    sql
    .import my_data.csv my_table
  4. Verify the data:
    sql
    SELECT * FROM my_table LIMIT 5;

Your CSV data is now imported into my_table.


📌 Method 2: Using Python

If you prefer Python, use the csv module:

python
import sqlite3 import csv # Connect to SQLite database conn = sqlite3.connect("my_database.db") cursor = conn.cursor() # Create table (if not exists) cursor.execute(""" CREATE TABLE IF NOT EXISTS my_table ( id INTEGER PRIMARY KEY, name TEXT, age INTEGER, city TEXT ) """) # Open CSV file and insert data with open("my_data.csv", "r") as f: reader = csv.reader(f) next(reader) # Skip header row for row in reader: cursor.execute("INSERT INTO my_table (id, name, age, city) VALUES (?, ?, ?, ?)", row) # Commit changes and close connection conn.commit() conn.close() print("CSV data imported successfully!")

This script reads my_data.csv and inserts it into my_table.


📌 Method 3: Using Pandas (If You Have Large Data)

If you're dealing with large CSV files, use pandas:

python
import sqlite3 import pandas as pd # Load CSV into Pandas DataFrame df = pd.read_csv("my_data.csv") # Connect to SQLite conn = sqlite3.connect("my_database.db") # Import DataFrame into SQLite table df.to_sql("my_table", conn, if_exists="replace", index=False) # Close connection conn.close() print("CSV data imported successfully!")

Best for handling large datasets efficiently.


📌 Verify Import

Run this in SQLite to check your data:

sql
SELECT * FROM my_table LIMIT 5;

Let me know if you need further help! 🚀😊

Saturday, March 8, 2025

Hour 2 - First steps with development for Automatic Speech Recognition (ASR)

To See All Articles About Technology: Index of Lessons in Technology
Note: We are using this model: 
https://huggingface.co/openai/whisper-tiny/tree/main

$ pwd
/home/ashish/Desktop/Using OpenAI-Whisper-Tiny via HuggingFace for Automatic Speech Recognition app (Research)/models--openai--whisper-tiny

$ ls -lR
...
...
...

./snapshots/169d4a4341b33bc18d8881c4b69c2e104e1cc0af:
total 4
lrwxrwxrwx 1 ashish ashish 52 Mar  8 22:46 added_tokens.json -> ../../blobs/e3d256c988462aa153dcabe2aa38b8e9b436c06f
lrwxrwxrwx 1 ashish ashish 52 Mar  8 22:46 config.json -> ../../blobs/417aa9de49a132dd3eb6a56d3be2718b15f08917
lrwxrwxrwx 1 ashish ashish 52 Mar  8 22:46 generation_config.json -> ../../blobs/4b26dd66b8f7bca37d851d259fdc118315cacc62
lrwxrwxrwx 1 ashish ashish 52 Mar  8 22:46 merges.txt -> ../../blobs/6038932a2a1f09a66991b1c2adae0d14066fa29e
lrwxrwxrwx 1 ashish ashish 76 Mar  8 22:46 model.safetensors -> ../../blobs/7ebd0e69e78190ffe1438491fa05cc1f5c1aa3a4c4db3bc1723adbb551ea2395
lrwxrwxrwx 1 ashish ashish 52 Mar  8 22:46 normalizer.json -> ../../blobs/dd6ae819ad738ac1a546e9f9282ef325c33b9ea0
lrwxrwxrwx 1 ashish ashish 52 Mar  8 22:46 preprocessor_config.json -> ../../blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2
lrwxrwxrwx 1 ashish ashish 52 Mar  8 22:46 special_tokens_map.json -> ../../blobs/bf69932dca4b3719b59fdd8f6cc1978109509f6c
lrwxrwxrwx 1 ashish ashish 52 Mar  8 22:46 tokenizer_config.json -> ../../blobs/d13b786c04765fb1a06492b53587752cd67665ea
lrwxrwxrwx 1 ashish ashish 52 Mar  8 22:46 tokenizer.json -> ../../blobs/1e95340ff836fad1b5932e800fb7b8c5e6d78a74
lrwxrwxrwx 1 ashish ashish 52 Mar  8 22:46 vocab.json -> ../../blobs/d7016e21da8776c8a9d577d0f559600f09a240eb


I am getting this error with my local setup:
OSError: ./models--openai--whisper-tiny does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/./models--openai-

But this file is present in my model_dir as I can check using "ls -lR"
lrwxrwxrwx 1 ashish ashish 52 Mar 8 22:46 preprocessor_config.json -> ../../blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2

This error usually happens when the model directory contains symbolic links (symlinks) that point to files (like preprocessor_config.json) outside the current folder, and the Transformers library can’t resolve them. Even though your ls output shows the symlink, if the target file (in this case, the blob file) isn’t actually available in the expected relative location, you’ll get an error.

Here are some steps to resolve it:

  1. Check the Symlink Target:
    Run:

    bash
    readlink preprocessor_config.json

    Verify that the target file (e.g. ../../blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2) exists relative to your model directory. If it doesn’t, the symlink is broken.

  2. Properly Download the Repository:
    The recommended way to obtain the model files is to clone the repository using Git LFS. For example:

    bash
    git lfs install git clone https://huggingface.co/openai/whisper-tiny

    This ensures that all the blob files are downloaded and the symlinks point to real files.

  3. Manually Fix the Symlink:
    If you already downloaded the files manually and have the blob file somewhere else, copy that blob file into the expected relative path so the symlink resolves. Alternatively, you can replace the symlink with the actual file:

    bash
    cp ../../blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2 preprocessor_config.json

    (Make sure the source path is correct.)

  4. Use the Hugging Face Hub API:
    Instead of manually handling files, you can let the Transformers library handle the download and caching by specifying the model repository name. For example:

    python
    from transformers import WhisperProcessor, AutoModelForConditionalGeneration model_dir = "openai/whisper-tiny" # This will download the files and cache them properly processor = WhisperProcessor.from_pretrained(model_dir) model = AutoModelForConditionalGeneration.from_pretrained(model_dir)

    This avoids the need to manage symlinks manually.

By ensuring that the symlinks are valid (either by using Git LFS to clone the repository or by copying the actual files in place), the Transformers library should be able to load the preprocessor configuration without error.

Let me know if you need more help!

If you don't fix the symlinks manually, you will get the below error:

04:04 ~ $ python app_pt.py 
2025-03-09 04:06:00.012059: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.10/site-packages/cv2/../../lib64:
2025-03-09 04:06:00.012096: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
  File "/home/keshavpawar137/app_pt.py", line 13, in [module]
    processor = WhisperProcessor.from_pretrained(model_dir)
  File "/home/keshavpawar137/.local/lib/python3.10/site-packages/transformers/processing_utils.py", line 1070, in from_pretrained
    args = cls._get_arguments_from_pretrained(pretrained_model_name_or_path, **kwargs)
  File "/home/keshavpawar137/.local/lib/python3.10/site-packages/transformers/processing_utils.py", line 1116, in _get_arguments_from_pretrained
    args.append(attribute_class.from_pretrained(pretrained_model_name_or_path, **kwargs))
  File "/home/keshavpawar137/.local/lib/python3.10/site-packages/transformers/feature_extraction_utils.py", line 385, in from_pretrained
    feature_extractor_dict, kwargs = cls.get_feature_extractor_dict(pretrained_model_name_or_path, **kwargs)
  File "/home/keshavpawar137/.local/lib/python3.10/site-packages/transformers/feature_extraction_utils.py", line 511, in get_feature_extractor_dict
    resolved_feature_extractor_file = cached_file(
  File "/home/keshavpawar137/.local/lib/python3.10/site-packages/transformers/utils/hub.py", line 313, in cached_file
    raise EnvironmentError(
OSError: ./models--openai--whisper-tiny does not appear to have a file named preprocessor_config.json. Checkout 'https://huggingface.co/./models--openai--whisper-tiny/tree/main' for available files.

Manually Fixing the Symlink

$ pwd /home/ashish/Desktop/Using OpenAI-Whisper-Tiny via HuggingFace for Automatic Speech Recognition app (Research)/models--openai--whisper-tiny cp ./blobs/e3d256c988462aa153dcabe2aa38b8e9b436c06f added_tokens.json cp ./blobs/417aa9de49a132dd3eb6a56d3be2718b15f08917 config.json cp ./blobs/4b26dd66b8f7bca37d851d259fdc118315cacc62 generation_config.json cp ./blobs/6038932a2a1f09a66991b1c2adae0d14066fa29e merges.txt cp ./blobs/7ebd0e69e78190ffe1438491fa05cc1f5c1aa3a4c4db3bc1723adbb551ea2395 model.safetensors cp ./blobs/dd6ae819ad738ac1a546e9f9282ef325c33b9ea0 normalizer.json cp ./blobs/c2048dfa9fd94a052e62e908d2c4dfb18534b4d2 preprocessor_config.json cp ./blobs/bf69932dca4b3719b59fdd8f6cc1978109509f6c special_tokens_map.json cp ./blobs/d13b786c04765fb1a06492b53587752cd67665ea tokenizer_config.json cp ./blobs/1e95340ff836fad1b5932e800fb7b8c5e6d78a74 tokenizer.json cp ./blobs/d7016e21da8776c8a9d577d0f559600f09a240eb vocab.json ~~~ 05:34 ~/mysite $ ls __pycache__ flask_app.py models--openai--whisper-tiny https://www.pythonanywhere.com/user/keshavpawar137/files/var/log/keshavpawar137.pythonanywhere.com.error.log 2025-03-09 05:31:23,560: OSError: Incorrect path_or_model_id: './models--openai--whisper-tiny'. Please provide either the path to a local folder or the repo_id of a model on the Hub. 2025-03-09 05:31:23,560: File "/var/www/keshavpawar137_pythonanywhere_com_wsgi.py", line 16, in [module> 2025-03-09 05:31:23,560: from flask_app import app as application # noqa 2025-03-09 05:31:23,560: 2025-03-09 05:31:23,560: File "/home/keshavpawar137/mysite/flask_app.py", line 23, in [module> 2025-03-09 05:31:23,561: processor = WhisperProcessor.from_pretrained(model_dir)
Tags: Large Language Models,Technology,

Managing memory by removing cache files from PythonAnywhere Cloud

To See All Articles About Technology: Index of Lessons in Technology
Where are we?
04:40 ~ $ pwd
/home/keshavpawar137

04:41 ~ $ ls
README.txt  app_pt.py  models--openai--whisper-tiny.zip

04:43 ~ $ ls -a
.   .bashrc  .gitconfig  .keras  .my.cnf   .pythonstartup.py  .virtualenvs  app_pt.py
..  .cache   .ipython    .local  .profile  .vimrc             README.txt    models--openai--whisper-tiny.zip

04:43 ~ $ ls -lh
total 86M
-rwxr-xr-x 1 keshavpawar137 registered_users  232 Mar  5 09:53 README.txt
-rw-rw-r-- 1 keshavpawar137 registered_users 1.6K Mar  9 04:04 app_pt.py
-rw-rw-r-- 1 keshavpawar137 registered_users  86M Mar  9 04:42 models--openai--whisper-tiny.zip

04:43 ~ $ du -sh .
298M  

04:44 ~ $ rm -rf ~/.cache/pip
04:44 ~ $ find . -type d -name "__pycache__" -exec rm -rf {} +
04:45 ~ $ du -sh .
159M 

Can I delete : ~/.cache/pip

Yes, you can delete the ~/.cache/pip directory. This directory is used by pip to cache packages and metadata to speed up subsequent installations. Deleting it will not harm your system, but it will mean that pip will need to re-download any packages you install in the future, which might take a bit longer.

To delete the directory, you can use the following command in your terminal:

sh
rm -rf ~/.cache/pip

Make sure to double-check the command to avoid accidentally deleting the wrong files or directories.


... ... ... ... ... ...


Delete __pycache__ from all subdirectories

To delete all __pycache__ directories (including all their contents) from the current directory and its subdirectories, use the following find command:

bash
find . -type d -name "__pycache__" -exec rm -rf {} +

Explanation:

  • .: Start searching from the current directory.
  • -type d: Search for directories only.
  • -name "__pycache__": Match directories named __pycache__.
  • -exec rm -rf {} +: Executes the rm -rf command on each found directory ({} represents the directory name). The + batches the deletions into fewer rm calls, making it more efficient.

Important: This command permanently deletes the __pycache__ directories, so double-check you’re in the correct directory before running it.

Tags: Cloud,Linux,

Flask Issue of "Address already in use" in PythonAnywhere Cloud

To See All Articles About Technology: Index of Lessons in Technology

Project Setup X:

Issue of: "Address already in use" * Serving Flask app '__main__' (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: on Address already in use Port 5000 is in use by another program. Either identify and stop that program, or start the server with a different port. Traceback (most recent call last): File "/usr/local/lib/python3.10/site-packages/werkzeug/serving.py", line 908, in prepare_socket s.bind(server_address) OSError: [Errno 98] Address already in use FIX: if __name__ == '__main__': app.run(debug=True, port=5002, use_reloader=False)

Project Setup For ASR

04:47 ~ $ python app_pt.py 2025-03-09 04:49:18.556050: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.10/site-packages/cv2/../../lib64: 2025-03-09 04:49:18.556080: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. * Serving Flask app 'app_pt' (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: on Address already in use Port 5000 is in use by another program. Either identify and stop that program, or start the server with a different port. 05:08 ~ $ tail -3 app_pt.py if __name__ == "__main__": # Run the server on port 5000 and listen on all interfaces. app.run(host="0.0.0.0", port=5001, debug=False) 04:59 ~ $ python app_pt.py 2025-03-09 05:01:13.156268: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.10/site-packages/cv2/../../lib64: 2025-03-09 05:01:13.156294: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. * Serving Flask app 'app_pt' (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off Address already in use Port 5001 is in use by another program. Either identify and stop that program, or start the server with a different port. 05:03 ~ $ 05:09 ~ $ tail -3 app_pt.py if __name__ == "__main__": # Run the server on port 5000 and listen on all interfaces. app.run(host="0.0.0.0", port=5001, debug=False, use_reloader=False) 05:09 ~ $ python app_pt.py 2025-03-09 05:11:43.255847: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/local/lib/python3.10/site-packages/cv2/../../lib64: 2025-03-09 05:11:43.255871: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine. * Serving Flask app 'app_pt' (lazy loading) * Environment: production WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. * Debug mode: off Address already in use Port 5001 is in use by another program. Either identify and stop that program, or start the server with a different port. 05:13 ~

The way to resolve this is to not attempt to run a 'flask app' as a Python Script. Rather use 'Web' app configuration of PythonAnywhere as shown below:

Below screen is not the actual one but only for demo purpose...
Tags: Python,Cloud,Technology,Web Development,