Loading data from Pandas to PostgreSQL


Installation of Python package required for accessing PostgreSQL:
Run it in Anaconda prompt: pip install psycopg2

from sqlalchemy import create_engine
import pandas as pd

#engine = create_engine('postgresql+psycopg2://user:password@hostname/database_name')
engine = create_engine('postgresql://postgres:postgres@localhost/postgres')

#import psycopg2
#conn_string = "host='localhost' dbname='my_database' user='postgres' password='secret'"
#conn = psycopg2.connect(conn_string)


# Reading data already present in the database
df = pd.read_sql("""SELECT * FROM states_and_capitals""", engine) 

# The data we are going to load is:
state_name,capital_name
Delhi,New Delhi
Bihar,Patna
Uttar Pradesh,Lucknow

df_2 = pd.read_csv("files_1/pandas_to_postgre.csv")

Note 1:
If a record violates primary key constraint, following logs are generated:
IntegrityError: (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "states_and_capitals_pkey"
DETAIL:  Key (state_name, capital_name)=(harayana, chandigarh) already exists.
 [SQL: 'INSERT INTO states_and_capitals (state_name, capital_name) VALUES (%(state_name)s, %(capital_name)s)'] [parameters: ({'state_name': 'harayana', 'capital_name': 'chandigarh'}, {'state_name': 'punjab', 'capital_name': 'chandigarh'})] (Background on this error at: http://sqlalche.me/e/gkpj)

Note 2:
Column names in the table and CSV file should be same.

# Loading the data into the database
df_2.to_sql(name='states_and_capitals', con=engine, if_exists='append', index=False)

# Viewing the saved data:

engine.execute("SELECT * FROM states_and_capitals").fetchall()

OUT:
[('harayana', 'chandigarh'),
 ('punjab', 'chandigarh'),
 ('Delhi', 'New Delhi'),
 ('Bihar', 'Patna'),
 ('Uttar Pradesh', 'Lucknow')]
 


--  --  --  --  --

PostgreSQL Notes:

Launching the pgAdmin4 file opens the web-app interface for the DB (Note: use Chrome browser).

http://127.0.0.1:54385/browser/

1. Query to get all table names.

select * from information_schema.tables

Ref: https://stackoverflow.com/questions/2276644/list-all-tables-in-postgresql-information-schema

2. Creating a table.

CREATE TABLE states_and_capitals(
   state_name TEXT NOT NULL,
   capital_name TEXT NOT NULL,
   PRIMARY KEY( state_name, capital_name )
);

3. Dropping table.

Drop table states_and_capitals;

4. Insert queries.

INSERT INTO states_and_capitals (state_name, capital_name) VALUES ('harayana', 'chandigarh');

INSERT INTO states_and_capitals (state_name, capital_name) VALUES ('punjab', 'chandigarh');

5. Select query.

select * from states_and_capitals;

Ref: https://www.tutorialspoint.com/postgresql/postgresql_insert_query.htm

No comments:

Post a Comment