Database Conversion (MS access to Sqlite3)
Used mdbtools (linux) to convert MS access (mdb) to Sqlite3
Shell script
Code Block |
---|
language | bash |
---|
title | mdb2sqlite.sh |
---|
|
#!/bin/bash
# Inspired by
# https://www.codeenigma.com/community/blog/using-mdbtools-nix-convert-microsoft-access-mysql
# USAGE
# Rename your MDB file to migration-export.mdb
# run ./mdb2sqlite.sh migration-export.mdb
# wait and wait a bit longer...
now=$(date +%s)
sqlite=sqlite3
fname=$1
sql=${fname/mdb/sqlite}
schema=${fname/mdb/schema}
dir=${fname/.mdb/}-$now
mkdir $dir
mdb-schema $fname sqlite > $dir/$schema
for i in $( mdb-tables $fname ); do
echo $i
mdb-export -D "%Y-%m-%d %H:%M:%S" -H -I sqlite $fname $i > $dir/$i.sql
done
< $dir/$schema $sqlite $sql
for f in $dir/*.sql ; do
echo $f
(echo 'BEGIN;'; cat $f; echo 'COMMIT;') | $sqlite $sql
done
echo "Using $dir" |
Overview website: http://ilrdss.isws.illinois.edu/fox/The data is at Box share,
To acquire the CSV file from MS access we first converted the .mdb database into SQLite. Once we had access to the .sqlite3 file we created a python script to access it, execute SQL statements, and generate a CSV file from the resulting table.
- Data location: gltg-source-data/Fox River Watershed Database
- CSV for latest sqlite3 file: selected_final_foxdb_20171116.csv
- Lastest MS access db: FoxDB_20171116.mdb
- Converted sqlite3 file: foxdb_20171116.sqlite3 (please use this one)
- Database
...
...
Presentation at Nutrient Monitoring Council that explains what kinds of query Jong Lee did to extract the data from the database.
View file |
---|
name | NMC-Fox-River-Data.pptx |
---|
height | 250 |
---|
|
...
Code Block |
---|
|
SELECT TBLSample.Station_ID, TBLSample.Sample_Code, TBLSample.IDLoc, TBLResults.IDLoc, TBLParameter_Codes.Full_Name, TBLParameter_Codes.Short_Name, TBLSample.Start_Date, TBLResults.Result_Value, TBLParameter_Group.Parameter_Group
FROM TBLParameter_Group INNER JOIN (TBLQAPP_Group_Codes INNER JOIN (TBLQAPPGroups INNER JOIN (TBLParameter_Codes INNER JOIN (TBLSample INNER JOIN TBLResults ON TBLSample.Sample_Code = TBLResults.Sample_Code) ON TBLParameter_Codes.Parameter_Code = TBLResults.Parameter_Code) ON TBLQAPPGroups.Parameter_Code = TBLParameter_Codes.Parameter_Code) ON TBLQAPP_Group_Codes.QAPPCode = TBLQAPPGroups.QAPPCode) ON TBLParameter_Group.Parameter_Group = TBLQAPP_Group_Codes.Parameter_Group
WHERE (((TBLParameter_Group.Parameter_Group)=9 Or (TBLParameter_Group.Parameter_Group)=10));
tblsample.station_id,
tblstation_information.latitude,
tblstation_information.longitude,
tblstation_information.place_name_description,
tblsample.sample_code,
tblsample.idloc,
tblparameter_codes.full_name,
tblparameter_codes.short_name,
tblsample.start_date,
tblresults.result_value,
tblparameter_group.parameter_group
FROM (((( (tblresults
INNER JOIN tblsample
ON ( tblresults.sample_code = tblsample.sample_code )
AND ( tblresults.idloc = tblsample.idloc ))
INNER JOIN tblparameter_codes
ON tblresults.parameter_code =
tblparameter_codes.parameter_code)
INNER JOIN tblqappgroups
ON tblparameter_codes.parameter_code =
tblqappgroups.parameter_code)
INNER JOIN tblqapp_group_codes
ON tblqappgroups.qappcode = tblqapp_group_codes.qappcode)
INNER JOIN tblparameter_group
ON tblqapp_group_codes.parameter_group =
tblparameter_group.parameter_group)
LEFT JOIN tblstation_information
ON tblstation_information.station_id = tblsample.station_id
WHERE (( ( tblparameter_group.parameter_group ) = 9
OR ( tblparameter_group.parameter_group ) = 10 )) |
Code Block |
---|
|
""" Access a sqlite3 file,
executes a SQL statement, and
generates a CSV file for parsing
Variables
----------
data_loc : str
The location of the database (for example, ./foxdb/foxdb_20171116.sqlite3)
query : str
The SQL statement
"""
import sqlite3
import pandas as pd
data_loc = ""
query = ""
conn = sqlite3.connect(data_loc)
df = pd.read_sql_query(query, conn)
df['Start_Date'] = pd.to_datetime(df['Start_Date'], errors='coerce') #use this to format date and time to only date
df.to_csv("output.csv", encoding='utf-8', index=False, date_format='%Y-%m-%d')
conn.close()
|
Resources:
- Presentation at Nutrient Monitoring Council that explains what kinds of query Jong Lee did to extract the data from the database.
View file |
---|
name | NMC-Fox-River-Data.pptx |
---|
height | 250 |
---|
|
- Python sqlite3 module
- Python Data Analysis Library