Converting CSV to SQL

Plenty of data is available in CSV format, and many people's immediate way of dealing with this is to import it into a spreadsheet (possibly because it's the only tool they know of when they're dealing with anything other than paragraphs of text).

Quite often, though, the data itself is far better handled in a database, so getting the CSV file into SQL is a good starting point.

MySQL does have the load data statement, which can easily read CSV formatted data, however you need to create the table schema for that data first, which can be fiddly.

If you have a CSV file with a header line, telling you the names of the fields of data, a simple Bash script can convert this header line into a create table statement and then import the data itself into the table, which now automatically matches the data fields.

csv2sql.sh
#!/bin/bash

# Take a CSV file with a header row and create a SQL table containing named fields
# Then import the data into the table.

if [ -z "$2" ]
then
  echo "Please supply a CSV filename with a header row, and a table name to be created from it"
  exit 1
fi

csvfile="$1"
table="$2"
DB=MyDatabaseName
length=64

( echo "drop table if exists $table;"

  echo "create table $table (UniqueIndex int auto_increment unique"
  head -n1 "$csvfile" | fromdos | sed 's/^"//; s/","/\n/g; s/"$//' | while read field
  do
    sqlfield=`echo "$field" | tr -cd '[:alpha:]'`
    echo ",$sqlfield varchar($length)"
  done
  echo ");"
) | mysql $DB

tail -n+2 "$csvfile" | fromdos | sed "s/\(.*\)/insert into $table values (0,\1);/" | mysql $DB

An example of the sort of CSV file this script can handle:

"Account","Service","Call Class","Charge Code","Destination","Peak Per Unit","Off Peak Per Unit","Weekend Per Unit","Billing Increment","Unit","As At"
"Customer","IP Direct","SIP","AFG","Afghanistan","0.2397","0.2397","0.2397","1","60","01/06/2022"
"Customer","IP Direct","SIP","AFGMOB","Afghanistan Mobile","0.2543","0.2543","0.2543","1","60","01/06/2022"
"Customer","IP Direct","SIP","ALB","Albania","0.1835","0.1835","0.1835","1","60","01/06/2022"
"Customer","IP Direct","SIP","ALBMOB","Albania Mobile","0.4563","0.4563","0.4563","1","60","01/06/2022"
"Customer","IP Direct","SIP","ALG","Algeria","0.0748","0.0748","0.0748","1","60","10/06/2022"
"Customer","IP Direct","SIP","ALGMOB","Algeria Mobile","0.4238","0.4238","0.4238","1","60","01/06/2022"
"Customer","IP Direct","SIP","ALGMBO","Algeria Mobile Other","0.5522","0.5522","0.5522","1","60","01/06/2022"

Result:

mysql> select * from wiki limit 1\G
*************************** 1. row ***************************
     UniqueIndex: 1
         Account: Customer
         Service: IP Direct
       CallClass: SIP
      ChargeCode: AFG
     Destination: Afghanistan
     PeakPerUnit: 0.2397
  OffPeakPerUnit: 0.2397
  WeekendPerUnit: 0.2397
BillingIncrement: 1
            Unit: 60
            AsAt: 01/06/2022
1 row in set (0.00 sec)

Notes:


Go up
Return to main index.