Creating a SQLite database from a CSV file
I work with CSV files a lot using Python, and, especially for larger data sets, the data’s often easier to work with if I put it in a database rather than keep it as a set of lists created by Python’s csv.reader.
I wanted an easier way to do this, so I created a Python module that will take a CSV file and, using the field names for column names, create an in-memory SQLite database to hold the data. (The database is created in memory by default, but this can be overridden.)
The module is available on Github.
A CSVTable is instantiated by passing it a file handler and, optionally, the delimiter used in the CSV file being imported (a comma is the default):
>>> from csvtosqlite import CSVTable
>>> table = CSVTable(open(r'testcsv.csv'))
The SQLite database hasn't been created yet. To do that, the create() method is called. create() requires no arguments but can be passed a filepath where the SQLite database should be saved (“:memory:” by default) and/or a name for the table that will be created (“csvtable” by default).
>>> cursor, tablename = table.create()
The create() method returns a tuple of the SQLite cursor object and the name of the table that was created. Now we can begin performing queries. These examples use the CSV file linked to from the module for testing.
>>> cursor.execute(“SELECTCOUNT(*) FROM %s” % tablename)
>>> print cursor.fetchone()[0]
2836
>>> cursor.execute(“SELECTSUM(latitude) FROM %s” % tablename)
>>> print cursor.fetchone()[0]
Any comments or suggestions are appreciated.
Rasmus on Dec. 15, 2009, at 6:42 a.m. EST