Ever wondered how hard it would be to load a CSV file into a SQLite database. I know how I would do it in code, no rocket science needed there! However in this case I wanted to really know the speed of doing this natively and really didn’t want to code anything!
So looking at what SQLite3.exe has too offer it pretty much supports it out of the box. Very nice
Requirements:
- Loading speed
- Making the data to consuming applications available asap
While I love C# and frankly its hard to go back to C or C++, sometimes performance trumps the creature comforts we have become accustomed to.
Note: I did this without circling back to a C# implementation as I know the data and performance requirements are tight and in this case I wanted max performance with no code! The biggest factor to a successful implementation is to ensure you use the tools best for the job, not just the ones you favor in that specific year.
So first things first – create a table to take the input
DROP TABLE IF EXISTS BookSales;
CREATE TABLE IF NOT EXISTS BookSales
(
Store int
,Date varchar
,OrderReference varchar
,Line int
,BookISBN varchar(14)
,Quantity int
,Price int
, Primary Key (OrderReference,Line)
);
Next is the magic. We need to load the CSV into the table:
.separator "|"
.import BookSales.txt BookSales
Wow that was easy
. You can see we set the separator to be a pipe rather than comma in this case, then the import.
.IMPORT [FileName] [Table]
Now the database is ready to be queried! But if we want to take it just one stage further:
.output SummaryBookSales.csv
SELECT Store, Date, BookISBN, SUM(Quantity), SUM(Price)
FROM BookSales
GROUP BY Store, Date, BookISBN;
Now we output the results of our simple aggregation into a pipe separated output file.
Tying this all together in a single configuration file, which we will call “BookAnalysisLoader.sql”, gives us:
DROP TABLE IF EXISTS BookSales;
CREATE TABLE IF NOT EXISTS BookSales
(
Store int
,Date varchar
,OrderReference varchar
,Line int
,BookISBN varchar(14)
,Quantity int
,Price int
, Primary Key (OrderReference,Line)
);
.separator "|"
.import BookSales.txt BookSales
.output SummaryBookSales.csv
SELECT Store, Date, BookISBN, SUM(Quantity), SUM(Price)
FROM BookSales
GROUP BY Store, Date, BookISBN;
.exit
The last piece of the puzzle is the final execution:
sqlite3.exe BookSalesAnalysis.db3 < BookAnalysisLoader.sql
Now we have a newly created database with our analysis data in it, and we have a summary CSV file generated from the output. So we can load the CSV into Excel or another DB, or directly interrogate the DB for more analytical information – and all without coding!
Related Links: