Excel at data

For the large majority of us working on data filled websites, be it ecommerce or membership sites, there is inevitably the question asked: how can I upload data en mass to the database?  For the average person, they have a nice little spreadsheet that they’ve created with new users or new products that they just want to upload.  Never mind the cool data entry screens you’ve built, most people still like working in a desktop application.  I still like working in a desktop application, but I’m not yet convinced on the software as a service business model, though I see that my old school ways will have to adapt soonish.

Lets face it though, online forms aren’t really fun to work with for large updates, it sometimes is easier to build a little spreadsheet and upload.  For many years, I struggled with data formatting.  ColdFusion does not happily read in any sort of data file that’s supposed to be actual data, or at least it didn’t up until XML started gaining momentum.  But your average user who is working with Excel files probably has no clue what XML is (Extensible Markup Language), or how to export their data as XML.  It’s a struggle enough just getting them to export a comma or tab delimited file.  So after what seemed like years of struggle getting our users to export correctly (and not doing a great job of it), and getting the parsing correct, I ran across an article on the JDBC (Java database connectivity) that changed my programming life.  The first thing that I learned is that I do not have to use the native support of data sources in ColdFusion, I could use the JDBC.  This opened up a world of fun where I could actually use a database like a database.  I didn’t have to create special log ins with old passwords, I could use features that I couldn’t before, it was brilliant.  But as I dove deeper into research, I found a miracle.  I could create a connection to an Excel document on the fly through this same database connector.  I didn’t have to install anything new, I didn’t have to set up a prearranged ODBC (open database connectivity) connection in the ODBC manager and a DSN (Data Source Name) in ColdFusion manager, I was free!

So how is this miracle done?  It’s not a simple one code line piece, but it’s pretty easy.  One, start out creating a java class object (yes, we are using java, lovely java classes). 

myClass=CreateObject( “java”,”java.lang.Class”)

 Then use that class to load up the JDBC driver.

myClass.forName(”sun.jdbc.odbc.JdbcOdbcDriver”)

Once we have the driver set up, we need to use the driver manager object in order to get the database connection.

myDM= CreateObject( “java”,”java.sql.DriverManager”)

The database connection can then load up the jdbc driver specific to what we need, in this case the Excel driver.

myConnection=myDM.getConnection(“jdbc:odbc:Driver={Microsoft Excel Driver (*.xls)};DBQ=” & FULL FILE AND PATH);

Then it’s all about the sql statement we need to add in, so first we have to create the connection statement in java:

myStatement= myConnection.createStatement();

After the holder for the statement, we need the holder for the query, the recordset according to the sql statement, here’s where the Excel specific stuff comes in.  Instead of a traditional SQL query where we’re selecting from a table, here we’re selecting from a worksheet.  So instead of using SELECT from table, we use select from sheet.  In the example below, sheet1 must be the name of the worksheet you’re trying to pull information from.  And don’t forget the MS $.

myRecordSet = myStatement.ExecuteQuery(“SELECT * FROM [sheet1$]“)

So finally we need to actually get the query rolling, so we initialize the querytable object with our recordset requirements, and we store that in a standard ColdFusion query object.

myQuery = CreateObject(‘java’,'coldfusion.sql.QueryTable’).init(myRecordSet);

Tada.  That is it.  Just use the myQuery object like any other query object in your ColdFusion programming.  Oh, but don’t forget to close your recordset and connection, very important to keep your memory leakage down.  myRecordset.close() and myConnection.close() will be sufficient. There are already several functions and CFC’s out there for anyone who doesn’t want to actually write the code to do this, here’s a great pre-made function (give credit where credit is due) Kyle Hayes.  I also ran across a neato utility, not quite the same but might be of use to anyone who wants to actually manipulate the Excel files using POI, see Ben Nadel’s post here.