I’m working on some project where I have to make a database to store data which is currently kept in Excel files, so I have to do some changes before writing the data into the database.
I’m going to read those files using Java and write them from Java into the database. Should I use Hibernate for that job? Or how else can I connect to those Excel sheets, read data in Java, and write to the database?
3
It’s possible to query Excel sheets using SQL with ODBC. The sheets are treated like tables. You can even join the sheets like you would database tables! I’d recommend you try this option first. Google java + ODBC + excel for details.
There is apache’s POI libary. http://poi.apache.org/
I’m not sure if this works on the old “binary” format or just the new XML format of newer versions of excel. But I remember someone I worked with used some sort of apache libarry to work with binary-format excel sheets a few years back.
In .NET you can use the excel interop library to get the data out of the excel sheet (as well as do anything else that you could possibly do in excel).
Once you’ve extracted the data from excel, you can load it into the database how you normally would.
1
LibreOffice and OpenOffice could scale you to the moon. Headless Linux-Mac-Win.
You can get the contents of spreadsheet with about 200 lines of code.
Use the OpenOffice.org (or LibreOffice) APIs to open the spreadsheet files, access the data, perform the transformations, and then write the proper data into the new database.
As long as your are not trying to do fancy things like inserting and updating formulas it’s very simple and cross platform and because it’s open source you can rack as many instances as you need. It opens up xls files by converting them to oxd. In Linux you can run in headless mode and if you multi procs you can startup multi OpenOffice daemons to process multi docs at once. It’s not marketed much as the “pro” solution, many instantly frown down on it. But I spent many years working on it starting from 2003 building a platform that printed millions impressions a year and 90% of the code is still running.
We started the implemention with 1.0 and as of last year they are using 3.x version. We didn’t have to upgrade any code to adjust for the upgrade. You don’t need any other XML API that’s version-dependent you won’t need to worry about comparability Open Office 3.x will open 97-2007 xls. I recommend this because I was a Platfrom R&D analyst and Java dev. for 8 years and I spent better part of three years researching this solution agents others.
3
If I were you, I wouldn’t use dedicated code for that.
Prefer a tool like ETL (Talend, kettle, by example).
It will be more easy to parse your file and you will have lot of problems (compatibility, maintainability, etc.) already solved.