I have an excel sheet which has a large amount of data. I am using php to insert the data into mysql server.
I have two problems
1) I have to update a row if the id already exists, else insert the data.
2) BIG PROBLEM : I have more than 40,000 rows and the time out on the sql server which is set by the admin is 60 seconds. When i run the update/insert query it will take more than 60 seconds, and because of this there will be a timeout. So the whole process will fail.
Is there a way I can do this ?
Currently I am checking the student id if it exists, then update otherwise insert. This I feel is taking a lot of time and causing the server to time out.
Also I have this field in the mysql stating the last time the data was updated(last_update). I was thinking of using this date, and if it is past a particular date(ie last time i ran the program) then only those rows should be updated.
Will this help in anyway ?
And what is the query i can run so as to check this date in the mysql database, that if it is past a particular date only those rows need to be updated and not everything else.
(Please help me with an example query for the above!!!!!!!!!!!!!!!!!)
4
Use MySQL’s INSERT… ON DUPLICATE KEY UPDATE syntax to automatically handle the insert/update logic. 40,000 is not that many rows – I’d be surprised if that command took more than a few seconds.
Note that you can insert many rows at once:
INSERT INTO table (id, name) VALUES (id1, name1), (id2, name2), ..., (idN, nameN) ON DUPLICATE KEY UPDATE id=VALUES(id), name=VALUES(name)
If you’re worried about hitting a memory limit (possible) then try loading the rows in batches of a few thousand at a time and looping through.
Break the large query into smaller queries and loop through each batch. Try smaller batches will take well under 60 seconds to avoid the risk of a timeout due to server overload, which might push the time over 60 seconds.
Checking the student ID sounds more reliable than using a time comparison. Are you sure you have an index on student ID? And is student ID using the most suitable data format. e.g. Use an integer if possible rather than a string.
Lastly, I don’t suppose you can request a longer query duration from the administrator… it would greatly reduce code complexity.
Thanks for all your answers.
I have finished my project. I Have found a way where I can overcome any of these memory issues , so since i already mentioned that i am going to get my files in the form of an excel sheet. And excel can easily be converted in the form of a csv.
I used the LOAD DATA INFILE Feature.
The LOAD DATA INFILE Feature just uses the csv and can directly load the entire file in one shot and need to write the query again & again.
Also Thanks guy for suggesting the INSERT ON KEY DUPLICATE.
This answer will give you a clear idea of what I have done.
https://stackoverflow.com/questions/15271202/mysql-load-data-infile-with-on-duplicate-key-update
As suggested in this answer you can use INSERT…ON DUPLICATE KEY UPDATE method. But if you have large no of columns with heavy data then you need to prepare this query as batch statements and fire the query once it is prepare.
OR
You can create one single insert query per row and fire it. But this method is very expensive in terms of overall execution time of script and memory usage.
1