Here’s the repository with all the details.
Essentially, we have a script that needs to fetch and transform data from some API and then insert it into the database. That data ends up being around 10-20 million lines. MySQL should have no problem dealing with that, so at first, since the script was written in Rust with sqlx
, we assumed it was a problem with sqlx
. So to rule that out, after having wrote this insert test script above (in the repo) in Rust, I rewrote it in Java 11 (close to what we use in production already) and the same problem occurred.
We’ve searched through and tried a multitude of things, from inspecting MySQL status, to tweaking various MySQL/InnoDB settings, to reducing the bulk size, to even not having a prepared statement in Java, but no matter what, the leakage is still there and linear.
Having MySQL progressively take more disk space is expected, after all we’re inserting data in bulk here. But taking more RAM is definitely not, and it will end up crashing our servers if we leave mysqld do its job in this condition.
While seeking for help elsewhere, someone mentioned that InnoDB is probably using memory-mapped files, which could be a plausible explanation, but then there’s a problem with the measurements themselves and I have no clue how to make them more precise to make sure it’s not leaking. 🙁
Does anyone have any clue what could possibly be happening here?