Depending on the size of your zabbix environment, you might want to migrate from mysql to timescaledb. This is especially true if you're generating a lot of values per second in your database.
Migrating data from mysql to timescaledb is a complex process that might require fundamental unterstanding of relational databases. If you don't fully understand what's happening here your data might be at risk - so make sure to read the full guide before applying any steps on your environment.
Why we're migrating data
You might have collected data on zabbix for a long time and the value of the information stored in this dataset is highly important for your company to make decisions on infrastructure configurations in the future.
So, there's one question that's very important to answer at the beginning:
We're quite conservative regarding storing new values per second - this has been a requirement in the past as we're using zabbix since version 1.8. Around 10 years ago you still tought if you need the data you're storing or not - not just collecting because it's possible to do so. :-) This approach is still ongoing and we're using features like storing values only once every hour (if it's unchanged) heavily to keep data growth in a practical range. Even though we're generating more than 3k VPS on our database - which accumulates to more than 259million records per day.
This amount of data ingestion is no longer the primary purpose of a relational database like mysql, even if you use partitioning and optimize the I/O configuration to streamline data when it's written to disk.
Long story short: Timescaledb is optimized for such data and keeps insertion rates constant over a growing set of data. Query performance is great and the amount of required disk space is even less than using a relational database when compression is enabled. So let's check out how we can move our data to timescaledb.
How we're migrating data
We'll migrate the data in several steps which will be covered in dedicated articles that will be linked below as they are going to be available.
The overall process will make use of the following techniques:
- Docker containers
- pgloader (https://github.com/dimitri/pgloader)
That's all the tools we need.
This series contains the following parts:
Magic - none yet. You might read about benefits of timescaledb until these are ready or make yourself familar how pgloader works. Until then - stay tuned.