Linux - i/o monitor and test tools
This post shows a few tools that are useful to check for i/o issues on linux systems by watching current i/o or testing the subsystem explicitly.
This time i'm talking about some helpers that may help you to check i/lo related bottlenecks when working with linux.
iotop
Obviously, but useful - use iotop to watch current io utilization and associated processes.
Total DISK READ : 0.00 B/s | Total DISK WRITE : 6.82 M/s
Actual DISK READ: 0.00 B/s | Actual DISK WRITE: 7.99 M/s
TID PRIO USER DISK READ DISK WRITE> SWAPIN IO COMMAND
425 be/4 mysql 0.00 B/s 1718.11 K/s 0.00 % 1.56 % mysqld
424 be/4 mysql 0.00 B/s 1710.37 K/s 0.00 % 1.89 % mysqld
422 be/4 mysql 0.00 B/s 1687.15 K/s 0.00 % 0.25 % mysqld
423 be/4 mysql 0.00 B/s 1687.15 K/s 0.00 % 1.28 % mysqld
430 be/4 mysql 0.00 B/s 178.00 K/s 0.00 % 0.00 % mysqld
This is quite handy if you need to check when there's a bottleneck and you want to get a quick overview about potential sources.
iostat
If you want to get deeper, fire up iostat (redhat/centos provide the tool in the sysstat package).
# iostat
Linux 3.10.0-1127.19.1.el7.x86_64 (linux-host) 11/24/2020 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
10.46 0.00 1.15 1.01 0.00 87.39
Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn
sda 936.82 25.17 7487.69 63104136 18772338510
dm-0 839.47 25.15 7487.65 63064456 18772236000
dm-1 0.00 0.00 0.00 3264 628
dm-2 0.00 0.00 0.00 4424 2068
Here you can see that /dev/sda is writing much more data than actually reading. You should especially keep an eye on tps (transfers per second) and kB_read/s + kB_wrtn/s (KB read/written per second). When having bottlenecks, you might spot an issue here, if values differ from expectation of the underlying system.
If you need to go deeper and see what partition has most transfers, use the p flag.
This provides insights on a per partition level. On the system taking the snapshot data is written most of the time into a database (append) - so most i/o is write on the currently last partition. This matches our expectations.
If you need to dig even deeper, use the extended flag (x).
Most useful counters here are the await values - these indicate the latency.
ioping
A very handy tool to check the current latency is ioping. Checking read or write latency is really easy here. Here are some examples:
It's quite useful to use a reasonable working set here and use DIRECT_IO (-D flag) - this bypasses kernel caches, especially.
If you want to check latency, use small sizes (-s, 4k - 32k) - if you want to check throughput use larger values (-s, 4M-8M).
Obviously there are plenty other tools around there, that may help to find useful information about i/o issues. For a common and quick test to check whether storage may be an issue I mostly use the tools shown above - so hopefully these may help you too sometimes.