Linux - i/o monitor and test tools

This time i'm talking about some helpers that may help you to check i/lo related bottlenecks when working with linux.

iotop

Obviously, but useful - use iotop to watch current io utilization and associated processes.

Total DISK READ :       0.00 B/s | Total DISK WRITE :       6.82 M/s
Actual DISK READ:       0.00 B/s | Actual DISK WRITE:       7.99 M/s
   TID  PRIO  USER     DISK READ DISK WRITE>  SWAPIN      IO    COMMAND
   425 be/4 mysql       0.00 B/s 1718.11 K/s  0.00 %  1.56 % mysqld
   424 be/4 mysql       0.00 B/s 1710.37 K/s  0.00 %  1.89 % mysqld
   422 be/4 mysql       0.00 B/s 1687.15 K/s  0.00 %  0.25 % mysqld
   423 be/4 mysql       0.00 B/s 1687.15 K/s  0.00 %  1.28 % mysqld
   430 be/4 mysql       0.00 B/s  178.00 K/s  0.00 %  0.00 % mysqld

This is quite handy if you need to check when there's a bottleneck and you want to get a quick overview about potential sources.

iostat

If you want to get deeper, fire up iostat (redhat/centos provide the tool in the sysstat package).

# iostat
Linux 3.10.0-1127.19.1.el7.x86_64 (linux-host)        11/24/2020      _x86_64_        (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          10.46    0.00    1.15    1.01    0.00   87.39

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             936.82        25.17      7487.69   63104136 18772338510
dm-0            839.47        25.15      7487.65   63064456 18772236000
dm-1              0.00         0.00         0.00       3264        628
dm-2              0.00         0.00         0.00       4424       2068

Here you can see that /dev/sda is writing much more data than actually reading. You should especially keep an eye on tps (transfers per second) and kB_read/s + kB_wrtn/s (KB read/written per second). When having bottlenecks, you might spot an issue here, if values differ from expectation of the underlying system.

If you need to go deeper and see what partition has most transfers, use the p flag.

iostat -p sda
Linux 3.10.0-1127.19.1.el7.x86_64 (linux-host)        11/24/2020      _x86_64_        (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          10.46    0.00    1.15    1.01    0.00   87.39

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda             936.82        25.17      7487.70   63104136 18772373641
sda1              0.00         0.00         0.00       3768         59
sda2              0.00         0.01         0.04      19430      99753
sda3             29.79         1.33       994.47    3326653 2493222157
sda4             60.61         0.40       493.07     997624 1236175147
sda5             12.07         0.14        99.55     355444  249585935
sda6            719.95        23.29      5900.57   58399475 14793290587

iostat with partition flag

This provides insights on a per partition level. On the system taking the snapshot data is written most of the time into a database (append) - so most i/o is write on the currently last partition. This matches our expectations.

If you need to dig even deeper, use the extended flag (x).

iostat -x -p
Linux 3.10.0-1127.19.1.el7.x86_64 (linux-host)        11/24/2020      _x86_64_        (8 CPU)

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
          10.46    0.00    1.15    1.01    0.00   87.39

Device:         rrqm/s   wrqm/s     r/s     w/s    rkB/s    wkB/s avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
sda               0.00     0.10    3.04  933.79    25.17  7487.76    16.04     1.50    1.59    1.32    1.59   0.12  10.90
sda1              0.00     0.00    0.00    0.00     0.00     0.00    18.95     0.00    5.64    6.10    1.71   1.39   0.00
sda2              0.00     0.00    0.00    0.00     0.01     0.04   147.69     0.00    4.21    0.75   21.30   0.90   0.00
sda3              0.00     0.06    0.07   29.71     1.33   994.47    66.87     0.30   10.22    3.66   10.23   2.40   7.16
sda4              0.00     0.00    0.05   60.56     0.40   493.03    16.28     0.73   11.98    1.08   11.99   1.09   6.61
sda5              0.00     0.00    0.02   12.05     0.14    99.54    16.52     0.16   12.98   12.58   12.98   1.91   2.30
sda6              0.00     0.04    2.90  717.06    23.29  5900.68    16.46     0.22    0.31    1.20    0.31   0.13   9.14
dm-0              0.00     0.00    3.04  836.44    25.15  7487.72    17.90     1.51    1.79    1.46    1.79   0.13  10.90
dm-1              0.00     0.00    0.00    0.00     0.00     0.00    26.75     0.00    6.53    6.59    6.47   2.20   0.00
dm-2              0.00     0.00    0.00    0.00     0.00     0.00    25.71     0.00    3.89    3.60   17.00   3.72   0.00

iostat with extended flag

Most useful counters here are the await values - these indicate the latency.

ioping

A very handy tool to check the current latency is ioping. Checking read or write latency is really easy here. Here are some examples:

# ioping -c 10 -s 4k -S 64M -D /tmp   
4 KiB <<< /tmp (xfs /dev/dm-0): request=1 time=414.5 us (warmup)
4 KiB <<< /tmp (xfs /dev/dm-0): request=2 time=424.7 us
4 KiB <<< /tmp (xfs /dev/dm-0): request=3 time=272.1 us
4 KiB <<< /tmp (xfs /dev/dm-0): request=4 time=324.7 us
4 KiB <<< /tmp (xfs /dev/dm-0): request=5 time=830.8 us
4 KiB <<< /tmp (xfs /dev/dm-0): request=6 time=422.8 us
4 KiB <<< /tmp (xfs /dev/dm-0): request=7 time=312.3 us
4 KiB <<< /tmp (xfs /dev/dm-0): request=8 time=376.5 us
4 KiB <<< /tmp (xfs /dev/dm-0): request=9 time=416.8 us
4 KiB <<< /tmp (xfs /dev/dm-0): request=10 time=370.6 us

--- /tmp (xfs /dev/dm-0) ioping statistics ---
9 requests completed in 3.75 ms, 36 KiB read, 2.40 k iops, 9.37 MiB/s
generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s
min/avg/max/mdev = 272.1 us / 416.8 us / 830.8 us / 154.8 us

read latency

# ioping -c 10 -W -s 4k -S 64M -D /tmp
4 KiB >>> /tmp (xfs /dev/dm-0): request=1 time=509.7 us (warmup)
4 KiB >>> /tmp (xfs /dev/dm-0): request=2 time=1.08 ms
4 KiB >>> /tmp (xfs /dev/dm-0): request=3 time=805.0 us
4 KiB >>> /tmp (xfs /dev/dm-0): request=4 time=717.0 us
4 KiB >>> /tmp (xfs /dev/dm-0): request=5 time=962.5 us
4 KiB >>> /tmp (xfs /dev/dm-0): request=6 time=900.3 us
4 KiB >>> /tmp (xfs /dev/dm-0): request=7 time=620.4 us (fast)
4 KiB >>> /tmp (xfs /dev/dm-0): request=8 time=1.27 ms (slow)
4 KiB >>> /tmp (xfs /dev/dm-0): request=9 time=918.5 us
4 KiB >>> /tmp (xfs /dev/dm-0): request=10 time=1.01 ms

--- /tmp (xfs /dev/dm-0) ioping statistics ---
9 requests completed in 8.29 ms, 36 KiB written, 1.08 k iops, 4.24 MiB/s
generated 10 requests in 9.00 s, 40 KiB, 1 iops, 4.44 KiB/s
min/avg/max/mdev = 620.4 us / 921.4 us / 1.27 ms / 184.8 us

write latency

It's quite useful to use a reasonable working set here and use DIRECT_IO (-D flag) - this bypasses kernel caches, especially.

If you want to check latency, use small sizes (-s, 4k - 32k) - if you want to check throughput use larger values (-s, 4M-8M).

Obviously there are plenty other tools around there, that may help to find useful information about i/o issues. For a common and quick test to check whether storage may be an issue I mostly use the tools shown above - so hopefully these may help you too sometimes.