一则利用bcc工具包的filetop和opensnoop工具跟踪IO的案例
一 背景说明
两台机器,在server1上运行一个Docker容器,跑的是Python程序的Flask应用,提供一个访问单词热度的接口;server2(172.16.17.36)向server1发起请求调用。
这个案例有意思的是,从server1上明明看到有消耗高CPU资源和磁盘IO的进程,但是,不太直观的看到该进程具体读写的是哪个文件。于是,这里给出不同的分析思路和解决方案。
server1: 172.16.17.7,4C 8G;
server2: 172.16.11.36,24C 64G;
二 分析步骤和思路
1 server1启动container
docker run --name=app -p 10000:80 -itd feisky/word-pop [root@guoxin7 ~]# docker run --name=app -p 10000:80 -itd feisky/word-pop 3ea1b081cf6d9565b487a56481d0e1182b96d58ae6a33b0f14b1ccc57400ee5f [root@guoxin7 ~]#
2 server2验证服务正常
[root@centos-master ~]# curl http://172.16.17.7:10000 hello world [root@centos-master ~]#
3 server2向server1发起请求
curl http://172.16.17.7:10000/popularity/word
发现响应特别慢。
4 server1 查看top
Tasks: 244 total, 1 running, 243 sleeping, 0 stopped, 0 zombie %Cpu(s): 11.3 us, 13.8 sy, 0.0 ni, 74.8 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st KiB Mem : 41037092 total, 35284336 free, 951280 used, 4801476 buff/cache KiB Swap: 8257532 total, 8257532 free, 0 used. 38038384 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 26611 root 20 0 97048 20532 4056 S 98.2 0.1 1:10.09 /usr/local/bin/python /app.py 6 root 20 0 0 0 0 S 0.4 0.0 0:00.13 [ksoftirqd/0] 9 root 20 0 0 0 0 S 0.4 0.0 0:09.86 [rcu_sched] ...
发现server1 的进程26611耗费了比较高的CPU资源,top界面按下c字符键,调出进程执行的命令,看到该进程执行了/usr/local/bin/python /app.py 。
5 server1通过iostat查看系统IO资源情况
通过iostat -dx 1; -d 列出磁盘设备利用率;-x 列出extended statistics ;数字1,每秒输出1个报告;
[root@guoxin7 ~]# iostat -dx 1 Linux 3.10.0-1160.59.1.el7.x86_64 (guoxin7) 2022年03月21日 _x86_64_ (4 CPU) Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.02 0.24 4.36 18.55 433.67 6473.81 603.08 10.89 474.14 16.02 581.70 6.01 13.77 scd0 0.00 0.00 0.00 0.00 0.15 0.00 114.22 0.00 6.89 6.89 0.00 6.67 0.00 dm-0 0.00 0.00 4.28 18.80 430.82 6481.31 598.76 11.02 476.23 16.71 580.93 5.96 13.75 dm-1 0.00 0.00 0.01 0.00 0.33 0.00 50.09 0.00 1.77 1.77 0.00 1.57 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 156.00 0.00 73668.00 944.46 144.42 1061.29 0.00 1061.29 6.42 100.10 scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 160.00 0.00 75732.00 946.65 145.41 1039.44 0.00 1039.44 6.26 100.10 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 257.00 0.00 123472.00 960.87 143.15 929.06 0.00 929.06 3.89 100.00 scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 256.00 0.00 122032.00 953.38 145.58 938.41 0.00 938.41 3.95 101.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 1.00 183.00 8.00 85364.00 927.96 144.19 474.96 119.00 476.90 5.43 100.00 scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 1.00 194.00 8.00 89984.00 922.99 145.18 451.15 119.00 452.86 5.13 100.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 184.00 0.00 85640.00 930.87 146.85 1091.28 0.00 1091.28 5.43 100.00 scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 161.00 0.00 75776.00 941.32 147.84 1253.72 0.00 1253.72 6.21 100.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 318.00 0.00 148888.00 936.40 144.58 520.76 0.00 520.76 3.14 100.00 scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 323.00 0.00 150236.00 930.25 145.55 517.09 0.00 517.09 3.10 100.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sda 0.00 0.00 0.00 157.00 0.00 71612.00 912.25 147.31 453.65 0.00 453.65 6.37 100.00 scd0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 dm-0 0.00 0.00 0.00 163.00 0.00 74660.00 916.07 148.31 438.91 0.00 438.91 6.13 100.00 dm-1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
持续观察,可以看到,系统的磁盘使用率有抖动,%util有时会达到100%,如上。iops(r/s+w/s)以及throughout(rkB/s+wkB/s)也会明显的变化。这意味着,系统的磁盘IO使用率会随着客户端发起请求,持续升高。那么,究竟是哪个进程在消耗较高的磁盘IO呢?
6 pidstat跟踪server1上进程的IO使用情况
[root@guoxin7 ~]# pidstat -p 26611 1 Linux 3.10.0-1160.59.1.el7.x86_64 (guoxin7) 2022年03月21日 _x86_64_ (4 CPU) 13时49分22秒 UID PID %usr %system %guest %CPU CPU Command 13时49分23秒 0 26611 63.00 38.00 0.00 100.00 0 python 13时49分24秒 0 26611 62.00 38.00 0.00 100.00 0 python 13时49分25秒 0 26611 44.00 29.00 0.00 73.00 0 python 13时49分26秒 0 26611 0.00 7.00 0.00 7.00 2 python 13时49分27秒 0 26611 2.00 48.00 0.00 50.00 2 python 13时49分28秒 0 26611 30.00 19.00 0.00 49.00 2 python 13时49分29秒 0 26611 62.00 39.00 0.00 100.00 0 python 13时49分30秒 0 26611 64.00 36.00 0.00 100.00 2 python 13时49分31秒 0 26611 62.00 38.00 0.00 100.00 3 python 13时49分32秒 0 26611 64.00 37.00 0.00 100.00 1 python 13时49分33秒 0 26611 63.00 38.00 0.00 100.00 1 python 13时49分34秒 0 26611 62.00 37.00 0.00 99.00 1 python ^C 平均时间: 0 26611 48.17 33.67 0.00 81.83 - python [root@guoxin7 ~]#
pidstat -d,列出进程磁盘使用率信息:
[root@guoxin7 ~]# pidstat -d -p 26611 1 Linux 3.10.0-1160.59.1.el7.x86_64 (guoxin7) 2022年03月21日 _x86_64_ (4 CPU) 13时50分12秒 UID PID kB_rd/s kB_wr/s kB_ccwr/s Command 13时50分13秒 0 26611 0.00 169382.18 0.00 python 13时50分14秒 0 26611 0.00 109853.47 0.00 python 13时50分15秒 0 26611 0.00 134661.39 0.00 python 13时50分16秒 0 26611 0.00 122948.00 0.00 python 13时50分17秒 0 26611 0.00 259640.00 0.00 python 13时50分18秒 0 26611 0.00 284580.00 0.00 python 13时50分19秒 0 26611 0.00 293028.00 0.00 python 13时50分20秒 0 26611 0.00 299768.00 0.00 python 13时50分21秒 0 26611 0.00 279688.00 0.00 python 13时50分22秒 0 26611 0.00 286160.00 0.00 python 13时50分23秒 0 26611 0.00 288036.00 0.00 python 13时50分24秒 0 26611 0.00 291524.00 0.00 python 13时50分25秒 0 26611 0.00 22760.00 0.00 python 13时50分26秒 0 26611 0.00 0.00 0.00 python 13时50分27秒 0 26611 0.00 0.00 0.00 python 13时50分28秒 0 26611 0.00 0.00 0.00 python 13时50分29秒 0 26611 0.00 0.00 0.00 python 13时50分30秒 0 26611 0.00 0.00 0.00 python 13时50分31秒 0 26611 0.00 0.00 778244.00 python 13时50分32秒 0 26611 0.00 0.00 2025068.00 python 13时50分33秒 0 26611 0.00 162403.96 0.00 python 13时50分34秒 0 26611 0.00 282608.00 0.00 python 13时50分35秒 0 26611 0.00 282020.00 0.00 python 13时50分36秒 0 26611 0.00 278336.00 0.00 python ^C 平均时间: 0 26611 0.00 160281.20 116610.32 python [root@guoxin7 ~]#
的确,看到26611进程的磁盘IO比较高。
7 strace跟踪进程26611的读写文件
[root@guoxin7 ~]# strace -p 26611 strace: Process 26611 attached select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=62570}) = 0 (Timeout) futex(0x7ffcab9dd974, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4998819}) = -1 EAGAIN (资源暂时不可用) futex(0x7ffcab9dd974, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4999436}) = 0 stat("/usr/local/lib/python3.7/importlib/_bootstrap.py", {st_mode=S_IFREG|0644, st_size=39278, ...}) = 0 stat("/usr/local/lib/python3.7/importlib/_bootstrap.py", {st_mode=S_IFREG|0644, st_size=39278, ...}) = 0 stat("/usr/local/lib/python3.7/importlib/_bootstrap.py", {st_mode=S_IFREG|0644, st_size=39278, ...}) = 0 stat("/usr/local/lib/python3.7/importlib/_bootstrap_external.py", {st_mode=S_IFREG|0644, st_size=59012, ...}) = 0 stat("/usr/local/lib/python3.7/importlib/_bootstrap_external.py", {st_mode=S_IFREG|0644, st_size=59012, ...}) = 0 stat("/usr/local/lib/python3.7/importlib/_bootstrap_external.py", {st_mode=S_IFREG|0644, st_size=59012, ...}) = 0 stat("/usr/local/lib/python3.7/encodings/__init__.py", {st_mode=S_IFREG|0644, st_size=5668, ...}) = 0 stat("/usr/local/lib/python3.7/encodings/__init__.py", {st_mode=S_IFREG|0644, st_size=5668, ...}) = 0 stat("/usr/local/lib/python3.7/encodings/__init__.py", {st_mode=S_IFREG|0644, st_size=5668, ...}) = 0 futex(0x7ffcab9dd8b4, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4999442}) = -1 ETIMEDOUT (连接超时) futex(0x7ffcab9dd8b4, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4998549}) = 0 futex(0x7f557934c2a4, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x7f557eb2b324, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x7f557eb2b2cc, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x7f557934c234, FUTEX_WAKE_PRIVATE, 1) = 1 futex(0x7f557eb2b2cc, FUTEX_WAKE_PRIVATE, 1) = 1 stat("/usr/local/lib/python3.7/codecs.py", {st_mode=S_IFREG|0644, st_size=36287, ...}) = 0 stat("/usr/local/lib/python3.7/codecs.py", {st_mode=S_IFREG|0644, st_size=36287, ...}) = 0 futex(0x7ffcab9dd564, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4999526}) = -1 EAGAIN (资源暂时不可用) futex(0x7ffcab9dd564, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4999624}) = 0 stat("/usr/local/lib/python3.7/codecs.py", {st_mode=S_IFREG|0644, st_size=36287, .. ... ...
并没有看到进程26611对应的读写文件。怎么办呢?有没有其它思路?
8 通过BCC的filetop来跟踪
由于在操作系统层面并没有跟踪到进程26611实际读写的文件是哪个?我们通过BCC的filetop来跟踪内核里读写文件的信息。
安装BCC(BPF Compiler Collection)工具包,调用其filetop查看File reads and writes by filename and process。
安装之后,位于/usr/share/bcc/tools路径下:-C表示当前屏幕不清屏的方式持续输出,第1列,表示线程号,COMM线程执行的命令,READS+WRITES分别表示读写次数,R_Kb、W_Kb表示的是读写的大小,T表示方式,R是Read,FILE列,表示的读写的文件名。
[root@guoxin7 tools]# ./filetop -C Tracing... Output every 1 secs. Hit Ctrl-C to end 14:01:07 loadavg: 3.76 2.56 2.18 4/282 7336 TID COMM READS WRITES R_Kb W_Kb T FILE 7336 python 0 1 0 5175 R 84.txt 7336 python 0 1 0 4785 R 113.txt 7336 python 0 1 0 4589 R 134.txt 7336 python 0 1 0 4492 R 150.txt 7336 python 0 1 0 4443 R 123.txt 7336 python 0 1 0 4394 R 85.txt 7336 python 0 1 0 4345 R 56.txt 7336 python 0 1 0 4296 R 106.txt 7336 python 0 1 0 4248 R 73.txt 7336 python 0 1 0 4052 R 59.txt 7336 python 0 1 0 4052 R 107.txt 7336 python 0 1 0 3662 R 147.txt 7336 python 0 1 0 3466 R 118.txt 7336 python 0 1 0 3369 R 89.txt 7336 python 0 1 0 3369 R 69.txt 7336 python 0 1 0 3369 R 128.txt 7336 python 0 1 0 3320 R 67.txt 7336 python 0 1 0 3320 R 119.txt 7336 python 0 1 0 3271 R 114.txt 7336 python 0 1 0 3222 R 61.txt 14:01:08 loadavg: 3.76 2.56 2.18 3/282 7336 TID COMM READS WRITES R_Kb W_Kb T FILE 7336 python 0 1 0 4394 R 397.txt 7336 python 0 1 0 4296 R 388.txt 7336 python 0 1 0 4199 R 365.txt 7336 python 0 1 0 3955 R 363.txt 7336 python 0 1 0 3857 R 386.txt 7336 python 0 1 0 3857 R 398.txt 7336 python 0 1 0 3808 R 360.txt 7336 python 0 1 0 3710 R 367.txt 7336 python 0 1 0 3710 R 392.txt 7336 python 0 1 0 3417 R 362.txt 7336 python 0 1 0 3173 R 402.txt 7336 python 0 1 0 3027 R 359.txt 7336 python 0 1 0 3027 R 385.txt 7336 python 0 1 0 2978 R 368.txt 7336 python 0 1 0 2929 R 371.txt 7336 python 0 1 0 2929 R 378.txt 7336 python 0 1 0 2880 R 361.txt 7336 python 0 1 0 2880 R 381.txt 7336 python 0 1 0 2832 R 387.txt 7336 python 0 1 0 2783 R 380.txt Detaching... [root@guoxin7 tools]#
9 BCC的opensnoop查看内核open执行的系统调用
同样使用BCC工具包下提供的另一个工具,opensnoop。
[root@guoxin7 tools]# ./opensnoop -p 26611 PID COMM FD ERR PATH 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/0.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/1.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/2.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/3.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/4.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/5.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/6.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/7.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/8.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/9.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/10.txt 26611 python 6 0 /tmp/83425c82-a8dd-11ec-b3db-0242ac110002/11.txt ...
这次,可以看到26611进程读取的文件的路径,持续跟踪,发现文件的路径会发生变化。大概率猜测一下,可能是前面的Python进程,在创建临时文件,然后操作处理之后,删除了文件。
10 查看Docker 应用代码
进入container内部,查看app.py源代码:
[root@guoxin7 tools]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 3ea1b081cf6d feisky/word-pop "python /app.py" About an hour ago Up About an hour 0.0.0.0:10000->80/tcp app [root@guoxin7 tools]# docker exec -it app /bin/sh / # ls app.py bin dev etc home lib media mnt proc root run sbin srv sys tmp usr var / # cat app.py #!/usr/bin/env python # -*- coding: UTF-8 -*- import os import uuid import random import shutil from concurrent.futures import ThreadPoolExecutor from flask import Flask, jsonify ... @app.route("/popularity/<word>") def word_popularity(word): dir_path = '/tmp/{}'.format(uuid.uuid1()) count = 0 sample_size = 1000 def save_to_file(file_name, content): with open(file_name, 'w') as f: f.write(content) try: # initial directory firstly os.mkdir(dir_path) # save article to files for i in range(sample_size): file_name = '{}/{}.txt'.format(dir_path, i) article = generate_article() save_to_file(file_name, article) # count word popularity for root, dirs, files in os.walk(dir_path): for file_name in files: with open('{}/{}'.format(dir_path, file_name)) as f: if validate(word, f.read()): count += 1 finally: # clean files shutil.rmtree(dir_path, ignore_errors=True) return jsonify({'popularity': count / sample_size * 100, 'word': word}) ...
的确,发现了创建临时目录和文件,然后处理之后,再删除的操作。
三 小结
1 分析思路
- 先top看到高CPU的进程;
- 然后,iostat -dx 1;查看系统IO
- 接下来,pidstat -d -p pid 1 ;查看指定进程的磁盘IO开销;
- 想要通过strace -p pid;找到进程读写的文件信息未果;
- 到此,转而使用BCC工具包的filetop查看内核中的文件读写信息,找到基本的文件名等读写相关信息;
- 最后,继续使用BCC工具包的opensnoop工具查看内核中open调用的栈信息;
2 另外一种思路
使用strace -f -p pid;可以跟踪进程及其线程的读写信息:
[root@guoxin7 tools]# strace -f -p 26611|grep Write strace: Process 26611 attached with 3 threads [pid 12362] close(6) = 0 [pid 26612] restart_syscall(<... resuming interrupted poll ...> <unfinished ...> [pid 26611] select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=989194} <unfinished ...> [pid 12362] mmap(NULL, 3002368, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5578b8e000 [pid 12362] madvise(0x5644e1264000, 438272, MADV_DONTNEED) = 0 [pid 12362] munmap(0x7f5578e6b000, 1953792) = 0 [pid 12362] open("/tmp/7d5596d8-a8e0-11ec-b3db-0242ac110002/737.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 6 [pid 12362] fcntl(6, F_SETFD, FD_CLOEXEC) = 0 [pid 12362] fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 [pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bff0) = -1 ENOTTY (对设备不适当的 ioctl 操作) [pid 12362] lseek(6, 0, SEEK_CUR) = 0 [pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bf10) = -1 ENOTTY (对设备不适当的 ioctl 操作) [pid 12362] lseek(6, 0, SEEK_CUR) = 0 [pid 12362] mmap(NULL, 3002368, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5578f2f000 [pid 12362] write(6, "The king explodes a cat with rab"..., 2999999) = 2999999 [pid 12362] munmap(0x7f5578f2f000, 3002368) = 0 [pid 12362] close(6) = 0 [pid 26612] <... restart_syscall resumed>) = 0 [pid 26612] futex(0x7f5579450874, FUTEX_WAIT_PRIVATE, 2, {tv_sec=0, tv_nsec=4999199} <unfinished ...> [pid 12362] mmap(NULL, 2752512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5578f6c000 [pid 12362] madvise(0x5644e1264000, 438272, MADV_DONTNEED) = 0 [pid 12362] munmap(0x7f5578b8e000, 3002368) = 0 [pid 12362] futex(0x7f5579450874, FUTEX_WAKE_PRIVATE, 1) = 1 [pid 12362] open("/tmp/7d5596d8-a8e0-11ec-b3db-0242ac110002/738.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666 <unfinished ...> [pid 26612] <... futex resumed>) = 0 [pid 26612] poll([{fd=4, events=POLLIN}], 1, 500 <unfinished ...> [pid 12362] <... open resumed>) = 6 [pid 12362] fcntl(6, F_SETFD, FD_CLOEXEC) = 0 [pid 12362] fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 [pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bff0) = -1 ENOTTY (对设备不适当的 ioctl 操作) [pid 12362] lseek(6, 0, SEEK_CUR) = 0 [pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bf10) = -1 ENOTTY (对设备不适当的 ioctl 操作) [pid 12362] lseek(6, 0, SEEK_CUR) = 0 [pid 12362] mmap(NULL, 2752512, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5578ccc000 [pid 12362] write(6, "Some guy kicks some guy to know "..., 2749999) = 2749999 [pid 12362] munmap(0x7f5578ccc000, 2752512) = 0 [pid 12362] close(6) = 0 [pid 12362] mmap(NULL, 2002944, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5578d83000 [pid 12362] madvise(0x5644e1264000, 438272, MADV_DONTNEED) = 0 [pid 12362] munmap(0x7f5578f6c000, 2752512) = 0 [pid 12362] open("/tmp/7d5596d8-a8e0-11ec-b3db-0242ac110002/739.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 6 [pid 12362] fcntl(6, F_SETFD, FD_CLOEXEC) = 0 [pid 12362] fstat(6, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0 [pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bff0) = -1 ENOTTY (对设备不适当的 ioctl 操作) [pid 12362] lseek(6, 0, SEEK_CUR) = 0 [pid 12362] ioctl(6, TIOCGWINSZ, 0x7f557934bf10) = -1 ENOTTY (对设备不适当的 ioctl 操作) [pid 12362] lseek(6, 0, SEEK_CUR) = 0 [pid 12362] mmap(NULL, 2002944, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f5579023000 [pid 12362] write(6, "The king treats some guy to make"..., 1999999) = 1999999 [pid 12362] munmap(0x7f5579023000, 2002944) = 0 [pid 12362] close(6) = 0 [pid 12362] mmap(NULL, 4100096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f557899a000 [pid 12362] madvise(0x5644e1264000, 438272, MADV_DONTNEED) = 0 [pid 12362] munmap(0x7f5578d83000, 2002944) = 0 [pid 12362] open("/tmp/7d5596d8-a8e0-11ec-b3db-0242ac110002/740.txt", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0666) = 6
3 个人收获
学习到strace -f -p pid;可以跟踪进程以及其线程的文件读写信息;
学会安装使用BCC工具包中的filetop跟踪内核中文件读写信息;opensnoop;跟踪内核中使用open系统调用的栈信息。
4 关于filetop和opensnoop的使用帮助
可以分别加上-h命令行查看,或者到GitHub站点查看相关示例。
四 参考和链接
1 倪朋飞老师的Linux性能优化实战,27 | 案例篇:为什么我的磁盘I/O延迟很高?;
一条评论
Pingback: