Linux常用性能调优工具索引

摘要： Linux服务器上经常遇到一些系统和应用上的问题，如何分析排查，需要利器，下面总结列表了一些常用工具、trace tool；最后也列举了最近hadoop社区在开发发展的分布式系统的trace tool。概览：引用linux-performance-analysis-and-tools中图片，

目前创新互联已为成百上千家的企业提供了网站建设、域名、虚拟空间、网站托管运营、企业网站设计、龙州网站维护等服务，公司将坚持客户导向、应用为本的策略，正道将秉承"和谐、参与、激情"的文化，与客户和合作伙伴齐心协力一起成长，共同发展。

Linux服务器上经常遇到一些系统和应用上的问题，如何分析排查，需要利器，下面总结列表了一些常用工具、trace tool；最后也列举了最近hadoop社区在开发发展的分布式系统的trace tool。

概览：
http://www.brendangregg.com/index.html
http://www.slideshare.net/brendangregg/linux-performance-analysis-and-tools
https://github.com/brendangregg/perf-tools/
http://www.brendangregg.com/linuxperf.html
引用linux-performance-analysis-and-tools中图片，说明这些tool试用层次位置
Linux常用性能调优工具索引其中提到了的工具，大部分在我日常工具箱里或者在实践的案例里面使用过, 都有很高的价值，这里方便大家索引下：

nicstat: 参见这里

oprofile: 参见这里

perf: 参见这里

systemtap: 参见这里

iotop: 参见这里

blktrace: 参见这里

dstat: 参见这里

strace: 参见这里

pidstat: 参见这里

vmstat: 参见这里

slabtop: 参见这里

tcpdump: 参见这里

free: 参见这里

mpstat: 参见这里

netstat: 参见这里

tcprstat: 参见这里

OS系统命令
系统信息（RHEL/Fedora）

uname -a 或 cat /proc/version #print system information

Linux hadoopst2.cm6 2.6.18-164.el5 #1 SMP Tue Aug 18 15:51:48 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

uptime

15:42:46 up 674 days, 6 min, 35 users, load average: 1.30, 5.97, 11.53

cat /etc/redhat-release

Red Hat Enterprise Linux Server release 5.4 (Tikanga)

lsb_release

LSB Version: :core-3.1-amd64:core-3.1-ia32:core-3.1-noarch:graphics-3.1-amd64:graphics-3.1-ia32:graphics-3.1-noarch

cat /proc/cpuinfo

cat /proc/meminfo

lspci - list all PCI devices

lsusb - list USB devices

last, lastb - show listing of last logged in users

lsmod — show the status of modules in the Linux Kernel

modprobe - add and remove modules from the Linux Kernel

常用命令/工具

To print a process tree: ps -ejH / ps axjf

To get info about threads: ps -eLf / ps axms

ulimit -a

lsof - list open files, UNIX一切皆文件

lsof -p PID

rpm/yum

rpm -qf FILE #文件所属rpm包

rpm -ql RPM #rpm包含文件

/var/log/yum.log #yum 更新包日志

/etc/XXX #系统级程序配置目录，如

/etc/yum.repos.d/ yum源配置

/var/log/cron #crontab日志，可以查看调度执行情况

ntpd - Network Time Protocol (NTP) daemon，同步集群中机器时间

squid - proxy caching server，集群WebUI的代理

系统监控

mpstat - Report processors related statistics. 注意%sys %iowait值

vmstat - Report virtual memory statistics

iostat - Report Central Processing Unit (CPU) statistics and input/output statistics for devices and partitions.

netstat - Print network connections, routing tables, interface statistics, masquerade connections, and multicast memberships

netstat -atpn | grep PID

ganglia - a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids.

sar/tsar - Collect, report, or save system activity information; tsar是淘宝自己改进的版本

定时采样（每分钟），可查历史记录（默认5分钟），可弥补ganglia显示更详细信息

iftop - the "top" bandwidth consumers shown. iftop wiki

iotop

vmtouch, Portable file system cache diagnostics and control

网络相关

telnet/nc IP PORT - 确认目标端口是否可访问，只ping通不一定端口可访问，可能防火墙等禁止

ifconfig/ifup/ifdown - configure a network interface

traceroute - print the route packets trace to network host

nslookup - query Internet name servers interactively

tcpdump - dump traffic on a network，类似开源工具 wireshark, netsniff-ng, 更多工具比较

lynx - a general purpose distributed information browser for the World Wide Web

tcpcp - allows cooperating applications to pass ownership of TCP connection endpoints from one Linux host to another one.

程序/进程相关
静态信息

ldconfig - configure dynamic linker run time bindings

ldconfig -p | grep SO 查看so是否在link cache中

ldd - print shared library dependencies，查看exe或so依赖的so

nm - list symbols from object files，可grep查找是否存在相关的symbol，是否Undefined.

readelf - Displays information about ELF files. 可现实elf相关信息，如32/64位，适用的OS，处理器

动态信息

cat /proc/$PID/[cmdline|environ|limits|status|...] - 进程相关信息

pstack - print a stack trace of a running process

pmap - report memory map of a process

java相关

JDK Tools and Utilities

Java Troubleshooting Tools

jinfo - print java process information, 如classpath，java.libary.path（jni so目录）

jstack - print a stack trace of a running java process，可查看死锁情况

jmap - report memory map of a java process

jmap -histo:live 可触发full gc

jmap -dump:live,file=$FILE 可dump heap内存，用于jhat等工具debug分析object在heap的占用情况

jhat - Heap Dump Browser - Starts a web server on a heap dump file (eg, produced by jmap -dump), allowing the heap to be browsed.

起http服务，浏览器访问查看

-J-mxXXXm ，分析大文件时需要加大heap大小

若有对象数据超大或内存占用过多，极有可能memory leak

Memory Analyzer (MAT) - eclipse plugin，Java heap analyzer

可视化工具，但受到机器内存的限制，无法分析太大的heap dump file

jdb - 可起服务做server，eclipse等工具远程连接调试

jstat - Java Virtual Machine Statistics Monitoring Tool

jstatd - Virtual Machine jstat Daemon，可配合jvisualvm

jvisualvm - Java Virtual Machine Monitoring, Troubleshooting, and Profiling Tool；可远程连接jstatd/jmx, 可视化展示工具：演示

jvmtop - In a top-like manner, displays JVM internal metrics (e.g. memory information) of running java processes.

JVM performance optimization JVM开发者写的优化文章

Overview

Compilers

Garbage collection

Concurrently compacting GC

Scalability

HPROF - Heap Profiler： java -agentlib:hprof

Trace/Debug/Profiling工具
通用工具

写log，但系统在线或无法源码时

strace - trace system calls and signals

示例：strace/ltrace的应用实例

示例：可跟踪系统调用时间，如机器cpu:%sys高的问题

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
67.90 3966.320849 496 7992161 3050250 futex
25.80 1507.326693 127093 11860 epoll_wait
....................

blktrace, generate traces of the i/o traffic on block devices

ltrace - A library call tracer

xtrace

gprof - a performance analysis tool, sampling and call-graph profiling

valgrind - an instrumentation framework for building dynamic analysis tools. automatically detect many memory management and threading bugs, and profile your programs in detail

systemtap - a simple command line interface and scripting language for writing instrumentation for a live running kernel plus user-space applications for complex tasks that may require live analysis, programmable on-line response, and whole-system symbolic access.

Linux版DTrace（SUN在Solaris上开发的）

功能强大，kernel， user-space app，cross language（java perl python ruby），build-in markers（pg MySQL）

can write and reuse simple scripts to deeply examine the activities of a live system

Data can be extracted, filtered, and summarized quickly and safely, to enable diagnoses of complex performance or functional problems

丰富的 "tapset" script library

java trace工具

btrace - dynamic tracing tool for the Java platform. UserGuide

基于动态字节码修改技术(Hotswap)来实现运行时java程序的跟踪和替换, 实现原理

BTrace使用总结

详细介绍

byteman - simplifies tracing and testing of Java programs. Can modify a running application without needing to stop and restart it.

define rules specifying the side effects you want to inject 而 BTrace类java语法

Distributed Tracing Tools

Dapper, a Large-Scale Distributed Systems Tracing Infrastructure

x-trace, a network diagnostic tool designed to provide users and network operators with better visibility into increasingly complex Internet applications.

HTrace， a tracing framework intended for use with distributed systems written in java

Add Tracing to HDFS

Update HTrace for HBase

Linux observability tools | Linux 性能观测工具
Linux benchmarking tools | Linux 性能测评工具
Linux tuning tools | Linux 性能调优工具
Linux observability sar

Brendan Gregg 目前是 Netflix 的高级性能架构师，他在那里做大规模计算机性能设计、分析和调优。他是《Systems Performance》等技术书的作者，因在系统管理员方面的成绩，获得过 2013年 USENIX LISA 大奖。他之前是 SUN 公司是性能领头人和内核工程师，他在 SUN 开发过 ZFS L2ARC，研究存储和网络性能。他也发明和开发过一大波性能分析工具，很多已集成到操作系统中了。他的最近工作包括研究性能分析的方法论和可视化，其目标包括Linux内核。

上面这是 Gregg 的简介，正如其中说的，他个人站点上分享了很多 Linux 性能相关的资源，都是自己开发的：

Linux常用性能调优工具索引

名称栏目：Linux常用性能调优工具索引
转载源于：http://cqcxhl.cn/article/joepdd.html

重庆分公司，新征程启航

Linux常用性能调优工具索引

其他资讯