AWK 简明教程

AWK 简明教程

有一些网友看了前两天的《Linux下应该知道的技巧》希望我能教教他们用awk和sed,所以,出现了这篇文章。我估计这些80后的年轻朋友可能对awk/sed这类上古神器有点陌生了,所以需要我这个老家伙来炒炒冷饭。况且,AWK是贝尔实验室1977年搞出来的文本出现神器,今年是蛇年,是AWK的本命年,而且年纪和我相仿,所以非常有必要为他写篇文章

之所以叫AWK是因为其取了三位创始人 Alfred AhoPeter Weinberger, 和 Brian Kernighan 的Family Name的首字符。要学AWK,就得提一提AWK的一本相当经典的书《The AWK Programming Language》,它在豆瓣上的评分是9.4分!在亚马逊上居然卖1022.30元

我在这里的教程并不想面面俱到,本文和我之前的Go语言简介一样,全是示例,基本无废话。

我只想达到两个目的:

1)你可以在乘坐公交地铁上下班,或是在坐马桶拉大便时读完(保证是一泡大便的工夫)。

2)我只想让这篇博文像一个火辣的脱衣舞女挑起你的兴趣,然后还要你自己去下工夫去撸。

废话少说,我们开始脱吧(注:这里只是topless)。

起步上台

我从netstat命令中提取了如下信息作为用例:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ cat netstat.txt
Proto Recv-Q Send-Q Local-Address Foreign-Address State
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN
tcp 0 0 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT
tcp 0 0 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2
tcp 0 0 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED
tcp 0 0 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED
tcp 0 0 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2
tcp 0 0 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED
tcp 0 0 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT
tcp 0 4166 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED
tcp 0 1 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1
tcp 0 0 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED
tcp 0 0 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT
tcp 0 1 sou-ip.com:80 208.115.113.92:50601 LAST_ACK
tcp 0 0 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED
tcp 0 0 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2
tcp 0 0 :::22 :::* LISTEN
$ cat netstat.txt Proto Recv-Q Send-Q Local-Address Foreign-Address State tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN tcp 0 0 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT tcp 0 0 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2 tcp 0 0 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED tcp 0 0 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED tcp 0 0 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2 tcp 0 0 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED tcp 0 0 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT tcp 0 4166 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED tcp 0 1 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1 tcp 0 0 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED tcp 0 0 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT tcp 0 1 sou-ip.com:80 208.115.113.92:50601 LAST_ACK tcp 0 0 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED tcp 0 0 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2 tcp 0 0 :::22 :::* LISTEN
$ cat netstat.txt
Proto Recv-Q Send-Q Local-Address          Foreign-Address             State
tcp        0      0 0.0.0.0:3306           0.0.0.0:*                   LISTEN
tcp        0      0 0.0.0.0:80             0.0.0.0:*                   LISTEN
tcp        0      0 127.0.0.1:9000         0.0.0.0:*                   LISTEN
tcp        0      0 sou-ip.com:80        124.205.5.146:18245         TIME_WAIT
tcp        0      0 sou-ip.com:80        61.140.101.185:37538        FIN_WAIT2
tcp        0      0 sou-ip.com:80        110.194.134.189:1032        ESTABLISHED
tcp        0      0 sou-ip.com:80        123.169.124.111:49809       ESTABLISHED
tcp        0      0 sou-ip.com:80        116.234.127.77:11502        FIN_WAIT2
tcp        0      0 sou-ip.com:80        123.169.124.111:49829       ESTABLISHED
tcp        0      0 sou-ip.com:80        183.60.215.36:36970         TIME_WAIT
tcp        0   4166 sou-ip.com:80        61.148.242.38:30901         ESTABLISHED
tcp        0      1 sou-ip.com:80        124.152.181.209:26825       FIN_WAIT1
tcp        0      0 sou-ip.com:80        110.194.134.189:4796        ESTABLISHED
tcp        0      0 sou-ip.com:80        183.60.212.163:51082        TIME_WAIT
tcp        0      1 sou-ip.com:80        208.115.113.92:50601        LAST_ACK
tcp        0      0 sou-ip.com:80        123.169.124.111:49840       ESTABLISHED
tcp        0      0 sou-ip.com:80        117.136.20.85:50025         FIN_WAIT2
tcp        0      0 :::22                  :::*                        LISTEN

下面是最简单最常用的awk示例,其输出第1列和第4例,

  • 其中单引号中的被大括号括着的就是awk的语句,注意,其只能被单引号包含。
  • 其中的$1..$n表示第几例。注:$0表示整个行。
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk '{print $1, $4}' netstat.txt
Proto Local-Address
tcp 0.0.0.0:3306
tcp 0.0.0.0:80
tcp 127.0.0.1:9000
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp :::22
$ awk '{print $1, $4}' netstat.txt Proto Local-Address tcp 0.0.0.0:3306 tcp 0.0.0.0:80 tcp 127.0.0.1:9000 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp sou-ip.com:80 tcp :::22
$ awk '{print $1, $4}' netstat.txt
Proto Local-Address
tcp 0.0.0.0:3306
tcp 0.0.0.0:80
tcp 127.0.0.1:9000
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp sou-ip.com:80
tcp :::22

我们再来看看awk的格式化输出,和C语言的printf没什么两样:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk '{printf "%-8s %-8s %-8s %-18s %-22s %-15s\n",$1,$2,$3,$4,$5,$6}' netstat.txt
Proto Recv-Q Send-Q Local-Address Foreign-Address State
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN
tcp 0 0 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT
tcp 0 0 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2
tcp 0 0 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED
tcp 0 0 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED
tcp 0 0 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2
tcp 0 0 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED
tcp 0 0 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT
tcp 0 4166 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED
tcp 0 1 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1
tcp 0 0 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED
tcp 0 0 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT
tcp 0 1 sou-ip.com:80 208.115.113.92:50601 LAST_ACK
tcp 0 0 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED
tcp 0 0 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2
tcp 0 0 :::22 :::* LISTEN
$ awk '{printf "%-8s %-8s %-8s %-18s %-22s %-15s\n",$1,$2,$3,$4,$5,$6}' netstat.txt Proto Recv-Q Send-Q Local-Address Foreign-Address State tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN tcp 0 0 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT tcp 0 0 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2 tcp 0 0 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED tcp 0 0 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED tcp 0 0 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2 tcp 0 0 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED tcp 0 0 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT tcp 0 4166 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED tcp 0 1 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1 tcp 0 0 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED tcp 0 0 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT tcp 0 1 sou-ip.com:80 208.115.113.92:50601 LAST_ACK tcp 0 0 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED tcp 0 0 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2 tcp 0 0 :::22 :::* LISTEN
$ awk '{printf "%-8s %-8s %-8s %-18s %-22s %-15s\n",$1,$2,$3,$4,$5,$6}' netstat.txt
Proto    Recv-Q   Send-Q   Local-Address      Foreign-Address        State
tcp      0        0        0.0.0.0:3306       0.0.0.0:*              LISTEN
tcp      0        0        0.0.0.0:80         0.0.0.0:*              LISTEN
tcp      0        0        127.0.0.1:9000     0.0.0.0:*              LISTEN
tcp      0        0        sou-ip.com:80    124.205.5.146:18245    TIME_WAIT
tcp      0        0        sou-ip.com:80    61.140.101.185:37538   FIN_WAIT2
tcp      0        0        sou-ip.com:80    110.194.134.189:1032   ESTABLISHED
tcp      0        0        sou-ip.com:80    123.169.124.111:49809  ESTABLISHED
tcp      0        0        sou-ip.com:80    116.234.127.77:11502   FIN_WAIT2
tcp      0        0        sou-ip.com:80    123.169.124.111:49829  ESTABLISHED
tcp      0        0        sou-ip.com:80    183.60.215.36:36970    TIME_WAIT
tcp      0        4166     sou-ip.com:80    61.148.242.38:30901    ESTABLISHED
tcp      0        1        sou-ip.com:80    124.152.181.209:26825  FIN_WAIT1
tcp      0        0        sou-ip.com:80    110.194.134.189:4796   ESTABLISHED
tcp      0        0        sou-ip.com:80    183.60.212.163:51082   TIME_WAIT
tcp      0        1        sou-ip.com:80    208.115.113.92:50601   LAST_ACK
tcp      0        0        sou-ip.com:80    123.169.124.111:49840  ESTABLISHED
tcp      0        0        sou-ip.com:80    117.136.20.85:50025    FIN_WAIT2
tcp      0        0        :::22              :::*                   LISTEN

脱掉外套

过滤记录

我们再来看看如何过滤记录(下面过滤条件为:第三列的值为0 && 第6列的值为LISTEN)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk '$3==0 && $6=="LISTEN" ' netstat.txt
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN
tcp 0 0 :::22 :::* LISTEN
$ awk '$3==0 && $6=="LISTEN" ' netstat.txt tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN tcp 0 0 :::22 :::* LISTEN
$ awk '$3==0 && $6=="LISTEN" ' netstat.txt
tcp        0      0 0.0.0.0:3306               0.0.0.0:*              LISTEN
tcp        0      0 0.0.0.0:80                 0.0.0.0:*              LISTEN
tcp        0      0 127.0.0.1:9000             0.0.0.0:*              LISTEN
tcp        0      0 :::22                      :::*                   LISTEN

其中的“==”为比较运算符。其他比较运算符:!=, >, <, >=, <=

我们来看看各种过滤记录的方式:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk ' $3>0 {print $0}' netstat.txt
Proto Recv-Q Send-Q Local-Address Foreign-Address State
tcp 0 4166 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED
tcp 0 1 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1
tcp 0 1 sou-ip.com:80 208.115.113.92:50601 LAST_ACK
$ awk ' $3>0 {print $0}' netstat.txt Proto Recv-Q Send-Q Local-Address Foreign-Address State tcp 0 4166 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED tcp 0 1 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1 tcp 0 1 sou-ip.com:80 208.115.113.92:50601 LAST_ACK
$ awk ' $3>0 {print $0}' netstat.txt
Proto Recv-Q Send-Q Local-Address          Foreign-Address             State
tcp        0   4166 sou-ip.com:80        61.148.242.38:30901         ESTABLISHED
tcp        0      1 sou-ip.com:80        124.152.181.209:26825       FIN_WAIT1
tcp        0      1 sou-ip.com:80        208.115.113.92:50601        LAST_ACK

如果我们需要表头的话,我们可以引入内建变量NR:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk '$3==0 && $6=="LISTEN" || NR==1 ' netstat.txt
Proto Recv-Q Send-Q Local-Address Foreign-Address State
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN
tcp 0 0 :::22 :::* LISTEN
$ awk '$3==0 && $6=="LISTEN" || NR==1 ' netstat.txt Proto Recv-Q Send-Q Local-Address Foreign-Address State tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN tcp 0 0 :::22 :::* LISTEN
$ awk '$3==0 && $6=="LISTEN" || NR==1 ' netstat.txt
Proto Recv-Q Send-Q Local-Address          Foreign-Address             State
tcp        0      0 0.0.0.0:3306           0.0.0.0:*                   LISTEN
tcp        0      0 0.0.0.0:80             0.0.0.0:*                   LISTEN
tcp        0      0 127.0.0.1:9000         0.0.0.0:*                   LISTEN
tcp        0      0 :::22                  :::*                        LISTEN

再加上格式化输出:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk '$3==0 && $6=="LISTEN" || NR==1 {printf "%-20s %-20s %s\n",$4,$5,$6}' netstat.txt
Local-Address Foreign-Address State
0.0.0.0:3306 0.0.0.0:* LISTEN
0.0.0.0:80 0.0.0.0:* LISTEN
127.0.0.1:9000 0.0.0.0:* LISTEN
:::22 :::* LISTEN
$ awk '$3==0 && $6=="LISTEN" || NR==1 {printf "%-20s %-20s %s\n",$4,$5,$6}' netstat.txt Local-Address Foreign-Address State 0.0.0.0:3306 0.0.0.0:* LISTEN 0.0.0.0:80 0.0.0.0:* LISTEN 127.0.0.1:9000 0.0.0.0:* LISTEN :::22 :::* LISTEN
$ awk '$3==0 && $6=="LISTEN" || NR==1 {printf "%-20s %-20s %s\n",$4,$5,$6}' netstat.txt
Local-Address        Foreign-Address      State
0.0.0.0:3306         0.0.0.0:*            LISTEN
0.0.0.0:80           0.0.0.0:*            LISTEN
127.0.0.1:9000       0.0.0.0:*            LISTEN
:::22                :::*                 LISTEN
内建变量

说到了内建变量,我们可以来看看awk的一些内建变量:

$0 当前记录(这个变量中存放着整个行的内容)
$1~$n 当前记录的第n个字段,字段间由FS分隔
FS 输入字段分隔符 默认是空格或Tab
NF 当前记录中的字段个数,就是有多少列
NR 已经读出的记录数,就是行号,从1开始,如果有多个文件话,这个值也是不断累加中。
FNR 当前记录数,与NR不同的是,这个值会是各个文件自己的行号
RS 输入的记录分隔符, 默认为换行符
OFS 输出字段分隔符, 默认也是空格
ORS 输出的记录分隔符,默认为换行符
FILENAME 当前输入文件的名字

怎么使用呢,比如:我们如果要输出行号:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk '$3==0 && $6=="ESTABLISHED" || NR==1 {printf "%02s %s %-20s %-20s %s\n",NR, FNR, $4,$5,$6}' netstat.txt
01 1 Local-Address Foreign-Address State
07 7 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED
08 8 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED
10 10 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED
14 14 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED
17 17 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED
$ awk '$3==0 && $6=="ESTABLISHED" || NR==1 {printf "%02s %s %-20s %-20s %s\n",NR, FNR, $4,$5,$6}' netstat.txt 01 1 Local-Address Foreign-Address State 07 7 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED 08 8 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED 10 10 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED 14 14 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED 17 17 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED
$ awk '$3==0 && $6=="ESTABLISHED" || NR==1 {printf "%02s %s %-20s %-20s %s\n",NR, FNR, $4,$5,$6}' netstat.txt
01 1 Local-Address        Foreign-Address      State
07 7 sou-ip.com:80      110.194.134.189:1032 ESTABLISHED
08 8 sou-ip.com:80      123.169.124.111:49809 ESTABLISHED
10 10 sou-ip.com:80      123.169.124.111:49829 ESTABLISHED
14 14 sou-ip.com:80      110.194.134.189:4796 ESTABLISHED
17 17 sou-ip.com:80      123.169.124.111:49840 ESTABLISHED
指定分隔符
Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk 'BEGIN{FS=":"} {print $1,$3,$6}' /etc/passwd
root 0 /root
bin 1 /bin
daemon 2 /sbin
adm 3 /var/adm
lp 4 /var/spool/lpd
sync 5 /sbin
shutdown 6 /sbin
halt 7 /sbin
$ awk 'BEGIN{FS=":"} {print $1,$3,$6}' /etc/passwd root 0 /root bin 1 /bin daemon 2 /sbin adm 3 /var/adm lp 4 /var/spool/lpd sync 5 /sbin shutdown 6 /sbin halt 7 /sbin
$  awk  'BEGIN{FS=":"} {print $1,$3,$6}' /etc/passwd
root 0 /root
bin 1 /bin
daemon 2 /sbin
adm 3 /var/adm
lp 4 /var/spool/lpd
sync 5 /sbin
shutdown 6 /sbin
halt 7 /sbin

上面的命令也等价于:(-F的意思就是指定分隔符)

$ awk -F: '{print $1,$3,$6}' /etc/passwd
$ awk -F: '{print $1,$3,$6}' /etc/passwd

注:如果你要指定多个分隔符,你可以这样来:

awk -F '[;:]'
awk -F '[;:]'

再来看一个以\t作为分隔符输出的例子(下面使用了/etc/passwd文件,这个文件是以:分隔的):

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk -F: '{print $1,$3,$6}' OFS="\t" /etc/passwd
root 0 /root
bin 1 /bin
daemon 2 /sbin
adm 3 /var/adm
lp 4 /var/spool/lpd
sync 5 /sbin
$ awk -F: '{print $1,$3,$6}' OFS="\t" /etc/passwd root 0 /root bin 1 /bin daemon 2 /sbin adm 3 /var/adm lp 4 /var/spool/lpd sync 5 /sbin
$ awk  -F: '{print $1,$3,$6}' OFS="\t" /etc/passwd
root    0       /root
bin     1       /bin
daemon  2       /sbin
adm     3       /var/adm
lp      4       /var/spool/lpd
sync    5       /sbin

脱掉衬衫

字符串匹配

我们再来看几个字符串匹配的示例:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk '$6 ~ /FIN/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt
1 Local-Address Foreign-Address State
6 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2
9 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2
13 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1
18 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2
$ $ awk '$6 ~ /WAIT/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt
1 Local-Address Foreign-Address State
5 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT
6 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2
9 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2
11 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT
13 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1
15 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT
18 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2
$ awk '$6 ~ /FIN/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt 1 Local-Address Foreign-Address State 6 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2 9 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2 13 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1 18 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2 $ $ awk '$6 ~ /WAIT/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt 1 Local-Address Foreign-Address State 5 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT 6 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2 9 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2 11 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT 13 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1 15 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT 18 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2
$ awk '$6 ~ /FIN/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt
1       Local-Address   Foreign-Address State
6       sou-ip.com:80 61.140.101.185:37538    FIN_WAIT2
9       sou-ip.com:80 116.234.127.77:11502    FIN_WAIT2
13      sou-ip.com:80 124.152.181.209:26825   FIN_WAIT1
18      sou-ip.com:80 117.136.20.85:50025     FIN_WAIT2

$ $ awk '$6 ~ /WAIT/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt
1       Local-Address   Foreign-Address State
5       sou-ip.com:80 124.205.5.146:18245     TIME_WAIT
6       sou-ip.com:80 61.140.101.185:37538    FIN_WAIT2
9       sou-ip.com:80 116.234.127.77:11502    FIN_WAIT2
11      sou-ip.com:80 183.60.215.36:36970     TIME_WAIT
13      sou-ip.com:80 124.152.181.209:26825   FIN_WAIT1
15      sou-ip.com:80 183.60.212.163:51082    TIME_WAIT
18      sou-ip.com:80 117.136.20.85:50025     FIN_WAIT2

上面的第一个示例匹配FIN状态, 第二个示例匹配WAIT字样的状态。其实 ~ 表示模式开始。/ /中是模式。这就是一个正则表达式的匹配。

其实awk可以像grep一样的去匹配第一行,就像这样:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk '/LISTEN/' netstat.txt
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN
tcp 0 0 :::22 :::* LISTEN
$ awk '/LISTEN/' netstat.txt tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN tcp 0 0 :::22 :::* LISTEN
$ awk '/LISTEN/' netstat.txt
tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN
tcp        0      0 0.0.0.0:80              0.0.0.0:*               LISTEN
tcp        0      0 127.0.0.1:9000          0.0.0.0:*               LISTEN
tcp        0      0 :::22                   :::*                    LISTEN

我们可以使用 “/FIN|TIME/” 来匹配 FIN 或者 TIME :

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk '$6 ~ /FIN|TIME/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt
1 Local-Address Foreign-Address State
5 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT
6 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2
9 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2
11 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT
13 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1
15 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT
18 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2
$ awk '$6 ~ /FIN|TIME/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt 1 Local-Address Foreign-Address State 5 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT 6 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2 9 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2 11 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT 13 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1 15 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT 18 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2
$ awk '$6 ~ /FIN|TIME/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt
1       Local-Address   Foreign-Address State
5       sou-ip.com:80 124.205.5.146:18245     TIME_WAIT
6       sou-ip.com:80 61.140.101.185:37538    FIN_WAIT2
9       sou-ip.com:80 116.234.127.77:11502    FIN_WAIT2
11      sou-ip.com:80 183.60.215.36:36970     TIME_WAIT
13      sou-ip.com:80 124.152.181.209:26825   FIN_WAIT1
15      sou-ip.com:80 183.60.212.163:51082    TIME_WAIT
18      sou-ip.com:80 117.136.20.85:50025     FIN_WAIT2

再来看看模式取反的例子:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk '$6 !~ /WAIT/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt
1 Local-Address Foreign-Address State
2 0.0.0.0:3306 0.0.0.0:* LISTEN
3 0.0.0.0:80 0.0.0.0:* LISTEN
4 127.0.0.1:9000 0.0.0.0:* LISTEN
7 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED
8 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED
10 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED
12 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED
14 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED
16 sou-ip.com:80 208.115.113.92:50601 LAST_ACK
17 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED
19 :::22 :::* LISTEN
$ awk '$6 !~ /WAIT/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt 1 Local-Address Foreign-Address State 2 0.0.0.0:3306 0.0.0.0:* LISTEN 3 0.0.0.0:80 0.0.0.0:* LISTEN 4 127.0.0.1:9000 0.0.0.0:* LISTEN 7 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED 8 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED 10 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED 12 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED 14 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED 16 sou-ip.com:80 208.115.113.92:50601 LAST_ACK 17 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED 19 :::22 :::* LISTEN
$ awk '$6 !~ /WAIT/ || NR==1 {print NR,$4,$5,$6}' OFS="\t" netstat.txt
1       Local-Address   Foreign-Address State
2       0.0.0.0:3306    0.0.0.0:*       LISTEN
3       0.0.0.0:80      0.0.0.0:*       LISTEN
4       127.0.0.1:9000  0.0.0.0:*       LISTEN
7       sou-ip.com:80 110.194.134.189:1032    ESTABLISHED
8       sou-ip.com:80 123.169.124.111:49809   ESTABLISHED
10      sou-ip.com:80 123.169.124.111:49829   ESTABLISHED
12      sou-ip.com:80 61.148.242.38:30901     ESTABLISHED
14      sou-ip.com:80 110.194.134.189:4796    ESTABLISHED
16      sou-ip.com:80 208.115.113.92:50601    LAST_ACK
17      sou-ip.com:80 123.169.124.111:49840   ESTABLISHED
19      :::22   :::*    LISTEN

或是:

awk '!/WAIT/' netstat.txt
awk '!/WAIT/' netstat.txt

折分文件

awk拆分文件很简单,使用重定向就好了。下面这个例子,是按第6例分隔文件,相当的简单(其中的NR!=1表示不处理表头)。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk 'NR!=1{print > $6}' netstat.txt
$ ls
ESTABLISHED FIN_WAIT1 FIN_WAIT2 LAST_ACK LISTEN netstat.txt TIME_WAIT
$ cat ESTABLISHED
tcp 0 0 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED
tcp 0 0 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED
tcp 0 0 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED
tcp 0 4166 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED
tcp 0 0 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED
tcp 0 0 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED
$ cat FIN_WAIT1
tcp 0 1 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1
$ cat FIN_WAIT2
tcp 0 0 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2
tcp 0 0 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2
tcp 0 0 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2
$ cat LAST_ACK
tcp 0 1 sou-ip.com:80 208.115.113.92:50601 LAST_ACK
$ cat LISTEN
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN
tcp 0 0 :::22 :::* LISTEN
$ cat TIME_WAIT
tcp 0 0 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT
tcp 0 0 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT
tcp 0 0 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT
$ awk 'NR!=1{print > $6}' netstat.txt $ ls ESTABLISHED FIN_WAIT1 FIN_WAIT2 LAST_ACK LISTEN netstat.txt TIME_WAIT $ cat ESTABLISHED tcp 0 0 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED tcp 0 0 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED tcp 0 0 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED tcp 0 4166 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED tcp 0 0 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED tcp 0 0 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED $ cat FIN_WAIT1 tcp 0 1 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1 $ cat FIN_WAIT2 tcp 0 0 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2 tcp 0 0 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2 tcp 0 0 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2 $ cat LAST_ACK tcp 0 1 sou-ip.com:80 208.115.113.92:50601 LAST_ACK $ cat LISTEN tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN tcp 0 0 :::22 :::* LISTEN $ cat TIME_WAIT tcp 0 0 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT tcp 0 0 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT tcp 0 0 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT
$ awk 'NR!=1{print > $6}' netstat.txt

$ ls
ESTABLISHED  FIN_WAIT1  FIN_WAIT2  LAST_ACK  LISTEN  netstat.txt  TIME_WAIT

$ cat ESTABLISHED
tcp        0      0 sou-ip.com:80        110.194.134.189:1032        ESTABLISHED
tcp        0      0 sou-ip.com:80        123.169.124.111:49809       ESTABLISHED
tcp        0      0 sou-ip.com:80        123.169.124.111:49829       ESTABLISHED
tcp        0   4166 sou-ip.com:80        61.148.242.38:30901         ESTABLISHED
tcp        0      0 sou-ip.com:80        110.194.134.189:4796        ESTABLISHED
tcp        0      0 sou-ip.com:80        123.169.124.111:49840       ESTABLISHED

$ cat FIN_WAIT1
tcp        0      1 sou-ip.com:80        124.152.181.209:26825       FIN_WAIT1

$ cat FIN_WAIT2
tcp        0      0 sou-ip.com:80        61.140.101.185:37538        FIN_WAIT2
tcp        0      0 sou-ip.com:80        116.234.127.77:11502        FIN_WAIT2
tcp        0      0 sou-ip.com:80        117.136.20.85:50025         FIN_WAIT2

$ cat LAST_ACK
tcp        0      1 sou-ip.com:80        208.115.113.92:50601        LAST_ACK

$ cat LISTEN
tcp        0      0 0.0.0.0:3306           0.0.0.0:*                   LISTEN
tcp        0      0 0.0.0.0:80             0.0.0.0:*                   LISTEN
tcp        0      0 127.0.0.1:9000         0.0.0.0:*                   LISTEN
tcp        0      0 :::22                  :::*                        LISTEN

$ cat TIME_WAIT
tcp        0      0 sou-ip.com:80        124.205.5.146:18245         TIME_WAIT
tcp        0      0 sou-ip.com:80        183.60.215.36:36970         TIME_WAIT
tcp        0      0 sou-ip.com:80        183.60.212.163:51082        TIME_WAIT

你也可以把指定的列输出到文件:

awk 'NR!=1{print $4,$5 > $6}' netstat.txt
awk 'NR!=1{print $4,$5 > $6}' netstat.txt

再复杂一点:(注意其中的if-else-if语句,可见awk其实是个脚本解释器)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk 'NR!=1{if($6 ~ /TIME|ESTABLISHED/) print > "1.txt";
else if($6 ~ /LISTEN/) print > "2.txt";
else print > "3.txt" }' netstat.txt
$ ls ?.txt
1.txt 2.txt 3.txt
$ cat 1.txt
tcp 0 0 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT
tcp 0 0 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED
tcp 0 0 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED
tcp 0 0 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED
tcp 0 0 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT
tcp 0 4166 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED
tcp 0 0 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED
tcp 0 0 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT
tcp 0 0 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED
$ cat 2.txt
tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN
tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN
tcp 0 0 :::22 :::* LISTEN
$ cat 3.txt
tcp 0 0 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2
tcp 0 0 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2
tcp 0 1 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1
tcp 0 1 sou-ip.com:80 208.115.113.92:50601 LAST_ACK
tcp 0 0 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2
$ awk 'NR!=1{if($6 ~ /TIME|ESTABLISHED/) print > "1.txt"; else if($6 ~ /LISTEN/) print > "2.txt"; else print > "3.txt" }' netstat.txt $ ls ?.txt 1.txt 2.txt 3.txt $ cat 1.txt tcp 0 0 sou-ip.com:80 124.205.5.146:18245 TIME_WAIT tcp 0 0 sou-ip.com:80 110.194.134.189:1032 ESTABLISHED tcp 0 0 sou-ip.com:80 123.169.124.111:49809 ESTABLISHED tcp 0 0 sou-ip.com:80 123.169.124.111:49829 ESTABLISHED tcp 0 0 sou-ip.com:80 183.60.215.36:36970 TIME_WAIT tcp 0 4166 sou-ip.com:80 61.148.242.38:30901 ESTABLISHED tcp 0 0 sou-ip.com:80 110.194.134.189:4796 ESTABLISHED tcp 0 0 sou-ip.com:80 183.60.212.163:51082 TIME_WAIT tcp 0 0 sou-ip.com:80 123.169.124.111:49840 ESTABLISHED $ cat 2.txt tcp 0 0 0.0.0.0:3306 0.0.0.0:* LISTEN tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN tcp 0 0 127.0.0.1:9000 0.0.0.0:* LISTEN tcp 0 0 :::22 :::* LISTEN $ cat 3.txt tcp 0 0 sou-ip.com:80 61.140.101.185:37538 FIN_WAIT2 tcp 0 0 sou-ip.com:80 116.234.127.77:11502 FIN_WAIT2 tcp 0 1 sou-ip.com:80 124.152.181.209:26825 FIN_WAIT1 tcp 0 1 sou-ip.com:80 208.115.113.92:50601 LAST_ACK tcp 0 0 sou-ip.com:80 117.136.20.85:50025 FIN_WAIT2
$ awk 'NR!=1{if($6 ~ /TIME|ESTABLISHED/) print > "1.txt";
else if($6 ~ /LISTEN/) print > "2.txt";
else print > "3.txt" }' netstat.txt

$ ls ?.txt
1.txt  2.txt  3.txt

$ cat 1.txt
tcp        0      0 sou-ip.com:80        124.205.5.146:18245         TIME_WAIT
tcp        0      0 sou-ip.com:80        110.194.134.189:1032        ESTABLISHED
tcp        0      0 sou-ip.com:80        123.169.124.111:49809       ESTABLISHED
tcp        0      0 sou-ip.com:80        123.169.124.111:49829       ESTABLISHED
tcp        0      0 sou-ip.com:80        183.60.215.36:36970         TIME_WAIT
tcp        0   4166 sou-ip.com:80        61.148.242.38:30901         ESTABLISHED
tcp        0      0 sou-ip.com:80        110.194.134.189:4796        ESTABLISHED
tcp        0      0 sou-ip.com:80        183.60.212.163:51082        TIME_WAIT
tcp        0      0 sou-ip.com:80        123.169.124.111:49840       ESTABLISHED

$ cat 2.txt
tcp        0      0 0.0.0.0:3306           0.0.0.0:*                   LISTEN
tcp        0      0 0.0.0.0:80             0.0.0.0:*                   LISTEN
tcp        0      0 127.0.0.1:9000         0.0.0.0:*                   LISTEN
tcp        0      0 :::22                  :::*                        LISTEN

$ cat 3.txt
tcp        0      0 sou-ip.com:80        61.140.101.185:37538        FIN_WAIT2
tcp        0      0 sou-ip.com:80        116.234.127.77:11502        FIN_WAIT2
tcp        0      1 sou-ip.com:80        124.152.181.209:26825       FIN_WAIT1
tcp        0      1 sou-ip.com:80        208.115.113.92:50601        LAST_ACK
tcp        0      0 sou-ip.com:80        117.136.20.85:50025         FIN_WAIT2
统计

下面的命令计算所有的C文件,CPP文件和H文件的文件大小总和。

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ ls -l *.cpp *.c *.h | awk '{sum+=$5} END {print sum}'
2511401
$ ls -l *.cpp *.c *.h | awk '{sum+=$5} END {print sum}' 2511401
$ ls -l  *.cpp *.c *.h | awk '{sum+=$5} END {print sum}'
2511401

我们再来看一个统计各个connection状态的用法:(我们可以看到一些编程的影子了,大家都是程序员我就不解释了。注意其中的数组的用法)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk 'NR!=1{a[$6]++;} END {for (i in a) print i ", " a[i];}' netstat.txt
TIME_WAIT, 3
FIN_WAIT1, 1
ESTABLISHED, 6
FIN_WAIT2, 3
LAST_ACK, 1
LISTEN, 4
$ awk 'NR!=1{a[$6]++;} END {for (i in a) print i ", " a[i];}' netstat.txt TIME_WAIT, 3 FIN_WAIT1, 1 ESTABLISHED, 6 FIN_WAIT2, 3 LAST_ACK, 1 LISTEN, 4
$ awk 'NR!=1{a[$6]++;} END {for (i in a) print i ", " a[i];}' netstat.txt
TIME_WAIT, 3
FIN_WAIT1, 1
ESTABLISHED, 6
FIN_WAIT2, 3
LAST_ACK, 1
LISTEN, 4

再来看看统计每个用户的进程的占了多少内存(注:sum的RSS那一列)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ ps aux | awk 'NR!=1{a[$1]+=$6;} END { for(i in a) print i ", " a[i]"KB";}'
dbus, 540KB
mysql, 99928KB
www, 3264924KB
root, 63644KB
hchen, 6020KB
$ ps aux | awk 'NR!=1{a[$1]+=$6;} END { for(i in a) print i ", " a[i]"KB";}' dbus, 540KB mysql, 99928KB www, 3264924KB root, 63644KB hchen, 6020KB
$ ps aux | awk 'NR!=1{a[$1]+=$6;} END { for(i in a) print i ", " a[i]"KB";}'
dbus, 540KB
mysql, 99928KB
www, 3264924KB
root, 63644KB
hchen, 6020KB

脱掉内衣

awk脚本

在上面我们可以看到一个END关键字。END的意思是“处理完所有的行的标识”,即然说到了END就有必要介绍一下BEGIN,这两个关键字意味着执行前和执行后的意思,语法如下:

  • BEGIN{ 这里面放的是执行前的语句 }
  • END {这里面放的是处理完所有的行后要执行的语句 }
  • {这里面放的是处理每一行时要执行的语句}

为了说清楚这个事,我们来看看下面的示例:

假设有这么一个文件(学生成绩表):

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ cat score.txt
Marry 2143 78 84 77
Jack 2321 66 78 45
Tom 2122 48 77 71
Mike 2537 87 97 95
Bob 2415 40 57 62
$ cat score.txt Marry 2143 78 84 77 Jack 2321 66 78 45 Tom 2122 48 77 71 Mike 2537 87 97 95 Bob 2415 40 57 62
$ cat score.txt
Marry   2143 78 84 77
Jack    2321 66 78 45
Tom     2122 48 77 71
Mike    2537 87 97 95
Bob     2415 40 57 62

我们的awk脚本如下(我没有写有命令行上是因为命令行上不易读,另外也在介绍另一种用法):

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ cat cal.awk
#!/bin/awk -f
#运行前
BEGIN {
math = 0
english = 0
computer = 0
printf "NAME NO. MATH ENGLISH COMPUTER TOTAL\n"
printf "---------------------------------------------\n"
}
#运行中
{
math+=$3
english+=$4
computer+=$5
printf "%-6s %-6s %4d %8d %8d %8d\n", $1, $2, $3,$4,$5, $3+$4+$5
}
#运行后
END {
printf "---------------------------------------------\n"
printf " TOTAL:%10d %8d %8d \n", math, english, computer
printf "AVERAGE:%10.2f %8.2f %8.2f\n", math/NR, english/NR, computer/NR
}
$ cat cal.awk #!/bin/awk -f #运行前 BEGIN { math = 0 english = 0 computer = 0 printf "NAME NO. MATH ENGLISH COMPUTER TOTAL\n" printf "---------------------------------------------\n" } #运行中 { math+=$3 english+=$4 computer+=$5 printf "%-6s %-6s %4d %8d %8d %8d\n", $1, $2, $3,$4,$5, $3+$4+$5 } #运行后 END { printf "---------------------------------------------\n" printf " TOTAL:%10d %8d %8d \n", math, english, computer printf "AVERAGE:%10.2f %8.2f %8.2f\n", math/NR, english/NR, computer/NR }
$ cat cal.awk
#!/bin/awk -f
#运行前
BEGIN {
    math = 0
    english = 0
    computer = 0

    printf "NAME    NO.   MATH  ENGLISH  COMPUTER   TOTAL\n"
    printf "---------------------------------------------\n"
}
#运行中
{
    math+=$3
    english+=$4
    computer+=$5
    printf "%-6s %-6s %4d %8d %8d %8d\n", $1, $2, $3,$4,$5, $3+$4+$5
}
#运行后
END {
    printf "---------------------------------------------\n"
    printf "  TOTAL:%10d %8d %8d \n", math, english, computer
    printf "AVERAGE:%10.2f %8.2f %8.2f\n", math/NR, english/NR, computer/NR
}

我们来看一下执行结果:(也可以这样运行 ./cal.awk score.txt)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ awk -f cal.awk score.txt
NAME NO. MATH ENGLISH COMPUTER TOTAL
---------------------------------------------
Marry 2143 78 84 77 239
Jack 2321 66 78 45 189
Tom 2122 48 77 71 196
Mike 2537 87 97 95 279
Bob 2415 40 57 62 159
---------------------------------------------
TOTAL: 319 393 350
AVERAGE: 63.80 78.60 70.00
$ awk -f cal.awk score.txt NAME NO. MATH ENGLISH COMPUTER TOTAL --------------------------------------------- Marry 2143 78 84 77 239 Jack 2321 66 78 45 189 Tom 2122 48 77 71 196 Mike 2537 87 97 95 279 Bob 2415 40 57 62 159 --------------------------------------------- TOTAL: 319 393 350 AVERAGE: 63.80 78.60 70.00
$ awk -f cal.awk score.txt
NAME    NO.   MATH  ENGLISH  COMPUTER   TOTAL
---------------------------------------------
Marry  2143     78       84       77      239
Jack   2321     66       78       45      189
Tom    2122     48       77       71      196
Mike   2537     87       97       95      279
Bob    2415     40       57       62      159
---------------------------------------------
  TOTAL:       319      393      350
AVERAGE:     63.80    78.60    70.00
环境变量

即然说到了脚本,我们来看看怎么和环境变量交互:(使用-v参数和ENVIRON,使用ENVIRON的环境变量需要export)

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
$ x=5
$ y=10
$ export y
$ echo $x $y
5 10
$ awk -v val=$x '{print $1, $2, $3, $4+val, $5+ENVIRON["y"]}' OFS="\t" score.txt
Marry 2143 78 89 87
Jack 2321 66 83 55
Tom 2122 48 82 81
Mike 2537 87 102 105
Bob 2415 40 62 72
$ x=5 $ y=10 $ export y $ echo $x $y 5 10 $ awk -v val=$x '{print $1, $2, $3, $4+val, $5+ENVIRON["y"]}' OFS="\t" score.txt Marry 2143 78 89 87 Jack 2321 66 83 55 Tom 2122 48 82 81 Mike 2537 87 102 105 Bob 2415 40 62 72
$ x=5

$ y=10
$ export y

$ echo $x $y
5 10

$ awk -v val=$x '{print $1, $2, $3, $4+val, $5+ENVIRON["y"]}' OFS="\t" score.txt
Marry   2143    78      89      87
Jack    2321    66      83      55
Tom     2122    48      82      81
Mike    2537    87      102     105
Bob     2415    40      62      72

几个花活

最后,我们再来看几个小例子:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
#从file文件中找出长度大于80的行
awk 'length>80' file
#按连接数查看客户端IP
netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr
#打印99乘法表
seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")}'
#从file文件中找出长度大于80的行 awk 'length>80' file #按连接数查看客户端IP netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr #打印99乘法表 seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")}'
#从file文件中找出长度大于80的行
awk 'length>80' file

#按连接数查看客户端IP
netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr

#打印99乘法表
seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")}' 

自己撸吧

关于其中的一些知识点可以参看gawk的手册

(全文完)

(转载本站文章请注明作者和出处 宝酷 – sou-ip ,请勿用于任何商业用途)

好烂啊有点差凑合看看还不错很精彩 (132 人打了分,平均分: 4.58 )
Loading...

AWK 简明教程》的相关评论

  1. 在“内建变量”中的“FNR”的描述似乎不是很完整,大概是想说“这个值会因文件的不同而不同”之类的话吧。

  2. 补充个:
    awk ‘BEGIN{service=”/inet/tcp/2000/0/0″; service |& getline; print $0; close(service)}’
    监听TCP 2000端口

  3. 80后……年轻朋友……顿时泪目了。本来和90后和00后一比都是老头子了,现在居然是年轻朋友了……不过说真的,我身边80后程序员不用awk的还真不多……

  4. “其中的$1..$n表示第几例。注:$0表示整个行。”
    这里是指“第几列”吧?

  5. 90后awk控表示鸭梨不大
    awk非常适合写“嵌入式”程序,也就是那种内联在shell或别的脚本里的一行~几行的小程序。
    曾经我还晒了一个1行写完的小爬虫:P

  6. CU有个帖子叫“shell十三篇”,awk/sed都有
    可惜作者是女的,不能叫你撸,哈哈

  7. 有错误吧, awk ‘FS=”:” {print $1,$3,$6}’ 明显不等价于 awk -F: ‘{print $1,$3,$6}’ ,前者是读取第一行记录并将记录分割成字段之后才进行FS的设置,FS=”:” 作为了模式,通常这样的写法是错误的.

  8. 以前听说过awk,而且也找一些资料来看,但是都不知道所云,看了这篇文章终于大概知道是怎么用了!

  9. 在文件分隔符(awk ‘FS=”:” {print $1,$3,$6}’ /etc/passwd)这个例子的输出结果里,为什么整个第一行(root:x:0:0:root:/root:/bin/bash)都被打印出来了?

  10. Tiemo :在文件分隔符(awk ‘FS=”:” {print $1,$3,$6}’ /etc/passwd)这个例子的输出结果里,为什么整个第一行(root:x:0:0:root:/root:/bin/bash)都被打印出来了?

    awk ‘BEGIN{FS=”:”}; {print $1,$3,$6}’ /etc/passwd

  11. Tiemo :
    在文件分隔符(awk ‘FS=”:” {print $1,$3,$6}’ /etc/passwd)这个例子的输出结果里,为什么整个第一行(root:x:0:0:root:/root:/bin/bash)都被打印出来了?

    谢谢指出需要加入关键字BEGIN的诸位,搜到 http://ubuntuforums.org/showthread.php?t=834068 有这么一段解释:
    The reason it’s not working like you expect is that the way you have things written the assignment to FS isn’t executed until awk has read/parsed the first line of input. awk’s normal behavior is to execute your code blocks on each line of input, but the parsing of the lines happens before the block is triggered. So in this case awk is parsing the first line with its default delimiter (whitespace) and then executing the block, which sets FS to the “pipe” (“|”) for all subsequent lines.

  12. 内容一如既往的赞!不过我想吐槽下:谁一泡屎啦这么长时间!(゚Д゚)ノ

  13. 我很少用sed,perl,简单文本处理基本都是awk搞定。。(我承认sed好久没用基本忘了,只记得p和s两个命令)

  14. 受益匪浅~
    以前看awk相关的东西的时候感觉这个是用来处理字段的, 目前工作用不上也就没细看了, 看完博主的文章才发现awk的各种活用.

  15. 对了 有个问题想请教下, 在shell脚本里面用awk命令的话, $0, $1并不代表字段的内容, 该如何处理?
    一下是我的具体命令, 想用awk取代grep进行搜索, 觉得那样可以同时搜索多个关键字.
    grep
    find ${LINE} -printf %m\ %Cy\-%Cm\-%Cd\ %CH\:%CM\ \ %s\K\ \ %p\\n\\n -exec grep -n $1 {} \;

    awk
    find ${LINE} -printf %m\ %Cy\-%Cm\-%Cd\ %CH\:%CM\ \ %s\K\ \ %p\\n\\n;
    awk “/`echo $1`/” ${LINE}
    改成这个样可以取代搜索, 可无法输出行号
    awk “/`echo $1`/” ${LINE} 该怎么添加print NR, $0?
    我试了几种写法都未果, 有时间的话麻烦帮我琢磨下, 谢谢了.

回复 yan9yu 取消回复

您的电子邮箱地址不会被公开。 必填项已用*标注