文件名称:
Awk - A Tutorial and Introduction - by Bruce Barnett.pdf
开发工具:
文件大小: 2mb
下载次数: 0
上传时间: 2019-08-31
详细说明:Awk is an extremely versatile programming language for working on files. We'll teach you just enough to understand the
examples in this page, plus a smidgen.
The examples given below have the extensions of the executing script as part of the filename. Once you download it,
and make it executable, yoprogramming. There are three variations ofAwK
AWK-the original from at&T
NAWK-A newer, improved version from at&T
GAWK- The free software foundation 's version
Originally, I didn't plan to discuss naWk, but several UNIX vendors have replaced AWK with NAWK, and there
are several incompatibilities between the two. It would be cruel of me to not warn you about the differences. So I will
highlight those when I come to them. It is important to know than all of,'s features are in naWK and GaWK
Most, if not all, ofNAWK's features are in gawK. nawk ships as part of solaris. gawK does not. However
many sites on the Internet have the sources freely available. If you user Linux, you have gawK. But in general, assume
that i am talk ing about the classic a wK unless otherwise noted
Why is aWK so important? It is an cxccllent filter and report writer. Many UNIX utilities gencratcs rows and columns
of information. awK is an cxccllent tool for proccssing thesc rows and columns, and is casic to usc awk than most
conventional programming languages. It can be considered to be a pseudo-C interpretor, as it understands the same
arithmatic operators as C. AwK also has string manipulation functions, so it can search for particular strings and modify
the output. AwK also has associative arrays, which are incredible useful, and is a feature most computing languages
lack. Associative arrays can make a complex problem a trivial exercise
be too confusing to discuss three different versions ofAWK. I won 't cover the gnU version of AWK called"gawk, r
I won't exhaustively cover AwK. That is, I will cover the essential parts, and avoid the many variants of AwK. It might
Similarly, I will not discuss the new at&T aWK called"nawk The new AWK comes on the Sun system, and you
may find it superior to the old AWK in many ways. In particular, it has better diagnostics, and won't print out the
infamous bailing out near lime."message the original AwK is prone to do. Instead,"hawk" prints out the line it didnt
understand, and highlights the bad parts with arrowS. GaWK does this as well, and this really helps a lot. If you find
yourself needing a feature that is very difficult or impossible to do in AWK, I suggest you either use NAWK, or
GAWK, or convert your AWK script into PERL using the a2p"conversion program which comes with PERL. PERL
is a marvelous language and i use it all the time but i do not plan to cover perl in these tutorials having made my
intention clcar. i can continuc with a clcar conscicncc
Many UNIX utilities havc strange namcs. AWK is onc of thosc utilities. It is not an abbreviation for awkward. In fact, it
is an elegant and simple language. The work"AWK is derived from the initials of the language's three developers: A
Aho, B W. Kernighan and P. Weinberger.
Basic structure
The essential organization of an awK program follows the form
Pattern action j
The pattern specifies when the action is performed. Like most UNIX utilities, AWK is lime oriented. That is, the pattern
specifies a test that is performed with each line read as input If the condition is true then the action is taken. The default
pattern is something that matches every line. This is the blank or null pattern. Two other important patterns are specified
by the keywords BEGiN" and"END. As you might expect, these two words specify actions to be taken before any
lines are read, and after the last line is read. The AWK program below
beGin t print START J
pr⊥2t
END
print STOP W
adds onc linc bcforc and onc linc after the input filc. This isn,'t vcry uscful, but with a simple changc, wc can makc this
into a typical AWK program
begiN print File \tO wner",")
{ prnt$8,"t",$3}
ENd print"-DONE-"3
I'll improve the script in the next sections, but we'll call it"FileOwner. But let's not put it into a script or file yet. I will
cover that part in a bit. Hang on and follow with me so you get the flavor ofawK.
The characters t"Indicates a tab character so the output lines up on even boundries. TheS8"""have a
meaning similar to a shell script. Instead of the eighth and third argument, they mean the eighth and third field of the input
line. You can think of a field as a column, and the action you specify operates on each line or row read in
There are two differences between A WK and a shell processing the characters within double quotes. AWK
understands spccial characters follow the ""charactcr lkc". The Bourne and C uniX shells do not. Also unlike the
shell(and PerD) awk docs not cvahuatc variables within strings. To cxplain, the sccond linc could not bc writtcn like
this
Gprint t$3")
That example would print$8 $3. " Inside the quotes, the dollar sign is not a special character. Outside, it corresponds
to a field What do i mean by the third and eight field? Consider the solaris"/usr/bin/ls-f"command which has eight
cons of information. The System v version (Similar to the linux version) /usr/5bn/ls-L"has 9 columns. The third
cohn is the owner, and the eighth(or nineth) cohn in the name of the file. This awk program can be used to
process the output of the"bs-I command, printing out the filename, then the owner, for each file I'l show you how
Update: On a lnux system, change$ to $9
One more point about the use of a dollar sign. In scripting languages like Perl and the various shells, a dollar sign means
the word following is the name of the variable. Awk is different. The dollar sign means that we are refering to a field or
column in the current line. When switching between Perl and AWK you must remener that"S"has a different meaning
So the following picc of codc prints two"ficlds"to standard out. The first fick printed is the numbcr 5", the sccond is
the fifth ficld (or colmn) on thc input linc
BEGIN (x51
f print x, Sx)
Executing an AwK script
So let's start writing our first AWK script. There are a couple of ways to do this
Assuming the first script is called"Fileowner, " the vocation would be
Is-1 FileOwner
This might generate the following if there were only two files in the current directory
File owner
a file barnett
another file barnett
DONE
There arc two problcms with this script. Both problems arc casy to fix, but I'll hold off on this until I cover thc basics
The script itself can be written in many ways. The C shell version would look like this
t! /bin/csh -f
t linux users have to change $8 to s9
awk I\
B上G⊥N( print"i1e\ tOwner"}
print $8,"\t" $3
上ND
print -DONE -"
Click here to get file: awk example csh
As you can see in the above script, each line of the AwK script must have a backslash if it is not the last line of the
script. This is necessary as the C shell doesn,t, by default, allow strings I have a long list of complaints about using the C
shell. See Top Ten reasons not to use the C shell
The Bourne shell(as docs most shells allows quoted strings to span scvcral lincs
bin/sh
Linux users have to change $8 to $9
awk
begin print" tOwner")
{ prnt$8,"t",$3}
ENd print"-DONE-")
Click here to get file: awk example.sh
The third form is to store the commands in a file, and execute
awk-ffilename
Since AWK is also an interpretor, you can save yourself a step and make the file executable by add one line in the
beginning of thc filc
#!/bin/awk -f
beGiN print FiletO wner")
{ prmt$8,"t,$3}
ENd print"-DONE-")
Click here to get file: awk example lawk
Change the permission with the chmod command, (ie. "chmod +x awk examplel awk), and the script becomes a
new command. Notice the"I"option following# ! /bin/awk above, which is also used n the third format where you use
AWK to execute the file directly, i.e. awk-ffilename". The"f" option specifies the awk file containing the
instructions. As you can see, AWK considers lines that start with a#"to be a comment, just like the shell. To be
precise, anything from the "y "to the end of the line is a comment (unless its inside an awK string. However, I always
comment my AWK scripts with the#at the start of the line, for reasons I'll discuss later
Which format should you usc? I prefer thc last format when possible. It's shorter and simpler. It's also casicr to debug
problcms. If you nccd to usc a shell, and want to avoid using too many files, you can combine them as wc did in thc first
and sccond cxamplc
Which shell to use with awk?
The format of AwK is not free-form You canot put new line breaks just anywhere. They must go in particular
locations. To be precise, in the original a wk you can insert a new line character after the curly braces and at the end
of a command, but not elsewhere. If you wanted to break a long line into two lines at any other place, you had to
backslash:
#!/bin/awk -f
beGin print File\towner")
print $8, "t"\
END print"-DONE-")
Click hcrc to gct file: awk cxamplc2 awk
The bourne shell version would be
#!/bin/sh
aw
bEgin print"File\tO wner")
i print$8, t",
ENd print"]
Click here to get file: awk example2.sh
while the c shell would be
#!/bin/csh-f
awk
begin i print"File tOwner")\
print $8, t", I
S3}
ENd print""\
Click here to get file: awk example2 csh
s you can see, this demonstrates how awkward the c shell is when enclosing an AWK script. Not only are back
slashes needed for every lne, some lines need two (Note-this is true when using ol awk(e.g. on Solaris) because the
print statement had to be on one line. Newer AWK's are more flexible where new lines can be added. Many people
will warn you about the C shell. Some of the problems are subtle, and you may never see them. Try to inchude an AWK
or sed script within a C shell script, and the back slashes will drive you crazy. This is what convinced me to learn the
Bourne shell years ago, when I was starting out. I strongly recommend you use the bourne shell for any AWK or sed
script. If you don,t use the Bourne shell, then you should learn it. As a minimum, learn how to set variables, which by
somc strange coincidence is the subjcct of the ncxt scction.
Dvnamic variables
Sincc you can makc a script an awK cxccutablc by mentioning#1 bin/awk-f' on the first linc, inchuding an AWK
script inside a shell script isn t needed unless you want to either e liminate the need for an extra file or if you want to pass
a variable to the insides ofan AwK script. Since this is a common problem, now is as good a time to explain the
technique. I'll do this by showing a simple awk program that will only print one cohn. NOTE: there will be a bug
the first version. The number of the column will be specified by the first argument. The first version of the program,
which we will call'Cohmn"looks like this
#I bin/sh
#NOTE- this script does not work
columNs
awk 'print Scolumn)
Click here to get file (but be aware that it doesn't work Columnl sh
A suggested use is:
Is-1 Column 3
This would print the third cohimn from the s command, which would bc the owner of thc file. You can change this into a
utility that counts how many files arc owned by cach uscr by adding
ls-1 Column 3 uniq-c sort-nr
Only one problem: the script doesn't work. The value of the"column variable is not seen by awK. Change"awk
to"echo"to check. You need to turn off the quoting when the variable is seen. This can be done by ending the quoting,
and restarting it after the variable
# bin/sh
colum
awk 'iprint S Scolurmm'f
Click here to get file: Column2. sh
This is a very important concept, and throws experienced programmers a curve ball. In many computer languages, a
string has a start quote, and end quote, and the contents in between. If you want to inchude a special character inside the
quote, you must prevent the charactcr from having the typical mcaning. In the C language, this is down by putting a
backslash bcforc thc charactcr. In othcr languages, thcrc is a spccial combination of charactcrs toto this. In thc C and
Bourne shell, the quote is just a switch. It turns thc interpretation modc on or off. Thcre is rcally no such conccpt as
start of string and"end of string. " The quotes toggle a switch inside the interpretor The quote character is not passed
on to the application. This is why there are two pairs of quotes above. Notice there are two dollar signs. The first one is
quoted, and is seen by awk. the second one is not quoted so the shell evaluates the variable, and replaces column"
by the value. If you don' t understand, either change"awk"to"echo, "or change the first line to read# /bin/sh-x.
Some improvements are needed, however. The Bourne shell has a mechanism to provide a vahue for a variable if the
value isn't set, or is set and the value is an empty string. This is done by using the format
Variable: defaultvalue
This is shown below where the default column will be one
#!/ bin/sh
column$1: 1)
awk'iprint $)
Click hcrc to gct filc: Column3. sh
We can save a line by combining these two steps
#I /bin/sh
awk 'print $$1:-13'
Click here to get file: Cohumn4sh
It is hard to read but it is compact There is one other method that can be used If you execute an awk command and
include on the command line
variable-value
this variable will be set when the AWK script starts. An example of this use would be
bin/sh
awk'print Scc=S1: 13
Click hcrc to gct file: Columns.sh
cxamplc, howcvcr, bccausc you can usc it with any script or command. Thc sccond mcthod is spccial to AWK: Cr
This last variation does not have the problems with quoting the previous example had. You should master the earlier
Modern awK's have other options as well. See the comp. unix shell FAQ
The Essential Syntax of AWK
Earlier I discussed ways to start an AWK script. This section will discuss the various grammatical elements ofAWK
Arithmetic Expressions
There are several arithmetic operators, similar to C. These are the binary operators which operate on two variables
AWK Table 1
Binary operators
I Operator
ype
Meaning
Arithmetic Addition
Arithmetic
Subtraction
Arithmetic Multiplication
Arithmetic Division
Arithmetic Modulo
I
string
Concatenation
一一一一一一一一一一一一一一一一一一一一一一一
Using variables with the vahuc of"7"and"3, "AWK returns the following results for cach opcrator when using the print
command
l Expression Result
2.33333
1783
73
3
There are a few points to make. The modulus operator finds the remainder after an integer divide. The print command
output a floating point number on the divide, but an integer for the rest. The string concatenate operator is confusing,
since it isn't even visible. Place a space between two variables and the strings are concatenated together. This also
shows that numbers are converted automatically into strings when needed. Unlike C, awk doesnt have"types" of
variables. There is one type only, and it can be a string or number. The conversion rules are simple. a number can easily
be converted into a string. When a string is converted into a number awk will do so. The string 123 will be
converted into the number 123. however, the string 123X will be converted into the number 0 (nawk will behave
differently, and converts the string into integer 123, which is found in the beginning of the string
Unary arithmetic operators
The+"and"-"operators can be used before variables and mumbers IfX equals 4, then the statement
print-X
will print -4
The Autoincrement and Autodecrement Operators
AWK also supports the++and"-"operators of C. both increment or decrement the variables by one. The operator
can only be used with a single variable, and can be before or after the variable. The prefix form modifies the value, and
then uses the result, while the postfix form gets the results of the variable, and afterwards modifies the variable. As an
example if X has the value of 3 then the awK statement
print x++,,++X
would print the numbers 3 and 5. These operators are also assignment operators, and can be used by themselves on a
Assignment Operators
Variables can be assigned new values with the assignment operators. you know about ++"and"--, "The other
assignment statement is simply
variable arichmetic expression
Certain operators have precedence over others, parenthesis can be used to control grouping. The statement
1+2*34:
is the same as
X=(1+(2*3)"4"
Both print out74
Notice spaces can be added for readability. AWK, like C, has special assignment operators, which combine a
calculation with an assignment. Instead of saying
XX
you can more concisely say
+=2
The complete list follows
AWK Tab_e 2
Assignment Operators
l Operator Meaning
d resu⊥ t to variable
Subtract result from variable
Multiply variable by result
Divide variable by resul
App l y modu o to variable
Conditional expressions
The second type of expression in AWK is the conditional expression. This is used for certain tests, like the if or while
Boolean conditions evaluate to true or false. In awK. there is a definite difference between a boolean condition and an
arithmetic cxprcssion. You cannot convert a boolean condition to an intcger or string. You can, howcvcr, usc an
arithmetic cxprcssion as a conditional cxprcssion. a valuc ofo is falsc, while anything clsc is truc. Undcfincd variables
has the value ofo. Unlike awK. NawK lets you use booleans as integers
arithmetic values can also be converted into boolean conditions by using relational operators
AWK Tabie 3
Relational Operators
l Operator Meaning
Is eg
Ts not eg
Ts greater than
>
Is greater than or equal to
<
Is less than
Is less than or equal to
These operators are the same as the c operators They can be used to compare numbers or strings. With respect to
strings, lowcr casc letters arc grcatcr than uppcr casc letters
Regular Expressions
Two operators are used to compare strings to regular expressions
AWK Tab_e 4
Regular Expression Operators
Operat○r
Meaning
Matches
Doesn t match
The order in this case is particular. The regular expression must be enclosed by slashes, and comes after the operator.
AWK supports extended regular expressions, so the following are examples of valid tests.
word / START
lawrence welk - /(onetwo three)
(系统自动生成,下载前可以参看下载内容)
下载文件列表
相关说明
- 本站资源为会员上传分享交流与学习,如有侵犯您的权益,请联系我们删除.
- 本站是交换下载平台,提供交流渠道,下载内容来自于网络,除下载问题外,其它问题请自行百度。
- 本站已设置防盗链,请勿用迅雷、QQ旋风等多线程下载软件下载资源,下载后用WinRAR最新版进行解压.
- 如果您发现内容无法下载,请稍后再次尝试;或者到消费记录里找到下载记录反馈给我们.
- 下载后发现下载的内容跟说明不相乎,请到消费记录里找到下载记录反馈给我们,经确认后退回积分.
- 如下载前有疑问,可以通过点击"提供者"的名字,查看对方的联系方式,联系对方咨询.