Home

Language Translator

Hacking Zone

Hacking Tools
Attacking

Configure Windows

Windows Configuration

Mix Tutorials

Asterisk
Website Building

Novels

Mix Novels

Human Personality

Body Language

Login Form






Lost Password?
No account yet? Register
Awk Tutorial Part 2 Print E-mail

Awk Tutorial Part 2

 Today I was able to meet Bryan of Guru Labs. During our conversation he posed the following question. “Find the 3rd field in a file consisting of space separated fields, the first being an ip address, in the range 192.168.1-2.1-255. There maybe lines in the file containing invalid ip addresses.”

 

I used grep to find the lines and then used awk to find the field. For example:

$ egrep '^192\.168\.[1-2]\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|2[0-5]{2})' access_log | \
awk '{print $(NF-1)}'
200
304
304
...

He pointed out that, while this works, there is no reason to invoke grep. He is certainly correct. Indeed, awk is all powerful! The default usage of awk is:

awk 'pattern { command }'

In its most common and simple usage, to print a field deliminated by spaces:

awk '{print $3}'

You are specifying no pattern, which matches every line. When solving the problem posed by Bryan, simply specify the pattern and eliminate grep from the pipe line. Here is the equivalent awk command:

$ awk '/^192\.168\.[1-2]\.([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|2[0-5]{2})/ {print $(NF-1)}' \
access_log
200
304
304
...

Awk has some extremely powerful selecting operators. Here I am using the ~ operator to match the third field from the right (resource), to ^/man, and printing the matched field:

$ awk '$(NF-3) ~ /^\/man/ {print $(NF-3)}' access_log
/man/cmd/info
/man/cmd/Mail
/man/s/Z
/man/cmd/mv
...

This invocation uses the !~ operator, to match lines where the resource does not match the pattern ^/man:

$ awk '$(NF-3) !~ /^\/man/ {print $(NF-3)}' access_log
/feed/
/feed
/robots.txt
/10-linux-commands-youve-never-used.html
...

Here I am selecting lines where the response code $(NF-1) is greater than or equal to 200, but less than 400 and printing the resource and response code. I use awk’s boolean “and” operator && to perform this operation:

 $ awk '$(NF-1) >= 200 && $(NF-1) <= 399 {print $(NF-3), $(NF-1)}' access_log
/man/cmd/info 200
/feed/ 304
/feed 304
...

The following example uses the boolean “or” operator || to print lines where there resource matches ^/feed or ^/sitemap:

$ awk '$(NF-3) ~ /^\/feed/ || $(NF-3) ~ /^\/sitemap/ {print $0}' access_log
192.168.1.2 - - [01/Jan/2008:00:00:31 -0600] "GET /feed/ HTTP/1.1" 304 -
192.168.1.3 - - [01/Jan/2008:00:01:09 -0600] "GET /feed HTTP/1.1" 304 -
...




Digg!Reddit!Del.icio.us!Live!Facebook!Slashdot!Technorati!StumbleUpon!Newsvine!Fark!Blogmarks!Yahoo!BlogMemes!FeedMeLinks!
Comments
Add NewSearch
Only registered users can write comments!

Copyright (C) 2007 Alain Georgette / Copyright (C) 2006 Frantisek Hliva. All rights reserved.

 
< Prev   Next >
Your Ad Here

RSS socialnet

Add to MyYahoo!
Subscribe in NewsGator Online
Add to Newsburst
Add to Google
Add to My AOL
Add to Pluck
Subscribe in FeedLounge
Add to Windows Live
Add to NetVibes
Subscribe in Rojo
Subscribe in Bloglines
Add to MyMSN
Add to Plusmo for your cellphone
Add to PageFlakes
Add to Technorati
Add to BlinkBits
get excellent redundancy cover online to day