In this tutorial, we will explain Linux/UNIX AWK Command with examples.
AWK is a versatile and lightweight text-processing language that comes pre-installed on most Unix-like operating systems, including Linux. It excels at manipulating structured text data, making it an invaluable tool for tasks such as data extraction, transformation, and reporting.
The name “AWK” itself is an acronym derived from the initials of its creators: Alfred Aho, Peter Weinberger, and Brian Kernighan, who developed it in the 1970s. Since then, AWK has evolved and is now available in various implementations, with the most common one being GNU AWK (gawk).
Basic Syntax
Before diving into AWK’s capabilities, let’s understand its basic syntax. AWK scripts consist of patterns and actions:
- Pattern: Describes a condition or criteria that a line of text must meet.
- Action: Specifies what to do when the pattern is matched.
$ awk '/pattern/ { action }' file.txt
In this example, AWK scans file.txt, and for each line that matches the pattern, it performs the specified action. Patterns and actions can be combined in various ways to create complex text processing logic.
Without any further delay, let’s jump into awk command examples.
Let’s take a input file with the following data
$ cat awk_file Name,Marks,Max Marks Ram,200,1000 Shyam,500,1000 Ghyansham,1000 Abharam,800,1000 Hari,600,1000 Ram,400,1000
1) Print All Lines From a File
By default, awk prints all lines of a file , so to print every line of above created file use below command :
$ awk '{print;}' awk_file Name,Marks,Max Marks Ram,200,1000 Shyam,500,1000 Ghyansham,1000 Abharam,800,1000 Hari,600,1000 Ram,400,1000
Note: In awk command ‘{print;}’ is used print all fields along with their values.
2) Display Only Specific Fields
In awk command, we use $ (dollar) symbol followed by field number to display field values. In below example, we are printing field 2 (i.e Marks) and field 3 (i.e Max Marks)
$ awk -F "," '{print $2, $3;}' awk_file Marks Max Marks 200 1000 500 1000 1000 800 1000 600 1000 400 1000
In the above command we have used the option -F “,” option which specifies that comma (,) is the field separator in the file.
3) Print Lines Which Matches the Pattern
Let’s suppose, we want to print the lines which contains the word “Hari & Ram”, run
$ awk '/Hari|Ram/' awk_file Ram,200,1000 Hari,600,1000 Ram,400,1000
4) Show Unique Values from First Column
To show unique values from the first column of a file, run below awk command
$ awk -F, '{a[$1];}END{for (i in a)print i;}' awk_file Abharam Hari Name Ghyansham Ram Shyam
5) Find the Sum of Data Entry in a Particular Column
In awk command, it is also possible to perform some arithmetic operation based on search, syntax is shown below
$ awk -F, ‘$1==”Item1″{x+=$2;}END{print x}’ awk_file
In the following example, we search for Ram word and then we add values of 2nd field for Ram word.
$ awk -F, '$1=="Ram"{x+=$2;}END{print x}' awk_file 600
6) Summing Numbers
In awk command, we can also calculate the sum of all numbers in a column of a file. In the below example we are calculating the sum of all numbers of 2nd and 3rd column.
$ awk -F"," '{x+=$2}END{print x}' awk_file 3500 $ awk -F"," '{x+=$3}END{print x}' awk_file 5000
7) Find the Sum of Individual Group Records
For example, if we consider the first column than we can do the summation for the first column based on the items
$ awk -F, '{a[$1]+=$2;}END{for(i in a)print i", "a[i];}' awk_file Abharam, 800 Hari, 600 Name, 0 Ghyansham, 1000 Ram, 600 Shyam, 500
8) Sum of All Entries of Specific Columns
As we already discuss that awk command can do sum of all numbers of a column, so to append the sum of column 2 and column 3 at the end of file, run
$ awk -F"," '{x+=$2;y+=$3;print}END{print "Total,"x,y}' awk_file Name,Marks,Max Marks Ram,200,1000 Shyam,500,1000 Ghyansham,1000 Abharam,800,1000 Hari,600,1000 Ram,400,1000 Total,3500 5000
9) Conditional Processing
You can apply conditions to your commands, such as printing lines that meet specific criteria:
$ awk '$3 > 50 { print $1, $3 }' data.txt
This command prints the first and third fields of lines where the third field is greater than 50.
10) Extracting Data from CSV Files
CSV files are common in data processing. AWK can help you extract specific columns effortlessly. Suppose you have a CSV file with columns ‘Name’, ‘Age’, and ‘Country’. You can extract the ‘Name’ and ‘Age’ columns like this:
$ awk -F, '{print $1, $2}' data.csv
The ‘-F,’ flag specifies the field separator (comma in this case). AWK then prints the first and second fields from each line.
AWK Begin Block
Syntax for BEGIN block is
$ awk ‘BEGIN{awk initializing code}{actual AWK code}’ File-Name
Let us create a datafile with below contents
11) Populate Each Column Names along with their Corresponding Data
$ awk 'BEGIN{print "Names\ttotal\tPPT\tDoc\txls"}{printf "%-s\t%d\t%d\t%d\t%d\n", $1,$2,$3,$4,$5}' datafile
12) Change the Field Separator
As we can see space is the field separator in the datafile , in the below example we will change field separator from space to “|”
$ awk 'BEGIN{OFS="|"}{print $1,$2,$3,$4,$5}' datafile
That’s all from this tutorial, I hope you found it informative. Please do share your feedback and queries in below comment’s section.
Also Read: Tar Command in Linux with Practical Examples
this is really nice tutorial,
i was searching for something.
which i got it here.
thanks
Well, this tutorial, looks like a collection of awk one-liners than a tutorial.
Thanx for such a nice and simple tutorial for beginners.
Example 7,8,9 are not working on my putty..can you please help?
The reason some of the awk scripts are not wrung is the following
Original statement:
awk -F, ‘{a[$1];}END{for (i in a)print i;}’ awkfile.txt
Correction Statement:
awk -F, ‘{a[$1];}END{for (i in a)print i;}’ awkfile.txt
The difference is.
Original:
‘{a[$1];}END{for (i in a)print i;}’
Correction:
‘{a[$1];}END{for (i in a)print i;}’
Note the quotes.