Regular expressions are special characters that help search data, matching complex patterns.
GREP (Global Regular Expression Print): It searches a file for a particular pattern of characters and displays all lines that contain that pattern.
# search a word (root) in a file ubuntu@ip-172-31-1-88:~$ grep root /etc/passwd # search a word (root) insensitive in a file ubuntu@ip-172-31-1-88:~$ grep -i Root /etc/passwd # search a word (root) in multiple files ubuntu@ip-172-31-1-88:~$ grep root /etc/passwd /etc/group # inverting the string match i.e. output all lines except lines that have a ''root' as a string ubuntu@ip-172-31-1-88:~$ grep -v root /etc/passwd #Display the total line numbers matched with a string (root) in a file ubuntu@ip-172-31-1-88:~$ grep -c root /etc/passwd #Display the filenames that match the string (root) ubuntu@ip-172-31-1-88:~$ grep -l root /etc/passwd /etc/shadow #Display the filenames that do not contain a string (root) ubuntu@ip-172-31-1-88:~$ grep -L root /etc/passwd /etc/shadow #Display the line numbers that match a string (root) ubuntu@ip-172-31-1-88:~$ grep -n root /etc/passwd #Display the line that starts with a string (root) ubuntu@ip-172-31-1-88:~$ grep ^root /etc/passwd #Display the line that ends with a string (/bin/bash) ubuntu@ip-172-31-1-88:~$ grep /bin/bash$ # Search a string (root) and write the output in a new file (find.txt) ubuntu@ip-172-31-1-88:~$ grep root /etc/passwd > devops/find.txt
Find: It is used to search and locate a list of files and directories based on conditions you specify for files that match the arguments. Find can be used in a variety of conditions you can find files by permissions, users, groups, file type, date, size, etc. Find is the most important and much-used command in Linux systems.
# Find files under home directory ubuntu@ip-172-31-87-84:~$ find /home -name new* # Find files with SUID permission ubuntu@ip-172-31-87-84:~$ find /var -perm 4755 # Find files with GUID permission ubuntu@ip-172-31-87-84:~$ find /var -perm 2644 # Find file with sticky bit permission ubuntu@ip-172-31-87-84:~$ find /var -perm 1755 # Search files based on user (steve) ubuntu@ip-172-31-87-84:~$ find /var -user steve # Search files based on group (steve) ubuntu@ip-172-31-87-84:~$ find /var -group steve # Search file with less than 10MB in folder (/tmp) ubuntu@ip-172-31-87-84:~$ find /tmp -size -10M # Search file with more than 10MB in folder (/tmp) ubuntu@ip-172-31-87-84:~$ find /tmp -size +10M
WC (Word count): It is used to count word and line numbers.
# Count the number of lines in a file ubuntu@ip-172-31-87-84:~$ wc -l /etc/passwd # Count the number of words in a file ubuntu@ip-172-31-87-84:~$ wc -w /etc/passwd
Head: It is used to display the top line in a file.
# Display top 10 lines of the file ubuntu@ip-172-31-87-84:~$ head /etc/passwd # Display a top-specific number of lines in the file ubuntu@ip-172-31-87-84:~$ head -n 5 /etc/passwd
Tail: It is used to display the bottom line in a file.
# Display the bottom 10 lines of the file ubuntu@ip-172-31-87-84:~$ tail /etc/passwd # Display a bottom-specific number of lines in the file ubuntu@ip-172-31-87-84:~$ tail -n 8 /etc/passwd
Sed (Stream Editor): It is used to parse and transform information. It can perform lots of functions on file like searching, finding and replacing, insertion or deletion. Though most common use of the SED command is for a substitution or for find and replace. We can edit files even without opening them, which is a much quicker way to find and replace something in the file, than first opening that file in vi editor and then changing it.
ubuntu@ip-172-31-87-84:~$ cat > sample.txt Hello to the world of unix. ubuntu@ip-172-31-87-84:~$ sed 's/unix/linux/' sample.txt Hello to the world of linux.
Awk: Awk is abbreviated from the names of the developers – Aho, Weinberger, and Kernighan. It is a utility and scripting language for performing simple/complex text-processing tasks. The most common action of awk is 'print'.
AWK Operations:
a. Scans a file line by line
b. Splits each input line into fields
c. Compares input line/fields to pattern
d. Performs actions on matched lines2. Useful For:
a. Transform data files
b. Produce formatted reports3. Programming Constructs:
a. Format output lines
b. Arithmetic and string operations
c. Conditionals and loopsExample:
Consider the following text file as the input file for all cases below:
ubuntu@ip-172-31-87-84:~$ cat > employee.txt Joe manager account 45000 Cavin clerk account 25000 Brian manager sales 50000 Noel manager account 47000 tarun peon sales 15000 Danny clerk sales 23000 Steve peon sales 13000 Mark director purchase 80000 # Default behaviour of awk , print every data in the file ubuntu@ip-172-31-87-84:~$ awk '{print}' employee.txt Joe manager account 45000 Cavin clerk account 25000 Brian manager sales 50000 Noel manager account 47000 Tarun peon sales 15000 Danny clerk sales 23000 Steve peon sales 13000 Mark director purchase 80000 # Print lines which match with the given pattern (/manager/) ubuntu@ip-172-31-87-84:~$ awk '/manager/ {print}' employee.txt Joe manager account 45000 Brian manager sales 50000 Noel manager account 47000 # Print columns $1 and $4 ubuntu@ip-172-31-87-84:~$ awk '{print $1,$4}' employee.txt
Built-In Variables In Awk
Awk’s built-in variables include the field variables—$1, $2, $3, and so on ($0 is the entire line) — that break a line of text into individual words or pieces called fields.
NR: NR command keeps a current count of the number of input records. Remember that records are usually lines. Awk command performs the pattern/action statements once for each record in a file.
NF: NF command keeps a count of the number of fields within the current input record.
Examples:
Use of NR built-in variables (Display Line Number)
$ awk '{print NR,$0}' employee.txt
1 Joe manager account 45000
2 Cavin clerk account 25000
3 Brian manager sales 50000
4 Noel manager account 47000
5 Tarun peon sales 15000
6 Danny clerk sales 23000
7 Steve peon sales 13000
8 Mark director purchase 80000
Use of NF built-in variables (Display Last Field)
$ awk '{print $1,$NF}' employee.txt
Joe 45000
Cavin 25000
Brian 50000
Noel 47000
Tarun 15000
Danny 23000
Steve 13000
Mark 80000