CS155 Commands2
Vocabulary | script , wc , grep , sort , cut , uniq |
---|---|
Punctuation | ; |
Grammar | command [option]... [argument]... [redirection] |
man
– show manual for command
info
– show info page for command
man
and info
can answer many of your questions.
Consider them travel guides.
script
records a terminal session to a file named typescript
.
script
filename.
exit
.
wc
is used to get information about the contents of a file.
% cd ~/pub % cat dwarfs.txt Grumpy is in love Sneezy has a red nose Happy — plump Doc — glasses Bashful doesn’t end in “y” Sleepy — somnambulent Dopey — unbearded % wc dwarfs.txt 7 26 149 dwarfs.txt
That’s 7 lines, 22 words, and 149 bytes. A byte is the same as a character, more or less.
Note that a byte means a letter, a number, a space, a dot, a newline character—everything that takes up room. Spaces and newlines count!
% cd ~/pub % wc -c dwarfs.txt 149 dwarfs.txt % wc -l dwarfs.txt 7 dwarfs.txt % wc -cw dwarfs.txt 26 149 dwarfs.txt
Individual counts can be extracted with options
-c
: print the byte count
-l
: print the line count
-m
: print the character count
-w
: print the word count
If multiple files are listed on the command line,
then multiple counts are computed, and a total is also given.
wc
is useful when combined with other
commands using |
(the pipe symbol).
grep
is an extremely useful command (it’s also extremely complicated).
The simplest use is to search for an exact piece of text in a list of files.
% grep 'el' ~/pub/greek Delta % grep "rty-t" ~/pub/numbers thirty-two thirty-three forty-two forty-three
If you searched multiple files, then the output is filename:line of text. That way, you know which output came from which file.
% cd ~/pub % grep "el" greek numbers greek:Delta numbers:eleven numbers:twelve
The -n
option causes the line number to be printed:
% cd ~/pub % grep -n 'el' greek numbers greek:4:Delta numbers:11:eleven numbers:12:twelve
UPPER/lower-case matters:
% cd ~/pub % grep "et" greek Beta Zeta Theta % grep "Et" greek Eta % grep -i "et" greek Beta Zeta Eta Theta
The -i
option specifies case-independent.
Some symbols can be used for searching for inexact patterns.
Alas, *
and ?
have different meanings than their use in wildcards.
Pattern | Meaning in grep | Pattern | Meaning |
---|---|---|---|
. | Any single character | ^ | start of line |
[aeiou] | a single vowel | $ | end of line |
[aeiou]* | a bunch of vowels | \? | zero or one of what came before |
[a-z] | a lowercase letter | * | zero or more of what came before |
ab*c
ab
, followed by
anything of any length, and then a c
, ending the filename”.
abc
or ab-my-name-is-mac
ac
, xabc
, or abcy
.
grep
pattern (or regular expression), that means
“starting anywhere in the line, an a
, followed by any number
of b
’s, followed by a c”.
ac
, 12ac34
, or 5abbbbbbc6789
.
axc
or abxc
.
Command | Matches |
---|---|
grep 'e[as]t' my_file | “eat” or “west” but not “east” |
grep 'b[oi]y' *html | “boy” but not “BOY” |
grep 'windows* ' *html | “window ” and “windows ” |
grep 'window.' *html | “window-” and “window,” and “windows” |
grep '^Jack' foo | “Jack” only at the start of the line |
.
% cat ~/pub/greek Alpha Beta Gamma Delta Epsilon Zeta Eta Theta Iota Kappa Lambda Mu Nu Xi Omicron Pi Rho Sigma Tau Upsilon Phi Chi Psi Omega
A dot (.
) matches any single character:
% grep "e.a" ~/pub/greek Beta Zeta Theta Omega % grep "e..a" ~/pub/greek Delta % grep "......." ~/pub/greek Epsilon Omicron Upsilon
[…]
% cat ~/pub/greek Alpha Beta Gamma Delta Epsilon Zeta Eta Theta Iota Kappa Lambda Mu Nu Xi Omicron Pi Rho Sigma Tau Upsilon Phi Chi Psi Omega
A character class ([aeiou]
) matches any single character
within it.
% grep "[JACK]" ~/pub/greek Alpha Kappa Chi % grep "[BCD]" ~/pub/greek Beta Delta Chi % grep "[B-D]" ~/pub/greek Beta Delta Chi
^
% cat ~/pub/greek Alpha Beta Gamma Delta Epsilon Zeta Eta Theta Iota Kappa Lambda Mu Nu Xi Omicron Pi Rho Sigma Tau Upsilon Phi Chi Psi Omega
^
matches at the beginning of the line.
% grep -i "o" ~/pub/greek Epsilon Iota Omicron Rho Upsilon Omega % grep -i "^o" ~/pub/greek Omicron Omega
$
% cat ~/pub/greek Alpha Beta Gamma Delta Epsilon Zeta Eta Theta Iota Kappa Lambda Mu Nu Xi Omicron Pi Rho Sigma Tau Upsilon Phi Chi Psi Omega
$
matches at the end of the line.
% grep -i "u" ~/pub/greek Mu Nu Tau Upsilon % grep -i "u$" ~/pub/greek Mu Nu Tau
\?
% cat ~/pub/greek Alpha Beta Gamma Delta Epsilon Zeta Eta Theta Iota Kappa Lambda Mu Nu Xi Omicron Pi Rho Sigma Tau Upsilon Phi Chi Psi Omega
\?
matches zero or one of what came before.
% grep 'et' ~/pub/greek Beta Zeta Theta % grep 'elt' ~/pub/greek Delta % grep 'el\?t' ~/pub/greek Beta Delta Zeta Theta
*
% cat ~/pub/greek Alpha Beta Gamma Delta Epsilon Zeta Eta Theta Iota Kappa Lambda Mu Nu Xi Omicron Pi Rho Sigma Tau Upsilon Phi Chi Psi Omega
*
matches zero or more of what came before.
% grep 'ma' ~/pub/greek Gamma Sigma % grep 'm.*a' ~/pub/greek Gamma Lambda Sigma Omega % grep 'm[a-d]*a' ~/pub/greek Gamma Lambda Sigma
sort
reorders lines of text lexicographically.
This means they are sorted alphabetically moving from the first character
to the last.
Usage: sort [OPTION]... [FILE]...
-n
makes sort
do a numeric sort -r
reverses the result of the comparisons -u
removes duplicates while sorting -t'
delimiter' -k
pos1,pos2 can be used to sort by column
Uniqueness while sorting by column is only checked on the field(s)
specified by -k
.
% cat ~/pub/dwarfs.txt Grumpy is in love Sneezy has a red nose Happy — plump Doc — glasses Bashful doesn’t end in “y” Sleepy — somnambulent Dopey — unbearded
% sort ~/pub/dwarfs.txt Bashful doesn’t end in “y” Doc — glasses Dopey — unbearded Grumpy is in love Happy — plump Sleepy — somnambulent Sneezy has a red nose
% sort -r ~/pub/dwarfs.txt Sneezy has a red nose Sleepy — somnambulent Happy — plump Grumpy is in love Dopey — unbearded Doc — glasses Bashful doesn’t end in “y”
% cat my_file2 1234 10 10000 5679 % sort my_file2 10 10000 1234 5679 % sort -n my_file2 10 1234 5679 10000 % sort -n <my_file2 (same as above) % cat my_file2 | sort -n
sort
sorts alphabetically.
-n
option says to sort numerically.
sort
does not change its input file.
Instead, it writes a sorted version to the output stream.
This could be your screen, or you could use >
to save it somewhere.
sort foo >foo
cut
allows us to select columns from a file.
Options:
-d "
delimiter"
-f
field-list Examples:
cut -d";" -f3 filename cut -f2,3,5,7 -d"/" filename grep "x" filename | cut -d"," -f1,3,5-7,9-
% cat data Alpha,Beta,Gamma,Delta,Epsilon,Zeta Eta,Theta,Iota,Kappa,Lambda,Mu,Nu,Xi Omicron,Pi,Rho,Sigma,Tau,Upsilon,Phi Chi,Psi,Omega
% cut -d"," -f 3 data Gamma Iota Rho Omega % cut -d"," -f 5 data Epsilon Lambda Tau % cut -d"," -f 2,4 data Beta,Delta Theta,Kappa Pi,Sigma Psi
% cat data Alpha,Beta,Gamma,Delta,Epsilon,Zeta Eta,Theta,Iota,Kappa,Lambda,Mu,Nu,Xi Omicron,Pi,Rho,Sigma,Tau,Upsilon,Phi Chi,Psi,Omega
% cut -d"," -f 2-4 data Beta,Gamma,Delta Theta,Iota,Kappa Pi,Rho,Sigma Psi,Omega % cut -d"," -f 1-3,5-7 data Alpha,Beta,Gamma,Epsilon,Zeta Eta,Theta,Iota,Lambda,Mu,Nu Omicron,Pi,Rho,Tau,Upsilon,Phi Chi,Psi,Omega
cut
can also use -c
character-list to obtain
ranges of characters (not fields). No delimiter is needed.
% cat data Alpha,Beta,Gamma,Delta,Epsilon,Zeta Eta,Theta,Iota,Kappa,Lambda,Mu,Nu,Xi Omicron,Pi,Rho,Sigma,Tau,Upsilon,Phi Chi,Psi,Omega
% cut -c5-12 data a,Beta,G Theta,Io ron,Pi,R Psi,Omeg % cut -c1-3,5,7-10 data AlpaBeta EtaTeta, Omirn,Pi ChiPi,Om % cut -c10- data a,Gamma,Delta,Epsilon,Zeta ,Iota,Kappa,Lambda,Mu,Nu,Xi i,Rho,Sigma,Tau,Upsilon,Phi mega
uniq
removes repeated lines from a sorted file.
uniq
can also be used to print only lines that are unique to a file
(with -u
) or only those that are repeated (with -d
).
The combination of sort
, cut
, and uniq
is a powerful tool
for text manipulation in Unix.
Consider the example file.
user1 4125142 passwd user3 1415511 f#afk@ user2 9999999 p_2ad( user4 1415511 m#@!ad user5 0011292 lkdfaa
We want to find out how many unique ID numbers there are and also get a list of names and passwords sorted by user name.
% cat my_file3 user1 4125142 passwd user3 1415511 f#afk@ user2 9999999 p_2ad( user4 1415511 m#@!ad user5 0011292 lkdfaa % cut -f2 -d" " my_file3 4125142 1415511 9999999 1415511 0011292 % cut -f2 -d" " my_file3 | sort -n 0011292 1415511 1415511 4125142 9999999
% cut -f2 -d" " my_file3 | sort -n | uniq 0011292 1415511 4125142 9999999 % sort my_file3 user1 4125142 passwd user2 9999999 p_2ad( user3 1415511 f#afk@ user4 1415511 m#@!ad user5 0011292 lkdfaa % sort my_file3 | cut -f1,3 -d" " user1 passwd user2 p_2ad( user3 f#afk@ user4 m#@!ad user5 lkdfaa