11. Real Life Examples
UNIX' core idea is that there are many simple commands that can linked
together via piping and redirection to accomplish even really complex tasks.
Have a look at the following examples. I'll only explain the most complex
ones; for the others, please study the above sections and the man pages.
Problem: ls is too quick and the file names fly away.
Solution:
Problem: I have a file containing a list of words. I want to sort it
in reverse order and print it.
Solution:
$ cat myfile.txt | sort -r | lpr |
Problem: my data file has some repeated lines! How do I get rid of them?
Solution:
$ sort datafile.dat | uniq > newfile.dat |
Problem: I have a file called 'mypaper.txt' or 'mypaper.tex' or some
such somewhere, but I don't remember where I put it. How do I find it?
Solution:
$ find ~ -name "mypaper*" |
Explanation: find is a very useful command that lists all the files
in a directory tree (starting from ˜ in this case). Its output
can be filtered to meet several criteria, such as -name.
Problem: I have a text file containing the word 'entropy' in this
directory, is there anything like SEARCH?
Solution: yes, try
Problem: somewhere I have text files containing the word 'entropy', I'd
like to know which and where they are. Under VMS I'd use search entropy
[...]*.*;*, but grep can't recurse subdirectories. Now what?
Solution:
$ find . -exec grep -l "entropy" {} \; 2> /dev/null |
Explanation: find . outputs all the file names starting from the
current directory, -exec grep -l "entropy" is an action to be
performed on each file (represented by {}), \
terminates the command. If you think this syntax is awful, you're right.
In alternative, write the following script:
#!/bin/sh
# rgrep: recursive grep
if [ $# != 3 ]
then
echo "Usage: rgrep --switches 'pattern' 'directory'"
exit 1
fi
find $3 -name "*" -exec grep $1 $2 {} \; 2> /dev/null |
Explanation: grep works like search, and combining it with
find we get the best of both worlds.
Problem: I have a data file that has two header lines, then every
line has 'n' data, not necessarily equally spaced. I want the 2nd and
5th data value of each line. Shall I write a Fortran program...?
Solution: nope. This is quicker:
$ awk 'NL > 2 {print $2, "\t", $5}' datafile.dat > newfile.dat |
Explanation: the command awk is actually a programming language:
for each line starting from the third in datafile.dat, print out
the second and fifth field, separated by a tab. Learn some awk---it
saves a lot of time.
Problem: I've downloaded an FTP site's ls-lR.gz to check its
contents. For each subdirectory, it contains a line that reads "total xxxx",
where xxxx is size in kbytes of the dir contents. I'd like to get the grand
total of all these xxxx values.
Solution:
$ zcat ls-lR.gz | awk ' $1 == "total" { i += $2 } END {print i}' |
Explanation: zcat outputs the contents of the .gz file and pipes
to awk, whose man page you're kindly requested to read ;-)
Problem: I've written a Fortran program, myprog, to calculate a
parameter from a data file. I'd like to run it on hundreds of data files
and have a list of the results, but it's a nuisance to ask each time for
the file name. Under VMS I'd write a lengthy command file, and under Linux?
Solution: a very short script. Make your program look for the data
file 'mydata.dat' and print the result on the screen (stdout), then
write the following script:
#!/bin/sh
# myprog.sh: run the same command on many different files
# usage: myprog.sh *.dat
for file in $* # for all parameters (e.g. *.dat)
do
# append the file name to result.dat
echo -n "${file}: " >> results.dat
# copy current argument to mydata.dat, run myprog
# and append the output to results.dat
cp ${file} mydata.dat ; myprog >> results.dat
done |
Problem: I want to replace `geology' with `geophysics' in all my
text files. Shall I edit them all manually?
Solution: nope. Write this shell script:
#!/bin/sh
# replace $1 with $2 in $*
# usage: replace "old-pattern" "new-pattern" file [file...]
OLD=$1 # first parameter of the script
NEW=$2 # second parameter
shift ; shift # discard the first 2 parameters: the next are the file names
for file in $* # for all files given as parameters
do
# replace every occurrence of OLD with NEW, save on a temporary file
sed "s/$OLD/$NEW/g" ${file} > ${file}.new
# rename the temporary file as the original file
/bin/mv ${file}.new ${file}
done |
Problem: I have some data files, I don't know their length and have to
remove their last but one and last but two lines. Er... manually?
Solution: no, of course. Write this script:
#!/bin/sh
# prune.sh: removes n-1th and n-2th lines from files
# usage: prune.sh file [file...]
for file in $* # for every parameter
do
LINES=`wc -l $file | awk '{print $1}'` # number of lines in file
LINES=`expr $LINES - 3` # LINES = LINES - 3
head -n $LINES $file > $file.new # output first LINES lines
tail -n 1 $file >> $file.new # append last line
done |
I hope these examples whetted your appetite...