Search This Blog

Wednesday, December 18, 2013

unix: an example of | (pipe) abuse

Something that really makes me mad: people using pipes when they should not. Most common example is:

cat foo.txt | grep bar

when looking for the list of lines containing the string "bar" inside the file named "foo.txt".

The correct command should read:

grep bar foo.txt

If you use cat filename | command the drawbacks are :

  • You created a bunch of processes where just one was enough. 
  • The communication between process cat and process grep is pretty slow, you're moving the data twice, once from the disk to the cat process space, then again from there to the grep process space.
  • Defeat any clever OS hack made to read file faster, like "mapping" them.
Since almost all file related commands accept a list of filenames as last argument you should use grep frob foo.txt bar.txt quux.txt but not cat foo.txt bar.txt quux.txt | grep frob . Also grep will output the filename where a match was found:

$ >a
$ date>b
$ >c
$ >d
$ grep 1 a b c d
b:Wed, Dec 18, 2013  9:28:21 PM
$ cat a b c d |grep 1
Wed, Dec 18, 2013  9:28:21 PM


Timing example :
$ time grep f00bar largefile

real    0m2.136s
user    0m0.078s
sys     0m0.078s

$ time cat largefile | grep f00bar

real    0m13.928s
user    0m0.077s
sys     0m0.139s

No comments:

Post a Comment