Wednesday, February 6, 2013

Grep

Grep is a command-line utility for searching plain-text data sets for lines matching a regular expression. Grep was originally developed for the Unix operating system, but is available today for all Unix-likesystems. Its name comes from the ed command g/re/p (global / regular expression / print). - Wikipedia

Exercise 0: Grepping Walt Whitman

1. Copy Whitman's work from Project Guttenberg and create a file called whitman.txt

2. Run Word Frecuency Counter Analysys.

3. Find the top 20 most frequent words:


10113 the
5334 and
4265 of
2906 i
2244 to
1875 in
1534 you
1293 a
1250 with
1109 is
1074 all
1014 my
1009 me
1003 or
993  for
877  not
854  that
816  as
792  it
703  from
671  on

4. Run the grep command using these words

grep the | grep and | grep of | grep to | grep in | grep you | grep with | grep is | grep all | grep my | grep me | grep or | grep for | grep not | grep that | grep as | grep it | grep from | grp on <whitman.txt | sort > whitmanhd5.txt

5. Display the first 26 lines (without any additional alterations):


Convict no more, nor shame, nor dole!
O fearful thought--a convict soul.
It is some dream that on the deck,
Where on the deck my Captain lies
With the life-long love of comrades.
A round full-orb'd eidolon.
A soul confined by bars and bands,
An image, an eidolon.
Beyond thy lectures learn'd professor,
But really build eidolons.
Dear prison'd soul bear up a space,
Do the feasters gluttonous feast?
Eidolons everlasting.
Eidolons! eidolons!
Eidolons, eidolons, eidolons.
Eidolons, eidolons.
Fill'd with eidolons only.
For soon or late the certain grace;
For the son is brought with the father,
God and eidolons.
I see a sad procession,
Immense and silent moon.
In its eidolon.
Issuing eidolons.
Joining eidolons.
Lo, the moon ascending.

No comments:

Post a Comment