Hi there
People for whom words are their daily bread and butter would find a close acquaintance with regular expressions most useful. (Wow, that was a literary sentence!)
It seems that "grep" has got its name from "global regular expression parser" or something like that. In general, what we pass as a first parameter to this command is a regular expression. All the examples in this thread are literal strings. The first slogan for today is:
- "literal strings are the simplest regular expression, and they match themselves".
That is, the word "line" will match an "l" followed immediately by an "i", etc.
Normal letters, digits and spaces behave very well. But there are a number of characters with special meaning. I'll introduce the two most used, and will leave the rest for other posts.
- The dot (full stop) character ".": matches any letter
- The asterisk, (star) character "*": matches cero or more occurrences of the regular expression immediately preceding it.
Some examples:
- The expression "l.ne" matches lane, lene, line, lone, lune, and also lbne, lcne, l4ne, etc.
- The expression "line*" matches lin, line, linee, lineee, and so on, cero or more letters "e" following "lin".
These are toy examples, of course. But there are a few useful things that can be done with these simple rules. For example grep -o '<.*>' index.html will display the html tags (as long as there is only one per line, more on this in future posts). ".*" can be read as "cero or more instances of any character".
Enjoy!
Cheers.
P.