Steve's Handy GREP docs
Find and replace a block of text when you know the start text and the end
text, but not the text in the middle
requires: BBedit (or UNIX command line)
Let's say you need to find the piece of text that's the name of a file, and change it into a link to that file.
Example:
<span class="header">FILE: </span><span id="fileName">ArcadeXmas.jpg</span>
We want to find "ArcadeXmas.jpg" and replace it with a link to that file.
grep magic: <span class="header">FILE: </span><span id="fileName">(.+[g|f])
what it means:
- <span class="header">FILE: </span><span id="fileName"> - find this block of text (we know it's the same in all the files we need to change)
- (.+[g|f]) - the () means "remember what gets matched in this sub-expression"
- .+ - match any number of characters except a carraige return
- [g|f] - match either a "g" or an "f".
- So, this expression says, "find that big line of text. Look for any amount of text after that (up to a CR) that ends with a 'g' or 'f' and remember it for later. That last bit allows the expression to match a .jpg or .gif filename.
OK - now that we've found the text we want, what next?:
Take the following, paste it into the "replace" box in bbedit & turn on "grep"
<span class="header">FILE: </span><span id="fileName"><a href="../pathname/\1">\1</a>
what it means:
- <span class="header">FILE: </span><span id="fileName"> - this is the text that's staying the same
- <a href="../pathname/ - here we're plugging in the HTML to start the link
- \1 - in grep-talk this means, "use that bit of text we matched in the (). So it sticks in imagename.jpg
- ">\1</a> - this finishes up the link we're building. The second \1 prints the filename out again so the user has something to click on, eh.
grep magic: [\s\S]+
what it means:
- \s - match any whitespace character (spaces, tabs, carraige returns, etc)
- \S - match any non-whitespace character (everything else)
- + - match any number of characters ([foo]+ means match any number of "foo"s
Example:
<body>
lots of lines of code that is different in every file you have to work with
<table>
In BBedit, place this: <body>[\s\S]+<table> in the find
command. Replace with anything you like. Can also be used in the find command
to go through all the files in directories using the multi-find stuff in the
find command. The Catch This bit of grep magic is a "greedy" and can match nearly all the stuff in your documents! Search the bbedit docs for "non-greedy quantifiers" to learn how to get around this.
Example:
^\s+[0-9]\.
This searches your document for lines like this:
1. blah blah blah
what it means:
- ^ - match the start of a line
- \s - match any whitespace character (like spaces, tabs...)
- + - match any number of those "\s"'s
- [0-9] - match a number (but just one)
- \. - match a "." (note the . is a special character so it's escaped with the magic backslash to indicate we're looking for a real . not the special meaning of .)