SED Regex¶
sed is a stream editor. I usually use grep for filtering contents, sed for editing texts, and awk for processing some tabular data.
1. Preparation¶
Raw Data
Run the following command to save the data.
cat <<SONG> song.txt
On the twelfth day of Christmas my true love gave to me
twelve drummers drumming
eleven pipers piping
ten lords a leaping
nine ladies dancing
eight maids a milking
seven swans a swimming
six geese a laying
five golden rings
four calling birds
three French hens
two turtle doves
and a partridge in a pear tree
SONG
2. Regex Examples¶
If you use -E, you will use extended regex with sed, otherwise the basic. The syntax is bit different.
When using +, (), {}, ?, and |, you need to escape them in basic regex, you should not in extended.
Basics Operations
replace all the words starting with s with xxx.
Run
Basic Regex
sed 's/s\w\+/xxx/g' song.txt
Extended Regex
sed -E 's/s\w+/xxx/g' song.txt
Output
On the twelfth day of Chrixxx my true love gave to me
twelve drummers drumming
eleven pipers piping
ten lords a leaping
nine ladies dancing
eight maids a milking
xxx xxx a xxx
xxx geexxx a laying
five golden rings
four calling birds
three French hens
two turtle doves
and a partridge in a pear tree
The result is not as expected. geese is replace with geexxx.
\b will match a word boundry which will solve our problem here.
Run
Basic Regex
sed 's/\bs\w\+/xxx/g' song.txt
Extended Regex
sed 's/\bs\w\+/xxx/g' song.txt
Output
On the twelfth day of Christmas my true love gave to me
twelve drummers drumming
eleven pipers piping
ten lords a leaping
nine ladies dancing
eight maids a milking
xxx xxx a xxx
xxx geese a laying
five golden rings
four calling birds
three French hens
two turtle doves
and a partridge in a pear tree
\b will match a word boundry which will solve our problem here.
Run
Basic Regex
sed 's/\w\+ing\b/xxx/g' song.txt
Extended Regex
sed -E 's/\w+ing\b/xxx/g' song.txt
Output
On the twelfth day of Christmas my true love gave to me
twelve drummers xxx
eleven pipers xxx
ten lords a xxx
nine ladies xxx
eight maids a xxx
seven swans a xxx
six geese a xxx
five golden rings
four xxx birds
three French hens
two turtle doves
and a partridge in a pear tree
Some More
Replace all the ve and ven with xxx
Run
Basic Regex
sed 's/ven\?/xxx/g' song.txt
Extended Regex
sed -E 's/ven?/xxx/g' song.txt
Output
On the twelfth day of Christmas my true loxxx gaxxx to me
twelxxx drummers drumming
elexxx pipers piping
ten lords a leaping
nine ladies dancing
eight maids a milking
sexxx swans a swimming
six geese a laying
fixxx golden rings
four calling birds
three French hens
two turtle doxxxs
and a partridge in a pear tree
Back Reference
Sometimes, it is useful to use the matched string in replacement. Use () to indicate a match, use \num to refer to the match. \1 refers to the first match.
Replace all the ve[x] with we[x], e.g. ven to wen, ves to wes.
Run
Basic Regex
sed 's/v\(e\w\+\)/w\1/g' song.txt
Extended Regex
sed -E 's/v(e\w+)/w\1/g' song.txt
Output
On the twelfth day of Christmas my true love gave to me
twelve drummers drumming
elewen pipers piping
ten lords a leaping
nine ladies dancing
eight maids a milking
sewen swans a swimming
six geese a laying
five golden rings
four calling birds
three French hens
two turtle dowes
and a partridge in a pear tree
Let's reverse the order of words of line 2
Run
Basic Regex
sed -n '2s/\(\w\+\) \(\w\+\) \(\w\+\)/\3 \2 \1/p' song.txt
Extended Regex
sed -nE '2s/(\w+) (\w+) (\w+)/\3 \2 \1/p' song.txt
Output
drumming drummers twelve