SED Regex¶
sed
is a stream editor. I usually use grep
for filtering contents, sed
for editing texts, and awk
for processing some tabular data.
1. Preparation¶
Raw Data
Run the following command to save the data.
cat <<SONG> song.txt
On the twelfth day of Christmas my true love gave to me
twelve drummers drumming
eleven pipers piping
ten lords a leaping
nine ladies dancing
eight maids a milking
seven swans a swimming
six geese a laying
five golden rings
four calling birds
three French hens
two turtle doves
and a partridge in a pear tree
SONG
2. Regex Examples¶
If you use -E
, you will use extended regex with sed
, otherwise the basic. The syntax is bit different.
When using +
, ()
, {}
, ?
, and |
, you need to escape them in basic regex, you should not in extended.
Basics Operations
replace all the words starting with s
with xxx
.
Run
Basic Regex
sed 's/s\w\+/xxx/g' song.txt
Extended Regex
sed -E 's/s\w+/xxx/g' song.txt
Output
On the twelfth day of Chrixxx my true love gave to me
twelve drummers drumming
eleven pipers piping
ten lords a leaping
nine ladies dancing
eight maids a milking
xxx xxx a xxx
xxx geexxx a laying
five golden rings
four calling birds
three French hens
two turtle doves
and a partridge in a pear tree
The result is not as expected. geese
is replace with geexxx
.
\b
will match a word boundry which will solve our problem here.
Run
Basic Regex
sed 's/\bs\w\+/xxx/g' song.txt
Extended Regex
sed 's/\bs\w\+/xxx/g' song.txt
Output
On the twelfth day of Christmas my true love gave to me
twelve drummers drumming
eleven pipers piping
ten lords a leaping
nine ladies dancing
eight maids a milking
xxx xxx a xxx
xxx geese a laying
five golden rings
four calling birds
three French hens
two turtle doves
and a partridge in a pear tree
\b
will match a word boundry which will solve our problem here.
Run
Basic Regex
sed 's/\w\+ing\b/xxx/g' song.txt
Extended Regex
sed -E 's/\w+ing\b/xxx/g' song.txt
Output
On the twelfth day of Christmas my true love gave to me
twelve drummers xxx
eleven pipers xxx
ten lords a xxx
nine ladies xxx
eight maids a xxx
seven swans a xxx
six geese a xxx
five golden rings
four xxx birds
three French hens
two turtle doves
and a partridge in a pear tree
Some More
Replace all the ve
and ven
with xxx
Run
Basic Regex
sed 's/ven\?/xxx/g' song.txt
Extended Regex
sed -E 's/ven?/xxx/g' song.txt
Output
On the twelfth day of Christmas my true loxxx gaxxx to me
twelxxx drummers drumming
elexxx pipers piping
ten lords a leaping
nine ladies dancing
eight maids a milking
sexxx swans a swimming
six geese a laying
fixxx golden rings
four calling birds
three French hens
two turtle doxxxs
and a partridge in a pear tree
Back Reference
Sometimes, it is useful to use the matched string in replacement. Use ()
to indicate a match, use \num
to refer to the match. \1
refers to the first match.
Replace all the ve[x]
with we[x]
, e.g. ven
to wen
, ves
to wes
.
Run
Basic Regex
sed 's/v\(e\w\+\)/w\1/g' song.txt
Extended Regex
sed -E 's/v(e\w+)/w\1/g' song.txt
Output
On the twelfth day of Christmas my true love gave to me
twelve drummers drumming
elewen pipers piping
ten lords a leaping
nine ladies dancing
eight maids a milking
sewen swans a swimming
six geese a laying
five golden rings
four calling birds
three French hens
two turtle dowes
and a partridge in a pear tree
Let's reverse the order of words of line 2
Run
Basic Regex
sed -n '2s/\(\w\+\) \(\w\+\) \(\w\+\)/\3 \2 \1/p' song.txt
Extended Regex
sed -nE '2s/(\w+) (\w+) (\w+)/\3 \2 \1/p' song.txt
Output
drumming drummers twelve