Skip to content

GREP

grep stands for global regular expression print.

Well, to my understanding, it is just filter-then-print.

1. Preparation

First, let's get some data and play with it.

Raw Data

Run the following command to create a test data.

cat <<LOGS> tut-access.log
161.138.187.117 - - [05/Jan/2021:23:05:01 -0500] "PUT /app/main/posts HTTP/1.0" 200 4973 "http://smith.com/" "Mozilla/5.0 (Windows NT 6.1; yi-US; rv:1.9.2.20) Gecko/2011-09-10 13:36:12 Firefox/3.6.13"
83.191.216.184 - - [05/Jan/2021:23:05:39 -0500] "POST /apps/cart.jsp?appID=3885 HTTP/1.0" 200 5031 "http://www.bryant.com/terms/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_12_4) AppleWebKit/534.0 (KHTML, like Gecko) Chrome/60.0.831.0 Safari/534.0"
164.42.246.104 - - [05/Jan/2021:23:09:56 -0500] "PUT /wp-content HTTP/1.0" 200 4976 "https://flynn-cruz.com/home/" "Mozilla/5.0 (Android 2.1; Mobile; rv:45.0) Gecko/45.0 Firefox/45.0"
28.219.159.236 - - [05/Jan/2021:23:11:43 -0500] "PUT /list HTTP/1.0" 200 5010 "http://www.wright.com/home/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.1 (KHTML, like Gecko) Chrome/42.0.863.0 Safari/536.1"
189.143.182.79 - - [05/Jan/2021:23:14:06 -0500] "GET /apps/cart.jsp?appID=9015 HTTP/1.0" 200 5037 "http://www.davis-moreno.com/app/tag/home.asp" "Mozilla/5.0 (Windows CE; ti-ET; rv:1.9.1.20) Gecko/2013-12-25 17:20:54 Firefox/3.8"
28.105.221.183 - - [05/Jan/2021:23:17:22 -0500] "GET /wp-admin HTTP/1.0" 200 4947 "http://www.smith-phelps.biz/homepage/" "Mozilla/5.0 (Windows; U; Windows 95) AppleWebKit/535.24.6 (KHTML, like Gecko) Version/5.0.5 Safari/535.24.6"
86.74.0.138 - - [05/Jan/2021:23:19:53 -0500] "GET /app/main/posts HTTP/1.0" 200 5036 "http://turner-brown.com/blog/search/" "Mozilla/5.0 (Windows NT 4.0; hr-HR; rv:1.9.2.20) Gecko/2014-09-22 21:20:35 Firefox/3.6.4"
154.106.166.221 - - [05/Jan/2021:23:20:24 -0500] "GET /app/main/posts HTTP/1.0" 200 4942 "https://www.white.com/wp-content/faq/" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/531.2 (KHTML, like Gecko) Chrome/30.0.872.0 Safari/531.2"
160.75.38.89 - - [05/Jan/2021:23:25:07 -0500] "GET /list HTTP/1.0" 200 5043 "http://gordon.biz/search/main/tags/privacy.asp" "Mozilla/5.0 (Windows NT 5.1; fil-PH; rv:1.9.0.20) Gecko/2019-03-26 20:49:08 Firefox/3.8"
90.1.228.91 - - [05/Jan/2021:23:25:57 -0500] "GET /wp-admin HTTP/1.0" 200 4982 "http://brady.com/tag/home/" "Mozilla/5.0 (X11; Linux i686; rv:1.9.6.20) Gecko/2012-05-30 12:45:08 Firefox/3.8"
LOGS

2. Basics

Let's first try a very simple task to know a bit what grep can do.

All the PUT Requests

grep PUT tut-access.log
161.138.187.117 - - [05/Jan/2021:23:05:01 -0500] "PUT /app/main/posts HTTP/1.0" 200 4973 "http://smith.com/" "Mozilla/5.0 (Windows NT 6.1; yi-US; rv:1.9.2.20) Gecko/2011-09-10 13:36:12 Firefox/3.6.13"
164.42.246.104 - - [05/Jan/2021:23:09:56 -0500] "PUT /wp-content HTTP/1.0" 200 4976 "https://flynn-cruz.com/home/" "Mozilla/5.0 (Android 2.1; Mobile; rv:45.0) Gecko/45.0 Firefox/45.0"
28.219.159.236 - - [05/Jan/2021:23:11:43 -0500] "PUT /list HTTP/1.0" 200 5010 "http://www.wright.com/home/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.1 (KHTML, like Gecko) Chrome/42.0.863.0 Safari/536.1"

Here, grep will print all the lines with the string PUT in it.

3. A Bit More

grep has a lot paramters, among those, some are quite convenient and frequently used. Below, we will try them out one by one.

Ignore Case

Sometimes, you want to match the string regardless of the letter being upper case or lower case.

grep -i li tut-access.log

like, list and Linux are all valid match.

However, ignoring case may not behave as expected in languages other than English.

83.191.216.184 - - [05/Jan/2021:23:05:39 -0500] "POST /apps/cart.jsp?appID=3885 HTTP/1.0" 200 5031 "http://www.bryant.com/terms/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_12_4) AppleWebKit/534.0 (KHTML, like Gecko) Chrome/60.0.831.0 Safari/534.0"
28.219.159.236 - - [05/Jan/2021:23:11:43 -0500] "PUT /list HTTP/1.0" 200 5010 "http://www.wright.com/home/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.1 (KHTML, like Gecko) Chrome/42.0.863.0 Safari/536.1"
28.105.221.183 - - [05/Jan/2021:23:17:22 -0500] "GET /wp-admin HTTP/1.0" 200 4947 "http://www.smith-phelps.biz/homepage/" "Mozilla/5.0 (Windows; U; Windows 95) AppleWebKit/535.24.6 (KHTML, like Gecko) Version/5.0.5 Safari/535.24.6"
154.106.166.221 - - [05/Jan/2021:23:20:24 -0500] "GET /app/main/posts HTTP/1.0" 200 4942 "https://www.white.com/wp-content/faq/" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/531.2 (KHTML, like Gecko) Chrome/30.0.872.0 Safari/531.2"
160.75.38.89 - - [05/Jan/2021:23:25:07 -0500] "GET /list HTTP/1.0" 200 5043 "http://gordon.biz/search/main/tags/privacy.asp" "Mozilla/5.0 (Windows NT 5.1; fil-PH; rv:1.9.0.20) Gecko/2019-03-26 20:49:08 Firefox/3.8"
90.1.228.91 - - [05/Jan/2021:23:25:57 -0500] "GET /wp-admin HTTP/1.0" 200 4982 "http://brady.com/tag/home/" "Mozilla/5.0 (X11; Linux i686; rv:1.9.6.20) Gecko/2012-05-30 12:45:08 Firefox/3.8"

Match Only Whole Word

It's quite often you want to only match the string as a whole word, instead of part of another word.

grep -w Linux tut-access.log
28.219.159.236 - - [05/Jan/2021:23:11:43 -0500] "PUT /list HTTP/1.0" 200 5010 "http://www.wright.com/home/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.1 (KHTML, like Gecko) Chrome/42.0.863.0 Safari/536.1"
90.1.228.91 - - [05/Jan/2021:23:25:57 -0500] "GET /wp-admin HTTP/1.0" 200 4982 "http://brady.com/tag/home/" "Mozilla/5.0 (X11; Linux i686; rv:1.9.6.20) Gecko/2012-05-30 12:45:08 Firefox/3.8"
grep -w Lin tut-access.log

No line is matched.


Invert Match

Invert Match is just to exclude the lines having matching items.

grep -v Windows tut-access.log
83.191.216.184 - - [05/Jan/2021:23:05:39 -0500] "POST /apps/cart.jsp?appID=3885 HTTP/1.0" 200 5031 "http://www.bryant.com/terms/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_12_4) AppleWebKit/534.0 (KHTML, like Gecko) Chrome/60.0.831.0 Safari/534.0"
164.42.246.104 - - [05/Jan/2021:23:09:56 -0500] "PUT /wp-content HTTP/1.0" 200 4976 "https://flynn-cruz.com/home/" "Mozilla/5.0 (Android 2.1; Mobile; rv:45.0) Gecko/45.0 Firefox/45.0"
28.219.159.236 - - [05/Jan/2021:23:11:43 -0500] "PUT /list HTTP/1.0" 200 5010 "http://www.wright.com/home/" "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/536.1 (KHTML, like Gecko) Chrome/42.0.863.0 Safari/536.1"
90.1.228.91 - - [05/Jan/2021:23:25:57 -0500] "GET /wp-admin HTTP/1.0" 200 4982 "http://brady.com/tag/home/" "Mozilla/5.0 (X11; Linux i686; rv:1.9.6.20) Gecko/2012-05-30 12:45:08 Firefox/3.8"

Combinations

We can use the combination of parameters above to achieve more.

grep -v -w http tut-access.log

I know this looks silly not just grep https, but at least it shows what it can do. Although, well, it doesn't show what more it has achieved, probably.

164.42.246.104 - - [05/Jan/2021:23:09:56 -0500] "PUT /wp-content HTTP/1.0" 200 4976 "https://flynn-cruz.com/home/" "Mozilla/5.0 (Android 2.1; Mobile; rv:45.0) Gecko/45.0 Firefox/45.0"
154.106.166.221 - - [05/Jan/2021:23:20:24 -0500] "GET /app/main/posts HTTP/1.0" 200 4942 "https://www.white.com/wp-content/faq/" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/531.2 (KHTML, like Gecko) Chrome/30.0.872.0 Safari/531.2"