The magic of grep

Introduction

grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command g/re/p (globally search a regular expression and print), which has the same effect: doing a global search with the regular expression and printing all matching lines. Grep was originally developed for the Unix operating system, but later available for Linux  systems as well.

Why an article about a single command?

Regular Expressions are beautiful and can help us to automate tedious tasks, especially when used together with Progamming languages such as Java, Perl, Python, PHP or any other.

The Linux / Unix grep program.

Although just command line tools, grep is probably the most powerful command available.

Below is a short list of grep’ use-cases:

  • Find postal codes in text.
  • Validate email addresses, phone numbers and postal codes in forms.
  • Convert HTML files to plain text.
  • Trimming white spaces from input.
  • Matching certain tags in HTML or XML.
  • Matching or validating Credit Card numbers.
  • To extract meaningful texts from binary files.

And much more.

Regular expressions

Regular expressions are the core of the grep command but are also extensively used in Progamming languages, where you can do lots of useful things with.

Regular expressions are also very useful in configuration of software. Apache Nutch and Hadoop for example make heavily use of configurations with Regular Expressions. For example to include or exclude hosts from getting crawled or to include or exclude plugins (software extensions).

Tools

There are several tools available to help you creating and testing Regular Expressions. We distinguish between online and offline tools.

Online

  • regexr.com is an excellent tool for creating and testing your Regular Expressions online. It’s free and good to learn RegEX, with color coding so you directly see what’s happening.
  • RegEX generator Is similar to regexr and also has a built in library with examples and video tutorial.
  •  https://regex101.com/ Is an online creator and tester with support for different programming languages.
Offline

Regexbuddy If you are willing to pay some money for an excellent tool, look no further! Regexbuddy is by far the best you can get! It has an extensive library built-in and supports all programming languages. Once completed, you just select your programming language and your code is copied, ready for use. The producer also has a very good text editor (UltraEdit) which integrates needles with Regexbuddy. It’s a pleasure to work with and definitely worth it’s price!

Unfortunately, it’s currently only available for Windows but runs fine on Mac and Linux too when using Wine or CrossOver.It also comes with the best and most comprehensive manual, which is also available online. See https://regular-expressions.mobi/tutorial.html?wlr=1

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.