Regular Expressions for Network Engineers

regex

How many times were you working on a task which involved either updating all instances of a piece of configuration or creating a new configuration piece at multiple points on a network device?  You have translated the requirements into functional syntax, a blueprint, for the specific hardware platform, now it’s time to implement it 10’s of times on the device. How do you implement it on the device?

For small and non-routine one-off tasks, the quickest way may be to jump on the device and repeat the manual labor N times at different places and with slight variations, where N is hopefully a relatively small number worth this manual approach. This may also be true for a junior network engineer who may not know other efficient methods of achieving it.

This is a type of automation as we aim to reduce if not eliminate manual processes that are very well defined and certainly repeatable. While automation can go a long way where we can have multiple devices or device groups, automated login to these, implementation of config and finally verification of status and rollback if needed, all being launched in order by a single orchestrating script – say an Ansible Playbook. Let’s keep that for some other day and talk about simple config generation on a single device that we can manually apply.

Ok, enough of the need for regular expressions (regex), let’s get started.

 

Regular Expressions

 

A regular expression defines a search pattern. This by itself is useful. However, combined with a “replace” pattern, it becomes even more useful. Let’s see some rules for the search pattern first.

 

 METACHARACTER DESCRIPTION EXAMPLE
  . Match any single character “a.c” matches aac, abc, acc, a1c, a2c and so on
  * Match preceding character zero or more times “ab*c” matches ac, abc, abbc, abbbc and so on
  + Match preceding character one or more times “ab+c” matches abc, abbc, abbbc and so on
  [  ]

Match any of the characters between the brackets

Special Ranges:

[0-9] matches digits 0 to 9.

[A-Z] matches uppercase A to Z.

[a-z] matches lowercase a to z.

\d is the same as [0-9]

\w is the same as [a-zA-Z0-9]

“a[bcde]f” matches abf, acf, adf, aef only.

“int gig[1-3]/0” matches “int gig1/0”, “int gig2/0”, “int gig3/0”

“int [a-z]+1/0” matches “int eth1/0”, “int gig1/0”, “int f1/0”

“int \w+/0” matches “int eth1/0”, “int gig1/0”, “int gig2/0”

  \ Don’t interpret the following single character as a regex metacharacter

“ab\*c” matches ab*c only.

The “*” is not interpreted as a metacharacter.

  ^  Match the start of line

“^ntp” matches “ntp server …”, “ntp peer …”,

but not “!ntp …”

  $  Match the end of line

 “eth3/10$” matches “int eth3/10”, “ip radius source-interface eth3/10”,

but not “interface ethernet3/10”

  | Logical OR two conditions

“eth|gig” matches “int eth1/0”, “int gig1/0”

but not “int f1/0”

  ( ) Define a subexpression. It can be recalled later using \1, \2 .. or $1, $2 .. – more on this later.

 “int (eth|gig|fe)1/0” matches “int eth1/0”, “int gig1/0”, “int fe1/0)

“int (eth|gig)([1-2])+/0” matches “int eth1/0”, “int eth2/0”, “int gig1/0”, “int gig2/0”

  ? Match a previous metacharacter zero or one time. The point to remember is that “?” is greedy i.e. if it finds a match, it will always match, thus ignoring the “zero” match part.

abc(de)?f matches abcdef

but not “abcf” in the greedy mode if “abcdef” is the input string

 

While there are other regex metacharacters, the above should be enough for a start.

Before moving on to use the regex we just learned in actual config generation, let’s talk briefly about a “replace” pattern. While fixed replace strings are something you would have used an astronomical number of times in typical Find/Replace operations in text editors. What if you need the replace pattern to contain a part of the matched search pattern? For that, you need to “reference” the earlier matched search pattern.

To recall a search pattern later in the replace pattern, we use the “( )” metacharacter in the search pattern as described earlier. The first “(” bracket in the search pattern is mapped to the replace pattern character \1, second “(” to \2, third “(” to \3 and so on up to \9 which matches the 9th occurrence of “(” in the search pattern. Some text editors take it as $1, $2 instead of \1, \2 i.e. Atom editor.

 

Let’s see some examples.

  • Example Text:   interface gig1
  • Search Pattern: interface ([a-z]+)(\d+)

\1 in the replace pattern will match contents of ([a-z]+) = gig

\2 in the replace pattern will match contents of (\d+) = 1

 

  • Example Text:   interface gig1
  • Search Pattern: interface ([a-z]+(\d+))

\1 in the replace pattern will match contents of ([a-z]+(\d+)) = gig1

\2 in the replace pattern will match contents of (\d+) = 1

 

Having seen some basic regex patterns, now it’s time to see how it could be useful in generating device configurations.

For the purpose of this post, I am using Notepad++ to generate the configuration. It is straight to the point, simple all in one text editor package. On Apple Mac, I am using Atom which is a modular and highly customizable editor. Which can be daunting at first and takes some time to setup, install the right packages and get used to. With respect to regex usage in Atom, use the “$1” convention instead of “\1” in the replace pattern and “\n” instead of “\r\n”.

 

Generate Configuration from Tabular Data

 

Suppose the configuration data exists in a tabular form in an Excel CSV file or in a design document containing network parameters and config snippets. The objective is to generate platform specific configuration from this file to build the physical device.

!!!! CSV File Containing VLAN SVI data

vlan,name,desc,ip,mask,dhcp1,dhcp2
10,data,Data VLAN,10.10.10.0,255.255.255.0,10.1.1.1,10.1.1.2
20,voice,Voice VLAN,10.10.20.0,255.255.255.0,10.1.1.1,10.1.1.2

!!!! SVI Configuration Snippet
vlan <VLAN>
 name <NAME>

interface vlan<VLAN>
 description <vlan-desc> 
 ip address <ip> <mask>
 ip helper-address <helper1> 
 ip helper-address <helper2>
 no shutdown

 

Regex searches and replace patterns to generate target config from above are given below.

^(.+),(.+),(.+),(.+),(.+),(.+),(.+)$
vlan \1\r\n name \2\r\n\r\ninterface vlan\1\r\n description \3\r\n ip address \4 \5\r\n ip helper-address \6\r\n ip helper-address \7\r\n no shutdown\r\n

 

The resulting configuration looks like this. Consider these were 50 or 100 VLAN’s, how long would that have taken?

vlan 10
 name data

interface vlan10
 description Data VLAN
 ip address 10.10.10.0 255.255.255.0
 ip helper-address 10.1.1.1
 ip helper-address 10.1.1.2
 no shutdown

vlan 20
 name voice

interface vlan20
 description Voice VLAN
 ip address 10.10.20.0 255.255.255.0
 ip helper-address 10.1.1.1 
 ip helper-address 10.1.1.2 
 no shutdown

 

Generate Tabular Data from Configuration

 

This can be a basic documentation method or a way to migrate from one hardware/vendor platform to another. Using the device config from the last section, regex search and replace patterns to generate CSV formatted tabular data are given below:

vlan (\d+)\r\n name (\w+)\r\n\r\ninterface vlan\d+\r\n description (.+)\r\n ip address ([0-9.]+) ([0-9.]+)\r\n ip helper-address ([0-9.]+)\r\n ip helper-address ([0-9.]+)\r\n no shutdown\r\n
\1,\2,\3,\4,\5,\6,\7

 

And the resulting output is the same as before:

10,data,Data VLAN,10.10.10.0,255.255.255.0,10.1.1.1,10.1.1.2
20,voice,Voice VLAN,10.10.20.0,255.255.255.0,10.1.1.1,10.1.1.2

 

Accounting for missing configuration

 

Sometimes translating configuration into tabular data may not readily work due to inconsistent configuration across different stanzas. As an example, suppose VLAN 20 is missing the “description” bit under the SVI config.

vlan 10
 name data

interface vlan10
 description Data VLAN
 ip address 10.10.10.0 255.255.255.0
 ip helper-address 10.1.1.1
 ip helper-address 10.1.1.2
 no shutdown

vlan 20
 name voice

interface vlan20
 ip address 10.10.20.0 255.255.255.0
 ip helper-address 10.1.1.1 
 ip helper-address 10.1.1.2 
 no shutdown

 

Our search pattern needs to account for both the presence and absence of the description field. That is the task for our zero or one match presence checking greedy friend, the “?”. Here is how it will look:

vlan (\d+)\r\n name (\w+)\r\n\r\ninterface vlan\d+\r\n( description (.+)\r\n)? ip address ([0-9.]+) ([0-9.]+)\r\n ip helper-address ([0-9.]+)\r\n ip helper-address ([0-9.]+)\r\n no shutdown\r\n
\1,\2,\4,\5,\6,\7,\8

 

And it converts it to this as CSV:

10,data,Data VLAN,10.10.10.0,255.255.255.0,10.1.1.1,10.1.1.2
20,voice,,10.10.20.0,255.255.255.0,10.1.1.1,10.1.1.2

 

Notice the 3rd field, description, is blank in the second record. The replace pattern matched “\4” for this field, which matched on nothing, hence blank.

 

Conclusion

Regex could be super useful when you know how to use it and together with the ubiquitous Notpad++ editor on pretty much any IT desktop machine, it is a powerful data analysis combination for a quick fix. The usage remains almost the same outside the realms of text editors. Many Cisco IOS, IOS XE, NXOS and other vendor platforms support regex based show commands on the CLI – use regex search pattern. The caveat is,  often with only basic regex support. If you used Linux “grep” or “egrep” command line tools, search patterns work there too.

If your config automation requires more then a few search/replace tasks or some non-trivial if-else checks and it is a frequent activity, scripting the logic in some programming language, Python or better still an automation framework like Ansible, may save you heaps of time in future. Even in such cases, on the fly Notepad++ and regex combination can act as a powerful proof of concept and verification tool.

 

Share this:

About: Rashid