Regular Expression Extractor in JMeter

Allows the user to extract values from a server response using a Perl-type regular expression. As a post-processor, this element will execute after each Sample request in its scope, applying the regular expression, extracting the requested values, generate the template string, and store the result into the given variable name.

While working with JMeter, sometimes you need to input the specific values which are responded from the Response Data of previous requests/samplers. Such as a getAccountInfo API need a token return after executed login API. Or you need to extract an item_id to go to the detail of that item in next step. In this case, Regular Expression Extractor is a useful element to do this.

In this post, I will show you how to use Regular Expression Extractor, and provide the detail of each parameter inside this element.

An example of a Test Plan structure:

Dummy Sampler – useful for debugging, you can put any data into “Response Data” field and make the Regular Expression Extractor a child of the Dummy Sampler.
NOTE: You must follow this link to install this plugin before running my example test plan
- I create a response data like below:
  title=”JMeter VN” name=”file” value=”readme.txt”
Debug Sampler – outputs JMeter Variable values (it also can print the JMeter and System properties)
View Results Tree – visualizes the Dummy and Debug samplers output

1. Using Regular Expression Extractor:

Suppose you want to extract “readme.txt“from the Response Data. Just do the following steps and see what will you get.

STEP 1: Create a Regular Expression by right clicking Dummy Sampler element > Add > Post Processor > Regular Expression Extractor

STEP 2: In the Regular Expression Extractor, input as the following

Reference Name: VALUE
Regular Expression: value="(.+?)"
Template: $1$
Match No. (0 for Random): 1
Default Value: NOT_FOUND

I will explain a bit:
– Reference Name: is the name which extracted value will be stored into.
– Regular Expression (ReGex): the regular expression used to parse the response data.
– Template: Almost case we’ll input $1$
– Match No. (0 for Random): almost case we’ll input 1
– Default Value: You can leave blank this field.

NOTE: All meaning of these properties can be found here. Or please continues to read my post with the section below.

STEP 3: Run the test and see what will happen

You can see, now we extracted the “readme.txt” as we want. You can download the demo file here and run it by yourself.

2. Parameters in Regex Extractor

2.1 Apply to: which kind of Samplers will be applied this extractor. But for now, please leave it as default with the option Main sample only . I will show the detail of this in the other post.

2.2 Field to check: Just leave it by default Body, it means the data will be extracted from Body data, aka Response Data. I will also provide you the detail in the other post.

2.3 Reference Name: The name of the JMeter variable in which to store the result. But it will be more powerful in some case which mentioned below.

2.4 Regular Expression: The regular expression used to parse the response data. This must contain at least one set of parentheses ( ) to capture a portion of the string.

There are some useful links for you to understand what is the Regular Expression, please refer:

Regular Expressions from JMeter User Manual Page
http://regexr.com/ or https://regex101.com/ to learn and practice Regex

TIP: In this post, I will show you how to define a Regular Expression in an easy way, it works for almost cases. Yes, please remember, the almost case this tip will work like a charm.
– Go back example in section 1, we want to extract “readme.txt”
– Just copy a little bit data before and after of expected value, so we’ll have
value="readme.txt"
– Then replace the expected value by (.+?)
– Finally we have the Regular Expression value="(.+?)"

How do you think? Is it easy? Let’s practice a bit with the other value

→ If you want to extract text “file” of attribute name, follow the guideline above, we have Regular Expression: name="(.+?)"

→ If you want to extract text “JMeter VN” of attribute title, follow the guideline above, we have Regular Expression: title="(.+?)"

The special characters (.+?) above are:

( and ): these enclose the portion of the match string to be returned
.: match any character
+: one or more times
?: don’t be greedy, i.e. stop when the first match succeeds

Note 1: without the ?, the .+ would continue past the first " until it found the last possible " – which is probably not what was intended.

Example 1:
– name="(.+?)" will match the value “file“.
– name="(.+)" will match the value file” value=”readme.txt because it’s greedy 😀

Note 2: Although the above expression works, it’s more efficient to use the following expression: value=”([^”]+)” where
[^”] – means match anything except “. In this case, the matching engine can stop looking as soon as it sees the first “

Extract multiple values: In case you want to extract both “file” and “readme.txt”

So the regular expression would be:

name="(.+?)" value="(.+?)" or name="([^"])" value="([^"])"

You should know that when we have more than one regular expression, we’ll have the group. The first appearing regex is group 1, the next appearing regex is group 2, etc. And also, the Regex Extractor saves the values of the groups in additional variables.

Example 2 (download here): create Regular Expression Extractor with

Reference Name: MULTIPLE_EXTRACT
Regular Expression: name="(.+?)" value="(.+?)"
Template: $1$ $2$

The following variables would be set:

MULTIPLE_EXTRACT=file readme.txt
MULTIPLE_EXTRACT_g=2
MULTIPLE_EXTRACT_g0=name=”file” value=”readme.txt”
MULTIPLE_EXTRACT_g1=file
MULTIPLE_EXTRACT_g2=readme.txt

Where the suffixes are:
– g : is the total number of group in the regex query
– g0 : refers to whatever the entire expression matches.
– gn : is the value of group n=1,2,3,…
– And the original Variable Name shows value depend on the template. See below.

These variables can be referred to later on in the JMeter test plan, as ${MULTIPLE_EXTRACT}, ${MULTIPLE_EXTRACT_g1} etc.

2.5 Template: The template used to create a string of the matches found. This is an arbitrary string with special elements to refer to groups within the regular expression. The syntax to refer to a group is: $1$ to refer to group 1, $2$ to refer to group 2, etc.

Go back to Example 2:

– If we define Template: $2$ $1$ so the variable will store value readme.txt file

– If Template: $2$_$1$ so it returns readme.txt_file

2.6 Match No. (0 for Random): Indicates which match to use. The regular expression may match multiple times. If you can make sure there is only 1 match, then please input 1 into this field as a default value.

We have 3 kinds of Match No:

Use a value of zero to indicate JMeter should choose a match at random.
A positive number N means to select the n^th match.
Negative numbers are used to returns all matches

Now I change a little bit the test plan, I added more line into the Response Data of Dummy Sampler. And now it looks like:

title=”JMeter VN” name=”file” value=”readme.txt”
title=”Google” name=”web” value=”search”
title=”Facebook” name=”application” value=”social-network”

Example 3: Consider the sample in section 1, using the regular expression value=(.+?). But in this case, it will match more than 1 value, actually, it has 3 matches.

– Match No: 0 –> Return random of value readme.txt OR search OR social-network
– Match No: 1 –> Return the first match = readme.txt
– Match No: 2 –> Return the second match = search
– Match No: 3 –> Return the third match = social-network
– Match No: 4 –> Return nothing because the number 4 is out of range of matches
– Match No: -1 –> Return all matches and see how the variable store all value

Assume the Variable Name is VALUE, when Match No = -1
VALUE=NOT_FOUND
VALUE_1=readme.txt
VALUE_1_g=1
VALUE_1_g0=value=”readme.txt”
VALUE_1_g1=readme.txt
VALUE_2=search
VALUE_2_g=1
VALUE_2_g0=value=”search”
VALUE_2_g1=search
VALUE_3=social-network
VALUE_3_g=1
VALUE_3_g0=value=”social-network”
VALUE_3_g1=social-network
VALUE_matchNr=3

These variables can be referred to later on in the JMeter test plan, as ${VALUE_1}, ${VALUE_1_g1}, ${VALUE_3_g1} etc.

Where:

refName_matchNr – the number of matches found; could be 0
refName_n, where n = 1, 2, 3 etc. – the strings as generated by the template
refName_n_gm, where m=0, 1, 2 – the groups for match n
refName – always set to the default value
refName_gn – not set

Example 4: Consider example 2, using regex name="(.+?)" value="(.+?)" and input Match No = -1. Let see what we have:

You can download this example here

MATCH_NUMBER=NOT_FOUND
MATCH_NUMBER_1=file readme.txt
MATCH_NUMBER_1_g=2
MATCH_NUMBER_1_g0=name=”file” value=”readme.txt”
MATCH_NUMBER_1_g1=file
MATCH_NUMBER_1_g2=readme.txt
MATCH_NUMBER_2=web search
MATCH_NUMBER_2_g=2
MATCH_NUMBER_2_g0=name=”web” value=”search”
MATCH_NUMBER_2_g1=web
MATCH_NUMBER_2_g2=search
MATCH_NUMBER_3=application social-network
MATCH_NUMBER_3_g=2
MATCH_NUMBER_3_g0=name=”application” value=”social-network”
MATCH_NUMBER_3_g1=application
MATCH_NUMBER_3_g2=social-network
MATCH_NUMBER_matchNr=3

These variables can be referred to later on in the JMeter test plan, as ${MATCH_NUMBER_matchNr}, ${MATCH_NUMBER_3_g2}, ${MATCH_NUMBER_1_g1} etc.

2.7 Default Value: If the regular expression does not match, then the reference variable will be set to the default value. This is particularly useful for debugging tests. If no default is provided, then it is difficult to tell whether the regular expression did not match, or the RegEx element was not processed or maybe the wrong variable is being used.

However, if you have several test elements that set the same variable, you may wish to leave the variable unchanged if the expression does not match. In this case, remove the default value once debugging is complete.

As usual, I always use the string NOT_FOUND or include the variable name VIRABLE_NAME_NOT_FOUND. From version 3.0, now we have the option User empty default value. It means your variable will be null, nothing to store.

P/S: Regular Expression is a powerful element, it can extract everything. But if your data is JSON format, then I strongly recommend you to use JSON Path PostProcessor instead. Or if you want to extract data from HTML or XML, please consider XPath Extractor first, it will much more easier.