I am trying to test some APIs but first I need to get session keys for each customer I use in the test. I have a CSV file with customer login information and read it with each thread.
I have the following form in my JMeter file.
CSV Data Set Config - User Login Info
Set username, password - one for each iteration
Setup Thread Group
BeanShell Sampler to delete the sessionKeys.csv
File file = new File("C:/user/sessionKeys.csv");
if (file.exists() && file.isFile()) {
file.delete();
}
Login Thread - ThreadCount = threadCount, Loop = 1
Login Request
BeanShell PostPrecessor to create file and append Session Keys to sessionKeys.csv
if("${sessionKey}" != "not_found")
{
File file = new File("C:/user/sessionKeys.csv");
FileWriter fWriter = new FileWriter(file, true);
BufferedWriter buff = new BufferedWriter(fWriter);
buff.write("${sessionKey}\n");
buff.close();
fWriter.close();
}
CSV Data Set Config - Session Keys
API Call Thread - ThreadCount = threadCount, Loop = loop
GetData Request
And I noticed that even if the file is actually deleted, created and filled with new sessionKeys, first a few requests uses the old sessionKeys in the file before it was deleted.
I have tried adding constant timer or changing the structure of the JMeter file but nothing worked.
Take a look at JMeter Test Elements Execution Order
Configuration elements
Pre-Processors
Timers
Sampler
Post-Processors (unless SampleResult is null)
Assertions (unless SampleResult is null)
Listeners (unless SampleResult is null)
CSV Data Set Config is a Configuration Element hence it's executed long before the Beanshell Sampler and this perfectly explains the behaviour you're facing.
So if you need to do some pre-processing of the CSV file you will need to do it in setUp Thread Group
Also be aware that starting from JMeter 3.1 you're supposed to be using JSR223 Test Elements and Groovy language for scripting so it makes sense considering migrating.
It looks like the CSV Config element in your test plan exists outside of the thread group and so will be called first before the file is deleted and recreated.
In your case it might be simpler to just store the session key as a JMeter property so that it can be accessed in all thread groups. You can store it using Groovy like props.put("${sessionKey}", sessionKey) or via JMeter functions like ${__setProperty("sessionKey",${sessionKey})}.
The property can then be accessed again using the property function like ${__P(sessionKey,)}.
Related
I have a Test Plan containing one Thread Group whith one HttpRequest sampler, JRS223PreProcessor and one csv data set config. I need to read from csv, at run time,the current value of column 2 and use it in my JSR223 PreProcessor. In order to do this, I defined a variable on Test Plan:
name ${__CSVRead(C:/Users/marial/Desktop/csvs/csv_hotelCodeReq.txt,2)
In JSR223 PreProcessor I am taking it like this:
String name= new String(vars.get("name"));
I would expect this value to change on each line readed, but it didn't, it always take the first value encountered. Does anyone know why?
To be more specific, if I have the csv file :
1,2,firstName1:lastName1
3,2,firstName2:lastName2
and loop count = 2, users=1 than the values of name are:
loop1: firstName1:lastName1
loop2: firstName1:lastName1
The other values are correctly handled, so it goes to the next line.
According to User Defined Variables documentation:
Note that all the UDV elements in a test plan - no matter where they are - are processed at the start.
So your __CSVRead() function is evaluated only during test startup and only once
The solution would be moving the function into "Parameters" section of the JSR223 PreProcessor and you will be able to access the function output as Parameters in your Groovy script like:
String name = Parameters
Demo:
This way the __CSVRead() function will be executed each time the JSR223 PreProcessor is called. Check out Apache Groovy - Why and How You Should Use It article to learn more about Groovy scripting in JMeter
I'm using Apache JMeter 3.2 r1790748 on Mac.
I have a setUp Thread Group making an authentication call. The call works and outputs the tokens correctly. Now I need to pass that token to the HTTP Header Manager for all the calls I'm making.
First of all, here's my token json output:
{
"access_token": "aaaaaa555555555",
"token_type": "Access",
"user_id": "5555"
}
Here's what my HTTP Header manager looks like:
1 value: Authorization : Bearer ${access_token}
My network call:
GET https://my_server.com/some_path
GET data:
[no cookies]
Request Headers:
Connection: close
Authorization: Bearer ${access_token}
Host: my_server.com
User-Agent: Apache-HttpClient/4.5.3 (Java/1.8.0_91)
As you can see, the variable access_token is not being replaced with the value from the setup call.
What I've tried:
BeanShell PostProcessor:
I created this script, and it actually parses and outputs the access_token properly:
import org.apache.jmeter.protocol.http.control.Header;
import net.minidev.json.JSONObject;
import net.minidev.json.parser.JSONParser;
String jsonString = prev.getResponseDataAsString();
log.info("jsonString = " + jsonString);
JSONParser parser = new JSONParser(JSONParser.MODE_JSON_SIMPLE);
JSONObject json = (JSONObject) parser.parse(jsonString);
String access_token = json.getAsString("access_token");
log.info("access_token = " + access_token);
vars.put("access_token", access_token);
JSON Extractor:
Apply to: Main sample and sub-samples
Variable names: access_token
JSON Path expressions: access_token
Match No. (0 for Random): 1
Compute concatenation var (suffix _ALL): unchecked
Default Values: none
Any ideas as to why the header manager is not applying the value of the access_token result?
Thanks!
Since you set a variable in setUp Thread Group, you cannot use it in another thread groups, since thread groups don't share variables, only properties.
So in order to pass authentication, you need to save it as a property:
${__setProperty(access_token, ${access_token})};
In this example I am using value of variable named access_token (already set, but only available in setUp thread group) to set property with the same name, which will be available across thread groups. Or change BeanShell post-processor, add:
props.put("access_token", access_token);
And then in the other thread group, you retrieve it using __P or __property function:
${__P(access_token)}
Also keep in mind that HTTP Header Manager initializes before any thread starts, so you can't use variables there for that reason too. Check this question for instance.
If you still see empty value, I recommend adding Debug Sampler (with both JMeter Properties and JMeter Variables enabled) in both thread groups, and checking where the breakage is (on saving or retrieving).
As per Functions and Variables chapter of the JMeter User Manual
Variables are local to a thread; properties are common to all threads, and need to be referenced using the __P or __property function
So the variable you define in the setUp Thread Group cannot be accessed by:
other threads in the same Thread Group
other threads outside the Thread Group
So my recommendations are:
Switch to JMeter Properties instead of Jmeter Variables, JMeter Properties are global to all threads and in fact the whole JVM instance
Switch to JSR223 PostProcessor with Groovy language instead of Beanshell PostProcessor, JSR223 Elements performance is much better, moreover Groovy has built-in JSON support.
So:
The relevant Groovy code for getting access_token attribute value and storing it into the relevant property would be :
props.put('access_token', new groovy.json.JsonSlurper().parse(prev.getResponseData()).access_token)
You can refer the value in the HTTP Header Manager (or wherever you require) as:
${__P(access_token,)}
I have JMeter project:
- HTTP Request Defaults
- HTTP Header Manager
- CSV Data Set Config
(filename = my.tsv, variable names = myVar,..., delimiter = \t, others default)
- Thread group
--- Loop Controller
----- HTTP Request (uses ${myVar})
----- Timer
----- View Results Tree
The issue is that only first line of my.tsv is used by JMeter to generate requests. How to fix it?
If you define multiple threads, It should pick the values from the tsv file. each thread picks different value until all the rows consumed and will repeat the records for next threads/iterations.
If you are looking for Loop Controller, in multiple loops and single thread, then JMeter uses the same value picked in first iteration for remaining iterations.
I am new to CouchDB. I need to get 60 or more JSON files in a minute from a server.
I have to upload these JSON files to CouchDB individually as soon as I receive them.
I installed CouchDB on my Linux machine.
I hope some one can help me with my requirement.
If possible can someone help me with pseudo code.
My Idea:
Is to write a python script for uploading all JSON files to CouchDB.
Each and every JSON file must be each document and the data present in
JSON must be inserted same into CouchDB
(the specified format with values in a file).
Note:
These JSON files are Transactional, every second 1 file is generated
so I need to read the file upload as same format into CouchDB on
successful uploading archive the file into local system of different folder.
python program to parse the json and insert into CouchDb:
import sys
import glob
import errno,time,os
import couchdb,simplejson
import json
from pprint import pprint
couch = couchdb.Server() # Assuming localhost:5984
#couch.resource.credentials = (USERNAME, PASSWORD)
# If your CouchDB server is running elsewhere, set it up like this:
couch = couchdb.Server('http://localhost:5984/')
db = couch['mydb']
path = 'C:/Users/Desktop/CouchDB_Python/Json_files/*.json'
#dirPath = 'C:/Users/VijayKumar/Desktop/CouchDB_Python'
files = glob.glob(path)
for file1 in files:
#dirs = os.listdir( dirPath )
file2 = glob.glob(file1)
for name in file2: # 'file' is a builtin type, 'name' is a less-ambiguous variable name.
try:
with open(name) as f: # No need to specify 'r': this is the default.
#sys.stdout.write(f.read())
json_data=f
data = json.load(json_data)
db.save(data)
pprint(data)
json_data.close()
#time.sleep(2)
except IOError as exc:
if exc.errno != errno.EISDIR: # Do not fail if a directory is found, just ignore it.
raise # Propagate other kinds of IOError.
I would use CouchDB bulk API, even though you have specified that you need to send them to db one by one. For example, by implementing a simple queue that gets sent out every say 5 - 10 seconds via a bulk doc call will greatly increase performance of your application.
There is obviously a quirk in that and that is you need to know the IDs of the docs that you want to get from the DB. But for the PUTs it is perfect. (it is not entirely true, you can get ranges of docs using bulk operation if the IDs you are using for your docs can be sorted nicely).
From my experience working with CouchDB, I have a hunch that you are dealing with Transactional documents in order to compile them into some sort of sum result and act on that data accordingly (maybe creating next transactional doc in series). For that you can rely on CouchDB by using 'reduce' functions on the views you create. It takes a little practice to get reduce function working properly and is highly dependent on what it is you actually what to achieve and what data you are prepared to emit by the view so I can't really provide you with more detail on that.
So in the end the app logic would go something like that:
get _design/someDesign/_view/yourReducedView
calculate new transaction
add transaction to queue
onTimeout
send all in transaction queue
If I got that first part of why you are using transactional docs wrong all that would really change is the part where you getting those transactional docs in my app logic.
Also, before writing your own 'reduce' function, have a look at buil-in ones (they are alot faster then anything outside of db engine can do)
http://wiki.apache.org/couchdb/HTTP_Bulk_Document_API
EDIT:
Since you are starting, I strongly recommend to have a look at CouchDB Definitive Guide.
NOTE FOR LATER:
Here is one hidden stone (well maybe not so much a hidden stone but not an obvious thing to look out for for the new-comer in any case). When you write reduce function make sure that it does not produce too much output for the query without boundaries. This will extremely slow down the entire view even when you provide reduce=false when getting stuff from it.
So you need to get JSON documents from a server and send them to CouchDB as you receive them. A Python script would work fine. Here is some pseudo-code:
loop (until no more docs)
get new JSON doc from server
send JSON doc to CouchDB
end loop
In Python, you could use requests to send the documents to CouchDB and probably to get the documents from the server as well (if it is using an HTTP API).
You might want to checkout the pycouchdb module for python3. I've used it myself to upload lots of JSON objects into couchdb instance. My project does pretty much the same as you describe so you can take a look at my project Pyro at Github for details.
My class looks like that:
class MyCouch:
""" COMMUNICATES WITH COUCHDB SERVER """
def __init__(self, server, port, user, password, database):
# ESTABLISHING CONNECTION
self.server = pycouchdb.Server("http://" + user + ":" + password + "#" + server + ":" + port + "/")
self.db = self.server.database(database)
def check_doc_rev(self, doc_id):
# CHECKS REVISION OF SUPPLIED DOCUMENT
try:
rev = self.db.get(doc_id)
return rev["_rev"]
except Exception as inst:
return -1
def update(self, all_computers):
# UPDATES DATABASE WITH JSON STRING
try:
result = self.db.save_bulk( all_computers, transaction=False )
sys.stdout.write( " Updating database" )
sys.stdout.flush()
return result
except Exception as ex:
sys.stdout.write( "Updating database" )
sys.stdout.write( "Exception: " )
print( ex )
sys.stdout.flush()
return None
Let me know in case of any questions - I will be more than glad to help if you will find some of my code usable.
I'm trying to create an SSIS package to process files from a directory that contains many years worth of files. The files are all named numerically, so to save processing everything, I want to pass SSIS a minimum number, and only enumerate files whose name (converted to a number) is higher than my minimum.
I've tried letting the ForEach File loop enumerate everything and then exclude files in a Script Task, but when dealing with hundreds of thousands of files, this is way too slow to be suitable.
The FileSpec property lets you specify a file mask to dictate which files you want in the collection, but I can't quite see how to specify an expression to make that work, as it's essentially a string match.
If there's an expression within the component somewhere which basically says Should I Enumerate? - Yes / No, that would be perfect. I've been experimenting with the below expression, but can't find a property to which to apply it.
(DT_I4)REPLACE( SUBSTRING(#[User::ActiveFilePath],FINDSTRING( #[User::ActiveFilePath], "\", 7 ) + 1 ,100),".txt","") > #[User::MinIndexId] ? "True" : "False"
Here is one way you can achieve this. You could use Expression Task combined with Foreach Loop Container to match the numerical values of the file names. Here is an example that illustrates how to do this. The sample uses SSIS 2012.
This may not be very efficient but it is one way of doing this.
Let's assume there is a folder with bunch of files named in the format YYYYMMDD. The folder contains files for the first day of every month since 1921 like 19210101, 19210201, 19210301 .... all the upto current month 20121101. That adds upto 1,103 files.
Let's say the requirement is only to loop through the files that were created since June 1948. That would mean the SSIS package has to loop through only the files greater than 19480601.
On the SSIS package, create the following three parameters. It is better to configure parameters for these because these values are configurable across environment.
ExtensionToMatch - This parameter of String data type will contain the extension that the package has to loop through. This will supplement the value to FileSpec variable that will be used on the Foreach Loop container.
FolderToEnumerate - This parameter of String data type will store the folder path that contains the files to loop through.
MinIndexId - this parameter of Int32 data type will contain the minimum numerical value above which the files should match the pattern.
Create the following four parameters that will help us loop through the files.
ActiveFilePath - This variable of String data type will hold the file name as the Foreach Loop container loops through each file in the folder. This variable is used in the expression of another variable. To avoid error, set it to a non-empty value, say 1.
FileCount - This is a dummy variable of Int32 data type will be used for this sample to illustrate the number of files that the Foreach Loop container will loop through.
FileSpec - This variable of String data type will hold the file pattern to loop through. Set the expression of this variable to below mentioned value. This expression will use the extension specified on the parameters. If there are no extensions, it will *.* to loop through all files.
"*" + (#[$Package::ExtensionToMatch] == "" ? ".*" : #[$Package::ExtensionToMatch])
ProcessThisFile - This variable of Boolean data type will evaluate whether a particular file matches the criteria or not.
Configure the package as shown below. Foreach loop container will loop through all the files matching the pattern specified on the FileSpec variable. An expression specified on the Expression Task will evaluate during runtime and will populate the variable ProcessThisFile. The variable will then be used on the Precedence constraint to determine whether to process the file or not.
The script task within the Foreach loop container will increment the counter of variable FileCount by 1 for each file that successfully matches the expression.
The script task outside the Foreach loop will simply display how many files were looped through by the Foreach loop container.
Configure the Foreach loop container to loop through the folder using the parameter and the files using the variable.
Store the file name in variable ActiveFilePath as the loop passes through each file.
On the Expression task, set the expression to the following value. The expression will convert the file name without the extension to a number and then will check if it evaluates to greater than the given number in the parameter MinIndexId
#[User::ProcessThisFile] = (DT_BOOL)((DT_I4)(REPLACE(#[User::ActiveFilePath], #[User::FileSpec] ,"")) > #[$Package::MinIndexId] ? 1: 0)
Right-click on the Precedence constraint and configure it to use the variable ProcessThisFile on the expression. This tells the package to process the file only if it matches the condition set on the expression task.
#[User::ProcessThisFile]
On the first script task, I have the variable User::FileCount set to the ReadWriteVariables and the following C# code within the script task. This increments the counter for file that successfully matches the condition.
public void Main()
{
Dts.Variables["User::FileCount"].Value = Convert.ToInt32(Dts.Variables["User::FileCount"].Value) + 1;
Dts.TaskResult = (int)ScriptResults.Success;
}
On the second script task, I have the variable User::FileCount set to the ReadOnlyVariables and the following C# code within the script task. This simply outputs the total number of files that were processed.
public void Main()
{
MessageBox.Show(String.Format("Total files looped through: {0}", Dts.Variables["User::FileCount"].Value));
Dts.TaskResult = (int)ScriptResults.Success;
}
When the package is executed with MinIndexId set to 1948061 (excluding this), it outputs the value 773.
When the package is executed with MinIndexId set to 20111201 (excluding this), it outputs the value 11.
Hope that helps.
From investigating how the ForEach loop works in SSIS (with a view to creating my own to solve the issue) it seems that the way it works (as far as I could see anyway) is to enumerate the file collection first, before any mask is specified. It's hard to tell exactly what's going on without seeing the underlying code for the ForEach loop but it seems to be doing it this way, resulting in slow performance when dealing with over 100k files.
While #Siva's solution is fantastically detailed and definitely an improvement over my initial approach, it is essentially just the same process, except using an Expression Task to test the filename, rather than a Script Task (this does seem to offer some improvement).
So, I decided to take a totally different approach and rather than use a file-based ForEach loop, enumerate the collection myself in a Script Task, apply my filtering logic, and then iterate over the remaining results. This is what I did:
In my Script Task, I use the asynchronous DirectoryInfo.EnumerateFiles method, which is the recommended approach for large file collections, as it allows streaming, rather than having to wait for the entire collection to be created before applying any logic.
Here's the code:
public void Main()
{
string sourceDir = Dts.Variables["SourceDirectory"].Value.ToString();
int minJobId = (int)Dts.Variables["MinIndexId"].Value;
//Enumerate file collection (using Enumerate Files to allow us to start processing immediately
List<string> activeFiles = new List<string>();
System.Threading.Tasks.Task listTask = System.Threading.Tasks.Task.Factory.StartNew(() =>
{
DirectoryInfo dir = new DirectoryInfo(sourceDir);
foreach (FileInfo f in dir.EnumerateFiles("*.txt"))
{
FileInfo file = f;
string filePath = file.FullName;
string fileName = filePath.Substring(filePath.LastIndexOf("\\") + 1);
int jobId = Convert.ToInt32(fileName.Substring(0, fileName.IndexOf(".txt")));
if (jobId > minJobId)
activeFiles.Add(filePath);
}
});
//Wait here for completion
System.Threading.Tasks.Task.WaitAll(new System.Threading.Tasks.Task[] { listTask });
Dts.Variables["ActiveFilenames"].Value = activeFiles;
Dts.TaskResult = (int)ScriptResults.Success;
}
So, I enumerate the collection, applying my logic as files are discovered and immediately adding the file path to my list for output. Once complete, I then assign this to an SSIS Object variable named ActiveFilenames which I'll use as the collection for my ForEach loop.
I configured the ForEach loop as a ForEach From Variable Enumerator, which now iterates over a much smaller collection (Post-filtered List<string> compared to what I can only assume was an unfiltered List<FileInfo> or something similar in SSIS' built-in ForEach File Enumerator.
So the tasks inside my loop can just be dedicated to processing the data, since it has already been filtered before hitting the loop. Although it doesn't seem to be doing much different to either my initial package or Siva's example, in production (for this particular case, anyway) it seems like filtering the collection and enumerating asynchronously provides a massive boost over using the built in ForEach File Enumerator.
I'm going to continue investigating the ForEach loop container and see if I can replicate this logic in a custom component. If I get this working I'll post a link in the comments.
The best you can do is use FileSpec to specify a mask, as you said. You could include at least some specs in it, like files starting with "201" for 2010, 2011 and 2012. Then, in some other task, you could filter out those you don't want to process (for instance, 2010).