Stress test API using multiple JSON files - json

I am trying to fire 40000 requests towards an API using 40000 different JSON files.
Normally I could do something like this:
for file in /dir/*.json
do
#ab -p $file -T application/json -c1 -n1 <url>
curl -X POST -d#"$file" <url> -H "Content-Type: application/json"
done;
My problem is that I want to run simultaneous requests, e.g. 100 and I want the total time it took to send all requests etc. recorded. I can't use the -c 100 -n 40000 in ab since its the same URL with different files.
The files/requests all look something like
{"source":"000000000000","type":"A"}
{"source":"000000000001","type":"A"}
{"source":"000000000003","type":"A"}
I was not able to find any tool that supports this out of the box (e.g. Apache Benchmark - ab).
I came across this example here on SO (modded for this question).
Not sure I understand why that example would "cat /tmp" when mkfifo tmp is a file and not a dir though. Might work?
mkfifo tmp
counter=0
for file in /dir/*.json
do
if [ $counter -lt 100 ]; then
curl -X POST -H "Content-Type: application/json" -d#"$file" <url> &
let $[counter++];
else
read x < tmp
curl -X POST -H "Content-Type: application/json" -d#"$file" <url> &
fi
done;
cat /tmp > /dev/null
rm tmp
How should I go about achieving this in perl, ksh, bash or similar or does anyone know any tools that supports this out of the box?
Thanks!

If your request is just to time the total time take for sending these 40000 curl requests with different JSON each time, you can use good use of GNU parallel. The tool has great ways achieve job concurrency by making use of multiple cores on your machine.
The download procedure is quite simple. Follow How to install GNU parallel (noarc.rpm) on CentOS 7 for quick and easy list of steps. The tool has a lot more complicated flags to solve multiple use-cases. For your requirement though, just go the folder containing these JSON files and do
parallel --dry-run -j10 curl -X POST -H "Content-Type: application/json" -d#{} <url> ::: *.json
The above command tries to dry run your command, in terms of how parallel sets up the flags and processes its arguments and starts running your command. Here {} represents your JSON file. We've specified here to run 10 jobs at a time and increase the number depending on how fast it runs on your machine and by checking the number of cores on your machine. There are also flags to limit the overall CPU to be allowed use by parallel, so that it doesn't totally choke your system.
Remove --dry-run to run your actual command. And to clock the time taken for the process to complete, use the time command just prefix it before the actual command as time parallel ...

Related

Jenkins Build Time Trend API does not yield output using curl API

I got this link to get the Build Time Trend along with other Data in jenkins
https://jenkins:8080/view/<view-name>/job/<job-name>/<buildnumber>/api/json
This works well in a web browser but this does not seem to work with curl, does not give any result when I run along with curl command
This is what I tried
curl -u user:api_token -s -k "https://jenkins:8080/view/<view-name>/job/<job-name>/<buildnumber>/api/json"
This syntax worked with other API's.
Not sure what is wrong here.
curl -u userid:api_token -s -k "https://jenkins:8080/view/<view-name>/job/<job-name>/<buildnumber>/api/json" | jq.'causes[]|{result}'
jq.causes[]|{result}: command not found
You need a space between jq and its arguments (and probably not a period).
... | jq 'causes[]|{result}'
^
space here

Docker API can’t apply json filters

According to the https://docs.docker.com/engine/reference/api/docker_remote_api_v1.24/#/list-tasks, filter can be only used to get running containers with a particular service name. For some reason, I am getting a full list of all tasks regardless of their names or desired states. I can't find any proper examples of using curl with JSON requests with Docker API.
I'm using the following command:
A)
curl -X GET -H "Content-Type: application/json" -d '{"filters":[{ "service":"demo", "desired-state":"running" }]}' https://HOSTNAME:2376/tasks --cert ~/.docker/cert.pem --key ~/.docker/key.pem --cacert ~/.docker/ca.pem
Returns everything
B)
trying to get something working from Docker Remote API Filter Exited
curl https://HOSTNAME:2376/containers/json?all=1&filters={%22status%22:[%22exited%22]} --cert ~/.docker/cert.pem --key ~/.docker/key.pem --cacert ~/.docker/ca.pem
This one returns "curl: (60) Peer's Certificate issuer is not recognized.", so I guess that curl request is malformed.
I have asked on Docker forums and they helped a little. I'm amazed that there are no proper documentation anywhere on the internet on how to use Docker API with curl or is it so obvious and I don't understand something?
I should prefix this with the fact that I have never seen curl erroneously report a certificate error when in fact there was some sort of other issue in play, but I will trust your assertion that this is in fact not a certificate problem.
I thought at first that your argument to filters was incorrect, because
according to the API reference, the filters parameter is...
a JSON encoded value of the filters (a map[string][]string) to process on the containers list.
I wasn't exactly sure how to interpret map[string][]string, so I set up a logging proxy between my Docker client and server and ran docker ps -f status=exited, which produced the following request:
GET /v1.24/containers/json?filters=%7B%22status%22%3A%7B%22exited%22%3Atrue%7D%7D HTTP/1.1\r
If we decode the argument to filters, we see that it is:
{"status":{"exited":true}}
Whereas you are passing:
{"status":["exited"]}
So that's different, obviously, and I was assuming that was the source of the problem...but when trying to verify that, I ran into a curious problem. I can't even run your curl command line as written, because curl tries to perform some globbing behavior due to the braces:
$ curl http://localhost:2376/containers/json'?filters={%22status%22:[%22exited%22]}'
curl: (3) [globbing] nested brace in column 67
If I correctly quote your arguments to filter:
$ python -c 'import urllib; print urllib.quote("""{"status":["exited"]}""")'
%7B%22status%22%3A%5B%22exited%22%5D%7D
It seems to work just fine:
$ curl http://localhost:2376/containers/json'?filters=%7B%22status%22%3A%5B%22exited%22%5D%7D'
[{"Id":...
I can get the same behavior if I use your original expression and pass -g (aka --globoff) to disable the brace expansion:
$ curl -g http://localhost:2376/containers/json'?filters={%22status%22:[%22exited%22]}'
[{"Id":...
One thing I would like to emphasize is the utility of sticking a proxy between the docker client and server. If you ever find yourself asking, "how do I use this API?", an excellent answer is to see exactly what the Docker client is doing in the same situation.
You can create a logging proxy using socat. Here is an example.
docker run -v /var/run/docker.sock:/var/run/docker.sock -p 127.0.0.1:1234:1234 bobrik/socat -v TCP-LISTEN:1234,fork UNIX-CONNECT:/var/run/docker.sock
Then run a command like so in another window.
docker -H localhost:1234 run --rm -p 2222:2222 hello-world
This example uses docker on ubuntu.
A docker REST proxy can be simple like this:
https://github.com/laoshanxi/app-mesh/blob/main/src/sdk/docker/docker-rest.go
Then you can curl like this:
curl -g http://127.0.0.1:6058/containers/json'?filters={%22name%22:[%22jenkins%22]}'

Error while trying to run a MapReduce job on FIWARE-Cosmos using Tidoop REST API

I am following this guide on Github and I am not able run the example mapreduced job mentioned in Step 5.
I am aware that this file no longer exists:
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar
And I am aware that the same file can now be found here:
/usr/lib/hadoop-0.20/hadoop-examples-0.20.2-cdh3u6.jar
So I form my call as below:
curl -v -X POST "http://computing.cosmos.lab.fiware.org:12000/tidoop/v1/user/$user/jobs" -d '{"jar":"/usr/lib/hadoop-0.20/hadoop-examples-0.20.2-cdh3u6.jar","class_name":"WordCount","lib_jars":"/usr/lib/hadoop-0.20/hadoop-examples-0.20.2-cdh3u6.jar","input":"testdir","output":"testoutput"}' -H "Content-Type: application/json" -H "X-Auth-Token: $TOKEN"
The input directory exists in my hdfs user space and there is a file called testdata.txt inside it. The testoutput folder does not exist in my hdfs user space since I know it creates problems.
When I execute this curl command, the error I get is {"success":"false","error":1} which is not very descriptive. Is there something I am missing here?
This has been just tested with my user frb and a valid token for that user:
$ curl -X POST "http://computing.cosmos.lab.fiware.org:12000/tidoop/v1/user/frb/jobs" -d '{"jar":"/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar","class_name":"wordcount","lib_jars":"/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar","input":"testdir","output":"outputtest"}' -H "Content-Type: application/json" -H "X-Auth-Token: xxxxxxxxxxxxxxxxxxx"
{"success":"true","job_id": "job_1460639183882_0011"}
Please observe the fat jar with the MapReduce examples in the "new" cluster (computing.cosmos.lab.fiware.org) is at /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar, as detailed in the documentation. /usr/lib/hadoop-0.20/hadoop-examples-0.20.2-cdh3u6.jar was the fat jar in the "old" cluster (cosmos.lab.fiware.org).
EDIT 1
Finally, the user had no account in the "new" pair of clusters of Cosmos in FIWARE LAB (storage.cosmos.lab.fiware.org and computing.cosmos.lab.fiware.org), where Tidoop runs, but in another "old" cluster (cosmos.lab.fiwre.org). Thus, the issue was fixed by simply provisioning an account in the "new" ones.

Converting a bash command output into JSON and serving it over http on the fly

I want to convert the output of ifstat command into JSON and serve it over http on the fly to be used for a javascript graph app. Are there any lightweight -- sed or awk -- command-line solutions which I can use? I do not want to store JSON output on the disk and it would be good if the web-server was a small lightweight command line tool into which I can pipe JSON output.
EDIT 1:
This is the live streaming chart library which will use the data. I'm not keen on a specific web server; any webserver that does the job would be fine.
This is what I have tried.
Terminal #1
ifstat -n | awk 'NR>2{print systime(),$0; fflush()}' | tee ifstat.log
Terminal #2
while :
do
{
echo -e "HTTP/1.1 200 OK"
echo -e "Content-Type: application/json\n"
tail -n1 ifstat.log | awk '{ printf("{\"time\":%s, \"in\":%s, \"out\":%s}\n", $1, $2, $3) }'
} | nc -l 8000
done
firefox
open: http://localhost:8000
{"time":1332052321, "in":1.24, "out":2.62}
I know little about JSON. Maybe the output is invalid. You should rewrite the awk command.

How To Capture network packets to MySQL

I'm going to design a network Analyzer for WiFi (802.11)
Currently I use tshark to capture and parse the WiFi frames and then pipe the output to a perl script to store the parsed information to Mysql database.
I just find out that I miss alot of frames in this process. I checked and the frames seem to be lost during the Pipe (when the output is delivered to perl to get srored in Mysql)
Here is how it goes
(Tshark) -------frames are lost----> (Perl) --------> (MySQL)
this is the how I pipe the output of tshark to script:
sudo tshark -i mon0 -t ad -T fields -e frame.time -e frame.len -e frame.cap_len -e radiotap.length | perl tshark-sql-capture.pl
this is simple template of the perl script I use (tshark-sql-capture.pl)
# preparing the MySQL
my $dns = "DBI:mysql:capture;localhost";
my $dbh = DBI->connect($dns,user,pass);
my $db = "captured";
while (<STDIN>) {
chomp($data = <STDIN>);
($time, $frame_len, $cap_len, $radiotap_len) = split " ", $data;
my $sth = $dbh-> prepare("INSERT INTO $db VALUES (str_to_date('$time','%M %d, %Y %H:%i:%s.%f'), '$frame_len', '$cap_len', '$radiotap_len'\n)" );
$sth->execute;
}
#Terminate MySQL
$dbh->disconnect;
Any Idea which can help to make the performance better is appreciated.Or may be there is an Alternative mechanism which can do better.
Right now my performance is 50% means I can store in mysql around half of the packets I'v captured.
Things written in a pipe don't get lost, what's probably really going on is that tshark tries to write to the pipe but perl+mysql is too slow to process the input so the pipeb is full, write would block so tshark just drops the packets.
Bottleneck could be either MySQL or Perl itself but probably the DB. Check CPU usage, measure insert rate. Then pick a faster DB or write to multiple DBs. You can also try batch inserts and increasing the size of the pipe buffer.
Update
while (<STDIN>)
this reads a line into $_, then you ignore it.
For pipe problems, you can improve packet capture with GULP http://staff.washington.edu/corey/gulp/
From the Man pages:
1) reduce packet loss of a tcpdump packet capture:
(gulp -c works in any pipeline as it does no data interpretation)
tcpdump -i eth1 -w - ... | gulp -c > pcapfile
or if you have more than 2, run tcpdump and gulp on different CPUs
taskset -c 2 tcpdump -i eth1 -w - ... | gulp -c > pcapfile
(gulp uses CPUs #0,1 so use #2 for tcpdump to reduce interference)
you can use a FIFO file, then read the packets and inserts in mysql using insert delay.
sudo tshark -i mon0 -t ad -T fields -e frame.time -e frame.len -e frame.cap_len -e radiotap.length > MYFIFO