Generating jmeter results into graphs off several trials through Hudson - hudson

I'm in the process of integrating test scripts into a Continuous Integration system like Hudson. My goal is to benchmark each load test over time and display it in readable charts.
While there are plugins to generate graphs for a single script run, I'd like to know how each session's data, such as those found in the summary report, could be recorded over time.
One way would be to store the summary reports into a jtl file, and graph data off of that.
I've checked out the Performance Plugin for Hudson, but I'm at a block at how to modify the plugin to display more charts with more information.

Both the reports from either JMeter or the Hudson plugin are snapshots (not charts over long periods of time) and that's part of the issue. I went through this same exercise a few months back and decided to go with a solution that was better suited for this problem.
I setup Logstash to pull the JMeter test results from the files it generates during every test. It outputs those results into an Elasticsearch index which I can chart with Kibana.
I know this adds several new pieces of software into your setup, but it only took a day to set things up and the results are much better than what the performance plugin was able to provide.

Related

How to test SQL queries/reports?

We are developing a Rails app, that has quite a few pages with data reports. A typical reporting page is based on a relatively big SQL query, usually involving 5–8 table joins.
The cornerstone question we've stumbled upon is – writing integration tests reports pages. A common integration test of ours looks like this:
creating a bunch of records in the DB via factory_girl in the test setup,
fire up a capybara scenario, where a user logs in, advances to the page with report, and sees the right data in it.
As the app grows and we get to create more of such reports pages, we've started to run into the following problem - the setup for each individual test ends up being too big, complex and generally hard to read and maintain.
Creating such a test significantly raises the bar for a developer in delivering a feature, related to reporting, as it is very time-consuming and not optimized for happiness. However, we still need to make sure our reports are correct.
Therefore, my questions are:
should or should we not test pages with reports?
if we should test the reports, then what would be the least painful way to do that?
where are we doing wrong?
1. should or should we not test reports page?
You should definitely test your reports page.
2. if we should test the reports, then what would be the least painful way to do that?
Given the size of the reports, you're probably going to have the following problems:
your tests will become super slow;
your tests will become really hard to read and maintain due to the huge setup;
your tests will stop being updated when reports change.
With this, you'll probably stop maintaining properly your specs.
So, first, you should differentiate between:
testing that UI shows proper results (acceptance) vs
testing that reports are generated correctly (unit and integrated).
The tests for the first scenario, UI, which use Capybara, should test UI and not reports themselves. It'd cover that reports data is showing as they were generated by their respective classes, which make us conclude that you don't need to test the millions of report lines, but rather that the table has the correct columns and headers, pagination is working etc. You'd test that the first, second and maybe last report line are showing properly.
On the other hand, the tests for the second scenario, reports generation, should test that reports are generated. That has nothing to do with the UI, as you could be serving those reports as JSON, HTML, Cap'n Proto and any visualization mean. As an imagination exercise, picture testing reports via JSON responses, then all over again via HTML, then all over again via some other method. It'd become evident that report generation is repeated all over.
This means that report generation is the core and should be tested on its own. Which means you should cover it mainly by unit tests. Tons of them if you need. Huge arrays.
With this setup, you'd have blazingly fast unit tests covering your reports and their edge cases, a few integrated tests making sure report generation pieces are connected properly and a few acceptance tests covering your UI (Capybara).
Remember the Test Pyramid?
3. where are we doing wrong?
I don't have all the details about your setup, but it seems the main misconception is thinking that reports are the pages themselves. Remember that you could generate reports as CSV or XML and they'd still be the same report internally. In software, a report will probably end up being an array with values.
So, next time, think about separating concepts. You have reports generation and you have the UI. Test them separately and then add some tests in between to make sure they're both integrated well.
In the future, say you move to a Single Page JS App™ and iOS app, you'd not have to get rid of your report generation tests, but UI tests would go into the clients. That's proof that UI is different from reports generation.
I will post ideas we've had so far.
Do not do an integration test at all
Instead writing an integration test, write a lower-level functional test, by treating DB interaction as an interaction with a 3rd party API.
This is how it would be:
stub the object that sends a query to DB with a mock of DB result we're expecting,
rely any test that needs the data – on that result mock,
execute a SQL query expecting empty dataset, although verifying that:
the SQL raised no syntax error,
the result object returned correct columns,
the column types are as what we are expecting them to be.
This advantages of this approach are:
the tests setup is no longer a cognitive barrier,
the tests are significantly faster than they were back in the factory_girl epoch,
if something changes in the DB result (column names set or column types), we still catch that by performing a "real" DB request.
Pre-load data fixtures
Instead of setting up complex tree of records for each and every report test, pre-populate a whole "field" of DB records prior to the entire test suite.
This is how it would be:
before a suite (or the part of it which contains tests for reports), populate the DB with a log of test data
test each and every report without any further setup needed.
This advantages of this approach are:
the tests setup is no longer a cognitive barrier – the data is just always there for you,
the tests are much faster than they were back in the factory_girl epoch.
This disadvantage of this approach is:
occasionally, as new reports come up, one will have to go and add more data to the "fixtures" setup, which will very likely to break existing tests. Fixing existing tests may lead to strange / not readable pull request changesets for the new features.

add summary information to a job results page in Jenkins/Hudson

I have some jobs that deploy and run automated integration tests as part of our CI system.
These jobs are shell scripts that use ssh to deploy and then run commands on the systems to be tested. Then they gather the results in a tarball and archive it. One of the files in this tarball contains a nicely formatted summary that I would like to make visible without having to read through the console output or open a tarball.
Is there a plugin for adding text to the job results page?
Is there a plugin that will produce reports from archived job results?
Is there an entirely more elegant way of doing this?
I would look at at Summary Display Plugin.
If you can output an XML from your build task, it will display it on the build page, using tables and other formatting.
If you can get your results file into HTML format, the HTML Publisher plugin will do the job for you.

How to periodically extract data from a CSV file?

I'm currently working in some Q&A projects. I am running tests (which can vary from a couple of minutes to 2-3 days) in an applications that is generating some csv files and updates them periodically, with a new row added with each update (once every couple of seconds or so).
Each CSV file is structured like this:
Header1,Header2,Header3,.................,HeaderN
numerical_value11,numerical_value12,numerical_value13,......,numerical_value1N,
numerical_value21,numerical_value22,numerical_value23,......,numerical_value1N,
etc
The number of columns may vary from csv file to csv file.
I am running in a windows environment. I also have cygwin (http://www.cygwin.com/) installed.
Is there a way I can do a script that runs periodically (once per hour or so), extracts data (a single/multiple values from a row, or the average of the the values from specific rows added in the csv between interrogations) and sends some email alerts if, for example, the data from one column is out of a range?
Thx
This can be done in several ways. Basically, you need to
1) Write a script in maybe pearl or python that does one iteration of what you want it to do.
2) Use windows scheduler to run this scrip at the frequency that you want. The Windows scheduler is very easy to setup from the Control Panel
Using Window' Scheduling, you can very easily get the interval part down; with the program parsing and alerting however, you have a few options. I myself would use C# to make the program. If you want an actual script however, VBA is a viable choice and could very easily Parse a basic CSV file and contact the web to send an email. If you have office already installed, this should give you some more detail. Hope that helps.

Hudson as passive server

Is it possible to use Hudson only as a passive server,i.e, not using it for building purpose instead sending build results generated by some other tool in maybe XML format and using Hudson to only display the results??
It's very doable.
If it's running on the same machine, such as a cron job, check out http://wiki.hudson-ci.org/display/HUDSON/Monitoring+external+jobs.
If you need to pull data from somewhere else, use a shell script as a build target, and do what you need to to stage the data locally (scp, etc.).
It may very well be possible using periodic builds and the URL SCM plug-in to pull in the xml files and the Plot Plug-in for display but more information is required before a more detailed answer can be provided.
What build tool are you currently using to generate build results?
A couple of my Hudson jobs are just summaries and display information. The 'jobs' need to run for data to be collected and saved. The run could be based dependent jobs or just scheduled nightly. Some examples:
One of our jobs just merges together the .SER files from Cobertura and generates the Cobertura reports for an overall code coverage from all of our unit, integration and different types of system tests (hint for others doing the same: Cobertura has little logic for unsynchronized SER files. Using them will yield some odd results. There are some tweaks that can be done to the merge code that reduces the problem)
Some of our builds write data to a database. We have a once a week task that pulls the data from the database and creates an HTML file with trend charts. The results are kept as part of the job.
It sounds to me what you're describing is a plugin for Hudson. For example, the CCCC plugin:
http://wiki.hudson-ci.org/display/HUDSON/CCCC+Plugin
It takes the output, in XML form, from the CCCC analyzer app and displays it in pretty ways in the Hudson interface.
Taking the same concept, you could write a plugin that works with the XML output from whatever build tool you have in mind and display it in Hudson.

Data sync solution?

For some security issues I'm in an envorinment where third party apps can't access my DB. For this reason I should have some service/tool/script (dunno what yet... i'm open to the best option, still reading to see what I'm gonna do...)
which enables me to generate on a regular basis(daily, weekly, monthly) some csv file with all new/modified records for a certain application.
I should be able to automate this process and also export at any time a new file.
So it should keep track for each application which records he still needs.
Each application will need some data in some other format (csv/xls/sql), also some fields will be needed for some application and some aren't... It should be fairly flexible...
What is the best option for me? Creating some custom tables for each application? Based on that extracting modified data?
I think you best thing here, assuming you have access to the server to let you set this up is to make a small command line program that can do the relativley simple task you need. Languages like pearl are good for this sort of thing I do believe.
once you have that 'tool' made you can schedule it through the OS of the server to run ever set amount of time. Either schedule task for a windows server or a cronjob for a linux server.
You can also (with out having to set up the scheduled task if you don't / can't want to) enable this small command line application to be called via 'CGI' this is a special way of letting applications on the server be executed at will by a web user. If you do enable this though, I suggest you add some sort of locking system so that it can only be run every so often and to stop it being run five times at once.
EDIT
You might also want to just look into database replication or adding read only users. This saves a hole lot of arseing around. Try to find a solution that dose not split or duplicate data. You can set up users to only be able to access certain parts of the database system in certain ways, such as SELECT data