What would be the approach to test the below scenarios in Spring Batch jobs:
1) An independent tasklet running in a step in a spring batch job.
2) A job consisting of several steps with some steps having itemreader, itemprocessor, itemwriters while some steps having tasklets.
3) An itemreader where the input query to item reader is a very complex sql query having joins from many tables - This scenario will not allow emptying all tables used and populating them with dummy data for testing. In other words, there could be real world scenarios where in it would not be possible to empty the tables for testing. What should be the approach in that case ?
If I get a little help on this forum about the approach to be taken, I want to prepare an example suite containing examples on testing spring batch jobs with different job scenarios.
It's very hard to 'unit' test a spring batch job, purely because there is so much interaction with external resources, such as databases or files etc. The approach I have has two levels:
The individual methods in the tasklets are unit tested as normal, I just call the methods directly from a JUnit test, with the usual mocks or stubs.
Secondly, I run the batch from a JUnit test, by calling the main method directly, with all of the necessary infrastructure in place. For my batch jobs, I need a database and some input files. My JUnit test copies all of the necessary files for that test into a temp directory (I use maven so usually target/it), along with a database (I have copies of my MySQL database saved in HSQLDB format). I then call Main.main() directly with the correct parameters, let the batch run and then check the results, check to see if the correct files have been generated, the database has been changed correctly etc.
This has a couple of advantages as a way of working.
I can run these tests from Eclipse, so shortening the debugging cycle, because I don't need to build the full package each time to test it.
They are easy to incoporate into the build, I just include them in the failsafe plugin in maven.
With a couple of warnings:
You'll need to run the main method within a SecurityManager. If your main calls System.exit(), you don't want the JVM to stop. For an example of a SecurityManager, see org.junit.tests.running.core.MainRunner. This is called like:
Integer exitValue = new MainRunner().runWithCheckForSystemExit(new Runnable() {
public void run() {
Main.main(new String[] {});
}
});
Using the above, you can assert that a batch calls System.exit() with the correct values on failure, allowing testing of failure conditions as well.
Secondly, you can do a lot of setup from a JUnit test, but you can't do it all, so if for instance, you require an FTP server, I usually start it from maven rather than the JUnit, before the failsafe plugin runs. This can be a little bit fiddly sometimes.
Related
In order to test my nodejs microservice architecture I am trying to build the entire architecture with docker. Now I want to run tests with newman (postman). In the before-each hook, so before every http test request, the database(s) should have a predefined dataset.
So now to the core question: Is there a simple way to reset the entire database, so that the architecture stays (does it anyway) but the data in the database gets reset to a predefined state. (Maybe via sql statement?)
I read about ROLLBACK, but I think this is not going to work due to the fact that the ROLLBACK is going to happen from another service within my architecture. Also there is not only one mysql request happening, but multiple msql request during one http test request.
Regards
I am working on a project that uses AWS and we are putting our data in MySQL database. The project is very long and it mostly deals with database. Our database structure may change every but our data in database keeps changing rather frequently ( like once every 15 days). If data changes this results in failing system integrator tests as we perform it on real database. Hence we came up with two ideas one was to create a mock data and put it in our database and test with it ( most of the time it will work) but it is a tedious task as database is rather complicated. Another was to create a new database and copy current data from it (it will work because right now all test cases pass). This will save our time to create mock data. We will run our test cases in Dev and acceptance in this environment and do a CF bind in production so it is like writing one additional line of binding and unbinding. We are using blue green deployment so downtime is not a problem.
Can someone suggest which is a better approach?
One seperate database for testing ( I have never seen this)
Or creating mocked data ?
Assuming your database schema is not overly complex, I suggest you use an in-memory db like H2 or HSql db. You can load all necessary data sets as part of the setup, then run the tests for all conditions.
At the end of the tests, the H2 instances will be wiped out. And your actual db is left unaffected.
One caveat, being in-memory db, you will want to rationalize the amount of test data that you put in during setup. So, you can have multiple test suites, each having its own set of test data (subset of the overall).
You can control such behaviors using active profiles.
We have an application running on Symfony 2.8 with a package named "liip/functional-test-bundle". We plan on using PHP Unit to run functional tests on our application, which uses MySQL for it's database.
The 'functional test bundle' package allows us to use the entities as a schema builder for an in-memory SQLite database, which is very handy because:
It requires zero configuration to run
It's extremely fast to run tests
Our tests can be run independently from each test and the development data
Unfortunately, some of our entities use 'enums' which is not supported by SQLite, and our technical lead has opted to keep existing enums whilst refraining from using them anymore.
Ideally we need this in the project sooner rather than later, so the team can start writing new tests in the future to help maintain the stability of the application.
I have 3 options at this point, but I need help choosing the correct one and performing it correctly:
Convince the technical lead that enums are a bad idea and lookup tables could instead be used (Which may cost time where the workload is already high)
Switch to using MySQL for the testing database. (This will require additional configuration for our tests to run, and may be slower)
Have doctrine detect when enums are used on a SQLite driver, and switch them out for strings. (I would have no idea how to do this, but this is, in my opinion, the most ideal solution)
Which action is the best, and how should I carry it out?
We have a suite of applications developed in C# and C++ and using SQL Server as the back end. Integration tests are developed with NUnit, and they take more than two minutes to run. To speed up integration tests, we are using the following:
Tests run on the same workstation, so no network delays
Test databases are created on DataRam RAM Disk, which is fast
Test fixtures run in parallel, currently up to four at a time
Most test data is bulk loaded using table-valued parameters.
What else can be done to speed up automated integration tests?
I know this question is very, very old but I'll post my answer anyway.
It may sound stupid but: Write less integration tests and more unit tests. Integration test only at your applications boundaries (as in "when you pass control to code you do not own").
My opinion on this is inspired by J.B. Reinsberger. If you want, you can listen to a talk he gave on this topic. He is way better in explaining this than I am. Here is a link to the video:
http://vimeo.com/80533536
I do not like this answer, write less integration tests as it is wrong. Our application is data heavy. Most of our code is logic around the data. So without integration tests we have just trivial unit tests (which i think should still be written).
Our Integration Tests run for 1 hour. Thousands of tests. They have brought us a tremendous value.
I think you should analyse the slow tests and why they are slow. Look if multiple tests can reuse the data without dropping and recreating it from scratch.
Divide tests into areas so you do not always need to run every test.
Use an existing database snapshot instead of recreating the database.
I use Hudson to automate the testing of a very large important product. I want to have my testing-hosts able to run as many concurrent builds as they will theoretically support with the exception of excel-tests which must only run one per machine at any time. Any number of non-excel tests can run concurrently, however at most one excel test at a time must run per machine.
Background:
Most of my tests are normal unit-tests - the sort of thing that I can easily run in parallel. Unfortunately a substantial and time consuming part of my unit-testing plan consists of tests which have been implemented in Excel.
You might think it crazy to implement a test in Excel - actually there's an important reason: Most of our users access our system via a Excel. Excel has it's own quirky ways of handling data so the only way to guarantee that our stuff works for Excel users is to literally implement our reg-test our application Excel.
I've written a test-runner tool which allows me to easily fire off a group of excel tests: Each test is a single .xls file. Each group is a folder full of excel files. I've got about 30 groups which need to be run for an end-to-end test. My tool converts the result of each of the tests into JUnit style XML which Hudson is able to understand. The tests use the pywin32com library to automate excel. When run on their own they are reliable.
I've got a group of computers which are dedicated to running tests. Each machine is quad-core and can theoretically run quite a lot of stuff at once. Unfortunately I've found that COM cannot be used to safely control more than 1 excel per machine at a time.
That is to say if a 2nd build stars which tries to talk to Excel via COM it might interfere with the one which is already running and cause both tests to fail.
I can run as many other non-excel processes as the machine will allow but I need to find a way so that Hudson does not attempt to launch any more than 1 process which requires excel on any one machine concurrently.
Sounds like the Locks and Latches plugin might help you.
http://hudson.gotdns.com/wiki/display/HUDSON/Locks+and+Latches+plugin
Isn't hudson java?
Since you've tagged this post python, I'll point out that buildbot, has slave locks to limit individual steps on individual slaves (or use them as more coarse locks if you'd like).