H2 database with unitils - junit

I used the extension unitils to initialize my database with one of the data sets.
With Mysql, HsqlDB and PostgreSql, I didn't have any problem, but with H2 I foudn problem.
Is what somebody already tested H2 with unitils? If so, thank you to provide the contents of unitils.properties.
below my configuration
database.driverClassName=org.h2.Driver
database.url=jdbc:h2:tcp://localhost/~/test
database.dialect=h2
database.userName=sa
database.password=
database.schemaNames=public
DatabaseModule.Transactional.value.default=disabled
updateDataBaseSchema.enabled=true
dbMaintainer.disableConstraints.enabled=false
dbMaintainer.fromScratch.enabled=true
org.unitils.dbmaintainer.script.ScriptSource.implClassName= org.unitils.dbmaintainer.script.impl.DefaultScriptSource
dbMaintainer.script.locations=./target/test-classes/database/scripts
dataSetStructureGenerator.xsd.dirName=src/test/resources/database/xsd
dbMaintainer.autoCreateExecutedScriptsTable=true
dbMaintainer.script.fileExtensions=sql,ddl
org.unitils.dbmaintainer.clean.DBCleaner.implClassName= org.unitils.dbmaintainer.clean.impl.DefaultDBCleaner
and this is the exception I get
org.unitils.dbmaintainer.DBMaintainer - Error while initializing DbMaintainer
org.unitils.core.UnitilsException: Missing configuration for org.unitils.core.dbsupport.DbSupport.implClassName
I used unitils-dbunit 3.3

I found the solution and I return in order to share it.
The last version of unitils doesn't support H2 database, for that it is necessary to implement the necessary support.
I found the solution in this jira :
https://unitils.atlassian.net/browse/UNI-79

Related

Unable to Create Extract - Tableau and Spark SQL

I am trying to make extract information from Spark SQL. Following error message showing while creating extract.
[Simba][Hardy] (35) Error from server: error code: '0' error message: 'org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 906 tasks (4.0 GB) is bigger than spark.driver.maxResultSize (4.0 GB)'.
A quick fix is just changing the setting in your execution context.
spark.sql("set spark.driver.maxResultSize = 8G")
Not entirely convinced on Spark SQL Thrift Server, and a little awkward to distill all facts. Tableau uses the results collect'ed to the driver, how else can it get them with Spark?
However:
Set spark.driver.maxResultSize 0 in relevant spark-thrift-sparkconf.conf file will mean no limit (except physicals limits on driver node).
Set spark.driver.maxResultSize 8G or higher in relevant spark-thrift-sparkconf.conf file. Note not all memory on driver can be used.
Or, use Impala Connection for Tableau assuming a Hive Impala source, then less such issues.
Also, number of concurrent users can be a problem. Hence, last point.
Interesting to say the least.
spark.driver.maxResultSize 0
This is the setting you can put in your advanced cluster settings. This will solve your 4 GB issue.

Working on migration of SPL 3.0 to 4.2 (TEDA)

I am working on migration of 3.0 code into new 4.2 framework. I am facing a few difficulties:
How to do CDR level deduplication in new 4.2 framework? (Note: Table deduplication is already done).
Where to implement PostDedupProcessor - context or chainsink custom? In either case, do I need to remove duplicate hashcodes from the list or just reject the tuples? Here I am also doing column updating for a few tuples.
My file is not moving into archive. The temporary output file is getting generated and that too empty and outside load directory. What could be the possible reasons? - I have thoroughly checked config parameters and after putting logs, it seems correct output is being sent from transformer custom, so I don't know where it is stuck. I had printed TableRowGenerator stream for logs(end of DataProcessor).
1. and 2.:
You need to select the type of deduplication. It is not a big difference if you choose "table-" or "cdr-level-deduplication".
The ite.businessLogic.transformation.outputType does affect this. There is one Dedup only. You can not have both.
Select recordStream for "cdr-level-deduplication", do the transformation to table row format (e.g. if you like to use the TableFileWriter) in xxx.chainsink.custom::PostContextDataProcessor.
In xxx.chainsink.custom::PostContextDataProcessor you need to add custom code for duplicate-handling: reject (discard) tuples or set special column values or write them to different target tables.
3.:
Possibly reasons could be:
Missing forwarding of window punctuations or statistic tuple
error in BloomFilter configuration, you would see it easily because PE is down and error log gives hints about wrong sha2 functions be used
To troubleshoot your ITE application, I recommend to enable the following debug sinks if checking the StreamsStudio live graph is not sufficient:
ite.businessLogic.transformation.debug=on
ite.businessLogic.group.debug=on
ite.businessLogic.sink.debug=on
Run a test with a single input file only and check the flow of your record and statistic tuples. "Debug sinks" write punctuations markers also to debug files.

Use Mysql in dev/prod and H2 in test

Using the play framework 2.1, I'm trying to find the best way to have two different databases configurations:
One to run my application based on mysql
One to test my application based on H2
While it is very easy to do one or the other, I run into the following problems when I try to do both:
I cannot have the same database evolutions because there are some mysql specific commands that do not function with H2 even in mysql mode: this means two sets of evolutions and two separate database names
I'm not certain how to override the main application.conf file by another reserved to test in test mode. What I tried (passing the file name or overriding keys from the command line) seems to be reserved to prod mode.
My question: Can anyone recommend a good way to do both (mysql all the time and only H2 in test) without overly complicating running the application? Google did not help me.
Thanks for your help.
There are a couple of tricks you might find useful.
Firstly, MySQL's /*! */ notation allows you to add code which MySQL will obey, but other DBs will ignore, for example:
create table Users (
id bigint not null auto_increment,
name varchar(40)
) /*! engine=InnoDB */
It's not a silver bullet, but it'll let you paper over some of the differences between MySQL and H2's syntax. It's a MySQL-ism, so it won't help with other databases, but since most other databases aren't as quirky as MySQL, you probably wouldn't need it - we migrated our database from MySQL to PostgreSQL, which doesn't support the /*! */ notation, but PostgreSQL is similar enough to H2 that we didn't need it.
If you want to use a different config for dev and prod, you're probably best off having extra config for prod. The reason for this is that you'll probably start your dev server with play run, and start your prod server with play stage; target/start. target/start can take a -Dconfig.resource parameter. For example, create an extra config file prod.conf for prod that looks like:
include "application.conf"
# Extra config for prod - this will override the dev values in application.conf
db.default.driver=...
db.default.url=...
...
and create a start_prod script that looks like:
#!/bin/sh
# Optional - you might want to do this as part of the build/deploy process instead
#play stage
target/start -Dconfig.resource=prod.conf
In theory, you could do it the other way round, and have application.conf contain the prod conf, and create a dev.conf file, but you'll probably want a script to start prod anyway (you'll probably end up needing extra JVM/memory/GC parameters, or to add it to rc.d, or whatever).
Using different database engines is probably worst possible scenario, as you wrote yourself : differences in some functions, reserved keywords etc. causes that you need to write sometimes custom statements very specific for selected DB engine. Better use two separate databases using the same engine.
Unfortunately I don't know issues with config overriding, so if default ways for overriding configs fails... override id in application.conf - so you'll be able to comment whole block fast...)
Here's how to use in memory database for tests:
public class ApplicationTest extends WithApplication {
#Before
public void setup() {
start(fakeApplication(inMemoryDatabase("default-test"), fakeGlobal()));
}
/// skipped ....
}
inMemoryDatabase() will use H2 driver by default.
You can find more details in source code

Perl DBI without accessing the database

I'm creating a set of SQL INSERT statements for a database that doesn't exist yet, and I'm saving them to file.
How can I use Perl's powerful DBI module to create those INSERT statements without accessing a specific database. In particular, it looks like using the $dbh->quote() function requires that I instantiate $dbh with a connection to a database.
Unfortunately, the actual quote() behaviour isn't always a portable characteristic, so each driver will do them differently. Unless you connect to a driver, you don't know which quoting format to use in practice. (There is one module that might do this without a connection, DBIx::Abstract, but it is not especially current.).
The quote() method is actually implemented by the corresponding driver class, in the DBD::* namespace. You might attempt to load the driver you need and call the function directly (see http://search.cpan.org/~timb/DBI-1.616/lib/DBI/DBD.pm#Writing_DBD::Driver::db::quote) but this feels grubby.
I'd still make a DBI connection, if only so that you get the right format of quoting. You don't need to actually send it any statements, but then you do know that the quoting format will be correct for the database you will use.
From DBI::quote:
For most database types, at least those that conform to SQL standards, quote would return 'Don''t' (including the outer quotation marks). For others it may return something like 'Don\'t'
That is, the behavior of DBI::quote varies from database to database, and it doesn't make sense to call it in a database-independent way.
Make a trivial connection to a database of the same type you are writing for, or learn your database's quoting conventions and implement a quote method yourself. See the DBI source for a reference implementation.
Usually you would use DBI by specifying a database like so:
my $dbh = DBI->connect("DBI:mysql:database=$db_name;host=127.0.0.1");
However, your database does not yet exist so you cannot connect to it. You can use DBI without specifying a database like so:
my $dbh = DBI->connect("DBI:mysql:;host=127.0.0.1");
You could use DBD::CSV or DBD::AnyData as a dummy database. SQLite is good for this purpose, too.
A hidden advantage of using SQLite here is that it's a semi-real database, and will tend to make you write code in a way that's decoupled from any specific database.
According to perl -MDBI -E 'say join(q{,},DBI->available_drivers);'
in clean Debian chroot with only DBI (package "libdbi-perl") installed the following drivers are available right away:
DBM,ExampleP,File,Gofer,Proxy,Sponge
The minimum DBI connect statement that works for me is
my $dbh=DBI->connect("DBI:DRIVER:"); # DRIVER is one of [DBM,File,ExampleP,Sponge,mysql,SQLite]
That is enough to use $dbh->quote() with no database whatsoever.
DBM and File escape q{'} as q{\'} ("mysql" style);
ExampleP and Sponge escape: q{'} as q{''} ("SQLite" style).
You can also use:
DBD::_::db->quote()
To access the quote function without setting up a database handle. I don't believe it is specific to MySQL though.

How to retrieve useful system information in java?

Which system information are useful - especially when tracing an exception or other problems down - in a java application?
I am thinking about details about exceptions, java/os information, memory/object consumptions, io information, environment/enchodings etc.
Besides the obvious - the exception stack trace - the more info you can get is better. So you should get all the system properties as well as environment variables. Also if your application have some settings, get all their values. Of course you should put all this info into your log file, I used System.out her for simplicity:
System.out.println("----Java System Properties----");
System.getProperties().list(System.out);
System.out.println("----System Environment Variables----");
Map<String, String> env = System.getenv();
Set<String> keys = env.keySet();
for (String key : keys) {
System.out.println(key + "=" + env.get(key));
}
For most cases this will be "too much" information, but for most cases the stack trace will be enough. Once you will get a tough issue you will be happy that you have all that "extra" information
Check out the Javadoc for System.getProperties() which documents the properties that are guaranteed to exist in every JVM.
For pure java applications:
System.getProperty("org.xml.sax.driver")
System.getProperty("java.version")
System.getProperty("java.vm.version")
System.getProperty("os.name")
System.getProperty("os.version")
System.getProperty("os.arch")
In addition, for java servlet applications:
response.getCharacterEncoding()
request.getSession().getId()
request.getRemoteHost()
request.getHeader("User-Agent")
pageContext.getServletConfig().getServletContext().getServerInfo()
One thing that really helps me- to see where my classes are getting loaded from.
obj.getClass().getProtectionDomain().getCodeSource().getLocation();
note: protectiondomain can be null as can code source so do the needed null checks