Use Mysql in dev/prod and H2 in test - mysql

Using the play framework 2.1, I'm trying to find the best way to have two different databases configurations:
One to run my application based on mysql
One to test my application based on H2
While it is very easy to do one or the other, I run into the following problems when I try to do both:
I cannot have the same database evolutions because there are some mysql specific commands that do not function with H2 even in mysql mode: this means two sets of evolutions and two separate database names
I'm not certain how to override the main application.conf file by another reserved to test in test mode. What I tried (passing the file name or overriding keys from the command line) seems to be reserved to prod mode.
My question: Can anyone recommend a good way to do both (mysql all the time and only H2 in test) without overly complicating running the application? Google did not help me.
Thanks for your help.

There are a couple of tricks you might find useful.
Firstly, MySQL's /*! */ notation allows you to add code which MySQL will obey, but other DBs will ignore, for example:
create table Users (
id bigint not null auto_increment,
name varchar(40)
) /*! engine=InnoDB */
It's not a silver bullet, but it'll let you paper over some of the differences between MySQL and H2's syntax. It's a MySQL-ism, so it won't help with other databases, but since most other databases aren't as quirky as MySQL, you probably wouldn't need it - we migrated our database from MySQL to PostgreSQL, which doesn't support the /*! */ notation, but PostgreSQL is similar enough to H2 that we didn't need it.
If you want to use a different config for dev and prod, you're probably best off having extra config for prod. The reason for this is that you'll probably start your dev server with play run, and start your prod server with play stage; target/start. target/start can take a -Dconfig.resource parameter. For example, create an extra config file prod.conf for prod that looks like:
include "application.conf"
# Extra config for prod - this will override the dev values in application.conf
db.default.driver=...
db.default.url=...
...
and create a start_prod script that looks like:
#!/bin/sh
# Optional - you might want to do this as part of the build/deploy process instead
#play stage
target/start -Dconfig.resource=prod.conf
In theory, you could do it the other way round, and have application.conf contain the prod conf, and create a dev.conf file, but you'll probably want a script to start prod anyway (you'll probably end up needing extra JVM/memory/GC parameters, or to add it to rc.d, or whatever).

Using different database engines is probably worst possible scenario, as you wrote yourself : differences in some functions, reserved keywords etc. causes that you need to write sometimes custom statements very specific for selected DB engine. Better use two separate databases using the same engine.
Unfortunately I don't know issues with config overriding, so if default ways for overriding configs fails... override id in application.conf - so you'll be able to comment whole block fast...)

Here's how to use in memory database for tests:
public class ApplicationTest extends WithApplication {
#Before
public void setup() {
start(fakeApplication(inMemoryDatabase("default-test"), fakeGlobal()));
}
/// skipped ....
}
inMemoryDatabase() will use H2 driver by default.
You can find more details in source code

Related

SQLModel in Fastapi using __table_args__ is unable to create tables for unit testing [duplicate]

I have a Pylons project and a SQLAlchemy model that implements schema qualified tables:
class Hockey(Base):
__tablename__ = "hockey"
__table_args__ = {'schema':'winter'}
hockey_id = sa.Column(sa.types.Integer, sa.Sequence('score_id_seq', optional=True), primary_key=True)
baseball_id = sa.Column(sa.types.Integer, sa.ForeignKey('summer.baseball.baseball_id'))
This code works great with Postgresql but fails when using SQLite on table and foreign key names (due to SQLite's lack of schema support)
sqlalchemy.exc.OperationalError: (OperationalError) unknown database "winter" 'PRAGMA "winter".table_info("hockey")' ()
I'd like to continue using SQLite for dev and testing.
Is there a way of have this fail gracefully on SQLite?
I'd like to continue using SQLite for
dev and testing.
Is there a way of have this fail
gracefully on SQLite?
It's hard to know where to start with that kind of question. So . . .
Stop it. Just stop it.
There are some developers who don't have the luxury of developing on their target platform. Their life is a hard one--moving code (and sometimes compilers) from one environment to the other, debugging twice (sometimes having to debug remotely on the target platform), gradually coming to an awareness that the gnawing in their gut is actually the start of an ulcer.
Install PostgreSQL.
When you can use the same database environment for development, testing, and deployment, you should.
Not to mention the QA team. Why on earth are they testing stuff they're not going to ship? If you're deploying on PostgreSQL, assure the quality of your work on PostgreSQL.
Seriously.
I'm not sure if this works with foreign keys, but someone could try to use SQLAlchemy's Multi-Tenancy Schema Translation for Table objects. It worked for me but I have used custom primaryjoin and secondaryjoinexpressions in combination with composite primary keys.
The schema translation map can be passed directly to the engine creator:
...
if dialect == "sqlite":
url = lambda: "sqlite:///:memory:"
execution_options={"schema_translate_map": {"winter": None, "summer": None}}
else:
url = lambda: f"postgresql://{user}:{pass}#{host}:{port}/{name}"
execution_options=None
engine = create_engine(url(), execution_options=execution_options)
...
Here is the doc for create_engine. There is a another question on so which might be related in that regard.
But one might get colliding table names all schema names are mapped to None.
I'm just a beginner myself, and I haven't used Pylons, but...
I notice that you are combining the table and the associated class together. How about if you separate them?
import sqlalchemy as sa
meta = sa.MetaData('sqlite:///tutorial.sqlite')
schema = None
hockey_table = sa.Table('hockey', meta,
sa.Column('score_id', sa.types.Integer, sa.Sequence('score_id_seq', optional=True), primary_key=True),
sa.Column('baseball_id', sa.types.Integer, sa.ForeignKey('summer.baseball.baseball_id')),
schema = schema,
)
meta.create_all()
Then you could create a separate
class Hockey(Object):
...
and
mapper(Hockey, hockey_table)
Then just set schema above = None everywhere if you are using sqlite, and the value(s) you want otherwise.
You don't have a working example, so the example above isn't a working one either. However, as other people have pointed out, trying to maintain portability across databases is in the end a losing game. I'd add a +1 to the people suggesting you just use PostgreSQL everywhere.
HTH, Regards.
I know this is a 10+ year old question, but I ran into the same problem recently: Postgres in production and sqlite in development.
The solution was to register an event listener for when the engine calls the "connect" method.
#sqlalchemy.event.listens_for(engine, "connect")
def connect(dbapi_connection, connection_record):
dbapi_connection.execute('ATTACH "your_data_base_name.db" AS "schema_name"')
Using ATTACH statement only once will not work, because it affects only a single connection. This is why we need the event listener, to make the ATTACH statement over all connections.

Working on migration of SPL 3.0 to 4.2 (TEDA)

I am working on migration of 3.0 code into new 4.2 framework. I am facing a few difficulties:
How to do CDR level deduplication in new 4.2 framework? (Note: Table deduplication is already done).
Where to implement PostDedupProcessor - context or chainsink custom? In either case, do I need to remove duplicate hashcodes from the list or just reject the tuples? Here I am also doing column updating for a few tuples.
My file is not moving into archive. The temporary output file is getting generated and that too empty and outside load directory. What could be the possible reasons? - I have thoroughly checked config parameters and after putting logs, it seems correct output is being sent from transformer custom, so I don't know where it is stuck. I had printed TableRowGenerator stream for logs(end of DataProcessor).
1. and 2.:
You need to select the type of deduplication. It is not a big difference if you choose "table-" or "cdr-level-deduplication".
The ite.businessLogic.transformation.outputType does affect this. There is one Dedup only. You can not have both.
Select recordStream for "cdr-level-deduplication", do the transformation to table row format (e.g. if you like to use the TableFileWriter) in xxx.chainsink.custom::PostContextDataProcessor.
In xxx.chainsink.custom::PostContextDataProcessor you need to add custom code for duplicate-handling: reject (discard) tuples or set special column values or write them to different target tables.
3.:
Possibly reasons could be:
Missing forwarding of window punctuations or statistic tuple
error in BloomFilter configuration, you would see it easily because PE is down and error log gives hints about wrong sha2 functions be used
To troubleshoot your ITE application, I recommend to enable the following debug sinks if checking the StreamsStudio live graph is not sufficient:
ite.businessLogic.transformation.debug=on
ite.businessLogic.group.debug=on
ite.businessLogic.sink.debug=on
Run a test with a single input file only and check the flow of your record and statistic tuples. "Debug sinks" write punctuations markers also to debug files.

Using attributes in Chef

Just getting started with using chef recently. I gather that attributes are stored in one large monolithic hash named node that's available for use in your recipes and templates.
There seem to be multiple ways of defining attributes
Directly in the recipe itself
Under an attributes file - e.g. attributes/default.rb
In a JSON object that's passed to the chef-solo call. e.g. chef-solo -j web.json
Given the above 3, I'm curious
Are those all the ways attributes can be defined?
What's the order of precedence here? I'm assuming one of these methods supercedes the others
Is #3 (the JSON method) only valid for chef-solo ?
I see both node and default hashes defined. What's the difference? My best guess is that the default hash defined in attributes/default.rb gets merged into the node hash?
Thanks!
Your last question is probably the easiest to answer. In an attributes file you don't have to type 'node' so that this in attributes/default.rb:
default['foo']['bar']['baz'] = 'qux'
Is exactly the same as this in recipes/whatever.rb:
node.default['foo']['bar']['baz'] = 'qux'
In retrospect having different syntaxes for recipes and attributes is confusing, but this design choice dates back to extremely old versions of Chef.
The -j option is available to chef-client or chef-solo and will both set attributes. Note that these will be 'normal' attributes which are persistent in the node object and are generally not recommended to use. However, the 'run_list', 'chef_environment' and 'tags' on servers are implemented this way. It is generally not recommended to use other 'normal' attributes and to avoid node.normal['foo'] = 'bar' or node.set['foo'] = 'bar' in recipe (or attribute) files. The difference is that if you delete the node.normal line from the recipe the old setting on a node will persist, while if you delete a node.default setting out of a recipe then when your run chef-client on the node that setting will get deleted.
What happens in a chef-client run to make this happen is that at the start of the run the client issues a GET to get its old node document from the server. It then wipes the default, override and automatic(ohai) attributes while keeping the 'normal' attributes. The behavior of the default, override and automatic attributes makes the most sense -- you start over at the start of the run and then construct all the state, if its not in the recipe then you don't see a value there. However, normally the run_list is set on the node and nodes do not (often) manage their own run_list. In order to make the run_list persist it is a normal attribute.
The choice of the word 'normal' is unfortunate, as is the choice of 'node.set' setting 'normal' attributes. While those look like obvious choices to use to set attributes users should avoid using those. Again the problem is that they came first and were and are necessary and required for the run_list. Generally stick with default and override attributes only. And typically you can get most of your work done with default attributes, those should be preferred.
There's a big precedence level picture here:
https://docs.chef.io/attributes.html#attribute-precedence
That's the ultimate source of truth for attribute precedence.
That graph describes all the different ways that attributes can be defined.
The problem with Chef Attributes is that they've grown organically and sprouted many options to try to help out users who painted themselves into a corner. In general you should never need to touch automatic, normal, force_default or force_override levels of attributes. You should also avoid setting attributes in recipe code. You should move setting attributes in recipes to attribute files. What this leaves is these places to set attributes:
in the initial -j argument (sets normal attributes, you should limit using this to setting the run_state, over using this is generally smell)
in the role file as default or override precedence levels (careful with this one though because roles are not versioned and if you touch these attributes a lot you will cause production issues)
in the cookbook attributes file as default or override precedence levels (this is where you should set most of your attributes)
in environment files as default or override precedence levels (can be useful for settings like DNS servers in a datacenter, although you can use roles and/or cookbooks for this as well)
You also can set attributes in recipes, but when you do that you invariably wind up getting your next lesson in the two-phase compile-converge parser that runs through the Chef Recipes. If you have recipes that need to communicate with each other its better to use the node.run_state which is just a hash that doesn't get written as node attributes. You can drop node.run_state[:foo] = 'bar' in one recipe and read it in another. You probably will see recipes that set attributes though so you should be aware of that.
Hope That Helps.
When writing a cookbook, I visualize three levels of attributes:
Default values to converge successfully -- attributes/default.rb
Local testing override values -- JSON or .kitchen.yml (have you tried chef_zero using ChefDK and Kitchen?)
Environment/role override values -- link listed in lamont's answer: https://docs.chef.io/attributes.html#attribute-precedence

How to store web sessions in MySQL for CherryPy 3.2.2?

I found many examples for older versions of CherryPy but they each referenced importing modules not found in cherrypy 3.2.2. Looking in the documentation, I found a reference to the fact that there is built in functionality with storage_type (one of ‘ram’, ‘file’, ‘postgresql’).
For a start you could take a look at
https://github.com/3kwa/cherrys
how this guy writes his own session class and overwrites some methods. He does it for redis not MySQL. You would write methods for MySQL. A very similar class already exists in cherrypy in "cherrpy/lib/sessions.py":
class PostgresqlSession(Session)
which is very similar to what you want. I'd say, take the implementing approach from the "3kwa" but instead of his RedisSession-class copy the PostgresqlSession-class from "cherrpy/lib/sessions.py" and alter to match proper MySQL-Syntax.
A possible path could be:
Download the "cherrys.py" from above link and rename into "mysqlsession.py". Overwrite the "RedisSessions(Session)" with the "PostgresqlSession(Session)" from "cherrpy/lib/sessions.py" and rename to "MySQLSession(Session)". Be sure to add
locks = {}
def acquire_lock(self):
"""Acquire an exclusive lock on the currently-loaded session data."""
self.locked = True
self.locks.setdefault(self.id, threading.RLock()).acquire()
def release_lock(self):
"""Release the lock on the currently-loaded session data."""
self.locks[self.id].release()
self.locked = False
to your new "MySQLSession"-class (like it is done in RedisSession(Session). Alter the the PostgreSQL-Syntax to match MySQL-Syntax (that shouldn't be difficult). Put the "mysqlsession.py" somewhere below your project directory and import in the application with
import mysqlsession
and use
cherrypy.lib.sessions.MySQLSession = mysqlsession.MySQLSession
in the initialization of you app. In the config
tools.sessions.storage_type : 'mysql'
and the parameters (like host, port, etc.) like you would with class "PostgreSQL".
I can be wrong all along. But this is how I would try to solve this.

Muetexes in perl & MySQL

I am trying to insure that only one instance of a perl script can run at one time. The script performs some kind of db_operation depending on the parameters passed in. The script does not necessarily live in one place or on one machine, and possibly multiple OSs. Though the file system is automounted across the various machines.
My first aproach was to just create a .lock file, and do the following:
use warnings;
use strict;
use Fcntl qw(:DEFAULT :flock);
...
open(FILE,">>",$lockFilePath);
flock(FILE,LOCK_EX) or die("Could not lock ");
do_something();
flock(FILE,LOCK_UN) or die("Could not unlock ");
close(FILE);
but I keep getting the following errors:
Bareword "LOCK_EX" not allowed while "strict subs" in use
Bareword "LOCK_UN" not allowed while "strict subs" in use
So I am looking for another way to approach the problem. Locking the DB itself is also not practical since the db could be used by other scripts(which is acceptable), I am just trying to prevent this script from running. And locking a table for write is not practical, since my script is not aware of what table the operation is taking place, it just launches another perl script supplied as a parameter.
I am thinking of adding a table to the db, with just one value, and to use that as a muetex, but I don't know how practical/reliable that is(a lot of red flags go up in my head). I have a DBI connection to a db that this script useses.
Thanks
The Bareword error you are getting sounds like you've done something in that "..." to confuse Perl with regard to the imported Fcntl constants. There's nothing wrong with using those constants like that. You might try something like LOCK_UN() to see what error that gets you.
If you are using MySQL, you can use the GET_LOCK() and RELEASE_LOCK() mechanism. It works reasonably well for cases like this:
SELECT GET_LOCK("script_lock");
and then when you are finished:
SELECT RELEASE_LOCK("script_lock");
See http://dev.mysql.com/doc/refman/4.1/en/miscellaneous-functions.html for details.
You may want to avoid the file locking; from what I remember it's notoriously unreliable on non-local filesystems. Your better bet is to just use the existence of the file itself to the indicator that the script is already running (similar to a UNIX PID file) Granted, this won't be 100% reliable but should work reasonably reliably with very low overhead, provided the script isn't getting invoked incessantly.
If you need better reliability than that, using the database for the mutex is a good solution.