Doctrine, SQLite and Enums - mysql

We have an application running on Symfony 2.8 with a package named "liip/functional-test-bundle". We plan on using PHP Unit to run functional tests on our application, which uses MySQL for it's database.
The 'functional test bundle' package allows us to use the entities as a schema builder for an in-memory SQLite database, which is very handy because:
It requires zero configuration to run
It's extremely fast to run tests
Our tests can be run independently from each test and the development data
Unfortunately, some of our entities use 'enums' which is not supported by SQLite, and our technical lead has opted to keep existing enums whilst refraining from using them anymore.
Ideally we need this in the project sooner rather than later, so the team can start writing new tests in the future to help maintain the stability of the application.
I have 3 options at this point, but I need help choosing the correct one and performing it correctly:
Convince the technical lead that enums are a bad idea and lookup tables could instead be used (Which may cost time where the workload is already high)
Switch to using MySQL for the testing database. (This will require additional configuration for our tests to run, and may be slower)
Have doctrine detect when enums are used on a SQLite driver, and switch them out for strings. (I would have no idea how to do this, but this is, in my opinion, the most ideal solution)
Which action is the best, and how should I carry it out?

Related

Is it bad practice to use h2db for integration test in a spring boot context?

in our team recently the question was raised if using h2db for integration tests is a bad practice/should be avoided if the production environment relies on a different database engine, in our case MySQL8.
I'm not sure if I agree with that, considering we are using spring boot/hibernate for our backends.
I did some reading and came across this article https://phauer.com/2017/dont-use-in-memory-databases-tests-h2/ stating basically the following (and more):
TL;DR
Using in-memory databases for tests reduce the reliability and
scope of your tests. Your application’s SQL may fail in production
against the real database, although the h2-based tests are green.
They provide not the same features as the real database. Possible
consequences are:
You change the application’s SQL code just to make
it run in both the real and the in-memory database. This may result in
less effective, elegant, accurate or maintainable implementations. Or
you can’t do certain things at all.
You skip the tests for some
features completely.
As far as I can tell for a simple CRUD application with some business logic all those points does'n concern me (there some more in the article), because hibernate wraps away all the SQL and there is no native SQL in the code.
Are there any points that i am overlooking or have not considered that speak against an h2db? Is there a "best practice" regarding the usage of in-memory db for integration tests with spring boot/hibernate?
I'd avoid using H2 DB if possible. Using H2DB is good, when you can't run your own instance, for example if your company uses stuff like Oracle and won't let you run your own DB wherever you want (local machine, own dev server...).
Problems with H2DB are following:
Migration scripts may different for H2DB and your DB. You'll probably have to have some tweaks for H2DB scripts and MySQL scripts.
H2DB usually doesn't provide same features, like real RDBMS, you degrade the DB for using only SQL, you won't be able to test store procedures, triggers and all the fancy stuff that may come handy.
The H2DB and other RDBMS are different. Tests won't be testing the same thing, you may get some errors in production that won't appear in your tests.
Speaking of your simple CRUD application - it may not stay like that forever.
But go ahead with any approach you like, it is best to get your personal experience yourself, I got burned on H2DB too often to like it.
I would say it depends on the scope of your tests and what you can afford for your integration tests. I would prefer testing against an as close as possible environment to my production environment. But that's the ideal case, in reality that might not be possible for a varied reasons. Also, expecting hibernate to abstract away low level details perfectly is also an ideal case, in reality the abstraction may be giving you a false sense of security.
If the scope of your tests is to just test CRUD operations, an in-memory tests should be fine. It will perform in that scope quite adequately. It might even be beneficial reducing time of your tests, as well as some degree of complexity. It wont detect any platform/version/vendor specific issues, but that wasnt the scope of the test anyways. You can rather test those things in a staging environment before going to production.
In my opinion, it's now easier than ever to create a test environment as close as possible to your production environment using things like docker, CI/CD tools/platform also support spinning up services for that purpose. If this isn't available or too complicated for your use case, then the fallback is acceptable.
From experience, I had faced failures related to platform/version/vendor specific issues when deploying to production though all my tests against in-memory database went green. It's always better to detect these issues early and save a lot of recurrent development time and most importantly your good night sleep.

SQLite3 database per customer

Scenario:
Building a commercial app consisting in an RESTful backend with symfony2 and a frontend in AngularJS
This app will never be used by many customers (if I get to sell 100 that would be fantastic. Hopefully much more, but in any case will be massive)
I want to have a multi tenant structure for the database with one schema per customer (they store sensitive information for their customers)
I'm aware of problem when updating schemas but I will have to live with it.
Today I have a MySQL demo database that I will clone each time a new customer purchase the app.
There is no relationship between my customers, so I don't need to communicate with multiple shards for any query
For one customer, they can be using the app from several devices at the time, but there won't be massive write operations in the db
My question
Trying to set some functional tests for the backend API I read about having a dedicated sqlite database for loading testing data, which seems to be good idea.
However I wonder if it's also a good idea to switch from MySQL to SQLite3 database as my main database support for the application, and if it's a common practice to have one dedicated SQLite3 database PER CLIENT. I've never used SQLite and I have no idea if the process of updating a schema and replicate the changes in all the databases is done in the same way as for other RDBMS
Is this a correct scenario for SQLite?
Any suggestion (aka tutorial) in how to achieve this?
[I wonder] if it's a common practice to have one dedicated SQLite3 database PER CLIENT
Only if the database is deployed along with the application, like on a phone. Otherwise I've never heard of such a thing.
I've never used SQLite and I have no idea if the process of updating a schema and replicate the changes in all the databases is done in the same way as for other RDBMS
SQLite is a SQL database and responds to ALTER TABLE and the like. As for updating all the schemas, you'll have to re-run the update for all schemas.
Schema synching is usually handled by an outside utility, usually your ORM will have something. Some are server agnostic, some only support specific servers. There are also dedicated database change management tools such as Sqitch.
However I wonder if it's also a good idea to switch from MySQL to SQLite3 database as my main database support for the application, and
SQLite's main advantage is not requiring you to install and run a server. That makes sense for quick projects or where you have to deploy the database, like a phone app. For server based application there's no problem having a database server. SQLite's very restricted set of SQL features becomes a disadvantage. It will also likely run slower than a server database for anything but the simplest queries.
Trying to set some functional tests for the backend API I read about having a dedicated sqlite database for loading testing data, which seems to be good idea.
Under no circumstances should you test with a different database than the production database. Databases do not all implement SQL the same, MySQL is particularly bad about this, and your tests will not reflect reality. Running a MySQL instance for testing is not much work.
This separate schema thing claims three advantages...
Extensibility (you can add fields whenever you like)
Security (a query cannot accidentally show data for the wrong tenant)
Parallel Scaling (you can potentially split each schema onto a different server)
What they're proposing is equivalent to having a separate, customized copy of the code for every tenant. You wouldn't do that, it's obviously a maintenance nightmare. Code at least has the advantage of version control systems with branching and merging. I know only of one database management tool that supports branching, Sqitch.
Let's imagine you've made a custom change to tenant 5's schema. Now you have a general schema change you'd like to apply to all of them. What if the change to 5 conflicts with this? What if the change to 5 requires special data migration different from everybody else? Now let's imagine you've made custom changes to ten schemas. A hundred. A thousand? Nightmare.
Different schemas will require different queries. The application will have to know which schema each tenant is using, there will have to be some sort of schema version map you'll need to maintain. And every different possible query for every different possible schema will have to be maintained in the application code. Nightmare.
Yes, putting each tenant in a separate schema is more secure, but that only protects against writing bad queries or including a query builder (which is a bad idea anyway). There are better ways mitigate the problem such as the view filter suggested in the docs. There are many other ways an attacker can access tenant data that this doesn't address: gain a database connection, gain access to the filesystem, sniff network traffic. I don't see the small security gain being worth the maintenance nightmare.
As for scaling, the article is ten years out of date. There are far, far better ways to achieve parallel scaling then to coarsely put schemas on different servers. There are entire databases dedicated to this idea. Fortunately, you don't need any of this! Scaling won't be a problem for you until you have tens of thousands to millions of tenants. The idea of front loading your design with a schema maintenance nightmare for a hypothetical big parallel scaling problem is putting the cart so far before the horse, it's already at the pub having a pint.
If you want to use a relational database I would recommend PostgreSQL. It has a very rich SQL implementation, its fast and scales well, and it has something that renders this whole idea of separate schemas moot: a built in JSON type. This can be used to implement the "extensibility" mentioned in the article. Each table can have a meta column using the JSON type that you can throw any extra data into you like. The application does not need special queries, the meta column is always there. PostgreSQL's JSON operators make working with the meta data very easy and efficient.
You could also look into a NoSQL database. There are plenty to choose from and many support custom schemas and parallel scaling. However, it's likely you will have to change your choice of framework to use one that supports NoSQL.

Arel in SQLite and other databases

I am new in databases and Ruby on rails applications.
I have a question about generating queries from ORM.
When my database is SQLite, and I am using a code for creating queries for this database, if I change my database am I still able to use the same code?
In addition, when I am using Arel, because it provides more ready methods for more complex queries, before I am generating a query I call the method .to_sql
If I want to use the same code but for another database am I still able to execute the query? Using instead of to_sql something else?
In general, the Ruby on Rails code is portable between databases without doing anything more than adjusting your config/database.yml file (for connection details) and updating your Gemfile (to use the correct database adapter gem).
Database portability is mostly likely when you do not rely on specific, hardcoded uses of SQL as a way to invoke queries. Instead use the Rails' associations, relations, and query tools wherever possible. Specific SQL often creeps in on .where() clauses, so be thoughtful there and minimize/simply those as much as practical (for instance, multiple simple scopes that can be chained may give you better results than trying larger more complex single scopes). Also use Arel.matches when depending on "LIKE" type clauses instead of hardcoding LIKE details in .where() calls because different databases (such as PostgreSQL) handle case-sensitivity differently.
Your best defense against surprises upon switching databases is a robust set of automated unit (e.g., Rspec, minitest) and integration tests (e.g., capybera). These are especially important where you are unable to avoid use of specific SQL coding (say for optimization or odd/complex queries).
Since SQLite is simpler than most other robust engines like MySQL or Postgres, you're likely to be safer anyway in any operations you depend on. You're most vulnerable when using some advanced or specific feature of the database, but you're also generally more aware if you're doing that, so can write protective tests to help warn you upon switching database engines.

MySQL: How do I test my database architecture (foreign key consistency, stored procedures, etc)

I'm just about designing a larger database architecture. It will contain a set of tables, several views and quite some stored procedures. Since it's a database of the larger type and in the very early stage of development (actually it's still only in the early design stage) I feel the need of a test suite to verify integrity during refactoring.
I'm quite familiar with testing concepts as far as application logic is concerned, both on server side (mainly PHPUnit) and client side (Selenium and the Android test infrastructure).
But how do I test my database architecture?
Is there some kind of similar testing strategies and tools for databases in general and MySQL in particular?
How do I verify that my views, stored procedures, triggers and God-knows-what are still valid after I change an underlying table?
Do I have to wrap the database with, say, a PHP layer to enable testing of database logic (stored procedures, triggers, etc)?
There are two sides of database testing.
One is oriented to testing database from the business logic point of view and should not concern persisted data. At that level there is a well-known technique - ORM. Algorithm in this case is simple: describe a model and create a set of unique cases or criterias to test if all cascades actions perform as they should (I mean, if we create Product and link it to Category, than after saving a session we get all entities written in DB with all required relations between). More to say: some ORMs already provide a unit testing module (for example, NHibernate) and some of them even more cool tool: the easiest and the fastest way to create database schemes, models, test cases: for example, Fluent NHibernate.
Second is oriented to testing database schema itself. For that purpose you can look at a good library DbUnit. Quote from official site:
DbUnit is a JUnit extension (also usable with Ant) targeted at database-driven projects that, among other things, puts your database into a known state between test runs. DbUnit has the ability to export and import your database data to and from XML datasets. Since version 2.0, DbUnit can also work with very large datasets when used in streaming mode. DbUnit can also help you to verify that your database data match an expected set of values.
At finally, I highly recommend you to read the article "Evolutionary Database Design" from Martin Fowler's site. It's a bit outdated (2003), but still worth to read indeed.
To test a database, some of the things you will need are:
A test database containing all your data test cases, initial data, and so on. This will enable you to test from a known start position each time.
A set of transactions (INSERT, DELETE, UPDATE) that move your database through the states you want to test. These can themselves be stored in the test database.
Your set of tests - expressed as queries on the database, that do the actual checking of the results of your actions. These results will be tested by your test suite.
Exceptions can be thrown by a database, but if you are getting exceptions, you are likely to have much more serious concerns in your database and data. You can test the action of the database in a similar fashion, but except for "corner cases" this should be less necessary, as modern database engines are pretty robust at their task of data serving.
You should not need to wrap your database with a PHP layer - if you follow the above structure it should be possible to have your complete test suite in the DML and DDL of your actual database combined with your normal test suite.

Testing framework for data access tier

Is there any testing framework for Data access tier? I'm using mysql DB.
If you are using ORM ( such as Hibernate), then the testing for DAL is easy. All you have to do, is to specify a test config involving in memory sqlite database and then executing all your DAL tests against the sqlite. Of course you need to do a proper data population, schema definition in the first place.
Dbunit will help you here.
Why do you need a database test tool?
Use your services (or DAOs) to populate the database. Otherwise you're going to duplicate your fixture state in your tests and your domain logic in your fixtures. This will result in worse maintainability (most notably readability).
If you get weary of inventing test data think about tools like Quickcheck (there are ports for all major languages).