Vertical deployment pros and cons in Jboss/MySQL - mysql

I'm working on a project which has a single WAR file for each application, it is like an app store.
So 10 apps have 10 different WAR files deployed. Usually there's a DAO, BL as separate jars inside the WAR file which exposes web services.
However there are few cases where we refer to a library usually the DAO/BL from another WAR file.
I'm not sure if this is the right approach. We seem to face difficulties when deploying to figure out what versions of deployed JARs are used etc. Another approach would be to not talk to another app's JAR(DAO) but talk to the deployed web service from the client if need be.
The DAO's have a mysql-ds.xml for a database in MySQL.
We could have one single data source for all the features but not sure if it helps.
As you can figure out from my previous paragraph, I'm a bit confused and also concerned if we have 100 different apps then maintaining all 100 of them with their dependencies would be really hard. Also how can connection pooling be effectively used from jboss? Would it be good to have single database for all apps or multiple databases - this is in terms of maintenance?? Our stack is
Jboss
Apache CXF
Dozer
DAO (Hibernate)
Entity (POJO)
Hibernate
Mysql
And maven as the build tool. I know my questions are a bit general please let me know if you need some more info.

Complex infrastructures like this are always difficult to manage.
There's three main approaches you can take, and each has pros and cons:
Web services to encapsulate all business layer/data access into an API. This minimizes the proliferation of versions of jars in various apps, but forces you to be more rigorous about API changes.
Creation of libraries that can be shared amongst multiple projects. I'm not clear on what you mean by referring to a library from another WAR file, perhaps this is what you mean in that you're including the relevant jars in your newly deployed WAR. This does lead to version compatibility concerns you mention, but can make modifying existing APIs more flexible, in that you don't have to immediately modify all existing apps.
Encapsulate all data logic in the database. In my experience, this is the most problematic, as it separates the dev from knowledge of how the business logic is working, and can be the most fragile - one stored procedure change can be harder to detect when it starts breaking other apps than the other approaches.
In my experience, it comes down to having more established processes and agreements among the team about how changes will be made. You really have to look at your business layer/data access layer as APIs and be very conservative about making changes. If you aren't already using a continuous build system, I'd highly recommend it, as it can help you catch changes that break existing applications early on and allow you to keep things in sync.

It's perfectly fine to have all your applications use the same database.
However, you run the risk that different apps use the database in different ways.
For this reason I would recommend that you put as much logic as possible in MySQL.
I cannot tell you how to do this, because I don't know what your apps do or need, but I can give you some general ideas and pointers.
General ideas and pointers
You can use stored procedures/functions to do stuff
If your apps use a stored procedure to make stuff happen in the single database, all apps will work in the same manner.
Use stored functions to do calculations on fields. (e.g. E.g. use a stored procedure to book a transaction)
Price_per_sales_line = price * quantity * 1+tax% * 1-discount%
If you put this logic in a MySQL function, than you don't need to worry about debugging this in app A or B, because all apps will work the same way.
And my personal favorite
Use triggers to make sure stuff happen properly.
E.g. if you have a transaction where you need to add new item for sale, you an put this in a stored proc, but you can also do something like:
Pseudo code
CREATE table blackhole_new_sales_item (
name varchar(45) not null
price decimal(10,2) not null
category_id integer not null )
ENGINE = Blackhole;
DELIMITER $$
CREATE TRIGGER ai_bh_new_sales_item_each FOR EACH ROW
BEGIN
/*all stuff inside a trigger happens in a single transaction*/
DECLARE new_item_id INTEGER;
INSERT IGNORE INTO items (name) VALUES (NEW.name);
SELECT id INTO new_item_id FROM items WHERE name = NEW.name;
INSERT IGNORE INTO item_categories (item_id, cat_id) VALUES (new_item_id, NEW.category_id)
INSERT INTO price (item_id, price, valid_from, valid_until) VALUES
(new_item_id, NEW.price, NOW(), '2038-12-31');
END $$
DELIMITER ;
In your apps you can just do a single:
INSERT INTO blackhole_new_sales_item VALUES ('test','0.99',2)
And the trigger will take care of everything and if you change the structure of your database, you need only change the inside of the trigger and all your apps will work without change.
If you add extra fields to the blackhole table, you need to only change the single call in each app.
You can even create an extra blackhole table and create a separate trigger for that, and fill your old-blackhole-table-trigger with fall-back code to support the older apps.
So this approach gives you a single point to put all your DB logic into so all apps will behave in the same way, whilst still being flexible enough to support upgrades.
Hope this helps.

Related

What can I do to trace what a program does, not having the source code and the support from the program supplier

I have now to deal with a program called FDT whose support is no longer taken by the company I am working for but still using the same program. Now I need to insert new orders into the program from the site which I can get in xml, csv or some other from magento. I am trying to automate this process. All work in the office are done on the basis of this software FDT like checking the out of stock, bills printing and others.
I am now thinking to use profiler to trace events. I would like to know what processing does the program do when we place some order in it. I am not a good user of Profiler, I would like some suggestions if it is possible know what tables it effects, what columns it updates or writes to.
Above it is a new order no. the program generates. which is a unique id and is integer. I am not able to know the pattern. I do have a test server where I can make changes and trial and error is no problem.
Some suggestions on how shall I proceed or at least start going on would be appreciated.
I think most important would be to trace the T-sql but again which events and what filter to use?
I am sorry if it a stupid question, I am trying to learn .. source code and support is not an option.
This question has too many parts- how to do trace, how to deal with an application post-support-contract, how to reverse engineer an app and even if that is a good idea (and sometimes it's the only idea available) I'd re-ask this as a series of narrow technical question or ask it on Programmers (after reading their FAQ they only like certain questions)
Yup, been there done that. In large organizations, normally these tasks fall to technies who don't weild the awesome power of the budget and can't personal go negotiate a new contract with the original vendor. I assume you have food bills to pay and can't tell your supervisor, "well, I ain't do doing nothing until we get a support contract"
Step 0 Diagram the tables - work out the entity relationships and assembly a data dictionary (one that explains the motivation of each table and column, not just the name and data type)
Step 1 Attach the profiler to an active instance of SQL 2008. If you have a specific question about SQL Profiler, open a new question. One hint-- if you are attached to a multi-user instance, filter down to just your own user (the one in the connection string)
http://blog.sqlauthority.com/2009/08/03/sql-server-introduction-to-sql-server-2008-profiler-2/
Step 2
Do an action in the application and watch what SQL was emitted. If it is SQL, you can copy and paste it to Management studio so you can diagram the query and run your own test executions. If it is a stored proc, you go read the source code of the stored procedure. If the stored procedure is encrypted, it may or may not be possible to decrypt it. Scenarios when decrypting the code is fairly defensible is when you aren't redistributing it and the supporting company isn't there.
Step 3
Once you understand the app, you can write reports, or more likey, you want to record either new transactions or old transactions differently.
If the app is written in .net or java, you can decompile it and read the code. Creating a custom build from that source isn't going to be fun. A more likely thing to happen is you will create an application that targets the same tables or possibly export all the data out of the original app and into a new bespoke one.

Put trigger in MySQL database to update Oracle database?

I want to create an insert trigger on MySQL which will automatically insert the record into an Oracle database. I would like to know if there are people that have experience to share on this topic.
Cheers
Invoke a script as is done in this example that calls the Oracle code.
Note: you lose support for transactions (there will be no built-in rollback for the Oracle database) when you perform this type of cascading, and you also will take a likely very large performance hit in doing so. The script could turn around and simply call Java code or some other executable that invokes your some generic code to insert into Oracle, or it could be a raw query that gets passed arguments from the script.
This is almost certainly a bad idea because of the odd side-effect behavior, but it's one that can be implemented. I think that you would be much better off having the code to do this against two different DataSources (in Java/.NET speak) rather than have a hidden script in a MySQL trigger that screams unmaintainable, as well as hidden failure for future developers.

Database build process management

What options exists to manage database scripts and do a new development for database:
For example, the database used by a number of applications and there are a number of developers working with database, what will be the best options to maintain database up to date with the last changes and what should be the process of deployment changes to production
I see two options:
Microsoft visual studio has a database project, so all database
scripts should be add in the project and database can be rebuild
from visual studio
Restore database from backup and apply only new scripts to database
What another options exists? How can I manage database development, what is the best practices? what will be advantages and disadvantages of options I write above? How to maintain new sql scripts?
I understand then source control system should be used, but with DB scripts it's not so easy as with application.
I believe it will be no universal solution, but at least I am interesting in DB developers opinion how it's implemented in your company.
Liquibase is IMHO the best tool. It's brutally simple in its approach, which is one of the reasons it works so well.
You can read up on the site how it works, but basically it creates and manages a simple table that stores a hash of each script to determine if it has run a script yet or not. There's pre- and post- sql too, and you can bypass on conditions... it does pretty much everything you'd want or need. It also has maven integration, so it can seamlessly become part of your build.
I used it very successfully on a large (8 developers) project and now I wouldn't use anything else.
And it's free!
Currently we use SVN and have an "UpgradeScripts" folder where all developers commit their scripts to.
Each script has a generated prefix in the format upg_yyyymmddhhmmss_ScriptName.sql - So when they are deployed they run in a pre-defined order; keeping the database consistent.
This is generated through the below SQL and enforced through a pre commit hook:
select 'upg_' + convert(varchar, SYSUTCDATETIME(), 112)
+ replace(convert(varchar, SYSUTCDATETIME(), 8), ':', '')
+ '-'
+ 'MeaningfulScriptName'
Another handy technique we use is making sure the difference between static and non-static data is clear; so in our database there is the standard "dbo" schema - which indicates non-static data which may change between environments, and a "static" schema. All tables in this schema have static id's, so developers know they can use them in enums and reference the id's in scripts.
If you are looking for something more formal, Red Gate have a utility called SQL Source Control.
Or you could look into using the Data Tier Application framework.
We use DBGhost to version control the database. The scripts to create the current database are stored in TFS (along with the source code) and then DBGhost is used to generate a delta script to upgrade an environment to the current version. DBGhost can also create delta scripts for any static/reference/code data.
It requires a mind shift from the traditional method but is a fantastic solution which I cannot recommend enough. Whilst it is a 3rd party product it fits seamlessly into our automated build and deployment process.

Testing without affected certain tables

At the moment I'm stuck with the need to debug several functions in our system to determine if they're working or not.
The situation is basicly that I'm left with someone elses CakePHP structure which makes me unable to know the code in and out. This is due to lack of time and lack of documentation.
I need to run tests on this system, however it will cause incorrect data on our reports page when I create new orders etc. This is not allowed and basicly there's a lot of models which saves data to the reports by simply creating other rows.
The easiest solution here would be to make no report rows get created if I'm logged in as a certain user. Then I'd simply just do a condition and determine if I should insert the report row in the database or not. (if ($bool_tester) return FALSE; else /* Insert data */)
This would however require to fetch the Session data within the Model, which I've read is a bad solution. I can't simply run an extra parameter in the function, since the function is called on so many places in so many files.
So my question is basicly; Should I include Session data within the Model regardless or is there any other nifty solution that makes me not insert these rows when I'm testing.
Defining a session value through the controllers isn't a smooth solution either here.
Do the testing in your development environment, not on the live site.
Do you use unit testing for the tests? CakePHP does support that. When you are, you could stub or mock the data within your setup for the test. Cake also supports that.

Django code or MySQL triggers

I'm making a web service with Django that uses MySQL database. Clients interface with our database through URLs, handled by Django. Right now I'm trying to create a behavior that automatically does some checking/logging whenever a certain table is modified, which naturally means MySQL triggers. However I can also do this in Django, in the request handler that does the table modification. I don't think Django has trigger support yet, so I'm not sure which is better, doing through Django code or MySQL trigger.
Anybody with knowledge on the performance of these options care to shed some light? Thanks in advance!
There are a lot of ways to solve the problem you've described:
Application Logic
View-specific logic -- If the behavior is specific to a single view, then put the changes in the view.
Model-specific logic -- If the behavior is specific to a single model, then override the save() method for the model.
Middleware Logic -- If the behavior relates to multiple models OR needs to wrapped around an existing application, you can use Django's pre-save/post-save signals to add additional behaviors without changing the application itself.
Database Stored Procedures -- Normally a possibility, but Django's ORM doesn't use them. Not portable across databases.
Database Triggers -- Not portable from one database to another (or even one version of a database to the next), but allow you to control shared behavior across multiple (possibly non-Django) applications.
Personally, I prefer using either overriding the save() method, or using a Django signal. Using view-specific logic can catch you out on large applications with multiple views of the same model(s).
What you're describing sounds like "change data capture" to me.
I think the trade-offs might go like this:
Django pros: Middle tier code can be shared by multiple apps; portable if database changes
Django cons: Logically not part of the business transaction
MySQL pros: Natural to do it in a database
MySQL cons: Triggers are very database-specific; if you change vendors you have to rewrite
This might be helpful.