We have data stored for our customers in MySQL (Web App) and other data stored in SQL Server (billing data) and now we have a need to report on this data inside our customer-facing application.
Does anyone have experience merging these two data sources? Is there an effective way to do this?
Are there existing solutions, preferably OSS, that can aggregate the data sources and allow them to be queried as though they were one (this would be ideal)?
Otherwise, without asking for the "best" solution, what is optimal in this situation? Should we merge the separate sources into one database nightly? This is the only thing I can think of off the bat, and am wondering (hoping) whether other, more elegant or robust solutions exist.
Ideally we'd be able to query the data in real-time, rather than working off of a daily upload or whatever.
if you want to write queries across the two db then you could link the MySQL to the SQL Server
- something like this
http://coresystems.ch/en/about-us/newsroom/category/blog/how-add-linked-server-connection-mysql-mssql/
If you don't mind using a third party reporting engine, then, you can give DBxtra a spin, it lets you combine different databases in one single query to produce a report, it even lets you do so graphically, so you don't have to write the query yourself.
Related
This site has been great for a Symfony newbie such as myself and hopefully this will be the same experience. I have searched a lot for this question so maybe I am not using the right terminology. I have read about using services but none seem to give an example of what I need using multiple databases with different tables. So here goes, first off I am at the discretion of the current database design and I can't merge databases or recreate them, I have to use them as is. Here is the mysql query I want to use:
select name, title, rank from db1.tbl1,
db2.tbl1,db2.tbl2
where db2.tbl1.id=db2.tbl.id
and db1.tbl1.person_id=db2.tbl2.person_id;
I have created connections to the db in parameters.yml and config.yml. I was thinking about creating a repository for one of the entities and then having it innerjoin the other tables from the same database but couldn't find any examples. I want to do this using best practice. I am all ears.
I should also mention all the databases are managed by the same server.
You can't use multiple databases in a single query because for multiple databases to work, you need a manager for each.
I can't think of a solution using arrays or objects that is not resource intensive. Because you need to load at least one entire table.
I am trying to build a database in sql server that replicates exact data present in tables in oracle production database. The database in sql server will be used for reporting and for analysis. I want every new or updated data in oracle tables to be present in sql server tables in around 1 hour time span. Does sql server integration services helps on this? is there any tool that does this i.e. it makes sure that data present in oracle table and sql server table is always same( neglecting the 1 hour lag?)......
There are two things you could look into: replication and SSIS. SQL Server replication allows you to replicate data from Oracle to MSSQL so that would be one way to handle the data copy. On the other hand, if you plan on doing data transformations, mappings etc. then you might want to use SSIS because it's a full ETL tool.
One important question is how you can identify new data in Oracle, because that may determine at least the first part of your solution. And you then have to decide what transformations are necessary once you've copied the data into SQL Server; perhaps you will need to run some stored procedures to clean the data and put it into reporting tables. Since your reporting system is a different platform from the source, you will need to handle data type transformations at some point, whatever solution you choose.
Your question is quite general, and it isn't really possible to say what you should do without a lot more detail about your environment, your requirements, your resources and so on. I suggest that you try to break down your task into smaller ones, and then you should be able to ask more specific questions.
We are currently having an OLTP sql server 2005 database for our project. We are planning to build a separate reporting database(de-normalized) so that we can take the load off from our OLTP DB. I'm not quite sure which is the best approach to sync these databases. We are not looking for a real-time system though. Is SSIS a good option? I'm completely new to SSIS, so not sure about the feasibility. Kindly provide your inputs.
Everyone has there own opinion of SSIS. But I have used it for years for datamarts and my current environment which is a full BI installation. I personally love its capabilities to move data and it still is holding the world record for moving 1.13 terabytes in under 30 minutes.
As for setup we use log shipping from our transactional DB to populate a 2nd box. Then use SSIS to de-normalize and warehouse the data. The community for SSIS is also very large and there are tons of free training and helpful resources online.
We build our data warehouse using SSIS from which we run reports. Its a big learning curve and the errors it throws aren't particularly useful, and it helps to be good at SQL, rather than treating it as a 'row by row transfer' - what I mean is you should be creating set based queries in sql command tasks rather than using lots of SSIS component and dataflow tasks.
Understand that every warehouse is difference and you need to decide how to do it best. This link may give you some good idea's.
How we implement ours (we have a postgres backend and use PGNP provider, and making use of linked servers could make your life easier ):
First of all you need to have a time-stamp column in each table so you can when it was last changed.
Then write a query that selects the data that has changed since you last ran the package (using an audit table would help) and get that data into a staging table. We run this as a dataflow task as (using postgres) we don't have any other choice, although you may be able to make use of a normal reference to another database (dbname.schemaname.tablename or somthing like that) or use a linked server query. Either way the idea is the same. You end up with data that has change since your query.
We then update (based on id) the data that already exists then insert the new data (by left joining the table to find out what doesn't already exist in the current warehouse).
So now we have one denormalised table that show in this case jobs per day. From this we calculate other tables based on aggregated values from this one.
Hope that helps, here are some good links that I found useful:
Choosing .Net or SSIS
SSIS Talk
Package Configurations
Improving the Performance of the Data Flow
Trnsformations
Custom Logging / Good Blog
Being an SSIS newbie, I am trying to figure out the best possible way to transfer multiple tables. I am trying to import multiple tables from one database to another. I could write multiple parallel data flows for each table, however, I want to be smart about it.
For each of the tables, If I were to generalize,
I need to transfer rows from one table to a table in another database
I need to count the number of rows transferred
Have to record the start and finish time of the data transfer for each table
Record any errors
I am trying not to use Stored procedures since I want people to not have to dig deep into the DB to get the rules for this transformation. I would ideally like to have this done at the SSIS level using the components that therefore can be seen visually and understood.
Any best practises that people have used before?
I would ideally want to do something like
foreach (table in list of tables to transfer)
transfer (table name)
To make a generic table handler you would have to programatically construct the data flow. AFAIK SSIS has no auto-introspection facility. A script task will allow you to do this, and you can get the table metadata from the source. However, you will have to programatically construct the data flow, which means fiddling with the API.
I have worked on a product where this was done, although I didn't develop that component, so I can't offer words of wisdom off the top of my head as to how to do it. However, you can find resources on the web that explain how to do it.
You can find the table structure and types of the columns by querying against the system data dictionary. See this posting for some links to resources describing how this, including a link to a code sample.
What is your destination database doing with this info? Is it simply reading it?
Perhaps you would be best served by replicating the tables.
You could create a config table that has a list of your tables you want to move and then use a for loop to do something repeatedly....but what to do.
http://blogs.conchango.com/jamiethomson/archive/2005/02/28/SSIS_3A00_-Dynamic-modification-of-SSIS-packages.aspx
Below the bullet points, he states that SSIS cannot be modified to change metadata at run time. And to make it easy to maintain....you're going the wrong direction.
I'd keep it simple and use the wizard and then customize with logging/notifications etc.
Maybe you can call the stored procedure inside of your ssis scripts. Here is an example of how you might be able to use the sp
http://blog.sqlauthority.com/2012/10/31/sql-server-copy-data-from-one-table-to-another-table-sql-in-sixty-seconds-031-video/
I'm trying to split a database into two pieces -- a backend that updates automatically, and a front-end that allows searching and adding/editing comments. The data in the source database is pulled together from multiple tables into a pair of queries, and I want to use these queries as the source of the current database.
Access 2007 supports splitting a database into multiple pieces, but not in the way I'm looking for. It keeps the tables in the source database and puts all the forms, queries, reports, and macros into the new database. The tables and queries are already in the back-end, and this new database should just provide a good GUI to the end-user.
Access 2007 also supports linked tables, but these can only use a table as a source, not a query object.
I was thinking that the best way to do this would be to do a SQL query along the lines of
SELECT * FROM SourceQuery IN "C:\Path\To\ExternalDB.accdb";
Is what I'm working towards even possible, and would this be the best way to do it?
Since its still relatively early in the project, rearchitecting the database isn't out of the question, but is something I'd prefer to avoid.
You described the usual Access BE-FE division correctly: only tables in the back-end. I'm aware not all DB programs do it that way, but this is Access and my approach would be to honor the usual division. (And you hardly have a choice in that you can't "link to a query" in Access.)
Reviewing your comment ('There is a specific reason ...'), I think this would possibly mean
adding a few more tables to the back-end, essentially buckets (import-data in ready form; export 1; export 2) that allow all users to get to consistent processed data;
making a small admin FE that sits next to the BE and stores your modules, queries for export, and export routines; and
having some redundant queries on the user FE. This is vexing in my own work. I just try to design sturdy stable "building block" queries in those roles, and keep their number to a minimum.
Hope I'm understanding you correctly, but the most sensible solution would be to link the tables in the backend DB and copy the queries to the UI database. Those queries would still be able to access the uderlying tables (via the linked tables) without issues and would be accessible through normal means to your forms and VBA code.
Is there a particular reason you don't want the queries in the UI database?