I have an application that runs on a MySQL database, the application is somewhat resource intensive on the DB.
My client wants to connect Qlikview to this DB for reporting. I was wondering if someone could point me to a white paper or URL regarding the best way to do this without causing locks etc on my DB.
I have searched the Google to no avail.
Qlikview is in-memory tool with preloaded data so your client have to get data only during periodical reloads not all the time.
The best way is that your client will set reload once per night and make it incremental. If your tables have only new records load every night only records bigger than last primary key loaded.
If your tables have modified records you need to add in mysql last_modified_time field and maybe also set index on that field.
last_modified_time TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
If your fields are get deleted the best is set it as deleted=1 in mysql otherwise your client will need to reload everything from that tables to get to know which rows were deleted.
Additionally your client to save resources should load only data in really simple style per table without JOINS:
SELECT [fields] FROM TABLE WHERE `id` > $(vLastId);
Qlikview is really good and fast for data modelling/joins so all data model your client can create in QLikview.
Reporting can indeed cause problems on a busy transactional database.
One approach you might want to examine is to have a replica (slave) of your database. MySQL supports this very well and your replica data can be as up to date as you require. You could then attach any reporting system to your replica to run heavy reports that won't affect your main database. This also gives you a backup (2nd copy) and the backup can further be used to create offline backups of your data also without affecting your main database.
There's lots of information on the setup of MySQL replicas so that's not too hard.
I hope that helps.
Related
I have a MySQL database on my server, and a Windows WPF application from which my clients will be inserting and deleting rows corresponding to their data. There may be hundreds of users working on the application at the same time, and they will be inserting or deleting rows in the db.
My question is whether or not all the database execution can go successfully or should I adapt some other alternative?
PS: There won't be any clash on rows while insertion/deletion by users as a user will be able to add/remove his/her corresponding data only.
My question is whether or not all the database execution can go successfully ...
Yes, like most other relational database systems, MySQL supports concurrent inserts, updates and deletes so this shouldn't be an issue provided that the operations don't conflict with each other.
If they do, you need to find a way to manage concurrency.
MySQL concurrency, how does it work and do I need to handle it in my application
Often times I use bash scripts to add massive amounts of data to my localhost site databses, once I see that the new data is working properly in my local website I export the database from phpmyadmin and edit the sql file , granted with vim it is realtively easy to change all inserts to insert ignore and so on to prepare it to be accepted in phpmyadmin in cpanel to finaly add the data to my website. this becomes cumbersome when the database gets bigger and bigger
I am new to this and I don't know how to do this operation in a professional/optimal way. is my entire process wrong? how do you do it ?
thank you for your answers
Ah, I think I understand better. I can't answer for any kind of specific enterprise environment, but I'm sure there are many different systems cobbled together with all sorts of creative baler twine and you could get a wide variety of answers to this question.
I'm actually working on a project right now where we're trying to keep data updated between two systems. The incoming data gets imported to a MySQL database and every now and then, new data is exported to a .sql file. Each row has an auto incrementing primary key "id", so we very simply keep track of the last exported ID and start the export from there (using mysqldump and the --where argument). This is quite simple and doesn't feel like an "enterprise-ready" solution, but it's fine for our needs. That avoids the problem of duplicated inserts.
Another solution would be to export the entire database from your development system, then through some series of actions import it to the production server while deleting the old database entirely. This could depend greatly on the size of your data and how much downtime you're willing to perform. An efficient and robust implementation of this would import to a staging database (and verify there were no import problems) before moving the tables to the "correct" database.
If you are simply referring to schema changes or very small amounts of data, then probably version control is your best bet. This is what I do for some of my database schemas; basically you start out with the base schema, then any change gets written as a script that can be run incrementally. So for instance, in an inventory system I might have originally started with a customer table, with fields for ID and name. Later I added a marketing department, and they want me to get email addresses. 2-email.sql would be this line: ALTER TABLE `customer` ADD `email` VARCHAR(255) NOT NULL AFTER `name`;. Still later, if I decide to handle shipping, I'll need to add mailing addresses, so 3-address.sql adds that to the database. Then on the other end, I just run those through a script (bonus points are awarded for using MySQL logic such as "IF NOT EXISTS" so the script can run as many times as needed without error).
Finally, you might benefit from setting up a replication system. Your staging database would automatically send all changes to the production database. Depending on your development process, this can be quite useful or might just get in the way.
I am looking into migrating my MySQL DB to Azure Database for MySQL https://azure.microsoft.com/en-us/services/mysql/. It currently resides on a server hosted by another company. The DB is about 100 GB. (It worries me that Azure uses the term "relatively large" for 1GB.)
Is there a way to migrate the DB without any or little (a few hours, max) downtime? I obviously can't do a dump and load as the downtime could be days. Their documentation seems to be for syncing with a MySQL server that is already on a MS server.
Is there a way to export the data out of MS Azure if I later want to use something else, again without significant downtime?
Another approach: Use Azure Data Factory to copy the data from your MySQL source to your Azure DB. Set up a sync procedure that updates your Azure Database with new rows. Sync, take MYSQL db offline, sync once more and switch to the Azure DB.
See Microsoft online help
Don't underestimate the complexity of this migration.
With 100GB, it's a good guess that most rows in your tables don't get UPDATEd or DELETEd.
For my suggestion here to work, you will need a way to
SELECT * FROM table WHERE (the rows are new or updated since a certain date)
Some INSERT-only tables will have autoincrementing ID values. In this case you can figure out the ID cutoff value between old and new. Other tables may be UPDATEd. Unless those table have timestamps saying when they were updated, you'll have a challenge figuring it out. You need to understand your data to do that. It's OK if your WHERE (new or updated) operation takes some extra rows that are older. It's NOT OK if it misses INSERTed or UPDATEd rows.
Once you know how to do this for each large table, you can start migrating.
Mass Migration Keeping your old system online and active, you can use mysqldump to migrate your data to the new server. You can take as long as you require to do it. Read this for some suggestions. getting Lost connection to mysql when using mysqldump even with max_allowed_packet parameter
Then, you'll have a stale copy of the data on the new server. Make sure the indexes are correctly built. You may want to use OPTIMIZE TABLE on the newly loaded tables.
Update Migration You can then use your WHERE (the rows are new or updated) queries to migrate the rows that have changed since you migrated the whole table. Again, you can take as long as you want to do this, keeping your old system online. It should take much less time than your first migration, because it will handle far fewer rows.
Final Migration, offline Finally, you can take your system offline and migrate the remaining rows, the ones that changed since your last migration. And migrate your small tables in their entirety, again. Then start your new system.
Yeah but, you say, how will I know I did it right?
For best results, you should script your migration steps, and use the scripts. That way your final migration step will go quickly.
You could rehearse this process on a local server on your premises. While 100GiB is big for a database, it's not an outrageous amount of disk space on a desktop or server-room machine.
Save the very large extracted files from your mass migration step so you can re-use them when you flub your first attempts to load them. That way you'll save the repeated extraction load on your old system.
You should stand up a staging copy of your migrated database (at your new cloud provider) and test it with a staging copy of your application. You may be able to do this with a small subset of your rows. But do test your final migration step with this copy to make sure it works.
Be prepared for a fast rollback to the old system if the new one goes wrong .
AND, maybe this is an opportunity to purge out some old data before you migrate. This kind of migration is difficult enough that you could make a business case for extracting and then deleting old rows from your old server, before you start migrating.
I want to export updated data from MySQL/postgreSQL to mongodb every time specified table has changed or, if that's impossible, make the dump of whole table to NoSQL every X seconds/minutes. What can I do to achieve this? I've googled and I found only paid, enterprise level solutions and those are out of reach for my amateur project.
SymmetricDS provides and open source database replication option that would that has support for replicating a RDMS database (MySQL, Postgres) into MongoDB.
Here is the specific documentation to setup the Mongo target node in SymmetricDS.
http://www.symmetricds.org/doc/3.11/html/user-guide.html#_mongodb
There is also a blog about setting up Mongo in a bit more detail.
https://www.jumpmind.com/blog/mongodb-synchronization
To get online replication into a target database you can use:
Get the data stream at the same time in both databases
Enterprise solution which reads the transaction log and pushes the data to the next database
Check periodically for change dates > X
Export table periodically
Write changed records to a certain table with trigger and poll this table to select the changes
Push the changed data with triggers in a datastreamservice into the next database
Many additional approaches
Depending on the amount of time you want to use and the lag the data can have it depends which solution fits for your demand.
If the amount of data gets bigger or the number or transaction increases some solutions which fit for an amateur project don't fit anymore.
My problem is I have a website that customers place orders on. That information goes into orders, ordersProducts, ...etc tables. I have a reporting Database on a DIFFERENT server where my staff will be processing the orders from. The tables on this server will need the order information AND additional columns so they can add extra information and update current information
What is the best way to get information from the one server (order website) to the other (reporting website) efficiently without the risk of data loss? Also I do not want the reporting database to be connecting to the website to get information. I would like to implement a solution on the order website to PUSH data.
THOUGHTS
mySQL Replication - Problem - Replicated tables are strictly for reporting and not manipulation. Example what if customer address changes? Need products added to order? This would mess up the replicated table.
Double Inserts - Insert into Local tables and then insert into Reporting Database. Problem - If for whatever reason the reporting database goes down there is a chance I lose data because the mySQL connection wont be able to push the data. Implement some sort of query log?
Both Servers use mySQL and PHP
Mysql replication sounds exactly like what you are looking for, I'm not too sure I understand what you've listed as the disadvantage there.
The solution to me sounds like a master to read-only slave where the slave is the reporting database. If your concern is changes to the master then making the slave out of sync then this shouldn't be too much of an issue, all changes will be synced over. In the situation of a loss of connectivity then the slave would track how many seconds it is behind master and execute the changes until the two are back in sync.