I have the following scenario:
On a Raspberry Pi (Raspbian Jessie) I run a python script receiving a huge number of individual data sets from a measuring device via UART. These data sets are "cached" in a mariaDB database (engine: MEMORY) and from there either sent via the internet to a database on a remote server or to a database on a USB drive on the RaspPi itself (in cases of temporary loss of connection or no connection at all). If possible at all, the MEMORY-database should have no downtime whatsoever, even if the USB drive has to be changed because it is full. There will be hardly any reads from the USB-database and no deletes or complicated restructurings This setup is intended to run for years.
According to my research, I have three options:
1) Mounting the USB drive into the datadir of the MEMORY-database (as has been suggested here)
2) Creating and running two instances of mariaDB, one (MEMROY-database) running forever, one (USB-database) being stopped ocasionally to allow for a change of the USB drive (something along the line of this
3) Running two instances of mariaDB in sandbox (e.g. flollowining this description)
My questions is, what is the best way to achieve the functionality described above? I worry, that in scenario 1 the whole mariaDB instance might crash if I mess (unmounting, formatting..) with parts of its datadir. Scenarios 2 and 3 seem preferable to that, but I don't know which one to choose or if I am mistaken altogether.
I just used mySQL workbench to connect to my clearDB account which is connected to an azure web app. The problem is even thought I ran a query that drops/creates tables in the newly made schema that mirrors exactly the tables and data in my previous live server, I go to mysite.azurewebsites.com/wp-admin and the error is in establishing data connection. Site could not be found. Check if your database contains the following pages: wp_blogs, ..........
What could be the problem? Does this process just need a bit of time to propagate all the data?
EDIT: something to note, which might be a factor, when I ran the last query, it also included dropping/adding the table "wp_users" so all previous data was wiped and replaced with the info from a previous live server.
Normally you will see any changes made immediately. But because your database is hosted on a geoseparated cluster in circular replication there are some rare circumstances where this might not be true.
Specifically, if your delete/write went to one master and your read query went to another. Data propagation is normally immediate but if one of the nodes is offline or the system is unusually busy there can be a delay.
I have read a few/lots of things on this but they don't seem to help much.
I have an App (it's called "TieUp" but that is irrelevant) I run it manually daily to collate data from several locations.
It is using as sources:
A) Data from a remote SOAP source and loaded into an in-memory TClientDataset via an XMLtransform setup.
B) CSV files downloaded daily and loaded into an in-memory TClientDataset
C) A Mysql Database on the same computer as the program (it's a restored backup of the live source)
D) A remote MS-SQL (SQLServer 2008) database
E) A Mysql Database on a remote server
Data is only read from sources A, B, C and D
Data source E is updated with the consolidated data.
There are between 800 to 2000 records daily so the datasets are not vast although the target (E) has grown to around 150,000 and increasing daily.
I can normally run this all happily and everything works as expected if a little slowly because of all the individual remote lookups to the MS-SQL system) but some days it really screws up and the error is always "Catastrophic Failure!".
The failure does not occur during any particular phase or operation that I can see. The steps are:
1) Get the SOAP(A) data first.
2) Tie in with CSV/In Memory data(B).
3) Lookup References data on Sources C and D to collate
4) Write the consolidated data to source E
After reading in the data into the in memory datasets every thing is In TClientDatasets accesses via DatasetProviders linked to TSQLQueries (they all on the same servers currently but I did it that way to keep some flexibility in future where it might goes true three tier). All queries are contained within the SQLQuery components as they are actually quite simple - it's just a matter of tying things together.
I am using completely standard components from Delphi 2009 Enterprise. All updates and database update packs have been applied. Each data source has its own DataModule these are auto created at startup
There is obviously quite a lot of data access going on here but when it crashes (with catastrophic failure) It gets stuck, completely stuck. Windows can't end the task from the normal "TieUp has stopped working" I have to go to the process and kill it.
There is so much going on and as this only happens once a week or so I really don't know where to start looking.
The reasons for asking the question is twofold: 1) is that I am trying to eliminate any manual stuff and fully automate it, but I can't rely on it if if bombs every week or so. 2) if it happens in the update phase to E - I have to manually delete the new records for the day and start again as I do not have (or haven't written yet) a mechanism to restart from a random point and I would still have to query the DB manually to establish that point for certain.
My next step is to install Delphi on another computer and always run it under the debugger until I can catch it, if it does not freeze first. But that introduces yet another different network connection (instead of the local host one).
So: "Is there a definite answer?" or what is the most likely offending component/connection? Where is the favoured place to start looking?
Thanks in advance...
I've been asked for a quick turn around on this. The group I'm assisting has a .MDB database where offsite workers that don't have internet all the time. Thus, way back the team implemented an Access DB which allows for synchronization.
As their team grew bigger they started running into the following issues:
Remote synching – when an user tries to synch from a worksite, more often than not, the database will crash either due to loss of wireless signal, program timing out, or Inspector manually shutting down due to time (i.e., 30 or more minutes)
Multiple synchers – we are unable to synch multiple at one time (there are currently 34 users in 3 different territories). If someone is synching and another person tries to synch at the same time, the second user will end up with an error message. They will have to shut down their DB and try to synch at a later time.
Incomplete synchs – sometimes when an worker synch’s his/her DB, not all the line items will copy over to the Master file which can cause confusion during review.
Is there any work arounds or items I can look into to resolve these?
I have little resources and time so anything involving a new server might not work.
THanks
It sounds as though you are mainly adding new data from different field operatives, rather than everyone updating existing data, if this is the case then that's good and you could try the following:
Ensure all the tables have "Replication ID's" for the Primary Keys as this will ensure no two operatives create conflicting records.
The synchronisation process should then be amended to take a snapshot of said table/tables to a .txt file on the operatives machine and then this file transferred back to the source machine.
Then at the end of the day or more often if required, the master copy should be setup to import the new data from all the text files it has received, as there will be no conflicting Primary Keys you should be ok, just remember to insert only those where the Primary Key is not already in the table.
Hope all that makes sense : )
I'm working with PHP & mySQL. I've finally got my head around source control and am quite happy with the whole development (testing) v production v repository thing for the PHP part.
My new quandary is what to do with the database. Do I create one for the test environment and one for the production environment? I currently have just the one which both environments use, leaving my test data sitting there. I kind of feel that I should have two, but I'm nervous in terms of making sure that my production database looks and feels exactly the same as my test one.
Any thoughts on which way to go? And, if you think the latter, what the best way is to keep the two databases the same (apart from the data, of course...)?
Each environment should have a separate database. Script all of the database objects (tables, views, procedures, etc) and store the scripts in source control. The scripts are applied first to the development database, then promoted to test (QA, UAT, etc), then production. By applying the same scripts to each database, they should all be the same in the end.
If you have data that needs to be loaded (code tables, lookup values, etc), script that data load as part of the database creation process.
By scripting everything and keeping it in source control, a database structure can be recreated at any time for any given build level.
You should definitely have two. As far as keeping them in sync, you should always create DDL for creating your database objects. Treat these scripts as you do you PHP code - keep them in version control. Anytime you have to modify the test database, make a script to do so, and check it in. Then you can propogate those changes to the production system once you are ready.
As a minimum one database for each development workstation and one for production. Besides that you should have one for the test environment unless you are only one developer and have a similar setup as the production environment.
See also
How do you version your database schema?
It's a common question and has been asked and answered many times.
Thomas Owens: Replication is not usable for versioning schemas - it is for duplicating data. You never want to replicate from dev to production or vice versa.
Once I've deployed my database, any changes made to my development database(s), are done in an SQL script (not a tool), and the script is saved, and numbered.
deploy.001.description.sql
deploy.002.description.sql
deploy.003.description.sql
... etc..
Then I run each of those scripts in order when I deploy.
Then I archive them into a directory called something like
\deploy.YYMMDD\
And start all over.
If I make a mistake, I never go back to the previous deploy script, I'll create a new script and put my fix in there.
Good luck
One thing I've been working with is creating a VM with the database installed. you can save the VM as a playfile, including its data. What you can do then is take a snapshot of the playfile, and start up as many different VM's as you want. They can all be identical, or you can modify one or another. Here's the good thing: assuming you have a dev version of the database that you want to go out, you can simply start that VM on your production server instead of the current server.
It's another problem altogether if you have production data that is not on your dev machines. In that case though, one thing you can do is set up a tracking VM. Run replication from your main DB to the tracking VM. When you get to a point where you need to run some alters on the production database, first stop the slave and save a snapshot.
Start an instance of that snapshot, take it out of slave mode entirely, apply your changes, and point your QA box at that database. If it works as intended, you can run the patches against your main production database. If not, bring up the snapshot, and get it replicating off the master again until you are ready to repeat the update test.
I was having the same dilemmas. I got stuck thinking that there was a clear dichotomy between production db versus development db. I.e they were two sides of a coin and never the twain shall meet.
A lot of problems disappeared when I stopped making my application 'think' in terms of "Either production db OR development db". Instead my application uses a local db.
When its running on my virtual (dev) machine, that local db happens to be a dev db. My application doesn't really 'know' that though.
So, for the main part, the problem disappears.
But sometimes I want to run tests using live data, or move data from the code into the live production db and see the results quickly.
This is when I added the concept of a live-read-only db connection. The application treats this differently. Its a bit like how your application might treat a web service like Google Apps. Its 'some external resource that your app uses'.
By default my app uses the local db and in some very special conditions (in the test suite) it also uses the live-readonly db. (Because its a read-only connection I don't fear making a mess of the live data during tests).
So rather than asking the question "dev db OR production db?", my app asks "local db OR live-read-only db".
Obviously my situation could be different to yours, but I found this 'breakthrough in understanding' to be most helpful for me.