What's the best way of backing up a rails app data? - mysql

I need to make a backup system for my rails app but this has to be a little special: It doesn't have to back up all the database info and files in a single file or folder but it has to back up the database info and attachment files per user. I mean, every one of this backups should be able to regenerate all the information and files for one single user.
My questions are:
Is this possible? What's the best way to do it? And, if it's impossible or a bad idea at all, why is it?
Note: The database is a MySQL one.
Note2: I used Paperclip for the users uploads.

Im guessing you have an app that backs up data, when a user clicks on something right? I'm thinking get all the info connected to the user(depends on how you did your user model, so maybe you should have a get_all_info method) then write it out in sql format to a file, which you save as .sql. (either using File.new or Logger.new)

I would dump the entire user object and related objects into a single xml file dump. As you go through the creation of the XML grab out all the files and write the XML + all files into one directory, then compress them.
I think there are definitely use cases to have a feature like this, but be sure to have it run in a background process and only when needed in order to not bog down the web server. Take a look at http://github.com/tobi/delayed_job or http://github.com/defunkt/resque.

Related

How to save new Django database entries to JSON?

The git repo for my Django app includes several .tsv files which contain the initial entries to populate my app's database. During app setup, these items are imported into the app's SQLite database. The SQLite database is not stored in the app's git repo.
During normal app usage, I plan to add more items to the database by using the admin panel. However I also want to get these entries saved as fixtures in the app repo. I was thinking that a JSON file might be ideal for this purpose, since it is text-based and so will work with the git version control. These files would then become more fixtures for the app, which would be imported upon initial configuration.
How can I configure my app so that any time I add new entries to the Admin panel, a copy of that entry is saved in a JSON file as well?
I know that you can use the manage.py dumpdata command to dump the entire database to JSON, but I do not want the entire database, I just want JSON for new entries of specific database tables/models.
I was thinking that I could try to hack the save method on the model to try and write a JSON representation of the item to file, but I am not sure if this is ideal.
Is there a better way to do this?
Overriding save method for something that can go wrong or that can take more than it should is not recommended. You usually override save when changes are simple and important.
You can use signals but in your case it's too much work. You can instead write a function to do this for you but still not exactly after you saved the data to database. You can do it right away but it's too much process unless it's so important for your file to be updated.
I recommend using something like celery to run a function in the background separated from all of your django functions. You can call it on every data update or each hour for example and edit your backup file. You can even create a table to monitor the update process.
Which solution is the best is highly depended you and how important the data is. And keep in mind that editing a file can be a heavy process too so creating a backup like everyday might be a better idea anyway.

Right way to manage application generated files

tl;dr
In my node.js application I create pdf documents. What is the best/right way to save them? Right now I use node.js fileserver and shell.js to do it.
I am working on a node.js web application to manage apartments and tenants for learning purpose and on some point I create PDF Documents that I want to save under a path
/documents/building_name/apartment_name/tenant_name/year/example.pfd
Now if the user wants to change the building, apartment or tenant name via an http PUT request I change the database but also the want to change the path.
Well both works but I can't write good tests for these functions.
Now a friend told me that it's a bad practice to save documents on a file server and I better should use BLOB.
On the other side google doesn't really agree on using blobs
So what is the right way to save documents?
Thanks
Amit
You should first define a source of truth. Unless you're legally obliged to keep copies of those files and they are not being accessed very often, I wouldn't even bother storing those at all and just generate them upon request.
If not, keep the DB clean, blobs will make it huge. Put them into cold storage (again assuming they are not being accessed too frequently) without those paths. If the paths are reliant on often changing information, that can't be performant for neither the file server nor your system.
Instead store a revision number in your DB that the file can be found under and limit the path structure to information that rarely change.
Like {building}/{apartment}/{tenant}_{revision}.pfd
That - depending on your backup structure - will allow you to time-travel if necessary and doesn't force a re-index all the time.
Note: I don't know too much about your use case.

What is the best way to routinely import a CSV or XML file into a MS access database?

I have an Access database that keeps track of many different aspects of my companies performance and I would like to add functionality to keep track of the hours the employees are working.
The hours are all kept track of on a website called timetracker. They have a few reporting options including XML and CSV files. The site has a favorite report feature to get the same data in the format that I want it every week.
What I would like to do is find the best process for getting the data from this website, into a table in my database that I can reference.
I will not be the one executing whatever process I come up with and I would really like it to be as easy as possible for whoever it is that does have to do it.
Right now I have a linked table that is an XML file in our SharePoint folder. I was thinking that maybe we could just run the report and download the file every week then just save it over the old file with the correct sheet names and it should update.
What I am wondering is if anyone can come up with an easier process for doing this that would take the least amount of time and be easiest to write down instructions for that anyone could execute.
(Would it maybe be possible to create some sort of macro to actually download the report automatically?)

insert csv file into MySQL with user id

I'm working on a membership site where users are able to upload a csv file containing sales data. The file will then be read, parsed, and the data will be charted. Which will allow me to dynamically create charts
My question is how to handle this csv upload? Should it be uploaded to folder and stored for later or should it be directly inserted into a MySQL table?
Depends on how much processing needs to be done, I'd say. if it's "short" data and processing is quick, then your upload-handling script should be able to take care of it.
If it's a large file and you'd rather not tie up the user's browser/session while the data's parsed, then do the upload-now-and-deal-with-it-later option.
It depends on how you think the users will use this site.
What do you estimate the size of the files for these users to be?
How often would they (if ever) upload a file twice, can they download the charts?
If the files are small and more for one-off use you could upload it and process it on the fly, if they require repetitive access and analysis then you will save the users time by importing the data to the database.
The LOAD DATA INFILE command in MySQL handles uploads like that really nice.If you make the table you want to upload it to and then use that command it has worked great and super quick for me. I've loaded several thousand rows of data in under 5 seconds using it.
http://dev.mysql.com/doc/refman/5.5/en/load-data.html

How does one properly cache/update data-driven iPhone apps that use remote databases?

My app is highly data driven, and needs to be frequently updated. Currently the MySQL database is dumped to an xml file via PHP, and when the app loads it downloads this file. Then it loads all the values in to NSMutableArray's inside of a data manager class which can be accessed anywhere in the app.
Here is the issue, the XML file produced is about 400kb, and this apparently takes several minutes to download on the EDGE network, and even for some people on 3G. So basically I'm looking for options on how to correctly cache or optimize my app's download process.
My current thought is something along the lines of caching the entire XML file on to the iPhone's hard disk, and then just serving that data up as the user navigates the app, and loading the new XML file in the background. The problem with this is that the user is now always going to see the data from the previous run, also it seems wasteful to download the entire XML file every time if only one field was changed.
TLDR: My iPhone app's download of data is slow, how would one properly minimize this effect?
I've had to deal with something like this in an app I developed over the summer.
I what did to solve it was to do an initial download of all the data from the server and place that in a database on the client along with a revision number.
Then each time the user connects again it sends the revision number to the server, if the revision number is smaller than the server revision number it sends across the new data (and only the new data) from the server, if its the same then it does nothing.
It's fairly simple and it seems to work pretty well for me.
This method does have the drawback that your server has to do a little more processing than normal but it's practically nothing and is much better than wasted bandwidth.
My suggestion would be to cache the data to a SQLite database on the iPhone. When the application starts, you sync the SQLite database with your remote database...while letting the user know that you are loading incremental data in the background.
By doing that, you get the following:
Users can use the app immediately with stale data.
You're letting the user know new data is coming.
You're storing the data in a more appropriate format.
And once the most recent data is loaded...the user gets to see it.