How to export subset of data from mongodb - json

I currently have a large database, and I need a means of backing up subsets of the data that can then be imported on another mongodb instance.
For example, I would need to find all documents that contain a key, so essentially: find({key: 'somekey'}), and then export that data set. I thought to simply run the query in NodeJS, and save the data in JSON format. I don't think this is optimal as through my understanding simply importing the JSON data again (if needed in the future) won't be a straightforward task as the data-types will be lost.
So my question is, how would I go about exporting a subset of the dataset so that it may be possibly re-imported into another mongodb instance on another server.

Thanks to #Veeram's comment, the way to do this is as BSON so that it retains all the data structure:
sudo mongodump -d DB_Name -c Collection -q '{"key_name": "value"}' --out /home/collection
Then to import it back:
sudo mongorestore -d DB_Name -c Collection /home/collection/DB_Name/Collection.bson

Related

Getting RECORD Array in MongoDB

I imported json data from mysql using this command :
mongoimport --db your --collection categories categories.json --type json
But when i stared search data i found an issue that mongodb collection have RECORDS Array and not imported ids as object like first one.
Any one know ? how to import data from mysql to mongodb that will be as Object not an extra RECORDS Array ?
I think RECORDS is coming from your mysql, please check your JSON file by opening it in editor like sublime.
Your Answer to export from mysql to mongo with JSON Objects:
Install Gem :
gem install mysql2xxxx
Then run :
mysql2json --user=root --database=yourdb --execute "select * from categories" > cat.json
So after run above command you will get clean records in json format, don't know how you are importing but i don't think RECORDS should come.
After done this you can use :
mongoimport --db your --collection categories cat.json --type json
Hopefully this will work.

Mongoimport without mongoexport

Suppose I have ssh access to a server with mongodb on it. However, suppose the server does not have mongoexport installed, and I cannot install it. I can use mongo, interactively or feed it a script. I wish to export a subset of the data and import it on my local computer. Ideally, I'd like to run a script or command that saves the data in the same format as mongoexport, so I can import it with mongoimport locally. https://stackoverflow.com/a/12830385/513038 doesn't work (as it has extra line breaks in the results), nor does using printjsononeline instead, because some values get printed differently and I end up with "Bad characters" and "expecting number" errors when I run mongoimport.
Any ideas? Again, I'd like to use mongoimport if possible, but other sufficiently workable ways are acceptable, as well.

How can I easily convert a Django app from mySQL to PostgreSQL?

Has anyone done this? Is it an easy process? We're thinking of switching over for transactions and because mysql seems to be "crapping out" lately.
Converting MySQL database to Postgres database with Django
First backup your data of the old Mysql database in json fixtures:
$ python manage.py dumpdata contenttypes --indent=4 --natural-foreign > contenttype.json
$ python manage.py dumpdata --exclude contenttypes --indent=4 --natural-foreign > everything_else.json
Then switch your settings.DATABASES to postgres settings.
Create the tables in Postgresql:
$ python manage.py migrate
Now delete all the content that is automatically made in the migrate (django contenttypes, usergroups etc):
$ python manage.py sqlflush | ./manage.py dbshell
And now you can safely import everything, and keep your pk's the same!
$ python manage.py loaddata contenttype.json
$ python manage.py loaddata everything_else.json
Tested with Django==1.8
I just used this tool to migrate an internal app and it worked wonderfully. https://github.com/maxlapshin/mysql2postgres
You can do that using Django serializers to output the data from MySQL's format into JSON and then back into Postgres. There are some good artices on internet about that:
Migrating Django from MySQL to PostgreSQL the Easy Way
Move a Django site to PostgreSQL: check
I've never done it personally, but it seems like a combination of the dumpdata and loaddata options of manage.py would be able to solve your problem quite easily. That is, unless you have a lot of database-specific things living outside the ORM such as stored procedures.
I've not done it either.
I'd first follow this migration guide, there is a MySql section which should take care of all your data. Then django just switch the mysql to postgre in the settings. I think that should be ok.
I found another question on stackoverflow which should help with the converting mysql to postgre here.
python manage.py dump.data >> data.json
Create database and user in postrgesql
Set your just created database in postrgesql as default database in django settings or use param --database=your_postrgesql_database next steps
Run syncdb for create tables.
python syncdb [--database=your_postrgesql_database] --noinput
Create dump without data, drop all tables and load dump. Or truncate all tables (table django_content_type whith data which can be not equals your old data - it is way to many errors). At this step we need empty tables in postgresql-db.
When you have empty tables in postgresql-db just load your data:
python manage.py loaddata data.json
And be fun!
I wrote a Django management command that copies one database to another:
https://gist.github.com/mturilin/1ed9763ab4aa98516a7d
You need to add both database in the settings and use this command:
./manage.py copy_db from_database to_database app1 app2 app3 --delete --ignore-errors
What cool about this command is that it recursively copy dependent objects. For example, if the model have 2 foreign keys and two Many-to-Many relationships, it will copy the other objects first to ensure you won't get foreign key violation error.

Import Multiple .sql dump files into mysql database from shell

I have a directory with a bunch of .sql files that mysql dumps of each database on my server.
e.g.
database1-2011-01-15.sql
database2-2011-01-15.sql
...
There are quite a lot of them actually.
I need to create a shell script or single line probably that will import each database.
I'm running on a Linux Debian machine.
I thinking there is some way to pipe in the results of a ls into some find command or something..
any help and education is much appreciated.
EDIT
So ultimately I want to automatically import one file at a time into the database.
E.g. if I did it manually on one it would be:
mysql -u root -ppassword < database1-2011-01-15.sql
cat *.sql | mysql? Do you need them in any specific order?
If you have too many to handle this way, then try something like:
find . -name '*.sql' | awk '{ print "source",$0 }' | mysql --batch
This also gets around some problems with passing script input through a pipeline though you shouldn't have any problems with pipeline processing under Linux. The nice thing about this approach is that the mysql utility reads in each file instead of having it read from stdin.
One-liner to read in all .sql files and imports them:
for SQL in *.sql; do DB=${SQL/\.sql/}; echo importing $DB; mysql $DB < $SQL; done
The only trick is the bash substring replacement to strip out the .sql to get the database name.
There is superb little script at http://kedar.nitty-witty.com/blog/mydumpsplitter-extract-tables-from-mysql-dump-shell-script which will take a huge mysqldump file and split it into a single file for each table. Then you can run this very simple script to load the database from those files:
for i in *.sql
do
echo "file=$i"
mysql -u admin_privileged_user --password=whatever your_database_here < $i
done
mydumpsplitter even works on .gz files, but it is much, much slower than gunzipping first, then running it on the uncompressed file.
I say huge, but I guess everything is relative. It took about 6-8 minutes to split a 2000-table, 200MB dump file for me.
I don't remember the syntax of mysqldump but it will be something like this
find . -name '*.sql'|xargs mysql ...
I created a script some time ago to do precisely this, which I called (completely uncreatively) "myload". It loads SQL files into MySQL.
Here it is on GitHub
It's simple and straight-forward; allows you to specify mysql connection parameters, and will decompress gzip'ed sql files on-the-fly. It assumes you have a file per database, and the base of the filename is the desired database name.
So:
myload foo.sql bar.sql.gz
Will create (if not exist) databases called "foo" and "bar", and import the sql file into each.
For the other side of the process, I wrote this script (mydumpall) which creates the corresponding sql (or sql.gz) files for each database (or some subset specified either by name or regex).

MySQL Memory engine + init-file

I'm trying to set up a MySQL database so that the tables are ran by the memory engine. I don't really care about loosing some data that gets populated but I would like to dump it daily (via mysqldump in a cronjob) and have the init-file set to this dump. However I can't seem to figure out how to get the mysqldump to be compatable with how the init-file wants the SQL statements to be formatted.
Am I just missing something completely obvious trying to set up a database this way?
MySQL dumps are exactly that -- dumps of the MySQL database contents as SQL. So, there isn't any way to read this directly as a database file.
What you can do, is modify your init script for MySQL to automatically load the last dump (via the command line) every time MySQL starts.
An even better solution would be to use a ramdisk to hold the entire contents of your database in memory, and then periodically copy this to a safe location as your backup.
Although, if you want to maintain the contents of your databases at all, you're better off just using one of the disk-based storage engines (InnoDB or MyISAM), and just giving your server a lot of RAM to use as a cache.
This solution is almost great, but it causes problems when string values in table data contain semicolons - all of them are replaced with newline char.
Here is how I implemented this:
mysqldump --comments=false --opt dbname Table1 Table2 > /var/lib/mysql/mem_tables_init.tmp1
#Format dump file - each statement into single line; semicolons in table data are preserved
grep -v -- ^-- /var/lib/mysql/mem_tables_init.tmp1 | sed ':a;N;$!ba;s/\n/THISISUNIQUESTRING/g' | sed -e 's/;THISISUNIQUESTRING/;\n/g' | sed -e 's/THISISUNIQUESTRING//g' > /var/lib/mysql/mem_tables_init.tmp2
#Add "USE database_name" instruction
cat /var/lib/mysql/mem_tables_init.tmp2 |sed -e 's/DROP\ TABLE/USE\ `dbname`;\nDROP\ TABLE/' > /var/lib/mysql/mem_tables_init.sql
#Cleanup
rm -f /var/lib/mysql/mem_tables_init.tmp1 /var/lib/mysql/mem_tables_init.tmp2
My understanding is that the --init-file is expecting each SQL statement on a single line and that there are no comments in the file.
You should be able to clear up the comments with:
mysqldump --comments=false
As for each SQL statement on one line, I'm not familiar with a mysqldump option to do that, but what you can do is a line of Perl to remove all of the newlines:
perl -pi -w -e 's/\n//g;' theDumpFilename
I don't know if --init-file will like it or not, but it's worth a shot.
The other thing you could do is launch mysql from a script that also loads in a regular mysqldump file. Not the solution you were looking for, but it might accomplish the effect you're after.
I stumbled onto this, so I'll tell you what I do. First, I have an ip->country db in a memory table. There is no reason to try to "save" it, its easily and regularly dropped and recreated, but it may be unpredictable how the php will act when its missing and its only scheduled to be updated weekly. Second, I have a bunch of other memory tables. There is no reason to save these, as they are even more volatile, with lifespans in minutes. They will be refreshed very quickly, but stale data is better than none at all. Also, if you are using any separate key caches, they may (in some cases) need to loaded first or you will be unable to load them. And finally, be sure to put a "use" statement in there if you're not dumpling complete databases, as there is no other interface (like mysql client) to open the database at start up.. So..
cat << EOF > /var/lib/mysql/initial_load.tmp
use fieldsave_db;
cache index fieldsave_db.search in search_cache;
EOF
mysqldump --comments=false -udrinkin -pbeer# fieldsave_db ip2c \
>> /var/lib/mysql/initial_load.tmp
mysqldump --comments=false -ufields -pavenue -B memtables \
>> /var/lib/mysql/initial_load.tmp
grep -v -- ^-- /var/lib/mysql/initial_load.tmp |tr -d '\012' \
|sed -e 's/;/;\n/g' > /var/lib/mysql/initial_load.sql
As always, YMMV, but it works for me.