In our network, we have very clear access control for mysql database. After writing a sqoop command we discovered that sqoop is trying to connect with mysql from one of the servers in hadoop cluster. Servers in hadoop cluster will not be able to connect to mysql database.
Is there any way to tell sqoop to connect with our mysql from the local machine where we are executing our sqoop command?
Sqoop naturally runs as a map reduce job. Therefore the process which is copying the data from mysql, could run on any host on the cluster and often there will be several processes reading data from mysql concurrently.
However, I think you can run Sqoop in "local" mode to stop it running on map reduce by passing -jt local to the command, eg:
sqoop [tool-name] -jt local [tool-arguments]
However if you need to run the exports in parallel, this may be much slower.
Note, I have not tested this myself.
Related
I have installed Hadoop and Hive with MySQL metastore. I started Hadoop daemons then I started Hive shell. The problem that I am facing is that, when I quit hive shell, using "quit" command, my Hadoop daemons are also get stopped. After then, when I restart my Hadoop daemons, using start-dfs.sh and start-yarn.sh, then the NameNode, DataNode and ResourceManager are not starting. What is the problem with my configurations? Can anyone help me out?
Oh! I got it!
The thing is, first I started the Hadoop Daemons, then I checked to ensure, using JPS, and I got:
and after then, I issued a query using hive, to check out the available tables:
After the hive query, I again checked for the Hadoop Daemons using jps, but I got nothing on the terminal:
So, whenever I issue the Hive related stuff, daemons goes off from the terminal. I am not able to see them using jps.
Despite the daemons are not showing on the terminal, they are running at the background in fact. I confirmed this when I issued a command to create a repository in the HDFS and it got created:
I checked the user interface of the NameNode and Cluster also, It was showing all the information.
Okay! But my concern is, how to stop those background running Hadoop daemons without restarting my machine?
This is something i need to figure out, my company runs a number of prod RDS on AWS. Some of the mysql RDS run with 5.7 , i need to downgrade the mysql to 5.6 or 5.5 . Is this functionality provided by AWS.
Scenario: A mysql server already up and running with mysql version 5.7, Downgrade this to 5.6
-> If this is possible then what are the possible ways ?
-> How to do this ?
This is not something that AWS provides out of the box, however it can be solved with below 2 approaches depending on your database size and downtime that you can accept.
It might worth considering fixing application compatibility instead of downgrading DB which is more risky operation.
1. Dump, restore and switch method
Dump your currently running database with mysqldump utility. Start a new RDS instance with downgraded engine, load your dumped data into it. Switch your application to use RDS instance with downgraded engine.
2. Dump, restore, replicate & switch method
Dump your currently running database with mysqldump utility. Start a new RDS instance with downgraded MySQL engine, load your dumped data into it.
Set the new, downgraded DB instance as read replica of your old DB instance using mysql.rds_set_external_master and then start replication using mysql.rds_start_replication. Stop writes to your original DB, once the read replica catches up (you must monitor replication lag), run mysql.rds_reset_external_master which will promote your downgraded instance and turn off replication. Point your application to the downgraded RDS DB instance.
Method 2 will shorten your downtime to minimum, but is a bit more complex to execute. Here is a command reference to get familiar with to help you succeed: MySQL on Amazon RDS SQL Reference
You will find a great amount of examples in RDS documentation also - Importing and Exporting Data From a MySQL DB Instance:
Quick help here...
I have these 2 mysql instances... We are not going to pay for this service anymore; so they will be gone... How can I obtain a backup file that I can keep for the future?
I do not have much experience with mysql, and all threads talk about mysqldump, which I don't know if its valid for this case. I also see the option to take a snapshot but I want a file I can save (like a .bak).
See screenshot:
Thanks in advance!
You have several choices:
You can replicate your MySQL instances to MySQL servers running outside AWS. This is a bit of a pain, but will result in a running instance. http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/MySQL.Procedural.Exporting.NonRDSRepl.html
You can use the commandline mysqldump --all-databases to generate a (probably very large) .sql file from each database server instance. Export and Import all MySQL databases at one time
You can use the commandline mysqldump to export a database at a time. This is what I would do.
You can use a gui MySQL client -- like HeidiSQL -- in place of the commandline to export your databases one at a time. This might be easier.
You don't need to, and should not, export the mysql, information_schema, or performance_schema databases; these contain system information and will already exist on another server.
In order to connect from outside AWS, you'll have to set the AWS protections appropriately. And you'll have to know the internet address, username, and password (and maybe port) of the MySQL server at AWS.
Then you can download HeidiSQL (if you're on windows) or some appropriate client software, connect to your database instance, and export your databases at your leisure.
We're running Linux VM's with MySQL on Azure and want to start using Azure SQL, but need to get the data from one into the other, initially.
Is there a way to dump a mysql database and then import that into an Azure sql database?
I'm on a Mac (or can be on Linux), so the .net tools won't help.
I've tried having Azure use the mysql dump. Reads it, but nope.
I've tried selecting the mysql tables from an open connection and drop them on the Azure db, also in an open connection, via Navicat. Nope.
I also tried looking for something in SQLPro for MSSQL. Nope.
Also, I'm willing to edit the mysql dump if there are minor global things to do so that Azure sql will read it.
You can:
1. Install mysql instance on windows based server.
2. Dump all your databases into there using mysql dump.
3. Use all the spectrum of microsoft tools for your goal.
I am trying to backup a mysql database on a cloudfoundry app. The database in tunneled via caldecott and i can connect using mysql.
My database is 40k so far and when i use mysqldump it takes ages, i.e. after 10 min and dumping 30% - 60% of the database (depending on the run) it stops with error Lost connection to MySQL server during query (2013)
any hints?
Tricky one, I guess if your route to Cloud Foundry is a little slow you could dump each table in turn. You could do this by opting for no client when creating the tunnel and then issue the mysqldump commands yourself for the individual tables.
Failing that, you might try using VMC from a different node on the internet to see if the connection is any quicker.