I'm trying to import data from MySQL table to Hive using Sqoop. From what I understood there are 2 ways of doing that.
Import data into HDFS and then create External Table in Hive and load data into that table.
Use create-hive-table while running Sqoop query to create a new table in Hive and directly load data into that. I am trying to do this but can't do it for some reason
This is my code
sqoop import \
--connect jdbc:mysql://localhost/EMPLOYEE \
--username root \
--password root \
--table emp \
--m 1 \
--hive-database sqoopimport \
--hive-table sqoopimport.employee \
--create-hive-table \
--fields-terminated-by ',';
I tried using --hive-import as well but got error.
When I ran the above query, job was successful but there was no table created in hive as well as data was stored in \user\HDFS\emp\ location where \HDFS\emp was created during the job.
PS: Also I could not find any reason for using --m 1 with Sqoop. It's just there in all queries.
I got the import working with following query. There is no need to write create-hive-table we can just write new table name with hive-table and that table will be created. Also if there is any issue then go to hive-metastore location and run rm *.lck then try import again.
sqoop import \
--connect jdbc:mysql://localhost/EMPLOYEE \
--username root \
--password root \
--table emp4 \
--hive-import \
--hive-table sqoopimport.emp4 \
--fields-terminated-by "," ;
Related
I wanna sqoop big table in mysql.
this table has about 300,000,000 rows and operating.
I wanna sqoop as soon as possible, not burden to system.
So, my sqoop command is
sqoop import \
--connect jdbc:mysql://.... \
--username ... \
--password ... \
--query "SELECT * FROM table_name where \$CONDITIONS" \
--hive-import \
--hive-overwrite \
--hive-table ........ \
--create-hive-table \
--mapreduce-job-name .... \
--num-mappers $mapper_count \
--fetch-size $patch_count \
--split-by "account_id"
I want to move all of big table, 500,000 at a time using 100 mappers.
I tried setting num-mappers=100 and fetch-size=500000. But only num-mappers adopted. So, operating Database stressed moving 3,000,000 rows each mappers.
Please advice.
Thanks.
I need to import data from MySQL to HDFS, and I'm doing that with Apache Sqoop. But the thing is I also need to export data from HDFS to MySQL and I need to update one column of these data (that is in HDFS) before moving that data to MySQL, how can I do this?
You can update the column directly from hdfs and can store the hive output to HDFS using INSER OVERWRITE DIRECTORY "path" then go with the below sqoop command
sqoop export \
--connect jdbc:mysql://master/poc \
--username root \
--table employee \
--export-dir /user/hdfs/mysql/export.txt \
--update-key id \
--update-mode allowinsert \
--fields-terminated-by '\t' \
-m 1
Hope this helps..
how does Sqoop mapped import csv file to my sql table's column ? I just ran below import and export sqoop command and it work properly but not sure how Sqoop mapped the imported result into my sql table column's ? I have CSV file created manually which I want to export to my sql so need a way to specify csv file & column mapping ..
sqoop import \
--connect jdbc:mysql://mysqlserver:3306/mydb \
--username myuser \
--password mypassword \
--query 'SELECT MARS_ID , MARKET_ID , USERROLE_ID , LEADER_MARS_ID , CREATED_TIME , CREATED_USER , LST_UPDTD_TIME , LST_UPDTD_USER FROM USERS_TEST u WHERE $CONDITIONS' \
-m 1 \
--target-dir /idn/home/data/user
Deleted record from my sql database and run the below export command which inserted data back into table .
sqoop export \
--connect jdbc:mysql://mysqlserver:3306/mydb \
--table USERS_TEST \
--export-dir /idn/home/data/user \
--username myuser \
--password mypassword \
You can utilize --input-fields-terminated-by and --columns parameters to control the structure of the data to be exported back to RDBMS through Sqoop.
I would recommend you to refer the sqoop user guide for more information.
While executing Sqoop Export jobs in Mysql, I'm facing following exception:
No columns to generate for ClassWriter
Cannot load connection class because of underlying exception: 'java.lang.NumberFormatException: For input string: "3306;"'.
Please help to resolve this issue.
In case if you want to export the data from HDFS to your MySQL table then you can use below syntax:
sqoop export \
--connect jdbc:mysql://<HOST_NAME (or) IP_ADDRESS>:<MySQL_PORT_NO>/<DATABASE_NAME> \
--username <DB_USER_NAME> \
--password <DB_PASSWORD> \
--table <YOUR_TABLE_NAME> \
--export-dir <HDFS_PATH_FROM_WHERE_YOU_WANT_TO_EXPORT_DATA>
If I want to export the data from /mysql/user/ which is located in HDFS to MySQL user table then the command will be as follows:
sqoop export \
--connect jdbc:mysql://localhost:3306/test \
--username hadoopuser \
--password **** \
--table user \
--export-dir /mysql/user/
I have my data store into hive table.
i want to transfer hive tables selected data to mysql table using sqoop.
Please guide me how to do this?
check out the sqoop guide here
You need to use sqoop export, here is the example
sqoop export --connect "jdbc:mysql://quickstart.cloudera:3306/retail_rpt_db" \
--username retail_dba \
--password cloudera \
--table departments \
--export-dir /user/hive/warehouse/retail_ods.db/departments \
--input-fields-terminated-by '|' \
--input-lines-terminated-by '\n' \
--num-mappers 2
sqoop export to export data to mysql from Hadoop.
--connect JDBC url
--username mysql username
--password password for mysql user
--table mysql table name
--export-dir valid hadoop directory
--input-fields-terminated-by column delimiter in Hadoop
--input-lines-terminated-by row delimiter in Hadoop
--num-mappers number of mappers to process the data