Can we export special characters using sqoop? - mysql

I'm trying to export one of the tables from hive to MySQL using sqoop export. The hive table data contains the special characters.
My hive "special_char" table data:
1 じゃあまた
2 どうぞ
My Sqoop Command:
sqoop export --verbose --connect jdbc:mysql://xx.xx.xx.xxx/Sampledb --username abc --password xyz --table special_char --direct --driver com.mysql.jdbc.Driver --export-dir /apps/hive/warehouse/sampledb.db/special_char --fields-terminated-by ' '
After using the above sqoop export command, the data is stored in the form of question marks (???) instead of actual message with special characters.
MySql "special_char" table:
id message
1 ?????
2 ???
Can anyone please help me out,in storing the special characters instead of question marks (???).

Specify proper encoding and charset in the JDBC URL as below:
jdbc:mysql://xx.xx.xx.xxx/Sampledb?useUnicode=true&characterEncoding=UTF-8
sqoop export --verbose --connect jdbc:mysql://xx.xx.xx.xxx/Sampledb?useUnicode=true&characterEncoding=UTF-8 --username abc --password xyz --table special_char --direct --driver com.mysql.jdbc.Driver --export-dir /apps/hive/warehouse/sampledb.db/special_char --fields-terminated-by ' '
Please verify charset encoding for Japanese characters and use proper one.
Reference: https://community.hortonworks.com/content/supportkb/198290/native-sqoop-export-from-hdfs-fails-for-unicode-ch.html

Related

Export Data from HiveQL to MySQL using Oozie with Sqoop

I have a table (regularly updated) in Hive that I want to have in one of my tool that has a MySQL database. I can't just connect my application to the Hive database, so I want to export those data directly in the MySQL database.
I've searched a bit and found out that it was possible with Sqoop, and I've been told to use Oozie since I want to regularly update the table and export it.
I've looked around for a while and tried some stuff but so far I can't succeed, and I just don't understand what I'm doing.
So far, the only code I understand but doesn't work looks like that :
export --connect jdbc:mysql://myserver
--username username
--password password
--table theMySqlTable
--hive-table cluster.hiveTable
I've seen people using temporary table and export it on a txt file to then export it, but I'm not sure I can do it.
Should Oozie have specific parameters too ? I'm not the administrator so I'm not sure if I'm able to do it...
Thank you !
Try this.
sqoop export \
--connect "jdbc:sqlserver://servername:1433;databaseName=EMP;" \
--connection-manager org.apache.sqoop.manager.SQLServerManager \
--username userid \
-P \
--table theMySqlTable\
--input-fields-terminated-by '|' \
--export-dir /hdfs path location of file/part-m-00000 \
--num-mappers 1 \

Sqoop - Error while exporting from hive to mysql

I have a problem using sqoop to export hive bigint data to mysql.
The type of the column in mysql and hive is bigint.
I get the following error:
Caused by: java.lang.NumberFormatException: For input string: "3465195470"
...
At java.lang.Integer.parseInt (Integer.java:583)
It seems that an error occurs when converting a string stored in hdfs to a numeric type.
Both hive and mysql columns are bigint types, how do i solve the problem?
add sqoop command
export -connect "jdbc:mysql://{url}/{db}?{option}"
--username {username}
--password {password}
--table {username}
--columns "column1,column2,column3"
--export-dir /apps/hive/warehouse/tmp.db/{table}
--update-mode allowinsert
--update-key column1
--input-fields-terminated-by "\001"
--input-null-string "\\N"
--input-null-non-string "\\N"
--null-string "\\N"
--null-non-string "\\N"
It could be an issue due to missing column or wrong column position.
Also there is no need of --null-string and -null-non-string. These are used in sqoop import commands.

Sqoop Import replace special characters of mysql

I have 1000 tables with more than 100000 records in each table in mysql. The tables have 300-500 columns.
Some of tables have columns with special characters like .(dot) and space in the column names.
Now I want to do sqoop import and create a hive table in HDFS in a single shot query like below
sqoop import --connect ${domain}:${port}/$(database) --username ${username} --password ${password}\
--table $(table) -m 1 --hive-import --hive-database ${hivedatabase} --hive-table $(table) --create-hive-table\
--target-dir /user/hive/warehouse/${hivedatabase}.db/$(table)
After this the hive table is created but when I query the table it shows error as
This error output is a sample output.
Error while compiling statement: FAILED: RuntimeException java.lang.RuntimeException: cannot find field emp from [0:emp.id, 1:emp.name, 2:emp.salary, 3:emp.dno]
How can we replace the .(dot) with _(underscore) while doing sqoop import itself. I would like to do this dynamically.
Use sqoop import \ with --query option rather than --table and in query use replace function .
ie
sqoop import --connect ${domain}:${port}/$(database) --username ${username} --password ${password}\
-- query 'Select col1 ,replace(col2 ,'.','_') as col from table.
Or (not recommended) write a shell script which can do find and replace "." to "_" (Grep command)at /user/hive/warehouse/${hivedatabase}.db/$(table)

How does Sqoop map csv file column to column of my sql table

how does Sqoop mapped import csv file to my sql table's column ? I just ran below import and export sqoop command and it work properly but not sure how Sqoop mapped the imported result into my sql table column's ? I have CSV file created manually which I want to export to my sql so need a way to specify csv file & column mapping ..
sqoop import \
--connect jdbc:mysql://mysqlserver:3306/mydb \
--username myuser \
--password mypassword \
--query 'SELECT MARS_ID , MARKET_ID , USERROLE_ID , LEADER_MARS_ID , CREATED_TIME , CREATED_USER , LST_UPDTD_TIME , LST_UPDTD_USER FROM USERS_TEST u WHERE $CONDITIONS' \
-m 1 \
--target-dir /idn/home/data/user
Deleted record from my sql database and run the below export command which inserted data back into table .
sqoop export \
--connect jdbc:mysql://mysqlserver:3306/mydb \
--table USERS_TEST \
--export-dir /idn/home/data/user \
--username myuser \
--password mypassword \
You can utilize --input-fields-terminated-by and --columns parameters to control the structure of the data to be exported back to RDBMS through Sqoop.
I would recommend you to refer the sqoop user guide for more information.

Save data into mysql from hive hadoop through sqoop?

I have my data store into hive table.
i want to transfer hive tables selected data to mysql table using sqoop.
Please guide me how to do this?
check out the sqoop guide here
You need to use sqoop export, here is the example
sqoop export --connect "jdbc:mysql://quickstart.cloudera:3306/retail_rpt_db" \
--username retail_dba \
--password cloudera \
--table departments \
--export-dir /user/hive/warehouse/retail_ods.db/departments \
--input-fields-terminated-by '|' \
--input-lines-terminated-by '\n' \
--num-mappers 2
sqoop export to export data to mysql from Hadoop.
--connect JDBC url
--username mysql username
--password password for mysql user
--table mysql table name
--export-dir valid hadoop directory
--input-fields-terminated-by column delimiter in Hadoop
--input-lines-terminated-by row delimiter in Hadoop
--num-mappers number of mappers to process the data