I was importing data from mysql to hive using sqoop:
sqoop import --connect jdbc:mysql://localhost:3306/DATASET -username root -P --table MATCHES --hive-import
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.
com.fasterxml.jackson.databind.ObjectMapper.readerFor(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/ObjectReader;18/11/25
11:42:58 ERROR ql.Driver: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.
com.fasterxml.jackson.databind.ObjectMapper.readerFor(Ljava/lang/Class;)Lcom/fasterxml/jackson/databind/ObjectReader;
Do you have the jackson-databind jar in your hive lib directory.check it once
Related
I'm using Anaconda and python3.7 and I'm trying to connect to a remote database using the following code:
import MySQLdb
myDB = MySQLdb.connect(host="xxxxx", port=3306, user="xxx",password="xxxx",db="xxxx")
but I get the following error:
File "C:\Users\zanto\anaconda3\lib\site-packages\MySQLdb\connections.py", line 208, in __init__
super(Connection, self).__init__(*args, **kwargs2)
OperationalError: (2006, 'SSL connection error: unknown error number')
I tried 2 users in mysql one using % and one using localhost but I still get the same error.
It was a problem of mySQLdb library.
I installed pymysql library through anaconda and now it's working!
My new code is:
import pymysql.cursors
connection = pymysql.connect(host='xxxx',port=3306,user="xxxx",password="xxxx",db="xxxx", cursorclass=pymysql.cursors.DictCursor)
try:
with connection.cursor() as cursor:
sql="SELECT * FROM Table WHERE Field = 'value'"
cursor.execute(sql)
results = cursor.fetchall()
#print (results) #if you remove the comment you will get the query result as a dictionary
for record in results:
record_line = " ".join('{0}{1}'.format(field,value) for field,value in record.items())
print(record_line)
finally:
connection.close()
More info: A cursor allows Python code to execute MySQL command in a database session. A cursor is created by the connection.cursor() method: they are bound to the connection for the entire lifetime and all the commands are executed in the context of the
database session wrapped by the connection.
The cursor.execute() method runs a query in MySQL database. The cursor.fetchall() method returns the results of a query in list form that contains the records as dictionaries.
I am trying to import MySQL data into HDFS but I am getting an exception.
I have a table(products) in MYSQL and I am using the following command to import data into HDFS.
bin/sqoop-import --connect jdbc:mysql://localhost:3306/test --username root --password root --table products --target-dir /user/nitin/products
I am getting the following exception:
Error: java.io.IOException: SQLException in nextKeyValue
at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)`enter code here`
at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:553)
at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
at org.apache.hadoop.mapreduce.lib.`enter code here`map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.sql.SQLException: Unknown type '246 in column 2 of 3 in binary-encoded result set.
at com.mysql.jdbc.MysqlIO.extractNativeEncodedColumn(MysqlIO.java:3710)
at com.mysql.jdbc.MysqlIO.unpackBinaryResultSetRow(MysqlIO.java:3620)
at com.mysql.jdbc.MysqlIO.nextRow(MysqlIO.java:1282)
at com.mysql.jdbc.RowDataDynamic.nextRecord(RowDataDynamic.java:335)
at com.mysql.jdbc.RowDataDynamic.<init>(RowDataDynamic.java:68)
at com.mysql.jdbc.MysqlIO.getResultSet(MysqlIO.java:416)
at com.mysql.jdbc.MysqlIO.readResultsForQueryOrUpdate(MysqlIO.java:1899)
at com.mysql.jdbc.MysqlIO.readAllResults(MysqlIO.java:1347)
at com.mysql.jdbc.ServerPreparedStatement.serverExecute(ServerPreparedStatement.java:1393)
at com.mysql.jdbc.ServerPreparedStatement.executeInternal(ServerPreparedStatement.java:958)
at com.mysql.jdbc.PreparedStatement.executeQuery(PreparedStatement.java:1705)
at org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:111)
at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
... 12 more
I have also used this command to import data into HDFS:
bin/sqoop-import --connect jdbc:mysql://localhost:3306/test?zeroDateTimeBehavior=convertToNull --username root --password root --table products --target-dir /user/nitin/product
MapReduce job failed.
It's because of data type conversion.
Try using --map-column-java option to define the column data type mapping explicitly
while doing sqoop export,cant parse input data.seeing below exception.
java.lang.RuntimeException: Can't parse input data: '"DeptId":888'
Caused by: java.lang.NumberFormatException
From Oracle DeptId is Number datatype
sqoop export \
--connect "jdbc:oracle:thin:#(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=wcx2-scan..com)(PORT=))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=***)))" \
--table API.CUSTOMER \
--columns 'Id','DeptId','Strategy','RCM_PROD_MSG_TXT' \
--export-dir /tmp/test \
--map-column-java RCM_PROD_MSG_TXT=String \
--username ********* \
--password ********** \
--input-fields-terminated-by ',' \
--input-null-string '\N' \
--input-null-non-string '\N'
Sample Json data
{"Id":"27952436","DeptId":888,"Strategy":"syn-cat-recs","recs":
[629848,1029280]}
Make sure data should be loaded to Oracle table
I have CSV file in HDFS with lines like:
"2015-12-01","Augusta","46728.0","1"
I am trying to export this file to MySQL table.
CREATE TABLE test.events_top10(
dt VARCHAR(255),
name VARCHAR(255),
summary VARCHAR(255),
row_number VARCHAR(255)
);
With the command:
sqoop export --table events_top10 --export-dir /user/hive/warehouse/result --escaped-by \" --connect ...
This command fails with error:
Error: java.io.IOException: Can't export data, please check failed map task logs
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.RuntimeException: Can't parse input data: '2015-12-02,Ashburn,43040.0,9'
at events_top10.__loadFromFields(events_top10.java:335)
at events_top10.parse(events_top10.java:268)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:83)
... 10 more
Caused by: java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:834)
at events_top10.__loadFromFields(events_top10.java:320)
... 12 more
In case I do not use --escaped-by \" parameter than MySQL table contains rows like this
"2015-12-01" | "Augusta" | "46728.0" | "1"
Could you please explain how to export CSV file to MySQL table without double quotes?
I have to use both --escaped-by \ and --enclosed-by '\"'
So the correct command is
sqoop export --table events_top10 --export-dir /user/hive/warehouse/result --escaped-by '\\' --enclosed-by '\"' --connect ...
For more information please see official documentation
I am trying to sqoop data out of a MySQL database where I have a table with both a primary key and a last_updated field. I am trying to essentially get all records that were recently updated and overwrite the current records in the hive warehouse
I have tried the following command
sqoop job --create trainingDataUpdate -- import \
--connect jdbc:mysql://localhost:3306/analytics \
--username user \
--password-file /sqooproot.pwd \
--incremental lastmodified \
--check-column last_updated \
--last-value '2015-02-13 11:08:18' \
--table trainingDataFinal \
--merge-key id \
--direct --hive-import \
--hive-table analytics.trainingDataFinal \
--null-string '\\N' \
--null-non-string '\\N' \
--map-column-hive last_updated=TIMESTAMP
and I get the following error
15/02/13 14:07:41 INFO hive.HiveImport: FAILED: SemanticException Line 2:17 Invalid path ''hdfs://dev.cluster.com:8020/user/hdfs/_sqoop/13140640000000520_32226_hwhjobdev_cluster.com_trainingDataFinal'': No files matching path hdfs://dev.cluster.com:8020/user/hdfs/_sqoop/13140640000000520_32226_dev.cluster.com_trainingDataFinal
15/02/13 14:07:42 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Hive exited with status 64
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:385)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:335)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:239)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
I thought by including the --merge-key it would be able to overwrite the old records with new records. Does anyone know if this is possible in sqoop?
I don't think sqoop can do it.
--merge-key is only used by sqoop-merge not import
also see http://sqoop.apache.org/docs/1.4.0-incubating/SqoopUserGuide.html#id1764421