Sqoop export to MySQL from HDFS - mysql

I have CSV file in HDFS. The contents are below.
1,sam
2,ram
3,Tim,Ny
4,Jim,CA
Now I want to export this file into MySQL table.
The MySQL table has following columns id name city
I am getting sqoop export failed error.
This is the sqoop export statement I am using.
sqoop export --connect jdbc:mysql://xxxx/test --username xxxxx --password xxxxx --table test --export-dir /user/xxxxx/testing -m 1 --input-fields-terminated-by ',' --input-null-string '\n' --input-null-non-string '\n'
Why I am I getting this error and what is the correct way to get the export done without errors. What if the file is in parquet format.
error code:
2017-03-20 15:32:37,388 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper:
2017-03-20 15:32:37,388 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Exception raised during data export
2017-03-20 15:32:37,388 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper:
2017-03-20 15:32:37,388 ERROR [main] org.apache.sqoop.mapreduce.TextExportMapper: Exception:
java.lang.RuntimeException: Can't parse input data: 'sam'
at test1.__loadFromFields(test1.java:335)
at test1.parse(test1.java:268)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:89)
at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:834)
at test1.__loadFromFields(test1.java:330)

The error is caused due to the different schema for first two and last two rows.
1,sam
2,ram
3,Tim,Ny
4,Jim,CA
Sqoop is expecting , after sam and ram. Last column could be empty but , should be there.

This is the last column in your CSV data, so it does not recognize the last column and ends the line with the default properties of mysql and end with error
so create csv like
1,sam,
2,ram,
3,Tim,Ny
4,Jim,CA
run following sqoop export command
sqoop export --connect jdbc:mysql://localhost:3306/test --username xxxx --password xxxx --table test --export-dir stack/stack.csv -m 1
thanks

Related

Sqoop error in direct mode

I am trying to import data from Memsql to HDFS using Sqoop in direct mode.
My Sqoop command follows this way
sqoop import -D mapred.task.timeout=0 --connect jdbc:mysql://XXXXXXX:3306/dbname --username XXXX --password XXXX --table catalog_returns --target-dir XXXXXX --direct
I am able to migrate data without direct mode. However, using direct mode produces the following error
Error: java.io.IOException: mysqldump terminated with status 2
at
org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:486)
at
org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:175)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
It will be a great help, if someone can provide their solution.

Sqoop - Failed to empty staging table before export

I have faced an interesting situation while trying to export data from HDFS to MySQL database. I'm getting Error during export: Failed to empty staging table before export run.
First I created two tables in MySQL and executed Sqoop export statement for the first time (when both tables were empty).
As a result records from the staging table migrated to the export table successfully:
INFO manager.SqlManager: Migrated 81424802 records from `staging_weather_data` to `export_weather_data`
However when I executed a SELECT query on the export_weather_data table (SELECT count(*) FROM export_weather_data;
) - it returned 0 records. How could the table remain empty if the records migrated successfully?
As for the staging_weather_data table - it contains the exported records.
So I decided to retry Sqoop export with the same statement:
sqoop export \
--connect jdbc:mysql://localhost/weather_data \
--connection-manager org.apache.sqoop.manager.MySQLManager \
--table export_weather_data \
--staging-table staging_weather_data \
--clear-staging-table \
--export-dir /user/maria_dev/input-sqoop \
--options-file ~/.access_credentials \
--num-mappers 4 \
--batch \
--input-fields-terminated-by ','
After that I keep getting the following Error:
INFO mapreduce.ExportJobBase: Beginning export of export_weather_data
18/03/22 13:28:57 ERROR manager.SqlManager: Unable to execute delete query: DELETE FROM `staging_weather_data`
java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
ERROR tool.ExportTool: Error during export: Failed to empty staging table before export run
The only thing I could suspect is the absence of --clear-staging-table attribute but I had it in the statement from the very beginning. So for now I have no ideas how to solve it and need any suggestions.

Create / import hive table using Sqoop

When I use below import command it allows me to create table and import data from mysql to Hive and I can see the table "widgets" in Hive.
sqoop import --connect jdbc:mysql://localhost:3306/hadoopguide --table widgets --username <username> --password <password> --split-by id -m 1 --hive-import;
But whenever I use below "create-hive-table" command, I get an error.
Command:
sqoop create-hive-table --connect jdbc:mysql://localhost:3306/hadoopguide --table widgets --username <username> --password <password> --fields-terminated-by ',';
Error:
17/03/14 21:30:21 INFO hive.HiveImport: FAILED: Execution Error,
return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask.
java.lang.RuntimeException: Unable to instantiate
org.apache.hadoop.hive.metastore.HiveMetaStoreClient 17/03/14 21:30:21
ERROR tool.CreateHiveTableTool: Encountered IOException running create
table job: java.io.IOException: Hive exited with status 1 at
org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:385)
at
org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:335)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:239)
at
org.apache.sqoop.tool.CreateHiveTableTool.run(CreateHiveTableTool.java:58)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145) at
org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at
org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181) at
org.apache.sqoop.Sqoop.runTool(Sqoop.java:220) at
org.apache.sqoop.Sqoop.runTool(Sqoop.java:229) at
org.apache.sqoop.Sqoop.main(Sqoop.java:238)
Can anyone please assist me why I am getting this error.
Any input from your side will be great :-) .
Is you Hadoop Services running.
$ start-all.sh
hive> show database;

sqoop export with procedure option is failing

I am trying to export the data from emr to postgre using sqoop. when I run below command, it is throwing the following exception
sqoop export --connect jdbc:postgresql://test.xxxxx.us-east-1.rds.amazonaws.com/RulesPOC --username user --password password --call merge_AccountRepo -m 1 --input-fields-terminated-by '|' --export-dir s3://<bucket_name>/output/ --driver org.postgresql.Driver --columns "num,max"
Error:
16/07/14 21:25:07 INFO tool.CodeGenTool: Beginning code generation
16/07/14 21:25:07 ERROR tool.ExportTool: Encountered IOException running export job: java.io.IOException: No columns to generate for ClassWriter
[
Can someone please suggest what is wrong?

sqoop export fails to load data into mysql from hive warehouse folder

sqoop export fails with error.
my export command is:
sqoop export --connect jdbc:mysql://<ip>:3306/<database> --username user --password pass --verbose --table <table name> --export-dir <dir in hdfs>
fields are terminated by hive default delimiter.
error message:
15/01/12 08:50:23 INFO mapred.JobClient: Task Id : attempt_201412261920_1440_m_000000_1, Status : FAILED
java.util.NoSuchElementException
at java.util.ArrayList$Itr.next(ArrayList.java:757)
at <tablename>.__loadFromFields(<tablename>.java:418)
at <tablename>.parse(<tablename>.java:332)
at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:81)
at com.cloudera.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:40)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at com.cloudera.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:189)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1177)
at org.apache.hadoop.mapred.Child.main(Child.java:264)`
I found out the mistake.. It was bcz of column mismatch from hive and mysql tables. Now it is working good.