Sqoop import all tables into hive gets stuck with below statement - mysql

By by default tables are moving to HDFS not to warehouse directory(user/hive/warehouse)
sqoop import-all-tables \
--num-mappers 1 \
--connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
--username=retail_dba \
--password=cloudera \
--hive-import \
--hive-overwrite \
--create-hive-table \
--compress \
--compression-codec org.apache.hadoop.io.compress.SnappyCodec \
--outdir java_files
Tried with --hive-home by overriding $HIVE_HOME- No use
Can any one suggest me the reason?

Related

how to add two NICs in kvm-qemu paramter

I can work with one NIC using below scripts in qemu
qemu-system-x86_64 \
-name Android11 \
-enable-kvm \
-cpu host,-hypervisor \
-smp 8,sockets=1,cores=8,threads=1 \
-m 4096 \
-net nic,macaddr=52:54:aa:12:35:02,model=virtio \
-net tap,ifname=tap1,script=no,downscript=no,vhost=on \
--display gtk,gl=on \
-full-screen \
-usb \
-device usb-tablet \
-drive file=bliss.qcow2,format=qcow2,if=virtio \
...
but now, i want to add another NIC to VM for some purpose, when i use below script, it does not work:
qemu-system-x86_64 \
-name Android11 \
-enable-kvm \
-cpu host,-hypervisor \
-smp 8,sockets=1,cores=8,threads=1 \
-m 4096 \
-net nic,macaddr=52:54:aa:12:35:02,model=virtio \
-net tap,ifname=tap1,script=no,downscript=no,vhost=on \
-net nic,macaddr=52:54:aa:12:35:03,model=e1000 \
-net tap,ifname=tap2,script=no,downscript=no \
--display gtk,gl=on \
-full-screen \
-usb \
-device usb-tablet \
-drive file=bliss.qcow2,format=qcow2,if=virtio \
...
how can i do when i want to work with the two Nics? is these something wrong in qemu parameters? thanks
PS: i have create tap1 tap2 in linux bridge before using above command

unable to export Json data from HDFS to Oracle using sqoop export

while doing sqoop export,cant parse input data.seeing below exception.
java.lang.RuntimeException: Can't parse input data: '"DeptId":888'
Caused by: java.lang.NumberFormatException
From Oracle DeptId is Number datatype
sqoop export \
--connect "jdbc:oracle:thin:#(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=wcx2-scan..com)(PORT=))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=***)))" \
--table API.CUSTOMER \
--columns 'Id','DeptId','Strategy','RCM_PROD_MSG_TXT' \
--export-dir /tmp/test \
--map-column-java RCM_PROD_MSG_TXT=String \
--username ********* \
--password ********** \
--input-fields-terminated-by ',' \
--input-null-string '\N' \
--input-null-non-string '\N'
Sample Json data
{"Id":"27952436","DeptId":888,"Strategy":"syn-cat-recs","recs":
[629848,1029280]}
Make sure data should be loaded to Oracle table

SQOOP import from MYSQL to HBASE

I am trying to import data from mysql to hbase with the below code:
sqoop import \
--connect jdbc:mysql://localhost/sampleOne \
--username root \
--password root \
--table SAMPLEDATA \
--columns "ID,NAME,DESIGNATION" \
--hbase-table customers \
--column-family ‘ID’ \
--hbase-row-key ‘ID,NAME’ \
-m 1
But the above code fails saying that --hbase-row-key command not found but if I execute without --hbase-row-key command then it works like a charm. So what must be the issue ??

How to specify timestamp in sqoop import from MySQL to HBase?

I wanted to import the structured data from my MySQL database sqoopDB using sqoop, but I met some pbs when I tried to specify the timestamp mapping. For example,
$ sqoop import \
--connect jdbc:mysql://localhost/sqoopDB \
--username root -P \
--table sensor \
--columns "name, type, value" \
--hbase-table sqoopDB \
--column-family sensor \
--hbase-row-key id -m 1
Enter password:
...
Can I add another parameter --timestamp to specify the timestamp mapping ? Such as
--timestamp=insert_date_long

using sqoop to update hive table

I am trying to sqoop data out of a MySQL database where I have a table with both a primary key and a last_updated field. I am trying to essentially get all records that were recently updated and overwrite the current records in the hive warehouse
I have tried the following command
sqoop job --create trainingDataUpdate -- import \
--connect jdbc:mysql://localhost:3306/analytics \
--username user \
--password-file /sqooproot.pwd \
--incremental lastmodified \
--check-column last_updated \
--last-value '2015-02-13 11:08:18' \
--table trainingDataFinal \
--merge-key id \
--direct --hive-import \
--hive-table analytics.trainingDataFinal \
--null-string '\\N' \
--null-non-string '\\N' \
--map-column-hive last_updated=TIMESTAMP
and I get the following error
15/02/13 14:07:41 INFO hive.HiveImport: FAILED: SemanticException Line 2:17 Invalid path ''hdfs://dev.cluster.com:8020/user/hdfs/_sqoop/13140640000000520_32226_hwhjobdev_cluster.com_trainingDataFinal'': No files matching path hdfs://dev.cluster.com:8020/user/hdfs/_sqoop/13140640000000520_32226_dev.cluster.com_trainingDataFinal
15/02/13 14:07:42 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Hive exited with status 64
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:385)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:335)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:239)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
I thought by including the --merge-key it would be able to overwrite the old records with new records. Does anyone know if this is possible in sqoop?
I don't think sqoop can do it.
--merge-key is only used by sqoop-merge not import
also see http://sqoop.apache.org/docs/1.4.0-incubating/SqoopUserGuide.html#id1764421