I am trying to import data from mysql to hbase with the below code:
sqoop import \
--connect jdbc:mysql://localhost/sampleOne \
--username root \
--password root \
--table SAMPLEDATA \
--columns "ID,NAME,DESIGNATION" \
--hbase-table customers \
--column-family ‘ID’ \
--hbase-row-key ‘ID,NAME’ \
-m 1
But the above code fails saying that --hbase-row-key command not found but if I execute without --hbase-row-key command then it works like a charm. So what must be the issue ??
Related
while doing sqoop export,cant parse input data.seeing below exception.
java.lang.RuntimeException: Can't parse input data: '"DeptId":888'
Caused by: java.lang.NumberFormatException
From Oracle DeptId is Number datatype
sqoop export \
--connect "jdbc:oracle:thin:#(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=wcx2-scan..com)(PORT=))(CONNECT_DATA=(SERVER=DEDICATED)(SERVICE_NAME=***)))" \
--table API.CUSTOMER \
--columns 'Id','DeptId','Strategy','RCM_PROD_MSG_TXT' \
--export-dir /tmp/test \
--map-column-java RCM_PROD_MSG_TXT=String \
--username ********* \
--password ********** \
--input-fields-terminated-by ',' \
--input-null-string '\N' \
--input-null-non-string '\N'
Sample Json data
{"Id":"27952436","DeptId":888,"Strategy":"syn-cat-recs","recs":
[629848,1029280]}
Make sure data should be loaded to Oracle table
By by default tables are moving to HDFS not to warehouse directory(user/hive/warehouse)
sqoop import-all-tables \
--num-mappers 1 \
--connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
--username=retail_dba \
--password=cloudera \
--hive-import \
--hive-overwrite \
--create-hive-table \
--compress \
--compression-codec org.apache.hadoop.io.compress.SnappyCodec \
--outdir java_files
Tried with --hive-home by overriding $HIVE_HOME- No use
Can any one suggest me the reason?
I wanted to import the structured data from my MySQL database sqoopDB using sqoop, but I met some pbs when I tried to specify the timestamp mapping. For example,
$ sqoop import \
--connect jdbc:mysql://localhost/sqoopDB \
--username root -P \
--table sensor \
--columns "name, type, value" \
--hbase-table sqoopDB \
--column-family sensor \
--hbase-row-key id -m 1
Enter password:
...
Can I add another parameter --timestamp to specify the timestamp mapping ? Such as
--timestamp=insert_date_long
I am trying to sqoop data out of a MySQL database where I have a table with both a primary key and a last_updated field. I am trying to essentially get all records that were recently updated and overwrite the current records in the hive warehouse
I have tried the following command
sqoop job --create trainingDataUpdate -- import \
--connect jdbc:mysql://localhost:3306/analytics \
--username user \
--password-file /sqooproot.pwd \
--incremental lastmodified \
--check-column last_updated \
--last-value '2015-02-13 11:08:18' \
--table trainingDataFinal \
--merge-key id \
--direct --hive-import \
--hive-table analytics.trainingDataFinal \
--null-string '\\N' \
--null-non-string '\\N' \
--map-column-hive last_updated=TIMESTAMP
and I get the following error
15/02/13 14:07:41 INFO hive.HiveImport: FAILED: SemanticException Line 2:17 Invalid path ''hdfs://dev.cluster.com:8020/user/hdfs/_sqoop/13140640000000520_32226_hwhjobdev_cluster.com_trainingDataFinal'': No files matching path hdfs://dev.cluster.com:8020/user/hdfs/_sqoop/13140640000000520_32226_dev.cluster.com_trainingDataFinal
15/02/13 14:07:42 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Hive exited with status 64
at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:385)
at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:335)
at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:239)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:514)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
at org.apache.sqoop.Sqoop.run(Sqoop.java:143)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:179)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:218)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:227)
at org.apache.sqoop.Sqoop.main(Sqoop.java:236)
I thought by including the --merge-key it would be able to overwrite the old records with new records. Does anyone know if this is possible in sqoop?
I don't think sqoop can do it.
--merge-key is only used by sqoop-merge not import
also see http://sqoop.apache.org/docs/1.4.0-incubating/SqoopUserGuide.html#id1764421
For Prase REST API, we can have
curl -X GET \
-H "X-Parse-Application-Id: Y6i5v9PQOAAGlnKnULJJu5odT72ffSCpOnqqPhx9" \
-H "X-Parse-REST-API-Key: T6STkwY6XqVMySTbqeSZfmli3naZZK9KoxnAcEhR" \
-G \
--data-url-encode 'where={"username":"someUser"}' \
https://api.parse.com/1/users
Now I'm trying to send the request without --data-url-encode, but to append the related query into the URL https://api.parse.com/1/users, what shall I do?
I tried
curl -X GET \
-H "X-Parse-Application-Id: Y6i5v9PQOAAGlnKnULJJu5odT72ffSCpOnqqPhx9" \
-H "X-Parse-REST-API-Key: T6STkwY6XqVMySTbqeSZfmli3naZZK9KoxnAcEhR" \
-G \
https://api.parse.com/1/users?where={"username":"someUser"}
but it doesn't work.
Thank you.
First encode where={"username":"someUser"} to where%3D%7B%22username%22%3A%22someUser%22%7D, then
curl -X GET \
-H "X-Parse-Application-Id: Y6i5v9PQOAAGlnKnULJJu5odT72ffSCpOnqqPhx9" \
-H "X-Parse-REST-API-Key: T6STkwY6XqVMySTbqeSZfmli3naZZK9KoxnAcEhR" \
-G \
https://api.parse.com/1/users?where%3D%7B%22username%22%3A%22someUser%22%7D
works