Unable to import data using sqoop - mysql

I want to import data from MySQL to remote Hive using sqoop. I have installed Sqoop on a middleware machine. When i run this command:
sqoop import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://192.168.2.146:3306/fir --username root -P -m 1 --table beard_size_list --connect jdbc:hive2://192.168.2.141:10000/efir --username oracle -P -m 1 --hive-table lnd_beard_size_list --hive-import;
Is this command correct can i import data from remote MySQL to remote Hive?
When i ran this command it keeps on trying to connect to resource manager:
17/11/01 10:54:05 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.1.0-129
Enter password:
17/11/01 10:54:10 INFO tool.BaseSqoopTool: Using Hive-specific delimiters
for output. You can override
17/11/01 10:54:10 INFO tool.BaseSqoopTool: delimiters with --fields-
terminated-by, etc.
17/11/01 10:54:10 WARN sqoop.ConnFactory: Parameter --driver is set to an
explicit driver however appropriate connection manager is not being set (via
--connection-manager). Sqoop is going to fall back to
org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which
connection manager should be used next time.
17/11/01 10:54:10 INFO manager.SqlManager: Using default fetchSize of 1000
17/11/01 10:54:10 INFO tool.CodeGenTool: Beginning code generation
17/11/01 10:54:11 INFO manager.SqlManager: Executing SQL statement: SELECT
t.* FROM beard_size_list AS t WHERE 1=0
17/11/01 10:54:11 INFO manager.SqlManager: Executing SQL statement: SELECT
t.* FROM beard_size_list AS t WHERE 1=0
17/11/01 10:54:11 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
/usr/hdp/2.6.1.0-129/hadoop-mapreduce
Note: /tmp/sqoop-
oracle/compile/d93080265a09913fbfe9e06e92d314a3/beard_size_list.java uses or
overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/11/01 10:54:15 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-
oracle/compile/d93080265a09913fbfe9e06e92d314a3/beard_size_list.jar
17/11/01 10:54:15 INFO mapreduce.ImportJobBase: Beginning import of
beard_size_list
17/11/01 10:54:15 INFO Configuration.deprecation: mapred.jar is deprecated.
Instead, use mapreduce.job.jar
17/11/01 10:54:15 INFO manager.SqlManager: Executing SQL statement: SELECT
t.* FROM beard_size_list AS t WHERE 1=0
17/11/01 10:54:17 INFO Configuration.deprecation: mapred.map.tasks is
deprecated. Instead, use mapreduce.job.maps
17/11/01 10:54:17 INFO client.RMProxy: Connecting to ResourceManager at
hortonworksn2.com/192.168.2.191:8050
17/11/01 10:54:17 INFO client.AHSProxy: Connecting to Application History
server at hortonworksn2.com/192.168.2.191:10200
17/11/01 10:54:19 INFO ipc.Client: Retrying connect to server:
hortonworksn2.com/192.168.2.191:8050. Already tried 0 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000
MILLISECONDS)
17/11/01 10:54:20 INFO ipc.Client: Retrying connect to server:
hortonworksn2.com/192.168.2.191:8050. Already tried 1 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000
MILLISECONDS)
17/11/01 10:54:21 INFO ipc.Client: Retrying connect to server:
hortonworksn2.com/192.168.2.191:8050. Already tried 2 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000
MILLISECONDS)
17/11/01 10:54:22 INFO ipc.Client: Retrying connect to server:
hortonworksn2.com/192.168.2.191:8050. Already tried 3 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000
MILLISECONDS)
17/11/01 10:54:23 INFO ipc.Client: Retrying connect to server:
hortonworksn2.com/192.168.2.191:8050. Already tried 4 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000
MILLISECONDS)
The port it is trying to connect is 8050 but the actual port is 8033. How can i fix this?

try this below command :
sqoop import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://192.168.2.146:3306/fir --username root -P -m 1 --table beard_size_list ;

Please check the below property is set to yarn-site.xml correctly
<name>yarn.resourcemanager.address</name>
<value>192.168.2.191:8033</value>

Why you have added -connect statement twice in your code? Try with below code:
sqoop import
--driver com.mysql.jdbc.Driver
--connect jdbc:mysql://192.168.2.146:3306/fir
--username root -P -m 1
--split-by beard_size_list_table_primary_key
--table beard_size_list
--target-dir /user/data/raw/beard_size_list
--fields-terminated-by ","
--hive-import
--create-hive-table
--hive-table dbschema.beard_size_list
Note:
create-hive-table – Determines if set job will fail if a Hive table already exists. It will work in this case other wise you have create hive external table and set the target-dir path

Related

cloudera link error while run sqoop list database command

Am trying to run the below command in cloudera and getting link failure error. I have tried to restart mysqld service too, no use. Kindly some one help friends.
Code and error:
[cloudera#quickstart ~]$ sqoop list-databases --connect "jdbc:mysql://quickstart.cloudera:3306" --username=retail_dba --password=cloudera
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/09/22 09:45:59 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.10.0
17/09/22 09:45:59 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/09/22 09:45:59 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/09/22 09:46:16 ERROR manager.CatalogQueryManager: Failed to list databases
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
Download mysql-connector-java-5.1.21.jar and copy it into the sqoop lib folder then try to run the sqoop import as follows:
sqoop list-databases \
--connect "jdbc:mysql://localhost:3306" \
--username=retail_dba \
--password=cloudera

Error when import a mysql table to hdfs using sqoop

I started learning hadoop and am working on some practice myself, here is an issue when I try to use sqoop to import a mysql table to hdfs:
sqoop import --connect jdbc:mysql://localhost/employees --username=root -P --table=dept_emp --warehouse-dir=dept_emp -where dept_no='d001' --m 1;
The dept_emp has 20k records roughly.
The output is as the following:
2016-09-26 16:42:26,467 INFO [main] ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-09-26 16:42:27,470 INFO [main] ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
MILLISECONDS)
The "Already tried x time(s)" increased from 0 to 9, and then looped again from 0 to 9, hanging there now.
Can someone shed me some light?
Thank you very much.
Please correct the syntax:
--where "dept_no='d001'"
-m 1;

Import all tables from MySQL to Hive. What is wrong with my command?

Is there anything wrong with the below command. Its not working for me.
sqoop import-all-tables
--connect jdbc:mysql://localhost/retail_db --username=retail_dba
-- compression-codec=snappy
--as-parquetfile --hive-import -m 1
16/08/17 08:34:07 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
16/08/17 08:34:07 INFO tool.BaseSqoopTool:
Using Hive-specific delimiters for output. You can override
16/08/17 08:34:07 INFO tool.BaseSqoopTool:
delimiters with --fields- terminated-by, etc.
16/08/17 08:34:08 INFO manager.MySQLManager:
Preparing to use a MySQL streaming resultset.
16/08/17 08:34:08 INFO tool.CodeGenTool: Beginning code generation
16/08/17 08:34:08 ERROR sqoop.Sqoop:
Got exception running Sqoop:
java.lang.NullPointerException
java.lang.NullPointerException
Can you post the complete error you are getting?
Does you error contain - Class not found exception? If yes, go to $HADOOP_HOME/etc/hadoop find mapred-site.xml and change/add the value of property mapreduce.framework.name as yarn. Save the file and restart hadoop

Importing data using sqoop in HDFS beugs

I am following this tutorial http://hadooped.blogspot.fr/2013/05/apache-sqoop-for-data-integration.html. And I have installed hadoop services (hdfs, hive, sqoop, hue, ...) using cloudera manager.
I am using Ubuntu 12.04 TLS.
When trying to import data from Mysql to HDFS, mapreduce jobs takes infinite time without returning any error. Knowing that the imported table has 4 columns and 10 rows.
this is what I do:
sqoop import --connect jdbc:mysql://localhost/employees --username hadoop --password password --table departments -m 1 --target-dir /user/sqoop2/sqoop-mysql/department
Warning: /opt/cloudera/parcels/CDH-5.5.2-1.cdh5.5.2.p0.4/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
16/02/23 17:49:09 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.5.2
16/02/23 17:49:09 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
16/02/23 17:49:10 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
16/02/23 17:49:10 INFO tool.CodeGenTool: Beginning code generation
16/02/23 17:49:11 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
16/02/23 17:49:11 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
16/02/23 17:49:11 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/6bdeb198a0c249392703e3fc0070cb64/departments.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
16/02/23 17:49:19 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/6bdeb198a0c249392703e3fc0070cb64/departments.jar
16/02/23 17:49:19 WARN manager.MySQLManager: It looks like you are importing from mysql.
16/02/23 17:49:19 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
16/02/23 17:49:19 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
16/02/23 17:49:19 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
16/02/23 17:49:19 INFO mapreduce.ImportJobBase: Beginning import of departments
16/02/23 17:49:20 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
16/02/23 17:49:24 INFO Configuration.deprecation: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
16/02/23 17:49:24 INFO client.RMProxy: Connecting to ResourceManager at hadoopUser/10.0.2.15:8032
16/02/23 17:49:31 INFO db.DBInputFormat: Using read commited transaction isolation
16/02/23 17:49:31 INFO mapreduce.JobSubmitter: number of splits:1
16/02/23 17:49:33 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1456236806433_0004
16/02/23 17:49:34 INFO impl.YarnClientImpl: Submitted application application_1456236806433_0004
16/02/23 17:49:34 INFO mapreduce.Job: The url to track the job: http://hadoopUser:8088/proxy/application_1456236806433_0004/
16/02/23 17:49:34 INFO mapreduce.Job: Running job: job_1456236806433_0004
Job_image
regards,
The MapReduce Job is not getting Launched. You need to run a test wordcount job on the cluster.

Datastax hadoop InvalidRequestException(why:You have not logged in)

I have installed DataStax Enterprise "dse-4.0.1", but when I tried to do the demo per the below link
http://www.datastax.com/docs/datastax_enterprise2.0/sqoop/sqoop_demo, I am getting the below error, can anybody please help me with the issue, I am facing, log file attached for your reference.
[root#chbslx0624 bin]# ./dse sqoop import --connect jdbc:mysql://127.0.0.1/npa_nxx_demo --username root --password poc123 --table npa_nxx --target-dir /npa_nxx
14/04/14 10:44:14 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
14/04/14 10:44:14 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
14/04/14 10:44:14 INFO tool.CodeGenTool: Beginning code generation
14/04/14 10:44:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `npa_nxx` AS t LIMIT 1
14/04/14 10:44:14 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `npa_nxx` AS t LIMIT 1
14/04/14 10:44:14 INFO orm.CompilationManager: HADOOP_HOME is /opt/cassandra/dse-4.0.1/resources/hadoop/bin/..
Note: /tmp/sqoop-root/compile/b0fc8093d30c07f252da42678679e461/npa_nxx.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
14/04/14 10:44:15 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/b0fc8093d30c07f252da42678679e461/npa_nxx.jar
14/04/14 10:44:15 WARN manager.MySQLManager: It looks like you are importing from mysql.
14/04/14 10:44:15 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
14/04/14 10:44:15 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
14/04/14 10:44:15 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
14/04/14 10:44:15 INFO mapreduce.ImportJobBase: Beginning import of npa_nxx
14/04/14 10:44:17 INFO snitch.Workload: Setting my workload to Cassandra
14/04/14 10:44:18 ERROR security.UserGroupInformation: PriviledgedActionException as:root cause:java.io.IOException: InvalidRequestException(why:You have not logged in)
14/04/14 10:44:18 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: InvalidRequestException(why:You have not logged in)
at com.datastax.bdp.util.CassandraProxyClient.initialize(CassandraProxyClient.java:453)
at com.datastax.bdp.util.CassandraProxyClient.<init>(CassandraProxyClient.java:376)
at com.datastax.bdp.util.CassandraProxyClient.newProxyConnection(CassandraProxyClient.java:259)
at com.datastax.bdp.util.CassandraProxyClient.newProxyConnection(CassandraProxyClient.java:306)
at com.datastax.bdp.hadoop.cfs.CassandraFileSystemThriftStore.initialize(CassandraFileSystemThriftStore.java:230)
at com.datastax.bdp.hadoop.cfs.CassandraFileSystem.initialize(CassandraFileSystem.java:73)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:187)
at org.apache.hadoop.mapreduce.JobSubmissionFiles.getStagingDir(JobSubmissionFiles.java:97)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:856)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:141)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:202)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:475)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:108)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:403)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:476)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
Caused by: InvalidRequestException(why:You have not logged in)
at org.apache.cassandra.thrift.Cassandra$describe_keyspaces_result$describe_keyspaces_resultStandardScheme.read(Cassandra.java:31961)
at org.apache.cassandra.thrift.Cassandra$describe_keyspaces_result$describe_keyspaces_resultStandardScheme.read(Cassandra.java:31928)
at org.apache.cassandra.thrift.Cassandra$describe_keyspaces_result.read(Cassandra.java:31870)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_describe_keyspaces(Cassandra.java:1181)
at org.apache.cassandra.thrift.Cassandra$Client.describe_keyspaces(Cassandra.java:1169)
at com.datastax.bdp.util.CassandraProxyClient.initialize(CassandraProxyClient.java:425)
... 32 more
It looks like you have another local running Cassandra node which is username/password enable. You can follow the latest document, and start DSE as -t then run the import script.