Error when import a mysql table to hdfs using sqoop

Error when import a mysql table to hdfs using sqoop - mysql

I started learning hadoop and am working on some practice myself, here is an issue when I try to use sqoop to import a mysql table to hdfs:
sqoop import --connect jdbc:mysql://localhost/employees --username=root -P --table=dept_emp --warehouse-dir=dept_emp -where dept_no='d001' --m 1;
The dept_emp has 20k records roughly.
The output is as the following:
2016-09-26 16:42:26,467 INFO [main] ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-09-26 16:42:27,470 INFO [main] ipc.Client: Retrying connect to
server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
MILLISECONDS)
The "Already tried x time(s)" increased from 0 to 9, and then looped again from 0 to 9, hanging there now.
Can someone shed me some light?
Thank you very much.

Please correct the syntax:
--where "dept_no='d001'"
-m 1;

Related

Unable to import data using sqoop

I want to import data from MySQL to remote Hive using sqoop. I have installed Sqoop on a middleware machine. When i run this command:
sqoop import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://192.168.2.146:3306/fir --username root -P -m 1 --table beard_size_list --connect jdbc:hive2://192.168.2.141:10000/efir --username oracle -P -m 1 --hive-table lnd_beard_size_list --hive-import;
Is this command correct can i import data from remote MySQL to remote Hive?
When i ran this command it keeps on trying to connect to resource manager:
17/11/01 10:54:05 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.1.0-129
Enter password:
17/11/01 10:54:10 INFO tool.BaseSqoopTool: Using Hive-specific delimiters
for output. You can override
17/11/01 10:54:10 INFO tool.BaseSqoopTool: delimiters with --fields-
terminated-by, etc.
17/11/01 10:54:10 WARN sqoop.ConnFactory: Parameter --driver is set to an
explicit driver however appropriate connection manager is not being set (via
--connection-manager). Sqoop is going to fall back to
org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which
connection manager should be used next time.
17/11/01 10:54:10 INFO manager.SqlManager: Using default fetchSize of 1000
17/11/01 10:54:10 INFO tool.CodeGenTool: Beginning code generation
17/11/01 10:54:11 INFO manager.SqlManager: Executing SQL statement: SELECT
t.* FROM beard_size_list AS t WHERE 1=0
17/11/01 10:54:11 INFO manager.SqlManager: Executing SQL statement: SELECT
t.* FROM beard_size_list AS t WHERE 1=0
17/11/01 10:54:11 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
/usr/hdp/2.6.1.0-129/hadoop-mapreduce
Note: /tmp/sqoop-
oracle/compile/d93080265a09913fbfe9e06e92d314a3/beard_size_list.java uses or
overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/11/01 10:54:15 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-
oracle/compile/d93080265a09913fbfe9e06e92d314a3/beard_size_list.jar
17/11/01 10:54:15 INFO mapreduce.ImportJobBase: Beginning import of
beard_size_list
17/11/01 10:54:15 INFO Configuration.deprecation: mapred.jar is deprecated.
Instead, use mapreduce.job.jar
17/11/01 10:54:15 INFO manager.SqlManager: Executing SQL statement: SELECT
t.* FROM beard_size_list AS t WHERE 1=0
17/11/01 10:54:17 INFO Configuration.deprecation: mapred.map.tasks is
deprecated. Instead, use mapreduce.job.maps
17/11/01 10:54:17 INFO client.RMProxy: Connecting to ResourceManager at
hortonworksn2.com/192.168.2.191:8050
17/11/01 10:54:17 INFO client.AHSProxy: Connecting to Application History
server at hortonworksn2.com/192.168.2.191:10200
17/11/01 10:54:19 INFO ipc.Client: Retrying connect to server:
hortonworksn2.com/192.168.2.191:8050. Already tried 0 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000
MILLISECONDS)
17/11/01 10:54:20 INFO ipc.Client: Retrying connect to server:
hortonworksn2.com/192.168.2.191:8050. Already tried 1 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000
MILLISECONDS)
17/11/01 10:54:21 INFO ipc.Client: Retrying connect to server:
hortonworksn2.com/192.168.2.191:8050. Already tried 2 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000
MILLISECONDS)
17/11/01 10:54:22 INFO ipc.Client: Retrying connect to server:
hortonworksn2.com/192.168.2.191:8050. Already tried 3 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000
MILLISECONDS)
17/11/01 10:54:23 INFO ipc.Client: Retrying connect to server:
hortonworksn2.com/192.168.2.191:8050. Already tried 4 time(s); retry policy
is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000
MILLISECONDS)
The port it is trying to connect is 8050 but the actual port is 8033. How can i fix this?

try this below command :
sqoop import --driver com.mysql.jdbc.Driver --connect jdbc:mysql://192.168.2.146:3306/fir --username root -P -m 1 --table beard_size_list ;

Please check the below property is set to yarn-site.xml correctly
<name>yarn.resourcemanager.address</name>
<value>192.168.2.191:8033</value>

Why you have added -connect statement twice in your code? Try with below code:
sqoop import
--driver com.mysql.jdbc.Driver
--connect jdbc:mysql://192.168.2.146:3306/fir
--username root -P -m 1
--split-by beard_size_list_table_primary_key
--table beard_size_list
--target-dir /user/data/raw/beard_size_list
--fields-terminated-by ","
--hive-import
--create-hive-table
--hive-table dbschema.beard_size_list
Note:
create-hive-table – Determines if set job will fail if a Hive table already exists. It will work in this case other wise you have create hive external table and set the target-dir path

cloudera link error while run sqoop list database command

Am trying to run the below command in cloudera and getting link failure error. I have tried to restart mysqld service too, no use. Kindly some one help friends.
Code and error:
[cloudera#quickstart ~]$ sqoop list-databases --connect "jdbc:mysql://quickstart.cloudera:3306" --username=retail_dba --password=cloudera
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
17/09/22 09:45:59 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.10.0
17/09/22 09:45:59 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/09/22 09:45:59 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/09/22 09:46:16 ERROR manager.CatalogQueryManager: Failed to list databases
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure
The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

Download mysql-connector-java-5.1.21.jar and copy it into the sqoop lib folder then try to run the sqoop import as follows:
sqoop list-databases \
--connect "jdbc:mysql://localhost:3306" \
--username=retail_dba \
--password=cloudera

Import all tables from MySQL to Hive. What is wrong with my command?

Is there anything wrong with the below command. Its not working for me.
sqoop import-all-tables
--connect jdbc:mysql://localhost/retail_db --username=retail_dba
-- compression-codec=snappy
--as-parquetfile --hive-import -m 1
16/08/17 08:34:07 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
16/08/17 08:34:07 INFO tool.BaseSqoopTool:
Using Hive-specific delimiters for output. You can override
16/08/17 08:34:07 INFO tool.BaseSqoopTool:
delimiters with --fields- terminated-by, etc.
16/08/17 08:34:08 INFO manager.MySQLManager:
Preparing to use a MySQL streaming resultset.
16/08/17 08:34:08 INFO tool.CodeGenTool: Beginning code generation
16/08/17 08:34:08 ERROR sqoop.Sqoop:
Got exception running Sqoop:
java.lang.NullPointerException
java.lang.NullPointerException

Can you post the complete error you are getting?
Does you error contain - Class not found exception? If yes, go to $HADOOP_HOME/etc/hadoop find mapred-site.xml and change/add the value of property mapreduce.framework.name as yarn. Save the file and restart hadoop

Facing an issue while executing Sqoop import command in Hadoop 2.6.0

I am using MAC Osx for Hadoop Stack and using MySQL as the database for it.I am trying to execute a Sqoop import command:
sqoop import --connect jdbc:mysql://127.0.0.1/emp --table employee --username root --password reality --m 1 --target-dir /sqoop_import
But I am facing below-mentioned issue while executing it. Even in /etc/hosts, localhost is at 127.0.0.1.host file screenshot. I have tried pinging localhost and it works but the error of host is down, still prevails. Please help.
2016-02-06 17:42:38,267 INFO [main] mapreduce.Job: Job job_1454152643692_0010 failed with state FAILED due to: Application application_1454152643692_0010 failed 2 times due to Error launching appattempt_1454152643692_0010_000002. Got exception: java.io.IOException: Failed on local exception: java.net.SocketException: Host is down; Host Details : local host is: "Mohits-MacBook-Pro.local/192.168.0.103"; destination host is: "192.168.0.105":38183;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1472)
at org.apache.hadoop.ipc.Client.call(Client.java:1399)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)
at com.sun.proxy.$Proxy32.startContainers(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.launch(AMLauncher.java:119)
at org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher.run(AMLauncher.java:254)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.net.SocketException: Host is down
at sun.nio.ch.Net.connect0(Native Method)
at sun.nio.ch.Net.connect(Net.java:454)
at sun.nio.ch.Net.connect(Net.java:446)
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:648)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:192)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)
at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)
at org.apache.hadoop.ipc.Client.call(Client.java:1438)
... 9 more
. Failing the application.
2016-02-06 17:42:38,304 INFO [main] mapreduce.Job: Counters: 0
2016-02-06 17:42:38,321 WARN [main] mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
2016-02-06 17:42:38,326 INFO [main] mapreduce.ImportJobBase: Transferred 0 bytes in 125.7138 seconds (0 bytes/sec)
2016-02-06 17:42:38,349 WARN [main] mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2016-02-06 17:42:38,350 INFO [main] mapreduce.ImportJobBase: Retrieved 0 records.
2016-02-06 17:42:38,351 ERROR [main] tool.ImportTool: Error during import: Import job failed!

I see that you are having network issues. From the log above it says that local host translates to Mohits-MacBook-Pro.local/192.168.0.103 in you unix system and destination which it is trying to connect is "192.168.0.105":38183. Please go to unix system and see /etc/hosts file and make sure you change localhost to 127.0.0.1.

hadoop examples not running on amazon ec2

I am using hadoop-1.0.4 on amazon ec2 of 3 ubuntu 12.10 instances, 1 master and 2 slaves, just under ~ directory.
Now start-all.sh and stop-all.sh is ok, but when i run jps on master or slaves, it prints nothing. Then i tested hadoop examples:
~/hadoop$ bin/hadoop jar hadoop-examples-1.0.4.jar pi 10 10000
It shows
Exception in thread "main" java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:1879)
at org.apache.hadoop.util.RunJar.main(RunJar.java:115)
However i've chmod 777 -R tmp to tmp folders.
~/hadoop$ sudo bin/hadoop jar hadoop-examples-1.0.4.jar pi 10 10000
With sudo, it produces
13/05/12 03:58:11 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
found in the classpath. Usage of hadoop-site.xml is deprecated.
Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to
override properties of core-default.xml, mapred-default.xml
and hdfs-default.xml respectively
Number of Maps = 10
Samples per Map = 10000
13/05/12 03:58:12 WARN fs.FileSystem: "54.235.101.85:50001" is a deprecated
filesystem name. Use "hdfs://54.235.101.85:50001/" instead.
13/05/12 03:58:13 INFO ipc.Client: Retrying connect to server:
hdmaster/54.235.101.85:50001. Already tried 0 time(s).
13/05/12 03:58:14 INFO ipc.Client: Retrying connect to server:
hdmaster/54.235.101.85:50001. Already tried 1 time(s).
13/05/12 03:58:15 INFO ipc.Client: Retrying connect to server:
hdmaster/54.235.101.85:50001. Already tried 2 time(s).
Then failed to connect. So what is the problem? should i put sudo to run the examples? Thanks a lot.

I think, the problem is, 54.235.101.85 is suppose to be a public IP address. Use ifconfig in all the nodes to get a list of IP address and check for IP beginning with 10.x.x.x/172.x.x.x/192.x.x.x. If you find any, modify your configuration files in all the nodes accordingly.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Error when import a mysql table to hdfs using sqoop - mysql

Please correct the syntax: --where "dept_no='d001'" -m 1;

Related

Unable to import data using sqoop

cloudera link error while run sqoop list database command

Import all tables from MySQL to Hive. What is wrong with my command?

Facing an issue while executing Sqoop import command in Hadoop 2.6.0

hadoop examples not running on amazon ec2

Categories

Resources