Sqoop imports into secure hbase fails - mysql

I am using hadoop-2.6.0 with kerberos security. I have installed hbase with kerberos security and could able to create table and scan it.
I could run sqoop job as well to import data from mysql into hdfs but sqoop job fails when trying to import from mysql into HBase.
Sqoop Command
sqoop import --hbase-create-table --hbase-table newtable --column-family ck --hbase-row-key id --connect jdbc:mysql://localhost/sample --username root --password root --table newtable -m 1
Exception
15/01/21 16:30:24 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=localhost:2181 sessionTimeout=90000 watcher=hconnection-0x734c0647, quorum=localhost:2181, baseZNode=/hbase
15/01/21 16:30:24 INFO zookeeper.ClientCnxn: Opening socket connection to server 127.0.0.1/127.0.0.1:2181. Will not attempt to authenticate using SASL (unknownerror)
15/01/21 16:30:24 INFO zookeeper.ClientCnxn: Socket connection established to 127.0.0.1/127.0.0.1:2181, initiating session
15/01/21 16:30:24 INFO zookeeper.ClientCnxn: Session establishment complete on server 127.0.0.1/127.0.0.1:2181, sessionid = 0x14b0ac124600016, negotiated timeout = 40000
15/01/21 16:30:25 ERROR tool.ImportTool: Error during import: Can't get authentication token

Could you please try the following :
In the connection string add the port number as :
jdbc:mysql://localhost:3306/sample
Remove --table newtable. Create the required table on Hbase first with the column family.
mention --split-by id
Finally mention a specific --fetch-size , as the sqoop client for MySQL have an error internally which attempts to set the default MIN fetch size which will run into an exception.
Could you attempt the import again and let us know ?

Related

ShardingSphere-Proxy DistSQL, Invalid data source: Communications link failure

I am trying to connect to the existing tables in mysql databases db0 and db1 by registering them as storage units using distsql on a container running ShardingSphere-Proxy using the following command:
REGISTER STORAGE UNIT ds_0 (
HOST="127.0.0.1",
PORT=3306,
DB="db0",
USER="root",
PASSWORD="blah"
),ds_1 (
HOST="127.0.0.1",
PORT=3306,
DB="db1",
USER="root",
PASSWORD="blah"
);
When I try to connect to the mysql instance info above via terminal, it works, but through the ShardingSphere-Proxy that runs in a docker, it shows the error as shown.
ERROR 19000 (44000): Can not process invalid storage units, error message is: [Invalid data source ds_0, error message is: Communications link failure, Invalid data source ds_1, error message is: Communications link failure]
Steps to reproduce
On my local DB:
mysql -h 127.0.0.1 --user=root --password=blah
mysql>create database db0;
mysql>create database db1;
Create & Connect to ShardingSphere-Proxy,
docker run -d -v /Users/pavankrn/Documents/tech/sspheredock/pgsphere/apache-shardingsphere-5.3.1-shardingsphere-proxy-bin/conf:/opt/shardingsphere-proxy/conf -v /Users/pavankrn/Documents/tech/sspheredock/pgsphere/apache-shardingsphere-5.3.1-shardingsphere-proxy-bin/ext-lib:/opt/shardingsphere-proxy/ext-lib -e PORT=3308 -p13308:3308 apache/shardingsphere-proxy:latest
mysql --host=127.0.0.1 --user=root -p --port=13308 sharding_db
On ShardingSphere-Proxy's mysql terminal
use sharding_db;
REGISTER STORAGE UNIT ds_0 (
HOST="127.0.0.1",
PORT=3306,
DB="db0",
USER="root",
PASSWORD="blah"
),ds_1 (
HOST="127.0.0.1",
PORT=3306,
DB="db1",
USER="root",
PASSWORD="blah"
);
As I've been using mac, I replaced HOST="127.0.0.1", with HOST="host.docker.internal", in the register storage dist sql command.
For other workarounds refer to the thread here.

Import data from mysql into HDFS using Sqoop

I am using Hadoop-1.2.1 and Sqoop-1.4.6. I am using sqoop to import the table test from the database meshtree into HDFS using this command:
`sqoop import --connect jdbc:mysql://localhost/meshtree --username user --password password --table test`
But, it shows this error:
17/06/17 18:15:21 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
17/06/17 18:15:21 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
17/06/17 18:15:21 INFO tool.CodeGenTool: Beginning code generation
17/06/17 18:15:22 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `test` AS t LIMIT 1
17/06/17 18:15:22 INFO orm.CompilationManager: HADOOP_HOME is /home/student /Installations/hadoop-1.2.1/libexec/..
Note: /tmp/sqoop-student/compile/6bab6efaa3dc60e67a50885b26c1d14b/test.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
17/06/17 18:15:24 ERROR orm.CompilationManager: Could not rename /tmp/sqoop- student/compile/6bab6efaa3dc60e67a50885b26c1d14b/test.java to /home/student /Installations/hadoop-1.2.1/./test.java
org.apache.commons.io.FileExistsException: Destination '/home/student /Installations/hadoop-1.2.1/./test.java' already exists
at org.apache.commons.io.FileUtils.moveFile(FileUtils.java:2378)
at org.apache.sqoop.orm.CompilationManager.compile(CompilationManager.java:227)
at org.apache.sqoop.tool.CodeGenTool.generateORM(CodeGenTool.java:83)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:367)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
17/06/17 18:15:24 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop- student/compile/6bab6efaa3dc60e67a50885b26c1d14b/test.jar
17/06/17 18:15:24 WARN manager.MySQLManager: It looks like you are importing from mysql.
17/06/17 18:15:24 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
17/06/17 18:15:24 WARN manager.MySQLManager: option to exercise a MySQL- specific fast path.
17/06/17 18:15:24 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
17/06/17 18:15:24 INFO mapreduce.ImportJobBase: Beginning import of test
17/06/17 18:15:27 INFO mapred.JobClient: Cleaning up the staging area hdfs://localhost:9000/home/student/Installations/hadoop-1.2.1/data/mapred /staging/student/.staging/job_201706171814_0001
17/06/17 18:15:27 ERROR security.UserGroupInformation: PriviledgedActionException as:student cause:org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory test already exists
17/06/17 18:15:27 ERROR tool.ImportTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory test already exists
at org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.checkOutputSpecs(FileO utputFormat.java:137)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:973)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:936)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:550)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:580)
at org.apache.sqoop.mapreduce.ImportJobBase.runJob(ImportJobBase.java:141)
at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:201)
at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:413)
at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:97)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:380)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:453)
at org.apache.sqoop.Sqoop.run(Sqoop.java:145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:181)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:220)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:229)
at org.apache.sqoop.Sqoop.main(Sqoop.java:238)
at com.cloudera.sqoop.Sqoop.main(Sqoop.java:57)
Is there any way to figure out this problem?
It’s important that you do not use the URL localhost if you intend to use Sqoop with a distributed Hadoop cluster. The connect string you supply will be used on TaskTracker nodes throughout your MapReduce cluster; if you specify the literal name localhost, each node will connect to a different database (or more likely, no database at all). Instead, you should use the full hostname or IP address of the database host that can be seen by all your remote nodes.
Please visit Sqoop Document Connecting to a Database Server section for more information.
You don't permissions.So contact myql dba to grant you the same.
Or you may do yourself if you have admin access to mysql.
grant all privileges on databasename.* to 'username'#'%' identified by 'password';
*-for all tables
%- allow from any host
The above syntax is to grant permission to user in mysql server.In your case it will be:-
grant all privileges on meshtree.test to 'root'#'localhost' identified by 'yourpassword';
you are importing without providing target directory of hdfs. when we are not providing any target directory sqoop run import only once and create directory in hdfs with your mysql table name.
So your query
sqoop import --connect jdbc:mysql://localhost/meshtree --username user --password password --table test
this create a directory with the name test1 in hdfs
Just add following script
sqoop import --connect jdbc:mysql://localhost/meshtree --username user --password password --table test --target-dir test1
hope fully its work fine and just refer sqoop import and all related sqoop

Use Sqoop to import data from mysql to Hadoop but fail

I tried to import data through Sqoop using the following command.
sqoop import -connect jdbc:mysql://localhost/test_sqoop --username root --table test
but I got the connection refuse error.
And I found out I can't connect to mysql and got this error:
Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock'
And I also found out if I don't execute start-dfs.sh,mysql.sock exists in /var/lib/mysql/mysql.sock.
mysql
After I executed start-dfs.sh,mysql.sock would be gone and I can't connect to mysql.
start-dfs.sh
Below is /etc/my.cnf configuration.
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
jdbc string should be: jdbc:mysql://localhost:3306/test_sqoop, best practice is to use server name intesad of localhost or 127.0.0.1. you can get the server name from this command hostname -f. so jdbc string should be jdbc:mysql://servername:3306/test_sqoop - replace the server name by out put of hostname -f command.
you need -P or --password or --connection-param-file to pass the password to the sqoop command. sqoop doesn't read from .my.cnf file. - see usage here

Sqoop MySQL Data import error

I am new to Hadoop world. Just started learning the new things about hadoop.
I am getting below error while importing data from mysql to hdfs using sqoop:
sqoop:000> sqoop import --connect jdbc:mysql://localhost/books --username root --password thanks --table authors --m 1;
Exception has occurred during processing command
Exception: org.codehaus.groovy.control.MultipleCompilationErrorsException Message: startup failed:
groovysh_parse: 1: expecting EOF, found 'import' # line 1, column 7.
sqoop import --connect jdbc:mysql://localhost/books --username root --password thanks --table authors --m 1;
^
1 error
Could you help me in fixing this error?
it seems that you are using sqoop2
you need to follow these steps!
1st step
check if you have installed sqoop correctly
sqoop:000> show version --all
you should get a response something like this
Server version: Sqoop 2.0.0-SNAPSHOT revision Unknown Compiled by
jarcec on Wed Nov 21 16:15:51 PST 2012 Client version: Sqoop
2.0.0-SNAPSHOT revision Unknown Compiled by jarcec on Wed Nov 21 16:15:51 PST 2012 Protocol version: [1]
2nd step
Check what connectors are available on your Sqoop server:
sqoop:000> show connector --all
1 connector(s) to show: Connector with
id 1:
Name: generic-jdbc-connector
Class:
org.apache.sqoop.connector.jdbc.GenericJdbcConnector
Supported job
types: [EXPORT, IMPORT]
3rd step
sqoop:000> create connection --cid 1
Creating connection for connector
with id 1
Please fill following values to create new connection object
Name: First connection
Configuration configuration
JDBC Driver Class: com.mysql.jdbc.Driver
JDBC Connection String: jdbc:mysql://mysql.server/database
Username: sqoop
Password: *****
JDBC Connection Properties: There are currently
0 values in the map: entry#
Security related configuration options Max connections: 0 New
connection was successfully created with validation status FINE and
persistent id 1
step 4
now create a job for importing data
at the end it will also ask for extractore and loaders, use 1 as a value for both.
sqoop:000> create job --xid 1 --type import
Creating job for
connection with id 1 Please fill following values to create new job
object
Name: First job
Database configuration
Table name: users
Table SQL statement: Table
column names:
Partition column name:
Boundary query:
Output configuration
Storage type:
0 : HDFS Choose: 0
Output directory: /user/jarcec/users
New job was successfully created with
validation status FINE and persistent id 1
step 5
now start the job
sqoop:000> start job --jid 1
and import your data
you need to pass --target-dir argument HDFS path where sqoop should copy MySql records.
try:
sqoop import --connect jdbc:mysql://localhost/books --username root --password thanks --table authors --target-dir /mysqlCopy --m 1;

MySQL to HBase using Sqoop: Driver issue

I am new to Sqoop. I am trying to import data from MySQL to hbase. That's why have to use Database connector for MySQL. Path to my connector file is /usr/lib/sqoop2/lib/mysql-connector-java-5.1.6.jar at server. Database name is :testhadoop and table which i am using is employee the command i enter is as
root#server:~# sqoop import --connect jdbc:mysql//localhost/testhadoop --driver com.mysql.jdbc.Driver --username root --table mytable
After hitting Enter key, i have to enter root password. And then a long long error message come
13/09/12 17:39:16 WARN sqoop.ConnFactory: Parameter --driver is set to an
explicit driver however appropriate connection manager is not being set
(via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
13/09/12 17:39:16 INFO manager.SqlManager: Using default fetchSize of 1000
13/09/12 17:39:16 INFO tool.CodeGenTool: Beginning code generation
13/09/12 17:39:16 ERROR manager.SqlManager:
Error executing statement: java.sql.SQLException:
No suitable driver found for jdbc:mysql//localhost/testhadoop
Please tell me how to get rid of this problem.
Based on the command line it seems that you are using Sqoop 1.x whereas the JDBC driver is in path for Sqoop2. I would recommend to copy the jar file mysql-connector-java-5.1.6.jar to /usr/lib/sqoop/lib instead, so that it's available for Sqoop 1.
Also I would strongly suggest to drop the parameter --driver as it will force Sqoop to use Generic JDBC Connector instead of the specialized MySQL connector.