When I open hadoop UI in browsers. I get this error:
Path does not exist on HDFS or WebHDFS is disabled. Please check your path or enable WebHDFS
Can you tell me what I am missing and how I can fix this error?
My config:
hdfs-site.xml
<configuration>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>file:///opt/volume/datanode</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>file:///opt/volume/namenode</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
By default, it will auto selct path = '/'. But it does not.
Related
I'm using Apache Spark 2.1.1 and I have put the following hive-site.xml on $SPARK_HOME/conf folder:
<?xml version="1.0"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://mysql_server:3306/hive_metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
<description>username to use against metastore database</description>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>password</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
<description>password to use against metastore database</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>${test.tmp.dir}/hadoop-tmp</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>hdfs://hadoop_namenode:9000/value_iq/hive_warehouse/</value>
<description>Warehouse Location</description>
</property>
</configuration>
When I start the thrift server the metastore schema is created on my MySQL DB but is not used, instead Derby is used.
Could not find any error on the thrift server log file, the only thing that calls my attentions is that it attempts to use MySQL at first (INFO MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL) but then without any error use Derby instead (INFO MetaStoreDirectSql: Using direct SQL, underlying DB is DERBY). This is the thrift server log https://www.dropbox.com/s/rxfwgjm9bdccaju/spark-root-org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-s-master.value-iq.com.out?dl=0
I have no hive installed on my system, I just pretend to use the built in Hive of Apache Spark.
I'm using mysql-connector-java-5.1.23-bin.jar which is located on $SPARK_HOME/jars folder.
As it appears in the hive-site.xml you have not set the metastore service to connect to. So spark will use the default one which is local metastore service with derby DB backend
I order to use Metastore service that has MySQL DB as its backend, you have to :
Start the metastore service. you can have a look here how to start the service hive metastore admin manual. You start your metastore service with the backend of MySQL DB, using your same hive-site.xml and you add the folowing lines to start the metastore service on METASTORESERVER on the port XXXX:
<property>
<name>hive.metastore.uris</name>
<value>thrift://METASTRESERVER:XXXX</value>
</property>
Let spark knows where the metastore service has started. That could be done using the same hive-site.xml you'have used when starting the metastore service (with the lines above added to it) copy this file into the configuration path of Spark, then restart your spark thrift server
all.sh, I get this error:
Incorrect configuration: namenode address
dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not
configured. Stopping namenodes on []
I looked at all the .xml file and I can't seem to find any problem. Please find the xml files configuration below.
hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>
</property>
</configuration>
core-site.xml
<property>
<name>fs.default.name</name>
<value> hdfs://localhost:9000</value>
</property>
</configuration>
mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.Shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
Thank you.
I got an exception when i execute the command sudo -u hdfs hdfs balancer -threshold 5.
Here is the Exception.
RuntimeException: java.lang.IllegalArgumentException: java.net.UnknownHostException: nameservice1
Here is my core-site.xml.
<property>
<name>fs.defaultFS</name>
<value>hdfs://nameservice1</value>
</property>
Here is my hdfs-site.xml.
<property>
<name>dfs.nameservices</name>
<value>nameservice1</value>
</property>
<property>
<name>dfs.ha.namenodes.nameservice1</name>
<value>nn1,nn2</value>
</property>
Someone help me?
I ran into this problem when setting up HA. The problem is that I set dfs.client.failover.proxy.provider.mycluster based on the reference documentation. When I replaced mycluster with my nameservice name, everything worked!
Reference: https://issues.apache.org/jira/browse/HDFS-12109
You can try after putting the port number at core-site.xml.
<property>
<name>fs.defaultFS</name>
<value>hdfs://nameservice1:9000</value>
</property>
And make sure your machine's /etc/hosts file has entry for nameservice1.
For Example (let you machine IP is 192.168.30.102)
127.0.0.1 localhost
192.168.30.102 nameservice1
<property>
<name>dfs.client.failover.proxy.provider.nameservice1</name>
<value>
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
</value>
</property>
We are installing Cloudera CDH4 in Ubuntu 12.04 LTS, In the installation step we are stuck at hive meta-store start. We have configured the meta-store with MySQL as recommended in download documentation.
Its giving us the following error:
/usr/lib/hive/conf$ sudo service hive-metastore status
* Hive Metastore is dead and pid file exists
In the log file its showing the following error:
ERROR metastore.HiveMetaStore (HiveMetaStore.java:main(4153)) - Metastore Thrift Server threw an exception...
org.apache.thrift.transport.TTransportException: No keytab specified
Following is out hive-site.xml file:
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://my-local-system-ip:3306/metastore?createDatabaseIfNotExist=true</value>
<description>JDBC connect string for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.apache.derby.jdbc.EmbeddedDriver</value>
<value>com.mysql.jdbc.Driver</value>
<description>Driver class name for a JDBC metastore</description>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>my-password</value>
</property>
<property>
<name>datanucleus.autoCreateSchema</name>
<value>false</value>
</property>
<property>
<name>datanucleus.fixedDatastore</name>
<value>true</value>
</property>
<property>
<name>datanucleus.autoStartMechanism</name>
<value>SchemaTable</value>
</property>
<property>
<name>hive.aux.jars.path</name>
<value>file:///usr/share/java/mysql-connector-java.jar</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://<FQDN>:9083</value>
</property>
<property>
<name>hive.support.concurrency</name>
<description>Enable Hive's Table Lock Manager Service</description>
<value>true</value>
</property>
<property>
<name>hive.metastore.local</name>
<description>Enable Hive's Table Lock Manager Service</description>
<value>false</value>
</property>
<property>
<name>hive.server2.authentication</name>
<value>KERBEROS</value>
</property>
<property>
<name>hive.server2.authentication.kerberos.principal</name>
<value>hive/_HOST#<my-domain-name></value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10001</value>
<description>TCP port number to listen on, default 10000</description>
</property>
<property>
<name>hive.server2.authentication.kerberos.keytab</name>
<value>/etc/hive/conf/hive.keytab</value>
</property>
<property>
<name>hive.zookeeper.quorum</name>
<description>Zookeeper quorum used by Hive's Table Lock Manager</description>
<value>FQDN</value>
</property>
<property>
<name>hive.metastore.sasl.enabled</name>
<value>true</value>
</property>
<property>
<name>hive.zookeeper.client.port</name>
<value>2181</value>
<description>
The port at which the clients will connect.
</description>
</property>
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
<property>
<name>hive.server2.thrift.sasl.qop</name>
<value>auth</value>
<description>Sasl QOP value; one of 'auth', 'auth-int' and 'auth-conf'</description>
</property>
<property>
<name>hive.metastore.client.socket.timeout</name>
<value>3600</value>
<description>MetaStore Client socket timeout in seconds</description>
</property>
Our main focus is to install impala. If we use default derby. Hive meta-store is working perfectly. But when we start impala-shell. It shows us Not Connected. What can we do rectify this ?
Can anybody help us out to this error.
I think the issue is that you're missing the following parameter:
<property>
<name>hive.metastore.kerberos.keytab.file</name>
<value>/etc/hive/conf/hive.keytab</value>
<description>The path to the Kerberos Keytab file containing the metastore thrift server's service principal.</description>
</property>
I see you do have hive.server2.authentication.kerberos.keytab, but it appears this is not enough.
replace "my-domain-name" in hive.server2.authentication.kerberos.principal property with your domain name. That is the third part which is missing in hive principal.
At first i cannot let jobtrackers and tasktrackers run, then i replaced all ips like 10.112.57.243 with hdmaster in xml files, and changed mapred.job.tracker into hdfs:// one. Later i formated namenode while hadoop running, then it turned into a disaster. I found the error msg as the title in logs, then i tried even remove all in /tmp and hdfs tmp, then restart, it's still like this. So how can i get rid of this error and let namenode run again? Thanks a lot.
core-site.xml
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://hdmaster:50000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/hadoop/tmp</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.</description>
</property>
</configuration>
hadoop-site.xml
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/ubuntu/hadoop/tmp</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://hdmaster:50000</value>
</property>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://hdmaster:50001</value>
</property>
</configuration>
mapred-site.xml
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>hdfs://hdmaster:50001</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/ubuntu/hadoop/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>/home/ubuntu/hadoop/var</value>
</property>
</configuration>