hadoop examples not running on amazon ec2

hadoop examples not running on amazon ec2 - configuration

I am using hadoop-1.0.4 on amazon ec2 of 3 ubuntu 12.10 instances, 1 master and 2 slaves, just under ~ directory.
Now start-all.sh and stop-all.sh is ok, but when i run jps on master or slaves, it prints nothing. Then i tested hadoop examples:
~/hadoop$ bin/hadoop jar hadoop-examples-1.0.4.jar pi 10 10000
It shows
Exception in thread "main" java.io.IOException: Permission denied
at java.io.UnixFileSystem.createFileExclusively(Native Method)
at java.io.File.createTempFile(File.java:1879)
at org.apache.hadoop.util.RunJar.main(RunJar.java:115)
However i've chmod 777 -R tmp to tmp folders.
~/hadoop$ sudo bin/hadoop jar hadoop-examples-1.0.4.jar pi 10 10000
With sudo, it produces
13/05/12 03:58:11 WARN conf.Configuration: DEPRECATED: hadoop-site.xml
found in the classpath. Usage of hadoop-site.xml is deprecated.
Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to
override properties of core-default.xml, mapred-default.xml
and hdfs-default.xml respectively
Number of Maps = 10
Samples per Map = 10000
13/05/12 03:58:12 WARN fs.FileSystem: "54.235.101.85:50001" is a deprecated
filesystem name. Use "hdfs://54.235.101.85:50001/" instead.
13/05/12 03:58:13 INFO ipc.Client: Retrying connect to server:
hdmaster/54.235.101.85:50001. Already tried 0 time(s).
13/05/12 03:58:14 INFO ipc.Client: Retrying connect to server:
hdmaster/54.235.101.85:50001. Already tried 1 time(s).
13/05/12 03:58:15 INFO ipc.Client: Retrying connect to server:
hdmaster/54.235.101.85:50001. Already tried 2 time(s).
Then failed to connect. So what is the problem? should i put sudo to run the examples? Thanks a lot.

I think, the problem is, 54.235.101.85 is suppose to be a public IP address. Use ifconfig in all the nodes to get a list of IP address and check for IP beginning with 10.x.x.x/172.x.x.x/192.x.x.x. If you find any, modify your configuration files in all the nodes accordingly.

Related

Pods stuck containercreating

previously my MySQL pod stuck at terminating status, and then I tried to force delete using command like this
kubectl delete pods <pod> --grace-period=0 --force
Later I tried to helm upgrade again, my pod was stuck at containercreating status, and this event from pod
17s Warning FailedMount pod/db-mysql-primary-0 MountVolume.SetUp failed for volume "pvc-f32a6f84-d897-4e35-9595-680302771c54" : kubernetes.io/csi: mount
er.SetUpAt failed to check for STAGE_UNSTAGE_VOLUME capability: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix
/var/lib/kubelet/plugins/dobs.csi.digitalocean.com/csi.sock: connect: no such file or directory"
17s Warning FailedMount pod/db-mysql-secondary-0 MountVolume.SetUp failed for volume "pvc-61fc6eda-97fa-455f-ac2c-df8ebcb90f1c" : kubernetes.io/csi: mount
er.SetUpAt failed to check for STAGE_UNSTAGE_VOLUME capability: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial unix
/var/lib/kubelet/plugins/dobs.csi.digitalocean.com/csi.sock: connect: no such file or directory"
anyone please can help me to resolve this issue, thanks a lot.

When you run the command
kubectl delete pods <pod> --grace-period=0 --force
you ask kubernetes to forget the Pod, not to delete it. You have to be careful while using this command. You have to make sure that the containers of the Pod are not running in the host especially when they are mounted to a PVC. Probably the containers is still running and attached to the PVC.

pool-product-8jd40 0
spec:
drivers: null
and on my some pool the driver csi not ready (null), it's supposed to be equal 1 (ready)
*sorry i can't attach the image yet

cbrestore/cbbackup from Couchbase 5.5 to Couchbase 6.5

I have backed up data from an Couchbase 5.5 cluster, using the following command:
$ cbbackup http://couchbase:8091 ~/cbbackup -u *** -p ***
Then, I copied the ~/cbbackup files out of the cluster and onto my local machine.
I have a new Couchbase 6.5 cluster that I want to migrate the data to.
So then I copied ~/cbbackup in to the new cluster.
However, when I try to restore it in the Couchbase 6.5 cluster this happens:
$ cbrestore ~/cbbackup http://couchbase:8091 -u *** -p ***
2020-05-23 14:11:47,209: s0 error: async operation: error: conn.sendall() exception: [Errno 104] Connection reset by peer on sink: http://couchbase:8091(b'default'#b'couchbase-0001.couchbase.couchbase.svc:8091')
2020-05-23 14:11:47,221: s2 error: async operation: error: conn.sendall() exception: [Errno 104] Connection reset by peer on sink: http://couchbase:8091(b'default'#b'couchbase-0003.couchbase.couchbase.svc:8091')
2020-05-23 14:11:47,226: s1 error: async operation: error: conn.sendall() exception: [Errno 104] Connection reset by peer on sink: http://couchbase:8091(b'default'#b'couchbase-0002.couchbase.couchbase.svc:8091')
error: conn.sendall() exception: [Errno 104] Connection reset by peer
How can I restore the backup from Couchbase 5.5 to my Couchbase 6.5 cluster?

Luckily, I know exactly what you need!
According to this chart of version compatibility, it should be possible for Couchbase 6.5 to restore backups from all the way back to Couchbase 5.0.
The reason it's failing though, I'm not sure. According to this thread at the Couchbase forums, it could be because of some problem handling xattrs [MB-31224] made by Sync Gateway; but again, I'm not sure.
However, after much trial and error, what worked for me once was to use cbbackup from 6.5 to make a backup of the 5.5 cluster. Then it's not cbbackup from 5.5 and cbrestore from 6.5, but rather both from 6.5. And it worked!
My setup was running in Kubernetes, so I did something like this:
$ kubectl run -i -t couchbase-migrate --image=couchbase/server:6.5.1 --restart=Never --rm=true --command -- /bin/bash
root#couchbase-migrate:/# cbbackup http://couchbase:8091 ~/cbbackup -u *** -p ***
...
Then I copied the backup out of the couchbase-migrate pod and onto my local machine.
After that, I did the restore similarly:
$ kubectl run -i -t couchbase-migrate --image=couchbase/server:6.5.1 --restart=Never --rm=true --command -- /bin/bash
...
root#couchbase-migrate:/# cbrestore ~/cbbackup http://couchbase:8091 -u *** -p ***
...

There are compatibility issues with the backup and restore process among the various Couchbase versions.
From the problem statement, it seems like it's one time data migration.
If nothing works and actually I too had the similar use cases, where I simply exported all the records from older version of Couchbase to a external file by running the simple SELECT query and ingested the same using the simple java based application to later version of Couchbase.
The difference between this approach and standard backup/restore process is, backup/restore would take care of building the indexes also and will be faster too.

Hadoop - No route to host while configuring HUE

I have installed hue on my local ubuntu system and installed hadoop muti cluster system on two system.
Hadoop Version : 2.7.3
Hue Version : 3.12.0
Ozzie Version : 4.3.0
I am facing issue when I am running sqoop job process from mysql to import data from HDFS. I am getting following error.
Caused by: java.net.NoRouteToHostException: No Route to Host from Developer4/127.0.0.1 to cm:10020 failed on socket timeout exception: java.net.NoRouteToHostException: No route to host; For more details see: http://wiki.apache.org/hadoop/NoRouteToHost
HDFS url hdfs://master:9000
My /etc/hosts file like
192.168.1.149 master
127.0.0.1 developer4
192.168.1.161 slave
Please suggest me where I am doing wrong. Even ozzie command for start and stop command work properly on command line.

You have set Hadoop in your localhost system then you need to remove or modified below things in core-site.xml file.
mapreduce.jobhistory.address 0.0.0.0:10020 Host and port for Job History Server (default 0.0.0.0:10020)
After that you need to run jobhistoryservice with below command.
sbin/mr-jobhistory-daemon.sh --config /home/developer4/hadoop-2.7.3/etc start historyserver
After this command port is enable on your localhost and hope this will help you.

I cant start context broker due to "error starting REST interface"

When I enter the following command:
/etc/init.d/contextBroker start
I get the following output:
Starting contextBroker... cat: /var/run/contextBroker/contextBroker.pid: No such file or directory
pidfile not found [FAILED]
I have two machines where I am practising with context broker and I havent touched the second one in days after I succesfully installed it and managed to receive a post message from a remote weather station.
I see that the directory /var/run/contextBroker/ is actually empty
What should I do to fix this now? reinstal context broker or?
So is this somehow my fault and how do I prevent in the future? I dont want this happening when my app goes live.
EDIT1: the orion version is 0.20.0
EDIT2: I just reinstalled contextBroker and I get the same problem. What are exectly the contents of that directory? Could I maybe just create the files inside?
EDIT3: Since running contextBroker as a system service still yields an unsuccessful start, I also attempted to run it symply by typing:
contextBroker in the command line, after which I get the following response
INFO#14:03:03 contextBroker.cpp[1346]: Orion Context Broker is running
[root#localhost DevF12]# INFO#14:03:03 MongoGlobal.cpp[181]: Successful connection to database
INFO#14:03:03 contextBroker.cpp[1157]: Connected to mongo at localhost:orion
INFO#14:03:03 MongoGlobal.cpp[499]: Database Operation Successful ({ conditions.type: "ONTIMEINTERVAL" })
FATAL#14:03:03 rest.cpp[1013]: Fatal Error (error starting REST interface)
EDIT4: Ok so I tried ps aux | grep contextBroker and the result is:
494 2196 0.0 7.0 688696 135116 ? Ssl Apr21 0:02 /usr/bin/contextBroker -port 1026 -logDir /var/log/contextBroker -pidpath /var/run/contextBroker/contextBroker.pid -dbhost localhost -db orion
root 7299 0.0 6.9 621052 134440 ? Ssl 04:21 0:00 contextBroker -port 1028
root 8870 0.0 0.0 103256 848 pts/0 S+ 08:51 0:00 grep contextBroker
but there simply isnt anything in /var/run/contextBroker/
Should I put contextBroker.pid by myself? and if so, what should the contents be?
EDIT5: I just ran netstat -ntlpd | grep 1026 and the output is:
tcp 0 0 0.0.0.0:1026 0.0.0.0:* LISTEN 2196/contextBroker
tcp 0 0 :::1026 :::* LISTEN 2196/contextBroker
So I guess nothing else but contextBroker is listening?

For the record (it was answered in the comments).
The message FATAL#XX:XX:XX rest.cpp[1013]: Fatal Error (error starting REST interface) means that there is a networking problem. Usually an interface or an already used port.
The usual cause is that there is another instance of Orion running (as a service, for example).
The way to solve it would be to kill the process entirely. Show all Orion processes with ps aux | grep contextBroker and issue a kill -9 <pid>, where <pid> is the process number (the second column of the output of the ps command).

MySQL Cluster - [ [ndbd] ERROR -- Couldn't start as daemon, error: 'Failed to open logfile ]

recently I want to set up mysql cluster, one Mgmt node, one sql node and two data node,
it seems successfully installed and Mgmt node started, but when I try to start data node, I hit a problem...
here is the error message when I try to start data node:
Does anyone know what's going wrong?
basically I follow the step by step tutorial on this site and this site
It would be very appreciated if you can give me some advice!
thanks

Okay, I came up with a solution to fix this issue : 013-01-18 09:26:10 [ndbd] ERROR -- Couldn't start as daemon, error: 'Failed to open logfile
I was stuck with the same issue and after exploring I opened the $MY_CLUSTER_INSTALLATION/ndb_data/ndb_1_cluster.log
1.I found the following message present in the log:
2013-01-18 09:24:50 [MgmtSrvr] INFO -- Got initial configuration
from 'conf/config.ini',
will try to set it when all ndb_mgmd(s) started
2013-01-18 09:24:50 [MgmtSrvr] INFO -- Node 1: Node 1 Connected
2013-01-18 09:24:54 [MgmtSrvr] ERROR -- Unable to bind management
service port: *:1186!
Please check if the port is already used,
(perhaps a ndb_mgmd is already running),
and if you are executing on the correct computer
2013-01-18 09:24:54 [MgmtSrvr] ERROR -- Failed to start mangement service!
2.I checked the services running on port on my Mac machine using following command:
lsof -i :1186
And sure enough, I found the ndb_mgmd(s):
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ndb_mgmd 418 8u IPv4 0x33a882b4d23b342d 0t0 TCP *:mysql-cluster (LISTEN)
ndb_mgmd 418 9u IPv4 0x33a882b4d147fe85 0t0 TCP localhost:50218->localhost:mysql-cluster (ESTABLISHED)
ndb_mgmd 418 10u IPv4 0x33a882b4d26901a5 0t0 TCP localhost:mysql-cluster->localhost:50218 (ESTABLISHED)
3.To kill the processes on the specific port (for me : 1186) I ran following command:
sof -P | grep '1186' | awk '{print $2}' | xargs kill -9
4.I repeated the steps listed in mySql Cluster installation pdf again:
$PATH/mysqlc/bin/ndb_mgmd -f conf/config.ini --initial --configdir=/$PATH/my_cluster/conf/
$PATH/mysqlc/bin/ndbd -c localhost:1186
Hope this helps!

Hope this will be useful
In my case, two data node were connected already
you can check this out in your management node
[root#ab0]# ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> show
what i did was
ndb_mgm> shutdown
and then execute the restart command. it works for me

Check that the datadir exists and is writeable with "ls -ld /home/netdb/mysql_cluster/data" on datanode1.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

hadoop examples not running on amazon ec2 - configuration

I think, the problem is, 54.235.101.85 is suppose to be a public IP address. Use ifconfig in all the nodes to get a list of IP address and check for IP beginning with 10.x.x.x/172.x.x.x/192.x.x.x. If you find any, modify your configuration files in all the nodes accordingly.

Related

Pods stuck containercreating

cbrestore/cbbackup from Couchbase 5.5 to Couchbase 6.5

Hadoop - No route to host while configuring HUE

I cant start context broker due to "error starting REST interface"

MySQL Cluster - [ [ndbd] ERROR -- Couldn't start as daemon, error: 'Failed to open logfile ]

Categories

Resources