Orion Context Broker works half of the times - fiware

I've installed Orion Context Broker 0.23.0 and it behaves rare: it only works half of the times. For instance, when trying to retrieve the version I get this error message:
$ curl "http://localhost:1026/version"
curl: (52) Empty reply from server
$ curl "http://localhost:1026/version"
<orion>
<version>0.23.0</version>
<uptime>15 d, 22 h, 13 m, 18 s</uptime>
<git_hash>f5d76a6f11736d52402e63a4aa0ba990bff7f5eb</git_hash>
<compile_time>Fri Jul 10 13:21:42 CEST 2015</compile_time>
<compiled_by>fermin</compiled_by>
<compiled_in>centollo</compiled_in>
</orion>
$ curl "http://localhost:1026/version"
curl: (52) Empty reply from server
$ curl "http://localhost:1026/version"
<orion>
<version>0.23.0</version>
<uptime>15 d, 22 h, 13 m, 53 s</uptime>
<git_hash>f5d76a6f11736d52402e63a4aa0ba990bff7f5eb</git_hash>
<compile_time>Fri Jul 10 13:21:42 CEST 2015</compile_time>
<compiled_by>fermin</compiled_by>
<compiled_in>centollo</compiled_in>
</orion>
This behaviour is deterministic, I mean, after failing it always works, and after working it always fails. This occurs with all the operations within the REST API.
I've checked the listening ports and the process running them matches the Orion's one:
$ sudo netstat -ntlp | grep 1026
tcp 0 0 0.0.0.0:1026 0.0.0.0:* LISTEN 9944/contextBroker
tcp 0 0 :::1026 :::* LISTEN 9944/contextBroker
$ ps ax | grep contextBroker | grep -v grep
9944 ? Ssl 0:13 /usr/bin/contextBroker -port 1026 -logDir /var/log/contextBroker -pidpath /var/run/contextBroker/contextBroker.pid -dbhost localhost -db orion -multiservice
Any hints? Thanks!

Orion runs by default listening to IPv4 and IPv6. We have found that in cases similar to the ones you described disabling IPv6 solves the problem (we don't know yet the exact cause, maybe is related with the operating system or it is involved is some way...).
Tu run Orion in only-IPv4 mode you have to use the -ipv4 option either in the contextBroker command line or (if you are running Orion as a service) editing the /etc/sysconfig/contextBroker file to add -ipv4 to the BROKER_EXTRA_OPS variable (have a look to the documentation for more information about configuring Orion as a service). After modifying /etc/sysconfig/contextBroker you have to restart Orion using:
sudo /etc/init.d/contextBroker restart

Related

QEMU hostfwd works only for some ports

I compiled qemu-system-x86_64 on aarch64 host, and was able to run a x86_64 guest with a command like
qemu-system-x86_64 -m 4096 -drive file=vmimage.qcow2,if=virtio \
-boot once=c,menu=on -net nic,model=virtio-net-pci \
-net user,hostfwd=tcp::8080-:80,hostfwd=tcp::22222-:22
I could ssh into the guest using
ssh -p22222 user#localhost
Meanwhile, port 80 was not forwarded successfully.
For debugging, I used nc to listen to port 80 inside the guest
nc -l 80
Then in the host, I connected to the forwarded port
nc localhost 8080
However, it was unable to connect to guest nc .
I tried the monitor interface. When the host nc command is executed, info usernet shows following:
(qemu) info usernet
Hub 0 (#net162):
Protocol[State] FD Source Address Port Dest. Address Port RecvQ SendQ
TCP[SYN_SENT] 33 127.0.0.1 8080 10.0.2.15 80 0 0
TCP[ESTABLISHED] 21 127.0.0.1 22222 10.0.2.15 22 0 0
TCP[HOST_FORWARD] 12 * 8080 10.0.2.15 80 0 0
TCP[HOST_FORWARD] 11 * 22222 10.0.2.15 22 0 0
...
I believe the SYN_SENT (FD 33) corresponded to the host nc command, and this matched the HOST_FORWARD line (FD 12). However, it never became ESTABLISHED. And a few seconds later, nc died with Connection reset by peer. , and the FD 33 line disappeared.
If I nc localhost 22222, I can see the OpenSSH banner.
So it seems only port 22 forwarded. Any idea about the cause or how to debug?
Both host and guest had no firewalliptables configured, and SELinux is permissive.
Thanks
Edit:
As a temporary workaround, I configured a second nic, and used port 22 of the new interface for forwarding my service. I also switch to the newer -nic option, but hostfwd still worked for port 22 only.
qemu-system-x86_64 -m 4096 -drive file=vmimage.qcow2,if=virtio \
-boot once=c,menu=on \
-nic user,model=virtio-net-pci,hostfwd=tcp::60022-:22 \
-nic user,model=virtio-net-pci,net=10.0.3.0/24,hostfwd=tcp::8080-10.0.3.15:22
To forward successfully, I also need to
Configure sshd to listen to port 22 the first nic only.
Configure my service to listen to port 22 of the second nic.
Configure the second nic to use a different network. Otherwise, both nics were assigned the same IP (10.0.2.15. I may better hardcode the IP for both nics.)
The problem was actually about firewall. My VM (based on Oracle Linux 8.5 on Oracle Linux VM Templates) actually had firewall rules in both iptables and nft. After disabling both iptables and nft, the port forward worked.

Openshift 4.4 - cannot 'oc logs\exec' pods running on worker nodes

Openshift 4.4.17 cluster (3 masters and 3 workers).
Getting error when trying to see logs (or exec terminal) on those pods running on worker nodes. The same applies for Openshift GUI. No issues when trying to do the same for pods running on master nodes.
Example 1: pods running on worker
$ oc whoami
kube:admin
$ oc get pod -n lamp
NAME READY STATUS RESTARTS AGE
lamp-lamp-6c7d9f467d-jsn4t 3/3 Running 0 108d
$ oc logs lamp-lamp-6c7d9f467d-jsn4t httpd -n lamp
error: You must be logged in to the server (the server has asked for the client to provide credentials ( pods/log lamp-lamp-6c7d9f467d-jsn4t))
Example 2: pods running on master nodes
$ oc get pod -n openshift-apiserver
NAME READY STATUS RESTARTS AGE
apiserver-6d64545f-5lmb8 1/1 Running 0 2d19h
apiserver-6d64545f-hktqd 1/1 Running 0 2d19h
apiserver-6d64545f-kb4qt 1/1 Running 0 2d19h
$ oc logs apiserver-6d64545f-5lmb8 -n openshift-apiserver
Copying system trust bundle
I0225 20:41:39.989689 1 requestheader_controller.go:243] (..output omitted..)
Investigating kubelet on worker nodes:
On every worker node kubelet service is running, but
journalctl -u kubelet
shows these two lines:
Unable to authenticate the request due to an error: x509: certificate signed by unknown authority
logging error output: "Unauthorized"
About kubeconfig on worker nodes:
Watching the content of /etc/kubernetes/kubeconfig file.
- kubelet connects to api-server --> https://api-int.ocs-cls1.mycompany.lab
- the server passes valid certificate signed by --> kube-apiserver-lb-signer
- certificate-authority-data carries --> kube-apiserver-lb-signer rootCA
The kubeconfig looks like correct.
UPDATE:
Also noticed these log lines reporting bad certificate:
$ oc -n openshift-apiserver logs apiserver-6d64545f-5lmb8
log.go:172] http: TLS handshake error from 10.128.0.12:47078: remote error: tls: bad certificate
...
UPDATE2:
Also checked apiserver-loopback-client certificate:
$ curl --resolve apiserver-loopback-client:6443:{IP_MASTER} -v -k https://apiserver-loopback-client:6443/healthz
server certificate verification SKIPPED
* server certificate status verification SKIPPED
* common name: apiserver-loopback-client#1614330374 (matched)
* server certificate expiration date OK
* server certificate activation date OK
* certificate public key: RSA
* certificate version: #3
* subject: CN=apiserver-loopback-client#1614330374
* start date: Fri, 26 Feb 2021 08:06:13 GMT
* expire date: Sat, 26 Feb 2022 08:06:13 GMT
* issuer: CN=apiserver-loopback-client-ca#1614330374
try this
while :;do
sleep 2
oc get csr -o name | xargs -r oc adm certificate approve
done
use the another terminal, and ssh to the any master node, run this:
crictl ps -a | awk '/Running/&&/-cert-syncer/{print $1}' | xargs -r crictl stop

I cant start context broker due to "error starting REST interface"

When I enter the following command:
/etc/init.d/contextBroker start
I get the following output:
Starting contextBroker... cat: /var/run/contextBroker/contextBroker.pid: No such file or directory
pidfile not found [FAILED]
I have two machines where I am practising with context broker and I havent touched the second one in days after I succesfully installed it and managed to receive a post message from a remote weather station.
I see that the directory /var/run/contextBroker/ is actually empty
What should I do to fix this now? reinstal context broker or?
So is this somehow my fault and how do I prevent in the future? I dont want this happening when my app goes live.
EDIT1: the orion version is 0.20.0
EDIT2: I just reinstalled contextBroker and I get the same problem. What are exectly the contents of that directory? Could I maybe just create the files inside?
EDIT3: Since running contextBroker as a system service still yields an unsuccessful start, I also attempted to run it symply by typing:
contextBroker in the command line, after which I get the following response
INFO#14:03:03 contextBroker.cpp[1346]: Orion Context Broker is running
[root#localhost DevF12]# INFO#14:03:03 MongoGlobal.cpp[181]: Successful connection to database
INFO#14:03:03 contextBroker.cpp[1157]: Connected to mongo at localhost:orion
INFO#14:03:03 MongoGlobal.cpp[499]: Database Operation Successful ({ conditions.type: "ONTIMEINTERVAL" })
FATAL#14:03:03 rest.cpp[1013]: Fatal Error (error starting REST interface)
EDIT4: Ok so I tried ps aux | grep contextBroker and the result is:
494 2196 0.0 7.0 688696 135116 ? Ssl Apr21 0:02 /usr/bin/contextBroker -port 1026 -logDir /var/log/contextBroker -pidpath /var/run/contextBroker/contextBroker.pid -dbhost localhost -db orion
root 7299 0.0 6.9 621052 134440 ? Ssl 04:21 0:00 contextBroker -port 1028
root 8870 0.0 0.0 103256 848 pts/0 S+ 08:51 0:00 grep contextBroker
but there simply isnt anything in /var/run/contextBroker/
Should I put contextBroker.pid by myself? and if so, what should the contents be?
EDIT5: I just ran netstat -ntlpd | grep 1026 and the output is:
tcp 0 0 0.0.0.0:1026 0.0.0.0:* LISTEN 2196/contextBroker
tcp 0 0 :::1026 :::* LISTEN 2196/contextBroker
So I guess nothing else but contextBroker is listening?
For the record (it was answered in the comments).
The message FATAL#XX:XX:XX rest.cpp[1013]: Fatal Error (error starting REST interface) means that there is a networking problem. Usually an interface or an already used port.
The usual cause is that there is another instance of Orion running (as a service, for example).
The way to solve it would be to kill the process entirely. Show all Orion processes with ps aux | grep contextBroker and issue a kill -9 <pid>, where <pid> is the process number (the second column of the output of the ps command).

varnish error sess_workspace

Metadata: My system is debian 7 64bit
Hi, I am new to varnish and I have encountered an error that i cant seem to solve.
I have the following in my /etc/default/varnish configuration file:
DAEMON_OPTS="-a :80 \
-T localhost:6082 \
-f /etc/varnish/default.vcl \
-u www-data -g www-data \
-S /etc/varnish/secret \
-p thread_pools=2 \
-p thread_pool_min=25 \
-p thread_pool_max=250 \
-p thread_pool_add_delay=2 \
-p timeout_linger=50 \
-p sess_workspace=262144 \
-p cli_timeout=40 \
-s malloc,512m"
When I restarted he varnish service it failed with the error Unknown parameter "sess_workspace".
I checked the /var/log/varnish/ and no logs where generated.
I also checked syslog and the only logs that varnish wrote were logs that had to do with the startup and shutdown of varnish after he install and when I ran service varnish restart. Here ara all the relevant syslog entries (cat syslog | grep varnish):
Nov 6 00:56:27 HOSTNAME varnishd[7557]: Platform: Linux,3.2.0-4-amd64,x86_64,-smalloc,-smalloc,-hcritbit
Nov 6 00:56:27 HOSTNAME varnishd[7557]: child (7618) Started
Nov 6 00:56:27 HOSTNAME varnishd[7557]: Child (7618) said Child starts
Nov 6 01:04:58 HOSTNAME varnishd[7557]: Manager got SIGINT
Nov 6 01:04:58 HOSTNAME varnishd[7557]: Stopping Child
Nov 6 01:04:59 HOSTNAME varnishd[7557]: Child (7618) ended
Nov 6 01:04:59 HOSTNAME varnishd[7557]: Child (7618) said Child dies
Nov 6 01:04:59 HOSTNAME varnishd[7557]: Child cleanup complete
I have searched the vast seas of google but with no luck (I have even compared with example code from varnish site and still no luck).
Any ideas that may help me?
Bit delayed but in case anyone else has this issue I will post what I found.
Found this here: http://www.0550go.com/proxy/varnish/varnish-v4-config.html
sess_workspace
In 3.0 it was often necessary to increase
sess_workspace if a lot of VMODs, complex header operations or ESI
were in use.
This is no longer necessary, because ESI scratch space
happens elsewhere in 4.0.
If you are using a lot of VMODs, you may
need to increase either workspace_backend and workspace_client based
on where your VMOD is doing its work.

MySQL Cluster - [ [ndbd] ERROR -- Couldn't start as daemon, error: 'Failed to open logfile ]

recently I want to set up mysql cluster, one Mgmt node, one sql node and two data node,
it seems successfully installed and Mgmt node started, but when I try to start data node, I hit a problem...
here is the error message when I try to start data node:
Does anyone know what's going wrong?
basically I follow the step by step tutorial on this site and this site
It would be very appreciated if you can give me some advice!
thanks
Okay, I came up with a solution to fix this issue : 013-01-18 09:26:10 [ndbd] ERROR -- Couldn't start as daemon, error: 'Failed to open logfile
I was stuck with the same issue and after exploring I opened the $MY_CLUSTER_INSTALLATION/ndb_data/ndb_1_cluster.log
1.I found the following message present in the log:
2013-01-18 09:24:50 [MgmtSrvr] INFO -- Got initial configuration
from 'conf/config.ini',
will try to set it when all ndb_mgmd(s) started
2013-01-18 09:24:50 [MgmtSrvr] INFO -- Node 1: Node 1 Connected
2013-01-18 09:24:54 [MgmtSrvr] ERROR -- Unable to bind management
service port: *:1186!
Please check if the port is already used,
(perhaps a ndb_mgmd is already running),
and if you are executing on the correct computer
2013-01-18 09:24:54 [MgmtSrvr] ERROR -- Failed to start mangement service!
2.I checked the services running on port on my Mac machine using following command:
lsof -i :1186
And sure enough, I found the ndb_mgmd(s):
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
ndb_mgmd 418 8u IPv4 0x33a882b4d23b342d 0t0 TCP *:mysql-cluster (LISTEN)
ndb_mgmd 418 9u IPv4 0x33a882b4d147fe85 0t0 TCP localhost:50218->localhost:mysql-cluster (ESTABLISHED)
ndb_mgmd 418 10u IPv4 0x33a882b4d26901a5 0t0 TCP localhost:mysql-cluster->localhost:50218 (ESTABLISHED)
3.To kill the processes on the specific port (for me : 1186) I ran following command:
sof -P | grep '1186' | awk '{print $2}' | xargs kill -9
4.I repeated the steps listed in mySql Cluster installation pdf again:
$PATH/mysqlc/bin/ndb_mgmd -f conf/config.ini --initial --configdir=/$PATH/my_cluster/conf/
$PATH/mysqlc/bin/ndbd -c localhost:1186
Hope this helps!
Hope this will be useful
In my case, two data node were connected already
you can check this out in your management node
[root#ab0]# ndb_mgm
-- NDB Cluster -- Management Client --
ndb_mgm> show
what i did was
ndb_mgm> shutdown
and then execute the restart command. it works for me
Check that the datadir exists and is writeable with "ls -ld /home/netdb/mysql_cluster/data" on datanode1.