How can I improve socket hang up when connecting many devices? - fiware

I am testing to connect many devices to FIWARE in the following environment.
Each component is deployed in a container on a physical server.
+-------------------------------------------------+
|Comet - Cygnus - Orion - IoTAgentJSON - Mosquitto| - device*N
+-------------------------------------------------+
Under the condition that each device transmits data at 1 msg/sec, the following error occurs at IoTAgent when the number of devices is 350.(That is, 350 msg/sec)
{"log":"time=2018-12-16T14:57:24.810Z | lvl=ERROR | corr=ec11c37f-5194-4cb3-8d79-e04a2d1e745c | trans=ec11c37f-5194-4cb3-8d79-e04a2d1e745c | op=IoTAgentNGSI.NGSIService | srv=n/a | subsrv=n/a | msg=Error found executing update action in Context Broker: Error: socket hang up | comp=IoTAgent\n","stream":"stdout","time":"2018-12-16T14:57:24.81037597Z"}
{"log":"time=2018-12-16T14:57:24.810Z | lvl=ERROR | corr=ec11c37f-5194-4cb3-8d79-e04a2d1e745c | trans=ec11c37f-5194-4cb3-8d79-e04a2d1e745c | op=IoTAgentNGSI.Alarms | srv=n/a | subsrv=n/a | msg=Raising [ORION-ALARM]: {\"code\":\"ECONNRESET\"} | comp=IoTAgent\n","stream":"stdout","time":"2018-12-16T14:57:24.810440213Z"}
{"log":"time=2018-12-16T14:57:24.810Z | lvl=ERROR | corr=ec11c37f-5194-4cb3-8d79-e04a2d1e745c | trans=ec11c37f-5194-4cb3-8d79-e04a2d1e745c | op=IoTAgentJSON.MQTTBinding | srv=n/a | subsrv=n/a | msg=MEASURES-002: Couldn't send the updated values to the Context Broker due to an error: Error: socket hang up | comp=IoTAgent\n","stream":"stdout","time":"2018-12-16T14:57:24.810526916Z"}
The result of the requested ps ax | grep contextBroker command is as follows.
ps ax | grep contextBroker
19766 ? Ssl 29:02 /usr/bin/contextBroker -fg -multiservice -ngsiv1Autocast -dbhost mongodb-orion-demo -statCounters -statSemWait -statTiming
Question 1: Where is the cause? IoTAgent? Or Orion? Or MongoDB? Or kernel parameter?
Error found executing update action in Context Broker: Error: socket hang up but there is no error log displayed in Orion.
Question 2: How can I improve the processing performance of FIWARE?
Do you need the scale of IoTAgent?
Do you need to consider Orion's parameters?
I need to consider values ​​such as reqPoolSize and maxConnections with reference to the following URL?
https://fiware-orion.readthedocs.io/en/master/admin/perf_tuning/#http-server-tuning
Do you need the scale of Orion?
How to scale Orion GE?
Question 3: Is there a batch operation on IoT Agent?
On the following page, you should do a batch operation instead of opening a new connection for each entity, but is there such a function in IoTAgent?
ECONNRESET when opening a large number of connection in small time period

It is difficult to provide a right answer, as performance depends on many factors specially in complicated setups involving several components interacting among them. However, I'll try to provide some thoughts and insights based in the information you provide and my previous experience.
With regards to Orion, I'd recommend you to have a look to the documentation on performance tunning. Following the indications in that page you can increase the performance of the component.
However, having said that, I don't think that Orion is the cause of the problem in your case, based on:
Even without performance optimization Orion typically reach a throughput in the order of the ~1,000 tps. It should cope updates at 350 tps without problems.
Orion is not showing error logs. The error logs you have are produced by IOTAgent component, as far as I understand.
Thus, focusing in IOTA, maybe it would be better to use IOTA-UL instead of IOTA-JSON. The UL encoding is more efficient that JSON encoding so you can gain in efficiency. In addition, IOTA-UL allows you to send multimeasures (using # as separator) which I don't know if fits your case but can be seen as a limited form of batch update (see UL documentation for more detail).
If that doesn't work another posibility is to send data directly to Orion using its NGSIv2 API. That would have several advantages:
Simplified setup (two pieces less: MQTT broker and IOTAgent)
Under same resource conditions, Orion native performance is typically higher than IOTAgents performance (as I mention before ~1,000 tps or even higher after applying performance optimizations)
NGSIv2 API provides a batch update operation (look for POST /v2/op/update in the NGSIv2 specification document cited above)

Related

How to send massive data of sensors in Orion

Let's suppose to have 100 sensors that send an attribute any second to Orion. How could I manage this massive data?
via batch operation (but I don't know if it can support them)
using an edge (to aggregate data) and sending to Orion (after 1 minute)
Thank you
Let’s consider 100 tps a high load for a given infrastructure (load and throughput must be always related to infrastructure and e2e scenarios).
The main problem you may encounter is not related to the update itself, Orion Context Broker and its fork Orion LD, can handle a lot of updates. The main problem in real/productive scenarios, like the ones handled by Orion Context Broker and NGSI v2, are the NOTIFICATIONS related to those UPDATES.
If you need a 1:1 (or even a 1:2 or 1:4) ratio between UPDATES:NOTIFICATIONS, for example you want to keep track of the history of every measure and also send the measures to the CEP for some post-processing, then it’s not only a matter of how many updates Orion may handle, but how many update-notifications the E2E can handle. If you got a slow notification endpoint Orion will saturate its notification queues and you will be losing notifications (not keeping track of those updates within en historic, or CEP…).
Batch updates are not helping on this since the UPDATE REQUEST SERVER is not the bottleneck and they are internally managed as single updates.
To alleviate this problem I should recommend you to enable NGSI V2 (only available in V2) flow control mechanism, so the update process may be automatically slowed down when the notification throughput requires so.
And of course, in any IoT scenario if you don’t need all the data the earlier you aggregate the better. So if your E2E doesn’t need to keep track of every single measure, data loggers are more than welcome.
For 100 sensors sending one update per second (did I understand that correctly?) ... that's nothing. The broker can handle 2-3 thousand updates per second running in a single core and with ~4 GB of RAM (mongodb needs about 3 times that).
And, if it's more (a lot more), then yes, the NGSI-LD API defines batch operations (for Create, Update, Upsert, and Delete of entities), and Orion-LD implements them all.
However, there's no batch op for attribute update. You'd need to use "batch update entity", the update mode (not replace). Check the NGSI-LD API spec for details.

Network Security Group Rule Audit (Azure)

I wondered if anyone has found a way to audit network security groups in Azure, other than trawl through them all in the Azure UI. I have managed to extract info as json, but still its not terribly easy to decipher as its nested quite deeply. Im looking for NSG's with default any/any rules and other poorly applied rules.
We have several hundred Network Security Groups (to give context).
Anyone have any views how best to go about this?
Depending on what you would like to audit in your NSG security rules, the Azure Resource Graph may be more friendly than exporting the the JSON and parsing. It can be called via the REST API, for example from a Logic App, for regular audits.
A simple query for NSGs with security rules allowing traffic to port 22 is below:
az graph query -q "where type == 'microsoft.network/networksecuritygroups' | extend rules = properties.securityRules | mv-expand rules | where rules.properties.destinationPortRanges contains '22' | summarize count() by id"
Another approach to consider would be to use Azure Policy to audit security rules for specific exceptions.
Lastly, if you are more interested in monitoring changes made to your NSGs than specific exceptions, the Resource Change History feature may be what you are looking for. You can target specific resources and review changes over a time window. Doing so would require some automation on your part, calling the Rest API, etc. See: https://learn.microsoft.com/en-us/azure/governance/resource-graph/how-to/get-resource-changes

EMV Offline Data Authentication - CDA Mode 3

The EMV Spec 4.3 Vol 2 defines the different modes for CDA ("Combined Data Authentication") with a chart:
+----+-------------------+-----------------------------------+
|Mode|Request CDA on ARQC|Request CDA on 2nd GEN AC (TC) |
| | |after approved online authorisation|
+----+-------------------+-----------------------------------+
| 1 | Yes | Yes |
| 2 | Yes | No |
| 3 | No | No |
| 4 | No | Yes |
+----+-------------------+-----------------------------------+
My question:
If a PinPad is in CDA Mode 3, does it actually perform the data authentication step at all?
The PinPad I am using is in CDA Mode 3 and it appears to be doing so sometime in the ARPC validation/TC generation step as evidenced by the Byte 1, Bit 8 of the TVR being set to zero at that time. However, the chart above would lead me to believe that it is not.
Unfortunately, I don't have a UL or Collis tool to get inside the PinPad to see the PinPad/chip flow.
Short answer to your question is YES - the acceptance device will perform card authentication. When it comes to ODA, it might be also SDA (already obsolete) or DDA that will happen regardless of CDA mode.
CDA mode 3 means only that ODA will not be performed if other CAM (Card Authentication Method) is available. It will still happen for offline accepted transactions.
To clarify, the Card Authentication Methods:
Offline CAM = PKI based Offline Data Authentication which CDA is an example of
Online CAM = symmetric cryptography based verification of cryptograms during online communication.
In early days of EMV implementation acceptance devices had quite limited processing power - they were mostly based on 8-bit microcontrollers which meant it took ages to perform RSA with larger modulus. That's why CDA mode 3 was introduced - to avoid performing resource-heavy offline CAM when online CAM is available - in online transactions. That was perceived an optimization in the time and was recommended by schemes and EMVCo.
In today terms, CDA mode 1 is widely adopted and I don't remember any recent Type Approvals with CDA mode 3. If you have a device with it, you might be dealing with an old device with an expired approval.
ARPC verification (Issuer Authentication step) you mention is not reflected in TVR B1b8 - it's only an indication that ODA was not performed, which (apart from CDA mode 3 situation) might also be when card and terminal do not support any common authentication method (some online-only terminals do not need to perform ODA; some non-expiring cards do not support ODA as well). Issuer Authentication might be explicit (when AIP in the card indicates it and you received ARPC in the response), but might happen also implicitly (when AIP doesn't indicate it but card requests ARPC in CDOL2) and you might not see it indicated in TVR.

Filter elements of a JSON by their subelements' contents in Windows shell using JQ?

This may be asked elsewhere but I have trouble finding a similar case:
Trying to filter the Tor network's relay consensus (https://onionoo.torproject.org/details) based on whether a relay is "Running" and if it has a "Fast" flag or not in order to make another json made up of the selected relays. Possibly even a more concise version that only lists certain elements of each relay (nickname, fingerprint, etc).
I'm trying to use this in a batch script, which makes JQ difficult to work with as it requires hoops to jump through in order to make it work with Windows' shell.
Looking through the Tutorial and manual, I'm stumped. Anyone know of a solution?
This filter will select relays with Running and Fast flags and yield an array of objects containing only nickname and fingerprint fields, tweak it to meet your requirements.
.relays | map(select(.flags | index("Running") and index("Fast")) | { nickname, fingerprint })

Can't get the boot disk to 200GB [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 8 years ago.
Improve this question
I tried creating the disk first via gcutil adddisk and then assigning it to the VM when running gcutil addinstance with the --disk flag. However, this method still results in a 10GB partition even though I set it to 200GB on adddisk.
Here is the disk itself:
INFO: Zone for db2-usc1a detected as us-central1-a.
+-----------------+--------------------------------------------------------+
| name | db2-usc1a |
| description | |
| creation-time | 2014-06-11T22:45:39.654-07:00 |
| zone | us-central1-a |
| status | READY |
| source-snapshot | |
| source-image | projects/centos-cloud/global/images/centos-6-v20140606 |
| source-image-id | 6290630306542078308 |
| size-gb | 200 |
+-----------------+--------------------------------------------------------+
But, as you can see, running df -h displays it as 9.9GB:
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 9.9G 4.3G 5.1G 46% /
tmpfs 7.3G 0 7.3G 0% /dev/shm
I have also tried to follow these instructions here: https://developers.google.com/compute/docs/disks#repartitionrootpd
However, on reboot, the VM becomes inaccessible so I can't even SSH back onto the machine.
Why is Google enforcing a 10GB image on boot? Why is it not being set to the size I have requested? More importantly, is there a way I can automate this process for our build scripts?
One option is to use Persistent Disk Snapshots:
resize the disk
create a snapshot of the disk
in your build scripts, create new PDs from your snapshot instead of the default image
The new disks will be 200GB. Snapshots only save blocks which have changed, so the snapshot itself should be fairly small.
As the previous comment suggests, resize the disk. For those who don't know how to do this:
sudo /sbin/resize2fs /dev/sda1