json preserve original data order - json

I am rendering following data
set firewall family inet filter v4-test term accept_1 from protocol lilla
set firewall family inet filter v4-test term accept_2 then accept
set firewall family inet filter v4-test term accept_3 from source-prefix-list v4-test
set firewall family inet filter v4-test term accept_3 from destination-port 1
set firewall family inet filter v4-test term accept_3 from destination-port 2
set firewall family inet filter v4-test term accept_3 then accept
set firewall family inet filter v4-test term access_4 from source-address x.x.x.x/32
via ttp templating
parser = ttp(data=data_to_parse, template=ttp_template)
parser.parse()
results = parser.result(format='json')[0]
results_dic = json.loads(results)
Output of rendering is a json file
[
{
"v4-test": {
"accept_2": {
"action": "accept"
},
"accept_1": {
"protocol": "lilla"
},
"accept_3": [
{
"source-prefix-list": "v4-test"
},
{
"destination-port": "1"
},
{
"destination-port": "2"
},
{
"action": "accept"
}
],
"access_4": {
"source-prefix-list": "x.x.x.x/32"
}
}
}
]
Problem: I want the output data to keep the order of original data. Any hint?
Thank you.

Related

TTN V3 (MQTT JSON) -> Telegraf -> Grafana / Sensor data from Dragino LSE01 does not apear

I have a problem with Telegraf. I have a Dragino LSE01-8 sensor which is registered on TTN v3. I can check the decoded payload by subscribing to the topic "v3/lse01-8#ttn/devices/+/up".
But when I want to grab the data from Influx, I can not get "temp_SOIL" and "water_SOIL", although the data appears in JSON. "conduct_SOIL" is no problem. But I don't know why. Can somebody give me a hint?
Another sensor (Dragino LHT 65) works fine with all data I want to access.
It's possible to get this data from the Influx-Database:
uplink_message_decoded_payload_BatV
uplink_message_decoded_payload_Mod
uplink_message_decoded_payload_conduct_SOIL
uplink_message_decoded_payload_i_flag
uplink_message_decoded_payload_s_flag
uplink_message_f_cnt
uplink_message_f_port
uplink_message_locations_user_latitude
uplink_message_locations_user_longitude
uplink_message_rx_metadata_0_channel_index
uplink_message_rx_metadata_0_channel_rssi
uplink_message_rx_metadata_0_location_altitude
uplink_message_rx_metadata_0_location_latitude
uplink_message_rx_metadata_0_location_longitude
uplink_message_rx_metadata_0_rssi
uplink_message_rx_metadata_0_snr
uplink_message_rx_metadata_0_timestamp
uplink_message_settings_data_rate_lora_bandwidth
uplink_message_settings_data_rate_lora_spreading_factor
uplink_message_settings_timestamp
## Feuchtigkeitssensor Dragino LSE01-8
[[inputs.mqtt_consumer]]
name_override = "TTN-LSE01"
servers = ["tcp://eu1.cloud.thethings.network:1883"]
qos = 0
connection_timeout = "30s"
topics = [ "v3/lse01-8#ttn/devices/+/up" ]
client_id = "telegraf"
username = "lse01-8#ttn"
password = "NNSXS.LLSNSE67AP..................P67Q.Q...........HPG............KJA..........." //
data_format = "json"
This is the JSON data I can get (I changed some data in order not to send any passwords or tokens).
{
"end_device_ids":{
"device_id":"eui-a8.40.141.bbe4",
"application_ids":{
"application_id":"lse01-8"
},
"dev_eui":"A8...40.BE...4",
"join_eui":"A8.40.010.1",
"dev_addr":"2.9F.....8"
},
"correlation_ids":[
"as:up:01G4WDNS..P3C3R...RK56VQ...KT7N076",
"gs:conn:01G4H2F.ETRG.V2QER...RQ.0K1MGZ44",
"gs:up:host:01G4H2F.ETWRZX.4PFN.A2M.6RDKD4",
"gs:uplink:01G4WDN.N7B6P.J8E.JS.503F1",
"ns:uplink:01G4WDNSFM.MCYYEZZ1.KY.4M78",
"rpc:/ttn.lorawan.v3.GsNs/HandleUplink:01G4W.NSFM29Z3.PABYW...43",
"rpc:/ttn.lorawan.v3.NsAs/HandleUplink:01G4W....VTQ4DMKBF"
],
"received_at":"2022-06-06T11:51:18.979353604Z",
"uplink_message":{
"session_key_id":"AYE...j+DM....A==",
"f_port":2,
"f_cnt":292,
"frm_payload":"DSQAAAcVB4AADBA=",
"decoded_payload":{
"BatV":3.364,
"Mod":0,
"conduct_SOIL":12,
"i_flag":0,
"s_flag":1,
"temp_DS18B20":"0.00",
"temp_SOIL":"19.20",
"water_SOIL":"18.13"
},
"rx_metadata":[
{
"gateway_ids":{
"gateway_id":"lr8",
"eui":"3.6201F0.058.....00"
},
"time":"2022-06-06T11:51:00.289713Z",
"timestamp":4283143007,
"rssi":-47,
"channel_rssi":-47,
"snr":7,
"location":{
"latitude":51.______________,
"longitude":6.__________________,
"altitude":25,
"source":"SOURCE_REGISTRY"
},
"uplink_token":"ChsKG________________________________",
"channel_index":2
}
],
"settings":{
"data_rate":{
"lora":{
"bandwidth":125000,
"spreading_factor":7
}
},
"coding_rate":"4/5",
"frequency":"868500000",
"timestamp":4283143007,
"time":"2022-06-06T11:51:00.289713Z"
},
"received_at":"2022-06-06T11:51:18.772518399Z",
"consumed_airtime":"0.061696s",
"locations":{
"user":{
"latitude":51._________________,
"longitude":6.__________________4,
"source":"SOURCE_REGISTRY"
}
},
"version_ids":{
"brand_id":"dragino",
"model_id":"lse01",
"hardware_version":"_unknown_hw_version_",
"firmware_version":"1.1.4",
"band_id":"EU_863_870"
},
"network_ids":{
"net_id":"000013",
"tenant_id":"ttn",
"cluster_id":"eu1",
"cluster_address":"eu1.cloud.thethings.network"
}
}
}

Convert an array of strings to a dictionary with JQ?

I have trying to convert the AWS public IP ranges into a format that can be used with the Terraform external data provider so I can create a security group rule based off the AWS public CIDRs. The provider requires a single JSON object with this format:
{"string": "string"}
Here is a snippet of the public ranges JSON document:
{
"syncToken": "1589917992",
"createDate": "2020-05-19-19-53-12",
"prefixes": [
{
"ip_prefix": "35.180.0.0/16",
"region": "eu-west-3",
"service": "AMAZON",
"network_border_group": "eu-west-3"
},
{
"ip_prefix": "52.94.76.0/22",
"region": "us-west-2",
"service": "AMAZON",
"network_border_group": "us-west-2"
},
// ...
]
I can successfully extract the ranges I care about with this, [.prefixes[] | select(.region == "us-west-2") | .ip_prefix] | sort | unique, and it gives me this:
[
"100.20.0.0/14",
"108.166.224.0/21",
"108.166.240.0/21",
"13.248.112.0/24",
...
]
I can't figure out how to convert this to an arbitrarily-keyed object with jq. In order to properly use the array object, I need to convert it to a dictionary, something like {"arbitrary-key": "100.20.0.0/14"}, so that I can use it in Terraform like this:
data "external" "amazon-ranges" {
program = [
"cat",
"${path.cwd}/aws-ranges.json"
]
}
resource "aws_default_security_group" "allow-mysql" {
vpc_id = aws_vpc.main.id
ingress {
description = "MySQL"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = [
values(data.external.amazon-ranges.result)
]
}
}
What is the most effective way to extract the the AWS public IP ranges document into a single object with arbitrary keys?
The following script uses the .ip_prefix as the key, thus perhaps avoiding the need for the sort|unique. It yields:
{
"35.180.0.0/16": "35.180.0.0/16",
"52.94.76.0/22": "52.94.76.0/22"
}
Script
#!/bin/bash
function data {
cat <<EOF
{
"syncToken": "1589917992",
"createDate": "2020-05-19-19-53-12",
"prefixes": [
{
"ip_prefix": "35.180.0.0/16",
"region": "eu-west-3",
"service": "AMAZON",
"network_border_group": "eu-west-3"
},
{
"ip_prefix": "52.94.76.0/22",
"region": "us-west-2",
"service": "AMAZON",
"network_border_group": "us-west-2"
}
]
}
EOF
}
data | jq '
.prefixes
| map(select(.region | test("west"))
| {(.ip_prefix): .ip_prefix} )
| add '
There's a better option to get at the AWS IP ranges data in Terraform, which is to use the aws_ip_ranges data source, instead of trying to mangle things with the external data source and jq.
The example in the above linked documentation shows a similar, but also slightly more complex, thing to what you're trying to do here:
data "aws_ip_ranges" "european_ec2" {
regions = ["eu-west-1", "eu-central-1"]
services = ["ec2"]
}
resource "aws_security_group" "from_europe" {
name = "from_europe"
ingress {
from_port = "443"
to_port = "443"
protocol = "tcp"
cidr_blocks = data.aws_ip_ranges.european_ec2.cidr_blocks
ipv6_cidr_blocks = data.aws_ip_ranges.european_ec2.ipv6_cidr_blocks
}
tags = {
CreateDate = data.aws_ip_ranges.european_ec2.create_date
SyncToken = data.aws_ip_ranges.european_ec2.sync_token
}
}
To do your exact thing you would do something like this:
data "aws_ip_ranges" "us_west_2_amazon" {
regions = ["us_west_2"]
services = ["amazon"]
}
resource "aws_default_security_group" "allow-mysql" {
vpc_id = aws_vpc.main.id
ingress {
description = "MySQL"
from_port = 0
to_port = 0
protocol = "-1"
cidr_blocks = data.aws_ip_ranges.us_west_2_amazon.cidr_blocks
}
}
However, there are 2 things that are bad here.
The first, and most important, is that you're allowing access to your database from every IP address that AWS has in US-West-2 across all services. That means that anyone in the world is able to spin up an EC2 instance or Lambda function in US-West-2 and then have network access to your database. This is a very bad idea.
The second is that if that returns more than 60 CIDR blocks you are going to end up with more than 60 rules in your security group. AWS security groups have a limit of 60 security group rules per IP address type (IPv4 vs IPv6) and per ingress/egress:
You can have 60 inbound and 60 outbound rules per security group (making a total of 120 rules). This quota is enforced separately for IPv4 rules and IPv6 rules; for example, a security group can have 60 inbound rules for IPv4 traffic and 60 inbound rules for IPv6 traffic. A rule that references a security group or prefix list ID counts as one rule for IPv4 and one rule for IPv6.
From https://docs.aws.amazon.com/vpc/latest/userguide/amazon-vpc-limits.html#vpc-limits-security-groups
This is technically a soft cap and you can ask AWS to raise this limit in exchange for reducing the amount of security groups that can be applied to a network interface to keep the maximum amount of security group rules at or below 1000 per network interface. It's probably not something you want to mess around with though.

Scrapy multiple regular expressions in LinkExtractor seem to be not working

I've got my regular expressions inside a JSON file. This file gets loaded as a configuration for my spider. The spider creates one LinkExtractor with allow and deny regular expression rules.
I'd like to:
crawl and scrape product pages (scraping / parsing is NOT working)
crawl category pages
avoid general pages (about us, privacy, etc.)
It all works well on some shops, but not on others and I believe it's a problem of my Regular Expressions.
"rules": [
{
"deny": ["\\/(customer\\+service|ways\\+to\\+save|sponsorship|order|cart|company|specials|checkout|integration|blog|brand|account|sitemap|prefn1=)\\/"],
"follow": false
},
{
"allow": ["com\\/store\\/details\\/"],
"follow": true,
"use_content": true
},
{
"allow": ["com\\/store\\/browse\\/"],
"follow": true
}
],
URL patterns:
Products:
https://www.example.com/store/details/Nike+SB-Portmore-II-Solar-Canvas-Mens
https://www.example.com/store/details/Coleman+Renegade-Mens-Hiking
https://www.example.com/store/details/Mueller+ATF3-Ankle-Brace
https://www.example.com/store/details/Planet%20Fitness+18
https://www.example.com/store/details/Lifeline+Pro-Grip-Ring
https://www.example.com/store/details/Nike+Phantom-Vision
Categories:
https://www.example.com/store/browse/footwear/
https://www.example.com/store/browse/apparel/
https://www.example.com/store/browse/fitness/
Deny:
https://www.example.com/store/customer+service/Online+Customer+Service
https://www.example.com/store/checkout/
https://www.example.com/store/ways+to+save/
https://www.example.com/store/specials
https://www.example.com/store/company/Privacy+Policy
https://www.example.com/store/company/Terms+of+Service
Loading the rules from JSON inside my spider __init__
for rule in self.MY_SETTINGS["rules"]:
allow_r = ()
if "allow" in rule.keys():
allow_r = [a for a in rule["allow"]]
deny_r = ()
if "deny" in rule.keys():
deny_r = [d for d in rule["deny"]]
restrict_xpaths_r = ()
if "restrict_xpaths" in rule.keys():
restrict_xpaths_r = [rx for rx in rule["restrict_xpaths"]]
Sportygenspider.rules.append(Rule(
LinkExtractor(
allow=allow_r,
deny=deny_r,
restrict_xpaths=restrict_xpaths_r,
),
follow=rule["follow"],
callback='parse_item' if ("use_content" in rule.keys()) else None
))
If I do a pprint(vars(onerule.link_extractor)) I can see the Python regex correctly:
'deny_res': [re.compile('\\/(customer\\+service|sponsorship|order|cart|company|specials|checkout|integration|blog|account|sitemap|prefn1=)\\/')]
{'allow_domains': set(),
'allow_res': [re.compile('com\\/store\\/details\\/')],
{'allow_domains': set(),
'allow_res': [re.compile('com\\/store\\/browse\\/')],
Testing the regex in https://regex101.com/ seems to be fine as well (despite: I'm using \\/ in my JSON file and \/ in regex101.com)
In my spider logfile, I can see that the produce pages are being crawled, but not parsed:
2019-02-01 08:25:33 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.example.com/store/details/FILA+Hometown-Mens-Lifestyle-Shoes/5345120230028/_/A-6323521;> (referer: https://www.example.com/store/browse/footwear)
2019-02-01 08:25:47 [scrapy.core.engine] DEBUG: Crawled (200) <GET https://www.example.com/store/details/FILA+D-Formation-Mens-Lifestyle-Shoes/5345120230027/_/A-6323323> (ref
Why does the spider not parse the product pages?
(same code, different JSON works on different shops)
After hours of debugging and testing, I figured that I had to change the order of the rules.
Products to scrape rule
Deny about us etc.
Categories to follow
Now it is working.
"rules": [
{
"allow": ["com\\/store\\/details\\/"],
"follow": true,
"use_content": true
},
{
"deny": ["\\/(customer\\+service|ways\\+to\\+save|sponsorship|order|cart|company|specials|checkout|integration|blog|brand|account|sitemap|prefn1=)\\/"],
"follow": false
},
{
"allow": ["com\\/store\\/browse\\/"],
"follow": true
}
],

allow All trafic to and from the instance using boto

The following code works as expected.
import boto.ec2
conn = boto.ec2.connect_to_region("us-east-1", aws_access_key_id='xxx', aws_secret_access_key='zzz')
sg = conn.create_security_group('test_delete', 'description')
auth = conn.authorize_security_group(sg.name, None, None, ip_protocol='tcp', from_port='22', to_port='22', cidr_ip='0.0.0.0/0')
I can select "All traffic" option from user interface. There is no equivalent here in boto.
I am aware of the security risks involved, but for some reason I want to open all ports (to / from) for all traffic using boto.
use 'IpProtocol': '-1' for "All traffic" option, see below code for details.
def create_ingress_rules (credentials=None,securitygroupid=None, region_name=None):
print("3-Start creating ingress rule(s)...")
create_ingress_rules_handler = \
boto3.client('ec2',
aws_access_key_id=credentials['AccessKeyId'],
aws_secret_access_key=credentials['SecretAccessKey'],
aws_session_token=credentials['SessionToken'],
region_name=region_name)
try:
data = create_ingress_rules_handler.authorize_security_group_ingress(
GroupId=securitygroupid,
IpPermissions=[
{'IpProtocol': '-1',
'FromPort': 0,
'ToPort': 65535,
'IpRanges': [{'CidrIp': '0.0.0.0/0','Description': 'Temporary inbound rule for Guardrail Testing'}]}
])
print('Complete creating Ingress rule...')
except ClientError as e:
print(e)
I think you just have to specify the min and max values for a port number. Since it is a 16-bit value, the value can range from 0 to 65535. So:
auth = conn.authorize_security_group(sg.name, None, None, ip_protocol='tcp', from_port=0, to_port=65535, cidr_ip='0.0.0.0/0')
Should allow traffic on all ports for the TCP protocol.

How does the VALUE? function work?

I have reduced down to a small example some code that I have, which tests for whether a variable called class-name has a value assigned to it:
ask-params: function [
config-file [file!]
default-class-name
default-fields
] [
probe value? 'class-name
input
either (value? 'class-name) [
probe class-name
] [
;-- omit code in this branch for now
]
]
ret-block: ask-params %simple-class.params.txt "Person" "First Name, Last Name"
The expression value? 'class-name returns false here. On the other hand, if I fill in the missing branch with an assignment:
ask-params: function [
config-file [file!]
default-class-name
default-fields
] [
probe value? 'class-name
input
either (value? 'class-name) [
probe class-name
] [
class-name: default-class-name
]
]
ret-block: ask-params %simple-class.params.txt "Person" "First Name, Last Name"
This will return true for value? 'class-name. But in this second case, class-name: default-class-name isn't even executed yet.
I would think that class-name shouldn't exist in memory, so value? 'class-name should be returning false. Why is value? returning true instead?
You are using function. This scans the body of the function and pre-creates the local variables for you, initialized to NONE. That's why value? 'class-name becomes true (because NONE is a legal value for a variable, distinct from the situation of being "unset").
If you used func instead, then both would return false.
I don't think function behaves differently than func /local. Look at these examples:
>> f: func [/x] [value? 'x]
>> f
== true
I didn't give any value to x, but it says it HAS a value. Same for /local
>> f: func [/local x] [value? 'x]
>> f
== true
Because when you make a variable local (or a refinement) then it means you already set a value for it (which is none) and that is what function does.
Here I show you two examples not using FUNCTION, but otherwise equivalent to your code:
ask-params: func [config-file [file!] default-class-name default-fields] [
probe value? 'class-name
input
either (value? 'class-name) [
probe class-name
][
]
]
ask-params: func [
config-file [file!] default-class-name default-fields /local class-name
] [
probe value? 'class-name
input
either (value? 'class-name) [
probe class-name
][
]
]
While the value? function in the first example yields #[false], in the second example it yields #[true]. That is because the "refinement arguments" following an "unused refinement" (a refinement that is not used in the actual call) are initialized to #[none!] together with the refinement variable. This applies to the /local variables as well, since the /local refinement does not differ from other function refinements (except for the fact, that it is a convention to use it to define local variables).
Since the function generator uses the /local method to implement local variables "under the hood", the above description applies to all functions it generates as well.
There is another way, which avoids using FUNC/LOCAL and still allows the use of FUNCTION.
That is to not use a SET-WORD! for the assignment. Instead use the SET function on a LIT-WORD!
ask-params: function [config-file [file!] default-class-name default-fields] [
probe value? 'class-name
input
either (value? 'class-name) [
probe class-name
] [
set 'class-name default-class-name
]
]
You will get #[false] for the value? function. However, the call to SET will be setting class-name in the global environment...not as a local.