Remove duplicate JSON representations from JSON Files - json

I have ~2000 JSON files that have duplicate representations like:
{
"vulnerabilities": [],
"ok": true,
"dependencyCount": 2,
"org": "org",
"isPrivate": true,
"licensesPolicy": {
"severities": {},
"orgLicenseRules": {
"AGPL-1.0": {
"licenseType": "AGPL-1.0",
"severity": "high",
"instructions": ""
}
}
}
}
{
"vulnerabilities": [],
"ok": true,
"dependencyCount": 2,
"org": "org",
"isPrivate": true,
"licensesPolicy": {
"severities": {},
"orgLicenseRules": {
"AGPL-1.0": {
"licenseType": "AGPL-1.0",
"severity": "high",
"instructions": ""
}
}
}
}
I want to essentially remove duplicate entries for vulnerabilities in a directory of JSON files.
How might I do that?

Related

How Should the Json Output Look in Nlog?

I am using nlog json layout and took this from the example
{
"Logging": {
"NLog": {
"IncludeScopes": false,
"ParseMessageTemplates": true,
"CaptureMessageProperties": true
}
},
"NLog": {
"autoreload": true,
"internalLogLevel": "Info",
"internalLogFile": "c:/temp/console-example-internal2.log",
"throwConfigExceptions": true,
"targets": {
"console": {
"type": "Console",
"layout": "${date}|${level:uppercase=true}|${message} ${exception:format=tostring}|${logger}|${all-event-properties}"
},
"file": {
"type": "AsyncWrapper",
"target": {
"wrappedFile": {
"type": "File",
"fileName": "c:/temp/console-example2.log",
"layout": {
"type": "JsonLayout",
"Attributes": [
{ "name": "timestamp", "layout": "${date:format=o}" },
{ "name": "level", "layout": "${level}" },
{ "name": "logger", "layout": "${logger}" },
{ "name": "message", "layout": "${message:raw=true}" },
{ "name": "properties", "encode": false, "layout": { "type": "JsonLayout", "includeallproperties": "true" } }
]
}
}
}
}
},
"rules": [
{
"logger": "*",
"minLevel": "Trace",
"writeTo": "File,Console"
}
]
}
}
https://github.com/NLog/NLog.Extensions.Logging/blob/master/examples/NetCore2/ConsoleExampleJsonConfig/appsettings.json
on this line I saw this { "name": "message", "layout": "${message:raw=true}" } I changed it to false.
when I do this
var test = "Something";
logger.Info("This is what is stored in the variable: {var}", test);
I get
{
"message": This is what is stored in the variable: \"Something\""
}
Why is it in quotes?
When I change raw to "true" I get
{
"message": This is what is stored in the variable: {test}"
}
how do I just get "This is what is stored in the variable: Something"

Failed to get certain files from IPFS cluster

Because public IPFS gateway is too slow, I set up own ipfs cluster using kubernetes on AWS.
However, when I tried to get files from the cluster, I succeeded for some files but failed for others consistently(failed one kept failing).
How do I debug this? Did I make mistake on configuration? Here's the configuration I used.
{
"API": {
"HTTPHeaders": {
"Access-Control-Allow-Methods": [
"PUT",
"POST"
],
"Access-Control-Allow-Origin": [
"http://localhost:3000",
"http://127.0.0.1:5001",
"https://webui.ipfs.io"
]
}
},
"Addresses": {
"API": "/ip4/0.0.0.0/tcp/5001",
"Announce": [],
"AppendAnnounce": [],
"Gateway": "/ip4/0.0.0.0/tcp/8080",
"NoAnnounce": [],
"Swarm": [
"/ip4/0.0.0.0/tcp/4001",
"/ip6/::/tcp/4001",
"/ip4/0.0.0.0/udp/4001/quic",
"/ip6/::/udp/4001/quic"
]
},
"AutoNAT": {},
"Bootstrap": [
"/dnsaddr/bootstrap.libp2p.io/p2p/QmNnooDu7bfjPFoTZYxMNLWUQJyrVwtbZg5gBMjTezGAJN",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmQCU2EcMqAqQPR2i9bChDtGNJchTbq5TbXJJ16u19uLTa",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmbLHAnMoJPWSCR5Zhtx6BHJX9KiKNN6tpvbUcqanj75Nb",
"/dnsaddr/bootstrap.libp2p.io/p2p/QmcZf59bWwK5XFi76CZX8cbJ4BhTzzA3gU1ZjYZcYW3dwt",
"/ip4/104.131.131.82/tcp/4001/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ",
"/ip4/104.131.131.82/udp/4001/quic/p2p/QmaCpDMGvV2BGHeYERUEnRQAwe3N8SzbUtfsmvsqQLuvuJ"
],
"DNS": {
"Resolvers": {}
},
"Datastore": {
"BloomFilterSize": 0,
"GCPeriod": "1h",
"HashOnRead": false,
"Spec": {
"mounts": [
{
"child": {
"path": "blocks",
"shardFunc": "/repo/flatfs/shard/v1/next-to-last/2",
"sync": true,
"type": "flatfs"
},
"mountpoint": "/blocks",
"prefix": "flatfs.datastore",
"type": "measure"
},
{
"child": {
"compression": "none",
"path": "datastore",
"type": "levelds"
},
"mountpoint": "/",
"prefix": "leveldb.datastore",
"type": "measure"
}
],
"type": "mount"
},
"StorageGCWatermark": 90,
"StorageMax": "10GB"
},
"Discovery": {
"MDNS": {
"Enabled": true,
"Interval": 10
}
},
"Experimental": {
"AcceleratedDHTClient": false,
"FilestoreEnabled": false,
"GraphsyncEnabled": false,
"Libp2pStreamMounting": false,
"P2pHttpProxy": false,
"StrategicProviding": false,
"UrlstoreEnabled": false
},
"Gateway": {
"APICommands": [],
"HTTPHeaders": {
"Access-Control-Allow-Headers": [
"X-Requested-With",
"Range",
"User-Agent"
],
"Access-Control-Allow-Methods": [
"GET"
],
"Access-Control-Allow-Origin": [
"*"
]
},
"NoDNSLink": false,
"NoFetch": false,
"PathPrefixes": [],
"PublicGateways": null,
"RootRedirect": "",
"Writable": false
},
"Identity": {
"PeerID": "<intentionally hide>"
},
"Internal": {},
"Ipns": {
"RecordLifetime": "",
"RepublishPeriod": "",
"ResolveCacheSize": 128
},
"Migration": {
"DownloadSources": [],
"Keep": ""
},
"Mounts": {
"FuseAllowOther": false,
"IPFS": "/ipfs",
"IPNS": "/ipns"
},
"Peering": {
"Peers": null
},
"Pinning": {
"RemoteServices": {}
},
"Plugins": {
"Plugins": null
},
"Provider": {
"Strategy": ""
},
"Pubsub": {
"DisableSigning": false,
"Router": ""
},
"Reprovider": {
"Interval": "12h",
"Strategy": "all"
},
"Routing": {
"Type": "dht"
},
"Swarm": {
"AddrFilters": null,
"ConnMgr": {
"GracePeriod": "20s",
"HighWater": 900,
"LowWater": 600,
"Type": "basic"
},
"DisableBandwidthMetrics": false,
"DisableNatPortMap": false,
"RelayClient": {
"Enabled": true
},
"RelayService": {
"Enabled": true
},
"Transports": {
"Multiplexers": {},
"Network": {},
"Security": {}
}
}
}

Get all keys and values from JSON data

I have the following json data:
{
"version": "2017-07-05",
"clientid": "s0001",
"statement": {
"publisher": "raspberry pi",
"publishstatus": false,
"subscribestatus": false,
"subscribedetails": {
"topic": "None",
"assetdetails": {
"assetname": "None",
"assetid": "None"
}
}
}
}
How can I get all the keys and its respective values in python?
For example:
"version": "2017-07-05",
"clientid": "s0001",
"publishstatus": false,
"topic": "None",
"assetname": "None",

Azure Data Factory V2 Copy Activity with Rest API giving one row for nested JSON

I am trying to flatten a nested JSON returned from a Rest source. The pipeline code is as follows.
The problem here is this pipeline returns only first object from JSON dataset and skips all the rest of the rows.
Can you please guide me on how to iterate over nested objects.
Thanks
Sameet
{
"name": "STG_NCR2",
"properties": {
"activities": [
{
"name": "Copy data1",
"type": "Copy",
"dependsOn": [],
"policy": {
"timeout": "7.00:00:00",
"retry": 0,
"retryIntervalInSeconds": 30,
"secureOutput": false,
"secureInput": false
},
"userProperties": [],
"typeProperties": {
"source": {
"type": "RestSource",
"httpRequestTimeout": "00:01:40",
"requestInterval": "00.00:00:00.010",
"requestMethod": "GET",
"additionalHeaders": {
"OData-MaxVersion": "4.0",
"OData-Version": "4.0",
"Prefer": "odata.include-annotations=*"
}
},
"sink": {
"type": "AzureSqlSink"
},
"enableStaging": false,
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"path": "$['value'][0]['tco_ncrid']"
},
"sink": {
"name": "NCRID"
}
},
{
"source": {
"path": "['tco_name']"
},
"sink": {
"name": "EquipmentSerialNumber"
}
}
],
"collectionReference": "$['value'][0]['tco_ncr_tco_equipment']"
}
},
"inputs": [
{
"referenceName": "Rest_PowerApps_NCR",
"type": "DatasetReference"
}
],
"outputs": [
{
"referenceName": "Prestaging_PowerApps_NCREquipments",
"type": "DatasetReference"
}
]
}
],
"annotations": []
}
}
The JSON is in the following format
[
{
"value":[
{
"tco_ncrid":"abc-123",
"tco_ncr_tco_equipment":[
{
"tco_name":"abc"
}
]
},
{
"tco_ncrid":"abc-456",
"tco_ncr_tco_equipment":[
{
"tco_name":"xyz"
},
{
"tco_name":"yzx"
}
}
]
]
}
]
This can be resolved by amending the translator property as follows.
"translator": {
"type": "TabularTranslator",
"mappings": [
{
"source": {
"path": "$.['value'][0].['tco_ncrid']"
},
"sink": {
"name": "NCRID",
"type": "String"
}
},
{
"source": {
"path": "$.['value'][0].['tco_text_id']"
},
"sink": {
"name": "EquipmentDescription",
"type": "String"
}
},
{
"source": {
"path": "['tco_name']"
},
"sink": {
"name": "EquipmentSerialNumber",
"type": "String"
}
}
],
"collectionReference": "$.['value'][*].['tco_ncr_tco_equipment']"
}
This code forces the pipeline to iterate over nested array but as you can see that the NCRID is hardcoded to first element of the value array. This is not exactly what I want as I am looking for all Equipment Serial Numbers against every NCRID. Still researching...

Firebase-database: Structure for event application with data consistency

As many have before me, I am building an event app using Firebase as the database.
After watching this https://www.youtube.com/watch?v=i1n9Kw3AORw I think that my current data structure is wrong so I want to change it.
This is something that I hope will keep track of which events users have been to, their responses, check ins and help update user information like in the video.
{
"events": {
"event1": {
"title": "Event 1",
"venue": {
"venue1": true
},
"guests": {
"user1": true,
"user2": true
}
},
"event2": {}
},
"locationHistory": {
"location1": {
"user1": true,
"user2": true
},
"location2": {
"user3": true
}
},
"userCheckIns": {
"event1": {
"user1": true
}
},
"userResponses": { // not sure about these responses
"event1": {
"user1": true,
"user2": false,
"user3": "maybe"
},
"event2": {}
},
"eventGuests": {
"event1": {
"user1": {
"name": "user name 1"
},
"user2": {
"name": "user name 2"
}
},
"event2": {}
},
"userEvents": {
"user1": {
"event1": true
},
"user2": {
"event1": true
}
},
"venues": {
"venue1": {
"name": "Venue name",
"location": "location1"
},
"users": {
"user1": {
"name": "user name 1",
"events": {
"event1": true
}
},
"user2": {
"name": "user name 2",
"events": {
"event1": true
}
}
}
}
}