In a json embedded YAML file - replace only json values using Python - json

I have a YAML file as follows:
api: v1
hostname: abc
metadata:
name: test
annotations: {
"ip" : "1.1.1.1",
"login" : "fad-login",
"vip" : "1.1.1.1",
"interface" : "port1",
"port" : "443"
}
I am trying to read this data from a file, only replace the values of ip and vip and write it back to the file.
What I tried is:
open ("test.yaml", w) as f:
yaml.dump(object, f) #this does not help me since it converts the entire file to YAML
also json.dump() does not work too as it converts entire file to JSON. It needs to be the same format but the values need to be updated. How can I do so?

What you have is not YAML with embedded JSON, it is YAML with some the value for annotations being
in YAML flow style (which is a superset of JSON and thus closely resembles it).
This would be
YAML with embedded JSON:
api: v1
hostname: abc
metadata:
name: test
annotations: |
{
"ip" : "1.1.1.1",
"login" : "fad-login",
"vip" : "1.1.1.1",
"interface" : "port1",
"port" : "443"
}
Here the value for annotations is a string that you can hand to a JSON parser.
You can just load the file, modify it and dump. This will change the layout
of the flow-style part, but that will not influence any following parsers:
import sys
import ruamel.yaml
file_in = Path('input.yaml')
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
yaml.width = 1024
data = yaml.load(file_in)
annotations = data['metadata']['annotations']
annotations['ip'] = type(annotations['ip'])('4.3.2.1')
annotations['vip'] = type(annotations['vip'])('1.2.3.4')
yaml.dump(data, sys.stdout)
which gives:
api: v1
hostname: abc
metadata:
name: test
annotations: {"ip": "4.3.2.1", "login": "fad-login", "vip": "1.2.3.4", "interface": "port1", "port": "443"}
The type(annotations['vip'])() establishes that the replacement string in the output has the same
quotes as the original.
ruamel.yaml currently doesn't preserve newlines in a flow style mapping/sequence.
If this has to go back into some repository with minimal chances, you can do:
import sys
import ruamel.yaml
file_in = Path('input.yaml')
def rewrite_closing_curly_brace(s):
res = []
for line in s.splitlines():
if line and line[-1] == '}':
res.append(line[:-1])
idx = 0
while line[idx] == ' ':
idx += 1
res.append(' ' * (idx - 2) + '}')
continue
res.append(line)
return '\n'.join(res) + '\n'
yaml = ruamel.yaml.YAML()
yaml.preserve_quotes = True
yaml.width = 15
data = yaml.load(file_in)
annotations = data['metadata']['annotations']
annotations['ip'] = type(annotations['ip'])('4.3.2.1')
annotations['vip'] = type(annotations['vip'])('1.2.3.4')
yaml.dump(data, sys.stdout, transform=rewrite_closing_curly_brace)
which gives:
api: v1
hostname: abc
metadata:
name: test
annotations: {
"ip": "4.3.2.1",
"login": "fad-login",
"vip": "1.2.3.4",
"interface": "port1",
"port": "443"
}
Here the 15 for width is of course highly dependent on your file and might influence other lines if they
were longer. In that case you could leave that out, and make the wrapping
that rewrite_closing_curly_brace() does split and indent the whole flow style part.
Please note that your original, and the transformed output are, invalid YAML,
that is accepted by ruamel.yaml for backward compatibility. According to the YAML
specification the closing curly brace should be indented more than the start of annotation

Related

Promtail: how to trim not JSON part from log

I have multiline log that consists correct json part (one or more lines), and after it - stack trace.
Is it possile to parse first part of the log as json, and for stack-trace make new label ("stackTrace" for example) and put there all the lines after first part?
Unfortunately, logs can contain a different number of fields in json format, and therefore it is unlikely to parse them using regex.
{ "timestamp" : "2022-03-28 14:33:00,000", "logger" : "appLog", "level" : "ERROR", "thread" : "ktor-8080", "url" : "/path","method" : "POST","httpStatusCode" : 400,"callId" : "f7a22bfb1466","errorMessage" : "Unexpected JSON token at offset 184: Encountered an unknown key 'a'. Use 'ignoreUnknownKeys = true' in 'Json {}' builder to ignore unknown keys. JSON input: { \"entityId\" : \"TGT-8c8d950036bf\", \"processCode\" : \"test\", \"tokenType\" : \"SSO_CCOM\", \"ttlMills\" : 600000, \"a\" : \"a\" }" }
com.example.info.core.WebApplicationException: Unexpected JSON token at offset 184: Encountered an unknown key 'a'.
Use 'ignoreUnknownKeys = true' in 'Json {}' builder to ignore unknown keys.
JSON input: {
"entityId" : "TGT-8c8d950036bf",
"processCode" : "test",
"tokenType" : "SSO_CCOM",
"ttlMills" : 600000,
"a" : "a"
}
at com.example.info.signtoken.SignTokenApi$signTokenModule$2$1$1.invokeSuspend(SignTokenApi.kt:94)
at com.example.info.signtoken.SignTokenApi$signTokenModule$2$1$1.invoke(SignTokenApi.kt)
at com.example.info.signtoken.SignTokenApi$signTokenModule$2$1$1.invoke(SignTokenApi.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:248)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(SuspendFunctionGun.kt:116)
at io.ktor.util.pipeline.SuspendFunctionGun.execute(SuspendFunctionGun.kt:136)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:78)
at io.ktor.routing.Routing.executeResult(Routing.kt:155)
at io.ktor.routing.Routing.interceptor(Routing.kt:39)
at io.ktor.routing.Routing$Feature$install$1.invokeSuspend(Routing.kt:107)
at io.ktor.routing.Routing$Feature$install$1.invoke(Routing.kt)
at io.ktor.routing.Routing$Feature$install$1.invoke(Routing.kt)
UPD.
I've made promtail pipeline like so
scrape_configs:
- job_name: Test_AppLog
static_configs:
- targets:
- ${HOSTNAME}
labels:
job: INFO-Test_AppLog
host: ${HOSTNAME}
__path__: /home/adm_web/app.log
pipeline_stages:
- multiline:
firstline: ^\{\s?\"timestamp\"
max_lines: 128
max_wait_time: 1s
- match:
selector: '{job="INFO-Test_AppLog"}'
stages:
- regex:
expression: '(?P<log>^\{ ?\"timestamp\".*\}[\s])(?s)(?P<stacktrace>.*)'
- labels:
log:
stacktrace:
- json:
expressions:
logger: logger
url: url
method: method
statusCode: httpStatusCode
sla: sla
source: log
But in fact, json config block does not work, the result in Grafana is only two fields - log and stacktrace.
Any help would be appreciated
if the style is constantly like this maybe the easiest way is to analyze whole log string find index of last symbol "}" - then split the string using its index+1 and result should be in the first part of output array

Seeding rails project with Json file

I'm at a lost and my searches have gotten me nowhere.
In my seeds.rb file I have the following code
require 'json'
jsonfile = File.open 'db/search_result2.json'
jsondata = JSON.load jsonfile
#jsondata = JSON.parse(jsonfile)
jsondata[].each do |data|
Jobpost.create!(post: data['title'],
link: data['link'],
image: data['pagemap']['cse_image']['src'] )
end
Snippet of the json file looks like this:
{
"kind": "customsearch#result",
"title": "Careers Open Positions - Databricks",
"link": "https://databricks.com/company/careers/open-positions",
"pagemap": {
"cse_image": [
{
"src": "https://databricks.com/wp-content/uploads/2020/08/careeers-new-og-image-sept20.jpg"
}
]
}
},
Fixed jsondata[].each to jasondata.each. Now I'm getting the following error:
TypeError: no implicit conversion of String into Integer
jsondata[] says to call the [] method with no arguments on the object in the jsondata variable. Normally [] would take an index like jsondata[0] to get the first element or a start and length like jsondata[0, 5] to get the first five elements.
You want to call the each method on jsondata, so jsondata.each.
So this is very specific to what you have posted:
require 'json'
file = File.open('path_to_file.json').read
json_data = JSON.parse file
p json_data['kind'] #=> "customsearch#result"
# etc for all the other keys
now maybe the json you posted is just the first element in an array:
[
{}, // where each {} is the json you posted
{},
{},
// etc
]
in which case you will indeed have to iterate:
require 'json'
file = File.open('path_to_file.json').read
json_data = JSON.parse file
json_data.each do |data|
p data['kind'] #=> "customsearch#result"
end

Emit Python embedded object as native JSON in YAML document

I'm importing webservice tests from Excel and serialising them as YAML.
But taking advantage of YAML being a superset of JSON I'd like the request part of the test to be valid JSON, i.e. to have delimeters, quotes and commas.
This will allow us to cut and paste requests between the automated test suite and manual test tools (e.g. Postman.)
So here's how I'd like a test to look (simplified):
- properties:
METHOD: GET
TYPE: ADDRESS
Request URL: /addresses
testCaseId: TC2
request:
{
"unitTypeCode": "",
"unitNumber": "15",
"levelTypeCode": "L",
"roadNumber1": "810",
"roadName": "HAY",
"roadTypeCode": "ST",
"localityName": "PERTH",
"postcode": "6000",
"stateTerritoryCode": "WA"
}
In Python, my request object has a dict attribute called fields which is the part of the object to be serialised as JSON. This is what I tried:
import yaml
def request_presenter(dumper, request):
json_string = json.dumps(request.fields, indent=8)
return dumper.represent_str(json_string)
yaml.add_representer(Request, request_presenter)
test = Test(...including embedded request object)
serialised_test = yaml.dump(test)
I'm getting:
- properties:
METHOD: GET
TYPE: ADDRESS
Request URL: /addresses
testCaseId: TC2
request: "{
\"unitTypeCode\": \"\",\n
\"unitNumber\": \"15\",\n
\"levelTypeCode": \"L\",\n
\"roadNumber1\": \"810\",\n
\"roadName\": \"HAY\",\n
\"roadTypeCode\": \"ST\",\n
\"localityName\": \"PERTH\",\n
\"postcode\": \"6000\",\n
\"stateTerritoryCode\": \"WA\"\n
}"
...only worse because it's all on one line and has white space all over the place.
I tried using the | style for literal multi-line strings which helps with the line breaks and escaped quotes (it's more involved but this answer was helpful.) However, escaped or multiline, the result is still a string that will need to be parsed separately.
How can I stop PyYaml analysing the JSON block as a string and make it just accept a block of text as part of the emitted YAML? I'm guessing it's something to do with overriding the emitter but I could use some help. If possible I'd like to avoid post-processing the serialised test to achieve this.
Ok, so this was the solution I came up with. Generate the YAML with a placemarker ahead of time. The placemarker marks the place where the JSON should be inserted, and also defines the root-level indentation of the JSON block.
import os
import itertools
import json
def insert_json_in_yaml(pre_insert_yaml, key, obj_to_serialise):
marker = '%s: null' % key
marker_line = line_of_first_occurrence(pre_insert_yaml, marker)
marker_indent = string_indent(marker_line)
serialised = json.dumps(obj_to_serialise, indent=marker_indent + 4)
key_with_json = '%s: %s' % (key, serialised)
serialised_with_json = pre_insert_yaml.replace(marker, key_with_json)
return serialised_with_json
def line_of_first_occurrence(basestring, substring):
"""
return line number of first occurrence of substring
"""
lineno = lineno_of_first_occurrence(basestring, substring)
return basestring.split(os.linesep)[lineno]
def string_indent(s):
"""
return indentation of a string (no of spaces before a nonspace)
"""
spaces = ''.join(itertools.takewhile(lambda c: c == ' ', s))
return len(spaces)
def lineno_of_first_occurrence(basestring, substring):
"""
return line number of first occurrence of substring
"""
return basestring[:basestring.index(substring)].count(os.linesep)
embedded_object = {
"unitTypeCode": "",
"unitNumber": "15",
"levelTypeCode": "L",
"roadNumber1": "810",
"roadName": "HAY",
"roadTypeCode": "ST",
"localityName": "PERTH",
"postcode": "6000",
"stateTerritoryCode": "WA"
}
yaml_string = """
---
- properties:
METHOD: GET
TYPE: ADDRESS
Request URL: /addresses
testCaseId: TC2
request: null
after_request: another value
"""
>>> print(insert_json_in_yaml(yaml_string, 'request', embedded_object))
- properties:
METHOD: GET
TYPE: ADDRESS
Request URL: /addresses
testCaseId: TC2
request: {
"unitTypeCode": "",
"unitNumber": "15",
"levelTypeCode": "L",
"roadNumber1": "810",
"roadName": "HAY",
"roadTypeCode": "ST",
"localityName": "PERTH",
"postcode": "6000",
"stateTerritoryCode": "WA"
}
after_request: another value

Postgrex how to define json library

I'm just trying to use Postgrex without any kind of ecto setup, so just the example from the documentation readme.
Here is what my module looks like:
defmodule Receive do
def start(_type, _args) do
{:ok, pid} = Postgrex.start_link(
hostname: "localhost",
username: "john",
# password: "",
database: "property_actions",
extensions: [{Postgrex.Extensions.JSON}]
)
Postgrex.query!(
pid,
"INSERT INTO actions (search_terms) VALUES ($1)",
[
%{foo: 'bar'}
]
)
end
end
when I run the code I get
** (RuntimeError) type `json` can not be handled by the types module Postgrex.DefaultTypes, it must define a `:json` library in its options to support JSON types
Is there something I'm not setting up correctly? From what I've gathered in the documentation, I shouldn't even need to have that extensions line because json is handled by default.
On Postgrex <= 0.13, you need to define your own types:
Postgrex.Types.define(MyApp.PostgrexTypes, [], json: Poison)
and then when starting Postgrex:
Postgrex.start_link(types: MyApp.PostgrexTypes)
On Postgrex >= 0.14 (currently master), it was made easier:
config :postgrex, :json_library, Poison

MSON to JSON-Schema "One of" issue

I'm try to describe in ApiBlueprint MSON notation an object with variable part.
Here the simple code in ApiBlueprint :
FORMAT: 1A
# Test API
## Services [/Service/{id}]
### GET Service info [GET]
+ Request (application/json)
+ Headers
Authorization: JWT <token>
+ Response 200 (application/json)
+ Attributes (array[ServiceResource], fixed)
# Data Structures
## Resource (object)
### Properties
+ id: `a6vhAo3FG` (string, fixed)
+ created_at: `2016-07-01T15:11:09.553Z` (string, required)
+ updated_at: `2017-11-22T08:07:55.002Z` (string, required)
## Service (object)
### Properties
+ type: tcp_service (string, required)
- One Of
- config (TcpService, required)
- config (IcmpService, required)
## ServiceResource (Resource)
### Properties
- Include Service
## TcpService (object)
### Properties
+ port: `80` (number, required)
+ request_str: `HEAD` (string, required)
+ expect_response_str: `200 OK` (string, required)
## IcmpService (object)
### Properties
+ timeout_ms: `1000` (number, required)
+ packet_size_bytes: `1000` (number, required)
+ ttl: `128` (number, required)
It renders perfectly in apiary.io but validation of generated Json Schema in https://json-schema-validator.herokuapp.com reports an error:
[ {
"level" : "error",
"schema" : {
"loadingURI" : "#",
"pointer" : "/items"
},
"instance" : {
"pointer" : "/0"
},
"domain" : "validation",
"keyword" : "additionalProperties",
"message" : "object instance has properties which are not allowed by the schema: [\"config\"]",
"unwanted" : [ "config" ]
} ]
Maybe I'm doing something wrong?
Is there any way to discribe in MSON the array of objects where the object has a variant part which gives correct JSON Schema?
I believe you are looking for the fixed-type type attribute. fixed would mean that the values are fixed and cannot be anything other than the provided example value. When you used fixed-type you are indicating that the type is fixed, but the values are not.
+ Attributes (array[ServiceResource], fixed-type)