Pass default 0 value to missing field in json log search in Sumo Logic - json

I am trying to parse aws ecr scan json logs to get vulnerabilities table report using below given query in SumoLogic. The issue is that aws.ecr sends the fields CRITICAL or HIGH only when those are found else it omits those fields. How to add CRITICAL field to 0 if CRITICAL is not found in json logs ?
I tried using isNull, isEmpty, isBlank but it seems I am missing something, please share your valuable advise. Thanks in advance.
_source="aws_ecr_events_test"
| json field=message "detail.repository-name" as repository_name
| json field=message "detail.image-tags" as tags
| json field=message "time" as last_scan
| json field=message "detail.finding-severity-counts.CRITICAL" as CRITICAL
| if(isNull("detail.finding-severity-counts.CRITICAL"), 0, CRITICAL) as CRITICAL
| json field=message "detail.finding-severity-counts.HIGH" as HIGH
| json field=message "detail.finding-severity-counts.MEDIUM" as MEDIUM
| json field=message "detail.finding-severity-counts.INFORMATIONAL" as INFORMATIONAL
| json field=message "detail.finding-severity-counts.LOW" as LOW
| json field=message "detail.finding-severity-counts.UNDEFINED" as UNDEFINED
| json field=message "detail.image-digest" as image_digest
| json field=message "detail.scan-status" as scan_status
| count by repository_name, tags, image_digest, scan_status, last_scan, CRITICAL, HIGH, MEDIUM, LOW, INFORMATIONAL, UNDEFINED
Example log:
detail:{finding-severity-counts:{LOW:1,HIGH:1}}

I think you're on the right track, but you might need a "nodrop" at the end of the parse line, otherwise Sumo Logic will just drop the records that don't match the json parse statement:
...
| json field=message "detail.finding-severity-counts.CRITICAL" as CRITICAL nodrop
| if(isNull("detail.finding-severity-counts.CRITICAL"), 0, CRITICAL) as CRITICAL

Related

Dependency Parsing in Spacy

I want to extract the pair verb-noun of my text using dependency parsing.
I did this:
document = nlp('appoint department heads or managers and assign or delegate responsibilities to them ')
print ("{:<15} | {:<8} | {:<15} | {:<20}".format('Token','Relation','Head', 'Children'))
print ("-" * 70)
for token in document:
print ("{:<15} | {:<8} | {:<15} | {:<20}"
.format(str(token.text), str(token.dep_), str(token.head.text), str([child for child in token.children])))
from spacy import displacy
displacy.render(document, style = 'dep', jupyter=True )
Can you guys help me do a cleaner one?

JSON SerDe in Hive/Athena: turning one JSON object into multiple rows?

I am looking into using AWS Athena to do queries against a mass of JSON files.
My JSON files have this format (prettyprinted for convenience):
{
"data":[
{<ROW1>},
{<ROW2>},
...
],
"foo":[...],
"bar":[...]
}
The ROWs contained in the "data" array are what should be queried. The rest of the JSON file is unimportant.
Can this be done without modifying the JSON files? If yes, how? From what I've been able to find, looks like the SerDes (or is it Hive itself?) assume one row of output per line of input, which would mean that I'm stuck with modifying all my JSON files (and turning them into JSONL?) before uploading them to S3.
(Athena uses the Hive JSON SerDe and the OpenX JSON SerDe; AFAICT, there is no option to write my own SerDe or file format...)
You can't make the serde do it automatically, but you can achieve what you're after in a query. You can then create a view to simulate a table with the data elements unwrapped.
The way you do this is to use the UNNEST keyword. This produces one new row per element in an array:
SELECT
foo,
bar,
element
FROM my_table, UNNEST(data) AS t(element)
If your JSON looked like this:
{"foo": "f1", "bar": "b1", "data": [1, 2, 3]}
{"foo": "f2", "bar": "b2", "data": [4, 5]}
The result of the query would look like this:
foo | bar | element
----+-----+--------
f1 | b1 | 1
f1 | b1 | 2
f1 | b1 | 3
f2 | b2 | 4
f2 | b2 | 5

Karate does not display a response after a POST request with status 201 [duplicate]

This question already has an answer here:
Support passing from Scenario Outline to JSON file
(1 answer)
Closed 2 years ago.
I am struggling with the following test, which is usually pretty easy...
Feature: Testing Env Create Feature
Scenario Outline: Create works as intended
Given url "http://localhost:10000/api/envs"
And request {"name": <Name>,"gcpProjectName": <GcpProjectName>,"url": <Url>}
When method POST
Then status 201
And match response contains {"id": #string, "name": <Name>,"gcpProjectName": <GcpProjectName>,"url": <Url>}
Examples:
| Name | GcpProjectName | Url |
| tests | D-COO-ContinuousCollaboration | https://fake.com |
| approval | Q-COO-ContinuousCollaboration | https://fake.com |
| demo | P-COO-ContinuousCollaboration | https://fake.com |
| prod | P-COO-ContinuousCollaboration | https://fake.com |
I am supposed to get a response summarizing my POST request that I successfully get using curl, Postman or even Swagger, but it does not appear with Karate:
[failed features:
src.test.features.envtest.env-create: [1.1:13] env-create.feature:9 - path: $, actual: '', expected: '{"id":"#string","name":"tests","gcpProjectName":"D-COO-ContinuousCollaboration","url":"https://fake.com"}', reason: not a sub-string
Anyone knows what happens ?
Thanks for your help.
Just add quotes around string substitutions:
And request {"name": "<Name>", "gcpProjectName": "<GcpProjectName>", "url": "<Url>" }

How to capture the values from Get Response Body - Robot framework

Output from Response Body
{"data":[{"id”:122,"name”:”Test 1“,”description”:”TEST 1 Test 2 …..}]},{"id”:123,"name”:”DYNAMO”……}]},{"id”:126,”name”:”T DYNAMO”……
*** Keywords ***
Capture The Data Ids
#{ids}= Create List 122 123 126 167 190
${header} Create Dictionary Authoriztion...
${resp} Get Response httpsbin /data
${t_ids}= Get Json Value ${resp.content} /data/0/id
Problem
I have created a list of above ids in the test case and I need to compare the created data against the id returned in the response body.
t_ids returns 122and when 0 is replaced by 1, returns 123
Rather than capturing individual id, is it possible to put them in for loop?
:FOR ${i} IN ${ids}
\ ${the_id= Get Json Value ${resp.content} /data/${i}/id ?
I tried this and failed.
What is the possible solution to compare the ids from the response data against the created list?
Thank you.
It is possible to what you want, but it is always good to know what kind of data structure your variable contains. In the below example loading a json file replaces the received answer in ${resp.content}. To my knowledge this is a string, which is also what Get File returns.
The example is split into the json file and the robot file.
so_json.json
{
"data":[
{
"id":122,
"name": "Test 1",
"description": "TEST 1 Test 2"
},
{
"id": 123,
"name": "DYNAMO"
},
{
"id": 126,
"name": "T DYNAMO"
}
]
}
so_robot.robot
*** Settings ***
Library HttpLibrary.HTTP
Library OperatingSystem
Library Collections
*** Test Cases ***
TC
${json_string} Get File so_json.json
${json_object} Parse Json ${json_string}
:FOR ${item} IN #{json_object['data']}
\ Log To Console ${item['id']}
Which in turn gives the following result:
==============================================================================
Robot - Example
==============================================================================
Robot - Example.SO JSON
==============================================================================
TC 122
123
126
| PASS |
------------------------------------------------------------------------------
Robot - Example.SO JSON | PASS |
1 critical test, 1 passed, 0 failed
1 test total, 1 passed, 0 failed
==============================================================================
Robot - Example | PASS |
1 critical test, 1 passed, 0 failed
1 test total, 1 passed, 0 failed
==============================================================================

Trying to understand number of ParseError in html5lib-test

I was looking at following test case in html5lib-tests:
{"description":"<!DOCTYPE\\u0008",
"input":"<!DOCTYPE\u0008",
"output":["ParseError", "ParseError", "ParseError",
["DOCTYPE", "\u0008", null, null, false]]},
source
State |Input char | Actions
--------------------------------------------------------------------------------------------
Data State | "<" | -> TagOpenState
TagOpenState | "!" | -> MarkupDeclarationOpenState
MarkupDeclarationOpenState | "DOCTYPE" | -> DOCTYPE state
DOCTYPE state | "\u0008" | Parse error; -> before DOCTYPE name state (reconsume)
before DOCTYPE name state | "\u0008" | DOCTYPE(name = "\u0008"); -> DOCTYPE name state
DOCTYPE name state | EOF | Parse error. Set force quirks on. Emit DOCTYPE -> Data state.
Data state | EOF | Emit EOF.
I'm wondering where do those three errors come from? I can only track two, but I assume I'm making an error in logic, somewhere.
The one you're missing is the one from the "Preprocessing the input stream" section:
Any occurrences of any characters in the ranges U+0001 to U+0008, U+000E to U+001F, U+007F to U+009F, U+FDD0 to U+FDEF, and characters U+000B, U+FFFE, U+FFFF, U+1FFFE, U+1FFFF, U+2FFFE, U+2FFFF, U+3FFFE, U+3FFFF, U+4FFFE, U+4FFFF, U+5FFFE, U+5FFFF, U+6FFFE, U+6FFFF, U+7FFFE, U+7FFFF, U+8FFFE, U+8FFFF, U+9FFFE, U+9FFFF, U+AFFFE, U+AFFFF, U+BFFFE, U+BFFFF, U+CFFFE, U+CFFFF, U+DFFFE, U+DFFFF, U+EFFFE, U+EFFFF, U+FFFFE, U+FFFFF, U+10FFFE, and U+10FFFF are parse errors. These are all control characters or permanently undefined Unicode characters (noncharacters).
This causes a parse error prior to the U+0008 character ever reaching the tokenizer. Given the tokenizer is defined as reading from the input stream, the tokenizer tests assume the input stream has its normal preprocessing applied to it.