AWS Athena Nested JSON error - json

I have a nested JSON file that looks like below:
\n \"total\" : 510,\n \"start\" : 0,\n \"count\" : 500,\n \"data\" : [ {\n \"id\" : 294,\n \"candidate\" : {\n \"id\" : 5275,\n \"firstName\" : \"bob\",\n \"lastName\" : \"bob\"\n },\n \"sendingUser\" : {\n \"id\" : 5,\n \"firstName\" : \"tom\",\n \"lastName\" : \"tom\"\n },\n \"dateAdded\" : 1487865908960,\n \"jobOrder\" : {\n \"id\" : 71,\n \"title\" : \"Job\"\n },\n \"status\" : \"1st Interview\",\n \"_score\" : 1.0\n }
I have this stored in S3 and am trying to create a table in AWS Athena, the editor I have done is below:
CREATE EXTERNAL TABLE IF NOT EXISTS cvtest (
data struct < candidate struct <id string, firstName string, lastName string>,
sendingUser struct <id string, firstName string, lastName string>,
dateAdded string,
jobOrder string,
score string
>
)
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'
LOCATION 's3://es-cvsent';
But the query runs into an error below;
FAILED: ParseException line 2:26 missing : at 'struct' near '<EOF>' line 2:37 missing : at 'string' near '<EOF>' line 2:55 missing : at 'string' near '<EOF>' line 2:72 missing : at 'string' near '<EOF>' line 3:28 missing : at 'struct' near '<EOF>' line 3:39 missing : at 'string' near '<EOF>' line 3:57 missing : at 'string' near '<EOF>' line 3:74 missing : at 'string' near '<EOF>' line 4:26 missing : at 'string' near '<EOF>' line 5:25 missing : at 'string' near '<EOF>' line 6:22 missing : at 'string' near '<EOF>'
This query ran against the "test" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 84e876e8-b947-490e-b2b6-7bf9c376266e.
Can anyone see what I am doing wrong?

The data itself doesn't look like a valid JSON, but regardless of that you should be able to create a table because the underlying data is not validated at this point in time. (Just querying the table afterwards wouldn't work).
The problem you are having is that your syntax is wrong, see this documentation.":" is used between column and datatype within the struct definition.
This should work
data struct<candidate:struct<id:string, firstName:string, lastName:string>,
sendingUser:struct<id:string, firstName:string, lastName:string>,
dateAdded:string,
jobOrder:string,
score:string
>

Related

How to auto format wrong json with code in flutter

I as trying to parse a json response. However, there is a trailing comma at the end of the reponse which throws an error while decoding it. How do i automatically format if from code such that it removes the comma and validates it?
[{"id" : "9991","last_message" : "How about tomorrow then?","members" : ["John", "Daniel", "Rachel"],"topic" : "pizza night", "modified_at" : 1599814026153}, {"id" : "9992","last_message" : "I will send them to you asap","members" : ["Raphael"],"topic" : "slides", "modified_at" : 1599000026153}, {"id" : "9993","last_message" : "Can you please?","members" : ["Mum", "Dad", "Bro"],"topic" : "pictures", "modified_at" : 1512814026153},]
Error
D/EGL_emulation( 7121): app_time_stats: avg=32.94ms min=4.95ms max=83.26ms count=30
E/flutter ( 7121): [ERROR:flutter/shell/common/shell.cc(93)] Dart Unhandled Exception: FormatException: Unexpected character (at character 435)
E/flutter ( 7121): ..."Mum", "Dad", "Bro"],"topic" : "pictures", "modified_at" : 1512814026153},]
So, you should give some code from source, but I think you're using a http.Response object so you can do something like this:
String bodyRes = response.body;
bodyRes = bodyRes.endsWith(',]') ? bodyRes.replaceFirst(',]', ']', bodyRes.length - 2) : bodyRes;

Promtail: how to trim not JSON part from log

I have multiline log that consists correct json part (one or more lines), and after it - stack trace.
Is it possile to parse first part of the log as json, and for stack-trace make new label ("stackTrace" for example) and put there all the lines after first part?
Unfortunately, logs can contain a different number of fields in json format, and therefore it is unlikely to parse them using regex.
{ "timestamp" : "2022-03-28 14:33:00,000", "logger" : "appLog", "level" : "ERROR", "thread" : "ktor-8080", "url" : "/path","method" : "POST","httpStatusCode" : 400,"callId" : "f7a22bfb1466","errorMessage" : "Unexpected JSON token at offset 184: Encountered an unknown key 'a'. Use 'ignoreUnknownKeys = true' in 'Json {}' builder to ignore unknown keys. JSON input: { \"entityId\" : \"TGT-8c8d950036bf\", \"processCode\" : \"test\", \"tokenType\" : \"SSO_CCOM\", \"ttlMills\" : 600000, \"a\" : \"a\" }" }
com.example.info.core.WebApplicationException: Unexpected JSON token at offset 184: Encountered an unknown key 'a'.
Use 'ignoreUnknownKeys = true' in 'Json {}' builder to ignore unknown keys.
JSON input: {
"entityId" : "TGT-8c8d950036bf",
"processCode" : "test",
"tokenType" : "SSO_CCOM",
"ttlMills" : 600000,
"a" : "a"
}
at com.example.info.signtoken.SignTokenApi$signTokenModule$2$1$1.invokeSuspend(SignTokenApi.kt:94)
at com.example.info.signtoken.SignTokenApi$signTokenModule$2$1$1.invoke(SignTokenApi.kt)
at com.example.info.signtoken.SignTokenApi$signTokenModule$2$1$1.invoke(SignTokenApi.kt)
at io.ktor.util.pipeline.SuspendFunctionGun.loop(SuspendFunctionGun.kt:248)
at io.ktor.util.pipeline.SuspendFunctionGun.proceed(SuspendFunctionGun.kt:116)
at io.ktor.util.pipeline.SuspendFunctionGun.execute(SuspendFunctionGun.kt:136)
at io.ktor.util.pipeline.Pipeline.execute(Pipeline.kt:78)
at io.ktor.routing.Routing.executeResult(Routing.kt:155)
at io.ktor.routing.Routing.interceptor(Routing.kt:39)
at io.ktor.routing.Routing$Feature$install$1.invokeSuspend(Routing.kt:107)
at io.ktor.routing.Routing$Feature$install$1.invoke(Routing.kt)
at io.ktor.routing.Routing$Feature$install$1.invoke(Routing.kt)
UPD.
I've made promtail pipeline like so
scrape_configs:
- job_name: Test_AppLog
static_configs:
- targets:
- ${HOSTNAME}
labels:
job: INFO-Test_AppLog
host: ${HOSTNAME}
__path__: /home/adm_web/app.log
pipeline_stages:
- multiline:
firstline: ^\{\s?\"timestamp\"
max_lines: 128
max_wait_time: 1s
- match:
selector: '{job="INFO-Test_AppLog"}'
stages:
- regex:
expression: '(?P<log>^\{ ?\"timestamp\".*\}[\s])(?s)(?P<stacktrace>.*)'
- labels:
log:
stacktrace:
- json:
expressions:
logger: logger
url: url
method: method
statusCode: httpStatusCode
sla: sla
source: log
But in fact, json config block does not work, the result in Grafana is only two fields - log and stacktrace.
Any help would be appreciated
if the style is constantly like this maybe the easiest way is to analyze whole log string find index of last symbol "}" - then split the string using its index+1 and result should be in the first part of output array

Retrieving of a file with mongofiles leads to a JSON error

I am trying to retrieve an xml file from my Mongo DB with mongofiles. I get a JSON parsing error. Here is an excerpt my terminal:
$ mongofiles -d anhalytics get_id 'ObjectId("5e7f56d30800611b17fc66b1")'
2020-09-15T16:55:33.205+0200 connected to: mongodb://localhost/
2020-09-15T16:55:33.205+0200 Failed: error parsing id as Extended JSON: invalid JSON number. Position: 18
I am using a MongoDB server version: 4.2.9
Here is the record of the target file
{
"_id" : ObjectId("5e7f56d30800611b17fc66b1"),
"filename" : "5e7f56d30800611b17fc66b0.tei.xml",
"aliases" : null,
"chunkSize" : NumberLong(261120),
"uploadDate" : ISODate("2020-03-28T13:53:23.708Z"),
"length" : NumberLong(35405),
"contentType" : null,
"md5" : "eeafae907c44b207071ccb6036148808"
}
Any idea why I am getting this error? Thanks!
The message error parsing id as Extended JSON indicates that the mongofiles tool had trouble parsing the id string that was provided on the command line.
That is done in parseOrCreateId function here: https://github.com/mongodb/mongo-tools/blob/master/mongofiles/mongofiles.go#L330
That function wraps the value from the command line in another string like {"_id":"%s"}, so the value actually passed to the bson.UnmarshalExtJSON function would have been
"{\"_id\":\"ObjectId(\"5e7f56d30800611b17fc66b1\")\"}"
Position 18 of that string, as called out in the error message is the quotation mark immediately preceding the hex string.

Elasticsearch bulk data insertion

In my node app i am using Elasticsearch as my backend process. I am trying to insert data from a json file but I got an error.
My json:
{"index":{"_index":"mfissample", "_type":"place_mfi", "_id": "1"}}
{"PAR" : 42.31,"Center":"xx","District":"yy","Country" : "vv","GLP" : 13073826.63,"State" : "zz","SSScore" :null, "location":"80.102134,12.897401"}
{"index":{"_index":"mfissample", "_type":"place_mfi", "_id": "2"}}
{"PAR" : 42.31,"Center" : "xx","District" : "yy","Country" : "zz","GLP" : 13073826.63,"State" : "vv","SSScore" :null,
"location":"80.102134,12.897401"}
My command:
curl -XPOST 'http://localhost:9200/_bulk' --data-binary #jsonbulk.json
The error:
{"error":"JsonParseException[Unexpected character (':' (code 58)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')\n at [Source: [B#792c4b55; line: 1, column: 12]]","status":500}
Remove the \n after "SSScore" :null, and before the "location":"80.102134,12.897401".

Aptana gives error with JSON format

Format:
{
"lastUpdate" : "20/9/2012-12:12",
"data":[{
"user" : "_name_",
"username" : "_fullname_",
"photoURL" : "_url_"
}, {
"user" : "_name_",
"username" : "_fullname_",
"photoURL" : "_url_"
}, {
"user" : "_name_",
"username" : "_fullname_",
"photoURL" : "_url_"
}]
}
Aptana gives errors at the :
Screenshot Aptana JSON format
Why is that? It seems I'm not having any problems receiving and processing the data.
[EDIT 1] Error given: Syntax Error: unexpected token ":"
In Aptana json is parsed "as json" only when you create/open a file with extension .json.
When have a json object inside a .js file works only the javascript parser, for that you see the error, is not a valid token for JS.