I'm using PostgreSQL jsonb and have the following in my database record:
{"tags": "[\"apple\",\" orange\",\" pineapple\",\" fruits\"]",
"filename": "testname.jpg", "title_en": "d1", "title_ja": "1",
"description_en": "d1", "description_ja": "1"}
and both SELECT statements below retrived no results:
SELECT "photo"."id", "photo"."datadoc", "photo"."created_timestamp","photo"."modified_timestamp"
FROM "photo"
WHERE datadoc #> '{"tags":> ["apple"]}';
SELECT "photo"."id", "photo"."datadoc", "photo"."created_timestamp", "photo"."modified_timestamp"
FROM "photo"
WHERE datadoc -> 'tags' ? 'apple';
I wonder it is because of the extra backslash added to the json array string, or the SELECT statement is incorrect.
I'm running "PostgreSQL 10.1, compiled by Visual C++ build 1800, 64-bit" on Windows 10.
PostgreSQL doc is here.
As far as any JSON parser is concerned, the value of your tags key is a string, not an array.
"tags": "[\"apple\",\" orange\",\" pineapple\",\" fruits\"]"
The string itself happens to be another JSON document, like the common case in XML where the contents of a string happen to be an XML or HTML document.
["apple"," orange"," pineapple"," fruits"]
What you need to do is extract that string, then parse it as a new JSON object, and then query that new object.
I can't test it right now, but I think that would look something like this:
(datadoc ->> 'tags') ::jsonb ? 'apple'
That is, "extract the tags value as text, cast that text value as jsonb, then query that new jsonb value.
Hey there i know this is very late answer, but here is the good approach, with data i have.
initital data in db:
"{\"data\":{\"title\":\"test\",\"message\":\"string\",\"image\":\"string\"},\"registration_ids\":[\"s
tring\"],\"isAllUsersNotification\":false}"
to convert it to json
select (notificationData #>> '{}')::jsonb from sent_notification
result:
{"data": {"image": "string", "title": "string", "message": "string"}, "registration_ids": ["string"], "isAllUsersNotification": false}
getting a data object from json
select (notificationData #>> '{}' )::jsonb -> 'data' from sent_notification;
result:
{"image": "string", "title": "string", "message": "string"}
getting a field from above result:
select (notificationData #>> '{}' )::jsonb -> 'data' ->>'title' from sent_notification;
result:
string
performing where operations,
Q: get records where title ='string'
ans:
select * from sent_notification where (notificationData #>> '{}' )::jsonb -> 'data' ->>'title' ='string'
Related
I have a table like this (with an jsonb column):
https://dbfiddle.uk/lGuHdHEJ
If I load this json with python into a dataframe:
import pandas as pd
import json
data={
"id": [1, 2],
"myyear": [2016, 2017],
"value": [5, 9]
}
data=json.dumps(data)
df=pd.read_json(data)
print (df)
I get this result:
id myyear value
0 1 2016 5
1 2 2017 9
How can a get this result directly from the json column via sql in a postgres view?
Note: This assumes that your id, my_year, and value array are consistent and have the same length.
This answer uses PostgresSQL's json_array_elements_text function to explode array elements to the rows.
select jsonb_array_elements_text(payload -> 'id') as "id",
jsonb_array_elements_text(payload -> 'bv_year') as "myyear",
jsonb_array_elements_text(payload -> 'value') as "value"
from main
And this gives the below output,
id myyear value
1 2016 5
2 2017 9
Although this is not the best design to store the properties in jsonb object and could lead to data inconsistencies later. If it's in your control I would suggest storing the data where each property's mapping is clear. Some suggestions,
You can instead have separate columns for each property.
If you want to store it as jsonb only then consider [{id: "", "year": "", "value": ""}]
I have a table called api_details where i dump the below JSON value into the JSON column raw_data.
Now i need to make a report from this JSON string and the expected output is something like below,
action_name. sent_timestamp Sent. Delivered
campaign_2475 1600416865.928737 - 1601788183.440805. 7504. 7483
campaign_d_1084_SUN15_ex 1604220248.153903 - 1604222469.087918. 63095. 62961
Below is the sample JSON OUTPUT
{
"header": [
"#0 action_name",
"#1 sent_timestamp",
"#0 Sent",
"#1 Delivered"
],
"name": "campaign - lifetime",
"rows": [
[
"campaign_2475",
"1600416865.928737 - 1601788183.440805",
7504,
7483
],
[
"campaign_d_1084_SUN15_ex",
"1604220248.153903 - 1604222469.087918",
63095,
62961
],
[
"campaign_SUN15",
"1604222469.148829 - 1604411016.029794",
63303,
63211
]
],
"success": true
}
I tried like below, but is not getting the results.I can do it using python by lopping through all the elements in row list.
But is there an easy solution in PostgreSQL(version 11).
SELECT raw_data->'rows'->0
FROM api_details
You can use JSONB_ARRAY_ELEMENTS() function such as
SELECT (j.value)->>0 AS action_name,
(j.value)->>1 AS sent_timestamp,
(j.value)->>2 AS Sent,
(j.value)->>3 AS Delivered
FROM api_details
CROSS JOIN JSONB_ARRAY_ELEMENTS(raw_data->'rows') AS j
Demo
P.S. in this case the data type of raw_data is assumed to be JSONB, otherwise the argument within the function raw_data->'rows' should be replaced with raw_data::JSONB->'rows' in order to perform explicit type casting.
I want to generate schema from a newline delimited JSON file, having each row in the JSON file has variable-key/value pairs. File size can vary from 5 MB to 25 MB.
Sample Data:
{"col1":1,"col2":"c2","col3":100.75}
{"col1":2,"col3":200.50}
{"col1":3,"col3":300.15,"col4":"2020-09-08"}
Exptected Schema:
[
{"name": "col1", "type": "INTEGER"},
{"name": "col2", "type": "STRING"},
{"name": "col3", "type": "FLOAT"},
{"name": "col4", "type": "DATE"}
]
Notes:
There is no scope to use any tool, as files loaded into an inbound location dynamically. The code will use to trigger an event as-soon-as file arrives and perform schema comparison.
Your first problem is, that json does not have a date-type. So you will get str there.
What I would do, if I was you is this:
import json
# Wherever your input comes from
inp = """{"col1":1,"col2":"c2","col3":100.75}
{"col1":2,"col3":200.50}
{"col1":3,"col3":300.15,"col4":"2020-09-08"}"""
schema = {}
# Split it at newlines
for line in inp.split('\n'):
# each line contains a "dict"
tmp = json.loads(line)
for key in tmp:
# if we have not seen the key before, add it
if key not in schema:
schema[key] = type(tmp[key])
# otherwise check the type
else:
if schema[key] != type(tmp[key]):
raise Exception("Schema mismatch")
# format however you like
out = []
for item in schema:
out.append({"name": item, "type": schema[item].__name__})
print(json.dumps(out, indent=2))
I'm using python types for simplicity, but you can write your own function to get the type, e.g. if you want to check if a string is actually a date.
I have a very long sized table (over 4M of records,i work with MySql) and it has a lot of records with this string: \\"
I'm trying to export this table to mongodb, but when I import the JSON file mongodb throws to me this error:
Failed: error processing document #18: invalid character 't' after object key:value pair
this is my query:
MySQL
SELECT json_object(
"id", id,
"execution_id", execution_id,
"type", type,
"info", info,
"position", position,
"created_at", json_object("$date", DATE_FORMAT(created_at,'%Y-%m-%dT%TZ')),
"updated_at", json_object("$date", DATE_FORMAT(updated_at,'%Y-%m-%dT%TZ'))
)as 'json'
FROM myTable
INTO OUTFILE 'myPath';
I know the problem is the string, my question is: how can I change this certain string to \"? Manually change it´s not an option, and my knowledge about query is limited. Please help. Thank you for reading me .
The column that has this character is "info", here is an example:
{
"id": 30,
"execution_id": 2,
"type": "PHASE",
"info": "{ \\r\\n \\"title\\": \\"Phase\\",
\\r\\n \\"order\\": \\"1\\",
\\r\\n \\"description\\": \\"Example Phase 1\\",
\\r\\n \\"step\\": \\"end\\",
\\r\\n \\"status\\": \\"True\\"\\r\\n}",
"position": 24,
"created_at": {"$date": "2018-01-11T15:01:46Z"},
"updated_at": {"$date": "2018-01-11T15:01:46Z"}
}
You should be able to do this using the MySQL REPLACE() function.
The backslash is a bit of a special case in the MySQL REPLACE() function, so you will need to use \\ to represent each literal \, thus to replace \\ with \ you need to run something like this:
REPLACE(info,'\\\\','\\')
Your full query would look something like this:
SELECT json_object(
"id", id,
"execution_id", execution_id,
"type", type,
"info", REPLACE(info,'\\\\','\\'),
"position", position,
"created_at", json_object("$date", DATE_FORMAT(created_at,'%Y-%m-%dT%TZ')),
"updated_at", json_object("$date", DATE_FORMAT(updated_at,'%Y-%m-%dT%TZ'))
)as 'json'
FROM myTable
INTO OUTFILE 'myPath';
I have a NiFi flow that takes JSON files and evaluates a JSON Path argument against them. It work perfectly except when dealing with records that contain Korean text. The Jayway JSONPath evaluator does not seem to recognize the escape "\" in the headline field before the double quote character. Here is an example (newlines added to help with formatting):
{"data": {"body": "[이데일리 김관용 기자] 우리 군이 2018 남북정상회담을 앞두고 남
북간 군사적 긴장\r\n완화와 평화로운 회담 분위기 조성을 위해 23일 0시를 기해 군사분계선
(MDL)\r\n일대에서의 대북확성기 방송을 중단했다.\r\n\r\n국방부는 이날 남북정상회담 계기
대북확성기 방송 중단 관련 내용을 발표하며\r\n“이번 조치가 남북간 상호 비방과 선전활동을
중단하고 ‘평화, 새로운 시작’을\r\n만들어 나가는 성과로 이어지기를 기대한다”고 밝혔
다.\r\n\r\n전방부대 우리 군 장병이 대북확성기 방송을 위한 장비를 점검하고 있다.\r\n[사
진=국방부공동취재단]\r\n\r\n\r\n\r\n▶ 당신의 생활 속 언제 어디서나 이데일리 \r\n▶
스마트 경제종합방송 ‘이데일리 TV’ | 모바일 투자정보 ‘투자플러스’\r\n▶ 실시간 뉴스와
속보 ‘모바일 뉴스 앱’ | 모바일 주식 매매 ‘MP트래블러Ⅱ’\r\n▶ 전문가를 위한 국내 최상의
금융정보단말기 ‘이데일리 마켓포인트 3.0’ | ‘이데일리 본드웹 2.0’\r\n▶ 증권전문가방송
‘이데일리 ON’ 1666-2200 | ‘ON스탁론’ 1599-2203\n<ⓒ종합 경제정보 미디어 이데일리 -
무단전재 & 재배포 금지> \r\n",
"mimeType": "text/plain",
"language": "ko",
"headline": "국방부 \"軍 대북확성기 방송, 23일 0시부터 중단\"",
"id": "EDYM00251_1804232beO/5WAUgdlYbHS853hYOGrIL+Tj7oUjwSYwT"}}
If this object is in my file, the JSON path evaluates blanks for all the path arguments. Is there a way to force the Jayway engine to recognize the "\"? It appears to function correctly in other languages.
I was able to resolve this after understanding the difference between definite and indefinite paths. The Jayway github README points out the following will make a path indefinite and return a list:
When evaluating a path you need to understand the concept of when a
path is definite. A path is indefinite if it contains:
.. - a deep scan operator
?(<expression>) - an expression
[<number>, <number> (, <number>)] - multiple array indexes Indefinite paths
always returns a list (as represented by current JsonProvider).
My JSON looked like the following:
{
"Version":"14",
"Items":[
{"data": {"body": "[이데일리 ... \r\n",
"mimeType": "text/plain",
"language": "ko",
"headline": "국방부 \"軍 ... 중단\"",
"id": "1"}
},
{"data": {"body": "[이데일리 ... \r\n",
"mimeType": "text/plain",
"language": "ko",
"headline": "국방부 \"軍 ... 중단\"",
"id": "2"}
...
}
]
}
This JSON path selector I was using ($.data.headline) did not grab the values as I expected. It instead returned null values.
Changing it to $.Items[*].data.headline or $..data.headline returns a list of each headline.