Count occurences along with result using DISTINCT ON on PostgreSQL - json

I have data like this:
[
{"name": "pratha", "email": "p#g.com", "sub": { "id": 1 } },
{"name": "john", "email": "x#x.com", "sub": { "id": 5 } },
{"name": "pratha", "email": "c#d.com", "sub": { "id": 2 } }
]
This is my query to get unique and latest emails:
SELECT DISTINCT ON (jae.e->>'name')
jae.e->>'name' as name,
jae.e->>'email' as email
FROM survey_results sr
CROSS JOIN LATERAL jsonb_array_elements(sr.data_field) jae (e)
ORDER BY jae.e->>'name', jae.e->'sub'->>'id' desc
Problem is, when I add count(*) to select, all counts are equal.
I want to get unique result with distinct, and count their occurrences. So in this case, pratha should be 2 and john should be 1
with their data (not just counts)
How can achieve this with PostgreSQL?
See here: https://dbfiddle.uk/?rdbms=postgres_11&fiddle=f5c640958c3e4d594287632d0f4a835f

Do you need this?
SELECT DISTINCT ON (jj->>'name') jj->>'name', jj->>'email' , count(*) over(partition by jj->>'name' )
from survey_results
join lateral jsonb_array_elements(data_field) j(jj) on true
ORDER BY jj->>'name', jj->'sub'->>'id' desc
https://dbfiddle.uk/?rdbms=postgres_11&fiddle=5f07b7bcb0001ebe32aa2f1338d9d0f0

Related

How to deal with not existing values using JSON_EXTRACT?

I have a list ob objects. Each object contains several properties. Now I want to make a SELECT statement that gives me a list of a single property values. The simplified list look like this:
[
[
{
"day": "2021-10-01",
"entries": [
{
"name": "Start of competition",
"startTimeDelta": "08:30:00"
}
]
},
{
"day": "2021-10-02",
"entries": [
{
"name": "Start of competition",
"startTimeDelta": "03:30:00"
}
]
},
{
"day": "2021-10-03",
"entries": [
{
"name": "Start of competition"
}
]
}
]
]
The working SELECT is now
SELECT
JSON_EXTRACT(column, '$.days[*].entries[0].startTimeDelta') AS list
FROM table
The returned result is
[
"08:30:00",
"03:30:00"
]
But what I want to get (and also have expected) is
[
"08:30:00",
"03:30:00",
null
]
What can I do or how can I change the SELECT statement so that I also get NULL values in the list?
SELECT startTimeDelta
FROM test
CROSS JOIN JSON_TABLE(val,
'$[*][*].entries[*]' COLUMNS (startTimeDelta TIME PATH '$.startTimeDelta')) jsontable
https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=491f0f978d200a8a8522e3200509460e
Do you also have a working idea for MySQL< 8? – Lars
What is max amount of objects in the array on the 2nd level? – Akina
Well it's usually less than 10 – Lars
SELECT JSON_EXTRACT(val, CONCAT('$[0][', num, '].entries[0].startTimeDelta')) startTimeDelta
FROM test
-- up to 4 - increase if needed
CROSS JOIN (SELECT 0 num UNION SELECT 1 UNION SELECT 2 UNION SELECT 3) nums
WHERE JSON_EXTRACT(val, CONCAT('$[0][', num, '].entries[0]')) IS NOT NULL;
https://www.db-fiddle.com/f/xnCCSTGQXevcpfPH1GAbUo/0

Sql Query Json Array items by Value

I have searched and can't seem to find somewhere doing exactly what I am trying.
I have a json similar to as follows in multiple rows in my database:
{
"date": "0001-01-01T00:00:00",
"details": {
"detail": [
{
"item": "11",
"value": "xt"
},
{
"item": "12",
"value": "xy"
},
{
"item": "13",
"value": "xz"
},
{
"item": "14",
"value": "zz"
}
]
}
}
I want to do sql that does this:
select ID
jsonColumn.value where item=11 as X
jsonColumn.value where item=12 as Y
from tbl
So I have results like this
----------------------
|ID |X |Y |
----------------------
|1 |xt |xy |
----------------------
I have tried using JSONVALUE but I seem to need to do it by the array item number like this:
'$.details.detail[3].value'
which doesn't really work
I have also tried this:
SELECT id, x.item, x.value
FROM
tbl F
CROSS APPLY (select *
FROM OPENJSON(F.Json,'$.details.detail')
CROSS APPLY OPENJSON(value)
WITH (item NVARCHAR(25) '$.item',
value NVARCHAR(max) '$.value') As x
where F.ID=55
Which I can use to print out all the items and values but then I'd have to query each separately again.
Is there a way of combining the two in to one big query that won't be completely inefficient?
Seems what you want is a pivot. I personally use conditional aggregation over the far more restrictive PIVOT operator. The JSON you supplied was invalid, so I took some liberties correcting it in my sandbox environment:
SELECT --ID,
MAX(CASE d.item WHEN 11 THEN d.[value] END) AS X,
MAX(CASE d.item WHEN 12 THEN d.[value] END) AS Y
FROM (VALUES(#JSON))V(J) --Your Table
CROSS APPLY OPENJSON(V.J,'$.details')
WITH (detail nvarchar(MAX) AS JSON ) OJ
CROSS APPLY OPENJSON(OJ.detail)
WITH(item int,
[value] nvarchar(2)) d;
If you are using this against a table, and not limiting the data to a single row, you'll need to also add a GROUP BY clause on the relevant columns (ID?).

Aggregate JSON in PostgreSQL

I have a json column with entries that look like this:
{
"pages": "64",
"stats": {
"1": { "200": "55", "400": "4" },
"2": { "200": "1" },
"3": { "200": "1", "404": "13" },
}
}
The 'stats' are collections (of various sizes) containing http status codes versus counts.
I would like to aggregate the stats into two calculated columns - one for the total number of 200 responses and the other for the total number of responses (including 200s).
You can use two lateral joins to unnest the inner objects, then do conditional aggregation:
select
sum(z.cnt::int) no_responses,
sum(z.cnt::int) filter(where z.code::int = 200) no_200_responses
from mytable t
cross join lateral jsonb_each(t.data -> 'stats') as x(kx, obj)
cross join lateral jsonb_each_text(x.obj) as z(code, cnt)
Demo on DB Fiddle:
no_responses | no_200_responses
-----------: | ---------------:
74 | 57

How to single value object list to array with N1QL

I have json documents in Couchbase bucket that looks like this
{
"id":"10"
"threadId": "thread1",
"createdDate": 1553285245575,
}
{
"id":"11"
"threadId": "thread1",
"createdDate": 1553285245776,
}
{
"id":"12"
"threadId": "thread2",
"createdDate": 1553285245575,
}
I'm trying to create a query that fetches documents based on group by threadId and most recent document by createdDate.
I wrote a n1ql query like this but it is only return documentId like this.
SELECT max([mes.createdDate,meta(mes).id])
from `messages` as mes
group by mes.threadId
result:
[
{
"$1": [
1553285245776,
"11"
]
},
{
"$1": [
1553285245737,
"12"
]
}
]
But i want to result like this
[{
"id":"10"
"threadId": "thread1",
"createdDate": 1553285245575,
}
{
"id":"11"
"threadId": "thread1",
"createdDate": 1553285245776,
}]
Any help would be appreciated
SELECT m.*
FROM `messages` AS mes
WHERE mes.threadId IS NOT NULL
GROUP BY mes.threadId
LETTING m = MAX([mes.createdDate, mes])[1];
You can use following index and query which uses covering avoids fetch.
CREATE INDEX ix1 ON `messages`(threadId, createdDate DESC, id);
SELECT m.*
FROM `messages` AS mes
WHERE mes.threadId IS NOT NULL
GROUP BY mes.threadId
LETTING m = MAX([mes.createdDate,{mes.threadId,mes.createdDate, mes.id}])[1];

Unnest and Average operation in Couchbase

I have following structure saved in a bucket:
[
{
"Grouplens_1M": {
"genres": [
"Thriller",
"Drama"
],
"movieId": 3952,
"ratings": [
{
"rating": 4,
"userId": 23
},
{
"rating": 5,
"userId": 36
},
{
"rating": 4,
"userId": 52
}
],
"title": "Contender, The (2000)"
}
}
]
Now I need to get all titles which are rated in average above 3. I found out, that I need to unnest ratings and then use AVG to get the average. But it was not working. After trying to figure out how to solve this problem, I came to this:
SELECT g.title, AVG(r_item.rating) AS avg_r
FROM Grouplens_1M AS g
UNNEST ratings r_item
WHERE r_item > 4.0
GROUP BY g.title
After execution time on the query, it shows me a result. But the WHERE clause is not correct. It seems to ignore the statement as it shows me all movies with the average rating.
Since you want to filter out results based on the value of the derived average use HAVING. WHERE would be used to limit the documents considered for the query.
Try
SELECT g.title, avg_r
FROM Grouplens_1M AS g
UNNEST ratings r_item
GROUP BY g.title
LETTING avg_r = AVG(r_item.rating)
HAVING avg_r > 4.0;