How to aggregate array values in JSONB? - json

I have the following PostgreSQL table:
CREATE TABLE orders
(
id uuid NOT NULL,
order_date timestamp without time zone,
data jsonb
);
Where data contains json documents like this:
{
"screws": [
{
"qty": 1000,
"value": "Wood screw"
},
{
"qty": 500,
"value": "Drywall screw"
},
{
"qty": 500,
"value": Concrete screw"
}
],
"nails": [
{
"qty": 1000,
"value": "Round Nails"
}
]
}
How do I can get an overall quantity for all types of screws across all orders? Something like this :)
select value, sum(qty) from orders where section = 'screws' group by value;

I am not quite sure why you are trying to sum up the qty values, because the GROUP BY value makes only sense if there would be several times the same value which can be summed, e.g. if you would have twice the value Wood screw.
Nevertheless, this would be the query:
step-by-step demo:db<>fiddle
SELECT
elems ->> 'value' AS value,
SUM((elems ->> 'qty')::int) AS qty
FROM
orders,
jsonb_array_elements(data -> 'screws') elems
GROUP BY 1
Expand the screw array into one row per array element by jsonb_array_elements()
Get the qty value by the ->> operator (which gives out a text type) and cast this into type int
If really necessary, aggregate these key/value pairs.

Related

psql equivalent of pandas .to_dict('index')

I want to return a psql table, but I want to return it in json format.
Let's say the table looks like this...
id
name
value
1
joe
6
2
bob
3
3
joey
2
But I want to return it as an object like this...
{
"1": {
"name": "joe",
"value": 6
},
"2": {
"name": "bob",
"value": 3
},
"3": {
"name": "joey",
"value": 2
}
}
So if I were doing this with pandas and the table existed as a dataframe, I could transform it like this...
df.set_index('id').to_dict('index')
But I want to be able to do this inside the psql code.
The closest I've gotten is by doing something like this
select
json_build_object (
id,
json_build_object (
'name', name,
'value', value
)
)
from my_table
But instead of aggregating this all into one object, the result is a bunch of separate objects separated by rows at the key level... that being said, it's kinda the same idea...
Any ideas?
You want jsonb_object_agg() to get this:
select jsonb_object_agg(id, jsonb_build_object('name', name, 'value', value))
from my_table
But this is not going to work well for any real-world sized tables. There is a limit of roughly 1GB for a single value. So this might fail with an out-of-memory error with larger tables (or values inside the columns)

Query to count all records without certain key in the json column of the snowflake table

I am trying to fetch the count the number of records from a Snowflake without certain keys in the json column of that particular record.
Here’s how the snowflake table looks like :
EMP_ID|DEPARTMENT_NAME|EID|DETAILS
EMP10001 | Finance |10008918 |{
"name": "Alec George",
"Year_Joined": "2013",
"Ready_to_transfer": "no",
"Ready_to_permanently_WFH": "yes",
}
Now I want to count records that doesn’t have have the keys that start with Ready_ in the details column of the snowflake table and group counts by the Department_Name.
Note : There can be multiple keys that start with Ready_ in the details.
Currently what’s happening is my count query is returning records where keys start with Ready_ is also listed.
You can flatten to get all the keys, then for each record you can count the number of keys that start with your desired string:
with data as (
select $1 emp_id, $2 dep, $3 eid, parse_json($4) details
from (values
('EMP10001','Finance', 10008918, '{
"name": "Alec George",
"Year_Joined": "2013", "Ready_to_transfer": "no", "Ready_to_permanently_WFH": "yes", }')
,('EMP10002','Finance', 10008918, '{
"name": "Alex George",
"Year_Joined": "2013", }')
)
)
select seq, count_if(detail.key like 'Ready_%') how_many_ready
from data, table(flatten(details)) detail
group by 1
;
Then you only need to count the # of elements that have a count > 0.

MariaDB JSON Query

I have a table with the following structure:
|id | json |
------------
| | |
------------
The JSON structure is as follows:
{
"roomType": "Deluxe",
"package": "Full-Board +",
"comb": {
"adult": "1",
"infant": "0",
"child": "0",
"teen": "0"
},
"rates": [
{
"rateFrom": "2021-02-11",
"rateTo": "2021-02-20",
"ratePrice": "6000"
}, {
"rateFrom": "2021-02-21",
"rateTo": "2021-02-26",
"ratePrice": "6500"
}]
}
There can be many entries in attribute rates.
Now, I need to return the rows where any of the attribute rateTo from rates is greater than today's date.
That is, if today's date is less than at least one rateTo of the entries of rates, then return that row.
This is the first time I am querying JSON and I am not sure if the structure of my JSON is correct for the type of querying I want to do.
This would be much easier if you abandoned JSON as a datatype and used properly normalised tables of rooms, packages, comb (might be part of packages?) and rates. If you are stuck with JSON though, one way to get the data you want is to extract all the rateTo values from each JSON into a comma separated (and ended) list of dates (for example, for your sample data, this would be 2021-02-20,2021-02-26,; then split that into individual dates 2021-02-20 and 2021-02-26, then SELECT rows from the original table if one of the associated dates is after today. You can do this with a couple of recursive CTEs:
WITH RECURSIVE toDate AS (
SELECT id, CONCAT(REGEXP_REPLACE(JSON_EXTRACT(`json`, '$.rates[*].rateTo'), '[ "\\[\\]]', ''), ',') AS toDates
FROM rooms
),
Dates AS (
SELECT id, SUBSTRING_INDEX(toDates, ',', 1) AS toDate, REGEXP_REPLACE(toDates, '^[^,]+,', '') AS balance
FROM toDate
UNION ALL
SELECT id, SUBSTRING_INDEX(balance, ',', 1), REGEXP_REPLACE(balance, '^[^,]+,', '')
FROM Dates
WHERE INSTR(balance, ',') > 0
)
SELECT *
FROM rooms r
WHERE EXISTS (SELECT *
FROM Dates d
WHERE d.id = r.id AND d.toDate > CURDATE())
Demo on dbfiddle

Creating a good mysql schema for web front end usage

I have recently started working with a company who sends me data via JSON, the JSON looks like this:
[{
"name": "company1",
"dataset": null,
"data": [{
"x": "2015-01-01T00:00",
"y": 182
},
{
"x": "2015-01-02T00:00",
"y": 141
}
]
},
{
"name": "company2",
"dataset": null,
"data": [{
"x": "2015-01-01T00:00",
"y": 182
},
{
"x": "2015-01-02T00:00",
"y": 141
}
]
},
{
"name": "company3",
"dataset": null,
"data": [{
"x": "2015-01-01T00:00",
"y": 182
},
{
"x": "2015-01-02T00:00",
"y": 141
}
]
}
]
I get 57 of these daily (Almost identical with the only difference being that the Y value changes accoridng to which metric it is) one for each metric tracked by the company. As you can see the way in which they've written the JSON (X & Y Key value pairs) make it rather hard to store nicely.
I've made 57 tables in MySQL, one for each JSON that inserts the values for that specific metric however querying to get all activity for a day takes a LONG time due to the amount of joins.
I'm hoping one of you might be able to tell me the best way in whihc to insert this into a mysql db table for where I end up either 1 table containing all 57 values or the best way to query across 57 tables without waiting hours for mysql to load it.
this is a personal project for my own business so funds are tight and I am doing what I can at the moment - sorry if this sounds ridiculous!
If I were to be required to store this data, I would personally be inclined to use a table for all the results, with a company table holding the 'master' information about each company.
The company table would be structured like this:
id INT NOT NULL AUTO_INCREMENT,
name VARCHAR(50) -- Arbitrary size - change as needed
The company_update table would be structured like this:
company_update_id INT NOT NULL AUTO_INCREMENT,
company_id INT NOT NULL,
update_timestamp DATETIME,
update_value INT -- This may need to be another type
There would be a foreign key between company_update.company_id to company.company_id.
When receiving the JSON update:
Check that the name exists in the company table. If not, create it.
Get the company's unique ID (for use in the next step) from the company table.
For each item in the data array, add a record to company_update using the appropriate company ID.
Now in order to get results for all companies, I would just use a query like:
SELECT c.name,
cu.update_timestamp,
cu.update_value
FROM company c
INNER JOIN company_update cu ON cu.company_id = c.company_id
ORDER BY c.name, cu.update_timestamp DESC
Note that I have made the following assumptions which you may need to address:
The company name size is at most 50 characters
The y value in the data array is an integer

MySQL JSON: How to select MIN() value

I have a MySQL table:
The combo column is a JSON datatype
id | combo
1 | {"qty": "2", "variations": [{"name": "Cover", "value": "Paperback"}], "price": "14.00"}
2 | {"qty": "1", "variations": [{"name": "Cover", "value": "Hardback"}], "price": "7.00"}
3 | {"qty": "1", "variations": [{"name": "Cover", "value": "Paperback"}], "price": "15.00"}
I'm trying to get the MIN() price of 7.00 but as they're strings, it returns 14.00.
Can this be done? Here's what I tried:
SELECT
JSON_UNQUOTE(MIN(combo->'$.price')) AS min_price
FROM itemListings
GROUP BY id
I also tried removing the quotes around the stored prices but it gave the same results.
Your code is giving you the lexicographical minimum; when sorting strings a "1" comes before a "7", despite the strings being "14.00" and "7:00", just like "apple" comes before "bat", despite "apple" being longer than "bat".
You want the numerical minimum, so cast the value to a decimal number:
SELECT
id, -- you probably want the select the grouped by value too
MIN(CAST(combo->'$.price' AS DECIMAL(10,2))) AS min_price
FROM itemListings
GROUP BY id