Postgres - How to search and aggregate from a JSON column - json

I have an asset_quantities table as below
id | asset_type | quantity | site_id | asset_ids_json
1 'Container' 3 1 [{"id":1,"make":"am1","model":"amo1"},{"id":2,"make":"am1","model":"amo2"},{"id":3,"make":"am3","model":"amo3"}]
2 'Cage' 3 1 [{"id":4,"make":"bm1","model":"bmo1"},{"id":5,"make":"bm2","model":"bmo2"},{"id":6,"make":"bm2","model":"cmo3"}]
3 'Crate' 3 1 [{"id":7,"make":"cm1","model":"cmo1"},{"id":8,"make":"cm1","model":"cmo1"},{"id":9,"make":"cm1","model":"cmo2"}]
I want to write a SQL query in Postgres that will give me the quantity count of each asset type for a given make or model.
E.g. If I wanted to fetch the quantity for each asset type where make='am1',
site_id | Container_qty | Cage_qty | Crate_qty
1 2 0 0
E.g. If I wanted to fetch the quantity for each asset type where make='cm1', the result set would look like
site_id | Container_qty | Cage_qty | Crate_qty
1 0 0 3
I have written the query below to pivot the values from the 'asset_type' rows into columns but can't figure out how to filter and aggregate the counts based on the attributes inside the field 'asset_ids_json'. It is safe to assume that the length of the json array inside asset_ids_json will always be the same as the value in the 'quantity' column.
select
aq.site_id,
sum(case when aq.asset_type = 'Container' then aq.quantity end) container_qty,
sum(case when aq.asset_type = 'Cage' then aq.quantity end) cage_qty ,
sum(case when aq.asset_type = 'Crate' then aq.quantity end) crate_qty,
from asset_quantities aq
group by aq.site_id;
The crux of my question is how can I filter & aggregate results based on the attributes inside the json column 'asset_ids_json'. I'm using Postgres 9.4.

step-by-step demo:db<>fiddle
SELECT
site_id,
SUM(case when asset_type = 'Container' then quantity end) container_qty,
SUM(case when asset_type = 'Cage' then quantity end) cage_qty ,
SUM(case when asset_type = 'Crate' then quantity end) crate_qty
FROM (
SELECT DISTINCT ON (id)
site_id,
asset_type,
quantity
FROM asset_quantities aq,
json_array_elements(asset_ids_json)
WHERE value ->> 'make' = 'cm1'
) s
GROUP BY site_id
To get a WHERE clause over the content of a JSON array you have to expand the array. json_array_elements() creates one row for each element. With that it is possible to ask for a certain value.
Because of this expansion the current rows are multiplied (three times here because there are three elements in the array). Because you are only interested in the original site_id, asset_type and quantity data which were simply copied into the new records, you can eliminate them with a DISTINCT. DISTINCT ON checks for distinct values of each id. So if two JSON array would contain the same key/value both will be saved.

Related

Restructure MySQL table to create one column per id

I have one MySQL table with the following structure:
+-------------+-------+-----------+
| timestamp | value | sensor_id |
+-------------+-------+-----------+
Let's say as an example that there are three possible sensors_ids: id1,id2 and id3.
I would like to perform a query to output the data in the following format:
+-------------+-----------+-----------+-----------+
| timestamp | value_id1 | value_id2 | value_id3 |
+-------------+-----------+-----------+-----------+
It is straightforward to extract data from one ID only but I'm struggling to combine the 3 of them.
SELECT timestamp,
value AS 'value_id1'
FROM table
WHERE sensor_id='id1'
EDIT for clarifications:
(timestamp,sensor_id) is unique over the table
Use conditional aggregation:
select
timestamp,
max(case when sensor_id = 'id1' then value end) value_id1,
max(case when sensor_id = 'id2' then value end) value_id2,
max(case when sensor_id = 'id3' then value end) value_id3
from mytable
group by timestamp

SQL - Only return when no rows hold a value

I am looking for a way to return an ID only if NO rows hold a certain value
For example:
*ID* | *Date*
1 | 01/01/2001
1 | 02/02/2002
1 | 03/03/2003
If I want SQL to return the ID only if no dates are equal to 02/02/2002, how would I script that? I have tried and failed with the below:
select *ID*
from (example)
where date != 02/02/2002
The problem is that this still returns the ID - 1, as the first and last row do not equal 02/02/2002. What I am aiming for is no returned results because at least one row held the matching date.
I would need the script to completely skip the ID when there is a matching date in any row.
For clarity the below should return the ID when using the same 'select' as above because no dates are matching:
*ID* | *Date*
2 | 03/03/2003
2 | 04/04/2004
2 | 05/05/2005
You need Group By and Having clause
select ID
From yourtable
group by ID
Having count (case when date != '02/02/2002' then 1 end) = count(*)
As mentioned by mathguy, this also works
select ID
From yourtable
group by ID
Having count(case when date = '02/02/2002' then 1 end) = 0

Find items with maximum matching attributes

Here is my table structure - table name "propAssign"
(indexed) (composite index for attributeName and attributeValue)
productId attributeName attributeValue
1 Height 3
1 Weight 1
1 Class X1
1 Category C1
2 Height 2
2 Weight 2
2 Class X2
2 Category C1
3 Height 3
3 Weight 1
3 Class X1
3 Category C1
4 Height 4
4 Weight 5
4 Class X2
4 Category C3
What I want to do is, get list of productId, sorted by maximum matching attributes-value pair. In real table, I am using numeric ID of attribute name and value, I've used text here for easy representation.
So if I want to find matching products of productId=1, I want it to look for product which has maximum match (like Height=3, Weight=1, Class=X1 and Category=C1). There may not be any with 100% match (all 4 match) but if there are, they should come first, next comes productId which has any 3 attributes matching, then any 2, etc.
I could add more indexes if required, better if I don't have to since there are millions rows. It's MariaDB v10 to be exact.
Desired result - If I try to find matching product for productId=1, it should return following, in same order.
productId
-----------
3
2
Reason - 3 has all attributes matching with 1, 2 has some matches and 4 has no match.
You can use conditional aggregation to retrieve the productId's with the highest number of matches first.
select productId,
count(case when attributeName = 'Height' and attributeValue='3' then 1 end)
+ count(case when attributeName = 'Weight' and attributeValue='1' then 1 end)
+ count(case when attributeName = 'Category' and attributeValue='C1' then 1 end) as rank
from mytable
group by productId
order by rank desc
The query above returns all rows even with 0 matches. If you only want to return rows with 1 or more matches, then use the query below, which should be able to take advantage of your composite index:
select productId, count(*) as rank
from mytable
where (attributeName = 'Height' and attributeValue = '3')
or (attributeName = 'Weight' and attributeValue = '1')
or (attributeName = 'Category' and attributeValue = 'C1')
group by productId
order by rank desc

Totalizing and grouping data from an SQL query

I have a query that will display a list of products in order of their import/export property.
select i.`itemid`, p.`prodid`,
(
(case when p.`desc` like "Import/Export"
then 100 else 0 end) +
(case when p.`desc` like "Export"
then 70 else 0 end) +
(case when p.`desc` like "Import"
then 50 else 0 end)
) as priority
from item i , product p
where (
p.`name` LIKE "cleaning agent"
and p.`prodid` = i.`itemid`
)
The query does fine in adding a "priority" value to each product but what I would like to ask is how do I group them by ID and total the priority based on the ID? I can group similar prodid rows with the Order by keyword, but then it just gives me a single value for the priority field.
What I want to achieve is to group all similar product id's and get a total of their priority value. I've used sum() in select statements before, but I'm at a loss at trying to figure out how to get the total of all priority fields because it is a query-generated column.
+--------+----------+
| prodid | priority |
+--------+----------+
| 225 | 50 |
| 225 | 20 |
+--------+----------+
should be
+--------+----------+
| prodid | priority |
+--------+----------+
| 225 | 70 |
+--------+----------+
Here is a sqlfiddle: http://sqlfiddle.com/#!2/cec136/5
You can do this by turning your query into an aggregation using group by:
select p.`prodid`,
sum(case when p.`desc` like 'Import/Export' then 100
when p.`desc` like 'Export' then 70
when p.`desc` like 'Import' then 50
else 20
end) as priority
from item i join
product p
on p.`prodid` = i.`prodid`
where p.`type` LIKE 'cleaning agent'
group by p.prodid;
Along the way, I fixed a few things:
The join is now explicit in the from clause, rather than implicit in the where clause.
Because i.prodid = p.prodid, there is no need to include both in the select.
I changed the case statement to cascade. Only one of the conditions can match, so there is no reason to add things together.
I changed all the string constants to use single quotes rather than double quotes.

something like "group by" for columns?

I have table like this:
+----+---------+---------+--------+
| id | value_x | created | amount |
+----+---------+---------+--------+
value_x is set of six strings, lets say "one", "two", "three", etc.
I need to create report like this:
+--------------+-------------------------+-------------------+----------------------+
| day_of_month | "one" | "two" | [etc.] |
+--------------+-------------------------+-------------------+----------------------+
| 01-01-2011 | "sum(amount) where value_x = colum name" for this specific day |
+--------------+-------------------------+-------------------+----------------------+
Most obvious solution is:
SELECT SUM(amount), DATE(created) FROM `table_name` WHERE value_x=$some_variable GROUP BY DATE(created)
And loop this query six times with another value for $some_variable in every iteration, but I'm courious if is it possible to do this in single query?
What you're asking is called a "pivot table" and is typically achieved as below. The idea is for each potential value of value_x you either produce a 1 or 0 per row and sum 1's and 0's to get the sum for each value.
SELECT
DATE(created),
SUM(CASE WHEN value_x = 'one' THEN SUM(amount) ELSE 0 END) AS 'one',
SUM(CASE WHEN value_x = 'one' THEN SUM(amount) ELSE 0 END) AS 'two',
SUM(CASE WHEN value_x = 'one' THEN SUM(amount) ELSE 0 END) AS 'three',
etc...
FROM table_name
GROUP BY YEAR(created), MONTH(created), DAY(created)
This will come close:
SELECT
s.day_of_month
,GROUP_CONCAT(CONCAT(s.value_x,':',s.amount) ORDER BY s.value_x ASC) as output
FROM (
SELECT DATE(created) as day_of_month
,value_x
,SUM(amount) as amount
FROM table1
GROUP BY day_of_month, value_x
) s
GROUP BY s.day_of_month
You will need to read the output and look for the value_x prior to the : to place the items in the proper column.
The benefit of this approach over #Michael's approach is that you do not need to know the possible values of field value_x beforehand.