MySQL Top items based on count within multiple grouping - mysql

I am trying to write a SQL query that will return the top items for each company and for each location. I have an example MySQL table (table_x) that looks like this:
Date | Company | Location | Item | Price | Quantity | Total_Amount
----------|---------|----------|--------------|-------|----------|-------------
1/10/2000 | ABC | 1 | Food | 2 | 6 | 12
1/11/2000 | ABC | 1 | Food | 1 | 2 | 2
1/12/2000 | ABC | 2 | Food | 10 | 5 | 50
1/13/2000 | ABC | 2 | Electronics | 100 | 2 | 200
1/10/2000 | ABC | 1 | Consumables | 10 | 5 | 50
1/15/2000 | ABC | 2 | Electronics | 100 | 3 | 300
1/10/2000 | DEF | 1 | Electronics | 50 | 5 | 250
1/16/2000 | DEF | 1 | Electronics | 50 | 4 | 200
1/19/2000 | DEF | 2 | Food | 10 | 5 | 50
1/14/2000 | DEF | 2 | Food | 2 | 10 | 20
1/11/2000 | DEF | 2 | Food | 5 | 8 | 40
1/11/2000 | DEF | 2 | Electronics | 500 | 2 | 1000
And for example what I want is to return is the top item by count per company per location. So something like this where the top item by count is per company and per location.
Company | Location | Item | AVG(Price) | SUM(Total_Amount) | COUNT(*)
--------|----------|-------------|------------|-------------------|---------
ABC | 1 | Food | 4 | 14 | 2
ABC | 2 | Electronics | 100 | 500 | 2
DEF | 1 | Electronics | 50 | 450 | 2
DEF | 2 | Food | 5.67 | 110 | 3
I know how to do this across all company and locations, but have trouble getting the top items by count to be within each specific grouping. Ideally, I'd want to be able to extend this to top N items if I have more item types if possible.
This is the SQL query I ran to generate top items based on the occurrences.
SELECT Company, Location, Item, AVG(Price), SUM(Total_Amount), COUNT(*) FROM table_x
GROUP BY Company, Location, Item
ORDER BY Company, Location, COUNT(*) desc

Use MySQL's non-standard grouping feature:
select * from (
SELECT Company, Location, Item, AVG(Price), SUM(Total_Amount), COUNT(*) FROM table_x
GROUP BY Company, Location, Item
ORDER BY Company, Location, COUNT(*) desc
)
group by 1,2
With MySQL (only) when you omit non-aggregate columns from the group by list, the first row of each combination is returned.
Note that since version 5.7.5, you must disable ONLY_FULL_GROUP_BY, which is enabled by default.
#Jeffrey has kindly provided an SQLFiddle.

Related

Precalculate numbers of records for each possible combination

I have a mySQL database table containing cellphones information like this:
ID Brand Model Price Type Size
==== ===== ===== ===== ====== ====
1 Apple A71 3128 A 40
2 Samsung B7C 3128 B 20
3 Apple ZX5 3128 A 30
4 Huawei Q32 2574 B 40
5 Apple A21 2574 A 25
6 Apple A71 3369 A 30
7 Samsung A71 7413 C 40
Now I want to create another table, that would contain counts for every possible combination of the parameters.
Params Count
============================================== =======
ALL 1000000
Brand(Apple) 20000
Brand(Apple,Samsung) 40000
Brand(Apple),Model(A71) 7100
Brand(Apple),Type(A) 6000
Brand(Apple),Model(A71,B7C),Type(A,B) 7
Model(A71) 12514
Model(A71,B7C) 26584
Model(A71),Type(A) 6521
Model(A71),Type(A,B) 8958
Model(A71),Type(A,B),Size(40) 85
And so on for every possible combination. I was thinking about creating a stored procedure (that i would execute periodically), that would perform queries with every existing condition like that, but I am a little stuck on how exactly should it look like. Or is there a better way how to do this?
Edit: the reason why I want to store information like this is to be able to show number of results in filter in client application, like in the picture.
I would like to create index on the Params column to be able to get the Count number for given hash instantly, improving performance.
I also tried querying and caching the values dynamically, but I want to try this approach as well, so I can compare which one is more effective.
This is how I am calculating the counts now:
SELECT COUNT(*) FROM products;
SELECT COUNT(*) FROM products WHERE Brand IN ('Apple');
SELECT COUNT(*) FROM products WHERE Brand IN ('Apple', 'Samsung');
SELECT COUNT(*) FROM products WHERE Brand IN ('Apple') AND Model IN ('A71');
etc.
You can use a ROLLUP for this.
SELECT
model, type, size, COUNT(*)
FROM mytab
GROUP BY 1, 2, 3
WITH ROLLUP
With your sample data, we get the following:
| model | type | size | COUNT(*) |
| ----- | ---- | ---- | -------- |
| A21 | A | 25 | 1 |
| A21 | A | | 1 |
| A21 | | | 1 |
| A71 | A | 30 | 1 |
| A71 | A | 40 | 1 |
| A71 | A | | 2 |
| A71 | C | 40 | 1 |
| A71 | C | | 1 |
| A71 | | | 3 |
| B7C | B | 20 | 1 |
| B7C | B | | 1 |
| B7C | | | 1 |
| Q32 | B | 40 | 1 |
| Q32 | B | | 1 |
| Q32 | | | 1 |
| ZX5 | A | 30 | 1 |
| ZX5 | A | | 1 |
| ZX5 | | | 1 |
| | | | 7 |
The subtotals are present in the rows with null values in different columns, and the total is the last row where all group by columns are null.

MySQL get the row with smallest column value for each product group [duplicate]

This question already has answers here:
Retrieving the last record in each group - MySQL
(33 answers)
Closed 3 years ago.
I have tables products and product_prices. Like that;
products:
+-------------+----------+
| products_id | title |
+-------------+----------+
| 1 | phone |
| 2 | computer |
| 3 | keyboard |
+-------------+----------+
product_prices:
+-------------------+-----------+-------+-------------+
| product_prices_id | productid | price | minquantity |
+-------------------+-----------+-------+-------------+
| 1 | 1 | 500 | 1 |
| 2 | 1 | 450 | 2 |
| 3 | 2 | 800 | 1 |
| 4 | 2 | 700 | 2 |
| 5 | 3 | 15 | 1 |
| 6 | 3 | 10 | 3 |
| 7 | 3 | 7 | 10 |
+-------------------+-----------+-------+-------------+
So there's multiple prices depending on quantity.
My SQL query is like this:
SELECT
*
FROM
products product
INNER JOIN
product_prices price
ON price.productid = product.products_id
GROUP BY
product.products_id
ORDER BY
price.price;
I'm getting this error:
Expression #3 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'price.product_prices_id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by
The result without GROUP BY is:
+-------------+----------+-------------------+-----------+-------+-------------+
| products_id | title | product_prices_id | productid | price | minquantity |
+-------------+----------+-------------------+-----------+-------+-------------+
| 3 | keyboard | 7 | 3 | 7 | 10 |
| 3 | keyboard | 6 | 3 | 10 | 3 |
| 3 | keyboard | 5 | 3 | 15 | 1 |
| 1 | phone | 2 | 1 | 450 | 2 |
| 1 | phone | 1 | 1 | 500 | 1 |
| 2 | computer | 4 | 2 | 700 | 2 |
| 2 | computer | 3 | 2 | 800 | 1 |
+-------------+----------+-------------------+-----------+-------+-------------+
What I want to do is, get the row with the cheapest price, grouped by products_id;
+-------------+----------+-------------------+-----------+-------+-------------+
| products_id | title | product_prices_id | productid | price | minquantity |
+-------------+----------+-------------------+-----------+-------+-------------+
| 3 | keyboard | 7 | 3 | 7 | 10 |
| 1 | phone | 2 | 1 | 450 | 2 |
| 2 | computer | 4 | 2 | 700 | 2 |
+-------------+----------+-------------------+-----------+-------+-------------+
I think I need to use MIN() but I have tried several things, which did not work. The closest I could do was ordering it by price, limiting to 1, but it was returning 1 product only.
Any ideas?
If it helps, here's the dump for example database I used: https://transfer.sh/dTvY4/test.sql
You need first to find out what are the minimum prices for each product. For that you use the MIN-aggregate function. As you are selecting a normal columnn with aggregate function, you need to list the normal column in the GROUP BY-clause.
Once you know the minimum prices for each product, you just select those rows from the join of the two tables:
select
p.products_id,
p.title,
pr.product_prices_id,
pr.productid,
pr.price,
pr.minquantity
from product_prices pr
join products p on p.products_id=pr.productid
join (
select productid, min(price) as minprice
from product_prices
group by productid
) mpr on mpr.productid=pr.productid and mpr.minprice=pr.price
See SQLFiddle.
In your query you try to use GROUP BY-clause without an aggregate function, hence the error. Also, you are missing the MIN-logic.
Instead of linking a file to the question, you better create a SQLFiddle / db-fiddle for it. This way it is far easier to answer the question.

Mysql and Triggers usage

I wonder what's the best practice is when you have a table like this below with orders. I need to calculate the total price of each order. Should I use triggers to calculate the price or should I hardcode the calculation before insert into database?
ORDERS:
id | Article | Price | Quantity | Total price
---------------------------------------------
1 | TV | 5 | 1 | 5
2 | CD | 3 | 2 | 6
3 | Book | 2 | 3 | 6
4 | XBOX | 1 | 1 | 1

Getting individual values using GROUP BY - MySQL

I have a table called lottery_winners with the following useful colums:
+----+------+------+--------+---------+
| id | plid | zbid | amount | numbers |
+----+------+------+--------+---------+
id is the unique, primary id for the table. plid refers to the past_lotteries table (i.e I can get all the lottery winners from a specific lottery in the past this way. zbid is the id of the member/user (the winner). amount is the sum of money they won in the lottery, and finally numbers is a VARCHAR CSV field with their lottery numbers.
Here's an example of what rows could be in the table:
+----+------+------+--------+---------+
| id | plid | zbid | amount | numbers |
+----+------+------+--------+---------+
| 1 | 1 | 2 | 1 | 1,2,3 |
+----+------+------+--------+---------+
| 2 | 1 | 4 | 5 | 4,5,6 |
+----+------+------+--------+---------+
| 3 | 1 | 3 | 7 | 3,4,5 |
+----+------+------+--------+---------+
| 4 | 1 | 2 | 3 | 7,8,9 |
+----+------+------+--------+---------+
| 5 | 2 | 2 | 8 | 8,9,10 |
+----+------+------+--------+---------+
Now, I want to run a SELECT statement which will bring back all the rows but in a really specific order. The rows should be in grouped by zbid as such (in this case I have added a WHERE plid=1 clause):
+----+------+------+--------+---------+
| id | plid | zbid | amount | numbers |
+----+------+------+--------+---------+
| 1 | 1 | 2 | 1 | 1,2,3 |
+----+------+------+--------+---------+
| 4 | 1 | 2 | 3 | 7,8,9 |
+----+------+------+--------+---------+
| 2 | 1 | 4 | 5 | 4,5,6 |
+----+------+------+--------+---------+
| 3 | 1 | 3 | 7 | 3,4,5 |
+----+------+------+--------+---------+
Next criteria is that not only should they be grouped by zbid, but within this grouping they should be ordered by amount DESC. This is what it would now look like:
+----+------+------+--------+---------+
| id | plid | zbid | amount | numbers |
+----+------+------+--------+---------+
| 4 | 1 | 2 | 3 | 7,8,9 |
+----+------+------+--------+---------+
| 1 | 1 | 2 | 1 | 1,2,3 |
+----+------+------+--------+---------+
| 2 | 1 | 4 | 5 | 4,5,6 |
+----+------+------+--------+---------+
| 3 | 1 | 3 | 7 | 3,4,5 |
+----+------+------+--------+---------+
The top two rows have swapped around.
One more criteria. As you can see, although they are grouped by zbid, there's no specific order to them. I want it to group by zbid, but the order should be based on sum(amount) for each group.
The following table shows the totals for each zbid in no specific order (taking into account that plid=1:
+------+-------------+
| zbid | sum(amount) |
+------+-------------+
| 2 | 4 |
+------+-------------+
| 3 | 7 |
+------+-------------+
| 4 | 5 |
+------+-------------+
So using this information the final result using the SELECT statement should be the following (with an added sum(amount) column):
+----+------+------+--------+---------+-------------+
| id | plid | zbid | amount | numbers | sum(amount) |
+----+------+------+--------+---------+-------------+
| 3 | 1 | 3 | 7 | 3,4,5 | 7 |
+----+------+------+--------+---------+-------------+
| 2 | 1 | 4 | 5 | 4,5,6 | 5 |
+----+------+------+--------+---------+-------------+
| 4 | 1 | 2 | 3 | 7,8,9 | 4 |
+----+------+------+--------+---------+-------------+
| 1 | 1 | 2 | 1 | 1,2,3 | 4 |
+----+------+------+--------+---------+-------------+
That's it! Now I've tried a couple of things myself, but I'm not exactly sure how to get the full final result. I have tried:
SELECT id,plid,zbid,amount,numbers,sum(amount) FROM lottery_winners GROUP BY zbid ORDER BY sum(amount) DESC
Now that seemed to meet the final criterion, but it didn't give me individual results for the table.
Please also note that as these results will be paginated, I will need to be adding LIMIT $start,$perpage to the end of the query.
have this a try:
SELECT a.id,
a.plid,
a.zbid,
a.amount,
a.numbers,
c.totalAmount
FROM lottery_winners a
INNER JOIN (
SELECT b.zbid,
SUM(b.amount) totalAmount
FROM lottery_winner b
WHERE b.plid = 1
GROUP BY b.zbid
) c
ON a.zbid = c.zbid
WHERE a.plid = 1
ORDER BY c.totalAmount desc

How can I count the number of entities that don't have certain attributes grouped by a common attribute in an EAV(ish) schema?

I'm trying to get a count by region of the number of files that DON'T contain "important" attributes given the following dataset:
files
------------------------------------
id | file | region
------------------------------------
1 | data.xml | eastern
2 | 2011-01-01-report.xml | eastern
3 | regional report.xml | western
4 | data.xml | central
5 | 2010 summary.xml | eastern
file_attributes
--------------------------------------------
file_id | attribute | value | importance
--------------------------------------------
1 | Patients | 18 | 0
1 | Deaths | 17 | 1
2 | Clients | 5 | 0
3 | Refunds | 12 | 1
5 | Deaths | 4 | 1
I can get a count of the number of files that have important attributes like this:
SELECT
region
, COUNT(f.id) AS file_count
, COUNT(DISTINCT if(fa.importance = 1, f.id)) AS files_w_important_attr
, COUNT(DISTINCT if(fa.importance = 0, f.id)) AS files_w_unimportant_attr
FROM files AS f
LEFT JOIN object_attributes AS fa
ON f.id = fa.object_id
GROUP BY f.region
This yields the following results:
region | file_count | files_w_important_attr | files_w_unimportant_attr
------------------------------------------------------------------------
central | 1 | 0 | 0
eastern | 3 | 2 | 2
western | 1 | 1 | 0
I'm having trouble figuring out how to get a count of the files without important attributes. Note that I'm not trying to get a count of the files that have unimportant attributes which is what the 3rd column in the above query yields. What I want is the following results:
region | file_count | files_w_important_attr | files_w_NO_important_attr
------------------------------------------------------------------------
central | 1 | 0 | 1
eastern | 3 | 2 | 1
western | 1 | 1 | 0
How about this?
SUM( fa.importance IS NULL ) AS nulledFiles