Precalculate numbers of records for each possible combination - mysql

I have a mySQL database table containing cellphones information like this:
ID Brand Model Price Type Size
==== ===== ===== ===== ====== ====
1 Apple A71 3128 A 40
2 Samsung B7C 3128 B 20
3 Apple ZX5 3128 A 30
4 Huawei Q32 2574 B 40
5 Apple A21 2574 A 25
6 Apple A71 3369 A 30
7 Samsung A71 7413 C 40
Now I want to create another table, that would contain counts for every possible combination of the parameters.
Params Count
============================================== =======
ALL 1000000
Brand(Apple) 20000
Brand(Apple,Samsung) 40000
Brand(Apple),Model(A71) 7100
Brand(Apple),Type(A) 6000
Brand(Apple),Model(A71,B7C),Type(A,B) 7
Model(A71) 12514
Model(A71,B7C) 26584
Model(A71),Type(A) 6521
Model(A71),Type(A,B) 8958
Model(A71),Type(A,B),Size(40) 85
And so on for every possible combination. I was thinking about creating a stored procedure (that i would execute periodically), that would perform queries with every existing condition like that, but I am a little stuck on how exactly should it look like. Or is there a better way how to do this?
Edit: the reason why I want to store information like this is to be able to show number of results in filter in client application, like in the picture.
I would like to create index on the Params column to be able to get the Count number for given hash instantly, improving performance.
I also tried querying and caching the values dynamically, but I want to try this approach as well, so I can compare which one is more effective.
This is how I am calculating the counts now:
SELECT COUNT(*) FROM products;
SELECT COUNT(*) FROM products WHERE Brand IN ('Apple');
SELECT COUNT(*) FROM products WHERE Brand IN ('Apple', 'Samsung');
SELECT COUNT(*) FROM products WHERE Brand IN ('Apple') AND Model IN ('A71');
etc.

You can use a ROLLUP for this.
SELECT
model, type, size, COUNT(*)
FROM mytab
GROUP BY 1, 2, 3
WITH ROLLUP
With your sample data, we get the following:
| model | type | size | COUNT(*) |
| ----- | ---- | ---- | -------- |
| A21 | A | 25 | 1 |
| A21 | A | | 1 |
| A21 | | | 1 |
| A71 | A | 30 | 1 |
| A71 | A | 40 | 1 |
| A71 | A | | 2 |
| A71 | C | 40 | 1 |
| A71 | C | | 1 |
| A71 | | | 3 |
| B7C | B | 20 | 1 |
| B7C | B | | 1 |
| B7C | | | 1 |
| Q32 | B | 40 | 1 |
| Q32 | B | | 1 |
| Q32 | | | 1 |
| ZX5 | A | 30 | 1 |
| ZX5 | A | | 1 |
| ZX5 | | | 1 |
| | | | 7 |
The subtotals are present in the rows with null values in different columns, and the total is the last row where all group by columns are null.

Related

SSRS - Matrix with multiple groups

I have a matrix which has starts out fine
| TType | Sept21 | Oct21 |
| ----- | ------ | ----- |
| DT | 50 | 29 |
| VT | 20 | 30 |
| AT | 10 | 11 |
| Total | 80 | 70 |
The DT/VT/AT is a row group, and the month columns is a column group and the values are SUM(Volume). The total is an ungrouped row that just sums it all. This all works ok.
However, what I want to add is extra rows to the same matrix with a percentage of what each TType is of the total. This would give me something that looks like this
| TType | Sept21 | Oct21 |
| ----- | ------ | ----- |
| DT | 50 | 29 |
| VT | 20 | 30 |
| AT | 10 | 11 |
| Total | 80 | 70 |
| DT | 62.5% | 41.4% |
| VT | 25% | 42.8% |
| AT | 12.5% | 15.7% |
When I try to do this, I have tried using the expression sum(volume) / sum(volume,"groupname") where groupname is my dataset, but this doesn't take account of the column group and split out by months, nor does it take account of the filter which is to ignore a TType I don't need. I have tried making a row group, but then that splits out by the TType, but still ignores the month split.
Does anyone know how I can get this to work as expected?

Filter every column in MySQL

I have a database with three tables right now : equipements and equipements_statistics that contains the statistics of each equipements and finally stats that contains all type of statistics.
To retrieve an equipement on a filter I'm doing this query :
SELECT
*
FROM
`equipement`
INNER JOIN `equipement_stats` ON `equipement_stats`.`id_equipement` = `equipement`.`id_equipement`
INNER JOIN `stats` ON `stats`.`id_stats` = `equipement_stats`.`id_stats`
WHERE
`stats`.`id_stats` IN(1068, 1069)
GROUP BY
`equipement`.`id_equipement`
HAVING
COUNT(DISTINCT stats.id_stats) = 1
LIMIT 10
Tables are like this :
equipement
+---------------+-----------------+
| id_equipement | name_equipement |
+---------------+-----------------+
| 1 | one |
| 2 | two |
| 3 | three |
+---------------+-----------------+`
equipement_stats
+---------------+-----------+---------------+
| id_equipement | id_stats | random_number |
+---------------+-----------+---------------+
| 1 | 2 | 0 |
| 1 | 4 | 0 |
| 1 | 1069 | 1 |
| 1 | 8 | 0 |
| _____________ | _________ | _____________ |
| 2 | 1070 | 2 |
| 2 | 1069 | 3 |
| 2 | 20 | 0 |
| 2 | 40 | 0 |
+---------------+-----------+---------------+
If stats are 1068 or 1069 I must filter them on the column random_number but random_number value can be different for 1070 and 1069. How to look only for a precise id_stats with a precise random_number?
In my case for example, I would like to filter on equipements that has the stats 1070 with random_number 2 and stats 1069 with random_number 3 as the 2nd entry.
Thanks you for helping!
The easiest way to filter tuples is this:
WHERE (equipement_stats.id_stats, equipement_stats.random_number) IN ( (1068,2) , (1069,3) )

Update records based on date on MySQL using joins

I have all those tables above.
car_model_tbl
-----------------------------
id | car_model_name|status |
-----------------------------
1 | seria_1 | 1 |
-----------------------------
2 | golf_4 | 1 |
-----------------------------
3 | C_Class | 1 |
-----------------------------
4 | golf_5 | 1 |
-----------------------------
5 | seria_2 | 0 |
-----------------------------
car_manufacturer_tbl
-------------------------
id |car_manufactu_name |
-------------------------
1 | bmw |
-------------------------
2 | volkswagen |
-------------------------
3 | mercedes |
-------------------------
car_service_tbl
---------------------------------
id | model_id| service_date |
---------------------------------
1 | 1 | 2018-03-10 |
---------------------------------
2 | 2 | 2018-02-10 |
---------------------------------
3 | 1 | 2018-01-10 |
---------------------------------
4 | 1 | 2017-12-10 |
---------------------------------
5 | 2 | 2017-12-10 |
---------------------------------
6 | 3 | 2018-02-10 |
---------------------------------
7 | 2 | 2018-01-10 |
---------------------------------
9 | 4 | 2018-03-10 |
---------------------------------
10 | 4 | 2018-02-10 |
---------------------------------
11 | 5 | 2018-02-10 |
---------------------------------
car_model_manufacturer_relation
-------------------------------------------------
id | model_id | manufactu_id| service_status |
-------------------------------------------------
1 | 1 | 1 | 1 |
-------------------------------------------------
2 | 5 | 1 | 1 |
-------------------------------------------------
3 | 2 | 2 | 1 |
-------------------------------------------------
4 | 4 | 1 | 1 |
-------------------------------------------------
5 | 2 | 2 | 1 |
-------------------------------------------------
6 | 3 | 3 | 1 |
-------------------------------------------------
I need to update car_model_manufacturer_relation.service_status = '0'
where car_service_tbl.service_date < "2018-03-01".
In this case car_model_manufacturer_relation.service_status of models 2, 3 and 5 should be set to '0' because every car_service_tbl.service_date for these models is smaller than "2018-03-01".
However, for models 1 and 4 car_model_manufacturer_relation.service_status should stay '1' because even that they have records smaller than "2018-03-01" they also have bigger dates ex. "2018-03-10".
I am trying to create a query for this but until now without success.
You'll need to nest a grouped query, to get the MAX date per model, and update from that.
update car_model_manufacturer_relation as cmmr,
(select model_id, max(service_date) as check_date
from car_service_tbl
group by model_id) as cst
set cmmr.service_status = '0'
where cmmr.model_id = cst.model_id
and cst.check_date < "2018-03-01"
Where you're using more than one table and the table names include underscores, I try and alias the tables to make the code a little shorter and easier on the eye, hence the use of cmmr and cst as table aliases.
The MAX date has also been renamed for clarity as check_date. You can of course name this anything you wish.
With sub query:
UPDATE car_model_manufacturer_relation c
LEFT join (SELECT model_id, service_date FROM car_service_tbl ORDER BY service_date DESC LIMIT 1) as s ON s.model_id = c.model_id
SET service_status=0
WHERE c.service_date < "2018-03-01"
#tyro - be careful with your solution, as a LEFT JOIN would update the service status to 0 when there wasn't a service date within the car_service_tbl. You would need to use a full join, rather than just the LEFT JOIN as you suggested in order to update the records correctly I feel.

MySQL Top items based on count within multiple grouping

I am trying to write a SQL query that will return the top items for each company and for each location. I have an example MySQL table (table_x) that looks like this:
Date | Company | Location | Item | Price | Quantity | Total_Amount
----------|---------|----------|--------------|-------|----------|-------------
1/10/2000 | ABC | 1 | Food | 2 | 6 | 12
1/11/2000 | ABC | 1 | Food | 1 | 2 | 2
1/12/2000 | ABC | 2 | Food | 10 | 5 | 50
1/13/2000 | ABC | 2 | Electronics | 100 | 2 | 200
1/10/2000 | ABC | 1 | Consumables | 10 | 5 | 50
1/15/2000 | ABC | 2 | Electronics | 100 | 3 | 300
1/10/2000 | DEF | 1 | Electronics | 50 | 5 | 250
1/16/2000 | DEF | 1 | Electronics | 50 | 4 | 200
1/19/2000 | DEF | 2 | Food | 10 | 5 | 50
1/14/2000 | DEF | 2 | Food | 2 | 10 | 20
1/11/2000 | DEF | 2 | Food | 5 | 8 | 40
1/11/2000 | DEF | 2 | Electronics | 500 | 2 | 1000
And for example what I want is to return is the top item by count per company per location. So something like this where the top item by count is per company and per location.
Company | Location | Item | AVG(Price) | SUM(Total_Amount) | COUNT(*)
--------|----------|-------------|------------|-------------------|---------
ABC | 1 | Food | 4 | 14 | 2
ABC | 2 | Electronics | 100 | 500 | 2
DEF | 1 | Electronics | 50 | 450 | 2
DEF | 2 | Food | 5.67 | 110 | 3
I know how to do this across all company and locations, but have trouble getting the top items by count to be within each specific grouping. Ideally, I'd want to be able to extend this to top N items if I have more item types if possible.
This is the SQL query I ran to generate top items based on the occurrences.
SELECT Company, Location, Item, AVG(Price), SUM(Total_Amount), COUNT(*) FROM table_x
GROUP BY Company, Location, Item
ORDER BY Company, Location, COUNT(*) desc
Use MySQL's non-standard grouping feature:
select * from (
SELECT Company, Location, Item, AVG(Price), SUM(Total_Amount), COUNT(*) FROM table_x
GROUP BY Company, Location, Item
ORDER BY Company, Location, COUNT(*) desc
)
group by 1,2
With MySQL (only) when you omit non-aggregate columns from the group by list, the first row of each combination is returned.
Note that since version 5.7.5, you must disable ONLY_FULL_GROUP_BY, which is enabled by default.
#Jeffrey has kindly provided an SQLFiddle.

Whats the best way to store different prices for one entity in SQL?

So I am building a webpage that shows a bunch of video games located in a SQL database and one suggestion I had was to have the different prices from each region display based on a drop down. My question is trying to figure out whats the best way to store int in the database. Would it be like:
GAME
CountryID
price1
CountryID
price2
CountryID
price3 ...
Or is there a better way to do this?
Just a heads up I've only been developing web applications for a year or so and I'm still pretty new to SQL.
Thanks for your input!
I would use multiple tables, one for games and one for region pricing.
Games
+--------+----------+
| GameID | GameName |
+--------+----------+
| 1 | Game1 |
| 2 | Game2 |
| 3 | Game3 |
| 4 | Game4 |
+--------+----------+
RegionPricing
+----------+--------+-------+
| RegionID | GameID | Price |
+----------+--------+-------+
| 1 | 1 | 60 |
| 1 | 2 | 55 |
| 1 | 3 | 45 |
| 1 | 4 | 80 |
| 2 | 1 | 50 |
| 3 | 2 | 30 |
| 3 | 3 | 25 |
| 3 | 4 | 45 |
| 4 | 1 | 60 |
| 4 | 2 | 55 |
| 4 | 3 | 45 |
| 4 | 4 | 80 |
+----------+--------+-------+
By using separate tables you minimize duplicate data and allow for easy granular changes. You may also consider adding a column to RegionPricing for currency. This would also need a Region table, with RegionID and RegionName.