SQL Select count of categories accross multiple columns - mysql

I have a data structure in the table of these columns
ID | Title | Category_level_1 | Category_level_2 | Category_level_3
1 | offer 1 | Browns | Greens | White
2 | offer 1 | Browns | White |
3 | offer 2 | Greens | Yellow |
4 | offer 3 | Browns | Greens |
5 | offer 4 | Browns | Yellow | White
Without the ability to change the table structure I need to "count the number for Offers per Category across the 3 columns"
There is also columns for date range of the offer, to limit to the current ones, but I want to work out the query first.
I need to get a list of all the Categories and then put offers against them.
Offer can be in the table more than once.
As far as I have got is do a temp table first with a UNION.
CREATE TEMPORARY TABLE IF NOT EXISTS Cats AS
( SELECT DISTINCT(opt) FROM (
SELECT Category_level_1 AS opt FROM a_table
UNION
SELECT Category_level_2 AS opt FROM a_table
UNION
SELECT Category_level_3 AS opt FROM a_table
) AS Temp
) ;
SELECT
Cats.opt AS "Joint Cat",
(
SELECT count(*)
FROM a_table
WHERE a_table.`Category_level_1` = Cats.opt
OR a_table.`Category_level_2` = Cats.opt
OR a_table.`Category_level_3` = Cats.opt
GROUP BY a_table.Title
) As Total
FROM Cats
WHERE Category_level_1 != ''
ORDER BY Category_level_1 ASC;
ISSUE:
a) so the union works well and I get my values. DONE
b) the Total subselect though is not grouping correctly.
I just want a count of all the rows returned but it is grouping with a count of the row titles not all rows.
So trying to work out how to figure this should work and the SQL could be totally different with the answer:
Joint Category | Total Count of offers
Browns | 3
White | 3
Greens | 2
Yellow | 2

plan
take a union of all distinct categories, alias to Joint Category
aggregate count over Joint Category ( where not null or blank - not clear from your rendering if those fields are null or blank.. )
grouping by Joint Category
query
select `Joint Category`, count(*) as `Total Count of offers`
from
(
select Title, Category_level_1 as `Joint Category`
from a_table
union
select Title, Category_level_2
from a_table
union
select Title, Category_level_3
from a_table
) allcats
where `Joint Category` is not null
and `Joint Category` <> ''
group by `Joint Category`
;
output
+----------------+-----------------------+
| Joint Category | Total Count of offers |
+----------------+-----------------------+
| Browns | 3 |
| Greens | 3 |
| White | 2 |
| Yellow | 2 |
+----------------+-----------------------+
sqlfiddle

Your results are a bit confusing . . . I cannot tell why browns and whites both have a count of 3. I think you are counting the combination of level and category.
I would be inclined to approach this using union all and then use count() or count(distinct), depending on what the counting logic really is. For the combination of level and category:
SELECT cat, COUNT(DISTINCT level, title) as numtitles
FROM ((SELECT title, 1 as level, category_level1 as cat FROM a_table) union all
(SELECT title, 2 as level, category_level2 as cat FROM a_table) union all
(SELECT title, 3 as level, category_level3 as cat FROM a_table)
) tc
WHERE cat is not null
GROUP BY cat;
You can include the date column in each of the subqueries and then include a condition in the WHERE clause.

Related

How to find the count of a particular number that can exist in 2 columns

I have a table that consist of a table that describes calls. Hence there is a to column and a from column. The problem is that I want the total messages sent by each number, which can be from or to. Refer to the table above for visuals.
I want the final table to be somethng that shows A : 3 , B: 2 , C:1 and D:1.
How do u count the numbers in 2 columns and sum them up?
One solution would be to first UNION ALL two aggregate queries to gather the count of occurences of each value in the two different columns, and them sum the results in an outer query, like:
SELECT val, SUM(cnt) cnt
FROM (
SELECT `from` val, COUNT(*) cnt FROM mytable GROUP BY `from`
UNION ALL
SELECT `to`, COUNT(*) FROM mytable GROUP BY `to`
) x
GROUP BY val
This demo on DB Fiddle with your sample data returns:
| val | cnt |
| --- | --- |
| A | 3 |
| B | 2 |
| C | 1 |
| D | 1 |
| E | 1 |
You can unpivot the data and aggregate:
select person, count(*) as num_calls
from ((select from as person from t) union all
(select to as person from t
) c
group by person;
Note that from and to are really, really bad names for columns because they are SQL keywords. I haven't escaped them in the query, because that just clutters the query and I assume the real columns have better names.

Select fields based on aggregates results

I have a typical products table with the fields name and price. I have to select the sum of all the prices, the name of the cheapest product and the name of the most expensive product and then return it all in the same result set. I've tried some combinations but the best I could come up with was an ugly query with multiple nested subselects. Can anyone help me with a good query example, please? Thanks in advance.
To illustrate the problem, here is a minimalistic products table:
+----+-------+------------------+
| id | price | name |
+----+-------+------------------+
| 1 | 2.20 | Shack Beer |
| 2 | 3.40 | Freeze IPA |
| 3 | 1.10 | Poor Man's Ale |
| 4 | 3.40 | Alabama Sour |
| 5 | 7.20 | Irish Stout |
+----+-------+------------------+
Given the table above, the query must return the following result set:
total_pricing = 17.30
cheapest_product = Poor Man's Ale
most_expensive_product = Irish Stout
Here are a couple of options for you (SQL Fiddle):
SELECT SUM(p.price) AS total_pricing,
(
SELECT name
FROM products
ORDER BY price
LIMIT 1
)AS cheapest_product,
(
SELECT name
FROM products
ORDER BY price DESC
LIMIT 1
)AS most_expensive_product
FROM products p;
Or in separate rows:
SELECT 'total_pricing' as Category, SUM(p.price) AS total_pricing
FROM products p
UNION
(
SELECT 'cheapest_product', name
FROM products
ORDER BY price
LIMIT 1
)
UNION
(
SELECT 'most_expensive_product', name
FROM products
ORDER BY price DESC
LIMIT 1
);
Check SQL Fiddle
SELECT DISTINCT price,
name
FROM products AS p1,
(SELECT MAX(price) AS `most_expensive_product`,
MIN(price) AS `cheapest_product`
FROM products) AS p2
WHERE p2.most_expensive_product = p1.price
OR p2.cheapest_product = p1.price
UNION
SELECT sum(price) AS price,
"Total_pricing" AS name
FROM products;

What is SQL to select a property and the max number of occurrences of a related property?

I have a table like this:
Table: p
+----------------+
| id | w_id |
+---------+------+
| 5 | 8 |
| 5 | 10 |
| 5 | 8 |
| 5 | 10 |
| 5 | 8 |
| 6 | 5 |
| 6 | 8 |
| 6 | 10 |
| 6 | 10 |
| 7 | 8 |
| 7 | 10 |
+----------------+
What is the best SQL to get the following result? :
+-----------------------------+
| id | most_used_w_id |
+---------+-------------------+
| 5 | 8 |
| 6 | 10 |
| 7 | 8 |
+-----------------------------+
In other words, to get, per id, the most frequent related w_id.
Note that on the example above, id 7 is related to 8 once and to 10 once.
So, either (7, 8) or (7, 10) will do as result. If it is not possible to
pick up one, then both (7, 8) and (7, 10) on result set will be ok.
I have come up with something like:
select counters2.p_id as id, counters2.w_id as most_used_w_id
from (
select p.id as p_id,
w_id,
count(w_id) as count_of_w_ids
from p
group by id, w_id
) as counters2
join (
select p_id, max(count_of_w_ids) as max_counter_for_w_ids
from (
select p.id as p_id,
w_id,
count(w_id) as count_of_w_ids
from p
group by id, w_id
) as counters
group by p_id
) as p_max
on p_max.p_id = counters2.p_id
and p_max.max_counter_for_w_ids = counters2.count_of_w_ids
;
but I am not sure at all whether this is the best way to do it. And I had to repeat the same sub-query two times.
Any better solution?
Try to use User defined variables
select id,w_id
FROM
( select T.*,
if(#id<>id,1,0) as row,
#id:=id FROM
(
select id,W_id, Count(*) as cnt FROM p Group by ID,W_id
) as T,(SELECT #id:=0) as T1
ORDER BY id,cnt DESC
) as T2
WHERE Row=1
SQLFiddle demo
Formal SQL
In fact - your solution is correct in terms of normal SQL. Why? Because you have to stick with joining values from original data to grouped data. Thus, your query can not be simplified. MySQL allows to mix non-group columns and group function, but that's totally unreliable, so I will not recommend you to rely on that effect.
MySQL
Since you're using MySQL, you can use variables. I'm not a big fan of them, but for your case they may be used to simplify things:
SELECT
c.*,
IF(#id!=id, #i:=1, #i:=#i+1) AS num,
#id:=id AS gid
FROM
(SELECT id, w_id, COUNT(w_id) AS w_count
FROM t
GROUP BY id, w_id
ORDER BY id DESC, w_count DESC) AS c
CROSS JOIN (SELECT #i:=-1, #id:=-1) AS init
HAVING
num=1;
So for your data result will look like:
+------+------+---------+------+------+
| id | w_id | w_count | num | gid |
+------+------+---------+------+------+
| 7 | 8 | 1 | 1 | 7 |
| 6 | 10 | 2 | 1 | 6 |
| 5 | 8 | 3 | 1 | 5 |
+------+------+---------+------+------+
Thus, you've found your id and corresponding w_id. The idea is - to count rows and enumerate them, paying attention to the fact, that we're ordering them in subquery. So we need only first row (because it will represent data with highest count).
This may be replaced with single GROUP BY id - but, again, server is free to choose any row in that case (it will work because it will take first row, but documentation says nothing about that for common case).
One little nice thing about this is - you can select, for example, 2-nd by frequency or 3-rd, it's very flexible.
Performance
To increase performance, you can create index on (id, w_id) - obviously, it will be used for ordering and grouping records. But variables and HAVING, however, will produce line-by-line scan for set, derived by internal GROUP BY. It isn't such bad as it was with full scan of original data, but still it isn't good thing about doing this with variables. On the other hand, doing that with JOIN & subquery like in your query won't be much different, because of creating temporery table for subquery result set too.
But to be certain, you'll have to test. And keep in mind - you already have valid solution, which, by the way, isn't bound to DBMS-specific stuff and is good in terms of common SQL.
Try this query
select p_id, ccc , w_id from
(
select p.id as p_id,
w_id, count(w_id) ccc
from p
group by id,w_id order by id,ccc desc) xxx
group by p_id having max(ccc)
here is the sqlfidddle link
You can also use this code if you do not want to rely on the first record of non-grouping columns
select p_id, ccc , w_id from
(
select p.id as p_id,
w_id, count(w_id) ccc
from p
group by id,w_id order by id,ccc desc) xxx
group by p_id having ccc=max(ccc);

MySQL conditionally populate column 3 based on DISTINCT involving 2 other columns in one table

Had a good read through similar topics but I can't quite a) find one to match my scenario, or b) understand others enough to fit / tailor / tweek to my situation.
I have a table, the important fields being;
+------+------+--------+--------+
| ID | Name | Price |Status |
+------+------+--------+--------+
| 1 | Fred | 4.50 | |
| 2 | Fred | 4.50 | |
| 3 | Fred | 5.00 | |
| 4 | John | 7.20 | |
| 5 | John | 7.20 | |
| 6 | John | 7.20 | |
| 7 | Max | 2.38 | |
| 8 | Max | 2.38 | |
| 9 | Sam | 21.00 | |
+------+------+--------+--------+
ID is an auto-incrementing value as records get added throughout the day.
NAME is a Primary Key field, which can repeat 1 to 3 times in the whole table.
Each NAME will have a PRICE value, which may or may not be the same per NAME.
There is also a STATUS field that need to be populated based on the following, which is actually the part I am stuck on.
Status = 'Y' if each DISTINCT name has only one price attached to it.
Status = 'N' if each DISTINCT name has multiple prices attached to it.
Using the table above, ID's 1, 2 and 3 should be 'N', whilst 4, 5, 6, 7, 8 and 9 should be 'Y'.
I think this may well involve some form of combination of JOINs, GROUPs, and DISTINCTs but I am at a loss on how to put that into the right order for SQL.
In order to get the count of distinct Price values per name, we must use a GROUP BY on the Name field, but since you also want to display all names ungrouped but with an additional Status field, we must first create a subselect in the FROM clause which groups by the name and determines whether the name has multiple price values or not.
When we GROUP BY Name in the subselect, COUNT(DISTINCT price) will count the number of distinct price values for each particular name. Without the DISTINCT keyword, it would simply count the number of rows where price is not null.
In conjunction with that, we use a CASE expression to insert N into the Status column if there is more than one distinct Price value for the particular name, otherwise, it will insert Y.
The subselect only returns one row per Name, so to get all names ungrouped, we join that subselect to the main table on the condition that the subselect's Name = the main table's Name:
SELECT
b.ID,
b.Name,
b.Price,
a.Status
FROM
(
SELECT Name, CASE WHEN COUNT(DISTINCT Price) > 1 THEN 'N' ELSE 'Y' END AS Status
FROM tbl
GROUP BY Name
) a
INNER JOIN
tbl b ON a.Name = b.Name
Edit: In order to facilitate an update, you can incorporate this query using JOINs in the UPDATE like so:
UPDATE
tbl a
INNER JOIN
(
SELECT Name, CASE WHEN COUNT(DISTINCT Price) > 1 THEN 'N' ELSE 'Y' END AS Status
FROM tbl
GROUP BY Name
) b ON a.Name = b.Name
SET
a.Status = b.Status
Assuming you have an unfilled Status column in your table.
If you want to update the status column, you could do:
UPDATE mytable s
SET status = (
SELECT IF(COUNT(DISTINCT price)=1, 'Y', 'N') c
FROM (
SELECT *
FROM mytable
) s1
WHERE s1.name = s.name
GROUP BY name
);
Technically, it should not be necessary to have this:
FROM (
SELECT *
FROM mytable
) s1
but there is a mysql limitation that prevents you to select from the table you're updating. By wrapping it in parenthesis, we force mysql to create a temporary table and then it suddenly is possible.

ORDER BY and GROUP BY of MySQL query

I have two mySQL tables as follows:
[product] table
P_id | Name | Quantity
1 | B | 10
2 | C | 15
3 | A | 8
[attribute] table
P_id | Name | Quantity
1 | Black | 5
1 | Red | 5
2 | Blue | 6
2 | Black | 9
How can I write an SQL query so that it can show the result from the above two tables as follows:
Report:
P_id | Name | Quantity
3 | A | 8
1 | B | 10
1 | Black | 5
1 | Red | 5
2 | C | 15
2 | Black | 9
2 | Blue | 6
These should be sorted on [Name] column, but these should be grouping on P_id column as above. By "grouping" on P_id, I mean "keeping all records with the same P_id next to each other". IS it possible to retrieve as above arrangement using a single SQL query.
SELECT P_id, Name, Quantity FROM (
SELECT P_id, Name, Quantity, Name as parent, 1 as level
FROM product
UNION
SELECT a.P_id, a.Name, a.Quantity, p.Name as parent, 2 as level
FROM attribute a JOIN product p ON a.P_id = p.P_id
) combined ORDER BY parent, level, Name
It should be Union operation, I believe like this:
select * from product union select * from attribute order Name;
Not sure why you wrote you need `group by' since your output is not grouped by any column.
Do you mean that you just need an union of both tables?
You can try this:
Select P_id, Name, Quantity
From product
Union
Select P_id, Name, Quantity
From attribute
Order by 2
SELECT * FROM
(
SELECT P_id, Name , Quantity FROM product
UNION ALL
SELECT P_id, Name , Quantity FROM attribute
) Order by Name
You will have to use grouping my friend. Then you can order on Both of you required columns like this:
SELECT * FROM product UNION SELECT * FROM attribute ORDER BY Name, P_id;
If you want them ordered by P_id and then Name, this should work.
(SELECT P_id, Name, Quantity FROM product)
UNION
(SELECT P_id, Name, Quantity FROM attribute)
ORDER BY P_id, Name;