I have table with 3 columns : A, B and C. These columns can be true or false.
I want to get count of every possible combination.
Sample data:
CREATE TABLE `myTable` (
`id` mediumint(8) unsigned NOT NULL auto_increment,
`A` mediumint default NULL,
`B` mediumint default NULL,
`C` mediumint default NULL,
PRIMARY KEY (`id`)
) AUTO_INCREMENT=1;
INSERT INTO `myTable` (`A`,`B`,`C`) VALUES (0,0,1),(1,1,0),(0,0,0),(1,1,0),(1,0,0),(1,0,1),(0,0,1),(1,1,1),(0,1,0),(1,1,1);
INSERT INTO `myTable` (`A`,`B`,`C`) VALUES (1,0,1),(0,1,0),(1,1,1),(0,0,1),(1,0,0),(0,0,0),(0,0,1),(1,1,0),(0,0,0),(1,1,0);
INSERT INTO `myTable` (`A`,`B`,`C`) VALUES (1,1,0),(0,1,0),(1,1,1),(0,0,0),(1,1,0),(1,0,1),(1,1,1),(1,0,1),(1,1,1),(1,1,1);
INSERT INTO `myTable` (`A`,`B`,`C`) VALUES (0,1,0),(1,0,0),(0,1,0),(0,0,0),(0,0,0),(1,0,0),(1,0,1),(1,1,1),(0,0,1),(0,0,0);
INSERT INTO `myTable` (`A`,`B`,`C`) VALUES (1,1,1),(0,0,1),(1,1,0),(1,1,0),(1,0,0),(0,0,1),(0,1,1),(1,0,1),(1,0,0),(1,1,0);
INSERT INTO `myTable` (`A`,`B`,`C`) VALUES (1,1,1),(0,0,0),(1,0,1),(1,0,0),(1,0,0),(1,0,0),(0,0,1),(1,1,1),(0,1,1),(1,1,0);
INSERT INTO `myTable` (`A`,`B`,`C`) VALUES (0,1,1),(0,1,1),(0,1,0),(0,0,0),(0,1,0),(0,1,1),(0,1,1),(0,1,1),(0,1,0),(0,1,0);
INSERT INTO `myTable` (`A`,`B`,`C`) VALUES (0,1,1),(0,0,1),(0,1,0),(1,1,0),(0,0,0),(1,1,1),(1,1,0),(0,1,1),(1,0,1),(1,0,0);
INSERT INTO `myTable` (`A`,`B`,`C`) VALUES (0,1,0),(1,1,1),(0,1,0),(1,1,0),(1,0,1),(1,1,0),(0,1,0),(0,1,0),(0,1,0),(0,1,0);
INSERT INTO `myTable` (`A`,`B`,`C`) VALUES (1,1,0),(0,1,0),(1,1,1),(0,0,0),(1,0,0),(1,1,0),(1,0,1),(0,0,1),(1,0,1),(1,0,0);
Example result (from sample data):
combination: count
none: 11
A: 12
B: 17
C: 10
AB: 16
BC: 9
AC: 11
ABC: 14
Is this possible in one query? (MySQL)
This appears to be a simple count and Group by.
SELECT A, B, C, count(*)
FROM MyTable
GROUP BY A, B, C;
DEMO:
If you want you can show the string of values combined use concat and case...
SELECT concat(case when A = 1 then 'A' else '' end,
case when B = 1 then 'B' else '' end,
case when C = 1 then 'C' else '' end) as Combination
, count(*)
FROM MyTable
GROUP BY A, B, C
ORDER BY Combination;
or as Paul Spiegel shows in comments:
SELECT concat(left('A', A), left('B', B), left('C', C)) as Combination
, count(*)
FROM MyTable
GROUP BY A, B, C
ORDER BY Combination;
Giving us:
+----+-------------+----------+
| | Combination | count(*) |
+----+-------------+----------+
| 1 | | 11 |
| 2 | A | 12 |
| 3 | AB | 16 |
| 4 | ABC | 14 |
| 5 | AC | 11 |
| 6 | B | 17 |
| 7 | BC | 9 |
| 8 | C | 10 |
+----+-------------+----------+
Assuming your combinations are where those columns' values are TRUE, you're just looking at a group by over those 3 columns. The case logic is just there to present the combo in a single column; you could easily replace "case...end" with "A, B, C" to get the same result with those columns showing their values separately.
select
case
when A = 1 and B = 1 and C = 1 then 'ABC'
when A = 1 and B = 1 and C = 0 then 'AB'
when A = 1 and B = 0 and C = 1 then 'AC'
when A = 0 and B = 1 and C = 1 then 'BC'
when A = 1 and B = 0 and C = 0 then 'A'
when A = 0 and B = 1 and C = 0 then 'B'
when A = 0 and B = 0 and C = 1 then 'C'
else 'oops, this should not happen'
end as `Combo`
--, sum(sumThing) as `sum` --amended to count per question edit
, count(*) as `count`
from myTable
where A = true
or B = true
or C = true
group by A, B, C
Use conditional count
SELECT
COUNT(CASE WHEN A=1 THEN 1 END) AS A,
COUNT(CASE WHEN B=1 THEN 1 END) AS B,
COUNT(CASE WHEN C=1 THEN 1 END) AS C,
COUNT(CASE WHEN A=1 AND B=1 THEN 1 END) AS AB,
COUNT(CASE WHEN A=1 AND C=1 THEN 1 END) AS AC,
COUNT(CASE WHEN B=1 AND C=1 THEN 1 END) AS BC,
COUNT(CASE WHEN A=1 AND B=1 AND C=1 THEN 1 END) AS ABC,
COUNT(CASE WHEN A<>1 AND B<>1 AND C<>1 THEN 1 END) AS None
FROM table1;
You can use a simple subquery for each combination and union them:
SELECT "none", COUNT(*) FROM mytable WHERE A = 0 AND B = 0 AND C = 0
UNION
SELECT "A", COUNT(*) FROM mytable WHERE A = 1 AND B = 0 AND C = 0
UNION
SELECT "B", COUNT(*) FROM mytable WHERE A = 0 AND B = 1 AND C = 0
UNION
SELECT "C", COUNT(*) FROM mytable WHERE A = 0 AND B = 0 AND C = 1
UNION
SELECT "AB", COUNT(*) FROM mytable WHERE A = 1 AND B = 1 AND C = 0
UNION
SELECT "BC", COUNT(*) FROM mytable WHERE A = 0 AND B = 1 AND C = 1
UNION
SELECT "AC", COUNT(*) FROM mytable WHERE A = 1 AND B = 0 AND C = 1
UNION
SELECT "ABC", COUNT(*) FROM mytable WHERE A = 1 AND B = 1 AND C = 1
Here's an example Table layout:
TABLE_A: TABLE_B: TABLE_A_B:
id | a | b | c id | name a_id | b_id
--------------------- --------- -----------
1 | true | X | A 1 | A 1 | 1
2 | true | Z | null 2 | B 1 | 2
3 | false | X | null 3 | C 2 | 2
4 | true | Y | Q 4 | 1
5 | false | null | null 4 | 2
5 | 1
Possible Values:
TABLE_A.a: true, false
TABLE_A.b: X, Y, Z
TABLE_A.c: A, B, C, ... basically arbitrary
TABLE_B.name: A, B, C, ... basically arbitrary
What I want to achieve:
SELECT all rows from TABLE_A
SUM(where a = true),
SUM(where a = false),
SUM(where b = 'X'),
SUM(where b = 'Y'),
SUM(where b = 'Z'),
SUM(where b IS NULL),
and also get the SUMs for all distinct TABLE_A.c values.
and also get the SUMs for all those TABLE_A_B relations.
The result for the example Table above should look like:
aTrue | aFalse | bX | bY | bZ | bNull | cA | cQ | cNull | nameA | nameB | nameC
-------------------------------------------------------------------------------
3 | 2 | 2 | 1 | 1 | 1 | 1 | 1 | 3 | 3 | 3 | 0
What I've done so far:
SELECT
SUM(CASE WHEN a = true THEN 1 ELSE 0 END) AS aTrue,
SUM(CASE WHEN b = false THEN 1 ELSE 0 END) AS aFalse,
SUM(CASE WHEN b = 'X' THEN 1 ELSE 0 END) AS bX,
...
FROM TABLE_A
What's my problem?
Selecting column TABLE_A.a and TABLE_A.b is easy, because there's a fixed number of possible values.
But I can't figure out how to count the distinct values of TABLE_A.c. And basically the same problem for the JOINed TABLE_B, because the number of values within TABLE_B is unknown and can change over time.
Thanks for your help! :)
EDIT1: New (preferred) SQL result structure:
column | value | sum
----------------------------
TABLE_A.a | true | 3
TABLE_A.a | false | 2
TABLE_A.b | X | 2
TABLE_A.b | Y | 1
TABLE_A.b | Z | 1
TABLE_A.b | null | 1
TABLE_A.c | A | 1
TABLE_A.c | Q | 1
TABLE_A.c | null | 3
TABLE_B.name | A | 3
TABLE_B.name | B | 3
TABLE_B.name | C | 0
From your original request of rows as a simulated pivot. By doing a SUM( logical condition ) basically returns 1 if true, 0 if false. So, since the column "a" is true or false, simple sum of "a" or NOT "a" (for the false counts -- NOT FALSE = TRUE). Similarly, your "b" column, so b='X' = true counted as 1, else 0.
In other sql engines, you might see it as SUM( case/when ).
Now, since your table counts don't rely on each other, they can be separate SUM() into their own sub-alias query references (pqA and pqB for pre-queryA and pre-queryB respectively). Since no group by, they will each result in a single row. With no join will create a Cartesian, but since 1:1 ratio, will only return a single record of all columns you want.
SELECT
pqA.*, pqB.*
from
( SELECT
SUM( ta.a ) aTrue,
SUM( NOT ta.a ) aFalse,
SUM( ta.b = 'X' ) bX,
SUM( ta.b = 'Y' ) bY,
SUM( ta.b = 'Z' ) bZ,
SUM( ta.b is null ) bNULL,
SUM( ta.c = 'A' ) cA,
SUM( ta.c = 'Q' ) cQ,
SUM( ta.c is null ) cNULL,
COUNT( distinct ta.c ) DistC
from
table_a ta ) pqA,
( SELECT
SUM( b.Name = 'A' ) nameA,
SUM( b.Name = 'B' ) nameB,
SUM( b.Name = 'C' ) nameC
from
table_a_b t_ab
join table_b b
ON t_ab.b_id = b.id ) pqB
This option gives your second (preferred) output
SELECT
MAX( 'TABLE_A.a ' ) as Basis,
CASE when a then 'true' else 'false' end Value,
COUNT(*) finalCnt
from
TABLE_A
group by
a
UNION ALL
SELECT
MAX( 'TABLE_A.b ' ) as Basis,
b Value,
COUNT(*) finalCnt
from
TABLE_A
group by
b
UNION ALL
SELECT
MAX( 'TABLE_A.c ' ) as Basis,
c Value,
COUNT(*) finalCnt
from
TABLE_A
group by
c
UNION ALL
SELECT
MAX( 'TABLE_B.name ' ) as Basis,
b.Name Value,
COUNT(*) finalCnt
from
table_a_b t_ab
join table_b b
ON t_ab.b_id = b.id
group by
b.Name
I think You will need to build dynamic query as you don't know possible values for column C in table A. So you can write store procedure where you can get list of distinct value for Column C in one variable and by using "Do WHILE" you can construct your dynamic query.
Please let me know if you need more help in detail
Dynamic SQL
Below, is the schema of my brand_of_items table. For simplicity, shown here with two columns: id (primary and AI), symbol (varchar 50, unique)
Table - brand_of_items
id symbol
0 a
1 b
2 c
.. ..
10 j
Below, is the schema of my items_of_brand.
Table - mainIndexQuantity
id brand_of_items_id vol item_type salefinalizeddate
0 1 5 0 2005-5-11
1 1 6 0 2004-5-11
2 1 7 0 2011-5-11
3 1 8 0 2011-5-12
4 1 9 0 2011-5-12
5 1 10 0 2011-5-11
6 1 5 1 2012-5-11
7 1 6 1 2012-5-11
8 1 7 1 2011-5-11
9 1 8 1 2010-5-12
10 1 9 1 2012-5-12
11 1 10 1 2005-5-12
The mainIndexQuantity table brand_of_items_id columns is a foreign key which points to brand_of_items (id).
The mainIndexQuantity table item_type column is not a foreign key, which it should be.
The two item types are: 0 = retail and 1 = wholesale
I want to calculate the ratio of the types of items (retail vs wholesale) per each_brand_of_items table entry. The goal is to see if the a brands item is selling more in retail or wholesale.
**
Adding Complexity:
I want to add a date column to mainIndexQuantity table and want to find out the difference in sum of RetailVolume and WholesaleVolume and group the results by salefinalizeddate field.
This is to help determine what items in what seasons sold more and the (delta) difference in sum of RetailVolume & WholeSaleVolume will help to select items to pay most attention to.
Try this:
SELECT
b.id,
b.symbol,
IFNULL(SUM(m.item_type = 1), 0) / (COUNT(*) * 1.0) AS wholesaleRatio,
IFNULL(SUM(m.item_type = 0), 0) / (COUNT(*) * 1.0) AS RetailRatio
FROM brand_of_items b
LEFT JOIN mainIndexQuantity m ON b.id = m.brand_of_items_id
GROUP BY b.id,
b.symbol;
SQL Fiddle Demo.
This will give you:
| ID | SYMBOL | WHOLESALERATIO | RETAILRATIO |
----------------------------------------------
| 0 | a | 0 | 0 |
| 1 | b | 0.5 | 0.5 |
| 2 | c | 0 | 0 |
| 10 | j | 0 | 0 |
Assuming that:
wholesaleRatio is the count of the items of type Whole sale to the count of all items.
RetailRatio is the count of the items of type retail to the count of all items.
If this ration is for the total sum of the vol column to the total vol you can do this instead:
SELECT
b.id,
b.symbol,
SUM(CASE WHEN m.item_type = 1 THEN m.vol ELSE 0 END) / SUM(m.vol) AS wholesaleRatio,
SUM(CASE WHEN m.item_type = 0 THEN m.vol ELSE 0 END) / SUM(m.vol) AS RetailRatio
FROM brand_of_items b
LEFT JOIN mainIndexQuantity m ON b.id = m.brand_of_items_id
GROUP BY b.id,
b.symbol;
Note that:
I used LEFT JOIN, so that you got the unmatched rows in the result set, i.e, those brand items that has no entries the MainIndexQuantity table. If you don't want to include them, use INNER JOIN instead.
The multiply with 1.0 to get the count with decimal places, as noted by #JW.
Update 1
To include the Total Volume, Retail Volume Sum and Wholesale Volume sum try this:
SELECT
b.id,
b.symbol,
IFNULL(SUM(m.item_type = 1), 0) * 1.0 / COUNT(*) AS wholesaleRatio,
IFNULL(SUM(m.item_type = 0), 0) * 1.0 / COUNT(*) AS RetailRatio,
IFNULL(SUM(m.vol), 0) AS 'Total Volume',
SUM(CASE WHEN m.item_type = 1 THEN m.vol ELSE 0 END) AS 'Retail Volume sum',
SUM(CASE WHEN m.item_type = 1 THEN m.vol ELSE 0 END) AS 'Wholesale Volume sum'
FROM brand_of_items b
LEFT JOIN mainIndexQuantity m ON b.id = m.brand_of_items_id
GROUP BY b.id,
b.symbol;
Updated SQL Fiddle Demo.
This will give you:
| ID | SYMBOL | WHOLESALERATIO | RETAILRATIO | TOTAL VOLUME | RETAIL VOLUME SUM | WHOLESALE VOLUME SUM |
--------------------------------------------------------------------------------------------------------
| 0 | a | 0 | 0 | 0 | 0 | 0 |
| 1 | b | 0.5 | 0.5 | 90 | 45 | 45 |
| 2 | c | 0 | 0 | 0 | 0 | 0 |
| 10 | j | 0 | 0 | 0 | 0 | 0 |
If you want to sort the result set by these total and sums, put this query in a subquery, then you can do this:
SELECT *
FROM
(
SELECT
b.id,
b.symbol,
IFNULL(SUM(m.item_type = 1), 0) * 1.0 / COUNT(*) AS wholesaleRatio,
IFNULL(SUM(m.item_type = 0), 0) * 1.0 / COUNT(*) AS RetailRatio,
IFNULL(SUM(m.vol), 0) AS TotalVolume,
SUM(CASE WHEN m.item_type = 1 THEN m.vol ELSE 0 END) AS RetailVolumeSum,
SUM(CASE WHEN m.item_type = 1 THEN m.vol ELSE 0 END) AS WholesaleVolumeSum
FROM brand_of_items b
LEFT JOIN mainIndexQuantity m ON b.id = m.brand_of_items_id
GROUP BY b.id,
b.symbol
) AS sub
ORDER BY RetailVolumeSum DESC,
WholesaleVolumeSum DESC;
But your last requirement is not clear, are you looking for those brand of items that has the highest of retio/wholesale ratis and volumns or select the highest values of them?
For the later one:
SELECT *
FROM
(
SELECT
b.id,
b.symbol,
IFNULL(SUM(m.item_type = 1), 0) * 1.0 / COUNT(*) AS wholesaleRatio,
IFNULL(SUM(m.item_type = 0), 0) * 1.0 / COUNT(*) AS RetailRatio,
IFNULL(SUM(m.vol), 0) AS TotalVolume,
SUM(CASE WHEN m.item_type = 1 THEN m.vol ELSE 0 END) AS RetailVolumeSum,
SUM(CASE WHEN m.item_type = 1 THEN m.vol ELSE 0 END) AS WholesaleVolumeSum
FROM brand_of_items b
LEFT JOIN mainIndexQuantity m ON b.id = m.brand_of_items_id
GROUP BY b.id,
b.symbol
) AS sub
ORDER BY RetailVolumeSum DESC,
WholesaleVolumeSum DESC,
TotalVolume DESC
LIMIT 1;
Update 2
To get those brands that has the highest total volume, you can do this:
SELECT
b.id,
b.symbol,
IFNULL(SUM(m.item_type = 1), 0) * 1.0 / COUNT(*) AS wholesaleRatio,
IFNULL(SUM(m.item_type = 0), 0) * 1.0 / COUNT(*) AS RetailRatio,
IFNULL(SUM(m.vol), 0) AS TotalVolume,
SUM(CASE WHEN m.item_type = 1 THEN m.vol ELSE 0 END) AS RetailVolumeSum,
SUM(CASE WHEN m.item_type = 1 THEN m.vol ELSE 0 END) AS WholesaleVolumeSum
FROM brand_of_items b
LEFT JOIN mainIndexQuantity m ON b.id = m.brand_of_items_id
GROUP BY b.id,
b.symbol
HAVING SUM(m.vol) = (SELECT MAX(TotalVolume)
FROM
(
SELECT brand_of_items_id, SUM(vol) AS TotalVolume
FROM mainIndexQuantity
GROUP BY brand_of_items_id
) t);
Like this.
Note that:
This will give you the brands that has the highest total volume, if you are looking for those that has the highest ratio, you have to replace the having clause to get the max of the ratio rather than the max of total volume.
This will give you the items that have the highest total volume, so you might expect to have more than item, in case there was multiple items having the highest total volume, like in this updated fiddle demo. In this case, to get only one, you have to use LIMIT to return only one.