COUNT(DISTINCT) in multiple columns in SQL Server 2008 - sql-server-2008

In Oracle, it's possible to get a count of distinct values in multiple columns by using the || operator (according to this forum post, anyway):
SELECT COUNT(DISTINCT ColumnA || ColumnB) FROM MyTable
Is there a way to do this in SQL Server 2008? I'm trying to perform a single query to return some group statistics, but I can't seem to do it.
For example, here is a table of values I'm trying to query:
AssetId MyId TheirId InStock
328 10 10 1
328 20 20 0
328 30 30 0
328 40 10 0
328 10 10 0
328 10 10 0
328 10 10 0
328 10 10 0
For AssetId #328, I want to compute the total number of unique IDs in the MyId and TheirId columns (4 = 10, 20, 30, 40), as well as the total number of non-zero rows in the InStock column (1):
AssetId TotalIds AvailableIds
328 4 1
Is there a way to work this magic somehow?

You can use a cross apply and values.
select T1.AssetId,
count(distinct T2.ID) TotalIds,
sum(case T2.InStock when 0 then 0 else 1 end) AvailableIds
from YourTable as T1
cross apply(values(T1.MyId, T1.InStock),
(T1.TheirId, 0)
) as T2(ID, InStock)
group by T1.AssetId
SE-Data
Or you can do a union all in a sub query.
select T.AssetId,
count(distinct T.ID) TotalIds,
sum(case T.InStock when 0 then 0 else 1 end) AvailableIds
from (
select AssetId, MyId as ID, InStock
from YourTable
union all
select AssetID, TheirId, 0
from YourTable
) as T
group by T.AssetId

I think it's good solution for you
SELECT COUNT(*)
FROM (SELECT DISTINCT Column1, Column2
FROM MyTable) A

You can get the result like this:
DECLARE #t TABLE (AssetId INT, MyId INT, TheirId INT, InStock INT)
INSERT #t
VALUES
(328,10, 10, 1)
,(328,20, 20, 0)
,(328,30, 30, 0)
,(328,40, 10, 0)
,(328,10, 10, 0)
,(328,10, 10, 0)
,(328,10, 10, 0)
,(328,10, 10, 0)
;WITH a AS(
SELECT AssetId,
COUNT(col) cnt
FROM
(
SELECT MyId col, AssetId
FROM #t
UNION
SELECT TheirId col, AssetId
FROM #t
) b
GROUP BY AssetId
)
SELECT a.AssetId,
a.cnt TotalIds,
SUM(CASE WHEN InStock <> 0 THEN 1 ELSE 0 END) AvailableIds
FROM #t c
JOIN a ON a.AssetId = c.AssetId
GROUP BY a.AssetId, a.cnt
In Common Table Expression (WITH code block) 'uniqueness' is guaranteed by using UNION operator which discards duplicate values, that's why COUNT(col) doesn't need to be used like COUNT(DISTINCT col).

You can follow the Oracle example and concatenate the values together (that is what the Oracle query is doing). You just have to convert the values to characters first:
select AssetId,
count(distinct cast(MyId as varchar(8000))+','+cast(TheirId as varchar(8000)
) totalIds,
count(distinct case when inStock> 0
then cast(MyId as varchar(8000))+','+cast(TheirId as varchar(8000)
end) as AvailableIds
from t
group by Assetid
You can also do it as a subquery:
select AssetId, count(*) as TotalIds,
sum(case when inStock > 0 then 1 else 0 end) as AvailableIds
from (select AssetId, myId, theirId, max(inStock) as inStock
from t
group by AssetId, myId, theirId
) a
group by AssetId
"Theoretically", I like the second approach better, since it is more set-based. However, if you find yourself trying to count distinct combinations columns in several different variables, the string concatenation approach is more practical.

If you don't like to use CTE's, you can try using following solution. The gist of it is to
select the TotalID's for each AssetID in a seperate subquery
select the AvailableIDs for each AssetID in a seperate subquery
JOIN the results of both subqueries to produce the final results.
The statement as is works on the entire table. You can get the results for a single AssetID by adding an appropriate where clause to the entire group.
SQL Statement
SELECT a.AssetId, t.TotalIDs, a.AvailableIDs
FROM (
SELECT AssetID, TotalIDs = COUNT(*)
FROM (
SELECT AssetID
FROM MyTable
GROUP BY
MyId, TheirID, AssetID
) t
GROUP BY
AssetID
) AS t
INNER JOIN (
SELECT AssetID, AvailableIDs = SUM(InStock)
FROM MyTable
GROUP BY
AssetID
) AS a ON a.AssetId = t.AssetId
Test script
;WITH MyTable (AssetId, MyId, TheirId, InStock) AS (
SELECT * FROM (VALUES
(328, 10, 10, 1)
, (328, 20, 20, 0)
, (328, 30, 30, 0)
, (328, 40, 10, 0)
, (328, 10, 10, 0)
, (328, 10, 10, 0)
, (328, 10, 10, 0)
, (328, 10, 10, 0)
, (329, 10, 10, 0)
, (329, 10, 20, 1)
) AS a (b, c, d, e)
)
SELECT a.AssetId, t.TotalIDs, a.AvailableIDs
FROM (
SELECT AssetID, TotalIDs = COUNT(*)
FROM (
SELECT AssetID
FROM MyTable
GROUP BY
MyId, TheirID, AssetID
) t
GROUP BY
AssetID
) AS t
INNER JOIN (
SELECT AssetID, AvailableIDs = SUM(InStock)
FROM MyTable
GROUP BY
AssetID
) AS a ON a.AssetId = t.AssetId

Option #1
select
AssetID, count(distinct MyId) As MyId, SUM(InStock) InStock
From T
Group By
AssetID
Option #2 Without CTE
Select AssetID, count(MyId), sum(InStock) InStock From
(
select
AssetID, MyId, SUM(InStock) InStock
From MyTable
Group By
AssetID, MyId
)K
Group by AssetID
Option #3 With CTE
;With Sub(AssetID, MyId, InStock)
As
(
select
AssetID, MyId, SUM(InStock) InStock
From MyTable
Group By
AssetID, MyId
)
Select AssetID, count(MyId), sum(InStock) From
(
Select * from Sub
)K

Related

Trying to concatenate three different rows in mysql

So, I have an attributes table and I'm trying to build a query to concatenate into one column
attributes
------------
id, c_id, key, value
1, 1, 'day', 11
2, 1, 'month', 09
3, 1, 'year', 1999
4, 2, 'day', 14
5, 2, 'month', 11
6, 2, 'year', 2004
And this is the query I wrote,
SELECT
consumer_id,
CONCAT(
(SELECT `value` FROM consumer_attributes WHERE `key` = 'select_day'),
'_',
CONCAT(
(SELECT `value` FROM consumer_attributes WHERE `key` = 'select_month'),
'_',
CONCAT(
(SELECT `value` FROM consumer_attributes WHERE `key` = 'select_year'),
'',
''
)
)
) AS dob
FROM consumer_attributes
It throws out
ERROR CODE: 1242 Subquery returns more than 1 row
Can someone help me out?
output I'm trying to achieve
consumer_id, concat
1, 11_09_1999
2, 14_11_2004
SELECT c_id, CONCAT_WS('_',
(SELECT value FROM consumer_attributes a WHERE `key`='day' AND a.c_id = c.c_id),
(SELECT value FROM consumer_attributes a WHERE `key`='month' AND a.c_id = c.c_id),
(SELECT value FROM consumer_attributes a WHERE `key`='year' AND a.c_id = c.c_id)) AS dob
FROM (SELECT DISTINCT c_id FROM consumer_attributes) c;
https://www.db-fiddle.com/f/pB6b5xrgPKCivFWcpQHsyE/14
Try this,
SELECT `c_id`, CONCAT(GROUP_CONCAT(IF(`key` = 'day', `value`, NULL)),'_',GROUP_CONCAT(IF(`key` = 'month', `value`, NULL)),'_',GROUP_CONCAT(IF(`key` = 'year', `value`, NULL))) as dob
FROM `consumer_attributes`
GROUP BY `c_id`
Note: is it select_day or day ? You should change it above query if its different .
You didnt join subqueries with main query with id columns, so it finds more than one rows for each record . It should be ok if the rest is ok :
SELECT
dmy.consumer_id,
concat (
max(ifnull( (SELECT `value` FROM consumer_attributes dd WHERE `key` = 'day' and dd.id = dmy.id) , -1)) ,'_' ,
max(ifnull( (SELECT `value` FROM consumer_attributes mm WHERE `key` = 'month' and mm.id = dmy.id) , -1) ), '_' ,
max(ifnull( (SELECT `value` FROM consumer_attributes yy WHERE `key` = 'year' and yy.id = dmy.id) , -1)) )
FROM consumer_attributes dmy
group by dmy.consumer_id
Another approach (if you have a lot of data you might want to time different answers);
SELECT c_id, CONCAT(d.d, '_', m.m, '_', y.y) dob
FROM (select c_id, value d FROM consumer_attributes WHERE `key`='day') d
NATURAL JOIN (select c_id, value m FROM consumer_attributes WHERE `key`='month') m
NATURAL JOIN (select c_id, value y FROM consumer_attributes WHERE `key`='year') y;

How to select a primary key which has exact foreign keys matches a given list of values?

For example:
pk_ref fk
====== ===
1 a
1 b
1 c
2 a
2 b
2 d
How do I do a query like the "pseudo" query:
select distinc pk_ref
where fk in all('a', 'c');
The return query result must match all given values for the foreign key in the list.
The result should be:
1
While the following select must not return any records.
select distinc pk_ref
where fk in all('a', 'c', 'd');
How do I do that?
Try this
select pk_ref
from yourtable
group by pk_ref
having count(case when fk = 'a', then 1 end) >= 1
and count(case when fk = 'c' then 1 end) >= 1
To do it dynamically. (considering you are using SQL SERVER)
Create a split string function and pass the input as comma separated values
Declare #input varchar(8000)= 'a,c',#cnt int
set #cnt = len(#input)-len(replace(#input,',','')) + 1
select pk_ref
from yourtable
Where fk in (select split_values from udf_splitstring(#input , ','))
group by pk_ref
having count(Distinct fk) >= #cnt
You can create a split string function from the below link
https://sqlperformance.com/2012/07/t-sql-queries/split-strings
:list is the input list (bind variable). The difference of length() return values is the number of commas in the bind variable. This query, or something very close to it, should work in pretty much any DB product. Tested in Oracle.
select pk_ref
from tbl -- enter your table name here
where ',' || :list || ',' like '%,' || fk || ',%'
group by pk_ref
having count(distinct fk) = 1 + length(:list) - length(replace(:list, ',', ''))
If you can pass the IN operator values as Set, then you can do this as below
Schema:
SELECT * INTO #TAB FROM (
SELECT 1 ID, 'a' FK
UNION ALL
SELECT 1, 'b'
UNION ALL
SELECT 1, 'c'
UNION ALL
SELECT 2, 'a'
UNION ALL
SELECT 2, 'b'
UNION ALL
SELECT 2, 'd'
UNION ALL
SELECT 1, 'a'
)AS A
Used CTE to make 'a','c' as Set
;WITH CTE AS (
SELECT 'a' FK --Here 'a','c' passed as a Set through CTE
UNION
SELECT 'c'
)
,FINAL AS(
SELECT DENSE_RANK() OVER (PARTITION BY ID ORDER BY (FK))AS COUNT_ID, ID, FK
FROM #TAB where FK IN (select FK FROM CTE)
)
SELECT ID FROM FINAL WHERE COUNT_ID>=(SELECT COUNT( FK) FROM CTE)
Select pk_ref where fk='a' and pk_ref in (select pk_ref where fk='c' from yourtable) from yourtable;
or
select pk_ref where fk='a' from yourtable intersect select pk_ref where fk='c' from yourtable;
DECLARE #inputVariable VARCHAR(200) = 'a,b,c,d'
DECLARE #inputValue INT
DECLARE #tblInput TABLE
(
FK VARCHAR(100)
)
INSERT INTO #tblInput
SELECT SUBSTRING( #inputVariable+',',RN,1)
FROM (SELECT TOP 100 ROW_NUMBER() OVER(ORDER BY s.object_id) RN
FROM sys.objects s) s
where LEN(#inputVariable) >= RN
AND SUBSTRING(','+ #inputVariable,RN,1) = ','
SELECT #inputValue = COUNT(1) FROm #tblInput
--#inputVariable
DECLARE #tbl TABLE
(
ID INT,
FK VARCHAR(100)
)
INSERT INTO #tbl
SELECT 1 ID, 'a' FK
UNION ALL
SELECT 1, 'b'
UNION ALL
SELECT 1, 'c'
UNION ALL
SELECT 2, 'a'
UNION ALL
SELECT 2, 'b'
UNION ALL
SELECT 2, 'd'
UNION ALL
SELECT 1, 'a'
SELECT t.ID ,COUNT(DISTINCT t.FK)
FROM #tbl t
INNER JOIn #tblInput ti
ON t.FK = ti.FK
GROUP BY ID
HAVING COUNT(DISTINCT t.FK) = #inputValue

MySQL SUM DISTINCT with Conditional

I need to gather sums using conditional statements as well as DISTINCT values
with a multiple GROUP BY. The example below is a simplified version of a much much more complex query.
Because the real query is very large, I need to avoid having to drastically re-write the query.
DATA
Contracts
id advertiser_id status
1 1 1
2 2 1
3 3 2
4 1 1
A Query that's close
SELECT
COUNT( DISTINCT advertiser_id ) AS advertiser_qty,
COUNT( DISTINCT id ) AS contract_qty,
SUM( IF( status = 1, 1, 0 ) ) AS current_qty,
SUM( IF( status = 2, 1, 0 ) ) AS expired_qty,
SUM( IF( status = 3, 1, 0 ) ) AS other_qty
FROM (
SELECT * FROM `contracts`
GROUP BY advertiser_id, id
) AS temp
Currently Returns
advertiser_qty contract_qty current_qty expired_qty other_qty
3 4 3 1 0
Needs to Return
advertiser_qty contract_qty current_qty expired_qty other_qty
3 4 2 1 0
Where current_qty is 2 which is the sum of records with status = 1 for only DISTINCT advertiser_ids and each sum function will need the same fix.
I hope someone has a simple solution that can plug into the SUM functions.
-Thanks!!
try this
SELECT
COUNT( DISTINCT advertiser_id ) AS advertiser_qty,
COUNT( DISTINCT id ) AS contract_qty,
(select count(distinct advertiser_id) from contracts where status =1
) AS current_qty,
SUM( IF( status = 2, 1, 0 ) ) AS expired_qty,
SUM( IF( status = 3, 1, 0 ) ) AS other_qty
FROM (
SELECT * FROM `contracts`
GROUP BY advertiser_id, id
) AS temp
DEMO HERE
EDIT:
you may look for this without subselect.
SELECT COUNT(DISTINCT advertiser_id) AS advertiser_qty,
COUNT(DISTINCT id) AS contract_qty,
COUNT(DISTINCT advertiser_id , status = 1) AS current_qty,
SUM(IF(status = 2, 1, 0)) AS expired_qty,
SUM(IF(status = 3, 1, 0)) AS other_qty
FROM (SELECT *
FROM `contracts`
GROUP BY advertiser_id, id) AS temp
DEMO HERE

How do I fill gaps in weekly data?

I have a table with 4 fields (id, Year, Week, Totals).
I need a query, I guess using join, to fill zero values based on the year and week fields.
In my example I need to fill zero values for the weeks 3 and 4 / Year 2013
Rec Id, Year, Week, Totals
1, '2012', '52', '23'
2, '2013', '1' , '9'
3, '2013', '2' , '4'
Missing record from DB -> null, '2013', '3' , '0'
Missing record from DB -> null, '2013', '4' , '0'
4, '2013', '5' , '5'
5, '2013', '6' , '6'
6, '2013', '7' , '5'
That was a fun one! OK, here we go. First off, I'll give you the simple version, which relies on a couple assumptions:
You have at least one entry in your table already for each year
You have at least one of each week in your table, for any given year. IE: this query returns all numbers from 1 to 52:
SELECT DISTINCT week FROM your_table
Given those constraints, this query should do what you want:
INSERT INTO your_table (id, year, week, totals)
SELECT null, y, w, 0 FROM (
SELECT DISTINCT week w FROM your_table
) weeks
CROSS JOIN
(
SELECT DISTINCT year y FROM your_table
) years
WHERE
(y > (select min(year) from your_table) OR w > (select min(week) from your_table where `year`=y))
AND
(y < (select max(year) from your_table) OR w < (select max(week) from your_table where `year`=y))
AND
NOT EXISTS (select year, week from your_table where `year`=y AND `week`=w)
If condition 2 might not be satisfied - if there are some weeks that are missing in every year, you can replace this line
SELECT DISTINCT week w FROM your_table
with
SELECT
(TWO_1.SeqValue + TWO_2.SeqValue + TWO_4.SeqValue + TWO_8.SeqValue + TWO_16.SeqValue + TWO_32.SeqValue) w
FROM
(SELECT 0 SeqValue UNION ALL SELECT 1 SeqValue) TWO_1
CROSS JOIN (SELECT 0 SeqValue UNION ALL SELECT 2 SeqValue) TWO_2
CROSS JOIN (SELECT 0 SeqValue UNION ALL SELECT 4 SeqValue) TWO_4
CROSS JOIN (SELECT 0 SeqValue UNION ALL SELECT 8 SeqValue) TWO_8
CROSS JOIN (SELECT 0 SeqValue UNION ALL SELECT 16 SeqValue) TWO_16
CROSS JOIN (SELECT 0 SeqValue UNION ALL SELECT 32 SeqValue) TWO_32
HAVING w >= 1 AND w <= 52
Giving this more general case:
INSERT INTO your_table (id, year, week, totals)
SELECT null, y, w, 0 FROM (
SELECT
(TWO_1.SeqValue + TWO_2.SeqValue + TWO_4.SeqValue + TWO_8.SeqValue + TWO_16.SeqValue + TWO_32.SeqValue) w
FROM
(SELECT 0 SeqValue UNION ALL SELECT 1 SeqValue) TWO_1
CROSS JOIN (SELECT 0 SeqValue UNION ALL SELECT 2 SeqValue) TWO_2
CROSS JOIN (SELECT 0 SeqValue UNION ALL SELECT 4 SeqValue) TWO_4
CROSS JOIN (SELECT 0 SeqValue UNION ALL SELECT 8 SeqValue) TWO_8
CROSS JOIN (SELECT 0 SeqValue UNION ALL SELECT 16 SeqValue) TWO_16
CROSS JOIN (SELECT 0 SeqValue UNION ALL SELECT 32 SeqValue) TWO_32
HAVING w >= 1 AND w <= 52
) weeks
CROSS JOIN
(
SELECT DISTINCT year y FROM your_table
) years
WHERE
(y > (select min(year) from your_table) OR w > (select min(week) from your_table where `year`=y))
AND
(y < (select max(year) from your_table) OR w < (select max(week) from your_table where `year`=y))
AND
NOT EXISTS (select year, week from your_table where `year`=y AND `week`=w)
(You can use a similar technique to generate the list of years if condition 1 isn't satisfied, but I'm guessing you don't have entire year-long holes.)
Finally, this could be simplified a bit if you have a unique index on year and week. If you do not yet have such an index, you could create it like so:
ALTER TABLE `your_table` ADD CONSTRAINT date UNIQUE (
`year`,
`week`
)
and if you want, you could remove it when you're done, like so:
ALTER TABLE `your_table` DROP INDEX date;
In that case, the final part of the where clause can be removed:
AND
NOT EXISTS (select year, week from your_table where `year`=y AND `week`=w)
because the INSERT IGNORE will skip any rows for which that unique year/week combination already exists.
Kudos to this answer for the range-generating code: https://stackoverflow.com/a/8349837/160565

Get column name which has the max value in a row sql

I have a a table in my database where I store categories for newsarticles and each time a user reads an article it increments the value in the associated column. Like this:
Now I want to execute a query where I can get the column names with the 4 highest values for each record. For example for user 9, it would return this:
I've tried several things, searched a lot but don't know how to do it. Can anyone help me?
This should do it:
select
userid,
max(case when rank=1 then name end) as `highest value`,
max(case when rank=2 then name end) as `2nd highest value`,
max(case when rank=3 then name end) as `3rd highest value`,
max(case when rank=4 then name end) as `4th highest value`
from
(
select userID, #rownum := #rownum + 1 AS rank, name, amt from (
select userID, Buitenland as amt, 'Buitenland' as name from newsarticles where userID = 9 union
select userID, Economie, 'Economie' from newsarticles where userID = 9 union
select userID, Sport, 'Sport' from newsarticles where userID = 9 union
select userID, Cultuur, 'Cultuur' from newsarticles where userID = 9 union
select userID, Wetenschap, 'Wetenschap' from newsarticles where userID = 9 union
select userID, Media, 'Media' from newsarticles where userID = 9
) amounts, (SELECT #rownum := 0) r
order by amt desc
limit 4
) top4
group by userid
Demo: http://www.sqlfiddle.com/#!2/ff624/11
A very simple way of doing this is shown below
select userId, substring_index(four_highest,',',1) as 'highest value', substring_index(substring_index(four_highest,',',2),',',-1) as '2th highest value', substring_index(substring_index(four_highest,',',3),',',-1) as '3 rd highest value', substring_index(four_highest,',',-1) as '4th highest value' from
(
select userid, convert(group_concat(val) using utf8) as four_highest from
(
select userId,Buitenland as val,'Buitenland' as col from test where userid=9 union
select userId,Economie as val,' Economie' as col from test where userid=9 union
select userId,Sport as val ,'Sport' as col from test where userid=9 union
select userId,Cultuur as val,'Cultuur' as col from test where userid=9 union
select userId,Wetenschap as val,'Wetenschap' as col from test where userid=9 union
select userId,Media as val,'Media' as col from test where userid=9 order by val desc limit 4
) inner_query
)outer_query;
PL/SQL, maybe? Set user_id, query your table, store the returned row in an nx2 array of column names and values (where n is the number of columns) and sort the array based on the values.
Of course, the correct thing to do is redesign your database in the manner that #octern suggests.
This will get you started with the concept of grabbing the highest value from multiple columns on a single row (modify for your specific tables - I created a fake one).
create table fake
(
id int Primary Key,
col1 int,
col2 int,
col3 int,
col4 int
)
insert into fake values (1, 5, 9, 27, 10)
insert into fake values (2, 3, 5, 1, 20)
insert into fake values (3, 89, 9, 27, 6)
insert into fake values (4, 17, 40, 1, 20)
SELECT *,(SELECT Max(v)
FROM (VALUES (col1), (col2), (col3), (col4) ) AS value(v))
FROM fake