Related
school_name
class
medium
total
srk
1
english
13
srk
2
english
14
srk
3
english
15
srk
1
french
16
srk
2
french
16
srk
3
french
18
vrk
1
english
17
vrk
1
french
18
I want that output by
school_name
class1eng
class1french
class2eng
class2french
class3english
class3french
[output needed][ otput required
output
You’re looking for multiple select statements along with appropriate cases to satisfy.
This should work for you
Select
school_name,
Sum(Case when (class=1 and medium=‘English’) then total else 0 end) as class1english,
Sum(Case when (class=1 and medium=‘French’) then total else 0 end) as class1french,
Sum(Case when (class=2 and medium=‘English’) then total else 0 end) as class2english,
Sum(Case when (class=2 and medium=‘French’) then total else 0 end) as class2french,
Sum(Case when (class=3 and medium=‘English’) then total else 0 end) as class3english,
Sum(Case when (class=3 and medium=‘French’) then total else 0 end) as class3french
From
table_name
Group by
school_name
Seems to be a simple ask, assumed you also want to order your results. Please check below query if that helps
SELECT school_name, class, medium, SUM(total) AS Total
FROM <Table Name>
GROUP BY school_name, class, medium
This solution is for general purpose, complex, but functional.
I've made it for myself as exercise and challenge.
/* --------------- TABLE --------------- */
CREATE TABLE schools_tab
(school VARCHAR(9), class INT, subj VARCHAR(9), total INT);
INSERT INTO schools_tab VALUES
('srk', 1, 'english', 13),
('srk', 2, 'english', 14),
('srk', 3, 'english', 15),
('srk', 1, 'french', 16),
('srk', 2, 'french', 16),
('srk', 3, 'french', 18),
('vrk', 1, 'english', 17),
('vrk', 1, 'french', 18);
/* -------------- DYNAMIC QUERY --------------- */
SET #sql=NULL;
WITH cte AS (
SELECT school, class, subj, ROW_NUMBER() OVER (PARTITION BY school) AS idx, DENSE_RANK() OVER (ORDER BY school) AS ids
FROM (SELECT DISTINCT school FROM schools_tab) A LEFT JOIN (SELECT DISTINCT class, subj FROM schools_tab) B ON (1=1)
), cte2 AS (
SELECT A.ids, A.idx, A.school, A.class, A.subj, COALESCE(B.total, 0) AS total
FROM cte A LEFT JOIN schools_tab B ON (A.school=B.school AND A.class=B.class AND A.subj=B.subj)
), cte3 AS (
SELECT DISTINCT class, subj
FROM schools_tab
ORDER BY class, subject
)
SELECT CONCAT('WITH RECURSIVE cte AS (
SELECT school, class, subj, ROW_NUMBER() OVER (PARTITION BY school) AS idx, DENSE_RANK() OVER (ORDER BY school) AS ids
FROM (SELECT DISTINCT school FROM schools_tab) A LEFT JOIN (SELECT DISTINCT class, subj FROM schools_tab) B ON (1=1)
), cte2 AS (
SELECT A.ids, A.idx, A.school, A.class, A.subj, COALESCE(B.total, 0) AS total
FROM cte A LEFT JOIN schools_tab B ON (A.school=B.school AND A.class=B.class AND A.subj=B.subj)
), ctx AS ('
'SELECT (SELECT MAX(ids) FROM cte2) AS n,',
GROUP_CONCAT(DISTINCT CONCAT( '(SELECT total FROM cte2 WHERE idx=',idx,' AND ids=n) AS class',class,subj ) ORDER BY class, subj),
' UNION ALL SELECT n-1 AS n,',
GROUP_CONCAT(DISTINCT CONCAT( '(SELECT total FROM cte2 WHERE idx=',idx,' AND ids=n) AS class',class,subj ) ORDER BY class, subj),
' FROM ctx WHERE n>0',
') SELECT DISTINCT SUBSTRING_INDEX(SUBSTRING_INDEX(''srk,vrk'', '','', n+1), '','', -1) AS school,',
GROUP_CONCAT(DISTINCT CONCAT('class',class,subj)),
' FROM ctx ORDER BY school'
) INTO #sql
FROM cte2;
PREPARE stmt1 FROM #sql;
EXECUTE stmt1;
Lately, I have been trying to pivot a table in snowflake and replicate a transformation operation in snowflake which is presently being done in pandas like the following:
I have a dataframe like the below:
I have been able to convert this into the following format:
Using code below:
dd = pd.pivot(df[['customerid', 'filter_', 'sum', 'count', 'max']], index='customerid', columns='filter_')
dd = dd.set_axis(dd.columns.map('_'.join), axis=1, inplace=False).reset_index()
I have been trying to do this in snowflake but am unable to get the same format. Here's what I have tried:
with temp as (
SELECT $1 as customerid, $2 as perfiosid, $3 as filter_, $4 as sum_, $5 as count_, $6 as max_
FROM
VALUES ('a', 'b', 'c', 10, 100, 1000),
('a', 'b', 'c1', 9, 900, 9000),
('a', 'b', 'c2', 80, 800, 8000),
('x', 'b', 'c', 10, 100, 1000),
('x', 'b', 'c1', 9, 900, 9000),
('x', 'b', 'c2', 80, 800, 8000))
,
cte as (
select *, 'SUM_' as idx
from temp pivot ( max(sum_) for filter_ in ('c', 'c1', 'c2'))
union all
select *, 'COUNT_' as idx
from temp pivot ( max(count_) for filter_ in ('c', 'c1', 'c2'))
union all
select *, 'MAX_' as idx
from temp pivot ( max(max_) for filter_ in ('c', 'c1', 'c2'))
order by customerid, perfiosid
)
-- select * from cte;
select customerid, perfiosid, idx, max("'c'") as c, max("'c1'") as c1, max("'c2'") as c2
from cte
group by 1, 2, 3
order by 1, 2, 3
The output I get from this is:
Note: I have 3k fixed filters per customerid and 18 columns like sum, count, max, min, stddev, etc. So the final output must be 54k columns for each customerid. How can I achieve this while being within the limits of 1 MB statement execution of snowflake?
Using conditional aggregation:
with temp as (
SELECT $1 as customerid, $2 as perfiosid, $3 as filter_, $4 as sum_, $5 as count_, $6 as max_
FROM
VALUES ('a', 'b', 'c', 10, 100, 1000),
('a', 'b', 'c1', 9, 900, 9000),
('a', 'b', 'c2', 80, 800, 8000),
('x', 'b', 'c', 10, 100, 1000),
('x', 'b', 'c1', 9, 900, 9000),
('x', 'b', 'c2', 80, 800, 8000)
)
SELECT customerid,
SUM(CASE WHEN FILTER_ = 'c' THEN SUM_ END) AS SUM_C,
SUM(CASE WHEN FILTER_ = 'c1' THEN SUM_ END) AS SUM_C1,
SUM(CASE WHEN FILTER_ = 'c2' THEN SUM_ END) AS SUM_C2,
SUM(CASE WHEN FILTER_ = 'c' THEN COUNT_ END) AS COUNT_C,
SUM(CASE WHEN FILTER_ = 'c1' THEN COUNT_ END) AS COUNT_C1,
SUM(CASE WHEN FILTER_ = 'c2' THEN COUNT_ END) AS COUNT_C2,
MAX(CASE WHEN FILTER_ = 'c' THEN MAX_ END) AS MAX_C,
MAX(CASE WHEN FILTER_ = 'c1' THEN MAX_ END) AS MAX_C1,
MAX(CASE WHEN FILTER_ = 'c2' THEN MAX_ END) AS MAX_C2
FROM temp
GROUP BY customerid;
Output:
To match the 1MB query limit the output could be splitted and materialized in temporary table first like:
CREATE TEMPORARY TABLE t_SUM
AS
SELECT customer_id,
SUM(...)
FROM tab;
CREATE TEMPORARY TABLE t_COUNT
AS
SELECT customer_id,
SUM(...)
FROM tab;
CREATE TEMPORARY TABLE t_MAX
AS
SELECT customer_id,
SUM(...)
FROM tab;
Combined query:
SELECT *
FROM t_SUM AS s
JOIN t_COUNT AS c
ON s.customer_id = c.customer_id
JOIN t_MAX AS m
ON m.customer_id = c.customer_id
-- ...
you cannot ask 54k sets of three column3 in a query, because:
the 50,000th set looks like (if precomputed into tables like Lukasz suggests)
s.s_50000 as sum_50000,
c.c_50000 as count_50000,
m.m_50000 as max_50000,
is 75 bytes, and 54K * 75 = 4,050,000 so even asking for 54K columns (you are have 18K sets of 3 columns) would 1.3MB so too larger.
Which means you have to build your temp tables, as suggested by Lukasz, you would have to use:
select s.customer_id, s.*, c.*, m.*
from sums as s
join counts as c on s.customer_id = c.customer_id
join maxs c on m.customer_id = c.customer_id
but building those temp tables has 18K columns of
SUM(IFF(FILTER_='c18000',SUM_,null)) AS SUM_18000
is 50 bytes, thus 18K of those lines takes 90K, so that might work.
But you then have problems like this person with their 8K columns started having prbolems:
https://community.snowflake.com/s/question/0D50Z00007CZcqmSAD/what-is-limit-on-number-of-columns-how-to-do-a-sparse-table
which is to all say, this thing you are doing seems of very low value, what system is going to make sense of 50K+ columns of data that can not handling processing many rows. It just feels like a, Tool A we know how to do Z and not Y, so Tool B must produce answers in Z format verse the natural concepts of Y..
I have the following table
(cl1 , cl2)
---- ----
(a , 1)
(a , 2)
(b , 2)
(c , 1)
(c , 2)
each a , b ,c can take two values (1 or 2 or both).
My question is :
How to insert a new row (with 0 on cl2) for all the cl1 that have only 1 or 2 and NOT the both in the example. I would like to insert the following row :
----
(b , 0)
----
I'm sure there are better ways, but here is one way to do it using group by and a having clause to enforce your rules (I'm assuming Oracle syntax):
insert into tbl (cl1, cl2)
(select cl1, 0
from tbl
group by cl1
having count(case when cl2 in (1, 2) then 'X' end) != 0 -- contains 1 or 2
and (count(case when cl2 = 1 then 'X' end) = 0 -- but not both
or count(case when cl2 = 2 then 'X' end) = 0)
)
EDIT
A much simpler way:
insert into tbl (cl1, cl2)
(select cl1, 0
from tbl
where cl2 in (1, 2)
group by cl1
having count(distinct cl2) = 1
)
I am assuming that the BD is Oracle. Hope the below snippet helps.
SELECT B.CL1,
0
FROM
(SELECT A.CL1,
CASE
WHEN WMSYS.WM_CONCAT(A.CL2) LIKE '%1%'
AND WMSYS.WM_CONCAT(A.CL2) LIKE '%2%'
THEN 'both'
ELSE 'one'
END rnk
FROM
(SELECT 'a' cl1,1 cl2 FROM dual
UNION ALL
SELECT 'a' cl1,2 cl2 FROM dual
UNION ALL
SELECT 'b' cl1,2 cl2 FROM dual
UNION ALL
SELECT 'c' cl1,1 cl2 FROM dual
UNION ALL
SELECT 'c' cl1,2 cl2 FROM dual
)A
GROUP BY A.CL1
)B
WHERE B.rnk = 'one';
CREATE TABLE TestTable (cl1 VARCHAR(2), cl2 INT);
INSERT INTO TestTable (cl1, cl2) VALUES ('a', 1), ('a', 2), ('b', 1), ('c', 1), ('c', 2);
INSERT INTO TestTable (cl1, cl2)
SELECT cl1, 0
FROM TestTable
WHERE cl1 NOT IN (
SELECT cl1
FROM TestTable
WHERE cl2 IN (1, 2)
GROUP BY cl1
HAVING COUNT(DISTINCT cl2) = 2
);
MySQL Demo: http://rextester.com/XWHGF50183
The below block returns the cl1 those have the cl2 is 1 and 2. Based on the result using NOT IN you can achieve the result.
SELECT cl1
FROM TestTable
WHERE cl2 IN (1, 2)
GROUP BY cl1
HAVING COUNT(DISTINCT cl2) = 2
Help from this answer
Here you go:
insert into [YOUR TABLE NAME]
select cl1,0 from [YOUR TABLE NAME]
group by cl1 having count(distinct cl2)<> 2
;
I have this table
**Original Table**
year month duration amount per month
2012 5 3 2000
and I want to get this
**Result table**
year month duration amount per month
2012 5 1 2000
2012 6 1 2000
2012 7 1 2000
Note how the duration of a project (this is a project) is 3 and the "amount per month" is 2000, so I added two more rows to show that the next months (6 and 7) will have an "amount per month" as well. How do I do that with sql/tsql?
try this for SQL SERVER, i included my test temp table:
declare #temp as table
(
[year] int
, [month] int
, [duration] int
, [amount] int
)
insert into #temp
(
[year]
, [month]
, [duration]
, [amount]
)
VALUES(
2012
,5
,3
,2000
)
SELECT
[year]
,[month] + n.number
,1
,[amount]
, '1' + SUBSTRING(CAST([duration] AS varchar(10)), 2, 1000) AS Items
FROM #temp
JOIN master..spt_values n
ON n.type = 'P'
AND n.number < CONVERT(int, [duration])
Please see the script below that may work for your requirement. I have also compensated for calender year and month increment. Please test and let me know.
DECLARE #temp AS TABLE([Year] INT,[Month] INT,Duration INT,Amount INT)
INSERT INTO #temp([year], [month], Duration, Amount)
VALUES (2011, 5, 3, 2000),(2012, 11, 3, 3000),(2013, 9, 12, 1000);
;WITH cte_datefix
AS (
SELECT [Year],
[Month],
Duration,
Amount,
CAST(CAST([Year] AS VARCHAR(4)) + RIGHT('00' + CAST([Month] AS VARCHAR(2)), 2) + '01' AS DATE) AS [Date]
FROM #temp
),
cte_Reslut
AS (SELECT [Year],
[Month],
Duration,
Amount,
[Date],
1 AS Months
FROM cte_datefix
UNION ALL
SELECT t.[Year],
t.[Month],
t.Duration,
t.Amount,
DATEADD(M, Months, t.[Date]) AS [Date],
cr.Months + 1 AS Months
FROM cte_Reslut AS cr
INNER JOIN cte_datefix AS t
ON t.[Year] = cr.[Year]
WHERE cr.Months < cr.Duration
)
SELECT YEAR([Date]) AS [Year],
MONTH([Date]) AS [Month],
1 AS Duration,
Amount
FROM cte_Reslut
ORDER BY [Date]
For those that are wondering how to increment the year if needed, here is an example building on Suing response (really easy, just include two case statements):
select
2012 as [year]
,11 as [month]
,5 as [duration]
,2000 as [amount]
into #temp
select * from #temp
SELECT
case
when [month] + n.number > 12
then [year] + 1
else [year]
end as [year]
,case
when [month] + n.number > 12
then [month] + n.number - 12
else [month] + n.number
end as newYear
,1 as newDuration
,[amount]
, '1' + SUBSTRING(CAST([duration] AS varchar(10)), 2, 1000) AS Items
FROM #temp
JOIN master..spt_values n
ON n.type = 'P'
AND n.number < CONVERT(int, [duration])
drop table #temp
I have a table like this:
ID Type
----------
1 sent
1 sent
1 open
1 bounce
1 click
2 sent
2 sent
2 open
2 open
2 click
I want a query to return results like this:
ID sent open bounce click
1 2 1 1 1
2 2 2 0 1
Just can't work out how to do it. Thanks.
try PIVOT
SELECT ID,[sent],[open],[bounce],[click]
FROM your_table
PIVOT (COUNT([Type])
FOR [Type] in ([sent],[open],[bounce],[click]))p
SQL Fiddle Demo
Select Id,
count(case When type='sent' then 1 else 0 end) as sent,
count(case when type='open' then 1 else 0 end) as open
From table
Group by Id
If that won't give you the exact answer then try count (distinct case....) :)
You can get such result by using PIVOT or GROUP BY, you can even get results if you have variable values in Type column:
Test data:
CREATE TABLE #t(ID INT, Type VARCHAR(100))
INSERT #t
VALUES
(1, 'sent'),
(1, 'sent'),
(1, 'open'),
(1, 'bounce'),
(1, 'click'),
(2, 'sent'),
(2, 'sent'),
(2, 'open'),
(2, 'open'),
(2, 'click')
PIVOT approach:
SELECT pvt.*
FROM #t
PIVOT
(
COUNT(Type) FOR Type IN ([sent], [open], [bounce], [click])
) pvt
If there are other possible values for Type and you don't know them in advance use dynamic PIVOT:
DECLARE #cols NVARCHAR(1000) = STUFF(
(
SELECT DISTINCT ',[' + Type + ']'
FROM #t
FOR XML PATH('')
), 1, 1, '')
DECLARE #query NVARCHAR(2000) =
'
SELECT pvt.*
FROM #t
PIVOT
(
COUNT(Type) FOR Type IN ('+#cols+')
) pvt
'
EXEC(#query)
If you have known fixed values for Type, you can also use:
SELECT ID,
COUNT(CASE WHEN Type = 'sent' THEN 1 END) [sent],
COUNT(CASE WHEN Type = 'open' THEN 1 END) [open],
COUNT(CASE WHEN Type = 'bounce' THEN 1 END) bounce,
COUNT(CASE WHEN Type = 'click' THEN 1 END) click
FROM #t
GROUP BY ID