Referencing the outer query from the case of a subquery in SQL server 2014 - sql-server-2014

I'm trying to build a stored procedure that transforms a lot of data from a very condensed table with a key to find any given information, to a wide table with all of the information in columns per group. This is to preprocess information and alleviate at point of use. I've structured my query like below.
Select distinct a.group_,
, (select value from mytable b where a.group = b.group) as Information 1
from mytable a
However, when I want to use a case statement, it breaks the reference to the outer query
Select distinct a.group_,
, (case
when (select value from mytable b where a.group = b.group) is not null -- this breaks
then (select value from mytable b where a.group = b.group)
else (select anothervalue from mytable b where a.group = b.group)
end ) as Information 1
from mytable a
I thought about a work around with a simple case to find if the value is null, and execute an else statement, but I found that my 'is not null' didn't work in the when statment, and I needed to reference the outer query anyways for the else. So ultimately, I need some method to be able to conditionally select values for one column and have it tied to the group that I'm trying to transform. Any help would be appreciated, thanks.
Edit:
Below is an example. To clarify, I need to be able to conditionally select information from potentially multiple sources together into the same column for any given group. It's also going to be working on a large amount of data so I need it to be as minimally computationally intensive as possible. From this table, I will combine all of these groups in different combinations to have one line per collection of groups, with each group at least one column in the final table. It's a little more complicated than that, but that's the general idea.
if OBJECT_ID(N'tempdb..#tt1') is not null
drop table #tt1
declare #counter INT
set #counter = 1
create table #tt1 (group_ int, id1 int, id2 int, info varchar(10))
while (#counter <= 5)
begin
insert into #tt1 (group_, id1, id2, info)
select #counter, #counter, #counter, CONCAT('info ', cast(#counter as varchar(10)))
set #counter = #counter +1
end
set #counter = 1
while (#counter <= 5)
begin
insert into #tt1 (group_, id1, id2, info)
select #counter, #counter, #counter+1, CONCAT('info ', cast(#counter+5 as varchar(10)))
set #counter = #counter +1
end
select distinct a.group_,
(select info from #tt1 b where a.group_ = b.group_ and id1 = 1 and id2=2) as Group1Info6
from #tt1 a
--The above works fine
select distinct a.group_,
(case
when (select info from #tt1 b where A.group_ = b.group_ and id1=1 and id2 =2) is < 6
then (select info from #tt1 b where A.group_ = b.group_ and id1=1 and id2 =2)
else (select info from #tt1 b where A.group_ = b.group_ and id1=1 and id2=1)
end) as Group1info
from #tt1 a
--The above does not.
Edit2:
My desired results would look something like this. In my actual data there are many group 1's, with many group 1 info columns.
group_
Group1Info
Group2Info
Group3Info
Group4Info
Group5Info
1
info
2
info
3
info
4
info
5
info
Then I'll present the information more cleanly to the end user like this.
group_
Group1Info
Group2Info
Group3Info
Group4Info
Group5Info
1-5
info
info
info
info
info

Related

BigQuery and NOT IN

Sorry for my bad English. Hopefully you understand, what I want.
I want something like a Pivot table (hopefully it's the right word)
For example I have a table with two columns: userid and domain
UserID Domain
1 | A
1 | B
1 | C
2 | A
2 | B
3 | A
2 | C
What I want. I want a table like the following which extracts the differences row-wise
A B C
A 0 1 1
B 0 0 0
C 0 0 0
How the read the output?
For example the first row (0,1,1)
Imagine all users which visited domain A (in our case user 1, user 2 and user 3).... All of domain A visitors were on domain A (I guess that's clear). The also visited domain B? No, one user (in our case user 3) was not on domain B. So we have a 1. Now we check if all domain A visitors were on domain C! And here we have also on user which was not on domain C. User 1 and 2 were on domain C but user 3 was not on domain C but on domain A. So we have to write a 1 again....
Second row - Check which users where on Domain B.
User 1 and user 2 were on domain B. Where they also on domain A? Yes... Both... So we have to write down a 0. User 1 and user 2 were on domain B? Yes... So 0. And on domain C? Yes... Both.. So we have to write a zero again.
Third row - To check the domain C
On domain C we have the visitors 1 and 2. Both also visited domain A so we have a zero... Both visited domain B? Yes, also zero and the last entry is clear since they came from domain C.....
To keep the long story short: I want to extract all exclusive visitors of each domain compared to the other domains...
I am struggling since 2 days with left joins and case when and so on... Nothing works out.
Is there anybody out their with suggestions? Would be really helpful. And yes, I have more than 3 domains. I have around 200!
very very big query :) , but it's working
DROP PROCEDURE IF EXISTS dowhile;
CREATE PROCEDURE dowhile()
BEGIN
SELECT #domain_arr := CONCAT(GROUP_CONCAT(domain SEPARATOR ','),',') AS domain_arr FROM ( SELECT t1.domain FROM user_domain t1 WHERE 1 GROUP BY t1.domain ) AS tt;
DROP table IF EXISTS temp_table;
create temporary table temp_table (
domain VARCHAR(100) not NULL
);
SET #domain_arr_table= #domain_arr;
WHILE LOCATE(',', #domain_arr_table) > 0 DO
SET #domain = SUBSTRING(#domain_arr_table,1,LOCATE(',',#domain_arr_table) - 1);
SET #domain_arr_table= SUBSTRING(#domain_arr_table, LOCATE(',',#domain_arr_table) + 1);
SET #s= CONCAT('ALTER TABLE temp_table ADD COLUMN ',#domain,' TINYINT DEFAULT 0');
PREPARE stmt3 FROM #s;
EXECUTE stmt3;
END WHILE;
WHILE LOCATE(',', #domain_arr) > 0 DO
SET #domain = SUBSTRING(#domain_arr,1,LOCATE(',',#domain_arr) - 1);
SET #domain_arr= SUBSTRING(#domain_arr, LOCATE(',',#domain_arr) + 1);
SELECT #user_count := COUNT(*) FROM user_domain WHERE domain=#domain;
INSERT INTO temp_table (domain) VALUES (#domain);
SELECT #domains_should_be_1 := CONCAT(GROUP_CONCAT(domain SEPARATOR ','),',') FROM (SELECT domain FROM user_domain WHERE user_id IN (SELECT user_id FROM user_domain WHERE domain=#domain) GROUP BY domain HAVING COUNT(*) < #user_count) AS tt2;
WHILE LOCATE(',', #domains_should_be_1) > 0 DO
SET #domain_sb_1 = SUBSTRING(#domains_should_be_1,1,LOCATE(',',#domains_should_be_1) - 1);
SET #domains_should_be_1= SUBSTRING(#domains_should_be_1, LOCATE(',',#domains_should_be_1) + 1);
SET #s= CONCAT("UPDATE temp_table SET ",#domain_sb_1,"='1' WHERE domain='",#domain,"'");
SELECT #s;
PREPARE stmt3 FROM #s;
EXECUTE stmt3;
END WHILE;
END WHILE;
END;
call dowhile();
SELECT * FROM temp_table;
There are really two questions here
I want to extract all exclusive visitors of each domain compared to the other domains...
I want something like a Pivot table
Let me answer your questions one by one
So,
How extract all exclusive visitors of each domain compared to the other domains...
Below is for BigQuery Standard SQL and produces flattened version of your matrix
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT 1 userid, 'A' domain UNION ALL
SELECT 1, 'B' UNION ALL
SELECT 1, 'C' UNION ALL
SELECT 2, 'A' UNION ALL
SELECT 2, 'B' UNION ALL
SELECT 3, 'A' UNION ALL
SELECT 2, 'C'
), temp AS (
SELECT DISTINCT userid, domain
FROM `project.dataset.your_table`
)
SELECT
a.domain domain_a,
b.domain domain_b,
COUNT(DISTINCT a.userid) - COUNTIF(a.userid = b.userid) count_of_not_in
FROM temp a
CROSS JOIN temp b
GROUP BY a.domain, b.domain
-- HAVING count_of_not_in > 0
This will result with
Row domain_a domain_b count_of_not_in
1 A A 0
2 A B 1
3 A C 1
4 B A 0
5 B B 0
6 B C 0
7 C A 0
8 C B 0
9 C C 0
I think in real life you will not have many zeroes in this data so if you want to compress that flattened version - just uncomment line with HAVING ... , so you will get "compact" version
Row domain_a domain_b count_of_not_in
1 A B 1
2 A C 1
For the sake of exercising and having fun, check out another approach below that produces exactly same result but in totally different way
#standardSQL
WITH `project.dataset.your_table` AS (
SELECT 1 userid, 'A' domain UNION ALL
SELECT 1, 'B' UNION ALL
SELECT 1, 'C' UNION ALL
SELECT 2, 'A' UNION ALL
SELECT 2, 'B' UNION ALL
SELECT 3, 'A' UNION ALL
SELECT 2, 'C'
), domains AS (
SELECT domain, ARRAY_AGG(DISTINCT userid) users
FROM `project.dataset.your_table`
GROUP BY domain
)
SELECT
a.domain domain_a, b.domain domain_b,
ARRAY_LENGTH(a.users) -
(SELECT COUNT(1)
FROM UNNEST(a.users) user_a
JOIN UNNEST(b.users) user_b
ON user_a = user_b
) count_of_not_in
FROM domains a
CROSS JOIN domains b
-- ORDER BY a.domain, b.domain
Now,
How to pivot above result, to produce actual matrix?
Ideally, pivoting should be done outside of BigQuery in whatever visualization tool you are usually using. But if for whatever reason you want to have it done within BigQuery - it is doable and there is enormous amount of questions here in SO related to this. One of the most recent that I have posted answer for is - https://stackoverflow.com/a/50300387/5221944 .
It shows how to generate/produce pivot query to achieve desired matrix
It is relatively easy and can be done either manually as two step process (step 1 - generate pivot query and step 2 - run generated query) or can be implemented using any client of your choice
You cannot (easily) express this as a matrix. But you can express this as a table with three columns: , , count.
with t as ( -- may not be necessary if the rows are already unique
select distinct userid, domain
from tab
)
select t1.domain as domain1, t2.domain as domain2, count(*)
from t t1 join
t t2
on t1.userid = t2.userid
group by t1.domain, t2.domain;
You cannot easily pivot the results in BigQuery into columns, unless you explicitly know the domains that you care about. You can aggregate them into columns, if you like.
For a given set of domains as columns, you can use conditional aggregation:
with t as ( -- may not be necessary if the rows are already unique
select distinct userid, domain
from tab
)
select t1.domain as domain1,
sum(case when t2.domain = 'amazon.com' then 1 else 0 end) as amazon,
sum(case when t2.domain = 'ebay.com' then 1 else 0 end) as ebay,
sum(case when t2.domain = 'yahoo.com' then 1 else 0 end) as yahoo
from t t1 join
t t2
on t1.userid = t2.userid
group by t1.domain, t2.domain;

Comparing n with (n-1) and (n-2) records in SQL

Write a SQL statement which can generate the list of customers whose minutes Streamed is consistently less than the previous minutes Streamed. As in minutes Streamed in the nth order is less than minutes Streamed in n-1th order, and the next previous order is also less. Another way to say it, list the customers that watch less and less minutes each time they watch a movie.
The table, query:
sqlfiddle link:
I have come up with the following query:
select distinct c1.customer_Id
from Customer c1
join Customer c2
where c1.customer_Id = c2.customer_Id
and c1.purchaseDate > c2.purchaseDate
and c1.minutesStreamed < c2.minutesStreamed;
This query doesn't deal with the (n-1)st and (n-2)nd comparison, i.e. "minutes Streamed in the nth order is less than minutes Streamed in n-1th order, and the next previous order is also less." condition.
I have attached a link for sqlfiddle, where I have created the table.
Hello Continuous Learner,
the following statement works for the n-1 and n-2 relation.
select distinct c1.customer_Id
from Customer c1
join Customer c2
on c1.customer_Id = c2.customer_Id
join Customer c3
on c1.customer_Id = c3.customer_Id
where c1.purchaseDate < c2.purchaseDate
and c1.minutesStreamed > c2.minutesStreamed
and c2.purchaseDate < c3.purchaseDate
and c2.minutesStreamed > c3.minutesStreamed
Although, I currently don't have an automatic solution for this problem.
Cheers
I would use a ROW_NUMBER() function with partition by customer id.
and then do a self join, on customer id and rank = rank-1, to bring new and old at the same level
Like:
create temp_rank_table as
(
select
customer_Id,
purchaseDate ,
minutesStreamed,
ROW_NUMBER() OVER (PARTITION BY customer_Id, ORDER BY purchaseDate, minutesStreamed) as cust_row
from Customer
)
self join
select customer_Id
( select
newval.customer_Id,
sum(case when newval.minutesStreamed < oldval.minutesStreamed then 1 else 0 end) as LessThanPrevCount,
max(newval.cust_row) as totalStreamCount
from temp_rank_table newval
left join temp_rank_table oldval
on newval.customer_id = oldval.customer_id
and newval.cust_row-1 = oldval.cust_row -- cust_row 2 matches to cust_row 1
group by newval.customer_id
)A
where A.LessThanPrevCount = (A.totalStreamCount-1)
-- get customers who always stream lesser than previous
--you can use having clause instead of a subquery too
DECLARE #TBL AS TABLE ( [NO] INT, [CODE] VARCHAR(50), [AREA]
VARCHAR(50) )
/* EXAMPLE 1 */ INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(1,'001','A00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(2,'001','A00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(3,'001','B00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(4,'001','C00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(5,'001','C00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(6,'001','A00') INSERT INTO #TBL([NO],[CODE],[AREA]) VALUES
(7,'001','A00')
/* EXAMPLE 2 / / ***** USE THIS CODE TO ENTER DATA FROM DIRECT TABLE
***** SELECT ROW_NUMBER() OVER(ORDER BY [FIELD_DATE]) AS [NO] ,[FIELD_CODE] AS [CODE] ,[FIELD_AREA] AS [AREA] FROM TABLE_A WHERE
CAST([FIELD_DATE] AS DATE) >= CAST('20200307' AS DATE) ORDER BY
[FIELD_DATE],[FIELD_CODE]
*/
SELECT A.NO AS ANO ,A.CODE AS ACODE ,A.AREA AS AAREA ,B.NO AS BNO
,B.CODE AS BCODE ,B.AREA AS BAREA ,CASE WHEN A.AREA=B.AREA THEN
'EQUAL' ELSE 'NOT EQUAL' END AS [COMPARE AREA] FROM #TBL A LEFT JOIN
#TBL B ON A.NO=B.NO+1
Blockquote

MySQL one query grouped by n rows, max and min inside group, possible?

I think there is no question like this.
I need to group rows by n records and get some values of this group.
I think is better to explain with a graphic example:
Is possible to do a query like this? if not my solution will be make an script to create another table with this but I donĀ“t like duplicate data at all.
Thanks!!!
set #counter=-1;
select xgroup,max(x) as mx, max(y) as my, avg(value3) as v3,
from
(
select (#counter := #counter +1) as counter,
#counter div 5 as xgroup,
currency, datetime, value1, value2,
case mod(#counter,5) when 0 then value1 else 00 end as x,
case mod(#counter,5) when 4 then value2 else 00 end as y,
mod(#counter,5) as xxx
FROM findata
) name1
group by xgroup;
#jms has the right approach, but you have to be very careful when using variables:
You should not assign a variable in one expression and then reference it in another in the same select.
To work in the most recent versions of MySQL, I would suggest ordering the data in a subquery.
In addition, there are some other values that you need:
select min(col1), min(col2),
max(case when mod(rn, 5) = 0 then col3 end),
max(col4), min(col5),
max(case when mod(rn, 5) or rn = #rn then col6 end),
max(case when mod(rn, 5) or rn = #rn then col7 end)
from (select (#rn := #rn + 1) as rn, t.*
from (select t.*
from t
order by col1, col2
) t cross join
(select #rn := -1) params
) t
group by (#rn div 5);
Note the logic is a bit arcane for the last values -- this is to take into account the final group that might not have exactly 5 rows.
You need a column that looks like(assuming you want to group every 5 rows)
dummy_table
1
1
1
1
1
2
2
2
2
2
...
You can do this by using generate_series() if you are using postgre sql by using
select t1 from (select generate_series(1,x)) t1, (select generate_series(1,5)) t2;
where you can replace x by (total rows/5) i.e. for 100 rows, x = 20. If you are using any other SQL platform, you can just work on creating this dummy table accordingly.
Once you get this dummy_table, join it with your table on row_number of your table with t1 column of dummy_table(not row_number of dummy_table). Syntax for accessing row number should be straightforward.
After the join, group by this t1 column and do the required aggregation. To do this in a single query, you can do the above in an inner query and do aggregation outside it. Hope this makes sense.
Ok, thanks you all guys for your answers, thanks to it I found the simple solution.
I simply add an autoincrement column, and then I can group results by integer division by 5.
And with this query:
SELECT id,
symbol,
datetime,
open,
MAX(high),
MIN(low),
SUBSTRING_INDEX( GROUP_CONCAT(CAST(close AS CHAR) ORDER BY datetime DESC), ',', 1 ) AS close
FROM `table`
GROUP BY (id-1) DIV 5
And the resulting is:
Thanks!
A solution is to introduce some field for grouping rows for aggregative operations.
It can be reached by introducing a user-variable and assigning values that will allow to group rows as required. For example, it can be a row counter divided by grouping chuck size and rounded to nearest upper ceil number:
SET #counter=0;
SELECT CEIL((#counter:=#counter+1)/5) AS chunk, MAX(high), MIN(low) FROM `table` GROUP BY chunk;

Looking for missed IDs in SQL Server 2008

I have a table that contains two columns
ID | Name
----------------
1 | John
2 | Sam
3 | Peter
6 | Mike
It has missed IDs. In this case these are 4 and 5.
How do I find and insert them together with random names into this table?
Update: cursors and temp tables are not allowed. The random name should be 'Name_'+ some random number. Maybe it would be the specified value like 'Abby'. So it doesn't matter.
Using a recursive CTE you can determine the missing IDs as follows
DECLARE #Table TABLE(
ID INT,
Name VARCHAR(10)
)
INSERT INTO #Table VALUES (1, 'John'),(2, 'Sam'),(3,'Peter'),(6, 'Mike')
DECLARE #StartID INT,
#EndID INT
SELECT #StartID = MIN(ID),
#EndID = MAX(ID)
FROM #Table
;WITH IDS AS (
SELECT #StartID IDEntry
UNION ALL
SELECT IDEntry + 1
FROM IDS
WHERE IDEntry + 1 <= #EndID
)
SELECT IDS.IDEntry [ID]
FROM IDS LEFT JOIN
#Table t ON IDS.IDEntry = t.ID
WHERE t.ID IS NULL
OPTION (MAXRECURSION 0)
The option MAXRECURSION 0 will allow the code to avoid the recursion limit of SQL SERVER
From Query Hints and WITH common_table_expression (Transact-SQL)
MAXRECURSION number Specifies the maximum number of recursions
allowed for this query. number is a nonnegative integer between 0 and
32767. When 0 is specified, no limit is applied. If this option is not specified, the default limit for the server is 100.
When the specified or default number for MAXRECURSION limit is reached
during query execution, the query is ended and an error is returned.
Because of this error, all effects of the statement are rolled back.
If the statement is a SELECT statement, partial results or no results
may be returned. Any partial results returned may not include all rows
on recursion levels beyond the specified maximum recursion level.
Generating the RANDOM names will largly be affected by the requirements of such a name, and the column type of such a name. What exactly does this random name entail?
You can do this using a recursive Common Table Expression CTE. Here's an example how:
DECLARE #MaxId INT
SELECT #MaxId = MAX(ID) from MyTable
;WITH Numbers(Number) AS
(
SELECT 1
UNION ALL
SELECT Number + 1 FROM Numbers WHERE Number < #MaxId
)
SELECT n.Number, 'Random Name'
FROM Numbers n
LEFT OUTER JOIN MyTable t ON n.Number=t.ID
WHERE t.ID IS NULL
Here are a couple of articles about CTEs that will be helpful to Using Common Table Expressions and Recursive Queries Using Common Table Expressions
Start by selecting the highest number in the table (select top 1 id desc), or select max(id), then run a while loop to iterate from 1...max.
See this article about looping.
For each iteration, see if the row exists, and if not, insert into table, with that ID.
I think recursive CTE is a better solution, because it's going to be faster, but here is what worked for me:
IF EXISTS (SELECT * FROM sys.objects WHERE object_id = OBJECT_ID(N'[dbo].[TestTable]') AND type in (N'U'))
DROP TABLE [dbo].[TestTable]
GO
CREATE TABLE [dbo].[TestTable](
[Id] [int] NOT NULL,
[Name] [varchar](50) NOT NULL,
CONSTRAINT [PK_TestTable] PRIMARY KEY CLUSTERED
(
[Id] ASC
))
GO
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (1, 'John')
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (2, 'Sam')
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (3, 'Peter')
INSERT INTO [dbo].[TestTable]([Id],[Name]) VALUES (6, 'Mike')
GO
declare #mod int
select #mod = MAX(number)+1 from master..spt_values where [type] = 'P'
INSERT INTO [dbo].[TestTable]
SELECT y.Id,'Name_' + cast(newid() as varchar(45)) Name from
(
SELECT TOP (select MAX(Id) from [dbo].[TestTable]) x.Id from
(
SELECT
t1.number*#mod + t2.number Id
FROM master..spt_values t1
CROSS JOIN master..spt_values t2
WHERE t1.[type] = 'P' and t2.[type] = 'P'
) x
WHERE x.Id > 0
ORDER BY x.Id
) y
LEFT JOIN [dbo].[TestTable] on [TestTable].Id = y.Id
where [TestTable].Id IS NULL
GO
select * from [dbo].[TestTable]
order by Id
GO
http://www.sqlfiddle.com/#!3/46c7b/18
It's actually very simple :
Create a table called #All_numbers which should contain all the natural number in the range that you are looking for.
#list is a table containing your data
select a.num as missing_number ,
'Random_Name' + convert(varchar, a.num)
from #All_numbers a left outer join #list l on a.num = l.Id
where l.id is null

How to fetch rows in column manner in sql?

I have a record in my table like
memberId PersonId Year
4057 1787 2
4502 1787 3
I want a result from a query like this
memberId1 MemberId2 PersonId
4057 4502 1787
How to write a query ??
Don't do this in a query, do it in the application layer.
Don't do this in SQL. At best you can try:
SELECT table1.memberId memberId1, table2.memberId MemberId2, PersonId
FROM table table1 JOIN table table2 USING (PersonId)
But it won't do what you expect if you have more than 2 Members for a person. (It will return every possible combination.)
Below an example on how to do it directly in SQL. Mind that there is plenty of room for optimisation but this version should be rather fast, esp if you have an index on PersonID and year.
SELECT DISTINCT PersonID,
memberId1 = Convert(int, NULL),
memberId2 = Convert(int, NULL)
INTO #result
FROM myTable
WHERE Year IN (2 , 3)
CREATE UNIQUE CLUSTERED INDEX uq0_result ON #result (PersonID)
UPDATE #result
SET memberId1 = t.memberId
FROM #result upd
JOIN myTable t
ON t.PersionId = upd.PersonID
AND t.Year = 2
UPDATE #result
SET memberId2 = t.memberId
FROM #result upd
JOIN myTable t
ON t.PersionId = upd.PersonID
AND t.Year = 3
SELECT * FROM #result
if you want all member ids for each person_id, you could use [for xml path] statement (great functionality)
to concat all memberId s in a string
select distinct PersonId
, (select ' '+cast(t0.MemberId as varchar)
from table t0
where t0.PersonId=t1.PersonId
for xml path('')
) [Member Ids]
from table t1
resulting in:
PersonId Members Ids
1787 ' 4057 4502'
if you really need seperate columns, with unlimited number of memberIds, consider using
a PIVOT table, but far more complex to use