SQL SERVER 2008. How to select one row only based on dates - sql-server-2008

Select MIN([appdate]), CustID, CustAtt, Surname, Firstname, itemType
From table
Where appDate > (getdate())
Group by CustID, CustAtt
CustAtt CustID Surname Firstname appDate itemType
53247 20675 A AX
49535 12/FX08 B BX 14/08/2017 solid
70433 400039 C CX
67119 413555 D DX
51406 27/EY07 E EX 14/07/2017 Liquid
51406 27/EY08 E EX 13/09/2017 Gas
51406 27/EY09 E EX 11/12/2017 Solid A
51406 27/EY10 E EX 06/06/2018 Liquid A
82820 410053 F FX
52395 29/FA72 G GX 25/09/2017 Gas A
89488 414282 H HX
55855 412799 I IX 30/08/2017 Solid
55855 412799 I IX 21/08/2017 Liquid
53248 16/EK15 J JX 06/07/2017 Gas
53248 16/EK15 J JX 17/07/2017 Solid B
89835 911528 K KX 08/05/2018 Solid B
The above is a snippet of my output from the above script, but isn't giving me the desired output. I want to only select one row for each CustID & CustAtt from my dataset based on the appdate column.
The desired output should include the next earliest date from current date, in the case the dates are the same any given row is acceptable:
The output table should yield:
CustAtt CustID Surname Firstname appDate itemType
53247 20675 A AX
49535 12/FX08 B BX 14/08/2017 solid
70433 400039 C CX
67119 413555 D DX
51406 27/EY07 E EX 14/07/2017 Liquid
82820 410053 F FX
52395 29/FA72 G GX 25/09/2017 Gas A
89488 414282 H HX
55855 412799 I IX 21/08/2017 Liquid
53248 16/EK15 J JX 06/07/2017 Gas
89835 911528 K KX 08/05/2018 Solid B

Example: http://sqlfiddle.com/#!6/c35e2/1
create table t(
id int,
x int,
y int,
z int,
a int,
b int,
c datetime
);
-- Build t table
INSERT INTO t values
(1, 1, 2, 3, 4, 5, getdate()),
(2, 1, 2, 3, 4, 5, dateadd(hh,+3,getdate())),
(3, 1, 2, 3, 4, 5, dateadd(hh,+1,getdate())),
(4, 1, 2, 3, 4, 5, dateadd(hh,+2,getdate())),
(5, 1, 2, 3, 4, 5, dateadd(hh,-2,getdate())),
(6, 1, 2, 3, 4, 5, dateadd(hh,-1,getdate())),
(7, 1, 2, 3, 4, 5, dateadd(hh,+1,getdate())),
(8, 1, 2, 3, 4, 5, dateadd(hh,+1,getdate())),
(9, 1, 2, 3, 4, 5, dateadd(hh,+1,getdate()));
select top 1 t.id, s.*, t.c
from t
join (
select x,y,z,a,b
from t
group by x,y,z,a,b
having count(*) > 1
) s
on s.x = t.x
and s.y = t.y
and s.z = t.z
and s.a = t.a
and s.b = t.b
and (getdate() < t.c)
order by t.c

You have to use HAVING keyword. It can filter "after group".
SELECT
MIN([appdate]),
id,
type
FROM
[table]
WHERE
appdate > GETDATE()
GROUP BY
id, type
HAVING
COUNT(*) > 1 -- select only records, that have duplicates, based on our "group by"
ORDER BY
id
added 12.07.17:
(for revised question)
Count row number (partion by CustID & CustAtt, order by appDate) and then select first one for each one.
;WITH cte AS (
SELECT
*,
Row_number() OVER(
PARTITION BY CustID, CustAtt ORDER BY appDate
) row_number
FROM
table
)
SELECT
*
FROM
cte
WHERE
row_number = 1

Related

How to count occurrences when the gap between the values is greater than x

Considering a simple MySQL table with a id column and a int column, I need to count how many times I have a gap equal or greater than certain value.
Let's say that value will be 10.
Given the following sample records:
{1, 2, 3} = 1 time
{1, 2, 3, 4, 5, 6, 7, 8, 9} = 1 time;
{1, 2, 3, 14, 17} = 2 times (1, 2, 3 and 14, 17);
{1, 2, 3, 14, 20, 40, 42} = 3 times (1, 2, 3 and 14, 20 and 40, 42);
Is it possible resolve that with mysql?
Yes. For table t with columns id and num this will be seems like this:
SET #n = 10;
SELECT 1 + SUM(COALESCE(t3.f, 0))
FROM (
SELECT DISTINCT t1.num, (
SELECT CASE WHEN t2.num - t1.num > #n THEN 1 ELSE 0 END
FROM t t2
WHERE t2.num > t1.num
ORDER BY num LIMIT 1
) AS f
FROM t t1
) t3

How to get combined column values from multiple tables with different columns

I am trying to combine multiple columns from three tables. I could do it using UNION ALL keyword but I am feeling this query what I use is not probably the most efficient
For example:
create table tbl1
(id int, act varchar(50), stk varchar(50), price int, vol int, amt float);
insert into tbl1 values
(1, 'a1', 's1', 10, 5, 50),
(2, 'a1', 's2', 5, 5, 25),
(3, 'a2', 's1', 15, 3, 45),
(4, 'a2', 's2', 20, 2, 40),
(5, 'a2', 's2', 20, 2, 40);
create table tbl2 (id int, tid int, price int, vol int, amt float);
insert into tbl2 values
(1, 1, 5, 3, 15),(2, 1, 5, 1, 5),(3, 1, 15, 1, 15),
(4, 2, 5, 3, 15),(5, 2, 6, 2, 12);
create table tbl3 (id int, act varchar(10), type int, amt float);
insert into tbl3 values
(1, 'a1', 0, 10),(2, 'a1', 1, 15),
(3, 'a2',1, 5),(4, 'a3',0, 5);`
The query I used
SELECT act,stk,amtFROM tbl1
UNION ALL
SELECT
(select act from tbl1 where tbl2.tid = tbl1.id) amt,
(select stk from tbl1 where tbl2.tid = tbl1.id) stk,
amt
from tbl2
Is there a way to get the same without using inner select queries twice? could someone please give me the efficient query?
here is the Fiddle
Expected output (amt from all three tables where act='a1')
ACT STK AMT
a1 s1 50
a1 s2 25
a1 s1 15
a1 s1 5
a1 s1 15
a1 s1 10
a1 s1 15
Just use an explicit join:
SELECT act, stk, amt
FROM tbl1
UNION ALL
SELECT t1.act as amt, t1.stk, t2.amt
from tbl2 join
tbl1
on tbl2.tid = tbl1.id;

How to select only non-zero values from multiple table conditional sum query

I am having a tough time to eliminate the rows having zero values in particular expression, any help here highly appericiated
Here are my two simple tables
create table tbl1
(id int, account varchar(50), stock varchar(50), price int, vol int);
insert into tbl1 values
(1, 'a1', 's1', 10, 5),
(2, 'a1', 's2', 5, 5),
(3, 'a2', 's1', 15, 3),
(4, 'a2', 's2', 20, 2),
(5, 'a2', 's2', 20, 2);
create table tbl2
(id int, tid int, price int, vol int);
insert into tbl2 values
(1, 1, 5, 3),
(2, 1, 5, 1),
(3, 1, 15, 1),
(4, 2, 5, 3),
(5, 2, 6, 2);
My select is as follows, it gives me what I need but it also gives me the rows where (t1.vol - ifnull(Sum(t2.vol), 0)) returns zero
select
t1.id,account,stock,
(t1.vol - ifnull(Sum(t2.vol), 0)) vol
from tbl1 t1
left join tbl2 t2 on t1.id=t2.tid
group by t1.id
Could somebody help me in getting rid of these zero values?
I tried having (t1.vol - ifnull(Sum(t2.vol), 0)) <> 0 ==> it says vol is invalid column
I tried where (t1.vol - ifnull(Sum(t2.vol), 0)) <> 0 ==> it says Invalid use of group function
here is the output I get now with the above query
ID ACCOUNT STOCK VOL
1 a1 s1 0
2 a1 s2 0
3 a2 s1 3
4 a2 s2 2
5 a2 s2 2
SOLUTION:
select
t1.id,account,stock,
(t1.vol - ifnull(Sum(t2.vol), 0)) vol
from tbl1 t1
left join tbl2 t2 on t1.id=t2.tid
group by t1.id
having vol <> 0
You can modify your query like below
select t1.id,
t1.account,
t1.stock,
(t1.vol - coalesce(tab.vol_total,0)) as vol
from tbl1 t1
left join
(
select tid,Sum(vol) as vol_total
from tbl2
group by tid
) tab
on t1.id=tab.tid
where (t1.vol - coalesce(tab.vol_total,0)) > 0

Count occurrences that differ within a column

I want to be able to select the amount of times the data in columns Somedata_A and Somedata_B has changed from the from the previous row within its column. I've tried using DISTINCT and it works to some degree. {1,2,3,2,1,1} will show 3 when I want it to show 4 course there's 5 different values in sequence.
Example:
A,B,C,D,E,F
{1,2,3,2,1,1}
A compare to B gives a difference, B compare to C gives a difference . . . E compare to F gives not difference. All in all it gives 4 differences within a set of 6 values.
I have gotten DISTINCT to work but it does not really do the trick for me. And to add more to the question I'm really not interested it the whole range, lets say just the 2 last days/entries per Title.
Second I'm concern about performance issues. I tried the query below on a real set of data and it got interrupted probably due to timeout.
SQL Fiddle
MySQL 5.5.32 Schema Setup:
CREATE TABLE testdata(
Title varchar(10),
Date varchar(10),
Somedata_A int(5),
Somedata_B int(5));
INSERT INTO testdata (Title, Date, Somedata_A, Somedata_B) VALUES
("Alpha", '123', 1, 2),
("Alpha", '234', 2, 2),
("Alpha", '345', 1, 2),
("Alpha", '349', 1, 2),
("Alpha", '456', 1, 2),
("Omega", '123', 1, 1),
("Omega", '234', 2, 2),
("Omega", '345', 3, 3),
("Omega", '349', 4, 3),
("Omega", '456', 5, 4),
("Delta", '123', 1, 1),
("Delta", '234', 2, 2),
("Delta", '345', 1, 3),
("Delta", '349', 2, 3),
("Delta", '456', 1, 4);
Query 1:
SELECT t.Title, (SELECT COUNT(DISTINCT Somedata_A) FROM testdata AS tt WHERE t.Title = tt.Title) AS A,
(SELECT COUNT(DISTINCT Somedata_B) FROM testdata AS tt WHERE t.Title = tt.Title) AS B
FROM testdata AS t
GROUP BY t.Title
Results:
| TITLE | A | B |
|-------|---|---|
| Alpha | 2 | 1 |
| Delta | 2 | 4 |
| Omega | 5 | 4 |
Something like this may work: it uses a variable for row number, joins on an offset of 1 and then counts differences for A and B.
http://sqlfiddle.com/#!2/3bbc8/9/2
set #i = 0;
set #j = 0;
Select
A.Title aTitle,
sum(Case when A.SomeData_A <> B.SomeData_A then 1 else 0 end) AVar,
sum(Case when A.SomeData_B <> B.SomeData_B then 1 else 0 end) BVar
from
(SELECT Title, #i:=#i+1 as ROWID, SomeData_A, SomeData_B
FROM testdata
ORDER BY Title, date desc) as A
INNER JOIN
(SELECT Title, #j:=#j+1 as ROWID, SomeData_A, SomeData_B
FROM testdata
ORDER BY Title, date desc) as B
ON A.RowID= B.RowID + 1
AND A.Title=B.Title
Group by A.Title
This works (see here) (FYI: Your results in the question do not match your data - for instance, for Alpha, ColumnA: it never changes from 1. The answer should be 0)
Hopefully you can adapt this Statement to your actual data model
SELECT t1.title, SUM(t1.Somedata_A<>t2.Somedata_a) as SomeData_A
,SUM(t1.Somedata_b<>t2.Somedata_b) as SomeData_B
FROM testdata AS t1
JOIN testdata AS t2
ON t1.title = t2.title
AND t2.date = DATE_ADD(t1.date, INTERVAL 1 DAY)
GROUP BY t1.title
ORDER BY t1.title;

How to select a range of rows from a multiple column primary key?

I'm trying to chunk through rows in MySQL 5.5 and to do this I want to select a range between two primary keys (which I can get easily). This is trivial when the primary key is only one column. However, some of the tables I need to chunk through have multiple columns in the primary key, and I haven't figured out how to make this work in a single prepared statement.
Here's an example table with some data:
CREATE TABLE test (
a INT UNSIGNED NOT NULL,
b INT UNSIGNED NOT NULL,
c INT UNSIGNED NOT NULL,
d VARCHAR(255) DEFAULT '', -- various data columns
PRIMARY KEY (a, b, c)
) ENGINE=InnoDB;
INSERT INTO test VALUES
(1, 1, 1),
(1, 1, 2),
(1, 1, 3),
(1, 2, 1),
(1, 2, 2),
(1, 2, 3),
(1, 3, 1),
(1, 3, 3),
(2, 1, 1),
(2, 1, 2),
(2, 2, 2),
(2, 3, 1),
(2, 3, 3),
(3, 1, 2),
(3, 1, 3),
(3, 2, 1),
(3, 2, 2),
(3, 2, 3),
(3, 3, 1),
(3, 3, 3);
If I had two primary keys like (1, 1, 3) and (3, 2, 1), the following statement would work. a1, b1, and c1 are the values from the first primary key, and a2, b2, and c2 are the values from the second primary key:
SELECT * FROM test WHERE a = a1 AND b = b1 AND c >= c1
UNION
SELECT * FROM test WHERE a = a1 AND b > b1
UNION
SELECT * FROM test WHERE a > a1 AND a < a2
UNION
SELECT * FROM test WHERE a = a2 AND b < b2
UNION
SELECT * FROM test WHERE a = a2 AND b = b2 AND c <= c2
Or
SELECT * FROM test WHERE a = 1 AND b = 1 AND c >= 3
UNION
SELECT * FROM test WHERE a = 1 AND b > 1
UNION
SELECT * FROM test WHERE a > 1 AND a < 3
UNION
SELECT * FROM test WHERE a = 3 AND b < 2
UNION
SELECT * FROM test WHERE a = 3 AND b = 2 AND c <= 1
Which gives
(1, 1, 3),
(1, 2, 1),
(1, 2, 2),
(1, 2, 3),
(1, 3, 1),
(1, 3, 3),
(2, 1, 1),
(2, 1, 2),
(2, 2, 2),
(2, 3, 1),
(2, 3, 3),
(3, 1, 2),
(3, 1, 3),
(3, 2, 1)
But the above fails when the first column is the same, e.g. (1, 2, 2) and (1, 3, 1). In this case, the 2nd and 4th SELECT select too much.
SELECT * FROM test WHERE a = 1 AND b = 2 AND c >= 2
UNION
SELECT * FROM test WHERE a = 1 AND b > 2
UNION
SELECT * FROM test WHERE a > 1 AND a < 1
UNION
SELECT * FROM test WHERE a = 1 AND b < 3
UNION
SELECT * FROM test WHERE a = 1 AND b = 3 AND c <= 1
Which gives
(1, 1, 1), -- erroneously selected from: SELECT * FROM test WHERE a = 1 AND b < 3
(1, 1, 2), -- erroneously selected from: SELECT * FROM test WHERE a = 1 AND b < 3
(1, 1, 3), -- erroneously selected from: SELECT * FROM test WHERE a = 1 AND b < 3
(1, 2, 1), -- erroneously selected from: SELECT * FROM test WHERE a = 1 AND b < 3
(1, 2, 2),
(1, 2, 3),
(1, 3, 1),
(1, 3, 3) -- erroneously selected from: SELECT * FROM test WHERE a = 1 AND b > 2
The desired output is
(1, 2, 2),
(1, 2, 3),
(1, 3, 1)
I would like a single statement that works with all primary key ranges, including identical values for the first and second columns. I also have tables with 4 columns in the primary key, and I'll extend the pattern in that case.
I would like a single statement per table instead of creating queries on the fly because the query will be executed up to a million times as I chunk through the tables. Some of the tables have over 100M rows.
I would rather avoid constructing multiple statements as I have hundreds to write following this pattern, and writing more would be significantly more work. I will do this if it's the only option.
I currently use parametrized queries, and generate the values programmatically from the two primary keys, taking care of required duplicate values (the a1 x3, b1 x2, a2 x3, b2 x2 in the above example) in the application layer. So passing duplicate values for parameters is simple for me to do.
My best guess at this point is duplicate the SELECTs with an additional part of the WHERE clause comparing the values of the columns of the primary keys.
I would use this query to select a range:
SELECT *
FROM test
WHERE (a,b,c) >= (1, 1, 3)
and (a,b,c) <= (3, 2, 1)
Demo: http://www.sqlfiddle.com/#!2/d6cf7b/4
Unfortunately, MySql is not able to perform a range optimalization for the above query, see this link: http://dev.mysql.com/doc/refman/5.7/en/range-optimization.html#range-access-single-part
(chapter: 8.2.1.3.4. Range Optimization of Row Constructor Expressions)
They say that starting from verion 5.7 MySql can optimize only queries of a form:
WHERE ( col_1, col_2 ) IN (( 'a', 'b' ), ( 'c', 'd' ));
Basically the above query is equivalent to this one:
SELECT *
FROM test
WHERE
a = 1 and b = 1 and c >= 3 -- lowest end
or
a = 3 and b = 2 and c <= 1 -- highest end
or
a = 1 and b > 1
or
a = 3 and b < 2
or
a > 1 and a < 3
;
MySql might use a range access method optimalization for this form of the query, see below link
(chapter :8.2.1.3.2. The Range Access Method for Multiple-Part Indexes):
http://dev.mysql.com/doc/refman/5.7/en/range-optimization.html