Ho to assign Previous value in column for each record - sql-server-2008

I have one table scenario in which data looks like this .
Request Id Field Id Current Key
1213 11 1001
1213 12 1002
1213 12 103
1214 13 799
1214 13 899
1214 13 7
In this when loop starts for first Request ID then it should check all the field ID for that particular request ID. then data should be look like this .
Request Id Field Id Previous Key Current Key
1213 11 null 1001
1213 12 null 1002
1213 12 1002 103
1214 13 null 799
1214 13 799 899
1214 13 899 7
When very first record for Field id for particular request id come then for it should be take null values in Previous key column and the current key will remain the same.
When the second record will come for same field ID its should take previous value of first record in Previous key column and when third record come it should take previous value of second record in Previous column and so on .
When the new field ID came the same thing should be repeated again.
Please let me know if you need any more info.Much needed your help.

You can check this.
Declare #t table (Request_Id int, Field_Id int, Current_Key int)
insert into #t values (1213, 11, 1001),(1213, 12, 1002), (1213, 12, 103) , (1214, 13, 799), (1214, 13, 899), (1214, 13, 7)
;with cte
as (
select 0 rowno,0 Request_Id, 0 Field_Id, 0 Current_Key
union
select ROW_NUMBER() over(order by request_id) rowno, * from #t
)
select
t1.Request_Id , t1.Field_Id ,
case when t1.Request_Id = t2.Request_Id and t1.Field_Id = t2.Field_Id
then t2.Current_Key
else null
end previous_key
, t1.Current_Key
from cte t1, cte t2
where t1.rowno = t2.rowno + 1
Refer link when you want to compare row value

When the second record will come for same field ID...
Tables don't work this way: there is no way to tell that 1213,12,1002 is the "previous" record of 1213,12,103 as you assume in your example.
Do you have any data you can use to sort your records properly? Request id isn't enough because, even if you guarantee that it increments monotonically for each operation, each operation can include multiple values for the same item id which need to be sorted relative to each other.

IN SQL 2008
You do not have the benefit of the lead and lag functions. Instead you must do a query for the new column. Make sure you query both tables in the same order, and add a row_num column. Then select the greatest row_num that is not equal to the current row_num and has the same request_id and field_id.
select a.request_id,
a.field_id,
(select x.current_key
from (select * from (select t.*, RowNumber() as row_num from your_table t) order by row_num desc) x
where x.request_id = a.request_id
and x.field_id = a.field_id
and x.row_num < a.row_num
and RowNumber()= 1
) as previous_key,
a.current_key
from (select t.*, RowNumber()as row_num from your_table t) a
IN SQL 2012+
You can use the LAG or LEAD functions with the OVER clause to get the previous or next nth row value:
select
Request_Id,
Field_Id,
lag(Current_Key,1) over (partition by Request_ID, Field_ID) as Previous_Key
,Current_Key
from your table
You should probably look at how you order your results too. If you have multiple results lag will only grab the next row in the default order of the table. If you had another column to order by such as a date time you could do the following:
lag(Current_Key,1) over (partition by Request_ID, Field_ID order by timestampColumn)

try this,
declare #tb table (RequestId int,FieldId int, CurrentKey int)
insert into #tb (RequestId,FieldId,CurrentKey) values
(1213,11,1001),
(1213,12,1002),
(1213,12,103),
(1214,13,799),
(1214,13,899),
(1214,13, 7)
select RequestId,t.FieldId,
case when t.FieldId=t1.FieldId then t1.CurrentKey end as PreviousKey,t.CurrentKey from
(select *, ROW_NUMBER() over (order by RequestId,FieldId) as rno
from #tb) t left join
(select FieldId,CurrentKey,
ROW_NUMBER() over (order by RequestId,FieldId) as rno from #tb) t1 on t.rno=t1.rno+1

Related

Make a select with max and min passing condition to each of the two

When a post is accessed, I need, in addition to returning the information of this posts, to return the previous one if it exists and the next one.
I would like to know if there is a way to select MAX(id) and MIN(id) in a single query/select, passing a condition for each one of them. Example of what I'm trying to do in Laravel and I'll write it in SQL to make it easier too
Laravel:
$query = Post::query();
$query = $query->from('posts')->select(DB::raw('MAX(id), MIN(id)'))->whereRaw("id < {$id} and id > {$id}")->first();
SQL:
select MAX(id), MIN(id) from `posts` where id < 5 and id > 5 limit 1
The id variable is the post id value. In this example, it has the value 5. The query I'm doing is to get the MAX and MIN referring to this id, but I also need to get the info of the post that the user accessed.
The DB has post id number 4 and number 6. That is, I need to get the information from posts number 4, 5 and 6 in this example.
The where condition will never be true, but I cannot use or. The first condition is for MAX and the second for MIN. If I use the or, the biggest id will come of the DB.
I need to get the min and max value compared to a value. That is, as I explained above. If the id is 5, I need to get the largest existing id() below that value and I need to get the smallest value above it. In my case, from the information I have in the DB, it would be id 4, 5 and 6
Is it possible in a single consultation or do I really have to do more than one?
Yes, you can do it with case-when
select MAX(
CASE
WHEN id < 5 THEN id
ELSE NULL
END
), MIN(
CASE
WHEN id > 5 THEN id
ELSE NULL
END
)
from `posts`
where id <> 5
EDIT
Laravel equivalent, as shared by Gabriel Edu in the comment-section:
$query = Post::query();
$query = $query->from('posts')->
select(DB::raw("MAX(CASE WHEN id < {$id} THEN id ELSE null END), MIN(CASE WHEN id > {$id} THEN id ELSE null END)"))->first();
The LEAD() and LAG() function in MySQL are used to get preceding and succeeding value of any row within its partition.
Try this:
SELECT ID,
LAG (id) OVER (ORDER BY NULL) ONE_SHIFT_FORWARD,
LEAD (id) OVER (ORDER BY NULL) ONE_SHIFT_BACKWARD
FROM POSTS
ORDER BY ID ASC;
SELECT *
FROM ( SELECT ID,
LAG (id) OVER (ORDER BY NULL) ONE_SHIFT_FORWARD,
LEAD (id) OVER (ORDER BY NULL) ONE_SHIFT_BACKWARD
FROM POSTS
ORDER BY ID ASC)
WHERE id = 5;
You may use lead and lag to access the values before and after the current row.
You may then use those to select the post with a given id and the values before and after in a single select.
The following query
select *
from (
select
p.*,
lead(id) over(order by id) _lead,
lag(id) over(order by id) _lag
from post p
) x
where 23 in (id, _lead, _lag);
results in
id
text
_lead
_lag
15
fifteen
23
10
23
twentythree
24
15
24
twentyfour
50
23
With the following setup:
Schema (MySQL v8.0)
create table post (
id integer,
text varchar(50)
);
insert into post(id, text)
values
( 10, 'ten'),
( 15, 'fifteen'),
( 23, 'twentythree'),
( 24, 'twentyfour'),
( 50, 'fifty');
View on DB Fiddle

Write a query to get rid of duplicate records in oracle database with below mentioned criteria:

Criteria :
1) unique combination of 2 columns(column1,column2)
2) keep oldest one out of that combination
3) records might be same i.e. same column1, column2 and creation date in that case need the one which has lesser id.
e.g. data is as below:
ID column1 column2 creation_date(dd-mm-yyyy)
1 11 aa 10/5/2016
2 11 aa 11/6/2016
3 12 bb 10/5/2017
4 12 bb 20-05-2017
5 12 cc 10/5/2016
6 12 cc 11/5/2017
7 13 dd 10/1/2018
8 13 dd 10/1/2018
I need to keep records with id: 1,3,5,7
Approach I am thinking of is:
a) first write select query to get required records (in this example 1,3,5,7)
b) write update query to change status to deleted using update query(soft delete)
Also please suggest if any other better approach to fulfill the criteria.
Additional information:
*total number of records: 11k
*I don't want to get records directly from table rather than that I have a query which fetches only required data, need to run query on those records
*Final aim is to modify status of duplicate records to deleted and append deleted word to those records
This is really straight forward if you use analytic functions. The query has three parts:
A) Assign a rank to each record like this:
Group records by column1 and column2. Within each group, sort the records first by creation_date and then by ID. Assign 1 to the first record, 2 to the second and so on.
B) Keep only the duplicates, i.e. the records with newer creation_date and/or ID. The record with rnk = 1 would be your requested record. Records with rnk > 1 are the duplicates.
C) Using ROWID, delete the duplicates
delete
from your_table
where rowid in(-- (C)
select duplicate_rowid
from (select rowid as duplicate_rowid
,row_number() over( -- (A)
partition by column1, column2 -- Your criterion 1
order by creation_date asc -- Your criterion 2
,id asc -- Your criterion 3
) as rnk
from your_table
)
where rnk > 1 -- (B)
);
So final queries which worked for my question are as below:
1) to get count of records/ to get required columns:
SELECT --count (*) -use this to get count of records
ID, COLUMN1, COLUMN2,CREATION_DATE --required columns
FROM
MY_TABLE
WHERE
ROWID IN(
select duplicate_rowid
from (select rowid as duplicate_rowid
,row_number() over(
partition by COLUMN1, COLUMN2 -- criterion 1
ORDER BY CREATION_DATE ASC -- criterion 2
,ID ASC -- criterion 3
) AS RNK
from MY_TABLE
)
WHERE (RNK > 1 and COLUMN1 IS NOT NULL and COLUMN2 IS NOT NULL)
);
2) to update records with status=deleted and append _deleted word to column1 values:
UPDATE MY_TABLE
SET STATUS='deleted' , COLUMN1=CONCAT(COLUMN1,'_deleted')
WHERE
ROWID IN(
select duplicate_rowid
from (select rowid as duplicate_rowid
,row_number() over(
partition by COLUMN1, COLUMN2 -- criterion 1
ORDER BY CREATION_DATE ASC -- criterion 2
,ID ASC -- criterion 3
) AS RNK
from MY_TABLE
)
WHERE (RNK > 1 and COLUMN1 IS NOT NULL and COLUMN2 IS NOT NULL)
);

Delete duplicates from db

I have table like following
id | a_id | b_id | success
--------------------------
1 34 43 1
2 34 84 1
3 34 43 0
4 65 43 1
5 65 84 1
6 93 23 0
7 93 23 0
I want delete duplicates with same a_id and b_id, but I want keep one record. If possible kept record should be with success=1. So in example table third and sixth/seventh record should be deleted. How to do this?
I'm using MySQL 5.1
The task is simple:
Find the minimum number of records that should not be deleted.
Delete the other records.
The Oracle way,
delete from sample_table where id not in(
select id from
(
Select id, success,row_number()
over (partition by a_id,b_id order by success desc) rown
from sample_table
)
where (success = 1 and rown = 1) or rown=1)
The solution in mysql:
Will give you the minimum ids that should not be deleted.:
Select id from (SELECT * FROM report ORDER BY success desc) t
group by t.a_id, t.b
o/p:
ID
1
2
4
5
6
You can delete the other rows.
delete from report where id not in (the above query)
The consolidated DML:
delete from report
where id not in (Select id
from (SELECT * FROM report
ORDER BY success desc) t
group by t.a_id, t.b_id)
Now doing a Select on report:
ID A_ID B_ID SUCCESS
1 34 43 1
2 34 84 1
4 65 43 1
5 65 84 1
6 93 23 0
You can check the documentation of how the group by clause works when no aggregation function is provided:
When using this feature, all rows in each group should have the same
values for the columns that are omitted from the GROUP BY part. The
server is free to return any value from the group, so the results are
indeterminate unless all values are the same.
So just performing an order by 'success before the group by would allow us to get the first duplicate row with success = 1.
How about this:
CREATE TABLE new_table
AS (SELECT * FROM old_table WHERE 1 AND success = 1 GROUP BY a_id,b_id);
DROP TABLE old_table;
RENAME TABLE new_table TO old_table;
This method will create a new table with a temporary name, and copy all the deduped rows which have success = 1 from the old table. The old table is then dropped and the new table is renamed to the name of the old table.
If I understand your question correctly, this is probably the simplest solution. (though I don't know if it's really efficient or not)
This should work:
If procedural programming is available to you like e.g. pl/sql it is fairly simple. If you on the other hand is looking for a clean SQL solution it might be possible but not very "nice". Below is an example in pl/sql:
begin
for x in ( select a_id, b_id
from table
having count(*) > 1
group by a_id, b_id )
loop
for y in ( select *
from table
where a_id = x.a_id
and b_id = x.b_id
order by success desc )
loop
delete from table
where a_id = y.a_id
and b_id = y.b_id
and id != x.id;
exit; // Only do the first row
end loop;
end loop;
end;
This is the idea: For each duplicated combination of a_id and b_id select all the instances ordered so that any with success=1 is up first. Delete all of that combination except the first - being the successful one if any.
or perhaps:
declare
l_a_id integer := -1;
l_b_id integer := -1;
begin
for x in ( select *
from table
order by a_id, b_id, success desc )
loop
if x.a_id = l_a_id and x.b_id = l_b_id
then
delete from table where id = x.id;
end if;
l_a_id := x.a_id;
l_b_id := x.b_id;
end loop;
end;
In MySQL, if you dont want to care about which record is maintained, a single alter table will work.
ALTER IGNORE TABLE tbl_name
ADD UNIQUE INDEX(a_id, b_id)
It ignores the duplicate records and maintain only the unique records.
A useful links :
MySQL: ALTER IGNORE TABLE ADD UNIQUE, what will be truncated?

Is there a simpler way to find MODE(S) of some values in MySQL

MODE is the value that occurs the MOST times in the data, there can be ONE MODE or MANY MODES
here's some values in two tables (sqlFiddle)
create table t100(id int auto_increment primary key, value int);
create table t200(id int auto_increment primary key, value int);
insert into t100(value) values (1),
(2),(2),(2),
(3),(3),
(4);
insert into t200(value) values (1),
(2),(2),(2),
(3),(3),
(4),(4),(4);
right now, to get the MODE(S) returned as comma separated list, I run the below query for table t100
SELECT GROUP_CONCAT(value) as modes,occurs
FROM
(SELECT value,occurs FROM
(SELECT value,count(*) as occurs
FROM
T100
GROUP BY value)T1,
(SELECT max(occurs) as maxoccurs FROM
(SELECT value,count(*) as occurs
FROM
T100
GROUP BY value)T2
)T3
WHERE T1.occurs = T3.maxoccurs)T4
GROUP BY occurs;
and the below query for table t200 (same query just with table name changed) I have 2 tables in this example because to show that it works for cases where there's 1 MODE and where there are multiple MODES.
SELECT GROUP_CONCAT(value) as modes,occurs
FROM
(SELECT value,occurs FROM
(SELECT value,count(*) as occurs
FROM
T200
GROUP BY value)T1,
(SELECT max(occurs) as maxoccurs FROM
(SELECT value,count(*) as occurs
FROM
T200
GROUP BY value)T2
)T3
WHERE T1.occurs = T3.maxoccurs)T4
GROUP BY occurs;
My question is "Is there a simpler way?"
I was thinking like using HAVING count(*) = max(count(*)) or something similar to get rid of the extra join but couldn't get HAVING to return the result i wanted.
UPDATED:
as suggested by #zneak, I can simplify T3 like below:
SELECT GROUP_CONCAT(value) as modes,occurs
FROM
(SELECT value,occurs FROM
(SELECT value,count(*) as occurs
FROM
T200
GROUP BY value)T1,
(SELECT count(*) as maxoccurs
FROM
T200
GROUP BY value
ORDER BY count(*) DESC
LIMIT 1
)T3
WHERE T1.occurs = T3.maxoccurs)T4
GROUP BY occurs;
Now is there a way to get ride of T3 altogether?
I tried this but it returns no rows for some reason
SELECT value,occurs FROM
(SELECT value,count(*) as occurs
FROM t200
GROUP BY `value`)T1
HAVING occurs=max(occurs)
basically I am wondering if there's a way to do it such that I only need to specify t100 or t200 once.
UPDATED: i found a way to specify t100 or t200 only once by adding a variable to set my own maxoccurs like below
SELECT GROUP_CONCAT(CASE WHEN occurs=#maxoccurs THEN value ELSE NULL END) as modes
FROM
(SELECT value,occurs,#maxoccurs:=GREATEST(#maxoccurs,occurs) as maxoccurs
FROM (SELECT value,count(*) as occurs
FROM t200
GROUP BY `value`)T1,(SELECT #maxoccurs:=0)mo
)T2
You are very close with the last query. The following finds one mode:
SELECT value, occurs
FROM (SELECT value,count(*) as occurs
FROM t200
GROUP BY `value`
LIMIT 1
) T1
I think your question was about multiple modes, though:
SELECT value, occurs
FROM (SELECT value, count(*) as occurs
FROM t200
GROUP BY `value`
) T1
WHERE occurs = (select max(occurs)
from (select `value`, count(*) as occurs
from t200
group by `value`
) t
);
EDIT:
This is much easier in almost any other database. MySQL supports neither with nor window/analytic functions.
Your query (shown below) does not do what you think it is doing:
SELECT value, occurs
FROM (SELECT value, count(*) as occurs
FROM t200
GROUP BY `value`
) T1
HAVING occurs = max(occurs) ;
The final having clause refers to the variable occurs but does use max(occurs). Because of the use of max(occurs) this is an aggregation query that returns one row, summarizing all rows from the subquery.
The variable occurs is not using for grouping. So, what value does MySQL use? It uses an arbitrary value from one of the rows in the subquery. This arbitrary value might match, or it might not. But, the value only comes from one row. There is no iteration over it.
I realize this is a very old question but in looking for the best way to find the MODE in a MySQL table, I came up with this:
SELECT [column name], count(*) as [ccount] FROM [table] WHERE [field] = [item] GROUP BY [column name] ORDER BY [ccount] DESC LIMIT 1 ;
In my actual situation, I had a log with recorded events in it. I wanted to know during which period (1, 2 or 3 as recorded in my log) the specific event occurred the most number of times. (Eg, the MODE of "period" column of the table for that specific event
My table looked like this (abridged):
EVENT_TYPE | PERIOD
-------------------------
1 | 3
1 | 3
1 | 3
1 | 2
2 | 1
2 | 1
2 | 1
2 | 3
Using the query:
SELECT event_type, period, count(*) as pcount FROM proto_log WHERE event_type = 1 GROUP BY period ORDER BY pcount DESC LIMIT 1 ;
I get the result:
> EVENT_TYPE | PERIOD | PCOUNT
> --------------------------------------
1 | 3 | 3
Using this result, the period column ($result['period'] for example) should contain the MODE for that query and of course pcount contains the actual count.
If you wanted to get multiple modes, I suppse you could keep adding other criteria to your WHERE clause using ORs:
SELECT event_type, period, count(*) as pcount FROM proto_log WHERE event_type = 1 ***OR event_type = 2*** GROUP BY period ORDER BY pcount DESC LIMIT 2 ;
The multiple ORs should give you the additional results and the LIMIT increase will add the additional MODES to the results. (Otherwise it will still only show the top 1 result)
Results:
EVENT_TYPE | PERIOD | PCOUNT
--------------------------------------
1 | 3 | 3
2 | 1 | 3
I am not 100% sure this is doing exactly what I think it is doing, or if it will work in all situations, so please let me know if I am on or off track here.

select min value of range [0,44) not in a column

I have a table with an int valued column, which has values between 0 and 43 (both included).
I would like a query that returns the min value of the range [0,44) which is not in the table.
For example:
if the table contains: 3,5, 14. The query should return 0
if the table contains: 0,1, 14. The query should return 2
if the table contains: 0,3, 14. The query should return 1
If the table contains all values, the query should return empty.
How can I achieve that?
Since the value you want is either 0 or 1 greater than a value that exists in the table, you can just do;
SELECT MIN(value)
FROM (SELECT 0 value UNION SELECT value+1 FROM MyTable) a
WHERE value < 44 AND value NOT IN (SELECT value FROM MyTable)
An SQLfiddle to test with.
One way would be to create another table that contains the integers in [0,43] and then left join that and look for NULLs, the NULLs will tell you what values are missing.
Suppose you have:
create table numbers (n int not null);
and this table contains the integers from 0 to 43 (inclusive). If your table is t and has a column n which holds the numbers of interest, then:
select n.n
from numbers n left join t on n.n = t.n
where t.n is null
order by n.n
limit 1
should give you the result you're after.
This is a fairly common SQL technique when you're working with a sequence. The most common use is probably calendar tables.
One approach is to generate a set of 44 rows with integer values, and then perform an anti-join against the distinct set of values from the table, and the grab the mininum value.
SELECT MIN(r.val) AS min_val
FROM ( SELECT 0 AS val UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5 UNION ALL
-- ...
SELECT 44
) r
LEFT
JOIN ( SELECT t.int_valued_col
FROM mytable t
WHERE t.int_valued_col >= 0
AND t.int_valued_col <= 43
GROUP BY t.int_valued_col
) v
ON v.int_valued_col = r.col
WHERE v.int_valued_col IS NULL
A little bit hacky and MySQL-specific:
SELECT NULLIF(MAX(IF(val=#min, #min:=(val+1), #min)), #max) as min_empty
FROM (
SELECT DISTINCT val
FROM table1
-- WHERE val BETWEEN 0 AND 43
ORDER BY val) as vals, (SELECT #min:=0, #max:=44) as init;