MySQL - Select row only if previous row field was 0? - mysql

I have a table as below:
CREATE TABLE IF NOT EXISTS `status`
(`code` int(11) NOT NULL AUTO_INCREMENT PRIMARY KEY
,`IMEI` varchar(15) NOT NULL
,`ACC` tinyint(1) NOT NULL
,`datetime` datetime NOT NULL
);
INSERT INTO status VALUES
(1, 123456789012345, 0, '2014-07-09 10:00:00'),
(2, 453253453334445, 0, '2014-07-09 10:05:00'),
(3, 912841851252151, 0, '2014-07-09 10:08:00'),
(4, 123456789012345, 1, '2014-07-09 10:10:00'),
(5, 123456789012345, 1, '2014-07-09 10:15:00');
I need to get all rows for a given IMEI (e.g 123456789012345) where ACC=1 AND the previous row for same IMEI has ACC=0. The rows may be one after the other or very apart.
Given the exampl above, I'd want to get the 4th row (code 4) but not 5th (code 5).
Any ideas? Thanks.

Assuming that you mean previous row by datetime
SELECT *
FROM status s
WHERE s.imei='123456789012345'
AND s.acc=1
AND (
SELECT acc
FROM status
WHERE imei=s.imei
AND datetime<s.datetime
ORDER BY datetime DESC
LIMIT 1
) = 0

The way I would approach this problem is much different from the approaches given in other answers.
The approach I would use would be to
1) order the rows, first by imei, and then by datetime within each imei. (I'm assuming that datetime is how you are going to determine if a row is "previous" to another row.
2) sequentially process the rows, first comparing imei from the current row to the imei from the previous row, and then checking if the ACC from the current row is 1 and the ACC from the previous row is 0. Then I would know that the current row was a row to be returned.
3) for each processed row, in the resultset, include a column that indicates whether the row should be returned or not
4) return only the rows that have the indicator column set
A query something like this:
SELECT t.code
, t.imei
, t.acc
, t.datetime
FROM ( SELECT IF(s.imei=#prev_imei AND s.acc=1 AND #prev_acc=0,1,0) AS ret
, s.code AS code
, #prev_imei := s.imei AS imei
, #prev_acc := s.acc AS acc
, s.datetime AS datetime
FROM (SELECT #prev_imei := NULL, #prev_acc := NULL) i
CROSS
JOIN `status` s
WHERE s.imei = '123456789012345'
ORDER BY s.imei, s.datetime, s.code
) t
WHERE t.ret = 1
(I can unpack that a bit, to explain how it works.)
But the big drawback of this approach is that it requires MySQL to materialize the inline view as a derived table (temporary MyISAM table). If there was no predicate (WHERE clause) on the status table, the inline view would essentially be a copy of the entire status table. And with MySQL 5.5 and earlier, that derived table won't be indexed. So, this could present a performance issue for large sets.
Including predicates (e.g. WHERE s.imei = '123456789' to limit rows from the status table in the inline view query may sufficiently limit the size of the temporary MyISAM table.
The other gotcha with this approach is that the behavior of user-defined variables in the statement is not guaranteed. But we do observe a consistent behavior, which we can make use of; it does work, but the MySQL documentation warns that the behavior is not guaranteed.
Here's a rough overview of how MySQL processes this query.
First, MySQL runs the query for the inline view aliased as i. We don't really care what this query returns, except that we need it to return exactly one row, because of the JOIN operation. What we care about is the initialization of the two MySQL user-defined variables, #prev_imei and #prev_acc. Later, we are going to use these user-defined variables to "preserve" the values from the previously processed row, so we can compare those values to the current row.
The rows from the status table are processed in sequence, according to the ORDER BY clause. (This may change in some future release, but we can observe that it works like this in MySQL 5.1 and 5.5.)
For each row, we compare the values of imei and acc from the current row to the values preserved from the previous row. If the boolean in the IF expression evaluates to TRUE, we return a 1, to indicate that this row should be returned. Otherwise, we return a 0, to indicate that we don't want to return this row. (For the first row processed, we previously initialized the user-defined variables to NULL, so the IF expression will evaluate to 0.)
The #prev_imei := s.imei and #prev_acc := s.acc assigns the values from the current row to the user-defined values, so they will be available for the next row processed.
Note that it's important that the tests of the user-defined variables (the first expression in the SELECT list) before we overwrite the previous values with the values from the current row.
We can run just the query from the inline view t, to observe the behavior.
The outer query returns rows from the inline view that have the derived ret column set to a 1, rows that we wanted to return.

select * from status s1
WHERE
ACC = 1
AND code = (SELECT MIN(CODE) FROM status WHERE acc = 1 and IMEI = s1.IMEI)
AND EXISTS (SELECT * FROM status WHERE IMEI = s1.IMEI AND ACC = 0)
AND IMEI = 123456789012345

SELECT b.code,b.imei,b.acc,b.datetime
FROM
( SELECT x.*
, COUNT(*) rank
FROM status x
JOIN status y
ON y.imei = x.imei
AND y.datetime <= x.datetime
GROUP
BY x.code
) a
JOIN
( SELECT x.*
, COUNT(*) rank
FROM status x
JOIN status y
ON y.imei = x.imei
AND y.datetime <= x.datetime
GROUP
BY x.code
) b
ON b.imei = a.imei
AND b.rank = a.rank + 1
WHERE b.acc = 1
AND a.acc = 0;

you can do a regular IN() and then group any duplicates (you could also use a limit but that would only work for one IMEI)
SETUP:
INSERT INTO `status`
VALUES
(1, 123456789012345, 0, '2014-07-09 10:00:00'),
(2, 453253453334445, 0, '2014-07-09 10:05:00'),
(3, 912841851252151, 0, '2014-07-09 10:08:00'),
(4, 123456789012345, 1, '2014-07-09 10:10:00'),
(5, 123456789012345, 1, '2014-07-09 10:15:00'),
(6, 123456789012345, 1, '2014-07-09 10:15:00'),
(7, 453253453334445, 1, '2014-07-09 10:15:00');
QUERY:
SELECT * FROM status
WHERE ACC = 1 AND IMEI IN(
SELECT DISTINCT IMEI FROM status
WHERE ACC = 0)
GROUP BY imei;
RESULTS:
works with multiple IMEI that have a 0 then a 1... IMAGE
EDIT:
if you would like to go by the date entered as well then you can just order it first by date and then group.
SELECT * FROM(
SELECT * FROM status
WHERE ACC = 1 AND IMEI IN(
SELECT DISTINCT IMEI FROM status
WHERE ACC = 0)
ORDER BY datetime
) AS t
GROUP BY imei;

Related

Get most recent result from a LEFT JOIN column

I'm creating a custom forum from scratch and I'm attempting to use some LEFT JOIN queries to get information such as total posts, total threads and most recent thread. I've managed to get the data but the recent thread keeps returning a random value rather than the most recent thread.
CREATE TABLE forum_categories
(`name` varchar(18), `label` varchar(52), `id` int)
;
INSERT INTO forum_categories
(`name`, `label`, `id`)
VALUES
('General Discussion', 'Talk about anything and everything Digimon!', 1),
('Deck Discussion', 'Talk about Digimon TCG Decks and Strategies!', 2),
('Card Discussion', 'Talk about Digimon TCG Cards!', 3),
('Website Feedback', 'A place to discuss and offer feedback on the website', 4)
;
CREATE TABLE forum_topics
(`name` varchar(18), `id` int, `parent_id` int, `author_id` int, date date)
;
INSERT INTO forum_topics
(`name`, `id`, `parent_id`, `author_id`, `date`)
VALUES
('My First Topic', 1, 1, 16, '2021-03-29'),
('My Second Topic', 2, 1, 16, '2021-03-30')
;
CREATE TABLE forum_topics_content
(`id` int, `topic_id` int, `author_id` int, date datetime, `content` varchar(300))
;
INSERT INTO forum_topics_content
(`id`, `topic_id`, `author_id`, `date`, `content`)
VALUES
(1, 1, 16, '2021-03-29 15:46:55', 'Hey guys! This is my first post!'),
(2, 1, 16, '2021-03-30 08:05:13', 'This is my first topic reply!')
;
My Query:
SELECT forum_categories.name, label, forum_categories.id, COUNT(DISTINCT(forum_topics.id)) as 'topics', COUNT(DISTINCT(forum_topics_content.id)) as 'posts', SUBSTRING(forum_topics.name,1, 32) as 'thread'
FROM forum_categories
LEFT JOIN forum_topics ON forum_categories.id = forum_topics.parent_id
LEFT JOIN forum_topics_content ON forum_topics.id = forum_topics_content.topic_id
GROUP BY forum_categories.id
ORDER BY forum_categories.id, forum_topics.date DESC
I figured having an ORDER BY of forum_topics.date DESC would work for me and output the most recent thread which is "My Second Topic" but it doesn't.
I'm a bit stumped and have tried different variations of ORDER BY to no avail.
thread keeps returning a random result from the two possible results.
Full example with data is available on this fiddle: https://www.db-fiddle.com/f/auDzUABaEpYzLKDkRqE7ok/0
Desired result would 'thread' always being the latest thread which in this example is "My Second Topic". However it always seems to randomly pick between "My First Topic" and "My Second Topic".
The output for the first row should always be:
'General Discussion' , 'Talk about anything and everything Digimon!' 1, 2, 2, 'My Second Topic'
thread keeps returning a random result from the two possible results.
Provided query is simply undeterministic and equivalent to:
SELECT forum_categories.name,
forum_categories.label,
forum_categories.id,
COUNT(DISTINCT(forum_topics.id)) as 'topics',
COUNT(DISTINCT(forum_topics_content.id)) as 'posts',
SUBSTRING(ANY_VALUE(forum_topics.name),1, 32) as 'thread'
FROM forum_categories
LEFT JOIN forum_topics ON forum_categories.id = forum_topics.parent_id
LEFT JOIN forum_topics_content ON forum_topics.id = forum_topics_content.topic_id
GROUP BY forum_categories.id,forum_categories.name,forum_categories.label
ORDER BY forum_categories.id, ANY_VALUE(forum_topics.date) DESC;
Assuming that forum_categories.id is PRIMARY KEY, the name/label are functionally dependent but rest of the column is simply ANY_VALUE.
If a column in SELECT list is not functionally dependent or wrapped with aggregate function the query is incorrect. On MySQL 8.0 or when ONLY_FULL_GROUP_BY is enabled the result is error.
Related: Group by clause in mySQL and postgreSQL, why the error in postgreSQL?
There are different ways to achieve desired result(correlated subqueries, windowed functions, limit) and so on.
Here using GROUP_CONCAT:
SELECT forum_categories.name,
forum_categories.label,
forum_categories.id,
COUNT(DISTINCT(forum_topics.id)) as `topics`,
COUNT(DISTINCT(forum_topics_content.id)) as `posts`,
SUBSTRING_INDEX(GROUP_CONCAT(SUBSTRING(forum_topics.name,1,32)
ORDER BY forum_topics.`date` DESC
SEPARATOR '~'),
'~',1) AS `thread`
FROM forum_categories
LEFT JOIN forum_topics ON forum_categories.id = forum_topics.parent_id
LEFT JOIN forum_topics_content ON forum_topics.id = forum_topics_content.topic_id
GROUP BY forum_categories.id,forum_categories.name,forum_categories.label
ORDER BY forum_categories.id;
How it works:
GROUP_CONCAT is aggregate function that allow to concatenate string preserving order.
My Second Topic~My First Topic~My First Topic
Then SUBSTRING_INDEX returns part of string up to first occurence of delimeter ~.
db<>fiddle demo
In you fiddle you have:
SET SESSION sql_mode = '';
You should change that to:
SET SESSION sql_mode = 'ONLY_FULL_GROUP_BY';
You will get an error like this:
Query Error: Error: ER_WRONG_FIELD_WITH_GROUP: Expression #1 of
SELECT list is not in GROUP BY clause and contains nonaggregated
column 'test.forum_categories.name' which is not functionally
dependent on columns in GROUP BY clause; this is incompatible with
sql_mode=only_full_group_by
It is stated in the docs that:
ONLY_FULL_GROUP_BY
Reject queries for which the select list, HAVING condition, or ORDER
BY list refer to nonaggregated columns that are neither named in the
GROUP BY clause nor are functionally dependent on (uniquely determined
by) GROUP BY columns.
As of MySQL 5.7.5, the default SQL mode includes ONLY_FULL_GROUP_BY.
(Before 5.7.5, MySQL does not detect functional dependency and
ONLY_FULL_GROUP_BY is not enabled by default. For a description of
pre-5.7.5 behavior, see the MySQL 5.6 Reference Manual.)
And there is a very good reason why they did that.
see: why should not disable only full group by

Sql select where array in column

In my query I use join table category_attributes. Let's assume we have such rows:
category_id|attribute_id
1|1
1|2
1|3
I want to have the query which suites the two following needs. I have a variable (php) of allowed attribute_id's. If the array is subset of attribute_id then category_id should be selected, if not - no results.
First case:
select * from category_attributes where (1,2,3,4) in category_attributes.attribute_id
should give no results.
Second case
select * from category_attributes where (1,2,3) in category_attributes.attribute_id
should give all three rows (see dummy rows at the beginning).
So I would like to have reverse side of what standard SQL in does.
Solution
Step 1: Group the data by the field you want to check.
Step 2: Left join the list of required values with the records obtained in the previous step.
Step 3: Now we have a list with required values and corresponding values from the table. The second column will be equal to required value if it exist in the table and NULL otherwise.
Count null values in the right column. If it is equal to 0, then it means table contains all the required values. In that case return all records from the table. Otherwise there must be at least one required value is missing in the table. So, return no records.
Sample
Table "Data":
Required values:
10, 20, 50
Query:
SELECT *
FROM Data
WHERE (SELECT Count(*)
FROM (SELECT D.value
FROM (SELECT 10 AS value
UNION
SELECT 20 AS value
UNION
SELECT 50 AS value) T
LEFT JOIN (SELECT value
FROM Data
GROUP BY value) D
ON ( T.value = D.value )) J
WHERE value IS NULL) = 0;
You can use group by and having:
select ca.category_id
from category_attributes ca
where ca.attribute_id in (1, 2, 3, 4)
group by ca.category_id
having count(*) = 4; -- "4" is the size of the list
This assumes that the table has no duplicates (which is typical for attribute mapping tables). If that is a possibility, use:
having count(distinct ca.attribute_id) = 4
You can aggregate attribute_id into array and compare two array from php.
SELECT category_id FROM
(select category_id, group_concat(attribute_id) as attributes from category_attributes
order by attribute_id) t WHERE t.attributes = (1, 2, 3);
But you need to find another way to compare arrays or make sure that array is always sorted.

How get TIMEDIFF of each row's date field with the closest previous date row without a subquery?

I need to calculate the TIMEDIFF between a row and the row whose field dateCompleted is the last one just before this one and then get the value as timeSinceLast.
I can do this easily as a subquery but it's very slow. (About 12-15 times slower than a straight query on the table for just the rows).
#Very slow
Select a.*, TIMDIFF(a.dateCompleted, (SELECT a2.dateCompleted FROM action a2 WHERE a2.dateCompleted < a.dateCompleted ORDER BY a2.dateCompleted DESC LIMIT 1)) as timeSinceLast
FROM action a;
I tried doing it as a join with itself but couldn't figure out how to get that work as I don't know how to do a LIMIT 1 on the join table and not the query as a whole.
#How limit the join table only?
SELECT a.*, TIMEDIFF(a.dateCompleted, a2.dateCompleted)
FROM action a
LEFT JOIN action a2 on a2.dateCompleted < a.dateCompleted
LIMIT 1;
Is this possible in MySQL?
EDIT: Schema and data
http://sqlfiddle.com/#!9/03b5c/3
create table Actions
( 
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
dateCompleted datetime not null
);
#Notice, they can come out of order.
# The third one would affect the first one in my query as
# it's the first completed date right after the first
insert into Actions (dateCompleted)
values ("2016-05-06 12:11:01");
insert into Actions (dateCompleted)
values ("2016-05-06 12:11:03");
insert into Actions (dateCompleted)
values ("2016-05-06 12:11:02");
insert into Actions (dateCompleted)
values ("2016-05-06 12:11:05");
insert into Actions (dateCompleted)
values ("2016-05-06 12:11:04");
Result (order by dateCompleted):
id dateCompleted timeSinceLast
1, "2016-05-06 12:11:01", null
3, "2016-05-06 12:11:02", 1
2, "2016-05-06 12:11:03", 1
5, "2016-05-06 12:11:04", 1
4, "2016-05-06 12:11:05", 1
(In this simple example, they all had a one second time since the next one)
SELECT x.*
, MIN(TIMEDIFF(x.datecompleted,y.datecompleted))
FROM actions x
LEFT
JOIN actions y
ON y.datecompleted < x.datecompleted
GROUP
BY x.id
ORDER
BY x.datecompleted;
...or faster...
SELECT x.*
, TIMEDIFF(datecompleted,#prev)
, #prev:=datecompleted
FROM actions x
, (SELECT #prev:=null) vars
ORDER
BY datecompleted;

MySql IN clauses, trying to match IN list of tuples

I am trying to select duplicate records based on a match of three columns. The list of triples could be very long (1000), so I would like to make it concise.
When I have a list of size 10 (known duplicates) it only matches 2 (seemingly random ones) and misses the other 8. I expected 10 records to return, but only saw 2.
I've narrowed it down to this problem:
This returns one record. Expecting 2:
select *
from ali
where (accountOid, dt, x) in
(
(64, '2014-03-01', 10000.0),
(64, '2014-04-23', -122.91)
)
Returns two records, as expected:
select *
from ali
where (accountOid, dt, x) in ( (64, '2014-03-01', 10000.0) )
or (accountOid, dt, x) in ( (64, '2014-04-23', -122.91) )
Any ideas why the first query only returns one record?
I'd suggest you don't use IN() for this, instead use a where exists query, e.g.:
CREATE TABLE inlist
(`id` int, `accountOid` int, `dt` datetime, `x` decimal(18,4))
;
INSERT INTO inlist
(`id`, `accountOid`, `dt`, `x`)
VALUES
(1, 64, '2014-03-01 00:00:00', 10000.0),
(2, 64, '2014-04-23 00:00:00', -122.91)
;
select *
from ali
where exists ( select null
from inlist
where ali.accountOid = inlist.accountOid
and ali.dt = inlist.dt
and ali.x = inlist.x
)
;
I was able to reproduce a problem (compare http://sqlfiddle.com/#!2/7d2658/6 to http://sqlfiddle.com/#!2/fe851/1 both MySQL 5.5.3) where if the x column was numeric and the value negative it was NOT matched using IN() but was matched when either numeric or decimal using a table and where exists.
Perhaps not a conclusive test but personally I wouldn't have used IN() for this anyway.
Why are you not determining the duplicates this way?
select
accountOid
, dt
, x
from ali
group by
accountOid
, dt
, x
having
count(*) > 1
Then use that as a derived table within the where exists condition:
select *
from ali
where exists (
select null
from (
select
accountOid
, dt
, x
from ali
group by
accountOid
, dt
, x
having
count(*) > 1
) as inlist
where ali.accountOid = inlist.accountOid
and ali.dt = inlist.dt
and ali.x = inlist.x
)
see http://sqlfiddle.com/#!2/ede292/1 for the query immediately above

How to add parameter to aggregate function in report dataset

Thank you for coming to look at my question.
I have an SQL group by function which I'd like to add parameters to. (If that's possible)
I've tried to splice the parameters, two columns from the table into the function but I don't seem to get it right.
This function creates a table that counts records, I would like to be able to filter with parameters by 'Team' and 'Location'.
How would I go about adding this information to the dataset to allow me to filter?
I would normally add them using:
select
i.Team
,i.Location
From
incident i
Where i.Team in (#Team)
and i.Location in (#Location)
The table is called incident and all the information is from the same table.
I would very much appreciate an idea to do this. Thank you.
Oh, and I'm using Report Builder 3, with SQL 2008 R2
declare #st_date datetime;
declare #en_date datetime;
declare #days int;
declare #offset int;
set #en_date = (#en_datein);
set #offset = (#BrowserTimezoneOffset);
set #days = -6;
set #st_date = DATEADD(dd, #days, #en_date);
with daterange(dt) as
(select
#st_date dt
union all
select
DATEADD(dd, 1, dt) dt
from daterange
where dt <= DATEADD(dd, -1, #en_date)
)
select
left(DATENAME(dw, dt), 3) as weekday
,ISNULL(sum(inc.createdc), 0) as createdcount
,ISNULL(sum(inr.resolvedclosedc), 0) as resolvedclosedcount
from daterange left outer join
(select
left(DATENAME(dw,DATEADD(mi,#offset,CreatedDateTime)), 3) as createddatetime
,count(recid) as createdc
from Incident
where DATEADD(mi,#offset,CreatedDateTime) >= #st_date
and DATEADD(mi,#offset,CreatedDateTime) <= #en_date
group by left(DATENAME(dw, DATEADD(mi,#offset,CreatedDateTime)), 3)
) as inc
on inc.CreatedDateTime = left(DATENAME(dw, dt), 3)
left outer join
(select
left(DATENAME(dw, DATEADD(mi,#offset,ResolvedDateTime)), 3) as ResolvedDateTime
,count(case when status in ('Resolved', 'Closed') then 1 end) as resolvedclosedc
from Incident
where DATEADD(mi,#offset,ResolvedDateTime) between #st_date and #en_date
group by left(DATENAME(dw, DATEADD(mi,#offset,ResolvedDateTime)), 3)
) as inr
on inr.ResolvedDateTime = left(DATENAME(dw, dt), 3)
group by dt
order by dt
When using parameters that will be using one or many values you may tie them to a dataset as well.
Say if I have orders and people in a pretend sequence but I want to find orders of only certain people. I would follow a few steps:
I would create a dataset only for a parameter and call it 'People' for this example lets use a table variable that self executes and place this 'Query' box for a dataset.
declare #People Table ( personID int identity, person varchar(8));
insert into #People values ('Brett'),('Sean'),('Chad'),('Michael')
,('Ray'),('Erik'),('Queyn');
select * From #People
I would want to start with the dependency first which is a variable #Person I set up as an Integer and check 'Allow multiple values'. I then choose 'Available Values' on the left pane of the variable. I choose 'Get values from a query' choose my 'people' dataset from 1, choose PersonID as the Value field, and person as the label.
Now my parameter is bound and I can move on to my orders set. Again create a Dataset but call this one 'OrdersMain' and use a self extracting table variable but I am adding a predicate now referencing my variable from above as well.
declare #Orders table ( OrderID int identity, PersonID int, Desciption varchar(32), Amount int);
insert into #Orders values (1, 'Shirt', 20),(1, 'Shoes', 50),(2, 'Shirt', 22),
(2, 'Shoes', 52),(3, 'Shirt', 20),(3, 'Shoes', 50),(3, 'Hat', 20),
(4, 'Shirt', 20),(5, 'Shirt', 20),(5, 'Pants', 30), (6, 'Shirt', 20),
(6, 'RunningShoes', 70),(7, 'Shirt', 22),(7, 'Shoes', 40),(7, 'Coat', 80)
Select * from #Orders where PersonID in (#Person)
Now if populate my report with a tablix item and put the values from 'OrdersMain' in a tablix a user is prompted with a label for Brett, Sean, etc.. but the id is used for the orders to limit the scope of the dataset.
Optional
You can repeat step 1 for a SUBSET of people in another dataset and call it 'Defaults'. Then with an expanse of step 2 leave everything as is, but add this new dataset to 'Default Values' get from a query. This way I could create a temp table to get some of my people I most often use and then set them to be defaults instead. This would make the report auto execute when called.
Filtering can mean other things in SSRS as well. You can on any dataset see on the left pane a 'filter' and you may apply this. Keep in mind this will evaluate the whole expression first and then filter it. This use IMHO is best with shared datasets that are rather small and fast. Or you can use the filter clause in tablix elements as well which often is good when you want three objects from the same set but different predicates evaluated after runtime but to limit scope with reuse of one dataset for many objects.