Get a query to list the records that are on and in between the start and the end values of a particular column for the same Id - mysql

There is a table with the columns :
USE 'table';
insert into person values
('11','xxx','1976-05-10','p1'),
('11','xxx ','1976-06-11','p1'),
('11','xxx ','1976-07-21','p2'),
('11','xxx ','1976-08-31','p2'),
Can anyone suggest me a query to get the start and the end date of the person with respect to the place he changed chronologically.
The query I wrote
SELECT PId,Name,min(Start_Date) as sdt, max(Start_Date) as edt, place
from **
group by Place;
only gives me the first two rows of my answer. Can anyone suggest the query??

This isn't pretty, and performance might be horrible, but at least it works:
select min(sdt), edt, place
from (
select A.Start_Date sdt, max(B.Start_Date) edt, A.place
from person A
inner join person B on A.place = B.place
and A.Start_Date <= B.Start_Date
left join person C on A.place != C.place
and A.Start_Date < C.Start_Date
and C.Start_Date < B.Start_Date
where C.place is null
group by A.Start_Date, A.place
) X
group by edt, place
The idea is that A and B represent all pairs of rows. C will be any row in between these two which has a different place. So after the C.place is null restriction, we know that A and B belong to the same range, i.e. a group of rows for one place with no other place in between them in chronological order. From all these pairs, we want to identify those with maximal range, those which encompass all others. We do so using two nested group by queries. The inner one will choose the maximal end date for every possible start date, whereas the outer one will choose the minimal start date for every possible end date. The result are maximal ranges of chronologically subsequent rows describing the same place.

This can be achived by:
SELECT Id, PId,
MIN(Start_Date) AS sdt,
MAX(Start_Date) as edt,
IF(`place` <> #var_place_prev, (#var_rank:= #var_rank + 1), #var_rank) AS rank,
(#var_place_prev := `place`) AS `place`
FROM person, (SELECT #var_rank := 0, #var_place_prev := "") dummy
GROUP BY rank, Place;
Example: SQLFiddle
If you want records to be ordered by ID then:
SELECT Id, PId,
MIN(Start_Date) AS sdt,
MAX(Start_Date) as edt,
`place`
FROM(
SELECT Id, PId,
Start_Date
IF(`place` <> #var_place_prev,(#var_rank:= #var_rank + 1),#var_rank) AS rank,
(#var_place_prev := `place`) AS `place`
FROM person, (SELECT #var_rank := 0, #var_place_prev := "") dummy
ORDER BY ID ASC
) a
GROUP BY rank, Place;

Related

Grouping rows via two different columns in MYSQL

I just want to ask if grouping rows with the same value but came from different columns is possible.
I have a scenario that we should sum up the total minutes if the records are found "continuous" transactions by checking if the STARTDATETIME column matches the previous data of ENDDATETIME column if they are the same. See image link below for reference.
Thanks guys.
I modified Gordon Linoff's solution ( see my comment under the question):
SELECT
c.employee_id
,MIN(c.start_date) AS start_date
,MAX(c.end_date) AS end_date
,COUNT(*) AS numcontracts,
TIMESTAMPDIFF(minute,MIN(c.start_date),MAX(c.end_date)) AS timediff
FROM
(
SELECT
c0.*
,(#rn := #rn + COALESCE(startflag, 0)) AS cumestarts
FROM
(SELECT c1.*,
(NOT EXISTS (SELECT 1
FROM contracts c2
WHERE c1.employee_id = c2.employee_id AND
c1.start_date = c2.end_date
)
) AS startflag
FROM contracts c1
ORDER BY employee_id, start_date
) c0 CROSS JOIN (SELECT #rn := 0) params
) c
GROUP BY c.employee_id, c.cumestarts
http://rextester.com/VOGMU19779
timediff contains the minutes passed in the combined interval.

Mysql - Accumulatively count the total on a row by row basis

I'm trying in MySql to count the number of users created each day and then get an accumulative figure on a row by row basis. I have followed other suggestions on here, but I cannot seem to get the accumulation to be correct.
The problem is that it keeps counting from the base number of 200 and not taking account of previous rows.
Where was I would expect it to return
My Sql is as follows;
SELECT day(created_at), count(*), (#something := #something+count(*)) as value
FROM myTable
CROSS JOIN (SELECT #something := 200) r
GROUP BY day(created_at);
To create the table and populate it you can use;
CREATE TABLE myTable (
id INT AUTO_INCREMENT,
created_at DATETIME,
PRIMARY KEY (id)
);
INSERT INTO myTable (created_at)
VALUES ('2018-04-01'),
('2018-04-01'),
('2018-04-01'),
('2018-04-01'),
('2018-04-02'),
('2018-04-02'),
('2018-04-02'),
('2018-04-03'),
('2018-04-03');
You can view this on SqlFiddle.
Use a subquery:
SELECT day, cnt, (#s := #s + cnt)
FROM (SELECT day(created_at) as day, count(*) as cnt
FROM myTable
GROUP BY day(created_at)
) d CROSS JOIN
(SELECT #s := 0) r;
GROUP BY and variables have not worked together for a long time. In more recent versions, ORDER BY also needs a subquery.

Mysql query is really slow. How do I increase the speed of the query?

I want to the latest results for my patients. The following sql returns 69,000 results after 87 seconds in mysqlworkbench. I have made both 'date' and 'patientid' columns as index.
select Max(date) as MaxDate, PatientID
from assessment
group by PatientID
I think my table has approximately 440,000 in total. Is it because that my table is 'large'?
Is there a way to increase the speed of this query, because I will have to embed this query inside other queries. For example like below:
select aa.patientID, assessment.Date, assessment.result
from assessemnt
inner join
(select Max(date) as MaxDate, PatientID
from assessment
group by PatientID) as aa
on aa.patientID = assessment.patientID and aa.MaxDate = assessment.Date
The above will give me the latest assessment results for each patient. Then I will also embed this piece of code to do other stuff... So I really need to speed up things. Anyone can help?
I wonder if this version would have better performance with the right indexes:
select a.patientID, a.Date, a.result
from assessemnt a
where a.date = (select aa.date
from assessment aa
where aa.patientID = a.patientID
order by aa.date desc
limit 1
);
Then you want an index on assessment(patientID, date).
EDIT:
Another approach uses an index on assessment(patient_id, date, result):
select a.*
from (select a.patient_id, a.date, a.result,
(#rn := if(#p = a.patient_id, #rn + 1,
if(#p := a.patient_id, 1, 1)
)
) as rn
from assessment a cross join
(select #p := -1, #rn := 0) params
order by patient_id desc, date desc
) a
where rn = 1;

Complicated Query

I'm not sure if the following can be done using a mere select statement, but I have two tables (truncated with the data necessary to the problem).
Inventory Item
id int (PRIMARY)
quantity int
Stock - Contains changes in the stock of the inventory item (stock history)
id int (PRIMARY)
inventory_item_id int (FOREIGN KEY)
quantity int
created datetime
The quantity in stock is the change in stock, while the quantity in inventory item is the current quantity of that item
EVERYTHING IN THE running COLUMN WILL RETURN 0
SELECT
inventory_item.id,
(inventory_item.quantity - SUM(stock.quantity)) AS running
FROM
stock
JOIN
inventory_item ON stock.inventory_item_id = inventory_item.id
GROUP BY inventory_item.id
THE QUESTION
Now, what I would like to know is: Is it possible to select all of the dates in the stock table where the running quantity of the inventory_item ever becomes zero using a SELECT?
I know this can be done programmatically by simply selecting all of the stock data in one item, and subtracting the stock quantity individually from the current inventory item quantity, which will get the quantity before the change in stock happened. Can I do this with a SELECT?
(Updated) Assuming there will never be more than one record for a given combination of inventory_item_id and created, try:
SELECT i.id,
s.created,
i.quantity - COALESCE(SUM(s2.quantity),0) AS running
FROM inventory_item i
JOIN stock s ON s.inventory_item_id = i.id
LEFT JOIN stock s2 ON s2.inventory_item_id = i.id and s.created < s2.created
GROUP BY i.id, s.created
HAVING running=0
My take on it:
select
inventory_item_id `item`,
created `when`
from
(select
#total := CASE WHEN #curr <> inventory_item_id
THEN quantity
ELSE #total+quantity END as running_total,
inventory_item_id,
created,
#curr := inventory_item_id
from
(select #total := 0) a
(select #curr := -1) b
(select inventory_item_id, created, quantity from stock order by inventory_item_id, created asc) c
) running_total
where running_total.running_total = 0;
This one has the relative advantage of having to give only one pass to the stock table. Depending on the size and the indexes on it that may or may not be a good thing.
The most logical way to do this is with a cumulative sum. But, MySQL doesn't support that.
The clearest approach, in my opinion, is to use a correlated subquery to get the running quantity. Then it is a simple matter of a where clause to select where it is 0:
select i.*
from (select i.*,
(select SUM(i2.inventory)
from inventory i2
where i2.inventory_item_id = i.inventory_item_id and
i2.created <= i.created
) as RunningQuantity
from inventory i
) i
where RunningQuantity = 0;
I had a response similar based on a running total to be flagged found here...
You can do with MySQL #variables, but the data needs to be pre-queried and ordered by the data of activity... then set a flag on each row that causes the negative and keep only those. Something like
select
PreQuery.*
from
( select
s.id,
s.created,
#runBal := if( s.id = #lastID, #runBal - quantity, #i.quantity ) as CurBal,
#lastID := s.id as IDToCompareNextEntry
from
stock s
join inventory_item ii
on s.inventory_item_id = ii.id,
(select #lastID := -1,
#runBal := 0 ) sqlvars
order by
s.id,
s.created DESC ) PreQuery
where
PreQuery.CurBal < 0
This way, for each inventory item, it works backwards by created date (order by the created descending per ID). So, when the inventory ID changes, look to the inventory table "Quantity" field to START the tally of used stock down. If same ID as the last record processed, just use the running balance and subtract out the quantity of that stock entry.
I believe this is a simple approach to this.
SELECT inventory_item.id, stock.created
FROM inventory_item
JOIN stock ON stock.inventory_item_id = inventory_item.id
WHERE (SELECT SUM(quantity) FROM stock WHERE created <= stock.created) = 0

Checking for maximum length of consecutive days which satisfy specific condition

I have a MySQL table with the structure:
beverages_log(id, users_id, beverages_id, timestamp)
I'm trying to compute the maximum streak of consecutive days during which a user (with id 1) logs a beverage (with id 1) at least 5 times each day. I'm pretty sure that this can be done using views as follows:
CREATE or REPLACE VIEW daycounts AS
SELECT count(*) AS n, DATE(timestamp) AS d FROM beverages_log
WHERE users_id = '1' AND beverages_id = 1 GROUP BY d;
CREATE or REPLACE VIEW t AS SELECT * FROM daycounts WHERE n >= 5;
SELECT MAX(streak) AS current FROM ( SELECT DATEDIFF(MIN(c.d), a.d)+1 AS streak
FROM t AS a LEFT JOIN t AS b ON a.d = ADDDATE(b.d,1)
LEFT JOIN t AS c ON a.d <= c.d
LEFT JOIN t AS d ON c.d = ADDDATE(d.d,-1)
WHERE b.d IS NULL AND c.d IS NOT NULL AND d.d IS NULL GROUP BY a.d) allstreaks;
However, repeatedly creating views for different users every time I run this check seems pretty inefficient. Is there a way in MySQL to perform this computation in a single query, without creating views or repeatedly calling the same subqueries a bunch of times?
This solution seems to perform quite well as long as there is a composite index on users_id and beverages_id -
SELECT *
FROM (
SELECT t.*, IF(#prev + INTERVAL 1 DAY = t.d, #c := #c + 1, #c := 1) AS streak, #prev := t.d
FROM (
SELECT DATE(timestamp) AS d, COUNT(*) AS n
FROM beverages_log
WHERE users_id = 1
AND beverages_id = 1
GROUP BY DATE(timestamp)
HAVING COUNT(*) >= 5
) AS t
INNER JOIN (SELECT #prev := NULL, #c := 1) AS vars
) AS t
ORDER BY streak DESC LIMIT 1;
Why not include user_id in they daycounts view and group by user_id and date.
Also include user_id in view t.
Then when you are queering against t add the user_id to the where clause.
Then you don't have to recreate your views for every single user you just need to remember to include in your where clause.
That's a little tricky. I'd start with a view to summarize events by day:
CREATE VIEW BView AS
SELECT UserID, BevID, CAST(EventDateTime AS DATE) AS EventDate, COUNT(*) AS NumEvents
FROM beverages_log
GROUP BY UserID, BevID, CAST(EventDateTime AS DATE)
I'd then use a Dates table (just a table with one row per day; very handy to have) to examine all possible date ranges and throw out any with a gap. This will probably be slow as hell, but it's a start:
SELECT
UserID, BevID, MAX(StreakLength) AS StreakLength
FROM
(
SELECT
B1.UserID, B1.BevID, B1.EventDate AS StreakStart, DATEDIFF(DD, StartDate.Date, EndDate.Date) AS StreakLength
FROM
BView AS B1
INNER JOIN Dates AS StartDate ON B1.EventDate = StartDate.Date
INNER JOIN Dates AS EndDate ON EndDate.Date > StartDate.Date
WHERE
B1.NumEvents >= 5
-- Exclude this potential streak if there's a day with no activity
AND NOT EXISTS (SELECT * FROM Dates AS MissedDay WHERE MissedDay.Date > StartDate.Date AND MissedDay.Date <= EndDate.Date AND NOT EXISTS (SELECT * FROM BView AS B2 WHERE B1.UserID = B2.UserID AND B1.BevID = B2.BevID AND MissedDay.Date = B2.EventDate))
-- Exclude this potential streak if there's a day with less than five events
AND NOT EXISTS (SELECT * FROM BView AS B2 WHERE B1.UserID = B2.UserID AND B1.BevID = B2.BevID AND B2.EventDate > StartDate.Date AND B2.EventDate <= EndDate.Date AND B2.NumEvents < 5)
) AS X
GROUP BY
UserID, BevID