Calculation involving repeated items in MySQL table - mysql

I have a table with a composite primary key on EID (event ID) and start_time. I have another column called attending.
Users make their events more popular by reusing the event ID and changing the date, however, I create a new line in the database in this instance.
I would like to create a 4th column, actual_attending which is equal to the attending value minus the previous event's attending value. If their is no previous ID, the column can be null. How can I calculate this via update.
Here is a sqlfiddle as an example: http://sqlfiddle.com/#!2/43f2c5

update event e1
set e1.actual_attending = (select e1.attending - e2.attending
from event e2
where e2.eid(+) = e1.previous_eid
)

SELECT a.*
, a.attending-b.attending new_actual_attending
FROM
( SELECT x.*
, COUNT(*) rank
FROM event x
JOIN event y
ON y.eid = x.eid
AND y.start_time <= x.start_time
GROUP
BY eid, start_time
) a
LEFT
JOIN
( SELECT x.*
, COUNT(*) rank
FROM event x
JOIN event y
ON y.eid = x.eid
AND y.start_time <= x.start_time
GROUP
BY eid, start_time
) b
ON b.eid = a.eid
AND b.rank = a.rank - 1;
+-----+------------+-----------+------------------+------+----------------------+
| eid | start_time | attending | actual_attending | rank | new_actual_attending |
+-----+------------+-----------+------------------+------+----------------------+
| 1 | 2013-06-08 | 29 | NULL | 1 | NULL |
| 2 | 2013-06-09 | 72 | NULL | 1 | NULL |
| 2 | 2013-06-16 | 104 | NULL | 2 | 32 |
| 3 | 2013-06-07 | 224 | NULL | 1 | NULL |
| 3 | 2013-06-14 | 222 | NULL | 2 | -2 |
+-----+------------+-----------+------------------+------+----------------------+
http://sqlfiddle.com/#!2/43f2c5/2

Related

Is there a way to UPDATE column values based another column's value?

Part 1 of my SQL task involves restructuring data. The jist of my task is as follows: Based on the event_type, if it is "begin" I am trying to use that "time" to find it's stopping time (in another row) and add it to a column (event_end) on the same row as the start time so that all the data for an event sits nicely in one row.
pID customerID locationID event_type time event_end (new row)
1 1 a begin 12.45
2 2 a begin 11.10
3 1 a stop 1.30
4 2 b begin 9.45
5 3 b stop 8.78
I would like to add another column (event_end), and have event_end = the minimum value of event_start IF event_start = 'stop', IF locationID = locationID, and IF customerID = customerID. The final step would be to delete all event_start 'begin' rows.
I have tried UPDATE SET WHERE sequences, and a little bit of CASE, but my issue is that I cannot wrap my head around how to perform this without a loop like VBA. The following is my best stab at it:
UPDATE table
SET event_end = MIN(time)
WHERE event_type = 'stop'
WHERE customerid = customerid
WHERE locationid = locationid
WHERE time > time
SELECT *
FROM table
I'm hoping to have a table with all event data in one row, not spread out over multiple rows. If this is a handful, I appologize but am thankful in advance.
Thanks
Problem Statement:
Add event_end as an extra attribute to the existing row, data will be populated based on customer_id, location_id.
We will populate data in event_end to all events which have event type as begin
Data would be picked from rows which have the same customer_id, location_id but event type as stop.
Finally, we will remove all events with type stop.
Solution: Consider your table name is customer_events and will use self join concept for the same.
First, identify which records needs to be updated. We can use a SELECT query to identify such records.
c1 table will represent rows with begin event type.
c2 table will represent rows with stop event type.
SELECT *
FROM customer_events c1
LEFT JOIN customer_events c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.event_type = 'begin' AND c2.event_type = 'stop'
WHERE c1.event_type = 'begin'; -- As we want to populate data in events with value as `begin`
Write a query to update the records.
UPDATE customer_events c1
LEFT JOIN customer_events c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.event_type = 'begin' AND c2.event_type = 'stop'
SET c1.event_end = c2.`time`
WHERE c1.event_type = 'begin';
Now every record with event type as begin has either value in event_end column or it would be null if no records match as stop event.
For rows with event type as stop, either they are mapped with some row with event type as begin or some are not mapped. In both cases, we don't want to keep them. To remove all records with event type as stop.
DELETE FROM customer_events
WHERE event_type = 'stop';
Note: Don't run DELETE statement unless you are sure that this solution will work for you.
Updated: We can have multiple records of begin & stop events for single customer & location.
Sample Input:
| pID | customerID* | *locationID* | *event_type* | *time* | *event_end* |
| 1 | 1 | a | begin | 02:45:00 | |
| 2 | 2 | a | begin | 03:10:00 | |
| 3 | 1 | b | begin | 04:30:00 | |
| 4 | 2 | b | begin | 05:45:00 | |
| 5 | 2 | a | stop | 06:49:59 | |
| 6 | 1 | a | begin | 07:38:00 | |
| 7 | 3 | b | begin | 08:57:19 | |
| 8 | 2 | b | stop | 09:57:43 | |
| 9 | 3 | b | stop | 10:58:03 | |
| 10 | 4 | a | begin | 11:58:34 | |
| 11 | 1 | a | stop | 12:09:36 | |
| 12 | 1 | b | stop | 13:09:50 | |
| 13 | 1 | a | stop | 14:10:02 | |
Query:
SELECT *
FROM (
SELECT
ce.*,
IF(#c_id <> ce.customerId OR #l_id <> ce.locationID, #rank:= 1, #rank:= #rank + 1 ) as rank,
#c_id:= ce.customerId,
#l_id:= ce.locationID
FROM customer_events ce,
(SELECT #c_id:= 0 c, #l_id:= '' l, #rank:= 0 r) AS t
WHERE event_type = 'begin'
ORDER BY customerId, locationID, `time`) AS c1
LEFT JOIN (
SELECT
ce.*,
IF(#c_id <> ce.customerId OR #l_id <> ce.locationID, #rank:= 1, #rank:= #rank + 1 ) as rank,
#c_id:= ce.customerId,
#l_id:= ce.locationID
FROM customer_events ce,
(SELECT #c_id:= 0 c, #l_id:= '' l, #rank:= 0 r) AS t
WHERE event_type = 'stop'
ORDER BY customerId, locationID, `time`
) AS c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.rank = c2.rank;
Output:
| pId | customerID| locationId| event_type| Start_Time|End_Id| End_Time |
| 1 | 1 | a | begin | 02:45:00 | 11 | 12:09:36 |
| 6 | 1 | a | begin | 07:38:00 | 13 | 14:10:02 |
| 3 | 1 | b | begin | 04:30:00 | 12 | 13:09:50 |
| 2 | 2 | a | begin | 03:10:00 | 5 | 06:49:59 |
| 4 | 2 | b | begin | 05:45:00 | 8 | 09:57:43 |
| 7 | 3 | b | begin | 08:57:19 | 9 | 10:58:03 |
| 10 | 4 | a | begin | 11:58:34 | | |
Update Statement: Create two columns end_pID and event_end for migration.
UPDATE customer_events
INNER JOIN (
SELECT c1.pId, c2.pID End_Id, c2.time AS End_Time
FROM (
SELECT
ce.*,
IF(#c_id <> ce.customerId OR #l_id <> ce.locationID, #rank:= 1, #rank:= #rank + 1 ) as rank,
#c_id:= ce.customerId,
#l_id:= ce.locationID
FROM customer_events ce,
(SELECT #c_id:= 0 c, #l_id:= '' l, #rank:= 0 r) AS t
WHERE event_type = 'begin'
ORDER BY customerId, locationID, `time`) AS c1
LEFT JOIN (
SELECT
ce.*,
IF(#c_id <> ce.customerId OR #l_id <> ce.locationID, #rank:= 1, #rank:= #rank + 1 ) as rank,
#c_id:= ce.customerId,
#l_id:= ce.locationID
FROM customer_events ce,
(SELECT #c_id:= 0 c, #l_id:= '' l, #rank:= 0 r) AS t
WHERE event_type = 'stop'
ORDER BY customerId, locationID, `time`
) AS c2 ON c1.customerID = c2.customerID AND c1.locationID = c2.locationID AND c1.rank = c2.rank) AS tt ON customer_events.pID = tt.pId
SET customer_events.end_pID = t.End_Id, customer_events.event_end = t.End_Time;
Finally, remove all events with event_type = 'stop'

SQL: list rows that do not have any transactions

I have a table for definition - def
Id Device Location
1 GGHY199 USA
12 DFGHY71 India
145 APPHY75 USA
And its transactions are recorded in a diff table -event
eventid deviceid event date
123 12 Login 12-01-2019
32 12 Unreachable 18-02-2019
223 145 Unreachable 19-02-2019
334 1 DOWN 01-03-2019
I want an output as
for every day, all three devices should show, if it has no transacion, it should show as null, with what i assume is the first date of the month in the date column.
like,
eventid deviceid event date
null 1 null 01-01-2019
123 12 Login 12-01-2019
null 145 null 01-01-2019
null 1 null 01-02-2019
32 12 Unreachable 18-02-2019
223 145 Unreachable 19-02-2019
334 1 DOWN 01-03-2019
null 12 null 01-03-2019
null 145 null 01-03-2019
currently im doing:
select * from def
left join
event on def.id=event.deviceid
and im obviously not getting what i want.
Thanks!
Assuming you have a valid calendar table for all the date you need (mimic by the select union all table t) you could try using left join between calendare and the others table
select t.date, e.eventid, e.device, e.event, d.device
from (
select '2019-01-01' date
union all
select '2019-01-02'
union all
select '2019-01-03'
.....
union all
select '2019-03-31'
) t
left event e on t.date = e.event
left join device d on e.device = d.id
You seem to be after something like this...
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(event_id SERIAL PRIMARY KEY
,device_id INT NOT NULL
,event VARCHAR(12) NOT NULL
,date DATE NOT NULL
);
INSERT INTO my_table VALUES
(123, 12,'Login','2019-01-12'),
( 32, 12,'Unreachable','2019-02-18'),
(223,145,'Unreachable','2019-02-19'),
(334, 1,'DOWN','2019-03-01');
SELECT DISTINCT z.event_id
, x.device_id
, z.event
, y.date
FROM my_table x
JOIN my_table y
LEFT
JOIN my_table z
ON z.device_id = x.device_id
AND z.date = y.date
ORDER
BY date
, device_id;
+----------+-----------+-------------+------------+
| event_id | device_id | event | date |
+----------+-----------+-------------+------------+
| NULL | 1 | NULL | 2019-01-12 |
| 123 | 12 | Login | 2019-01-12 |
| NULL | 145 | NULL | 2019-01-12 |
| NULL | 1 | NULL | 2019-02-18 |
| 32 | 12 | Unreachable | 2019-02-18 |
| NULL | 145 | NULL | 2019-02-18 |
| NULL | 1 | NULL | 2019-02-19 |
| NULL | 12 | NULL | 2019-02-19 |
| 223 | 145 | Unreachable | 2019-02-19 |
| 334 | 1 | DOWN | 2019-03-01 |
| NULL | 12 | NULL | 2019-03-01 |
| NULL | 145 | NULL | 2019-03-01 |
+----------+-----------+-------------+------------+
Using CTE table, we can create dates and then join them with event table.
with t0 (i) as (select 0 union all select 0 union all select 0),
t1 (i) as (select a.i from t0 a ,t0 b ),
t2 (i) as (select a.i from t1 a ,t1 b ),
t3 (srno) as (select row_number()over(order by a.i) from t2 a ,t2 b ),
tbldt(dt) as (select dateadd(day,t3.srno-1,'01/01/2019') from t3)
select eventid, deviceid, event, tbldt.dt
from tbldt
left join event e on e.date = tbldt.dt
left join def d on e.deviceid = d.id
where tbldt.dt <= (select max(date) from event)

Ranking for unique users

If I have three columns:
id, username, time
My data is:
+-------+------------------+-------------+
| id | username | time |
+-------+------------------+-------------+
| 1 | A | 1 min |
| 2 | A | 2 min |
| 3 | B | 3 min |
| 4 | B | 4 min |
+-------+------------------+-------------+
This query is working to get the ranking:
SELECT time,
FIND_IN_SET(MIN(time), (SELECT GROUP_CONCAT(time ORDER BY time ASC)
FROM table t1)) AS rank
FROM table t2
WHERE t2.username = 'B';
There is only one problem: It returns Rank 3de for the user B instead 2nd.
So I tried to use GROUP BY t2.username and also Distinct t2.username but did not work.
How can I get the rank of THE user B? It should be 2 (Not 3) because we have only 2 users.
E.g.:
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL
,username CHAR(1) NOT NULL
,time TIME NOT NULL
);
INSERT INTO my_table VALUES
(1,'A','00:01:00'),
(2,'A','00:02:00'),
(3,'B','00:03:00'),
(4,'B','00:04:00');
SELECT * FROM my_table;
+----+----------+----------+
| id | username | time |
+----+----------+----------+
| 1 | A | 00:01:00 |
| 2 | A | 00:02:00 |
| 3 | B | 00:03:00 |
| 4 | B | 00:04:00 |
+----+----------+----------+
SELECT *
FROM
( SELECT username
, time
, #i:=#i+1 rank
FROM
( SELECT username
, MIN(time) time
FROM my_table
GROUP
BY username
) x
, (SELECT #i:=0) vars
ORDER
BY time
) n
WHERE username = 'B';
+----------+----------+------+
| username | time | rank |
+----------+----------+------+
| B | 00:03:00 | 2 |
+----------+----------+------+
I think this would work too, but it's slightly hacky, so I'm not sure...
SELECT x.*
, FIND_IN_SET(time,(SELECT GROUP_CONCAT(DISTINCT time ORDER BY time) FROM (SELECT MIN(time) time FROM my_table GROUP BY username) j )) rank
FROM my_table x HAVING rank <> 0 AND username = 'B';

SELECT and GROUP BY multiple columns based on Max()

Current Table Events:
| eventId | personId | type | title | score |
+-----------+------------+--------+---------+---------+
| 1 | 1 | movie | Mission | 12 |
| 2 | 1 | movie | UNCLE | 32 |
| 3 | 1 | show | U2 | 17 |
| 4 | 1 | show | Leroy | 13 |
| 5 | 2 | movie | Holmes | 19 |
| 6 | 2 | movie | Compton | 14 |
| 7 | 2 | show | Imagine | 22 |
| 8 | 2 | show | Kane | 22 |
MySQL Example:
SELECT #personId:=personId as personId,
(
SELECT title FROM Events
WHERE rate = (SELECT MAX(score) FROM Events)
AND type = ‘movie’ AND personId=#personId
) as movie,
(
SELECT title FROM Events
WHERE rate = (SELECT MAX(score) FROM Events)
AND type = ‘movie’ AND personId=#personId
) as show,
FROM Events
GROUP BY personId
ORDER BY personId;
Desired Output:
| personId | movie | show |
+------------+----------+---------+
| 1 | UNCLE | U2 |
| 2 | Holmes | Imagine |
The desired result is to show the MAX() score for movie and show per personId in the Events table. My actual output contains NULLS and takes a very long time to load. My actual Events table has around 20,000 entries.
UPDATE Solution (derived from first two answers to increase performance)
SELECT e.personId,
(
SELECT o.title
FROM Events o
LEFT JOIN Events b
ON o.personId = b.personId AND o.score < b.score
AND o.type = b.type
WHERE o.type = 'movie' AND o.personId=e.personId LIMIT 1
) as best_movie ,
(
SELECT o.title
FROM Events o
LEFT JOIN Events b
ON o.personId = b.personId AND o.score < b.score
AND o.type = b.type
WHERE o.type = 'show' AND o.personId=e.personId LIMIT 1
) as best_show
FROM Events e
GROUP BY e.personId
ORDER BY e.personId
I made few modifications to your query.
Have a look:
SELECT #personId:=personId as personId,
(
SELECT title FROM Events
WHERE score = (SELECT MAX(score) FROM Events WHERE type = 'movie'
AND personId=#personId)
AND type = 'movie' AND personId=#personId LIMIT 1
) as movie ,
(
SELECT title FROM Events
WHERE score = (SELECT MAX(score) FROM Events WHERE type = 'show'
AND personId=#personId)
AND type = 'show' AND personId=#personId LIMIT 1
) as show1
FROM Events
GROUP BY personId
ORDER BY personId
I am not sure if it is exactly what you need.
But just my guess, probably you don't need multiple columns?
http://sqlfiddle.com/#!9/04b27/4
SELECT e.*
FROM `events` e
left join `events` e1
on e.type = e1.type
and e.score<e1.score
WHERE e1.eventId IS NULL
GROUP BY personId, type, score
Update If you need the max score per person + type you can
http://sqlfiddle.com/#!9/04b27/13
SELECT e.*
FROM `events` e
left join `events` e1
on e.personId = e1.personId
and e.type = e1.type
and e.score<e1.score
WHERE e1.eventId IS NULL
GROUP BY personId, score

Select only newest grouped entries

I have a table with data like this:
+-----------+-------+------+----------+
| timestamp | event | data | moreData |
+-----------+-------+------+----------+
| 100000000 | 1 | 10 | 20 |
| 100000001 | 1 | 15 | 10 |
| 100000002 | 1 | 30 | 30 |
| 100000003 | 1 | 5 | 50 |
| 100000004 | 2 | 110 | 120 |
| 100000005 | 2 | 115 | 110 |
| 100000006 | 2 | 130 | 130 |
| 100000007 | 2 | 15 | 150 |
+-----------+-------+------+----------+
Now I want to select only the newest rows for each event. So in the end I want to have this result set:
+-----------+-------+------+----------+
| timestamp | event | data | moreData |
+-----------+-------+------+----------+
| 100000003 | 1 | 5 | 50 |
| 100000007 | 2 | 15 | 150 |
+-----------+-------+------+----------+
So far I was not able to do this. In MySQL I can use "GROUP BY event" but then I get some random row from the database, not the newest. ORDER BY doesn't help because the grouping is done before ordering. Using an aggregation like MAX(timestamp) while grouping by event also doesn't help because then the timestamp is the newest but "data" and "moreData" is still from some other random row.
I guess I have to do a sub select so I have to first get the latest timestamp like this:
SELECT MAX(timestamp), event FROM mytable GROUP BY event
and then use the result set to filter a second SELECT, but how? And maybe there is a clever way to do it without a sub select?
AFAIK, sub select is your best option, as follows:
SELECT *
FROM mytable mt
JOIN ( SELECT MAX(timestamp) as max, event
FROM mytable
GROUP BY event) m_mt
ON (mt.timestamp = m_mt.max AND mt.event = m_mt.event);
You could use an inner join as a filter:
select *
from events e1
join (
select event
, max(timestamp) as maxtimestamp
from events
group by
event
) e2
on e1.event = e2.event
and e1.tiemstamp = e2.maxtimestamp
SELECT * FROM
(SELECT * FROM mytable ORDER BY timestamp DESC) AS T1
GROUP BY event;
SELECT e2.*
FROM events e
JOIN events e2 ON e2.event = e.event AND e2.timestamp = MAX(e2.timestamp)
GROUP BY e.id