MySQL Calculating sum over pairwise time differences of log file - mysql

i have a table in mysql to log user actions. Each row in the table corresponds to a user action, like login, logout etc.
The table looks like:
CREATE TABLE IF NOT EXISTS `user_activity_log` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`user_id` int(11) NOT NULL,
`action_type` smallint NOT NULL,
`action_created` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
id user_id action_type action_created
...
22 6 1 2013-07-21 14:31:14
23 6 2 2013-07-21 14:31:16
24 8 2 2013-07-21 14:31:18
25 8 1 2013-07-21 14:45:18
26 8 0 2013-07-21 14:45:25
27 8 1 2013-07-21 14:54:54
28 8 2 2013-07-21 15:09:11
29 6 1 2013-07-21 15:09:17
30 6 2 2013-07-21 15:09:29
...
Imagine the action 1 is login and 2 is logout and that i want to find out the total time (in hours:minutes:seconds) the user with id 6 was logged in within a specific range of dates.
My first idea was to fetch all rows with either action 1 or 2 and calculate the date differences in PHP myself. This seems rather complicated and i am sure this can be done in one query with mysql, too!
What i tried was this:
SELECT TIMEDIFF(ual1.action_created, ual2.action_created) FROM user_activity_log
ual1,user_activity_log ual2 WHERE ual1.user_id = 6 AND ual2.user_id = 6 AND
ual1.action_type = 1 AND ual2.action_type = 2 AND
DATE(ual1.action_created) >= '2013-07-21' AND
DATE(ual1.action_created) <= '2013-07-21'
ORDER BY ual1.action_created
to select all login events from ual1 and all logout events from ual2 from the same user and then calculate the pairwise time difference for day 2013.7.21, which does not really work and i don't know why.
How can i calculate the total login time (sum over all time differences, date action 2 - date action 1)?
The result from the correct operation should be 2 seconds from log id pair 22,23 + 12 seconds from log id pair 29,30 = 14 seconds.
Thank you very much for your help in advance. Best regards

I think the easiest way to structure this type of query is using correlated subqueries (and, to be honest, I generally don't like correlated subqueries, but this is an exception). Your query would probably work with the right group by clause.
Here is an alternative method:
select TIMEDIFF(action_created, LogoutTS)
from (select ual.*,
(select ual2.user_activity_log
from user_activity_log ual2
where ual2.user_id = ual.user_id and
ual2.action_type = 2 and
ual2.action_created > ual.action_created
order by ual2.action_created desc
limit 1
) as LogoutTS
from user_activity_log ual
where ual.user_id = 6 and
ual.action_type = 1
) ual
To get the total, you then need to do something like sum(TIMEDIFF(action_created, LogoutTS). However, this can depend on the format of the time column. It might look something like this:
select SUM((UNIX_TIMESTAMP(LogoutTS) - UNIX_TIMESTAMP(action_created))/1000)
Or:
select sec_to_time(SUM((UNIX_TIMESTAMP(LogoutTS) - UNIX_TIMESTAMP(action_created))/1000))

Related

SQL join each row in a table with a one row from another table

The Problem
I have a table window with start and end timestamps. I have another table activity that has a timestamp. I would like to create a query that:
For each row in activity it joins with a single row from window, where the timestamp occurs between start and end, choosing the older window.
Window Table
Start
End
ISBN
0
10
"ABC"
5
15
"ABC"
20
30
"ABC"
25
35
"ABC"
Activity Table
Timestamp
ISBN
7.5
"ABC"
27.5
"ABC"
Desired Result
Start
End
ISBN
Timestamp
0
10
"ABC"
7.5
20
30
"ABC"
27.5
The Attempt
My attempt at solving this so far has ended with the following query:
SELECT
*
FROM
test.activity AS a
JOIN test.`window` AS w ON w.isbn = (
SELECT
w1.isbn
FROM
test.window as w1
WHERE a.`timestamp` BETWEEN w1.`start` AND w1.`end`
ORDER BY w1.`start`
LIMIT 1
)
The output of this query is 8 rows.
When there is guaranteed to be a single oldest window (i.e. no two Start times are the same for any ISBN)
with activity_window as (
select
a.`Timestamp`,
a.`ISBN`,
w.`Start`,
w.`End`,
row_number() over (partition by a.`ISBN`, a.`Timestamp` order by w.`Start`) rn
from
`Activity` a
inner join `Window` w on a.`ISBN` = w.`ISBN` and a.`Timestamp` between w.`Start` and w.`End`
)
select `Start`, `End`, `ISBN`, `Timestamp` from activity_window where rn = 1;
Result:
Start
End
ISBN
Timestamp
0
10
ABC
7.5
20
30
ABC
27.5
(see complete example at DB<>Fiddle)
CTEs are available from MySQL 8.0. Use subqueries when you are still on MySQL 5. Try to avoid table- and column names that are reserved words in SQL (things like Window, Start, End or Timestamp are examples for bad name choices).
Keeping an index over (ISBN, Start, End) on Window (or clustering the entire table that way by defining those three columns as the primary key) helps this query.

MySQL - Join row with the next N smaller rows

I have a table:
id timestamp
1 1
23 2
12 4
45 6
3 7
4 8
I need this result:
major minor
1 2
1 4
1 6
2 4
2 6
2 7
I need to join each number, with the next 3 smallest numbers. Since these numbers are inserted out of order, I can't use the ids.
Because the numbers are also not in regular intervals I cannot set a specific limit to find the max number to join with.
Solutions I have:
I could create a temp table and use an auto increment id to do this.
I can do this for a single number, and write a script to iterate through the table. This is the query for it (Going with this for now, till something better comes up):
SELECT * FROM
(SELECT id major_id, timestamp major_timestamp FROM timestamps WHERE interval_id=7 ORDER BY timestamp DESC limit 1) timestamps_major
LEFT JOIN
(SELECT id minor_id, timestamp minor_timestamp FROM timestamps WHERE timestamp < (SELECT timestamp FROM timestamps WHERE interval_id=7 ORDER BY timestamp DESC limit 1) ORDER BY timestamp DESC LIMIT 2) timestamps_minor
ON major_timestamp>minor_timestamp
This just needs to be done for all numbers once, and then once per day to calculate and store a moving average. So speed is not an issue.
Wondering what is the best way to approach this. Thanks.
EDIT:
This is the actual table with timestamps and ids. The example I posted is just simplified for the sake of the question.
CREATE TABLE `timestamps` (
`id` mediumint(8) unsigned NOT NULL AUTO_INCREMENT,
`interval_id` tinyint(3) unsigned NOT NULL,
`timestamp` int(10) unsigned NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `interval_timestamp` (`interval_id`,`timestamp`),
KEY `interval_id` (`interval_id`),
KEY `timestamp` (`timestamp`),
CONSTRAINT `timestamps_ibfk_1` FOREIGN KEY (`interval_id`) REFERENCES `intervals` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=75157 DEFAULT CHARSET=latin1
Here's a possible solution (see this sqlfiddle to play around with it)
SELECT *
FROM mytable major inner join mytable minor
ON minor.timestamp > major.timestamp
WHERE (SELECT COUNT(*) FROM mytable m WHERE m.timestamp < minor.timestamp and m.timestamp > major.timestamp) < 3
ORDER BY major.timestamp, minor.timestamp
I'm definitely not confident this is the cleanest solution (and I didn't do anything to handle "ties" for equal timestamps), but it does do what you want so it might be something to build off of at a minimum.
All I am doing is joining the tables then counting the number of rows "between" the major and minor so that I don't get too many.

create columns and rows mysql

I have table data such as this
index user date rank
11 a 1Mar 23
12 b 1Mar 16
13 a 2Mar 24
14 b 2Mar 18
What I would like to achieve via a query is this:
1Mar 2Mar
a 23 24
b 16 18
I don't know if this can be done via a single statement at the command line or if this will have to be done via a form and some scripting. Doing through scripting I can do, but can't see how to do in a single statement.
you can do pivot like below, if you know all possible values for date
or you need to use dynamic sql.
SELECT user,
MAX( CASE WHEN date ='1Mar' THEN rank else NULL end) AS '1Mar',
MAX( CASE WHEN date ='2Mar' THEN rank else NULL end) AS '2Mar'
FROM Table1
GROUP BY user

MySQL query with multiple subqueries is too slow. How do I speed it up?

I have a MySQL query which does exactly what I want, but it takes anywhere between 110 & 130 seconds to process. The problem is that it works in tandem with a software that times out 20 seconds after making the query.
Is there anything I can do to speed up the query? I'm considering moving the db over to another server, but are there any more elegant options before I go that route?
-- 1 Give me a list of IDs & eBayItemIDs
-- 2 where it is flagged as bottom tier
-- 3 Where it has been checked less than 168 times
-- 4 Where it has not been checked in the last hour
-- 5 Or where it was never checked but appears on the master list.
-- 1 Give me a list of IDs & eBayItemIDs
SELECT `id`, eBayItemID
FROM `eBayDD_Main`
-- 2 where it is flagged as bottom tier
WHERE `isBottomTier`='0'
-- 3 Where it has been checked less than 168 times
AND (`id` IN
(SELECT `mainid`
FROM `eBayDD_History`
GROUP BY `mainid`
HAVING COUNT(`mainID`) < 168)
-- 4 Where it has not been checked in the last hour
AND id IN
(SELECT `mainID`
FROM `eBayDD_History`
GROUP BY `mainID`
HAVING ((TIME_TO_SEC(TIMEDIFF(NOW(), MAX(`dateCollected`)))/60)/60) > 1))
-- 5 Or where it was never checked but appears on the master list.
OR (`id` IN
(SELECT `id`
FROM `eBayDD_Main`)
AND `id` NOT IN
(SELECT `mainID`
FROM `eBayDD_History`))
If I understand the logic correctly, you should be able to replace this logic with this:
select m.`id`, m.eBayItemID
from `eBayDD_Main` m left outer join
(select `mainid`, count(`mainID`) as cnt,
TIME_TO_SEC(TIMEDIFF(NOW(), MAX(`dateCollected`)))/60)/60) as dc
from `eBayDD_History`
group by `mainid`
) hm
on m.mainid = hm.mainid
where m.`isBottomTier` = '0' and hm.cnt < 168 and hm.dc > 1 or
hm.mainid is null;

MySQL: How to construct a given query

I am not a MySQL guru at all, and I would really appreciate if someone takes some time to help me. I have three tables as shown below:
TEAM(teamID, teamName, userID)
YOUTH_TEAM(youthTeamID, youthTeamName, teamID)
YOUTH_PLAYER(youthPlayerID, youthPlayerFirstName, youthPlayerLastName, youthPlayerAge, youthPlayerDays, youthPlayerRating, youthPlayerPosition, youthTeamID)
And this is the query that I have now:
SELECT team.teamName, youth_team.youthTeamName, youth_player.*
FROM youth_player
INNER JOIN youth_team ON youth_player.youthTeamID = youth_team.youthTeamID
INNER JOIN team ON youth_team.teamID = team.teamID
WHERE youth_player.youthPlayerAge < 18
AND youth_player.youthPlayerDays < 21
AND youth_player.youthPlayerRating >= 5.5
What I would like to add to this query is a more thorough checks like the following:
if player has 16 years, and his position is scorer, then the player should have at least 7 rating in order to be returned
if player has 15 years, and his position is playmaker, then the player should have at least 5.5 rating in order to be returned
etc., etc.
How can I implement these requirements in my query (if possible), and is that query going to be a bad-way solution? Is it maybe going to be better if I do the selection with PHP code (if we suppose I use PHP) instead of doing it in the query?
Here is a possible solution with an additional "criteria/filter" table:
-- SAMPLE TEAMS: Yankees, Knicks:
INSERT INTO `team` VALUES (1,'Yankees',2),(2,'Knicks',1);
-- SAMPLE YOUTH TEAMS: Yankees Juniors, Knicks Juniors
INSERT INTO `youth_team` VALUES (1,'Knicks Juniors',1),(2,'Yankees Juniors',2);
-- SAMPLE PLAYERS
INSERT INTO `youth_player` VALUES
(1,'Carmelo','Anthony',16,20,7.5,'scorer',1),
(2,'Amar\'e','Stoudemire',17,45,5.5,'playmaker',1),
(3,'Iman','Shumpert',15,15,6.1,'playmaker',1),
(4,'Alex','Rodriguez',18,60,3.5,'playmaker',2),
(5,'Hiroki','Kuroda',16,17,8.7,'scorer',2),
(6,'Ichiro','Suzuki',19,73,8.3,'playmaker',2);
-- CRITERIA TABLE
CREATE TABLE `criterias` (
`id` int(11) NOT NULL,
`age` int(11) DEFAULT NULL,
`position` varchar(45) DEFAULT NULL,
`min_rating` double DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
-- SAMPLE CRITERIAS
-- AGE=16, POSITION=SCORER, MIN_RATING=7
-- AGE=15, POSITION=PLAYMAKER, MIN_RATING=5.5
INSERT INTO `criterias` VALUES (1,16,'scorer',7), (2,15,'playmaker',5.5);
Now your query could look like:
SELECT team.teamName, youth_team.youthTeamName, youth_player.*
FROM youth_player
CROSS JOIN criterias
INNER JOIN youth_team ON youth_player.youthTeamID = youth_team.youthTeamID
INNER JOIN team ON youth_team.teamID = team.teamID
WHERE
(
youth_player.youthPlayerAge < 18
AND youth_player.youthPlayerDays < 21
AND youth_player.youthPlayerRating >= 5.5
)
AND
(
youth_player.youthPlayerAge = criterias.age
AND youth_player.youthPlayerPosition = criterias.position
AND youth_player.youthPlayerRating >= criterias.min_rating
)
This yields (shortened results):
teamName youthTeamName youthPlayerName Age Days Rating Position
=============================================================================
Yankees "Knicks Juniors" Carmelo Anthony 16 20 7.5 scorer
Yankees "Knicks Juniors" Iman Shumpert 15 15 6.1 playmaker
Knicks "Yankees Juniors" Hiroki Kuroda 16 17 8.7 scorer
Doing it in the query is quite fine...... as long as it doesn't get too messed up. You can perform a lot of stuff in your query, but it may get hard to maintain. So if it gets too long and you want somebody else to take a look at it, you should split it up or find a solution in your php-script.
As for your requirements add this too your WHERE-part:
AND
(
(YOUTH_PLAYER.youthPlayerAge >= 16 AND YOUTH_PLAYER.youthPlayerPosition = 'scorer' AND YOUTH_PLAYER.youthPlayerRating >= 7)
OR (YOUTH_PLAYER.youthPlayerAge >= 15 AND YOUTH_PLAYER.youthPlayerPosition = 'playmaker' AND YOUTH_PLAYER.youthPlayerRating >= 5.5)
)