Select first and last timestamp where userID is unique - mysql

I'm trying to do a query to get first and last timestamp of each unique user.
Database looks like this:
| ID | EventID | Timestamp | Person | Number |
--------------------------------------------------------
| 1 | 2 | 2015-01-08 17:31:40 | 7 | 5 |
| 2 | 2 | 2015-01-08 17:35:40 | 7 | 4 |
| 3 | 2 | 2015-01-08 17:38:40 | 7 | 7 |
--------------------------------------------------------
I'm trying to put together a MySQL query that will do the following:
SUM of number field for each unique user.
Time difference (in hours) between first and last row for each unique user.
I would imagine that if I could get the first and last timestamp for each user, I should be able to use timediff to get the time difference in hours.
What I've got so far:
SELECT
person,
SUM(number) AS 'numbers_all_sum'
FROM database
WHERE eventid = 2
GROUP BY person
ORDER BY numbers_all_sum DESC
Any help would be greatly appreciated.

Something like this:
SELECT
Person
MIN(Timestamp),
MAX(Timestamp),
SUM(number) AS 'numbers_all_sum'
FROM database
WHERE eventid = 2
GROUP BY person

Related

How to get the average time between multiple dates

What I'm trying to do is bucket my customers based on their transaction frequency. I have the date recorded for every time they transact but I can't work out to get the average delta between each date. What I effectively want is a table showing me:
| User | Average Frequency
| 1 | 15
| 2 | 15
| 3 | 35
...
The data I currently have is formatted like this:
| User | Transaction Date
| 1 | 2018-01-01
| 1 | 2018-01-15
| 1 | 2018-02-01
| 2 | 2018-06-01
| 2 | 2018-06-18
| 2 | 2018-07-01
| 3 | 2019-01-01
| 3 | 2019-02-05
...
So basically, each customer will have multiple transactions and I want to understand how to get the delta between each date and then average of the deltas.
I know the datediff function and how it works but I can't work out how to split them transactions up. I also know that the offset function is available in tools like Looker but I don't know the syntax behind it.
Thanks
In MySQL 8+ you can use LAG to get a delayed Transaction Date and then use DATEDIFF to get the difference between two consecutive dates. You can then take the average of those values:
SELECT User, AVG(delta) AS `Average Frequency`
FROM (SELECT User,
DATEDIFF(`Transaction Date`, LAG(`Transaction Date`) OVER (PARTITION BY User ORDER BY `Transaction Date`)) AS delta
FROM transactions) t
GROUP BY User
Output:
User Average Frequency
1 15.5
2 15
3 35
Demo on dbfiddle.com
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(user INT NOT NULL
,transaction_date DATE
,PRIMARY KEY(user,transaction_date)
);
INSERT INTO my_table VALUES
(1,'2018-01-01'),
(1,'2018-01-15'),
(1,'2018-02-01'),
(2,'2018-06-01'),
(2,'2018-06-18'),
(2,'2018-07-01'),
(3,'2019-01-01'),
(3,'2019-02-05');
SELECT user
, AVG(delta) avg_delta
FROM
( SELECT x.*
, DATEDIFF(x.transaction_date,MAX(y.transaction_date)) delta
FROM my_table x
JOIN my_table y
ON y.user = x.user
AND y.transaction_date < x.transaction_date
GROUP
BY x.user
, x.transaction_date
) a
GROUP
BY user;
+------+-----------+
| user | avg_delta |
+------+-----------+
| 1 | 15.5000 |
| 2 | 15.0000 |
| 3 | 35.0000 |
+------+-----------+
I don't know what to say other than use a GROUP BY.
SELECT User, AVG(DATEDIFF(...))
FROM ...
GROUP BY User

How to Exclude Mysql Group

I would like to know how to exclude a mysql group, i did research on here for some time but just dont make it right.
I tried
SELECT * FROM booking GROUP BY BookingId HAVING Status!="Cancellation"
which obviously doesn't work.
Example Database looks like this:
+----+-----------+--------------+
| id | BookingId | Status |
+----+-----------+--------------+
| 1 | 1 | Booked |
| 2 | 1 | Cancellation |
| 3 | 2 | Booked |
+----+-----------+--------------+
I would like to group them by BookingId and if one of the found entries got the Status Cancellation this group shouldnt show up, so from above just the id 3 would be seen.
You need to use aggregate functions like sum in a having clause
SELECT BookingId
FROM booking
GROUP BY BookingId
HAVING sum(Status = 'Cancellation') = 0

how to group mysql based on items and today date

I have a database that store transaction logs, I would like to count all the logs for that day and group them based on prod_id
MySQL table structure:
Table name = products
+------+---------+------------+--------+
| ID | PROD_ID | DATE | PERSON |
+------+---------+------------+--------+
| 1 | 2 | 1400137633 | 1 |
| 2 | 2 | 1400137666 | 1 |
| 3 | 3 | 1400137125 | 2 |
| 4 | 4 | 1400137563 | 1 |
| 5 | 2 | 1400137425 | 2 |
| 6 | 3 | 1400137336 | 1 |
+------+---------+------------+--------+
MYSQL CODE:
$q = 'SELECT count(ID) as count
FROM PRODUCTS
WHERE PERSON ='.$db->qstr($person).'
AND DATE(FROM_UNIXTIME(DATE)) = DATE(NOW())';
so what I get is the number of items for the given date. Since the date is the same as all other entries. however I would like to group the items by prod_id, I tried GROUP BY PROD_ID but that did not give me what I want. I would like it to group if the PROD_ID is multiple and the date is the same display as one entry while still count the others
so here I should get an output ($Person = 1).... 2+2+2=1 +3 +4 so total should be 3
any suggestions?
Use DISTINCT with COUNT on PROD_ID.
Example:
SELECT count( distinct PROD_ID ) as count
FROM PRODUCTS
WHERE PERSON = 1 -- <---- change this with relevant variable
AND DATE( FROM_UNIXTIME (DATE ) ) = curdate();
And I suggest you to use Prepared Statement to bind values.

Retrieving the most recent entry per user

If I have a table with the following structure and data:
id | user_id | created_at
-------------------------
1 | 7 | 0091942
2 | 3 | 0000014
3 | 6 | 0000890
4 | 6 | 0029249
5 | 7 | 0000049
6 | 3 | 0005440
7 | 9 | 0010108
What query would I use to get the following results (explanation to follow):
id | user_id | created_at
-------------------------
1 | 7 | 0091942
6 | 3 | 0005440
4 | 6 | 0029249
7 | 9 | 0010108
As you can see:
Only one row per user_id is returned.
The row with the highest created_at is the one returned.
Is there a way to accomplish this without using subqueries? Is there a name in relational algebra parlance that this procedure goes by?
The query is known as a groupwise maximum, which (in MySQL, at least) can be implemented with a subquery. For example:
SELECT my_table.* FROM my_table NATURAL JOIN (
SELECT user_id, MAX(created_at) created_at
FROM my_table
GROUP BY user_id
) t
See it on sqlfiddle.
You can just get the max and group by the user_id:
select id,user_id,max(created_at)
from supportContacts
group by user_id
order by id;
Here is what it outputs:
ID USER_ID MAX(CREATED_AT)
1 7 91942
2 3 5440
3 6 29249
7 9 10108
See the working demo here
Note that the example on the fiddle uses the created_at field as int, just use your format it should make no difference.
EDIT: I will leave this answer as a referece but note that his query will produce undesired results as Gordon stated, please do not use this in production.

Counting messages per day (after 17:00 counts for next day)

I have two MySQL tables: stats (left) and messages (right)
+------------+---------+ +---------+------------+-----------+----------+
| _date | msgcount| | msg_id | _date | time | message |
+------------+---------+ +----------------------+-----------+----------+
| 2011-01-22 | 2 | | 1 | 2011-01-22 | 06:23:11 | foo bar |
| 2011-01-23 | 4 | | 2 | 2011-01-22 | 15:17:03 | baz |
| 2011-01-24 | 0 | | 3 | 2011-01-22 | 17:05:45 | foobar |
| 2011-01-25 | 1 | | 4 | 2011-01-22 | 23:58:13 | barbaz |
+------------+---------+ | 5 | 2011-01-23 | 00:06:32 | foo foo |
| 6 | 2011-01-23 | 13:45:00 | bar foo |
| 7 | 2011-01-25 | 02:22:34 | baz baz |
+---------+------------+-----------+----------+
I filled in stats.msgcount, but in reality it is still empty. I'm looking for a query way to:
count the number of messages for every stats._date (notice the zero msgcount on 2011-01-25)
messages.time is in 24-hour format. All messages AFTER 5 o'clock (17:00:00) should be counted for the next day (notice msg_id 3 and 4 count for 2011-01-23)
update stats.msgcount to hold all counts
I'm especially concerned about the "later than 17:00:00 count for next day" part. Is this possible in (My)SQL?
You could use:
UPDATE stats LEFT JOIN
( SELECT date(addtime(_date,time) + interval 7 hour) as corrected_date,
count(*) as message_count
FROM messages
GROUP BY corrected_date ) mc
ON stats._date = mc.corrected_date
SET stats.msgcount = COALESCE( mc.message_count, 0 )
However this query requires dates you are interested in to be in the stats table already, if you don't have them make _date primary or unique key if its not yet and use:
INSERT IGNORE INTO stats(_date,msgcount)
SELECT date(addtime(_date,time) + interval 7 hour) as corrected_date,
count(*) as message_count
FROM messages
GROUP BY corrected_date
Really, all you're doing is shifting the times by 7 hours. Something like this should work:
UPDATE stats s
SET count = (SELECT COUNT(msg_id) FROM messages m
WHERE m._date BETWEEN DATE_SUB(DATE_ADD(s._date, INTERVAL TIME_TO_SEC(m.time) SECOND), INTERVAL 7 HOUR)
AND DATE_ADD(DATE_ADD(s._date, INTERVAL TIME_TO_SEC(m.time) SECOND), INTERVAL 17 HOUR));
The basic idea is that it takes each date in your stats table, adjusts it by 7 hours, and looks for messages sent in that range. If you used a DATETIME column instead of separate DATE and TIME columns, you wouldn't need the extra DATE_ADD(..., TIME_TO_SEC) stuff.
There may be a better way to add a date and a time, I didn't see one with a quick look at the MySQL reference documents.
So all you'd need to do is insert a new row in the stats table with a 0 for the msgcount, and run the update command. If you only wanted to update a few days (since the message count probably isn't changing 6 days later) you just need a simple where clause on the update:
UPDATE stats s
SET ...
WHERE s._date BETWEEN '2012-04-03' AND '2012-04-08'