Write a query to identify frequent posters - mysql

I'm trying to write a query that will find the user_id's of all users
that have created a minimum of two posts in a maximum of 1 hour.
Here's a light example of the data:
CREATE TABLE tbl_posts
(`id` int, `user_id` int, `created_date` datetime);
INSERT INTO tbl_posts
(`id`, `user_id`, `created_date`)
VALUES
(1, 1, '2021-07-01 09:00'),
(2, 2, '2021-07-01 10:15'), -- *
(3, 2, '2021-07-01 11:00'), -- * user posted twice within an hour.
(4, 3, '2021-07-01 13:00'),
(5, 3, '2021-07-01 15:00'),
(6, 3, '2021-07-01 18:00'),
(7, 4, '2021-07-01 11:00'),
(8, 4, '2021-07-02 11:30'),
(9, 4, '2021-07-03 12:30'), -- *
(10, 4, '2021-07-03 12:45'); -- * user posted twice within an hour.
http://sqlfiddle.com/#!9/0e7cba
The expected output of the query is
2, 4
This output is expected because users 2 and 4 have each posted at least twice in under an hour.
I don't know where to begin with this in MySQL. I can export the data and get a result procedurally in something like C or Python, but I'm sure this is accomplishable in MySQL and am curious to know how. Maybe I need a Window function?

Use EXISTS:
SELECT DISTINCT t1.user_id
FROM tbl_posts t1
WHERE EXISTS (
SELECT 1
FROM tbl_posts t2
WHERE t2.user_id = t1.user_id
AND t1.created_date < t2.created_date
AND TIMESTAMPDIFF(SECOND, t1.created_date, t2.created_date) <= 60 * 60
)
Or, if your version of MySql is 8.0+ use LEAD() window function:
SELECT user_id
FROM (
SELECT *, TIMESTAMPDIFF(
SECOND,
created_date,
LEAD(created_date) OVER (PARTITION BY user_id ORDER BY created_date)
) diff
FROM tbl_posts
) t
GROUP BY user_id
HAVING MIN(diff) <= 60 * 60
See the demo.

select distinct p.user_id from tbl_posts p
inner join tbl_posts p2 on p.user_id = p2.user_id
and p.created_date < p2.created_date
and DATE_ADD(p.created_date,interval 1 hour) >= p2.created_date

Related

MySQL get earliest record of each day

The query below gives me one record per day for each user. How can I modify it so that it gives me the earliest record per day for each user?
I tried using MIN() on the date field in the GROUP BY part, but that obviously doesn't work. There's a date_trunc function mentioned in this answer which seems to do what I want, but it is not available in MySQL. What's the best way to go about this?
For the sample data below, the query should return records with ids 1, 3, 5, and 7.
SELECT user_id, coords, date
FROM table
WHERE draft = 0
GROUP BY user_id, DAY('date')
CREATE TABLE `table` (
`id` bigint(20) UNSIGNED NOT NULL,
`user_id` int(11) NOT NULL,
`coords` point NOT NULL,
`date` datetime NOT NULL,
`draft` tinyint(4) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `table` (`id`, `user_id`, `coords`, `date`, `draft`) VALUES
(1, 1, xxx, '2020-11-08 18:01:47', 0),
(2, 1, xxx, '2020-11-08 18:05:47', 0),
(3, 1, xxx, '2020-11-09 18:06:47', 0),
(4, 1, xxx, '2020-11-09 18:07:47', 0),
(5, 2, xxx, '2020-11-08 17:01:47', 0),
(6, 2, xxx, '2020-11-08 17:05:47', 0),
(7, 2, xxx, '2020-11-09 14:00:47', 0),
(8, 2, xxx, '2020-11-09 14:05:47', 0),
A typical approach is to filter with a correlated subquery:
select t.*
from mytable t
where t.draft = 0 and t.date = (
select min(t1.date)
from mytable t1
where t1.draft = t.draft and t1.user_id = t.user_id and date(t1.date) = date(t.date)
)
You can optimize the subquery a little by using a half-open interval for filtering:
select t.*
from mytable t
where t.draft = 0 and t.date = (
select min(t1.date)
from mytable t1
where
t1.user_id = t.user_id
and t1.draft = t.draft
and t1.date >= date(t.date)
and t1.date < date(t.date) + interval 1 day
)
The second query should be able to take advantage of an index on (draft, user_id, date).
Alternatively, if you are running MuSQL 8.0, you can also use window functions:
select *
from (
select t.*, row_number() over(partition by user_id, date(date) order by date) rn
from mytable t
where draft = 0
) t
where rn = 1
Use:
SELECT user_id, coords, date
FROM `table`
WHERE draft = 0
GROUP BY DAY('date'), user_id order by user_id, date

Get n oldest rows, but no more than x that have the same value in a column

I have a simple table
CREATE TABLE `example` (
`id` int(12) NOT NULL,
`food` varchar(250) NOT NULL
);
With the following data
INSERT INTO `example` (`id`, `food`) VALUES
(1, 'apple'),
(2, 'apple'),
(3, 'apple'),
(4, 'apple'),
(5, 'apple'),
(6, 'apple'),
(7, 'apple'),
(8, 'banana'),
(9, 'banana'),
(10, 'potato'),
(11, 'potato'),
(12, 'potato'),
(13, 'banana'),
(14, 'banana'),
(15, 'banana');
I want to get the oldest 10 rows
SELECT *
FROM example
ORDER BY id ASC
LIMIT 10
But I don't want to get more than 5 rows where food has the same value.
My current query receives 7 apple (more than I want), 2 banana, and 1 potato. In the data provided, I'd want to receive 5 apple, 2 banana, and 3 potato.
How can I accomplish this?
Update:
SQL Group BY, Top N Items for each Group is not a duplicate because it involves a different database. In particular, GROUP BY works different in sql-server than it does in MySQL
You can add a count (in reverse) for each food . . . using variables or a correlated subquery. This will use the latter:
select t.*
from (select t.*,
(select count(*) from example t2 where t2.food = t.food and t2.id >= t.id) as seqnum
from example t
) t
where seqnum <= 5
order by id desc
limit 10;
I didn't create the table and test this, but it should give you what you want. Just a different approach than the one above.
Select *
From (Select ID, Food
, Count(Food) Over(Partition By Food Order by ID) as Appearances
From Your_Table) as a
Where a.Appearances <= 5
Order By ID Asc
You can obviously put the limit if you want.

get rows from a table where value of field x is maximum

I have two tables myTable and myTable2 in a mysql database:
CREATE TABLE myTable (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
number INT,
version INT,
date DATE
) ENGINE MyISAM;
INSERT INTO myTable
(`id`, `number`, `version`, `date`)
VALUES
(1, '123', '1', '2016-01-12'),
(2, '123', '2', '2016-01-13'),
(3, '124', '1', '2016-01-14'),
(4, '124', '2', '2016-01-15'),
(5, '124', '3', '2016-01-16'),
(6, '125', '1', '2016-01-17')
;
CREATE TABLE myTable2 (
id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
myTable_id INT
) ENGINE MyISAM;
INSERT INTO myTable2
(`id`, `myTable_id`)
VALUES
(1, 1),
(2, 1),
(3, 2),
(4, 2),
(5, 3),
(6, 3),
(7, 4),
(8, 4),
(9, 4),
(10, 5),
(11, 6)
;
The field myTable2.myTable_id is a foreign key of myTable.Id.
I would like to get all the rows from myTable where myTable2.myTable_id = myTable.Id and the value of the field version in myTable is the maximum for every corresponding value for the field number in myTable.
I tried something like this:
SELECT
*
FROM
myTable,
myTable2
WHERE
myTable.version = (SELECT MAX(myTable.version) FROM myTable)
But the above query does not return the correct data. The correct query should output this:
Id number version date
2 123 2 2016-01-13
5 124 3 2016-01-16
6 125 1 2016-01-17
Please help!
One way to do this is to get the max version for each number in myTable in a derived table and join with that:
SELECT DISTINCT
m.*
FROM
myTable m
JOIN
myTable2 m2 ON m.id = m2.myTable_id
JOIN
(
SELECT number, MAX(version) AS max_version
FROM myTable
GROUP BY number
) AS derived_table
ON m.number = derived_table.number
AND m.version = derived_table.max_version
With your sample data this produces a result like this:
id number version date
6 125 1 2016-01-17
5 124 3 2016-01-16
2 123 2 2016-01-13
your Query is logically wrong. Here is the correct one
SELECT
*
FROM
myTable,
myTable2
WHERE
(myTable.version,myTable.number) in
(SELECT MAX(myTable.version),number FROM myTable group by number)
and myTable.id=myTable2.id
Here is the sqlfiddle http://sqlfiddle.com/#!9/74a67/4/0
This is the query posted for the previous edited question
SELECT * FROM myTable
inner join myTable2 on myTable.id = myTable2.mytable_id
WHERE (version, number) in
(SELECT MAX(version), number FROM myTable group by number)
Try this solution with using subquery simply as:
# Selecting desired result..
SELECT t1.id, t1.number, t1.version, t1.date
FROM myTable As t1 JOIN
# subquery to select max version and its corresponding
# number form myTable
(SELECT number, max(version) As max_ver FROM myTable
GROUP BY number
) As t2 ON t1.number = t2.number and t1.version = t2.max_ver
# Now checking for foreign key..
WHERE t1.id IN (SELECT mytable_id FROM myTable2);
Was it helpful..

In MySQL how to query 2 columns from 1 row?

In MySQL table cardToCard has 1 row each time a credit card balance is transferred from one card to another card.
create table cardToCard (
id int,
dt date,
card_from int,
card_to int,
amount decimal(6,2),
primary key (id)
);
insert into cardToCard values (1, '2014-01-01', 100, 101, 200.00);
insert into cardToCard values (2, '2014-01-01', 101, 102, 200.00);
insert into cardToCard values (3, '2014-01-01', 102, 103, 200.00);
insert into cardToCard values (4, '2014-01-01', 103, 104, 200.00);
insert into cardToCard values (5, '2014-01-01', 104, 100, 200.00);
insert into cardToCard values (6, '2014-01-01', 99, 104, 200.00);
Query which card has been used 3 or more times.
select card, count(*) 'count'
from
(
select card_from 'card', dt
from cardtocard
union all
select card_to 'card', dt
from cardtocard
) d
group by card
having count >= 3
The results are correct. The question is would it be more efficient to write this as a self join?
http://sqlfiddle.com/#!2/420e72/1
Possibly the most efficient way to write this query would be to start with a list of cards and then do:
select c.card,
((select count(*) from cardTocard ctc where ctc.card_from = c.card) +
(select count(*) from cardTocard ctc where ctc.card_to = c.card)
) as cnt
from cards c
having cnt >= 3;
Then, you need two indexes: cardTocard(card_from) and cardTocard(card_to).
This should use the index for the aggregation, which is typically faster than a file sort.
EDIT:
Using the structure that you are using, it can be faster to do aggregation in the subqueries as well as the outer query:
select card, sum(cnt) as cnt
from ((select card_from as car, count(*) as cnt
from cardtocard
group by card_from
) union all
(select card_to as card, count(*) as cnt
from cardtocard
group by card_to
)
) d
group by card
having count >= 3;
This can be faster because the volume of data for the subqueries is smaller than just union'ing them together.

MySQL: find IDs with constatnly increasing values

I have the following table:
create table my_table
(
SubjectID int,
Date Date,
Test_Value int
);
insert into my_table(SubjectID, Date, Test_Value)
values
(1, '2014-01-01', 55),
(1, '2014-01-05', 170),
(1, '2014-01-30', 160),
(2, '2014-01-02', 175),
(2, '2014-01-20', 166),
(2, '2014-01-21', 160),
(3, '2014-01-05', 70),
(3, '2014-01-07', 75),
(3, '2014-01-11', 180)
I want to find IDs with constantly increasing Test_Value over time. In this example, only SubjectID 3 satisfies that condition. Could you write the code to find this out? Thanks for your help as always.
SELECT *
FROM my_table o
WHERE NOT EXISTS (
SELECT null
FROM my_table t1
INNER JOIN my_table t2 ON t2.Date > t1.Date AND t2.Test_Value < t1.Test_Value AND t1.SubjectID = t2.SubjectID
WHERE t1.SubjectID = o.SubjectID
)
The inner query would select all the entities that DO VIOLATE the requirements: they have later dates with least values. Then the outer select entities that do not match ones from the inner query.
SQLFiddle: http://www.sqlfiddle.com/#!2/1a7ba/12
PS: presumably if you only need an id - use SELECT DISTINCT SubjectID
If the values are not monotonically increasing, then there is at least one case where adjacent values decrease. Hence, you can reduce this problem to just looking at the previous value:
select t.SubjectId
from (select t.*,
(select TestValue
from table t2
where t2.SubjectId = t.SubjectId and
t2.Date < t.Date
order by t2.Date desc
limit 1
) as prev_Test_value
from table t
) t
group by t.SubjectId
having coalesce(sum(Test_Value < prev_Test_value), 0) = 0;