I'm having a mental block with this query, I'm trying to return the max date and the maximum time and do an order by of the identity. It would be greatly appreciate if someone can add a pair of eyes to this type of query So :
Data Set
Identity, Date, Time, Website
10, 5/10/15, 1, google.com
10, 5/10/15, 3, google.com
10, 5/10/15, 10, google.com
25, 5/11/15, 1, yahoo.com
25, 5/11/15, 15, yahoo.com
Expected Result
10, 5/10/15, 10, google.com
25, 5/11/15, 15, yahoo.com
Current Query
SELECT DISTINCT *, MAX(datetime) as maxdate, MAX(time), identity
FROM identity_track
GROUP BY identity
ORDER BY maxdate DESC
Something like this?
select identity, max(date), max(time), website
from identity_track
group by website;
Demo here: http://sqlfiddle.com/#!9/5cadf/1
You can order by any of the fields you want.
Also, the expected output you posted doesn't line up with what it seems like you're attempting to do.
edit
Updated query based on additional information.
select t.identity, t.date, max(t.time), t.website
from t
inner join
(select identity, website, max(date) d
from t
group by identity, website) q
on t.identity = q.identity
and t.website = q.website
and q.d = t.date
group by t.identity, t.website, t.date
This one should give you the users identity, the pages he visited, the last time he visited that page, and the most amount of time he spent in any visit on that last visit.
Don't assume that all records for an identity are on the same day e.g. if the entity has times of 1/1/15 5pm and 1/2/15 2pm you'd get 1/2/15 5pm which is wrong.
I'd always merge the time and date but if you can't try this:
select t.identity, t.website, MAX(t.time)
FROM t
INNER JOIN
(
select identity, max(date) as max_date
from t
group by identity;
) x
ON t.identity = x.identity
AND t.date = x.max_date
group by t.identity, t.website
Firstly we get the maximum date for each site. Then for that day, get the maximum time.
Hope this helps.
Related
I have a user table like this,
name week_no year_no
fb 5 2021
twitter 1 2022
twitter 2 2022
twitter 3 2022
twitter 7 2022
youtube 21 2022
I want to find the names of users who login >= 3 consecutive weeks in the same year. The week numbers will be unique for each year. For example, in the above table we can see that user twitter is logged in week_no: 1, 2, 3 in the same year 2022 thereby satisfying the condition that I am looking for.
The output I am looking for,
name year_no
twitter 2022
You can create the sample table using,
CREATE TABLE test (
name varchar(20),
week_no int,
year_no int
);
INSERT INTO test (name, week_no, year_no)
VALUES ('fb', 5, 2021),
('twitter', 1, 2022),
('twitter', 2, 2022),
('twitter', 3, 2022),
('twitter', 7, 2022),
('youtube', 21, 2022);
I am new to SQL language and I read that group by can achieve that, can someone help in what function/query we have to use to achieve the same.
select * from test group by year_no, name;
Thank you for any help in advance.
A simple solution which will work on every MySQL version, without using windows function. Join the same table 3 times
SELECT t1.name,t1.year_no
FROM test t1
INNER JOIN test t2 ON t1.name=t2.name AND t1.year_no=t2.year_no
INNER JOIN test t3 ON t1.name=t3.name AND t1.year_no=t3.year_no
WHERE t2.week_no = t1.week_no + 1
AND t3.week_no = t1.week_no + 2
https://dbfiddle.uk/XjeXKUFE
You may define a unique groups for consecutive weeks in the same year and aggregate them as the following:
SELECT name, year_no
FROM
(
SELECT *,
week_no -
ROW_NUMBER() OVER (PARTITION by name, year_no ORDER BY week_no) grp
FROM test
) T
GROUP BY name, year_no, grp
HAVING COUNT(*) >= 3
ORDER BY name, year_no
See a demo.
Window function version.
demo
WITH cte AS (
SELECT
name,
week_no,
year_no,
lag(week_no) OVER (PARTITION BY name,
year_no ORDER BY year_no,
week_no) AS lag,
lead(week_no) OVER (PARTITION BY name,
year_no ORDER BY year_no,
week_no) AS lead
FROM
testuser
)
SELECT DISTINCT
name,
year_no
FROM
cte
WHERE
lead + lag = 2 * week_no;
Really struggling matching up other people examples on this one, so wonder if someone would be good enough to point me in the right direction....
What I have are 2 tables in MySQL.
Tags
tagid, status, lot, lat, long, createuser, timestamp
Users
userid, first, surname
My process just adds rows to the Tags table, for the tagid scanned so there could be many rows with the same tagid but each row will have different info depending on the user, with each row having the timestamp of when it happened.
The ask is that I would like to list the latest record for each tagid, but I would like to exclude anything with a Tags.status of 'store' and enumerate the Tags.createuser to the name of the Users.userid
I just cant figure out how to get the last timestamp, as well as do the NOT statement, given there could be a situation like below.
tagid, status, lot, lat, long, createuser, timestamp
1000001, live, 1, xxxx, yyyy, 1, 2020-10-20 12:00
1000001, store, 1, xxxx, yyyy, 1, 2020-10-20 12:10
1000002, live, 1, xxxx, yyyy, 2, 2020-10-20 11:00
User 2 = Joe Bloggs
So the only thing I want returned is below because the last record for 1000001 was 'store'
1000002, live, 1, xxxx, yyyy, Joe Bloggs, 2020-10-20 11:00
You want the latest record per tag, along with the associated user name - if and only if the status of that tag is "live".
You can use row_number() and filtering:
select t.*, u.surname
from users u
inner join (
select t.*, row_number() over(partition by tagid order by timestamp desc) rn
from tags
) t on t.createduser = u.userid
where t.rn = 1 and t.status = 'live'
This requires MySQL 8.0. In earlier versions, one option uses a correlated subquery for filtering:
select t.*, u.surname
from users u
inner join tags t on t.createduser = u.userid
where t.status = 'live' and t.timestamp = (
select max(t1.timestamp) from tags t1 where t1.tagid = t.tagid
)
Given the schema
The following query
SELECT a.user_id,
a.id,
a.date_created,
avg(ai.level) level
FROM assessment a
JOIN assessment_item ai ON a.id = ai.assessment_id
GROUP BY a.user_id, a.id;
Returns these results
user_id, a.id, a.date_created, level
1, 99, "2015-07-13 18:26:00", 4.0000
1, 98, "2015-07-13 19:04:58", 6.0000
13, 9, "2015-07-13 18:26:00", 2.0000
13, 11, "2015-07-13 19:04:58", 3.0000
I would like to change the query such that only the earliest results is returned for each user. In other words, the following should be returned instead
user_id, a.id, a.date_created, level
1, 99, "2015-07-13 18:26:00", 4.0000
13, 9, "2015-07-13 18:26:00", 2.0000
I think I need to add a HAVING clause, but I'm struggling to figure out the exact syntax.
I have done something like this, except for a small difference I wanted first 5 per group. The usage case was for reporting - means time for running query / creation of temp table was not a constraint.
The solution I had:
Create a new table with columns as id( a reference to original table) and id can be unique/primary
INSERT IGNORE INTO tbl1 (id) select min(id) from original_tbl where id not in (select id from tbl1) group by user_id
Repeat step 2 as many times you required( in my case it was 5 times). the new table table will have only the ids you want to show
Now run a join on tbl1 and original table will give you the required result
Note: This might not be the best solution, but this worked for me when I had to share the report in 2-3hours in a weekend. And the data size I had was around 1M records
Disclaimer: I am in a bit of a hurry, and have not tested this fully
-- Create a CTE that holds the first and last date for each user_id.
with first_and_last as (
-- Get the first date (min) for each user_id
select a.[user_id], min(a.date_created) as date_created
from assessment as a
group by a.[user_id]
-- Combine the first and last, so each user_id should have two entries, even if they are the same one.
union all
-- Get the last date (max) for each user_id
select a.[user_id], max(a.date_created)
from assessment as a
group by a.[user_id]
)
select a.[user_id],
a.id,
a.date_created,
avg(ai.[level]) as [level]
from assessment as a
inner join assessment_item as ai on a.id = ai.assessment_id
-- Join with the CTE to only keep records that have either the min or max date_created for each user_id.
inner join first_and_last as fnl on a.[user_id] = fnl.[user_id] and a.date_created = fnl.date_created
group by a.[user_id], a.id, a.date_created;
Sorry for the confusing title, but it's the best way to explain it. This is not a usual "most recent from group" problem and I haven't been able to find anything similar on the web.
I have a status table that tracks what people are doing at various work sites. It contains records that link people, status and location.
ID, start_date, person_ID, location_ID, status
1, 2014-10-12, 1, 1, job a
2, 2014-10-13, 2, 2, job b
3, 2014-10-15, 1, 3, job c
4, 2014-10-21, 1, 3, job d
5, 2014-10-22, 2, 4, job a
6, 2014-10-26, 2, 2, job d
I need to be able to determine how long each person as been at the current site - I'm hoping to get results like this:
person_ID, location_ID, since
1, 3, 2014-10-15
2, 2, 2014-10-26
Getting when they started the current job is relatively easy by joining the max(start_date), but I need the min(start_date) from the jobs done at the most recent location.
I have been trying to join the min(start_date) within the records that match the current location (from the most recent record), and that works great until I have a person (like person 2) who has multiple visits to the current location... you can see in my desired results that I want the 10-26 date, not the 10-13 which is the first time they were at the site.
I need some method for matching the the job records for a given person, and then iterating back until the location doesn't match. I'm figuring there has to be some way to do this with some sub-queries and some clever joins, but I haven't been able to find it yet, so I would appreciate some help.
If I understand what you're asking correctly, you could use EXISTS to eliminate all but the most recent locations per person, and get the min date from the resulting rows.
SELECT person_id, location_id, MIN(start_date) since
FROM status s
WHERE NOT EXISTS (
SELECT 1 FROM status
WHERE s.person_id = person_id
AND s.location_id <> location_id
AND s.start_date < start_date)
GROUP BY person_id
An SQLfiddle to test with.
Basically, it eliminates all locations and times where the same person has visited another location more recently. For example;
1, 2014-10-12, 1, 1, job a
...is eliminated since person 1 has visited location 3 more recently, while;
3, 2014-10-15, 1, 3, job c
...is kept since the same person has only visited the same location more recently.
It then just picks the least recent time per person. Since only the rows from the last location are kept, it will be the least recent time from the most recent location.
I think the easiest way is with variables to keep track of the information you need:
select person_id, location_id, min(start_date) as since
from (select s.*,
(#rn := if(#p <> person_id, if(#p:=person_id, 1, 1),
if(#l = location_id, #rn,
if(#l:=location_d, #rn + 1, #rn + 1)
)
)
) as location_counter
from status s cross join
(select #p := 0, #l := 0, #rn := 0) vars
order by person_id, start_date desc
) s
where location_counter = 1
group by person_id, location_id;
The weird logic with the variables is (trying to) enumerate the locations for each person. It should be incrementing #rn only when the location changes and resetting the value to 1 for a new person.
Quite simple actually.
SELECT g.person_ID,
(SELECT l.location_ID
FROM status l
WHERE l.person_ID = g.person_ID
AND l.start_date = MAX(g.start_date)) AS location,
MAX(g.start_date) AS since
FROM status g
GROUP BY g.person_ID
This uses a grouping on person_ID, and uses a SELECT for the location column expression.
The sole question is whether you meant MIN i.o. MAX as in your example you yield the youngest date, not the oldest.
I have a table called log_payment that has a series of payment records like:
log_user_id, log_date, log_payment_id
13, 2013-01-01 01:13:00, TRIAL<BR>
13, 2013-01-02 01:18:00, 1<BR>
13, 2013-01-03 01:05:00, 2
What I want to get is the payment id and date of the users last record. So I want that user_id's last transaction was 01/03 and has a payment id of 2. So I wrote this query:
select max(log_date) as max_date,log_user_id,log_payment_id from log_payment group by log_user_id
but it returns 13, 2013-01-03 01:05:00, TRIAL
So based on some data I found somewhere else, I tried this:
select log_user_id, max_date, log_payment_id from (select log_user_id,max(log_date) as max_date from log_payment group by _log_user_id) payment_table inner join log_payment on payment_table.log_user_id = log_payment.log_user_id and payment_table.max_date = log_payment.log_date
But this goes on for several minutes until I finally just cancel it. What am I missing?
Your query, which I have reparsed, looks good, except for the _log_user_id in the group by. It should be log_user_id:
select log_user_id,
max_date,
log_payment_id from
(select log_user_id,max(log_date) as max_date from log_payment group by _log_user_id)
payment_table
inner join
log_payment
on payment_table.log_user_id = log_payment.log_user_id and
payment_table.max_date = log_payment.log_date
Depending on the size of your tables the query might be slow. Try adding a LIMIT 10 at the end of the query to see if that gets you the desired result for the first 10 tuples.
--dmg
The best solution for the Group by order is use a subquery to make the order by for you:
SELECT t1.*
FROM `log_payment` t1
WHERE `id` = (
SELECT `id`
FROM `log_payment` `t2`
WHERE `t2`.`log_user_id` = `t1`.`log_user_id`
ORDER BY `t2`.`log_date` DESC
LIMIT 1
)
It also should be really fast. Of course it always relies on your index's setup.