query to add incremental field based on GROUP BY - mysql

Have a table photos
photos.id
photos.user_id
photos.order
A) Is it possible via a single query to group all photos by user and then update the order 1,2,3..N ?
B) added twist, what if some of the photos already have an order value associated? Make sure that the new photos.order never gets repeated and fills in ant orders lower or higher than those existing (as best as possible)
My only thought is just to run a script on this and loop through it and re'order' everything?
photos.id int(10)
photos.created_at datetime
photos.order int(10)
photos.user_id int(10)
Right now data may look like this
user_id = 1
photo_id = 1
order = NULL
user_id = 2
photo_id = 2
order = NULL
user_id = 1
photo_id = 3
order = NULL
the desired result would be
user_id = 1
photo_id = 1
order = 1
user_id = 2
photo_id = 2
order = 1
user_id = 1
photo_id = 3
order = 2

A)
You can use a variable that increments with each row and resets with each user_ID to get the row count.
SELECT ID,
User_ID,
`Order`
FROM ( SELECT #r:= IF(#u = User_ID, #r + 1,1) AS `Order`,
ID,
User_ID,
#u:= User_ID
FROM Photos,
(SELECT #r:= 1) AS r,
(SELECT #u:= 0) AS u
ORDER BY User_ID, ID
) AS Photos
Example on SQL Fiddle
B)
My First solution was to just add Order to the sorting that adds the row number, therefore anything with an Order Gets sorted by its order first, but this only works if your ordering system has no gaps and starts at 1:
SELECT ID,
User_ID,
RowNumber AS `Order`
FROM ( SELECT #r:= IF(#u = User_ID, #r + 1,1) AS `RowNumber`,
ID,
User_ID,
#u:= User_ID
FROM Photos,
(SELECT #i:= 1) AS r,
(SELECT #u:= 0) AS u
ORDER BY User_ID, `Order`, ID
) AS Photos
ORDER BY `User_ID`, `Order`
Example using Order Field
ORDERING WITH GAPS
I have eventually found a way of maintaining the sort order even when there are gaps in the sequence.
SELECT ID, User_ID, `Order`
FROM Photos
WHERE `Order` IS NOT NULL
UNION ALL
SELECT Photos.ID,
Photos.user_ID,
Numbers.RowNum
FROM ( SELECT ID,
User_ID,
#r1:= IF(#u1 = User_ID,#r1 + 1,1) AS RowNum,
#u1:= User_ID
FROM Photos,
(SELECT #r1:= 0) AS r,
(SELECT #u1:= 0) AS u
WHERE `Order` IS NULL
ORDER BY User_ID, ID
) AS Photos
INNER JOIN
( SELECT User_ID,
RowNum,
#r2:= IF(#u2 = User_ID,#r2 + 1,1) AS RowNum2,
#u2:= User_ID
FROM ( SELECT DISTINCT p.User_ID, o.RowNum
FROM Photos AS p,
( SELECT #i:= #i + 1 AS RowNum
FROM INFORMATION_SCHEMA.COLLATION_CHARACTER_SET_APPLICABILITY,
( SELECT #i:= 0) AS i
) AS o
WHERE RowNum <= (SELECT COUNT(*) FROM Photos P1 WHERE p.User_ID = p1.User_ID)
AND NOT EXISTS
( SELECT 1
FROM Photos p2
WHERE p.User_ID = p2.User_ID
AND o.RowNum = p2.`Order`
)
AND p.`Order` IS NULL
ORDER BY User_ID, RowNum
) AS p,
(SELECT #r2:= 0) AS r,
(SELECT #u2:= 0) AS u
ORDER BY user_ID, RowNum
) AS numbers
ON Photos.User_ID = numbers.User_ID
AND photos.RowNum = numbers.RowNum2
ORDER BY User_ID, `Order`
However as you can see this is pretty complicated. This works by treating those with an order value separately to those without. The top query just ranks all photos with no order value in order of ID for each user. The bottom query uses a cross join to generates a sequential list from 1 to n for each user ID (up to the number of entries for each User_ID). So with a data set like this:
ID User_ID Order
1 1 NULL
2 2 NULL
3 1 NULL
4 1 1
5 1 3
6 2 2
7 2 3
It would generate
UserID RowNum
1 1
1 2
1 3
1 4
2 1
2 2
2 3
It then uses NOT EXISTS to elimiate all combinations already used by Photos with a non null order, and ranked in order of RowNum partitioned by User_ID giving
UserID RowNum Rownum2
1 2 1
1 4 2
2 1 1
The RowNum2 value can then be matched with the rownum value achieved in the from subquery, giving the correct order value. Long winded, but it works.
Example on SQL Fiddle

Worked for me. I needed to increment version grouping by 4 fields (host, folder, fileName, status) and sort by 1 (downloadedAtTicks).
This is is my SELECT
SET #status := NULL;
SET #version := NULL;
SELECT
id,
host,
folder,
fileName,
status,
downloadedAtTicks,
version,
IF(IF(status IS NULL, 0, status) = #status, #version := #version + 1, #version := 0) AS varVersion,
#status := IF(status IS NULL, 0, status) AS varStatus
FROM csvsource
ORDER BY host, folder, fileName, status, downloadedAtTicks;
And this is my UPDATE
SET #status := NULL;
SET #version := NULL;
UPDATE
csvsource csv,
(SELECT
id,
IF(IF(status IS NULL, 0, status) = #status, #version := #version + 1, #version := 0) AS varVersion,
#status := IF(status IS NULL, 0, status) AS varStatus
FROM csvsource
ORDER BY host, folder, fileName, status, downloadedAtTicks) AS sub
SET
csv.version = sub.varVersion
WHERE csv.id = sub.id;

Related

MYSQL: get latest msg from each user

Hello i have a table users, which has the primary Key 'user_id' and a table messages which has the columns 'msg_id'(primary), 'msg_from', 'msg_to', 'msg_date', 'msg_text'. the columns msg_from and msg_to are filled with user_id's.
What i want is for a specific user(say user_id=1) to get a table with all users who had msgs with him and for each user the newest msg.
So the result should have the columns
'id'(of the partner),
'msg_from',
'msg_to',
'msg_date',
'msg_text'
and each user occurs at most once. In Addition i want the table to be sorted by the column 'msg_date'
I know i can get the first column with
SELECT msg_from as id FROM messages WHERE msg_to=1
UNION
SELECT msg_to as id FROM messages WHERE msg_from=1
how can i add the latest msg?
SELECT *
FROM (
SELECT T.*,
#rn := IF(#partner_id = partner_id
#rn + 1,
if(#partner_id := partner_id,1,1)
) as rn
FROM (
SELECT CASE WHEN `msg_from` = #userID THEN msg_to
WHEN `msg_to` = #userID THEN msg_from
END as `partner_id`,
`msg_from`,
`msg_to`,
`msg_date`,
`msg_text`
FROM messages
WHERE #userID IN (`msg_from`, `msg_to`)
) T
CROSS JOIN ( SELECT #partner_id := 0, #rn := 0, #userID := 1) as var
ORDER BY `partner_id`, `msg_date` DESC
) T
WHERE T.rn = 1

Check if a user was "active" in multiple rows - MySQL

How would I go about creating group_ids in the following example based on the area(s) the users are active in?
group_id rep_id area datebegin dateend
1 1000 A 1/1/15 1/1/16
1 1000 B 1/1/15 1/1/16
2 1000 C 1/2/16 12/31/99
In the table you can see that rep 1000 was active in both A and B between 1/15 and 1/16. How would I go about coding the group_id field to group by datebegin & dateend?
Thanks for any help.
You can use variables in order to enumerate groups of records having identical rep_id, datebegin, dateend values:
SELECT rep_id, datebegin, dateend,
#rn := IF(#rep_id <> rep_id,
IF(#rep_id := rep_id, 1, 1),
#rn + 1) AS rn
FROM (
SELECT rep_id, datebegin, dateend
FROM mytable
GROUP BY rep_id, datebegin, dateend) AS t
CROSS JOIN (SELECT #rep_id := 0, #rn := 0) AS v
ORDER BY rep_id, datebegin
Output:
rep_id, datebegin, dateend, rn
-----------------------------------
1000, 2015-01-01, 2016-01-01, 1
1000, 2016-02-01, 2099-12-03, 2
You can use the above query as a derived table and join back to the original table. rn field is the group_id field you are looking for.
You can use variables to assign groups. As you said, only if the date_begin and date_end exactly match for 2 rows, they would be in the same group. Else a new group starts.
select rep_id,area,date_begin,date_end,
,case when #repid <> rep_id then #rn:=1 --reset the group to 1 when rep_id changes
when #repid=rep_id and #begin=date_begin and #end=date_end then #rn:=#rn --if rep_id,date_begin and date_end match use the same #rn previously assigned
else #rn:=#rn+1 --else increment #rn by 1
end as group_id
,#begin:=date_begin
,#end:=date_end
,#repid:=rep_id
from t
cross join (select #rn:=0,#begin:='',#end:='',#repid:=-1) r
order by rep_id,date_begin,date_end
The above query includes variables in the output. To only get the group_id use
select rep_id,area,date_begin,date_end,group_id
from (
select rep_id,area,date_begin,date_end
,case when #repid <> rep_id then #rn:=1
when #repid=rep_id and #begin=date_begin and #end=date_end then #rn:=#rn
else #rn:=#rn+1
end as group_id
,#begin:=date_begin
,#end:=date_end
,#repid:=rep_id
from t
cross join (select #rn:=0,#begin:='',#end:='',#repid:=-1) r
order by rep_id,date_begin,date_end
) x

Select first N messages each user receives

I have a table that stores messages sent to users, the layout is as follows
id (auto-incrementing) | message_id | user_id | datetime_sent
I'm trying to find the first N message_id's that each user has received, but am completely stuck. I can do it easily on a per-user basis (when defining the user ID in the query), but not for all users.
Things to note:
Many users can get the same message_id
Message ID's aren't sent sequentially (i.e. we can send message 400 before message 200)
This is a read only mySQL database
EDIT: On second thought I removed this bit but have added it back in since someone was kind enough to work on it
The end goal is to see what % of users opened one of the first N messages they received.
That table of opens looks like this:
user_id | message_id | datetime_opened
This is an untested answer to the original question (with 2 tables and condition on first 5):
SELECT DISTINCT user_id
FROM (
SELECT om.user_id,
om.message_id,
count(DISTINCT sm2.message_id) messages_before
FROM opened_messages om
INNER JOIN sent_messages sm
ON om.user_id = sm.user_id
AND om.message_id = sm.message_id
LEFT JOIN sent_messages sm2
ON om.user_id = sm2.user_id
AND sm2.datetime_sent < sm.datetime_sent
GROUP BY om.user_id,
om.message_id
HAVING messages_before < 5
) AS base
The subquery joins in sm2 to count the number of preceding messages that were sent to the same user, and then the having clause makes sure that there are fewer than 5 earlier messages sent. As for the same user there might be multiple messages (up to 5) with that condition, the outer query only lists the unique users that comply to the condition.
To get the first N (here 2) messages, try
SELECT
user_id
, message_id
FROM (
SELECT
user_id
, message_id
, id
, (CASE WHEN #user_id != user_id THEN #rank := 1 ELSE #rank := #rank + 1 END) AS rank,
(CASE WHEN #user_id != user_id THEN #user_id := user_id ELSE #user_id END) AS _
FROM (SELECT * FROM MessageSent ORDER BY user_id, id) T
JOIN (SELECT #cnt := 0) c
JOIN (SELECT #user_id := 0) u
) R
WHERE rank < 3
ORDER BY user_id, id
;
which uses a RANK substitute, derived from #Seaux response to Does mysql have the equivalent of Oracle's “analytic functions”?
To extend this to your original question, just add the appropriate calculation:
SELECT
COUNT(DISTINCT MO.user_id) * 100 /
(SELECT COUNT(DISTINCT user_id)
FROM (
SELECT
user_id
, message_id
, id
, (CASE WHEN #user_id != user_id THEN #rank := 1 ELSE #rank := #rank + 1 END) AS rank,
(CASE WHEN #user_id != user_id THEN #user_id := user_id ELSE #user_id END) AS _
FROM (SELECT * FROM MessageSent ORDER BY user_id, id) T
JOIN (SELECT #cnt := 0) c
JOIN (SELECT #user_id := 0) u
) R2
WHERE rank < 3
) AS percentage_who_read_one_of_the_first_messages
FROM MessageOpened MO
JOIN
(SELECT
user_id
, message_id
FROM (
SELECT
user_id
, message_id
, id
, (CASE WHEN #user_id != user_id THEN #rank := 1 ELSE #rank := #rank + 1 END) AS rank,
(CASE WHEN #user_id != user_id THEN #user_id := user_id ELSE #user_id END) AS _
FROM (SELECT * FROM MessageSent ORDER BY user_id, id) T
JOIN (SELECT #cnt := 0) c
JOIN (SELECT #user_id := 0) u
) R
WHERE rank < 3) MR
ON MO.user_id = MR.user_id
AND MO.message_id = MR.message_id
;
With no CTEs in MySQL, and being in a read-only database - I see no way around having the above query twice in the statement.
See it in action: SQL Fiddle.
Please comment if and as this requires adjustment / further detail.

mysql table with duplicate records

I have a table
email(email varchar(30),id integer(10),duplicated varchar(10))
with records
sai#gmail.com 101 null
kiran#gmail.com 102 null
sai123#gmail.com 103 null
sai#gmail.com 101 null
kiran#gmail.com 102 null
Now my question is i need to get "yes" in the duplicated column for the two duplicated records for the second time. so, the output table should be
sai#gmail.com 101 null
kiran#gmail.com 102 null
sai123#gmail.com 103 null
sai#gmail.com 101 yes
kiran#gmail.com 102 yes
Try this
update email set duplicated =
(case when (select count(*) from email x where x.email = e.email) > 1 then "yes" else null)
edited: this will update table
You can try this query for viewing:
select numerated.email, numerated.id, (case when cnt=1 OR numerated.rnum=grouped.min_rnum then null else "yes" end) as duplicated
from
(select #i := #i + 1 as rnum, email.* from email, (select #i:=0) as c order by id) as numerated
left join
(select email, id, min(rnum) as min_rnum, count(rnum) as cnt
from (select #i := #i + 1 as rnum, email.* from email, (select #i:=0) as c order by id) as numerated
group by email, id
) as grouped
on numerated.email=grouped.email and numerated.id=grouped.id
order by id;
Could you explain your situation in details? It looks like it needs another solution, not just SELECT query.
And try this one for updating:
update email u, (select #i:=0) urnum
set
id = id + (#i:=#i + 1) - #i,
duplicated = (
select duplicated from (
select
numerated.email,
numerated.id,
(case when cnt=1 OR numerated.rnum=grouped.min_rnum then null else "yes" end) as duplicated,
rnum
from
(select #i := #i + 1 as rnum, email.* from email, (select #i:=0) as c ) as numerated
left join
(select email, id, min(rnum) as min_rnum, count(rnum) as cnt
from (select #i := #i + 1 as rnum, email.* from email, (select #i:=0) as c ) as numerated
group by email, id
) as grouped
on numerated.email=grouped.email and numerated.id=grouped.id
order by rnum
) found_duplicates
where u.email=found_duplicates.email and u.id=found_duplicates.id and #i=found_duplicates.rnum
limit 1
)
;
It looks like it works, but you shouldn't rely on it.
If it is possible, you should do any of this:
1. change table structure - add unique field
2. change table filling logic - check uniqueness before inserting new row and insert it with proper 'duplicates' field value;
3. repopulate via temporary table like this:
CREATE TEMPORARY TABLE tmp_email AS <... 'SELECT' version of my query ...>;
TRUNCATE TABLE email;
INSERT INTO email SELECT * FROM tmp_email;

MYSQL retrieving user activity where the same user_id can appear a maximum of 3 times

I'm retrieving rows from an user activity table like so
SELECT user_id, type, source_id FROM activity ORDER BY date DESC LIMIT 5
But I don't want the activity feed to be able to be clogged up by the same user, so I want to be able to retrieve a maximum of 3 rows out of 5 that contain the same user_id.
Any ideas how I could do this? Thanks :)
Here is a "traditional" way, where you first enumerate the user idsand use this information as a filter:
SELECT user_id, type, source_id
FROM (select a.*,
#rn := if (#user_id = user_id, #rn + 1, 1) as rn,
#user_id := user_id
from activity a cross join
(select #rn := 0, #user_id := -1) const
order by user_id
) a
WHERE rn <= 3
ORDER BY date DESC
LIMIT 5;
You can try this:-
SELECT user_id, type, source_id
FROM activity
WHERE 3 > (
SELECT count( * )
FROM activity AS activity1
WHERE activity .user_id = activity1.user_id
AND activity.user_id > activity1.user_id)
ORDER BY activity.user_id DESC
LIMIT 5