Path from a group - mysql

I have the following data:
╔════╦═══════╦═══════╗
║ id ║ group ║ place ║
╠════╬═══════╬═══════╣
║ 1 ║ 1 ║ a ║
║ 2 ║ 1 ║ b ║
║ 3 ║ 1 ║ b ║
║ 4 ║ 1 ║ a ║
║ 5 ║ 1 ║ c ║
║ 6 ║ 2 ║ a ║
║ 7 ║ 2 ║ b ║
║ 8 ║ 2 ║ c ║
╚════╩═══════╩═══════╝
How can I get the path of each group in MySQL?
The expected result is:
╔═══════╦════════════╗
║ group ║ path ║
╠═══════╬════════════╣
║ 1 ║ a-b-a-c ║
║ 2 ║ a-b-c ║
╚═══════╩════════════╝

Assuming that the end goal is to sort by group and id, and then simplify each group's sequence so that consecutive repeated places are only shown once:
Start by determining, for each row, whether the place or the group have changed since the previous row. There's a good solution to this problem in this answer.
Then use GROUP_CONCAT to merge the places together into a path.
Be aware that GROUP_CONCAT has a user-configurable maximum length, which by default is 1,024 characters.
SELECT
`group`,
GROUP_CONCAT(place ORDER BY id SEPARATOR '-') path
FROM
(SELECT
COALESCE(#place != place OR #group != `group`, 1) changed,
id,
#group:=`group` `group`,
#place:=place place
FROM
place_table, (SELECT #place:=NULL, #group:=NULL) s
ORDER BY `group`, id) t
WHERE
changed = 1
GROUP BY `group`;

Related

SQL: Select ID/Value combinations that do not possess a value in another table

So here's basically the issue (I'm turning this into more of a universal question in case people need something like this in the future).
I have one table ("People") that is basically this
╔══════════╦═══════╗
║ PersonID ║ Letter║
╠══════════╬═══════╣
║ 1 ║ A ║
║ 1 ║ B ║
║ 1 ║ C ║
║ 1 ║ D ║
║ 2 ║ A ║
║ 2 ║ B ║
║ 2 ║ C ║
║ 3 ║ B ║
║ 3 ║ C ║
║ 4 ║ A ║
║ 4 ║ C ║
║ 4 ║ D ║
║ 5 ║ E ║
╚══════════╩═══════╝
And lets say I have another table ("Letters") which can lists all possible "Letters" a person can have.
╔══════════╦══════╗
║ LetterID ║ Text ║
╠══════════╬══════╣
║ 1 ║ A ║
║ 2 ║ B ║
║ 3 ║ C ║
║ 4 ║ D ║
║ 5 ║ E ║
╚══════════╩══════╝
I need to make a new table that will have all persons listed and letters that they DONT have. So for this example, the result would be this
╔══════════╦══════════════╗
║ PersonID ║ LetterNotHad ║
╠══════════╬══════════════╣
║ 1 ║ E ║
║ 2 ║ D ║
║ 2 ║ E ║
║ 3 ║ A ║
║ 3 ║ D ║
║ 3 ║ E ║
║ 4 ║ B ║
║ 4 ║ E ║
║ 5 ║ A ║
║ 5 ║ B ║
║ 5 ║ C ║
║ 5 ║ D ║
╚══════════╩══════════════╝
Any and all help or guidance is greatly appreciated.
Edit: Here's basically what I was trying, something like this
select p.PersonId, l.value
from letters l
left join people p
on l.Text = p.Letter
where p.personid is null
Here is the idea
WITH cte
AS (SELECT *
FROM (SELECT DISTINCT personid
FROM people) B
CROSS JOIN (SELECT DISTINCT Text as letter
FROM letters) A)
SELECT *
FROM cte c
WHERE NOT EXISTS (SELECT 1
FROM first_table f
WHERE c.personid = f.personid
AND c.letter = f.letter)
Note: You need to use letterid in People table instead of Letter and define a foreign key which make the table's consistent
So what you want to do in order to find the missing values is to generate the set that represents all possible values. This is the cartesian product between the two sets (people and letters) and in SQL you use the cross join operator (or unqualified join) to do this.
From this set you want to remove the combinations you already have, and the remainder will be the missing ones.
There are many ways to do this; using a left join it could look like this:
select sub.*
from (
select distinct personid, text
from people
cross join letters
) sub
left join people p on p.letter = sub.text and p.personid = sub.personid
where p.personid is null
Or using the except set operator (for MSSQL (minus in Oracle) - MySQL does not have this):
select personid, text from people cross join letters
except
select personid, letter from people

De-duplicating many-to-many relationships in MySQL lookup table

I've inherited a database that includes a lookup table to find other patents that are related to a given patent.
So it looks like
╔════╦═══════════╦════════════╗
║ id ║ patent_id ║ related_id ║
╠════╬═══════════╬════════════╣
║ 1 ║ 1 ║ 2 ║
║ 2 ║ 1 ║ 3 ║
║ 3 ║ 2 ║ 1 ║
║ 4 ║ 2 ║ 3 ║
║ 5 ║ 3 ║ 2 ║
╚════╩═══════════╩════════════╝
And I want to filter out the reciprocal relationships. 1->2 and 2->1 are the same for my purposes so I only want 1->2.
I don't need to make the edit in the table, I just need a query the returns a list of the unique relationships, and while I'm sure it's simple I've been banging my head against the keyboard for far too long.
Here is a clever query which you can try using. The general strategy is to identify the unwanted duplicate records and then subtract them away from the entire set.
SELECT t.id, t.patent_id, t.related_id
FROM t LEFT JOIN
(
SELECT t1.patent_id AS t1_patent_id, t1.related_id AS t1_related_id
FROM t t1 LEFT JOIN t t2
ON t1.related_id = t2.patent_id
WHERE t1.patent_id = t2.related_id AND t1.patent_id > t1.related_id
) t3
ON t.patent_id = t3.t1_patent_id AND t.related_id = t3.t1_related_id
WHERE t3.t1_patent_id IS NULL
Here is the inner temporary table generated by this query. You can convince yourself that by applying the logic in the WHERE clause you will select the correct records. Non-duplicate records are characterized by t1.patent_id != t2.related_id, and all these records are retained. In the case of duplicates (t1.patent_id = t2.related_id), the record chosen from each pair of duplicates is the one where patent_id < related_id, as you requested in your question.
╔════╦══════════════╦═══════════════╦══════════════╦═══════════════╗
║ id ║ t1.patent_id ║ t1.related_id ║ t2.patent_id ║ t2.related_id ║
╠════╬══════════════╬═══════════════╬══════════════╬═══════════════╣
║ 1 ║ 1 ║ 2 ║ 2 ║ 1 ║ * duplicate
║ 1 ║ 1 ║ 2 ║ 2 ║ 3 ║
║ 2 ║ 1 ║ 3 ║ 3 ║ 2 ║
║ 3 ║ 2 ║ 1 ║ 1 ║ 2 ║ * duplicate
║ 3 ║ 2 ║ 1 ║ 1 ║ 3 ║
║ 4 ║ 2 ║ 3 ║ 3 ║ 2 ║ * duplicate
║ 5 ║ 3 ║ 2 ║ 2 ║ 1 ║
║ 5 ║ 3 ║ 2 ║ 2 ║ 3 ║ * duplicate
╚════╩══════════════╩═══════════════╩══════════════╩═══════════════╝
Click the link below for a running example of this query.
SQLFiddle
Try something like
select distinct * from
(select patient_id, related_id from TABLENAME
union
select related_id, patient_id from TABLENAME
);
Okay you're right the above won't work. Try
select patient_id, related_id from TABLENAME p1
where p1.patiend_id not in
(select patient_id from TABLENAME p2
where p2.related_id = p1.related_id)

How to merge two different field values into one row?

I need to clean some data by merging two similar but slightly different dimension field values into one new row that adds together the two metric values, keeping the uid and date intact.
Current setup looks like this:
╔═════╦═════════════╦══════╦═══════════╦═══════════╗
║ id ║ date ║ uid ║ source ║ pageviews ║
╠═════╬═════════════╬══════╬═══════════╬═══════════╣
║ 1 ║ 2013-12-11 ║ 111 ║ source1 ║ 14 ║
║ 3 ║ 2013-12-11 ║ 111 ║ source1a ║ 1 ║
║ 11 ║ 2013-12-11 ║ 222 ║ source1 ║ 3 ║
║ 19 ║ 2013-12-11 ║ 222 ║ source1a ║ 11 ║
╚═════╩═════════════╩══════╩═══════════╩═══════════╝
I'd like to consider source1 and source1a to be equal and merge the two, to get this:
╔═════╦═════════════╦══════╦══════════╦═══════════╗
║ id ║ date ║ uid ║ source ║ pageviews ║
╠═════╬═════════════╬══════╬══════════╬═══════════╣
║ 1 ║ 2013-12-11 ║ 111 ║ source1 ║ 15 ║
║ 2 ║ 2013-12-11 ║ 222 ║ source1 ║ 14 ║
╚═════╩═════════════╩══════╩══════════╩═══════════╝
id is not important, I had planned to re-increment the id in the new table that results
This is what I tried, but it didn't merge the two records – I am getting matching values but still separate rows:
SELECT date, uid, (SELECT CASE
WHEN source = 'source1a' THEN 'source1'
ELSE source
END) AS 'source', pageviews
FROM trafficSourceMedium
GROUP BY date, source, userid
An aggregation query should do what you want:
select `date`, uid,
(case when source = 'source1a' then 'source1' else source end) as source,
sum(pageviews) as pageviews
from trafficSourceMedium
group by `date`, uid,
(case when source = 'source1a' then 'source1' else source end);

MySql Query for identifying sequence in a table

I Need help with mysql query to update a new column of the same table based on series of entry and exit dates.
Below is table:
╔════╦═══════════╦═════════════╦═════════════╦════════════════════╗
║ ID ║ PLACE ║ ENTRYDATE ║ EXITDATE ║ LAST_PLACE_VISITED ║
╠════╬═══════════╬═════════════╬═════════════╬════════════════════╣
║ 1 ║ Delhi ║ 1-Jan-2012 ║ 5-Jan-2012 ║ ║
║ 1 ║ Agra ║ 10-Jan-2012 ║ 11-Jan-2012 ║ ║
║ 1 ║ Bangalore ║ 21-Jan-2012 ║ 24-Jan-2012 ║ ║
║ 1 ║ Mumbai ║ 12-Jan-2012 ║ 19-Jan-2012 ║ ║
║ 2 ║ LA ║ 1-Mar-2012 ║ 3-Mar-2012 ║ ║
║ 2 ║ SFO ║ 10-Mar-2012 ║ 14-Mar-2012 ║ ║
║ 2 ║ NY ║ 4-Mar-2012 ║ 9-Mar-2012 ║ ║
║ 3 ║ Delhi ║ 12-Apr-2012 ║ 13-Apr-2012 ║ ║
╚════╩═══════════╩═════════════╩═════════════╩════════════════════╝
The data type of ENTRYDATE and EXITDATE is DATE.
From the above table i need to write a query to update "Last_Place_Visited" column based on entry and exit date of the ID.
Any help with this query would be much appriciated.
Thanks.
Bhargav
Here's a very messy one since MySQL doesn't support window functions,
UPDATE TravelTbl a
INNER JOIN
(
SELECT a.ID,
a.Place,
a.EntryDate,
a.ExitDate,
b.Place Last_Place_Visited
FROM
(
SELECT ID,
Place,
EntryDate,
ExitDate,
Last_Place_Visited,
#grp := if(#ID = ID, #grp ,0) + 1 GRP_RecNo,
#ID := ID
FROM TravelTbl,
(SELECT #ID := '', #grp := 0) vars
ORDER BY EntryDate
) a
LEFT JOIN
(
SELECT ID,
Place,
EntryDate,
ExitDate,
Last_Place_Visited,
#grp2 := if(#ID2 = ID, #grp2 ,0) + 1 GRP_RecNo,
#ID2 := ID
FROM TravelTbl,
(SELECT #ID2 := '', #grp2 := 0) vars
ORDER BY EntryDate
) b ON a.ID = b.ID AND
a.GRP_RecNo = b.GRP_RecNo + 1
) b ON a.ID = b.ID AND
a.Place = b.Place AND
a.EntryDate = b.EntryDate AND
a.ExitDate = b.ExitDate AND
b.Last_Place_Visited IS NOT NULL
SET a.Last_Place_Visited = b.Last_Place_Visited
SQLFiddle Demo
OUTPUT
╔════╦═══════════╦═════════════╦═════════════╦════════════════════╗
║ ID ║ PLACE ║ ENTRYDATE ║ EXITDATE ║ LAST_PLACE_VISITED ║
╠════╬═══════════╬═════════════╬═════════════╬════════════════════╣
║ 1 ║ Delhi ║ 1-Jan-2012 ║ 5-Jan-2012 ║ (null) ║
║ 1 ║ Agra ║ 10-Jan-2012 ║ 11-Jan-2012 ║ Delhi ║
║ 1 ║ Bangalore ║ 21-Jan-2012 ║ 24-Jan-2012 ║ Mumbai ║
║ 1 ║ Mumbai ║ 12-Jan-2012 ║ 19-Jan-2012 ║ Agra ║
║ 2 ║ LA ║ 1-Mar-2012 ║ 3-Mar-2012 ║ (null) ║
║ 2 ║ SFO ║ 10-Mar-2012 ║ 14-Mar-2012 ║ NY ║
║ 2 ║ NY ║ 4-Mar-2012 ║ 9-Mar-2012 ║ LA ║
║ 3 ║ Delhi ║ 12-Apr-2012 ║ 13-Apr-2012 ║ (null) ║
╚════╩═══════════╩═════════════╩═════════════╩════════════════════╝
I tried to modify the table itself:
UPDATE T SET LAST_PLACE_VISITED = (
SELECT t2.PLACE
FROM T t2
WHERE t2.EXITDATE = (
SELECT MAX(t1.EXITDATE)
FROM T t1
WHERE t1.ID = ID
AND t1.EXITDATE < EXITDATE
));
MySQL won't permit this:
You can't specify target table 'T' for update in FROM clause:
But you could work with a view or a temporary table and use this:
UPDATE: Inserted LIMIT 1 for dealing with the case of multiple occurences of the maximum value of EXITDATE for distinct IDs. Disadvantage: We cannot predict which row with maximum value will be taken.
UPDATE 2: Added condition AND t2.ID = t0.ID
SELECT t0.ID, t0.PLACE, t0.ENTRYDATE, t0.EXITDATE, (
SELECT t2.PLACE
FROM T t2
WHERE t2.EXITDATE = (
SELECT MAX(t1.EXITDATE)
FROM T t1
WHERE t1.ID = t0.ID
AND t1.EXITDATE < t0.EXITDATE
)
AND t2.ID = t0.ID
LIMIT 1
) AS LAST_PLACE_VISITED
FROM T t0;
See my SQLFiddle demo

MySQl query Select all rows with same value in limit so that no value is left outside the limit defined

╔════════╦═══════════╦═══════╗
║ MSG_ID ║ RANDOM_ID ║ MSG ║
╠════════╬═══════════╬═══════╣
║ 1 ║ 22 ║ apple ║
║ 2 ║ 22 ║ bag ║
║ 3 ║ 0 ║ cat ║
║ 4 ║ 0 ║ dog ║
║ 5 ║ 0 ║ egg ║
║ 6 ║ 21 ║ fish ║
║ 7 ║ 21 ║ hen ║
║ 8 ║ 20 ║ glass ║
╚════════╩═══════════╩═══════╝
Want to fetch 3 records in a lot such a way that all the data of a particular random_id is picked up .
Result Required:
║ MSG_ID ║ RANDOM_ID ║ MSG ║
╠════════╬═══════════╬═══════╣
║ 1 ║ 22 ║ apple ║
║ 2 ║ 22 ║ bag ║
║ 3 ║ 0 ║ cat ║
Current Result:
║ MSG_ID ║ RANDOM_ID ║ MSG ║
╠════════╬═══════════╬═══════╣
║ 1 ║ 22 ║ apple ║
║ 3 ║ 0 ║ cat ║
║ 4 ║ 0 ║ dog ║
______________________________
Query Used:
SELECT ID,Random_ID, GROUP_CONCAT(message SEPARATOR ' ' ),FLAG,mobile,sender_number,SMStype
FROM messagemaster
WHERE Random_ID > 0
GROUP BY Random_ID
UNION
SELECT ID,Random_ID, message,FLAG,mobile,sender_number,SMStype
FROM messagemaster
WHERE Random_ID = 0
order by random_id LIMIT 100;
I don't want to pick up records using group by.I want to fetch all the records w rt random_ids .Like , if there is a random_id for which there are 3 records and if the query has limit =3 , then i want all the data w r t those random_id to be picked up.
The situation is if I fetch rows with limit 100 , i dont want that some of the data with the random id present in the result set is not picked.
For example if i am picking records limit by 3 then for random id=22 , all records with random id =22 should be picked .
Consider the following...
SELECT b.*
FROM
( SELECT x.*, SUM(y.cnt)
FROM
( SELECT random_id,COUNT(*) cnt FROM messagemaster GROUP BY random_id) x
JOIN
( SELECT random_id,COUNT(*) cnt FROM messagemaster GROUP BY random_id) y
ON y.random_id >= x.random_id
GROUP
BY x.random_id
HAVING SUM(y.cnt) < 4
) a
JOIN messagemaster b
ON b.random_id = a.random_id;