Find differences in name-value pairs across version IDs - mysql

Say I have a table with 3 columns:version_id, name, value.
Conceptually, this table has a bunch of name-value pairs for each version_id.
How can I write a query that will show only the name value pairs of the top two version_ids where the name value pair is not the same across version-ids?
Additionally, I am wondering if there is a way to put the differing name-value pairs from the different version_ids side by side, or have the rows be right next to each other in the results.
Basically, I want like a diff of the two versions.
Example:
version_id name value
23459 jsLibrary2 JQuery_1_4_3
23459 jsLibrary1 CrossDomainAjax_1_0
23456 jsLibrary2 JQuery_1_4_2
23456 jsLibrary1 CrossDomainAjax_1_0
23456 groovyInclude2 GroovyUtilities
23454 jsLibrary2 JQuery_1_4_2
23454 jsLibrary1 CrossDomainAjax_1_0
23454 groovyInclude2 GroovyUtilities
Ideal query result:
23456 jsLibrary2 JQuery_1_4_2
23459 jsLibrary2 JQuery_1_4_3
23456 groovyInclude2 GroovyUtilities
23459 NULL NULL
Note that ideally it would note new name-value pairs (where the name doesn't exist in the smaller version_id) and deleted name-value pairs (where the name doesn't exist in the larger version_id)

I'm sure this can be simplified — or at least, I really hope it can — but:
SELECT name,
version_id_before,
( SELECT value
FROM property_history
WHERE name = t.name
AND version_id = version_id_before
) AS value_before,
( SELECT MIN(version_id)
FROM property_history
WHERE version_id > version_id_before
) AS version_id_after,
( SELECT value
FROM property_history
WHERE name = t.name
AND version_id =
( SELECT MIN(version_id)
FROM property_history
WHERE version_id > version_id_before
)
) AS value_after
FROM ( SELECT name,
CASE WHEN EXISTS
( SELECT 1
FROM property_history
WHERE name = ph1.name
AND version_id =
( SELECT MAX(version_id)
FROM property_history
)
)
THEN ( SELECT MAX(version_id)
FROM property_history ph2
WHERE NOT EXISTS
( SELECT 1
FROM property_history
WHERE name = ph1.name
AND version_id = ph2.version_id
AND value =
( SELECT value
FROM property_history
WHERE name = ph1.name
AND version_id =
( SELECT MAX(version_id)
FROM property_history
)
)
)
)
ELSE ( SELECT MAX(version_id)
FROM property_history
WHERE name = ph1.name
)
END AS version_id_before
FROM property_history ph1
GROUP
BY name
) AS t
WHERE version_id_before IS NOT NULL
;
(Disclaimer: tested only using your example data-set, for which it gives the result:
+----------------+-------------------+-----------------+------------------+--------------+
| name | version_id_before | value_before | version_id_after | value_after |
+----------------+-------------------+-----------------+------------------+--------------+
| groovyInclude2 | 23456 | GroovyUtilities | 23459 | NULL |
| jsLibrary2 | 23456 | JQuery_1_4_2 | 23459 | JQuery_1_4_3 |
+----------------+-------------------+-----------------+------------------+--------------+
I haven't made any effort to construct other data-sets to test it on.)

I think you'll need to use a couple of subqueries to get the desired results since you are looking for the first and second values. I'm assuming that the name is the 'key' that you have to group on, in which case something along these lines should work:
Select
firstVersion.firstVersionId,
firstVersionDetails.name as firstVersionName,
firstVersionDetails.value as firstVersionValue,
--second version values will be null if there is no second value
secondVersion.secondVersionId,
secondVersionDetails.name as secondVersionName, --always the same as firstVersionName because name is a key field
secondVersionDetails.value as secondVersionValue
From
(
Select
name,
Max(version_id) as firstVersionId
From versions
Group by name
) as firstVersion
join versions as firstVersionDetails--inner join because every name has a first version
on firstVersions.version_id = firstVersion.firstVersionId
left outer Join --outer join so we always get the first version and get the second version whenever there is one (in other words, does *not* limit data to names with at least 2 versions)
(
select
name,
Max(version_id) as secondVersionId
from versions
Group by name
) as secondVersion
on firstVersion.name=secondVersion.name
and secondVersion.version_id < firstVersion.firstVersionId --exclude the first version when calculating the 'max'. This is the part of the join that allows us to identify the second version
left outer join versions as secondVersionDetails --using outer join again so we don't limit our data to names with 2 versions
on secondVersion.secondVersionId = secondVersionDetails.version_id
Happy querying! :-)

How about this approach -
SELECT MAX(version_id) INTO #cur FROM tbl;
SELECT MAX(version_id) INTO #prev FROM tbl WHERE version_id < #cur;
SELECT name, #prev, MAX(IF(version_id = #prev, value, '')) AS prev_val, #cur, MAX(IF(version_id = #cur, value, '')) AS cur_val
FROM tbl
WHERE version_id IN (#prev, #cur)
GROUP BY name
HAVING cur_val <> prev_val;

Related

Identifying groups in Group By

I am running a complicated group by statement and I get all my results in their respective groups. But I want to create a custom column with their "group id". Essentially all the items that are grouped together would share an ID.
This is what I get:
partID | Description
-------+---------+--
11000 | "Oven"
12000 | "Oven"
13000 | "Stove"
13020 | "Stove"
12012 | "Grill"
This is what I want:
partID | Description | GroupID
-------+-------------+----------
11000 | "Oven" | 1
12000 | "Oven" | 1
13000 | "Stove" | 2
13020 | "Stove" | 2
12012 | "Grill" | 3
"GroupID" does not exist as data in any of the tables, it would be a custom generated column (alias) that would be associated to that group's key,id,index, whatever it would be called.
How would I go about doing this?
I think this is the query that returns the five rows:
select partId, Description
from part p;
Here is one way (using standard SQL) to get the groups:
select partId, Description,
(select count(distinct Description)
from part p2
where p2.Description <= p.Description
) as GroupId
from part p;
This is using a correlated subquery. The subquery is finding all the description values less than the current one -- and counting the distinct values. Note that this gives a different set of values from the ones in the OP. These will be alphabetically assigned rather than assigned by first encounter in the data. If that is important, the OP should add that into the question. Based on the question, the particular ordering did not seem important.
Here's one way to get it:
SELECT p.partID,p.Description,b.groupID
FROM (
SELECT Description,#rn := #rn + 1 AS groupID
FROM (
SELECT distinct description
FROM part,(SELECT #rn:= 0) c
) a
) b
INNER JOIN part p ON p.description = b.description;
sqlfiddle demo
This gets assigns a diferent groupID to each description, and then joins the original table by that description.
Based on your comments in response to Gordon's answer, I think what you need is a derived table to generate your groupids, like so:
select
t1.description,
#cntr := #cntr + 1 as GroupID
FROM
(select distinct table1.description from table1) t1
cross join
(select #cntr:=0) t2
which will give you:
DESCRIPTION GROUPID
Oven 1
Stove 2
Grill 3
Then you can use that in your original query, joining on description:
select
t1.partid,
t1.description,
t2.GroupID
from
table1 t1
inner join
(
select
t1.description,
#cntr := #cntr + 1 as GroupID
FROM
(select distinct table1.description from table1) t1
cross join
(select #cntr:=0) t2
) t2
on t1.description = t2.description
SQL Fiddle
SELECT partID , Description, #s:=#s+1 GroupID
FROM part, (SELECT #s:= 0) AS s
GROUP BY Description

MySQL GROUP BY order

Please consider the following table structure and data:
+--------------------+-------------+
| venue_name | listed_by |
+--------------------+-------------+
| My Venue Name | 1 |
| Another Venue | 2 |
| My Venue Name | 5 |
+--------------------+-------------+
I am currently using MySQL's GROUP BY function to select only unique venue names. However, this only returns the first occurance of My Venue Name, but I would like to return it based on a condition (in this case where the listed_by field has a value > 2.
Essentially here's some pseudo-code of what I'd like to achieve:
Select all records
Group by name
if grouped, return the occurance with the higher value in listed_by
Is there an SQL statement that will allow this functionality?
Edit: I should have mentioned that there are other fields involved in the query, and the listed_by field needs to be used elsewhere in the query, too. Here is the original query that we're using:
SELECT l1.field_value AS venue_name,
base.ID AS listing_id,
base.user_ID AS user_id,
IF(base.user_ID > 1, 'b', 'a') AS flag,
COUNT(img.ID) AS img_num
FROM ( listingsDBElements l1, listingsDB base )
LEFT JOIN listingsImages img ON (base.ID = img.listing_id AND base.user_ID = img.user_id and img.active = 'yes')
WHERE l1.field_name = 'venue_name'
AND l1.field_value LIKE '%name%'
AND base.ID = l1.listing_id
AND base.user_ID = l1.user_id
AND base.ID = l1.listing_id
AND base.user_ID = l1.user_id
AND base.active = 'yes'
GROUP BY base.Title ORDER BY flag desc,img_num desc
As long as you didn't mention other fields - here is the simplest solution:
SELECT venue_name,
MAX(listed_by)
FROM tblname
WHERE listed_by > 2
GROUP BY venue_name
With other fields it could look like (assuming there is no duplicates in venue_name + listed_by pairs):
SELECT *
FROM tblname t1
INNER JOIN (SELECT venue_name,
MAX(listed_by) max_listed_by
FROM tblname
WHERE listed_by > 2
GROUP BY venue_name) t2 ON t1.venue_name = t2.venue_name
AND t1.listed_by = t2.max_listed_by

MySQL getting the lowest ID for a certain user -or- the ID of the entry with the highest urgency for each row

I have the following database
id | user | urgency | problem | solved
The information in there has different users, but these users all have multiple entries
1 | marco | 0 | MySQL problem | n
2 | marco | 0 | Email problem | n
3 | eddy | 0 | Email problem | n
4 | eddy | 1 | MTV doesn't work | n
5 | frank | 0 | out of coffee | y
What I want to do is this: Normally I would check everybody's oldest problem first. I use this query to get the ID's of the oldest problem.
select min(id) from db group by user
this gives me a list of the oldest problem ID's. But I want people to be able to make a certain problem more urgent. I want the ID with the highest urgency for each user, or ID of the problem with the highest urgency
Getting the max(urgency) won't give the ID of the problem, it will give me the max urgency.
To be clear: I want to get this as a result
row | id
0 | 1
1 | 4
The last entry should be in the results since it's solved
Select ...
From SomeTable As T
Join (
Select T1.User, Min( T1.Id ) As Id
From SomeTable As T1
Join (
Select T2.User, Max( T2.Urgency ) As Urgency
From SomeTable As T2
Where T2.Solved = 'n'
Group By T2.User
) As MaxUrgency
On MaxUrgency.User = T1.User
And MaxUrgency.Urgency = T1.Urgency
Where T1.Solved = 'n'
Group By T1.User
) As Z
On Z.User = T.User
And Z.Id = T.Id
There are lots of esoteric ways to do this, but here's one of the clearer ones.
First build a query go get your min id and max urgency:
SELECT
user,
MIN(id) AS min_id,
MAX(urgency) AS max_urgency
FROM
db
GROUP BY
user
Then incorporate that as a logical table into
a larger query for your answers:
SELECT
user,
min_id,
max_urgency,
( SELECT MIN(id) FROM db
WHERE user = a.user
AND urgency = a.max_urgency
) AS max_urgency_min_id
FROM
(
SELECT
user,
MIN(id) AS min_id,
MAX(urgency) AS max_urgency
FROM
db
GROUP BY
user
) AS a
Given the obvious indexes, this should be pretty efficient.
The following will get you exactly one row back -- the most urgent, probably oldest problem in your table.
select id from my_table where id = (
select min(id) from my_table where urgency = (
select max(urgency) from my_table
)
)
I was about to suggest adding a create_date column to your table so that you could get the oldest problem first for those problems of the same urgency level. But I'm now assuming you're using the lowest ID for that purpose.
But now I see you wanted a list of them. For that, you'd sort the results by ID:
select id from my_table where urgency = (
select max(urgency) from my_table
) order by id;
[Edit: Left out the order by!]
I forget, honestly, how to get the row number. Someone on the interwebs suggests something like this, but no idea if it works:
select #rownum:=#rownum+1 ‘row', id from my_table where ...

Can select next record alphabetically, but what happens when the record name is identical?

With this sql I can grab the next name in alphabetical order using the ID of
SELECT id
FROM `names`
WHERE `name` > (SELECT `name` FROM `names` WHERE `id` = X)
ORDER BY `name` ASC, `id` ASC
However let's asume I have these I have these records
id | name
---------
12 | Alex
8 | Bert
13 | Bert
17 | Bert
4 | Chris
Say I have id 12 as reference I get the results
id | name
---------
8 | Bert
13 | Bert
17 | Bert
4 | Chris
But if I use 8 as reference I get
id | name
---------
4 | Chris
Bert 13 and 17 would get skipped.
This may seem obvious but if you wanted to order by 2 fields such as first and last names then you will need to concatenate (CONCAT / CONCAT_WS) the first and last name fields in order to find the previous or next result. If you have names that are the same then you may find yourself looping from one name to the other and back again, to prevent this, concatenate the ID of the row to the end of the concatenated first and last name. This will work better than testing if the id is greater than the current id (OR n.names = q.name AND n.id > q.id) as if the names have not been inserted alphabetically then you will miss sections of results (not what is wanted when browsing to the next / previous result). Hope this helps someone.
Try a condition like this:
WHERE `name` > (SELECT `name` FROM `names` WHERE `id` = X)
OR `name` = (SELECT `name` FROM `names` WHERE `id` = X) AND `id` > X
That's because you're comparing names with the greater than operator, which will exclude any names which are equal to. If you want to keep respecting the id:
SELECT n.id
FROM names n
JOIN (SELECT name, id FROM names WHERE id = X) q
ON n.id = q.id
WHERE n.name > q.name
(OR n.names = q.name AND n.id > q.id)
ORDER BY n.name ASC, n.id ASC
Here we use the inner query to return not only names, but also corresponding id's. We can then use the id as a tie-breaker in the case of equal names.

Split a MYSQL string from GROUP_CONCAT into an ( array, like, expression, list) that IN () can understand

This question follows on from MYSQL join results set wiped results during IN () in where clause?
So, short version of the question. How do you turn the string returned by GROUP_CONCAT into a comma-seperated expression list that IN() will treat as a list of multiple items to loop over?
N.B. The MySQL docs appear to refer to the "( comma, seperated, lists )" used by IN () as 'expression lists', and interestingly the pages on IN() seem to be more or less the only pages in the MySQL docs to ever refer to expression lists. So I'm not sure if functions intended for making arrays or temp tables would be any use here.
Long example-based version of the question: From a 2-table DB like this:
SELECT id, name, GROUP_CONCAT(tag_id) FROM person INNER JOIN tag ON person.id = tag.person_id GROUP BY person.id;
+----+------+----------------------+
| id | name | GROUP_CONCAT(tag_id) |
+----+------+----------------------+
| 1 | Bob | 1,2 |
| 2 | Jill | 2,3 |
+----+------+----------------------+
How can I turn this, which since it uses a string is treated as logical equivalent of ( 1 = X ) AND ( 2 = X )...
SELECT name, GROUP_CONCAT(tag.tag_id) FROM person LEFT JOIN tag ON person.id = tag.person_id
GROUP BY person.id HAVING ( ( 1 IN (GROUP_CONCAT(tag.tag_id) ) ) AND ( 2 IN (GROUP_CONCAT(tag.tag_id) ) ) );
Empty set (0.01 sec)
...into something where the GROUP_CONCAT result is treated as a list, so that for Bob, it would be equivalent to:
SELECT name, GROUP_CONCAT(tag.tag_id) FROM person INNER JOIN tag ON person.id = tag.person_id AND person.id = 1
GROUP BY person.id HAVING ( ( 1 IN (1,2) ) AND ( 2 IN (1,2) ) );
+------+--------------------------+
| name | GROUP_CONCAT(tag.tag_id) |
+------+--------------------------+
| Bob | 1,2 |
+------+--------------------------+
1 row in set (0.00 sec)
...and for Jill, it would be equivalent to:
SELECT name, GROUP_CONCAT(tag.tag_id) FROM person INNER JOIN tag ON person.id = tag.person_id AND person.id = 2
GROUP BY person.id HAVING ( ( 1 IN (2,3) ) AND ( 2 IN (2,3) ) );
Empty set (0.00 sec)
...so the overall result would be an exclusive search clause requiring all listed tags that doesn't use HAVING COUNT(DISTINCT ... ) ?
(note: This logic works without the AND, applying to the first character of the string. e.g.
SELECT name, GROUP_CONCAT(tag.tag_id) FROM person LEFT JOIN tag ON person.id = tag.person_id
GROUP BY person.id HAVING ( ( 2 IN (GROUP_CONCAT(tag.tag_id) ) ) );
+------+--------------------------+
| name | GROUP_CONCAT(tag.tag_id) |
+------+--------------------------+
| Jill | 2,3 |
+------+--------------------------+
1 row in set (0.00 sec)
Instead of using IN(), would using FIND_IN_SET() be an option too?
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_find-in-set
mysql> SELECT FIND_IN_SET('b','a,b,c,d');
-> 2
Here's a full example based on the example problem in the question, confirmed as tested by the asker in an earlier edit to the question:
SELECT name FROM person LEFT JOIN tag ON person.id = tag.person_id GROUP BY person.id
HAVING ( FIND_IN_SET(1, GROUP_CONCAT(tag.tag_id)) ) AND ( FIND_IN_SET(2, GROUP_CONCAT(tag.tag_id)) );
+------+
| name |
+------+
| Bob |
+------+
You can pass a string as array, using a split separator, and explode it in a function, that will work with the results.
For a trivial example, if you have a string array like this: 'one|two|tree|four|five', and want to know if two is in the array, you can do this way:
create function str_in_array( split_index varchar(10), arr_str varchar(200), compares varchar(20) )
returns boolean
begin
declare resp boolean default 0;
declare arr_data varchar(20);
-- While the string is not empty
while( length( arr_str ) > 0 ) do
-- if the split index is in the string
if( locate( split_index, arr_str ) ) then
-- get the last data in the string
set arr_data = ( select substring_index(arr_str, split_index, -1) );
-- remove the last data in the string
set arr_str = ( select
replace(arr_str,
concat(split_index,
substring_index(arr_str, split_index, -1)
)
,'')
);
-- if the split index is not in the string
else
-- get the unique data in the string
set arr_data = arr_str;
-- empties the string
set arr_str = '';
end if;
-- in this trivial example, it returns if a string is in the array
if arr_data = compares then
set resp = 1;
end if;
end while;
return resp;
end
|
delimiter ;
I want to create a set of usefull mysql functions to work with this method. Anyone interested please contact me.
For more examples, visit http://blog.idealmind.com.br/mysql/how-to-use-string-as-array-in-mysql-and-work-with/