Weird result for GROUP_CONCAT on subquery - mysql
I am having a weird behavior when using GROUP_CONCAT on subquery.
Here is my query :
SELECT
name,
GROUP_CONCAT(DISTINCT (id) SEPARATOR "-") AS id
FROM (
(SELECT
"APN" AS name,
GROUP_CONCAT(DISTINCT (site.id) SEPARATOR "-") AS id
FROM site
WHERE id IN
(138, 147, 8918, 8916, 9033, 9240, 97, 9038, 8886, 9036, 9067, 146, 37, 9127, 52, 9031, 23, 8635, 8665,
46, 39, 18, 33, 9035, 137, 9051, 8766, 25, 20, 9160, 133, 8636, 9021, 8655, 21, 42, 8757, 22, 9017, 77,
9037, 44, 49, 9323, 55, 74, 150, 8, 67, 1, 8928, 58, 9025, 9221, 9019, 9069, 9214, 9176, 95, 40, 9335,
168, 9260, 8641, 9227, 9258, 24, 50, 29, 9073, 12, 36, 8882, 9, 43, 76, 9032, 51, 9060, 96, 8922, 9212,
14, 9095, 28, 9213, 31, 41, 68, 9027, 8884, 9023, 9059, 9034, 9016, 11, 61, 9229, 8761, 9225, 8937, 9018,
9121, 9119, 8659, 8926, 9096, 57, 9083, 8662, 9232, 149, 8643, 88, 19, 8660, 10, 8936, 9210, 9241, 17, 8872))
UNION ALL
(SELECT
"smart" AS name,
GROUP_CONCAT(DISTINCT (site.id) SEPARATOR "-") AS id
FROM site
WHERE id IN
(9129, 8981, 9136, 9169, 9170, 9171, 9172, 9297, 9147, 9155, 9139, 9138, 9142, 9296, 8987, 9216, 9252,
9320, 8951, 8945, 8952, 8965, 8963, 9012, 9192, 8938, 8941, 8968, 8977, 9117, 9135, 9140, 9143, 9295,
9298, 9137, 8988, 8989, 8992, 9164, 9156, 9165, 9168, 9173, 8953, 8999, 8939, 8940, 8942, 8943, 8954,
8956, 8957, 8959, 8960, 8964, 8971, 8972, 8973, 8974, 8982, 9000, 9001, 9003, 8950, 8978, 8979, 8983,
9002, 9005, 8984, 8955, 8986, 8980, 8993, 9008, 9010, 8949, 8998, 9150, 9122, 8944, 8946, 8948, 9006,
9009, 9013, 9128, 9215, 9321, 9011, 9154, 8970, 8975, 8994, 9070, 8966, 8958, 9007, 9014))
) t
GROUP BY name;
(This is a "test" query to show easily the issue, the real query is not that "dumb"). It regroups the result of two subqueries. All the IDs exist and return a row.
So when I run the first subquery alone, I get the result "APN" for name, and " 1-8-9-10-11-12-14-17-18-19-20-21-22-23-24-25-28-29-31-33-36-37-39-40-41-42-43-44-46-49-50-51-52-55-57-58-61-67-68-74-76-77-88-95-96-97-133-137-138-146-147-149-150-168-8635-8636-8641-8643-8655-8659-8660-8662-8665-8757-8761-8766-8872-8882-8884-8886-8916-8918-8922-8926-8928-8936-8937-9016-9017-9018-9019-9021-9023-9025-9027-9031-9032-9033-9034-9035-9036-9037-9038-9051-9059-9060-9067-9069-9073-9083-9095-9096-9119-9121-9127-9160-9176-9210-9212-9213-9214-9221-9225-9227-9229-9232-9240-9241-9258-9260-9323-9335" for ID (the full list of IDs)
It is the same for the second subquery, except that the name is "smart" and the IDs are different. So this is the expected behavior.
The issue is when I run the complete query, for the name APN, I get the following list of IDs :
1-8-9-10-11-12-14-17-18-19-20-21-22-23-24-25-28-29-31-33-36-37-39-40-41-42-43-44-46-49-50-51-52-55-57-58-61-67-68-74-76-77-88-95-96-97-133-137-138-146-147-149-150-168-8635-8636-8641-8643-8655-8659-8660-8662-8665-8757-8761-8766-8872-8882-8884-8886-8916-8918-8922-8926-8928-8936-8937-9016-9017-9018-9019-9021-9023-9025-9027-9031-9032-9033-9034
So this list is much smaller than the first one. And this is the same for the name "smart".
I tried replacing my two subqueries by (SELECT "APN" as name, "1-8-9-10-11-12-14-17-etc..." as id FROM site LIMIT 1) with the complete list of IDs (and the same for the name "smart"), and with that, the result of the full query is as expected (the full list of ID for each name).
The group_concat_max_len is 1024 on my server (and my full ID list are much more smaller than 1024 caracters)
So, do you have any idea why the result is not as expected ?
Your query is bit weird.
select name, GROUP_CONCAT(DISTINCT(id) SEPARATOR "-") AS id FROM (
(select "APN" AS name, GROUP_CONCAT(DISTINCT(site.id) SEPARATOR "-") AS id from site WHERE id IN (138,147,8918,8916,9033,9240,97,9038,8886,9036,9067,146,37,9127,52,9031,23,8635,8665,46,39,18,33,9035,137,9051,8766,25,20,9160,133,8636,9021,8655,21,42,8757,22,9017,77,9037,44,49,9323,55,74,150,8,67,1,8928,58,9025,9221,9019,9069,9214,9176,95,40,9335,168,9260,8641,9227,9258,24,50,29,9073,12,36,8882,9,43,76,9032,51,9060,96,8922,9212,14,9095,28,9213,31,41,68,9027,8884,9023,9059,9034,9016,11,61,9229,8761,9225,8937,9018,9121,9119,8659,8926,9096,57,9083,8662,9232,149,8643,88,19,8660,10,8936,9210,9241,17,8872))
UNION ALL
(select "smart" AS name, GROUP_CONCAT(DISTINCT(site.id) SEPARATOR "-") AS id from site WHERE id IN (9129,8981,9136,9169,9170,9171,9172,9297,9147,9155,9139,9138,9142,9296,8987,9216,9252,9320,8951,8945,8952,8965,8963,9012,9192,8938,8941,8968,8977,9117,9135,9140,9143,9295,9298,9137,8988,8989,8992,9164,9156,9165,9168,9173,8953,8999,8939,8940,8942,8943,8954,8956,8957,8959,8960,8964,8971,8972,8973,8974,8982,9000,9001,9003,8950,8978,8979,8983,9002,9005,8984,8955,8986,8980,8993,9008,9010,8949,8998,9150,9122,8944,8946,8948,9006,9009,9013,9128,9215,9321,9011,9154,8970,8975,8994,9070,8966,8958,9007,9014))
) t GROUP BY name;
is equal to:
(select "APN" AS name, GROUP_CONCAT(DISTINCT(site.id) SEPARATOR "-") AS id from site WHERE id IN (138,147,8918,8916,9033,9240,97,9038,8886,9036,9067,146,37,9127,52,9031,23,8635,8665,46,39,18,33,9035,137,9051,8766,25,20,9160,133,8636,9021,8655,21,42,8757,22,9017,77,9037,44,49,9323,55,74,150,8,67,1,8928,58,9025,9221,9019,9069,9214,9176,95,40,9335,168,9260,8641,9227,9258,24,50,29,9073,12,36,8882,9,43,76,9032,51,9060,96,8922,9212,14,9095,28,9213,31,41,68,9027,8884,9023,9059,9034,9016,11,61,9229,8761,9225,8937,9018,9121,9119,8659,8926,9096,57,9083,8662,9232,149,8643,88,19,8660,10,8936,9210,9241,17,8872))
UNION ALL
(select "smart" AS name, GROUP_CONCAT(DISTINCT(site.id) SEPARATOR "-") AS id from site WHERE id IN (9129,8981,9136,9169,9170,9171,9172,9297,9147,9155,9139,9138,9142,9296,8987,9216,9252,9320,8951,8945,8952,8965,8963,9012,9192,8938,8941,8968,8977,9117,9135,9140,9143,9295,9298,9137,8988,8989,8992,9164,9156,9165,9168,9173,8953,8999,8939,8940,8942,8943,8954,8956,8957,8959,8960,8964,8971,8972,8973,8974,8982,9000,9001,9003,8950,8978,8979,8983,9002,9005,8984,8955,8986,8980,8993,9008,9010,8949,8998,9150,9122,8944,8946,8948,9006,9009,9013,9128,9215,9321,9011,9154,8970,8975,8994,9070,8966,8958,9007,9014))
No need for parent grouping by name and id unless your original query produces many APN Rows with same group of ID's.
back to your question: You are correct that group_concat has max length of 1024 but a sort /union operation truncates further to 1/3 (1024/3 = 341). (although it's known but no official document is available to back this up)
In your case, just increase the group concat max length value:
SET group_concat_max_len = 5000;
and that should give your desired output without truncating.
You can create temporary tables and union them, or you can output the grou_concat result into a variable. In both way grop_concat will truncate by its original default value.
Related
MySQL - group_concat pulling in additional incorrect data
I'm having trouble with a JOIN and a GROUP_CONCAT. The query is concatenating additional data that should not be associated with the join. Here's my table structure: linkages ID table_name tag_id 1 subcategories 6 2 categories 9 music ID artwork 1 5 2 4 artwork ID url_path 1 /some/file/path 2 /some/file/path And here's my query: SELECT music.*, artwork.url_path AS artwork_url_path, GROUP_CONCAT( linkages.tag_id ) AS tag_ids, GROUP_CONCAT( linkages.table_name ) AS table_name FROM music LEFT JOIN artwork ON artwork.id = music.artwork LEFT JOIN linkages ON music.id = linkages.track_id WHERE music.id IN( '1356', '1357', '719', '169', '170', '171', '805' ) ORDER BY FIELD( music.id, 1356, 1357, 719, 169, 170, 171, 805 ) This is the result of the GROUP_CONCAT : [tag_ids] => 3, 6, 9, 17, 19, 20, 26, 49, 63, 64, 53, 57, 63, 65, 67, 73, 79, 80, 85, 96, 98, 11, 53, 67, 3, 6, 15, 17, 26, 38, 50, 63, 74, 53, 56, 57, 62, 63, 65, 66, 67, 72, 85, 88, 98, 24, 69, 71, 3, 6, 15, 17, 26, 38, 50 The first portion of the result is correct: [tag_ids] => 3, 6, 9, 17, 19, 20, 26, 49, 63, 64, 53, 57, 63, 65, 67, 73, 79, 80, 85, 96, 98, 11, 53, 67 Everything after the correct values seems random and most of the values don't exist in the result in the database, but it's still pulling it in. It seems to repeat a portion of the correct result (3, 6, 15, 17 - the 3, 6, 17 are correct, but 15 shouldn't be there, similar with a bunch of other numbers - 71, etc. I can't use DISTINCT because I need to match up the tag_ids and table_name results as a multidimensional array from the results. Any thoughts as to why? UPDATE: I ended up solving it with the initial push from Gordon. It needed a GROUP_BY clause, otherwise it was putting every results tag id's in each result. The final query ended up becoming this: SET SESSION group_concat_max_len = 1000000; SELECT music.*, artwork.url_path as artwork_url_path, GROUP_CONCAT(linkages.tag_id, ':', linkages.table_name) as tags FROM music LEFT JOIN artwork ON artwork.id = music.artwork LEFT JOIN linkages ON music.id = linkages.track_id WHERE music.id IN('1356', '1357', '719', '169', '170', '171', '805') GROUP BY music.id ORDER BY FIELD(music.id,1356,1357,719,169,170,171,805);
Your join is generating duplicate rows. I would suggest that you fix the root cause of the problem. But, a quick-and-dirty solution is to use group_concat(distinct): GROUP_CONCAT(DISTINCT linkages.tag_id) as tag_ids, GROUP_CONCAT(DISTINCT linkages.table_name) as table_name You can put the columns in a single field using GROUP_CONCAT(): GROUP_CONCAT(DISTINCT linkages.tag_id, ':', linkages.table_name) as tags
SQL: value higher than percentage of population of values
I wish to calculate the value which is higher than a percentage of the population of values, this per group. Suppose I have: CREATE TABLE project ( id int, event int, val int ); INSERT INTO project(id,event,val) VALUES (1, 11, 43), (1, 12, 19), (1, 13, 19), (1, 14, 53), (1, 15, 45), (1, 16, 35), (2, 21, 22), (2, 22, 30), (2, 23, 25), (2, 24, 28); I now want to calculate for each id what is the val that will be for example higher than 5%, or 30% of the val for that id. For example, for id=1, we have the following values: 43, 19, 19, 53, 45, 35. So the contingency table would look like this: 19 35 43 45 53 2 1 1 1 1 and the val=20 (higher than 19) would be chosen to be higher than 5% (actuall 2 out of 6) of the rows. The contengency table for id 2 is: 22 25 28 30 1 1 1 1 My expected out is: id val_5p_coverage val_50p_coverage 1 20 36 2 23 26 val_5p_coverage is the value val needed to be above at least 5% of val in the id. val_50p_coverage is the value val needed to be above at least 50% of val in the id. How can I calculate this with SQL ?
I managed to do it in HiveQL (for Hadoop) as follows: create table prep as select *, CUME_DIST() OVER(PARTITION BY id ORDER BY val ASC) as proportion_val_equal_or_lower from project SELECT id, MIN(IF(proportion_val_equal_or_lower>=0.05, val, NULL)) AS val_5p_coverage, MIN(IF(proportion_val_equal_or_lower>=0.50, val, NULL)) AS val_50p_coverage FROM prep GROUP BY id Although this is not MySQL nor SQL per se, it might help to do it in MySQL or SQL.
Conversion failed when converting the var char value '10, 11, 12, 60, 61, 111, 143, 144' to data type int
Conversion failed when converting the var char value '10, 11, 12, 60, 61, 111, 143, 144' to data type int. Master Details is var char contains 'list user, add user,list master,add master' value and ID as int, User Access column is Var char containing '10, 11, 12, 60, 61, 111, 143, 144' value Select Master Details from Master information where ID IN (SELECT User Access FROM User Access Details where User ID = 22)
Make sure that you have same column type in both your tables. Otherwise use cast or convert to make it of same type in your query. also you should use single column in you inner query or use [] for column name having spaces in your column name. SELECT [User Access] FROM User Access Details where User ID = 22 MSDN link for cast and convert Some links sql how to cast a select query
Blank parameter with multivalue parameter returns nothing
I have three parameter (#person_id, #Person_name, #Supervisor_name), all have Allow Multiple Values and Allow blank value property enabled. Columns of the report are Person_id, Person_name, Supervisor_name, Claims_done, average_claims_perday created with dataset table with same columns. The dataset which return the data has filter in query: where #person_id in (#person_id) or [PersonName] in (#Person_name) or Supervisor_name in (#supervisor_name) The requirement is out of three parameter, if any of the parameter is blank, then query should gives the result based on the parameters that are selected with multivalued. For Example: dataset creates the following result. 11, abc, john, 12, 3 22, def, john, 345, 9 33, ghi, bryan, 89, 7 44, jkl, bryan, 45, 6 55, mno, bryan, 60, 7 If I select the parmeter #Person_name = 'mno' and #Supervisor_name = 'John' and kept #person_id blank then it should give the result: 11, abc, john, 12, 3 22, def, john, 345, 9 55, mno, bryan, 60, 7 If I select #person_id = 11, 44 and #Supervisorname = 'John', and left the #Person_name blank, then it should give the result: 11, abc, john, 12, 3 22, def, john, 345, 9 44, jkl, bryan, 45, 6 When I keep any of the parameter blank, the report doesnt shows anything, If I select at least one value for all parameters, it gives perfect result. Any help is appreciated.
If I understand correctly, your requirements for handling parameters can be rephrased as: If a parameter is set, then filter on it; otherwise don't filter on it. If that is correct, change the where clause to something like this: WHERE (Person_id in (#person_id) OR #person_id = '') AND (PersonName in (#Person_name) OR #Person_name = '') AND (Supervisor_name in (#supervisor_name) OR #supervisor_name = '') This means each parameter has to be either satisfied, or has to be blank.
Add up a points column based on an id from within the same mysql table
OK the database is layed out as (only columns being used are listed): Table Name: race_stats Columns: race_id, user_id, points, tournament_id Table Name: user Columns: user_id, driver Table Name: race Columns: race_id, race_name Table Name: tournament Columns: tournament_id, tournament_name This is my current query: $query = " SELECT user.user_id, user.driver, race_stats.points, race_stats.user_id, SUM(race_stats.points) AS total_points "." FROM user, race_stats, tournament, race "." WHERE race.race_id=race_stats.race_id AND user.user_id=race_stats.user_id AND tournament.tournament_id=race_stats.tournament_id GROUP BY driver ORDER BY total_points DESC LIMIT 0, 15 "; Ok the query works but it is adding them all up for all the available races from the race_stats.race_id column as the total points. I have racked my brain beyond recognition to fix this but I just can't quite seem to find the solution I need. I'm sure it has to be an easy fix but I just can't get it. Any help is greatly appreciated. ///////////////////EDITED WITH RAW VALUES////////////////////// INSERT INTO `race_stats` (`id_race`, `race_id`, `user_id`, `f`, `s`, `race_interval`, `race_laps`, `led`, `points`, `total_points`, `race_status`, `tournament_id`, `driver`, `tournament_name`) VALUES (1, 1, 4, 1, 4, '135.878', 60, '2', 180, 0, 'Running', 1, 'new_driver_5', ''), (2, 1, 2, 2, 2, '-0.08', 60, '22', 175, 0, 'Running', 1, 'new_driver_38', ''), (3, 1, 5, 3, 5, '-11.82', 60, '2', 170, 0, 'Running', 1, 'new_driver_94', ''), (4, 2, 2, 1, 15, '138.691', 29, '6', 180, 0, 'Running', 2, 'new_driver_38', ''), (5, 2, 15, 2, 9, '-16.12', 29, '8*', 180, 0, 'Running', 2, 'new_driver_44', ''), (6, 2, 8, 3, 11, '-2:03.48', 29, '0', 165, 0, 'Running', 2, 'new_driver_83', ''), Let me know if this is what you meant by raw values if not I can get some more data for you.
Just posting the solution here for completeness: SELECT user.driver, race_stats.race_id, SUM(race_stats.points) AS total_points "." FROM user, race_stats "." WHERE user.user_id=race_stats.user_id GROUP BY user.driver, race.race_id
Here's the query you want (formatted for readability): SELECT u.driver, SUM(rs.points) AS total_points FROM user u LEFT JOIN race_stats rs on rs.user_id = u.user_id GROUP BY 1; The advantage of using an outer join (ie LEFT JOIN) is that drivers who have no stats still get a row, but with null as total_points. p.s. I don't know what the usage of "." in your query is all about, so I removed it.