group_concat on two different tables gives duplicate result on second table - mysql

consider three entities as student, course, subject
Below are the associations -
student has_many courses,
student has_many subjects.
Now i want to fetch student records with subject names and course names using mysql group_concat, left join on courses, left join on subjects and group_by student_id.
Problem is that group_concat('subjects.name') as subject_names gives me duplicate entries of subjects but group_concat('students.name') as student_names gives unique names.
Why ??

The 2 left joins are multiplying rows via Cartesian product of the child rows per student
Example
Student 1 has 3 courses and 2 subjects
Generates 6 rows for Student 1
Gives one course value per subject = each course repeated twice
Gives one subject value per course = each subject repeated thrice
To fix:
Option 1: Use GROUP_CONCAT(DISTINCT ...) as per MySQL docs
In MySQL, you can get the concatenated values of expression combinations. To eliminate duplicate values, use the DISTINCT clause.
Option 2: Use a UNION ALL + derived table
SELECT
Student, MAX(CourseConcat), MAX(SubjectConcat)
FROM
(
-- 2 separate SELECTs here
.. student LEFT JOIN course ...
UNION ALL
.. student LEFT JOIN subjects...
) T
GROUP BY
Student
The 2nd option may be better albeit more complex because you have less intermediate rows to process with DISTINCT

Following you logic, group_concat('subjects.name') as subject_names gives you duplicate entries because there's possibly more than 1 subject for each student, so you're getting a duplicate record for every student record on the subject table, while group_concat('students.name') as student_names (I presume) has 1 record per student.

I know I'm probably driving this a bit too off topic, but because searching answer from google have directed me here for several times already I'd like to share my solution, for a bit more complicated similar problem.
The GROUP_CONCAT(DISTINCT ...) solution as gbn pointed out, is great, until you actually have multiple equal values or almost equal like á and a.
I left out the distinct keyword from query and solved the problem with PHP. If you only need to distinguish á from a, simple array_unique will do the trick.
Unfortunately I was not so lucky and I also had exactly equal values which I needed to keep. Consider sample values returned from database query group_concat field exploded into array:
$values = array( 'Value1','Value1','Value2','Value2','Value2','Value2' );
Now somehow distinguish how many duplicates are you dealing with. I did the following:
$x=0;
$first = reset($values);
while($first === $values[$x]) $x++;
Above solution works only if your actual first and second value is never same, which in my case was true. If that's not the case with you, figure out some other way to know how many duplicates are you dealing with.
Finally just unset all extra values with a help of modulo:
foreach($values as $k => $v){
if($k%$x !== 0) unset($values[$k]);
}
Thats it. Printing $values now will give you:
Array
(
[0] => Value1
[2] => Value2
[4] => Value2
)

Related

Match data from two tables and return results where something matchs

Update 4/25/13 6:25AM: I am using MyISAM
I have searched a lot and am not sure the best way to do this. I have two tables that have matching values in different columns and need to return all that apply to where clause.
Table 1 name agent
Relevant Column Names agent_name and team
Table 2 name poll_data
Relevant Column Names agent and duid
So I want to count how many poll results I get from each teambut I need to somehow add the team from agent table to poll_data by matching the agent.agent_name to poll_data.name so I can return only data for that team. How can I match the records and then search them in a single query.
try this ...
$query1="SELECT COUNT(*) FROM poll_data JOIN agent ON (poll_data.agent = agent.agent_name) GROUP BY agent.team";
you should normalize the database using foreign key.

Find first, second, third, and so forth record per person

I have a 1 to many relationship between people and notes about them. There can be 0 or more notes per person.
I need to bring all the notes together into a single field and since there are not going to be many people with notes and I plan to only bring in the first 3 notes per person I thought I could do this using at most 3 queries to gather all my information.
My problem is in geting the mySQL query together to get the first, second, etc note per person.
I have a query that lets me know how many notes each person has and I have that in my table. I tried something like
SELECT
f_note, f_person_id
FROM
t_person_table,
t_note_table
WHERE
t_person_table.f_number_of_notes > 0
AND t_person_table.f_person_id = t_note_table.f_person_id
GROUP BY
t_person_table.f_person_id
LIMIT 1 OFFSET 0
I had hoped to run this up to 3 times changing the OFFSET to 1 and then 2 but all I get is just one note coming back, not one note per person.
I hope this is clear, if not read on for an example:
I have 3 people in the table. One person (A) has 0 notes, one (B) with 1 and one (C) with 2.
First I would get the first note for person B and C and insert those into my person table note field.
Then I would get the second note for person C and add that to the note field in the person table.
In the end I would have notes for persons B and C where the note field for person C would be a concatination of their 2 notes.
Welcome to SO. The thing you're trying to do, selecting the three most recent items from a table for each person mentioned, is not easy in MySQL. But it is possible. See this question.
Select number of rows for each group where two column values makes one group
and, see my answer to it.
Select number of rows for each group where two column values makes one group
Once you have a query giving you the three rows, you can use GROUP_CONCAT() ... GROUP BY to aggregate the note fields.
You can get one note per person using a nested query like this:
SELECT
f_person_id,
(SELECT f_note
FROM t_note_table
WHERE t_person_table.f_person_id = t_note_table.f_person_id
LIMIT 1) AS note
FROM
t_person_table
WHERE
t_person_table.f_number_of_notes > 0
Note that tables in SQL are basically without a defined inherent order, so you should use some form or ORDER BY in the subquery. Otherwise, your results might be random, and repeated runs asking for different notes might unexpectedly return the same data.
If you only aim for a concatenation of notes in any case, then you can use the GROUP_CONCAT function to combine all notes into a single column.

Query MySQL for rows that share a value, and returning them as columns?

This is for a homework assignment. I haven't copy-pasted the question below, I made an simpler version of it that focuses on the specific area where I'm stuck.
Let's say I have a table of two values: a person's name, and the place he had lunch yesterday. Assume everyone has lunch in pairs. How can I query the database to return all the pairs of people that had lunch together yesterday? Each pair must be only listed once.
I'm actually not even sure what the professor means by return them as pairs. I've sent him an email, but no reply yet. It seems like he wants me to write a query that returns a table with column 1 as person 1 and column 2 as person 2.
Any suggestions on how to go about this? Does it seem right to assume he wants them as separate columns?
So far, I basically have:
SELECT name, restaurant FROM lunches GROUP BY restaurant, name
which essentially just reorganizes the table so that the people who had lunch together are one after the other.
We have to assume there can be only one pair eating lunch in a given restaurant.
You can get a list of pairs either using self-join:
SELECT l1.name, l2.name FROM lunches l1
JOIN lunches l2
ON l1.restaurant = l2.restaurant AND l1.name < l2.name
or using GROUP BY:
SELECT GROUP_CONCAT(name) FROM lunches
GROUP BY restaurant
The first query will return pairs in two different columns, while the second in one column, using comma as separator (default for GROUP_CONCAT, you can change it to whatever you wish).
Also note that for the first query names in pairs will come in alphabetical order as we use < instead of <> to avoid listing each pair twice.

Selecting multiple rows based on specific categories (mysql)

I don't think this is a duplicate posting because I've looked around and this seems a bit more specific than whats already been asked (but I could be wrong).
I have 4 tables and one of them is just a lookup table
SELECT exercises.id as exid, name, sets, reps, type, movement, categories.id
FROM exercises
INNER JOIN exercisecategory ON exercises.id = exerciseid
INNER JOIN categories ON categoryid = categories.id
INNER JOIN workoutcategory ON workoutid = workoutcategory.id
WHERE (workoutcategory.id = '$workouttypeid')
AND rand_id > UNIX_TIMESTAMP()
ORDER BY rand_id ASC LIMIT 6;
exercises table contains a list of exercise names, sets, reps, and an id
categories table contains an id, musclegroup, and type of movement
workoutcategory table contains an id, and a more specific motion (ie: upper body push, or upper body pull)
exercisecategory table is the lookup table that contains (and matches the id's) for exerciseid, categoryid, and workoutid
I've also added a column to the exercises table that generates a random number upon entering the row in the database. This number is then updated only for the specified category when it is called, and then sorted and displays the ascending order of the top 6 listings. This generates a nice random entry for me. (Found that solution elsewhere here on SO).
This works fine for generating 6 random exercises from a specific top level category. But I'd like to drill down further. Here's an example...
select all rows inside categoryid 4
then still within the category 4 results, find all that have movementid 2, and then find one entry with a typeid 1, then another for typeid 2, etc
TLDR; Basically there's a few levels of categories and I'm looking to select a few from here and a few from there and they're all within this top level. I'm thinking this could all be executed within more than one query but im not sure how... in the end I'm looking to end with one array of the randomized entries.
Sorry for the long read, its the best explanation I've got.
Just realized I never came back to this posting...
I ended up using several mysql queries within a switch based on what is needed during the request. Worked out perfectly.

Find an invalid record ID in a list of potential ID's

Given a list of potential ID's is there a quick way using a single MYSQL query to work out which, if any, of the ID's do not have an associated record in the database.
e.g. if the list is 1,3,4 and records exist for 1 and 4 but not 3 is there a query that will return 3 from the list.
This needs to be applied to a database containing 15000 records checking against a list of 1 to 100 IDs which may contain zero or more invalid IDs. The list is sourced externally and not in another table.
SELECT idtable.id, records.id as rid FROM idtable
LEFT JOIN records ON idtable.id = records.id
WHERE rid IS NULL
Sorry about my previous answer - my bad, I read the question too quickly.
There's no clean way to do this in pure SQL (clean meaning not involving ugly subqueries, unions, or temp tables). If it were me, I'd probably do it something like this (assuming PHP):
$all_ids = array(1, 3, 4);
$query = "SELECT id FROM table1 WHERE id IN (" . implode(',', $all_ids) . ")";
$found_ids = getArrayFromQuery($query);
$invalid_ids = array_diff($all_ids, $found_ids);