Hi I have a table called Engineers and a table called Post_Codes
When I use the following sql I get a list of engineers and the postcodes associated with them by using the Group Concat statement but I cannot figure out how to also include in another Group Concat (if indeed I need one) to also list in another field called Secondary_Post_Codes_Assigned those post codes linked to the same engineer via the Secondary_Engineer_id field.
SELECT
Engineer.Engineer,GROUP_CONCAT(Post_Code SEPARATOR ', ') as Post_Codes_Assigned,
Engineer.Region,
Engineer.active,
Engineer.Engineer_id
FROM Engineer INNER JOIN Post_Code ON Engineer.Engineer_id = Post_Code.Engineer_id
GROUP BY Engineer_id
What I need is output similar to this.
Engineer_id | Post_Codes_Assigned | Secondary_Post_Codes_Assigned
----------
1 | AW, AW3 | B12 |
2 | B12 | AW, CV12 |
I hope this is clear as I am pretty new to mysql.
Regards
Alan
You are already joining the primary post codes and list them, now do the same with the secondary ones.
SELECT
e.Engineer,
GROUP_CONCAT(DISTINCT pc1.Post_Code) AS Primary_Post_Codes_Assigned,
GROUP_CONCAT(DISTINCT pc2.Post_Code) AS Secondary_Post_Codes_Assigned,
e.Region,
e.active,
e.Engineer_id
FROM Engineer e
JOIN Post_Code pc1 ON e.Engineer_id = pc1.Engineer_id
JOIN Post_Code pc2 ON e.Engineer_id = pc2.Secondary_Engineer_id
GROUP BY e.Engineer_id;
As you see, you need DISTINCT because when selecting all primary and all secondary postcodes, you are getting rows for all combinations of them in the intermediate result. So you must get rid of duplicates. For this reason ist is better to aggregate before joining. (Which I generally consider a good idea, so you may want to make this a habit when working with aggregates.)
SELECT
e.Engineer,
pc1.Post_Codes AS Primary_Post_Codes_Assigned,
pc2.Post_Codes AS Secondary_Post_Codes_Assigned,
e.Region,
e.active,
e.Engineer_id
FROM Engineer e
JOIN
(
SELECT Engineer_id, GROUP_CONCAT(Post_Code) AS Post_Codes
FROM Post_Code
GROUP BY Engineer_id
) pc1 ON e.Engineer_id = pc1.Engineer_id
JOIN
(
SELECT Secondary_Engineer_id, GROUP_CONCAT(Post_Code) AS Post_Codes
FROM Post_Code
GROUP BY Secondary_Engineer_id
) pc2 ON e.Engineer_id = pc2.Secondary_Engineer_id;
A third option would be subqueries in the SELECT clause. I usually prefer them to be in the FROM clause as shown, because then it is easy to add more columns to the subqueries, which is not possible in the SELECT clause.
SELECT
e.Engineer,
(
SELECT GROUP_CONCAT(pc1.Post_Code)
FROM Post_Code pc1
WHERE pc1.Engineer_id = e.Engineer_id
) AS Primary_Post_Codes_Assigned,
(
SELECT GROUP_CONCAT(pc2.Post_Code)
FROM Post_Code pc2
WHERE pc2.Secondary_Engineer_id = e.Engineer_id
) AS Secondary_Post_Codes_Assigned,
e.Region,
e.active,
e.Engineer_id
FROM Engineer e;
Related
I have a Profile table like this
|--------|-----------|
| People | Favorite |
|--------|-----------|
| A | Movie |
| B | Movie |
| B | Jogging |
|--------|-----------|
Q: How to retrieve the people whose favorite is movie but not jogging?
In this table, the result is only People A.
Although I came out with this
select People from Profile
where
People
in
(select People from Profile
where favorite='Movie')
and
People
not in
(select People from Profile
where favorite='Jogging')
But it seem like can be better, any suggestion or answer (without using join or union clause)?
https://www.db-fiddle.com/f/rboiDpxxbABCpjtduEz7uY/1
SELECT People
FROM `profile`
GROUP BY people
HAVING SUM('Movie' = favorite) > 0
AND SUM('Jogging' = favorite) = 0
There's lots of ways. While you can use a UNION, its rather messy and innefficient. MySQL doesn't have a MINUS clause which would give a fairly easy to understand query.
You could aggregate the data:
SELECT people
, MAX(IF(favorite='jogging', 1, 0)) as jogging
, MAX(IF(favorite='movie', 1, 0)) as movie
FROM profile
GROUP BY people
HAVING movie=1 AND jogging=0
Or use an outer join:
SELECT m.people
FROM profile m
LEFT JOIN
( SELECT j.people
FROM joggers j
WHERE j.favorite='jogging' ) joggers
ON m.people=joggers.people
WHERE joggers.people IS NULL
AND m.favorite='movies'
Using a NOT IN/NOT EXISTS gives clearer syntax but again would be very innefficient.
There are several query patterns that will return a result that satisfies the specification.
We can use NOT EXISTS with a correlated subquery:
SELECT p.people
FROM profile p
WHERE p.favorite = 'Movie'
AND NOT EXISTS ( SELECT 1
FROM profile q
WHERE q.favorite = 'Jogging'
AND q.people = p.people /* related to row in out query */
)
ORDER
BY p.people
An equivalent result can also be done with an anti-join pattern:
SELECT p.people
FROM profile p
LEFT
JOIN profile q
ON q.people = p.people
AND q.favorite = 'Jogging'
WHERE q.people IS NULL
AND p.favorite = 'Movie'
ORDER BY p.people
Another option is conditional aggregation. Without a guarantee about uniqueness, and some MySQL shorthand:
SELECT p.people
FROM profile p
GROUP
BY p.people
HAVING 1 = MAX(p.favorite='Movie')
AND 0 = MAX(p.favorite='Jogging')
A more portable more ANSI standard compliant syntax for the conditional aggregation:
SELECT p.people
FROM profile p
GROUP
BY p.people
HAVING 1 = MAX(CASE p.favorite WHEN 'Movie' THEN 1 ELSE 0 END)
AND 0 = MAX(CASE p.favorite WHEN Jogging' THEN 1 ELSE 0 END)
This is a common problem when you want to have multiple conditions with the same column. I have answered this here and there are other methods like intersect and subqueries.
SELECT people, GROUP_CONCAT(favorite) as fav
FROM profile
GROUP BY people
HAVING fav REGEXP 'Movie'
AND NOT fav REGEXP 'Jogging';
With group by people and checking the minimum and maximum values of favorite to be 'Movie':
select people from tablename
where favorite in ('Movie', 'Jogging')
group by people
having min(favorite) = 'Movie' and max(favorite) = 'Movie'
The MySQL query is:
SELECT orders.scientific_name AS 'Order',
families.scientific_name AS 'Family',
COUNT(*) AS 'Number of Birds'
FROM birds, bird_families AS families, bird_orders AS orders
WHERE birds.family_id = families.family_id
AND families.order_id = orders.order_id
AND orders.scientific_name = 'Pelecaniformes';
The Output is:
+----------------+-------------+-----------------+
| Order | Family | Number of Birds |
+----------------+-------------+-----------------+
| Pelecaniformes | Pelecanidae | 224 |
+----------------+-------------+-----------------++
But I have 5 Families in the DB. Why did it return only one?
You are using COUNT(*) which turns this into an aggregation query. Without a GROUP BY, this returns exactly one row.
I would recommend getting started by:
Removing the COUNT(*).
Replacing the commas with explicit JOIN syntax.
Use table aliases.
Don't use single quotes for column aliases.
Then work toward the query you really want to write. So, to get started:
SELECT o.scientific_name AS `Order`,
bf.scientific_name AS Family
FROM birds b JOIN
bird_families bf
ON b.family_id = bf.family_id JOIN
bird_orders bo
ON bf.order_id = o.order_id
WHERE o.scientific_name = 'Pelecaniformes';
At this point, you can probably add the COUNT(*) and GROUP BY o.scientific_name, bf.scientific_name.
If you have 5 Families in the orders.scientific_name = 'Pelecaniformes'
then you should use group by
SELECT orders.scientific_name AS 'Order',
families.scientific_name AS 'Family',
COUNT(*) AS 'Number of Birds'
FROM birds, bird_families AS families, bird_orders AS orders
WHERE birds.family_id = families.family_id
AND families.order_id = orders.order_id
AND orders.scientific_name = 'Pelecaniformes'
Group by orders.scientific_name , families.scientific;
I have the following tables:
matters(matterid, mattername, refno)
mattersjuncstaff(junked, matterid, staffid, lead)
staff(staffid, staffname)
A matter may have a number of staff associated with it and a number of those staff will be marked as ‘leads’ i.e. they will have a ‘Y’ in the ‘lead’ field.
I wish to show a table that has a list of matters, the matter name and ref no and those staff marked as leads, ideally in a single row. So it would look something like:
reference | mattername | Lead Staff |
ABC1 | matter abc & Co | Fred Smith, Jane Doe, Naomi Watts |
etc
I am using the code below but this only displays one person with the lead field marked Y.
SELECT refno, mattername, matters.matterid, staffname
FROM matters
INNER JOIN matterjuncstaff
USING (matterid)
Inner join staff
using (staffid)
Inner join matterjuncactions
On matterjuncactions.matterid = matters.matterid
WHERE lead = 'Y'
GROUP BY matters.matterid, nickname
Can anyone tell me how I can I get round this?
You want to concatenate values from a join and represent that as a field in the result set. GROUP_CONCAT function is suited for such queries:
SELECT m.matterid, m.refno, m.mattername, GROUP_CONCAT(s.staffname) AS LeadStaff
FROM matters m
LEFT JOIN matterjuncstaff mjs ON mjs.matterid = m.matterid AND lead = 'Y'
LEFT JOIN staff s ON s.staffid = mjs.staffid
GROUP BY m.matterid, m.refno, m.mattername
The join changed to LEFT and lead = 'Y' moved there, otherwise you will lose matters with no lead staffs.
Use INNER JOIN if you only want matters having some lead staff.
I have removed matterjuncactions as you did not give its info.
Use the GROUP_CONCAT() function in mysql to concatenate values from a query into a single string.
For example you could select a row for each matter and append a column with all the concatenated lead staff names as follows:
SELECT m.refno,
m.mattername,
(Select GROUP_CONCAT(distinct staffname SEPARATOR ', ')
from mattersjuncstaff js
join staff s
on s.staffid = js.staffid
where js.lead = 'Y'
and js.matterid = m.matterid) as LeadStaffMembers
FROM matters m
Update
Here is the same example, but with an added column showing staff members that are not the lead.
SELECT m.refno,
m.mattername,
(Select GROUP_CONCAT(distinct staffname SEPARATOR ', ')
from mattersjuncstaff js
join staff s
on s.staffid = js.staffid
where js.lead = 'Y'
and js.matterid = m.matterid) as LeadStaffMembers,
(Select GROUP_CONCAT(distinct staffname SEPARATOR ', ')
from mattersjuncstaff js
join staff s
on s.staffid = js.staffid
where js.lead <> 'Y'
and js.matterid = m.matterid) as NonLeadStaffMembers
FROM matters m
I am having some trouble putting together a SQL statement properly because I don't have much experience SQL, especially aggregate functions. Safe to say I don't really know what I'm doing outside of the basic SQL structure. I can do regular joins, but not complex ones.
I have some tables: 'Survey', 'Questions', 'Session', 'ParentSurvey', and 'ParentSurveyQuestion'. Structurally, a survey can have questions, it can have users that started the survey (a session), and it can have a parent survey whose questions get imported into the current survey.
What I want to do is get information for a each survey in the Survey table; total questions it has, how many sessions have been started (conditionally, ones that have not finished), and the number of questions in the parents survey. The three joined tables can but do not have to contain any values, and if they don't then 0 should be returned by COUNT. The common field in three of the tables is a variation of 'survey_id'
Here is my SQL so far, I put the table structure below it.
SELECT
`kp_survey_id`,
COALESCE( q.cnt, 0 ) AS questionsAmount,
COALESCE( s.cnt, 0 ) AS sessionsAmount
COALESCE( p.cnt, 0 ) AS parentQAmount,
FROM `Survey`
LEFT JOIN <-- I'd like the count of questions for this survey
( SELECT COUNT(*) AS cnt
FROM Questions
GROUP BY kf_survey_id ) q
ON Survey.kp_survey_id = Questions.kf_survey_id
LEFT JOIN
( SELECT COUNT(*) AS cnt <-- I'd like the count of started sessions for this survey
FROM Session
WHERE session_status = 'started' <-- should this be Session.session_status?
GROUP BY kf_survey_id ) s
ON Survey.kp_survey_id = Session.kf_survey_id
LEFT JOIN
( SELECT COUNT(*) AS cnt <-- I'd like the count of questions in the parent survey with this survey id
FROM ParentSurvey
GROUP BY kp_parent_survey_id ) p
ON Survey.kf_parent_survey_id = ParentSurveyQuestion.kf_parent_survey_id
'kp' prefix means primary key, while 'kf' prefix means foreign key
Structure:
Survey: 'kp_survey_id' | 'kf_parent_survey_id'
Question: 'kp_question_id' | 'kf_survey_id'
Session: 'kp_session_id' | 'kf_survey_id' | 'session_status'
ParentSurvey: 'kp_parent_survey_id' | 'survey_name'
ParentSurveyQuestion: 'kp_parent_question_id' | 'kf_parent_survey_id'
There are also other columns in each table like 'name' or 'account_id', but i don't think they matter in this case
I'd like to know if I'm doing this correctly or if I'm missing something. I'm repurposing some code I found here on stackoverflow and modifying it to meet my needs, as I haven't seen conditional aggregation for more than three tables on this site.
My expected output is something like:
kp_survey_id | questionsAmount | sessionsAmount | parentQAmount
1 | 3 | 0 | 3
2 | 0 | 5 | 3
I think you were pretty close -- just need to fix your joins and include the survey id in the subqueries to use in those joins:
SELECT
`kp_survey_id`,
COALESCE( q.cnt, 0 ) AS questionsAmount,
COALESCE( s.cnt, 0 ) AS sessionsAmount
COALESCE( p.cnt, 0 ) AS parentQAmount,
FROM `Survey`
LEFT JOIN
( SELECT COUNT(*) cnt, kf_survey_id AS cnt
FROM Questions
GROUP BY kf_survey_id ) q
ON Survey.kp_survey_id = q.kf_survey_id
LEFT JOIN
( SELECT COUNT(*) cnt, kf_survey_id
FROM Session
WHERE session_status = 'started'
GROUP BY kf_survey_id ) s
ON Survey.kp_survey_id = s.kf_survey_id
LEFT JOIN
( SELECT COUNT(*) cnt, kp_parent_survey_id
FROM ParentSurvey
GROUP BY kp_parent_survey_id ) p
ON Survey.kf_parent_survey_id = p.kp_parent_survey_id
One thing you need to do is correct your joins. When you are joining to a subquery, you need to use the alias of the subquery. In your case you are using the alias of the table being used in the subquery.
Another thing you need to change is to include the field you wish to use in your JOIN in the subquery.
Make these changes and try running. Do you get an error or the desired results?
SELECT
`kp_survey_id`,
COALESCE( q.cnt, 0 ) AS questionsAmount,
COALESCE( s.cnt, 0 ) AS sessionsAmount
COALESCE( p.cnt, 0 ) AS parentQAmount,
FROM `Survey`
LEFT JOIN <-- I'd like the count of questions for this survey
( SELECT kf_survey_id, COUNT(*) AS cnt
FROM Questions
GROUP BY kf_survey_id ) q
ON Survey.kp_survey_id = q.kf_survey_id
LEFT JOIN
( SELECT kf_survey_id, COUNT(*) AS cnt <-- I'd like the count of started sessions for this survey
FROM Session
WHERE session_status = 'started' <-- should this be Session.session_status?
GROUP BY kf_survey_id ) s
ON Survey.kp_survey_id = s.kf_survey_id
LEFT JOIN
( SELECT kp_parent_survey_id, COUNT(*) AS cnt <-- I'd like the count of questions in the parent survey with this survey id
FROM ParentSurvey
GROUP BY kp_parent_survey_id ) p
ON Survey.kf_parent_survey_id = p.kf_parent_survey_id
This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
How can I modify this query with two Inner Joins so that it stops giving duplicate results?
I'm having trouble getting my query to work.
SELECT itpitems.identifier, itpitems.name, itpitems.subtitle, itpitems.description, itpitems.itemimg, itpitems.mainprice, itpitems.upc, itpitems.isbn, itpitems.weight, itpitems.pages, itpitems.publisher, itpitems.medium_abbr, itpitems.medium_desc, itpitems.series_abbr, itpitems.series_desc, itpitems.voicing_desc, itpitems.pianolevel_desc, itpitems.bandgrade_desc, itpitems.category_code, itprank.overall_ranking, itpitnam.name AS artist, itpitnam.type_code FROM itpitems
INNER JOIN itprank ON (itprank.item_number = itpitems.identifier)
INNER JOIN (SELECT DISTINCT type_code FROM itpitnam) itpitnam ON (itprank.item_number = itpitnam.item_number)
WHERE mainprice > 1
LIMIT 3
I keep getting Unknown column 'itpitnam.name' in 'field list'.
However, if I change DISTINCT type_code to *, I do not get that error, but I do not get the results I want either.
This is a big result table so I am making a dummy example...
With *, I get something like:
+-----------+---------+----------+
| identifier| name | type_code|
+-----------+---------+----------+
| 2 | Joe | A |
| 2 | Amy | R |
| 7 | Mike | B |
+-----------+------------+-------+
The problem here is that I have two instances of identifier = 2 because the type_code is different. I have tried GROUP BY at the outside end of the query, but it is sifting through so many records it creates too much strain on the server, so I'm trying to find an alternative way of getting the results I need.
What I want to achieve (using the same dummy output) would look something like this:
+-----------+---------+----------+
| identifier| name | type_code|
+-----------+---------+----------+
| 2 | Joe | A |
| 7 | Mike | B |
| 8 | Sam | R |
+-----------+------------+-------+
It should skip over the duplicate identifier regardless if type_code is different.
Can someone help me modify this query to get the results as simulated in the above chart?
One approach is to use an inline view, like the query you already have. But instead of using DISTINCT, you would use a GROUP BY to eliminate duplicates. The simplest inline view to satisfy your requirements would be:
( SELECT n.item_number, n.name, n.type_code
FROM itpitnam n
GROUP BY n.item_number
) itpitnam
Although its not deterministic as to which row from itpitnam the values for name and type_code are retrieved from. A more elaborate inline view can make this more specific.
Another common approach to this type of problem is to use a correlated subquery in the SELECT list. For returning a small set of rows, this can perform reasonably well. But for returning large sets, there are more efficient approaches.
SELECT i.identifier
, i.name
, i.subtitle
, i.description
, i.itemimg
, i.mainprice
, i.upc
, i.isbn
, i.weight
, i.pages
, i.publisher
, i.medium_abbr
, i.medium_desc
, i.series_abbr
, i.series_desc
, i.voicing_desc
, i.pianolevel_desc
, i.bandgrade_desc
, i.category_code
, r.overall_ranking
, ( SELECT n1.name
FROM itpitnam n1
WHERE n1.item_number = r.item_number
ORDER BY n1.type_code, n1.name
LIMIT 1
) AS artist
, ( SELECT n2.type_code
FROM itpitnam n2
WHERE n2.item_number = r.item_number
ORDER BY n2.type_code, n2.name
LIMIT 1
) AS type_code
FROM itpitems i
JOIN itprank r
ON r.item_number = i.identifier
WHERE mainprice > 1
LIMIT 3
That query will return the specified resultset, with one significant difference. The original query shows an INNER JOIN to the itpitnam table. That means that a row will be returned ONLY of there is a matching row in the itpitnam table. The query above, however, emulates an OUTER JOIN, the query will return a row when there is no matching row found in itpitnam.
UPDATE
For best performance of those correlated subqueries, you'll want an appropriate index available,
... ON itpitnam (item_number, type_code, name)
That index is most appropriate because it's a "covering index", the query can be satisfied entirely from the index without referencing data pages in the underlying table, and there's equality predicate on the leading column, and an ORDER BY on the next two columns, so that will a avoid a "sort" operation.
--
If you have a guarantee that either the type_code or name column in the itpitnam table is NOT NULL, you can add a predicate to eliminate the rows that are "missing" a matching row, e.g.
HAVING artist IS NOT NULL
(Adding that will likely have an impact on performance.) Absent that kind of guarantee, you'd need to add an INNER JOIN or a predicate that tests for the existence of a matching row, to get an INNER JOIN behavior.
SELECT a.*
b.overall_ranking,
c.name AS artist,
c.type_code
FROM itpitems a
INNER JOIN itprank b
ON b.item_number = a.identifier
INNER JOIN itpitnam c
ON b.item_number = c.item_number
INNER JOIN
(
SELECT item_number, MAX(type_code) code
FROM itpitnam
GROUP BY item_number
) d ON c.item_number = d.item_number AND
c.type_code = d.code
WHERE mainprice > 1
LIMIT 3
Follow-up question: can you please post the table schema and how are the tables related with each other? So I will know what are the columns to be linked.