Best way to get max value when joining mysql tables - mysql

I've got THREE MYSQL TABLES (innoDB) :
NAMES
id nid version fname lname birth
RELATIONS
id rid version idname idperson roleid
ROLES
id role
I want to select the last version of each RELATIONS joined to the last version of their related NAMES for a particular idperson (and the name of the ROLE)
Of course, idperson will have 0, 1 or more relations and there will be one or more versions of RELATIONS and NAMES
I wrote something like :
SELECT A.id,A.nid,MAX(A.version),A.idname,A.idperson,A.roleid,B.id,B.role
FROM RELATIONS A
INNER JOIN
ROLES
ON A.roleid = B.id
INNER JOIN
(SELECT id,nid,MAX(version),fname,lname,birth FROM NAMES) C
ON A.idname = C.id
WHERE A.idperson = xx
It doesn't work maybe because MAX() seems to return only one line...
How to get the maximum value for more than one line in this joining context?
PS: how do you generate this kind of nice data set?
i.e. :
id home datetime player resource
---|-----|------------|--------|---------
1 | 10 | 04/03/2009 | john | 399
2 | 11 | 04/03/2009 | juliet | 244
5 | 12 | 04/03/2009 | borat | 555
8 | 13 | 01/01/2009 | borat | 700

Adding a GROUP BY statement, both in the subquery you have, as well as in the outer query should allow the MAX function to generate the result that you're looking for.
This is untested code, but should give you the result that you're looking for:
SELECT A.id,A.nid,MAX(A.version),A.idname,A.idperson,A.roleid,B.id,B.role
FROM RELATIONS A
INNER JOIN
ROLES
ON A.roleid = B.id
INNER JOIN
(SELECT id,nid,MAX(version),fname,lname,birth FROM NAMES GROUP BY fname,lname) C
ON A.idname = C.id
WHERE A.idperson = xx
GROUP BY fname,lname
Alternatively, if it works better for your database architecture, you can use any unique identifier for the employees you'd like (possibly nid?).
As to the question that you've posed in your PS, I'm unsure as to what you're asking. I don't seem a home, datetime, player, or resource field in the examples of your tables that you've provided. If you could clarify, I'd be happy to try and help you with that as well.

Related

Mysql query to search between two tables with no direct relationship

OK i found myself in a dead end, and i know must be a way but my brain is just about to explode. This is the case: I have two tables with tons of rows, one for works done (lets call it works table), something like this:
ID | Home_Work_ID | task_id | Person_id
1 | 23 | 1 | 30
2 | 23 | 2 | 31
3 | 23 | 3 | 30
4 | 876 | 1 | 31
5 | 123 | 3 | 32
and another table to report the fixes to do on the works mentioned before, lets call it fixes table
ID | Home_Work_ID | Person_reporting_id | Task_id | Details
1 | 23 | 93 | 1 | Fix this
2 | 23 | 85 | 3 | Fix that
3 | 123 | 86 | 3 | Fix something
As we can see, in the fixes table there are home works with fixes reported, and those home works was done by the person_id 30 and 32 (from the work table). The results that im trying to achieve would be:
Person | Fixes
John (lets say this is person id 30) | 2
Kate (lets say this is person id 32) | 1
The problem is, i need to create a report that show me who was the person responsable of the work done in the works table that have reported fixes in the second table and how many fixes was reported for that person. The only link is the home work id and probably the task id, so i was testing something like this
SELECT P.person_name AS Person, COUNT(F.id) AS fixes
FROM fixes F
INNER JOIN homeworks H ON F.home_work_id = H.id
INNER JOIN works as W
INNER JOIN people AS P ON W.person_id = P.id
INNER JOIN tasks AS T ON T.task_id = F.task_id
WHERE F.task_id = W.task_id
AND F.home_work_id = W.home_work_id
AND W.home_work_id IN (SELECT home_work_id FROM fixes GROUP BY home_work_id)
GROUP BY P.person_name ORDER BY fixes DESC
Ok there are three more inner/left joinable tables with the id and name of the person, home work and task. This show me the person responsable but the number of fixes for that person and that home_work/task dont match the ones in the fixes table, i guess im doing wrong inner joining the work table that way but i dont see the light yet. Any help will be really appreciated.
Regards
I think this query should give you the expected result:
SELECT P.Person_name AS Person, COUNT(F.id) AS fixes
FROM works W
INNER JOIN fixes F ON W.home_work_id = F.home_work_id and W.task_id=F.Task_id
INNER JOIN people P ON W.person_id = P.id
group by P.Person_name
however because I didn't read the question deeply, so I'm not sure
100% this will give exact result.
EDIT: question owner wrote the exact query of answer below.
Note from owner of the answer: Check the query, its based on it but not the "exact" query.
Ok after do some tests and thanks to the comments from Farhang Amary and Rick James from dba.stackexchange, the following query seems to do the job, so im posting it here for anyone that find itself on the same situation or for anything tha can be fixed on it:
SELECT P.Person_name AS Person, COUNT(F.id) AS fixes
FROM (SELECT person_id, home_work_id, task_id FROM work) W
LEFT JOIN fixes_table F ON W.home_work_id = F.home_work_id AND W.task_id = F.taks_id
LEFT JOIN People_table P ON W.person_id = P.id
GROUP BY P.person_name ORDER BY fixes DESC;

MySQL subquery from same table

I have a database with table xxx_facileforms_forms, xxx_facileforms_records and xxx_facileforms_subrecords.
Column headers for xxx_facileforms_subrecords:
id | record | element | title | neame | type | value
As far as filtering records with element = '101' ..query returns proper records, but when i add subquery to filete aditional element = '4871' from same table - 0 records returned.
SELECT
F.id AS form_id,
R.id AS record_id,
PV.value AS prim_val,
COUNT(PV.value) AS count
FROM
xxx_facileforms_forms AS F
INNER JOIN xxx_facileforms_records AS R ON F.id = R.form
INNER JOIN xxx_facileforms_subrecords AS PV ON R.id = PV.record AND PV.element = '101'
WHERE R.id IN (SELECT record FROM xxx_facileforms_records WHERE record = R.id AND element = '4871')
GROUP BY PV.value
Does this looks right?
Thank You!
EDIT
Thank you for support and ideas! Yes, I left lot of un guessing. Sorry. Some input/output table data might help make it more clear.
_facileforms_form:
id | formname
---+---------
1 | myform
_facileforms_records:
id | form | submitted
----+------+--------------------
163 | 1 | 2014-06-12 14:18:00
164 | 1 | 2014-06-12 14:19:00
165 | 1 | 2014-06-12 14:20:00
_facileforms_subrecords:
id | record | element | title | name|type | value
-----+--------+---------+--------+-------------+--------
5821 | 163 | 101 | ticket | radio group | flight
5822 | 163 | 4871 | status | select list | canceled
5823 | 164 | 101 | ticket | radio group | flight
5824 | 165 | 101 | ticket | radio group | flight
5825 | 165 | 4871 | status | select list | canceled
Successful query result:
form_id | record_id | prim_val | count
1 | 163 | flight | 2
So i have to return value data (& sum those records) from those records where _subrecord element - 4871 is present (in this case 163 and 165).
And again Thank You!
Thank You for support and ideas! Yes i left lot of un guessing.. sorry . So may be some input/output table data might help.
_facileforms_form:
headers -> id | formname
1 | myform
_facileforms_records:
headers -> id | form | submitted
163 | 1 | 2014-06-12 14:18:00
164 | 1 | 2014-06-12 14:19:00
165 | 1 | 2014-06-12 14:20:00
_facileforms_subrecords
headers -> id | record | element | title | name | type | value
5821 | 163 | 101 | ticket | radio group| flight
5822 | 163 | 4871 | status | select list | canceled
5823 | 164 | 101 | ticket | radio group | flight
5824 | 165 | 101 | ticket | radio group | flight
5825 | 165 | 4871 | status | select list | canceled
Succesful Query result:
headers -> form_id | record_id | prim_val | count
1 | 163 | flight | 2
So i have to return value data (& sum those records) from those records where _subrecord element - 4871 is present (in this case 163 and 165).
And again Thank You!
No, it doesn't look quite right. There's a predicate "R.id IN (subquery)" but that subquery itself has a reference to R.id; it's a correlated subquery. Looks like something is doubled up there. (We're assuming here that id is a UNIQUE or PRIMARY key in each table.)
The subquery references an identifier element... the only other reference we see to that identifier is from the _subrecords table (we don't see any reference to that column in _records table... if there's no element column in _records, then that's a reference to the element column in PV, and that predicate in the subquery will never be true at the same time the PV.element='101' predicate is true.
Kudos for qualifying the column references with a table alias, that makes the query (and the EXPLAIN output) much easier to read; the reader doesn't need to go digging around in the table definitions to figure out which table does and doesn't contain which columns. But please take that pattern to the next step, and qualify all column references in the query, including column references in the subqueries.
Since the reference to element isn't qualified, we're left to guess whether the _records table contains a column named element.
If the goal is to return only the rows from R with element='4871', we could just do...
WHERE R.element='4871'
But, given that you've gone to the bother of using a subquery, I suspect that's not really what you want.
It's possible you're trying to return all rows from R for a _form, but only for the _form where there's at least one associated _record with element='4871'. We could get that result returned with either an IN (subquery) or an EXISTS (correlated_ subquery) predicate, or an anti-join pattern. I'd give examples of those query patterns; I could take some guesses at the specification, but I would only be guessing at what you actually want to return.
But I'm guessing that's not really what you want. I suspect that _records doesn't actually contain a column named element.
The query is already restricting the rows returned from PV with those that have element='101'.)
This is a case where some example data and the example output would help explain the actual specification; and that would be a basis for developing the required SQL.
FOLLOWUP
I'm just guessing... maybe what you want is something pretty simple. Maybe you want to return rows that have element value of either '101' or '4913'.
The IN comparison operator is a convenient of way of expressing the OR condition, that a column be equal to a value in a list:
SELECT F.id AS form_id
, R.id AS record_id
, PV.value AS prim_val
, COUNT(PV.value) AS count
FROM xxx_facileforms_forms F
JOIN xxx_facileforms_records R
ON R.form = F.id
JOIN xxx_facileforms_subrecords PV
ON PV.record = R.id
AND PV.element IN ('101','4193')
GROUP BY PV.value
NOTE: This query (like the OP query) is using a non-standard MySQL extension to GROUP BY, which allows non-aggregate expressions (e.g. bare columns) to be returned in the SELECT list.
The values returned for the non-aggregate expressions (in this case, F.id and R.id) will be a values from a row included in the "group". But because there can be multiple rows, and different values on those rows, it's not deterministic which of values will be returned. (Other databases would reject this statement, unless we wrapped those columns in an aggregate function, such as MIN() or MAX().)
FOLLOWUP
I noticed that you added information about the question into an answer... this information would better be added to the question as an EDIT, since it's not an answer to the question. I took the liberty of copying that, and reformatting.
The example makes it much more clear what you are trying to accomplish.
I think the easiest to understand is to use EXISTS predicate, to check whether a row meeting some criteria "exists" or not, and exclude rows where such a row does not exist. This will use a correlated subquery of the _subrecords table, to which check for the existence of a matching row:
SELECT f.id AS form_id
, r.id AS record_id
, pv.value AS prim_val
, COUNT(pv.value) AS count
FROM xxx_facileforms_forms f
JOIN xxx_facileforms_records r
ON r.form = f.id
JOIN xxx_facileforms_subrecords pv
ON pv.record = r.id
AND pv.element = '101'
-- only include rows where there's also a related 4193 subrecord
WHERE EXISTS ( SELECT 1
FROM xxx_facileforms_subrecords sx
WHERE sx.element = '4193'
AND sx.record = r.id
)
--
GROUP BY pv.value
(I'm thinking this is where OP was headed with the idea that a subquery was required.)
Given that there's a GROUP BY in the query, we could actually accomplish an equivalent result with a regular join operation, to a second reference to the _subrecords table.
A join operation is often more efficient than using an EXISTS predicate.
(Note that the existing GROUP BY clause will eliminate any "duplicates" that might otherwise be introduced by a JOIN operation, so this will return an equivalent result.)
SELECT f.id AS form_id
, r.id AS record_id
, pv.value AS prim_val
, COUNT(pv.value) AS count
FROM xxx_facileforms_forms f
JOIN xxx_facileforms_records r
ON r.form = f.id
JOIN xxx_facileforms_subrecords pv
ON pv.record = r.id
AND pv.element = '101'
-- only include rows where there's also a related 4193 subrecord
JOIN xxx_facileforms_subrecords sx
ON sx.record = r.id
AND sx.element = '4193'
--
GROUP BY pv.value

SQL "chained" queries?

I have two tables.
I am a total newbie to SQL. Using mysql at the moment.
I have the following setup for a school-related db:
Table A contains students records.
Student's id, password,name, lastname and so on.
Table B contains class attendancy records.
Each record goes like this: date, student id, grade
I need to gather all the student info of students that attended classes in a certain date range.
Now, the stupid way would be
first I SELECT all classes from Table B with DATE IN BETWEEN the range
then for each class, I get the student id and SELECT * FROM students WHERE id = student id
What I can't wrap my mind around is the smart way.
How to do this in one query only.
I am failing at understanding the concepts of JOIN, UNION and so on...
my best guess so far is
SELECT students.id, students.name, students.lastname
FROM students, classes
WHERE classes.date BETWEEN 20140101 AND 20150101
AND
classes.studentid = students.id
but is this the appropriate way for this case?
Dont add the join statement in the where clause. Do it like this:
SELECT s.id, s.name, s.lastname,c.date,c.grade
FROM classes c
inner join students s
on c.studentid=s.id
WHERE c.date BETWEEN '01/01/2014' AND '01/01/2015'
This sounds like an assignment so I will attempt to describe the problem and give a hint to the solution.
An example of a union would be;
SELECT students.name, students.lastname
FROM students
WHERE students.lastname IS NOT NULL
UNION
SELECT students.name, 'N/A'
FROM students
WHERE students.lastname IS NULL;
+--------------+--------------+
| name | lastname |
+--------------+--------------+
| John | Doe | <- First two rows came from first query
| Jill | Smith |
| Bill | N/A | <- This came from the second query
+--------------+--------------+
The usual use case for a union is to display the same columns, but munge the data in a different way - otherwise you can usually achieve similar results through a WHERE clause.
An example of a join would be;
SELECT authors.id, authors.name, books.title
FROM authors LEFT JOIN books ON authors.id = books.authors_id
+--------------+--------------+------------------+
| id | name | title |
+--------------+--------------+------------------+
| 1 | Mark Twain | Huckleberry Fin. |
| 2 | Terry Prat.. | Good Omens |
+--------------+--------------+------------------+
^ First two columns from ^ Third column appended
from authors table from books table linked
by "author id"
Think of a join as appending columns to your results, a union is appending rows with the same columns.
In your situation we can rule out a union as you don't want to append more student rows, you want class and student information side by side.

How can I get connected values of two times the same foreign key?

I currently have a situation that could easily be solved with 3 SQL queries, but I wonder if it can be done in one query.
I have the following tables:
symbol similarity
------- ------------
id | name | latex id | base_symbol_id | similar_symbol_id
I want to SELECT so that my result looks like this:
query_result
------------
similarity_id | base_formula_id | base_formula_name | base_formula_latex | similar_formula_id | similar_formula_name | similar_formula_latex
Failed tries
I usual solve similar tasks with JOIN. But this time, the SELECT depends on another attribute I select ... I don't know how to do this. Here is my try (which of course failed):
SELECT `base_symbol_id`, `similar_symbol_id`, `latex`
FROM `similarity`
JOIN `symbol` ON ((`symbol`.`id` = `base_symbol_id`) OR (`symbol`.`id` = `similar_symbol_id`))
gives
base_symbol_id | simlar_symbol_id | latex
10 | 11 | \alpha
10 | 11 | a
select sim.id
,base.id
,base.name
,base.latex
,similar.id
,similar.name
,similar.latex
from similarity as sim
join symbol as base on base.id=sim.base_symbol_id
join symbol as similar on similar.id=sim.similar_symbol_id
Using the given table structure, and making up some random sample inputs in an SQL Fiddle session, the following query would work as you desired:
SELECT T.id as similarity_id,
S1.id as base_formula_id, S1.name as base_formula_name, S1.latex as base_formula_latex,
S2.id as similar_formula_id, S2.name as similar_formula_name, S2.latex as similar_formula_latex
FROM similarity T
LEFT OUTER JOIN symbol S1 ON (T.base_symbol_id = S1.id)
LEFT OUTER JOIN symbol S2 ON (T.similar_symbol_id = S2.id)

How to filter duplicates within row using Distinct/group by with JOINS

For simplicity, I will give a quick example of what i am trying to achieve:
Table 1 - Members
ID | Name
--------------------
1 | John
2 | Mike
3 | Sam
Table 1 - Member_Selections
ID | planID
--------------------
1 | 1
1 | 2
1 | 1
2 | 2
2 | 3
3 | 2
3 | 1
Table 3 - Selection_Details
planID | Cost
--------------------
1 | 5
2 | 10
3 | 12
When i run my query, I want to return the sum of the all member selections grouped by member. The issue I face however (e.g. table 2 data) is that some members may have duplicate information within the system by mistake. While we do our best to filter this data up front, sometimes it slips through the cracks so when I make the necessary calls to the system to pull information, I also want to filter this data.
the results SHOULD show:
Results Table
ID | Name | Total_Cost
-----------------------------
1 | John | 15
2 | Mike | 22
3 | Sam | 15
but instead have John as $20 because he has plan ID #1 inserted twice by mistake.
My query is currently:
SELECT
sq.ID, sq.name, SUM(sq.premium) AS total_cost
FROM
(
SELECT
m.id, m.name, g.premium
FROM members m
INNER JOIN member_selections s USING(ID)
INNER JOIN selection_details g USING(planid)
) sq group by sq.agent
Adding DISTINCT s.planID filters the results incorrectly as it will only show a single PlanID 1 sold (even though members 1 and 3 bought it).
Any help is appreciated.
EDIT
There is also another table I forgot to mention which is the agent table (the agent who sold the plans to members).
the final group by statement groups ALL items sold by the agent ID (which turns the final results into a single row).
Perhaps the simplest solution is to put a unique composite key on the member_selections table:
alter table member_selections add unique key ms_key (ID, planID);
which would prevent any records from being added where the unique combo of ID/planID already exist elsewhere in the table. That'd allow only a single (1,1)
comment followup:
just saw your comment about the 'alter ignore...'. That's work fine, but you'd still be left with the bad duplicates in the table. I'd suggest doing the unique key, then manually cleaning up the table. The query I put in the comments should find all the duplicates for you, which you can then weed out by hand. once the table's clean, there'll be no need for the duplicate-handling version of the query.
Use UNIQUE keys to prevent accidental duplicate entries. This will eliminate the problem at the source, instead of when it starts to show symptoms. It also makes later queries easier, because you can count on having a consistent database.
What about:
SELECT
sq.ID, sq.name, SUM(sq.premium) AS total_cost
FROM
(
SELECT
m.id, m.name, g.premium
FROM members m
INNER JOIN
(select distinct ID, PlanID from member_selections) s
USING(ID)
INNER JOIN selection_details g USING(planid)
) sq group by sq.agent
By the way, is there a reason you don't have a primary key on member_selections that will prevent these duplicates from happening in the first place?
You can add a group by clause into the inner query, which groups by all three columns, basically returning only unique rows. (I also changed 'premium' to 'cost' to match your example tables, and dropped the agent part)
SELECT
sq.ID,
sq.name,
SUM(sq.Cost) AS total_cost
FROM
(
SELECT
m.id,
m.name,
g.Cost
FROM
members m
INNER JOIN member_selections s USING(ID)
INNER JOIN selection_details g USING(planid)
GROUP BY
m.ID,
m.NAME,
g.Cost
) sq
group by
sq.ID,
sq.NAME