I got the following database layout. The table on the left - the "exhibit" is somehow the parent. The one in the middle - the "flyer" table describes some data about the exhibit and is therefore linked to it by ex_ID. Let's say it's like versioning. Only the latest "flyer" (the highest ID) is important. On the right side you got "text" parts that reference a row in the "flyer" table directly.
So here's the question. How can I retrieve in ONE statements all flyers a) with the highest ID to a non-archived exhibit and that are b) referenced by non-archived text-parts. For example, as shown in the picture, ID 1 and 4 of the flyer-table should not be returned.
For a SQL expert, this might be an easy question. But for all others, that's learning. So, please, no down-voting.
This statement returns only flyers with the highest id by exhibition, that are referenced by a non archived exhibition and have either not archived parts or no referenced parts at all:
SELECT
f.id fid,
f.ex_id,
e.id eid,
e.archived earchived,
p.id pid,
p.archived parchived
FROM
Flyer f
INNER JOIN (
SELECT
MAX(id) max_id
FROM
Flyer
GROUP BY
ex_id
) t
ON
f.id = t.max_id
INNER JOIN
Exhibition e
ON
f.ex_id = e.id
AND
e.archived = false
LEFT JOIN
Part p
ON
f.id = p.fl_id
AND
p.archived = false
;
Explanation
The INNER JOIN to the subselect will give us the highest id of the
Flyer table by Exhibition id.
We use then an INNER JOIN to Exhibiton to get the details of the exhibiton, only if these are not archived
and a LEFT JOIN to the part table to get only those parts that are not archived or non existent.
DEMO
Related
I have two tables, one is having deposit's (table: deposit) another table is having comments for deposits (table: comment).
Both table are linked by 1 to many relationship ie, one deposit may have many comments.
Below query gives all comments but I want only the last comment (comments table has auto generated id)
SELECT
D.missing_deposit_amount,
D.missing_deposit_date,
C.comment
FROM deposit AS D
LEFT JOIN comments AS C on
C.md_id=D.md_id
WHERE 1
How can i extend this query to give only the last comment corresponding to deposit?
You can search for the last one using a correlated subquery:
SELECT D.missing_deposit_amount, D.missing_deposit_date,
C.comment
FROM deposit D LEFT JOIN
comments C on
ON C.md_id = D.md_id AND
c.id = (SELECT MAX(c2.id) FROM comments c2 WHERE c2.md_id = c.md_id);
I'm trying to answer to the following query:
Select the first name and last name of the clients which rent films (that have DVD's) from all the categories, ordering by first name and last name.
Database consists in:
(better view - open in a new tab)
Inventory -> DVD's
Rental -> Rents customers did
Category table:
| category_id | int(10) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(25) | YES | | NULL |
My doubt is in how to assign that a field from a query must contain all ids from another query (categories).
I mean I understand the fact we can natural join inventory with rental and film, and then find an id that fails on a single category, then we know he doesn't contain all... But I can't complete this.
I have this solution (But I can't understand it very well):
SELECT first_name, last_name
FROM customer AS C WHERE NOT EXISTS
(SELECT * FROM category AS K WHERE NOT EXISTS
(SELECT * FROM (film NATURAL JOIN inventory) NATURAL JOIN rental
WHERE C.customer_id = customer_id AND K.category_id = category_id));
Are there any other solutions?
On our projects, we NEVER use NATURAL JOIN. That doesn't work for us, because the PRIMARY KEY is always a surrogate column named id, and the foreign key columns are always tablename_id.
A natural join would match id in one table to id in the other table, and that's not what we want. We also frequently have "housekeeping" columns in the tables that are named the same, such as version column used for optimistic locking pattern.
And even if our naming conventions were different, and the join columns were named the same, there would be a potential for a join in an existing query to change if we added a column to a table that was named the same as a column in another table.
And, reading SQL statement that includes a NATURAL JOIN, we can't see what columns are actually being matched, without running through the table definitions, looking for columns that are named the same. That seems to put an unnecessary burden on the reader of the statement. (A SQL statement is going to be "read" many more times than it's written... the author of the statement saving keystrokes isn't a beneficial tradeoff for ambiguity leading to extra work by future readers.
(I know others have different opinions on this topic. I'm sure that successful software can be written using the NATURAL JOIN pattern. I'm just not smart enough or good enough to work with that. I'll give significant weight to the opinions of DBAs that have years of experience with database modeling, implementing schemas, writing and tuning SQL, supporting operational systems, and dealing with evolving requirements and ongoing maintenance.)
Where was I... oh yes... back to regularly scheduled programming...
The image of the schema is way too small for me to decipher, and I can't seem to copy any text from it. Output from a SHOW CREATE TABLE is much easier to work with.
Did you have a SQL Fiddle setup?
I don't thin the query in the question will actually work. I thought there was a limitation on how far "up" a correlated subquery could reference an outer query.
To me, it looks like this predicate
WHERE C.customer_id = customer_id
^^^^^^^^^^^^^
is too deep. The subquery that's in isn't allowed to reference columns from C, that table is too high up. (Maybe I'm totally wrong about that; maybe it's Oracle or SQL Server or Teradata that has that restriction. Or maybe MySQL used to have that restriction, but a later version has lifted it.)
OTHER APPROACHES
As another approach, we could get each customer and a distinct list of every category that he's rented from.
Then, we could compare that list of "customer rented category" with a complete list of (distinct) category. One fairly easy way to do that would be to collapse each list into a "count" of distinct category, and then compare the counts. If a count for a customer is less than the total count, then we know he's not rented from every category. (There's a few caveats, We need to ensure that the customer "rented from category" list contains only categories in the total category list.)
Another approach would be to take a list of (distinct) customer, and perform a cross join (cartesian product) with every possible category. (WARNING: this could be fairly large set.)
With that set of "customer cross product category", we could then eliminate rows where the customer has rented from that category (probably using an anti-join pattern.)
That would leave us with a set of customers and the categories they haven't rented from.
OP hasn't setup a SQL Fiddle with tables and exemplar data; so, I'm not going to bother doing it either.
I would offer some example SQL statements, but the table definitions from the image are unusable; to demonstrate those statements actually working, I'd need some exemplar data in the tables.
(Again, I don't believe the statement in the question actually works. There's no demonstration that it does work.)
I'd be more inclined to test it myself, if it weren't for the NATURAL JOIN syntax. I'm not smart enough to figure that out, without usable table definitions.
If I worked on that, the first think I would do would be to re-write it to remove the NATURAL keyword, and add actual predicates in an actual ON clause, and qualify all of the column references.
And the query would end up looking something like this:
SELECT c.first_name
, c.last_name
FROM customer c
WHERE NOT EXISTS
( SELECT 1
FROM category k
WHERE NOT EXISTS
( SELECT 1
FROM film f
JOIN inventory i
ON i.film_id = f.film_id
JOIN rental r
ON r.inventory_id = i.inventory_id
WHERE f.category_id = k.category_id
AND r.customer_id = c.customer_id
)
)
(I think that reference to c.customer_id is too deep to be valid.)
EDIT
I stand corrected on my conjecture that the reference to C.customer_id was too many levels "deep". That query doesn't throw an error for me.
But it also doesn't seem to return the resultset that we're expecting, I may have screwed it up somehow. Oh well.
Here's an example of getting the "count of distinct rental category" for each customer (GROUP BY c.customer_id, just in case we have two customers with the same first and last names) and comparing to the count of category.
SELECT c.last_name
, c.first_name
FROM customer c
JOIN rental r
ON r.customer_id = c.customer_id
JOIN inventory i
ON i.inventory_id = r.inventory_id
JOIN film f
ON f.film_id = i.film_id
GROUP
BY c.last_name
, c.first_name
, c.customer_id
HAVING COUNT(DISTINCT f.category_id)
= (SELECT COUNT(DISTINCT a.category_id) FROM category a)
ORDER
BY c.last_name
, c.first_name
, c.customer_id
EDIT
And here's a demonstration of the other approach, generating a cartesian product of all customers and all categories (WARNING: do NOT do this on LARGE sets!), and find out if any of those rows don't have a match.
-- customers who have rented from EVERY category
-- h = cartesian (cross) product of all customers with all categories
-- g = all categories rented by each customer
-- perform outer join, return all rows from h and matching rows from g
-- if a row from h does not have a "matching" row found in g
-- columns from g will be null, test if any rows have null values from g
SELECT h.last_name
, h.first_name
FROM ( SELECT hi.customer_id
, hi.last_name
, hi.first_name
, hj.category_id
FROM customer hi
CROSS
JOIN category hj
) h
LEFT
JOIN ( SELECT c.customer_id
, f.category_id
FROM customer c
JOIN rental r
ON r.customer_id = c.customer_id
JOIN inventory i
ON i.inventory_id = r.inventory_id
JOIN film f
ON f.film_id = i.film_id
GROUP
BY c.customer_id
, f.category_id
) g
ON g.customer_id = h.customer_id
AND g.category_id = h.category_id
GROUP
BY h.last_name
, h.first_name
, h.customer_id
HAVING MIN(g.category_id IS NOT NULL)
ORDER
BY h.last_name
, h.first_name
, h.customer_id
I will take a stab at this, only because I am curious why the answer proposed seems so complex. First, a couple of questions.
So your question is: "Select the first name and last name of the clients which rent films (that have DVD's) from all the categories, ordering by first name and last name."
So, just go through the rental database, joining customer. I am not sure what the category part has anything to do with this, as you are not selecting or displaying any category, so that does not need to be part of the search, it is implied as when they rent a DVD, that DVD has a category.
SELECT C.first_name, C.last_name
FROM customer as C JOIN rental as R
ON (C.customer_id = R.customer_id)
WHERE R.return_date IS NOT NULL;
So, you are looking for movies that are currently rented, and displaying the first and last names of customers with active rentals.
You can also do some UNIQUE to reduce the number of duplicate customers that show up in the list.
Does this help?!
So, alright, I have a few tables. My current query runs against a "historical" table. I want to do a join of some kind to get the most recent status from my Current table. These tables share a like column, called "ID"
Here's the structure
ddCurrent
-ID
-Location
-Status
-Time
ddHistorical
-CID (AI field to keep multiple records per site)
-ID
-Location
-Status
-Time
My goal now is to do a simple join to get all the variables from ddHistorical and the current Status from ddCurrent.
I know that they can be joined on ID since both of them have the same items in their ID tables, I just can't figure out which kind of join is appropriate or why?
I'm sure someone may provide a specific link that goes into great detail explaining, but I'll try to summarize it this way. When writing a query, I try to list the tables from the position of what table do I want to get data from and have that as my first table in the "FROM" clause. Then, do "JOIN" criteria to other tables based on relationships (such as IDs). In your example
FROM
ddHistorical ddH
INNER JOIN ddCurrent ddC
on ddH.ID = ddC.ID
In this case, INNER JOIN (same as JOIN) the ddHistorical table is the left table(listed first for my styling consistency and indentation) and ddCurrent is the right table. Notice my ON criteria that joins them together is also left alias.column = right alias table.column -- again, this is just for mental correlation purposes.
an Inner Join (or JOIN) means a record MUST have a match on each side, otherwise it is discarded.
A LEFT JOIN means give me all records in the LEFT table (ddHistorical in this case), regardless of a matching in the right-side table (ddCurrent). Not practical in this example.
A RIGHT JOIN is the reverse... give me all records from the RIGHT-side table REGARDLESS of a matching record in the left side table. Most of the time you will see LEFT-JOINs more frequently than RIGHT-JOINs.
Now, a sample to mentally get the left-join. You work at a car dealership and have a master table of 10 cars that are sold. For a given month, you want to know what IS NOT selling. So, start with the master table of all cars and look at the sales table for what DID sell. If there is NO such sales activity the right-side table will have NULL value
select
M.CarID,
M.CarModel
from
MasterCarsList M
LEFT JOIN CarSales CS
on M.CarID = CS.CarID
AND month( CS.DateSold ) = 4
where
CS.CarID IS NULL
So, my LEFT join is based on a matching car ID -- AND -- the month of sales activity is 4 (April) as I may not care about sales for Jan-Mar -- but would also qualify year too, but this is a simple sample.
If there is no record in the Car Sales table it will have a NULL value for all columns. I just happen to care about the car ID column since that was the join basis. That is why I am including that in the WHERE clause. For all other types of cars that DO have a sale it will have a value.
This is a common approach you will see in querying where someone looking for all regardless of other... Some use a where NOT EXIST ( subselect ), but those perform slower because they test on every record. Having joins is much faster.
Other examples may be you want a list of all employees of a company, and if they had some certification / training to show it... You still want all employees, but LEFT-JOINING to some certification/training table would expose those extra field as needed.
select
Emp.FullName,
Cert.DateCertified
FROM
Employees Emp
Left Join Certifications Cert
on Emp.EmpID = Cert.EmpID
Hopefully these samples help you understand better the relationship for queries, and now to actually provide answer for your needs.
If what you want is a list of all "Current" items and want to look at their historical past, I would use current FIRST. This might be if your current table of things is 50, but historically your table had 420 items. You don't care about the other 360 items, just those that are current and the history of those.
select
ddC.WhateverColumns,
ddH.WhateverHistoricalColumns
from
ddCurrent ddC
JOIN ddHistorical ddH
on ddC.ID = ddH.ID
If there is always a current field then a simple INNER JOIN will do it
SELECT a.CID, a.ID, a.Location, a.Status, a.Time, b.Status
FROM ddHistorical a
INNER JOIN ddCurrent b
ON a.ID = b.ID
An INNER JOIN will omit any ddHistorical rows that don't have a corresponding ID in ddCurrent.
A LEFT JOIN will include all ddHistorical rows, even if they don't have a corresponding ID in ddCurrent, but the ddCurrent values will be null (because they're unknown).
Also note that a LEFT JOIN is just a specific type of outer join. Don't bother with the others yet - 90% or more of what you'll ever do will be INNER or LEFT.
To include only those ddHistorical rows where the ID is in ddCurrent:
SELECT h.CID, h.ID, h.Location, h.Status, c.Status, h.Time
FROM ddHistorical h
INNER JOIN ddCurrent c ON h.ID = c.ID
If you want to include ddHistorical rows even if the ID isn't in ddCurrent:
SELECT h.CID, h.ID, h.Location, h.Status, c.Status, h.Time
FROM ddHistorical h
LEFT JOIN ddCurrent c ON h.ID = c.ID
If all ddHistorical rows happen to match an ID in ddCurrent, note that both queries will return the same result.
I have these tables
AssigenmentList --linksto ---School,AgeGroup
Users will have birthday attached to it
AgeGroup in turn linked to many ages like AgeGroup 3-4 is linked to one to many with 3,4 in numerical format
Now i want all the Assignment list which are linked to particular SCHOOL belong to same age as the age of Child
As a general rule:
select a.*, b.*, c.* from
A a inner join B b on a.idB = b.id
inner join C c on b.idC = c.id
You use inner join if a.idB must have match to add the row to the resultset. Left outer join if the mere presence of a.idB (left side) is enough to project the row.
The trick is to navigate from the starting table to the last joining the columns that ties them.
To filter a table output of selected entries from a single table i would need something like a multiple JOIN request through several tables.
I want to filter a table of people by a special column in the table. Lets say this column is "tasks." Now tasks is also another table with the column "people" and the values between those two tables are connected with an existant "join" table in the database, which is matching several IDs of one table to each ID of the other table.
Now if this would be simple as that i could just filter with an INNER JOIN and a special condition. The problem is, that the entries of the table "tasks" are connected to another table over a "join" table in the database. To simplify things lets say it is "settings". So each "task" consists of several "settings" which are connected via a join table in their IDs.
So what is the input?
I got an array of IDs, which are representing the settings-ids i do not want to be shown.
What should be the output?
As already said i want a filtered output of "people" while the filter is "settings."
I want the sql request to return each entry of the table "people" with only joined tasks that are not joining any of the "setting-ids" from the array.
I hope you can help me with that.
Thanks in advance!
Example
Settings-Table:
1. Is in progress
2. Is important
3. Has unsolved issues
Tasks-Table: (settings.tasks is the join table between many tasks to many settings)
1. Task from 01.01.2012 - JOINS Settings in 1 and 3 (In progress + unsolved issues)
2. Task from 02.01.2012 - JOINS Settings in 2 (Is important)
3. Task from 03.01.2012 - JOINS Settings in 1 and 2 (...)
People-Table: (people.tasks is the join table between many people to many tasks)
1. Guy - JOINS Tasks in 1, 2, 3 (Has been assigned to all 3 tasks)
2. Dude - JOINS Tasks in 1 (Has been assigned to the Task from 01.01.2012)
3. Girl - JOINS Tasks in 2, 3 (...)
Now there is an array passed to a sql query
[2,3] should return noone because every person is assigned in a task that was either important or had unsolved issues!
[3] would return me only the person "Girl" because it is the only one that is assigned to tasks (2 and 3) that had no unsolved issues.
I hope it is clear now. :)
SELECT DISTINCT PEOPLE.*
FROM PEOPLE INNER JOIN PEOPLE_TASKS ON PEOPLE.PERSON_ID = PEOPLE_TASKS.PERSON_ID
WHERE TASK_ID NOT IN (SELECT DISTINCT TASK_ID
FROM TASK_SETTINGS
WHERE SETTING_ID = <Id you don't want>)
EDIT (for supplying multiple setting ids you don't want)
SELECT DISTINCT PEOPLE.*
FROM PEOPLE INNER JOIN PEOPLE_TASKS ON PEOPLE.PERSON_ID = PEOPLE_TASKS.PERSON_ID
WHERE TASK_ID NOT IN (SELECT DISTINCT TASK_ID
FROM TASK_SETTINGS
WHERE SETTING_ID IN (<Id you don't want>))
First you have to join table people and table tasks with the join table, let's call it people_tasks.
select distinct p.* from people p
inner join people_tasks pt on p.people_id = pt.people_id
inner join tasks on t.tasks_id = pt.tasks_id
Then you have to join table tasks and table settings with the join table, let's call it tasks_settings. You have to join them in the current select.
select distinct p.* from people p
inner join people_tasks pt on p.people_id = pt.people_id
inner join tasks on t.tasks_id = pt.tasks_id
inner join tasks_settings ts on t.tasks_id = ts.tasks_id
inner join settings s on s.settings_id = ts.settings_id
and now you have all people connected with its tasks and its settings. Finally you need the restriction. With the people with the settings selected, you choose the others like this:
select distinct p.people_id from people p
inner join people_tasks pt on p.people_id = pt.people_id
where p.people_id not in (
select distinct p2.people_id from people p2
inner join people_tasks pt2 on p2.people_id = pt2.people_id
inner join tasks t2 on t2.tasks_id = pt2.tasks_id
inner join tasks_settings ts2 on t2.tasks_id = ts2.tasks_id
inner join settings s2 on s2.settings_id = ts2.settings_id
where s2.settings_id in (list of ids)
)