I have a table schema as
create table Location(
id int primary key,
city varchar(255),
state varchar(100),
country varchar(255)
);
create table Person(
id int primary key,
name varchar(100)
);
create table Photographer(
id int primary key references Person(id) on update cascade on delete cascade,
livesIn int not null references Location(id) on update cascade on delete no action
);
create table Specialty(
photographer int references Photographer(id) on update cascade on delete cascade,
type enum('portrait','landscape','sport'),
primary key(photographer, type)
);
create table Photo(
id int primary key,
takenAt timestamp not null,
takenBy int references Photographer(id) on update cascade on delete no action,
photographedAt int references Location(id) on update cascade on delete no action
);
create table Appearance(
shows int references Person(id) on update cascade on delete cascade,
isShownIn int references Photo(id) on update cascade on delete cascade,
primary key(shows, isShownIn)
);
I am stuck at two queries :
1) The photos such that the photo only shows photographers that live in the same location. List each photo once. That is, photos must have persons that are photographers, and they all need to live in the same place.
2) The locations that have the property that every photo in the location was taken by a photographer who is not shown in any photo in Massachusetts? For each location show only the city, and show each location only once.
My tries :
1)
SELECT ph.id, ph.takenAt, ph.takenBy, ph.photographedAt FROM
(SELECT * FROM Photo p, Appearance ap WHERE p.id = ap.isShownIn
HAVING ap.shows IN (SELECT person.id FROM Person,Photographer WHERE person.id
photographer.id)) ph
WHERE ph.photographedAt = (SELECT location.id FROM location WHERE location.id =
(SELECT livesIn FROM Photographer WHERE id = ph.takenBy))
2)
select distinct city from location where location.id in (
select photographedAt from photo, (select * from appearance where appearance.shows in
(select photographer.id from photographer)) ph
where photo.id = ph.isShownIn )
and location.state <> 'Massachusetts'
Can anyone help in creating these queries ??
Your queries are both of the "list individual items that have properties X and Y, where X and Y are in different tables" variety.
These types of questions are commonly solved using correlated sub-queries with EXISTS and NOT EXISTS.
Using EXISTS takes care of the "show each item only once" part. Otherwise you would need to use grouping in conjunction with complex joins, and this can get messy very quickly.
Question 1 requires:
[...] photos must have persons that are photographers, and they all need to live in the same place.
Note that this definition doesn't say "do not show photos if they contain other people, too". If that's what you really meant, it's upon you to draw conclusions from the SQL below and to write better definitions next time. ;)
SELECT
*
FROM
Photo p
WHERE
EXISTS (
-- ...that has at least one appearance of a photographer
SELECT
1
FROM
Appearance a
INNER JOIN Photographer r ON r.id = a.shows
INNER JOIN Location l ON l.id = r.livesIn
WHERE
a.isShownIn = p.id
-- AND l.id = <optional location filter would go here>
AND NOT EXISTS (
-- ...that does not have an appearance of a photographer from
-- some place else
SELECT
1
FROM
Appearance a1
INNER JOIN Photographer r1 ON r1.id = a1.shows
INNER JOIN Location l1 ON l1.id = r1.livesIn
WHERE
a1.isShownIn = p.Id
AND l1.id <> l.id
)
)
The second question reads
[...] locations that have the property that every photo in the location was taken by a photographer who is not shown in any photo in Massachusetts? For each location show only the city, and show each location only once.
The according SQL would look like:
SELECT
city
FROM
Location l
WHERE
NOT EXISTS (
-- ...a photo at this location taken by a photographer who makes
-- an apperance on another photo which which was taken in Massachusetts
SELECT
1
FROM
Photo p
INNER JOIN Photographer r ON r.id = p.takenBy
INNER JOIN Appearance a ON a.shows = r.id
INNER JOIN Photo p1 ON p1.id = a.isShownIn
WHERE
p.photographedAt = l.Id
AND p1.photographedAt = <the location id of Massachusetts>
)
My attempt for Query1. Photos that show photographers that live in the same city.
select ph.id, ph.takenAt, ph.takenBy, ph.photographedAt from Photo as ph
join Appearance as a on ph.id = a.isShownIn
join Photographer as p on a.shows = p.id where p.livesIn in
(select p1.id from Photographer as p1, Photographer as p2
where p1.id != p2.id and p1.livesIn = p2.livesIn);
My attempt for Query2. Take references of people shown in a photo taken at Massachusets, then list all the pictures not taken by that people.
select * from Photo where takenBy not in
(select a.shows from Photo as ph
join Location as l on ph.photographedAt = l.id
join Appearance as a on a.isShownIn = ph.id
where city = 'Massachusets');
Hope that helps.
Related
I have a query which gets all the jobs from a database, some jobs don't have a languagepair and I need to get those too while still getting the languagepair information for jobs witch have a languagepair, I understand this is done with a full join but full joins do not exist in mySQL, I read about it and I need to do some sort of UNION.
If I get NULLS as source & target for jobs that do not have a languagepair it is good.
This is the query I have at the moment:
SELECT jobName, source.name AS source, target.name AS target FROM (
(SELECT jobs.name AS jobName, lp.sourceId, lp.targetId FROM jobs **JOIN languagePairs** lp
ON lp.id = jobs.languagePairId)
UNION
(SELECT jobs.name AS jobName, lp.sourceId, lp.targetId FROM collectiveJobs JOIN jobs ON jobs.id = collectiveJobs.jobId
**JOIN languagePairs lp** on jobs.languagePairId = lp.id
WHERE collectiveJobs.freelancerId = 1)
) AS jobs **JOIN languages** source ON source.id = sourceId **JOIN languages** target ON target.id = targetId;
I think but I am not sure the full joins need to happen at the bold joins. There also needs to be some sort of checking for null (I think) in the query.
Off course I could do this programmatically but it would be nice to have 1 query for it.
DB schema:
create table languages
(
id int auto_increment primary key,
name varchar(255) not null
)
create table languagePairs
(
id int auto_increment
primary key,
sourceId int not null,
targetId int not null,
constraint languagePair_sourceId_targetId_uindex
unique (sourceId, targetId),
constraint languagePair_language_id_fk_source
foreign key (sourceId) references languages (id),
constraint languagePair_language_id_fk_target
foreign key (targetId) references languages (id)
)
create table jobs
(
id int auto_increment
primary key,
name varchar(255) null,
freelancerId int null,
languagePairId int null,
constraint jobs_freelancers_id_fk
foreign key (freelancerId) references freelancers (id),
constraint jobs_languagePairs_id_fk
foreign key (languagePairId) references languagePairs (id)
)
create table collectiveJobs
(
id int auto_increment
primary key,
jobId int not null,
freelancerId int not null,
constraint collectiveJobs_freelancerId_jobId_uindex
unique (freelancerId, jobId),
constraint collectiveJobs_freelancers_id_fk
foreign key (freelancerId) references freelancers (id),
constraint collectiveJobs_jobs_id_fk
foreign key (jobId) references jobs (id)
)
create table freelancers
(
id int auto_increment primary key
)
Sample data:
INSERT INTO datamundi.jobs (id, name, freelancerId, languagePairId) VALUES (1, 'Job 1', 1, 1);
INSERT INTO datamundi.jobs (id, name, freelancerId, languagePairId) VALUES (2, 'Job 2', 1, null);
If I execute the query only Job 1 gets shown.
MySQL version on development machine: mysql Ver 8.0.19 for Linux on x86_64 (MySQL Community Server - GPL)
MySQL version on production server: mysql Ver 8.0.17 for Linux on x86_64 (MySQL Community Server - GPL)
All help is truly appreciated.
I can't really delve into your specific example, but the good news is you are using MySQL 8.x. The workaround for a FULL OUTER JOIN between two tables (a and b) in MySQL is:
select * from a left join b on <predicate>
union
select * from a right join b on <predicate>
Now if you need to join complex selects instead of simple tables, them CTEs come to your rescue. For example, if the left side were a comple SELECT you would do:
with s as ( <complex-select-here> )
select * from s left join b on <predicate>
union
select * from s right join b on <predicate>
If both are complex SELECTs then:
with s as ( <complex-select-here> ),
t as ( <complex-select-here> )
select * from s left join t on <predicate>
union
select * from s right join t on <predicate>
No sweat.
This works with all LEFT joins, I am sorry, I should have tried first.
SELECT jobName, source.name AS source, target.name AS target FROM (
(SELECT jobs.name AS jobName, lp.sourceId, lp.targetId FROM jobs LEFT JOIN languagePairs lp
ON jobs.languagePairId = lp.id)
UNION
(SELECT jobs.name AS jobName, lp.sourceId, lp.targetId FROM collectiveJobs JOIN jobs ON jobs.id = collectiveJobs.jobId
LEFT JOIN languagePairs lp on jobs.languagePairId = lp.id
WHERE collectiveJobs.freelancerId = 1)
) AS jobs LEFT JOIN languages source ON source.id = sourceId LEFT JOIN languages target ON target.id = targetId;
Not sure why I taught I needed a FULL JOIN...
Is there any way we can reuse the reference of table we joined in a sub query?
I have three tables:
task_categories, information about task categories
task_priorities, priorities associated with task categories
task_links, url links for each individual tasks.
Please check this SQL Fiddle.
CREATE TABLE task_categories (
task_category_id int,
code varchar(255),
name varchar(255)
);
CREATE TABLE task_priorities (
priority_id int,
task_category_id int,
priority int
);
CREATE TABLE task_links (
task_links_id int,
task_category_id int,
title varchar(255),
link varchar(255),
position int
);
We'd need to join all these tables if we need links of tasks that has high priority. Something like this
select * from task_links t_links
inner join task t on t_links.task_id = t.task_id
inner join task_priorities t_priorities on t.task_id = t_priorities.task_id
where t.code in ('TASK_P2', 'TASK_P3') and
t_priorities.priority = (select min(priority) from task_priorities tp
inner join task t on tp.task_id = t.task_id
where t.code in('TASK_P2', 'TASK_P3'))
order by t_links.position;
Is there any way to optimize this query? This query has joined table twice, I think there should be a better way to write this query.
The logic for your subquery is incorrect. It is not selecting the minimum priority for each task.
I am guessing that you really want:
where t.code in ('TASK_P2', 'TASK_P3') and
tp.priority = (select min(tp2.priority)
from task_priorities tp2
where tp2.task_id = t.task_id
)
This doesn't need much more optimization than an index on task_priorities(task_id, priority).
I have the following table in my database. Its purpose is to hold colour sets. I.e. [red + black], [blue + green + yellow], etc.
CREATE TABLE `df_productcolours`
(
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_colourSet` int(11) NOT NULL,
`id_colour` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQUE` (`id_colourSet`,`id_colour`),
KEY `idx_colourSet` (`id_colourSet`),
KEY `idx_colour_id` (`id_colour`),
CONSTRAINT `fk_colourid` FOREIGN KEY (`id_colour`) REFERENCES `df_lu_color` (`id`)
ON DELETE NO ACTION ON UPDATE NO ACTION
)
I made a stored proc that takes an array of id_colour integers as input, and returns a colour set id. What it's meant to do is return the set that contains those colours, and ONLY those colours that are provided as input. What it's actually doing is returning sets that contain the colours requested plus some others.
This is the code that I have so far:
SET #count = (SELECT COUNT(*) FROM tempTable_inputColours);
SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN tempTable_inputColours AS B
ON A.id_colour = B.id_colour
GROUP BY A.id_colourSet
HAVING COUNT(A.id_colour) = #count
AND COUNT(B.id_colour) = #count;
I have a feeling the issue may be with the way I'm joining, but I just can't seem to get it. Any help would be appreciated. Thanks.
You can try this:
SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN tempTable_inputColours AS B
ON A.id_colour = B.id_colour
WHERE A.id_colourSet IN (SELECT id_colour FROM tempTable_inputColours)
AND A.id_colour IN (SELECT id_colour FROM tempTable_inputColours)
EDIT
SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN tempTable_inputColours AS B
ON A.id_colour = B.id_colour
WHERE A.id_colourSet =(SELECT SUM(id_colour) FROM tempTable_inputColours)
I think I solved it myself after a few days of punishment. Here's the code:
SET clrCount = (SELECT COUNT(*) FROM _tmp_ColourSet);
-- The first half of the query does an inner join,
-- it will return all sets that have ANY of our requested colours.
-- But the HAVING condition will make it return sets that have AT LEAST all of the colours we are requesting.
-- So at this point we have all the super-sets, if you will.
-- Then, the second half of the query will restrict that further,
-- to only sets that have the same number of colours as we are requesting.
-- And voila :)
-- FIND ALL COLOUR SETS THAT HAVE ALL REQUESTED COLOURS
SET colourSetId = (SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN _tmp_colourset AS B
ON A.id_colour = B.id_colour
GROUP BY A.id_colourSet
HAVING COUNT(A.id_colour) = clrCount
-- FIND ALL COLOUR SETS THAT HAVE EXACTLY N COLOURS
AND A.id_colourSet IN (SELECT A.id_colourSet
FROM df_productcolours AS A
GROUP BY A.id_colourSet
HAVING COUNT(A.id_colour) = clrCount));
Hope it saves someone pulling their hair out.
i have 4 tables
1. the first table(d_cities) for cities // related with the next table by country_id
CityId |CountryID |RegionID |City |Latitude|Longitude
2. the second table(d_country) for countries
CountryId|Country
3. the third table(ip2location_db11) for ip
ip_from|ip_to |country_code|country_name| city_name
4 the fourth table (ip_relation) would be like this
CountryID |CityId|ip_from |ip_to
i create the fourth table to collect custom data from the three tables and put it in one table..
this will has been done by :
join (d_country,d_cities) by id ,
then compare this names with IP table if matched
it will fetch the ids for these names & ips that matched and put it in the fourth table
..so i write my code like this and need to support to modify this code
INSERT ip_relations (CountryID, CityId,ip_from,ip_to)
SELECT *
FROM d_cities
INNER JOIN d_country ON d_cities.CountryID = d_country.CountryId
INNER JOIN ip2location_db11 ON ip2location_db11.country_name = d_country.Country
AND ip2location_db11.city_name = d_cities.City
/// this sql statement not work
first, i am not sure why do you design table like this.maybe you could change them like below:
d_cities: city_id | country_id | region_id |city | latitude| longitude
d_country: country_id | country
ip2location_db11: ip_from | ip_to | country_code | city_id
pS: I am not very sure what does country_code mean,so I keep it.base on the table structure above,it mostly like this: country to city is one-to-many and city_id must be unique, the ips is only have relation with the city_id.
I think ,this will be a better design...
then, if you have to solve the problem based on your current tables.
you would make a unique key "UNIQUE KEY uniq_from_to (ip_from,ip_to) "; and,there is sql:
INSERT IGNORE ip_relation
(SELECT cA.country_id,cA.city_id,ip.ip_from,ip.ip_to FROM ip2location_db11 ip
LEFT JOIN (SELECT cy.country,ct.city,ct.country_id,ct.city_id FROM d_country cy,d_cities ct WHERE cy.country_id = ct.country_id) as cA ON ip.city_name = cA.city AND ip.country_name = cA.country);
this means : 1.find All city-country groups;then based on the city-country groups;2.insert into your forth table,and when ip_from-ip_to is duplicate ,will cover the data before.
hope this can give you some help.
UPDATE ip_relations, ip2location_db11, d_cities, d_country
set ip_relations.countryid =d_country.CountryID
, ip_relations.cityid= d_cities.CityId
, ip_relations.ip_from=ip2location_db11.ip_from
, ip_relations.ip_to=ip2location_db11.ip_to
WHERE ip_relations.countryID=d_country.countryID and ip_relations.cityID=d_cities.CityID and ip_relations.ip_from=ip2location_db11.ip_from
and ip_relations.ip_to=ip2location_db11.ip_to;
Assuming country_id, city_id, countryname etc., being same through out all three tables:
SELECT
,dco.countryid AS countryid
,dci.cityid AS cityid
,ip_from
,ip_to
FROM d_country dco
INNER JOIN d_cities dci
ON dco.countryid=dci.countryid
INNER JOIN ip2location_db11 ip
ON TRIM(dci.city)=TRIM(ip.city_name)
AND TRIM(dco.country)=TRIM(ip.country_name)
WHERE dco.country_id=ip.country_code
UPDATE
ip_relation
FROM (
SELECT
,dco.countryid AS countryid
,dci.cityid AS cityid
,ip_from
,ip_to
FROM d_country dco
INNER JOIN d_cities dci
ON dco.countryid=dci.countryid
INNER JOIN ip2location_db11 ip
ON TRIM(dci.city)=TRIM(ip.city_name)
AND TRIM(dco.country)=TRIM(ip.country_name)
WHERE dco.country_id=ip.country_code
) DT
SET
ip_from=DT.ip_from
ip_to=DT.ip_to
WHERE CountryID=DT.countryid
AND CityId=DT.cityid
The correct way to join is here
UPDATE
ip_relations ir
INNER JOIN ip2location_db11 idl ON ir.ip_to = idl.ip_to
INNER JOIN ip2location_db11 idr ON ir.ip_from = idr.ip_from
INNER JOIN d_cities dc ON ir.d_cities = dc.d_cities
INNER JOIN d_country dct ON ir.countryID = dct.countryID
SET
ir.CountryID = dct.CountryID,
ir.CityId = dc.CityId,
ir.ip_from = dcr.ip_from,
ir.ip_to = dcl.ip_to
// put where condition if required
But when you are joining on some keys and want to update the keys i am sure all the keys will be same even after update so this will have no effect. If you update some else columns then it is practical.
To understand this assume this example.
joining two table on key which is 3. 3 is coming from 2nd table. updating column of first table with 3. So why do you need this? You need to update some else columns not the same ones you are joining.
INSERT INTO ip_relations (CityId,CountryID,ip_from,ip_to)
SELECT
d_cities.CityId,
d_cities.CountryID,
ip2location_db11.ip_from,
ip2location_db11.ip_to
FROM d_cities
INNER JOIN d_country ON d_cities.CountryID = d_country.CountryId
INNER JOIN ip2location_db11 ON ip2location_db11.country_name = d_country.Country
AND ip2location_db11.city_name = d_cities.City
I have a database full of Pokemon Cards, and their attacks. I want to do a query to find the Pokemon that has the strongest attack by each type. I want the view to show just the name, type, and damage of the attack.
SELECT p2.MaxD, p2.Type, p1.name
FROM Pokemon p1
INNER JOIN ( SELECT type, MAX(damage) MaxD, pokemon_name FROM Attack GROUP BY Type )
p2 ON p1.type = p2.type AND p2.pokemon_name = p1.name
I have this code. It returns the highest damage but not the correct Pokemon. The Pokemon table doesn't have a damage field. I'm trying to get a grasp of joins.
Here is the structure:
Attack table has 4 fields: pokemon_name (the pokemon this attack belongs to), damage, name (name of the attack), and type (the type of pokemon this attack belongs to).
The Pokemon table has 3: HP, type (of the pokemon), and name (of the pokemon).
First of all you have to build select that select maximal damage for each type (you already have that):
SELECT type, MAX(damage) MaxD FROM Attack GROUP BY Type
Now, this won't have a good performance unless:
type is INT (or ENUM or other numeric type)
there's index on type or type, damage
You cannot select pokemon_name because MySQL doesn't guarantee that you'll get pokemon_name matching MaxD (here's a nice answer on stackoverflow which already covers this issue).
Now you can select pokemon with that matching pokemon_name
SELECT p1.pokemon_name, p1.type, p1.damage
FROM Attack p1
INNER JOIN (
SELECT type, MAX(damage) MaxD FROM Attack GROUP BY Type
) p2 ON p1.type = p2.type
AND p1.damage = p2.MaxDamage
GROUP BY (p1.type, p1.damage)
The last GROUP BY statement makes sure that having multiple pokemons with the same attack damage won't cause multiple records for one type,damage pairs.
Again, you will achieve good performance by replacing pokemon_name with pokemon_id. Maybe you should google database normalization for a while [wikipedia],[first tutorial]. You also may want to check this Q&A out, it provides nice overview of what does "relation table" mean.
Now you have correct pokemon_name (for your program sake, I hope you'll replace this with pokemon_id) and you may put it all together:
SELECT p1.pokemon_name, p1.type, p1.damage, p.*
FROM Attack p1
INNER JOIN (
SELECT type, MAX(damage) MaxD FROM Attack GROUP BY Type
) p2 ON p1.type = p2.type
AND p1.damage = p2.MaxDamage
INNER JOIN Pokemon p
ON p.pokemon_name = p1.pokemon_name
GROUP BY (p1.type, p1.damage)
Ideal example
In perfect world you're database would look like this:
-- Table with pokemons
CREATE TABLE `pokemons` (
`id` INT NOT NULL AUTO_INCREMENT,
`name` VARCHAR(255),
-- More fields
PRIMARY KEY (`id`)
)
-- This contains pairs as (1,'Wather'), (2, 'Flame'), ...
CREATE TABLE `AttackTypes` (
`id`,
`name` VARCHAR(255)
)
-- Create records like (1, 2, 3, 152)
-- 1 = automatically generated keys
-- 2 = id of pokemon (let say it's Pikachu :P)
-- 3 = type of attack (this say it's Electric)
-- 152 = damage
-- This way each pokemon may have multiple attack types (Charizard flame + wind)
CREATE TABLE `Attacks` (
`id`,
`pokemonID` INT NOT NULL, -- Represents pokemons.id
`typeID` INT NOT NULL, -- Represents attack.id
`damage` INT
)
ID fields are ALWAYS PRIMARY KEY, NOT NULL and AUTO_INCREMENT in this example
And the select from it, again get types first:
SELECT MAX(attack.damage) AS mDmg, attack.typeID
FROM attack
GROUP BY attack.typeID
Than get pokemon ID:
SELECT a.pokemonID, a.damage, a.typeID
FROM attack AS a
INNER JOIN (
SELECT MAX(a.damage) AS mDmg, a.typeID
FROM attack AS a
GROUP BY a.typeID
) AS maxA
ON a.typeID = maxA.typeID
AND a.damage = mDmg
GROUP BY (a.typeID)
And once you've covered all that you may actually select pokemon data
SELECT aMax.pokemonID as id,
aMax.damage,
p.name AS pokemonName,
aMax.typeID AS attackTypeID,
t.name AS attackType
FROM (
SELECT a.pokemonID, a.damage, a.type
FROM attack AS a
INNER JOIN (
SELECT MAX(a.damage) AS mDmg, a.type
FROM attack AS a
GROUP BY a.type
) AS maxA
ON a.type = maxA.type
AND a.damage = mDmg
GROUP BY (a.type)
) AS aMax
INNER JOIN pokemons AS p
ON p.id = aMax.pokemonID
INNER JOIN AttackTypes AS t
ON t.id = aMax.typeID
Performance hints:
you may add field MaxDamage into AttackTypes (which would be calculated by stored procedure) and will save you one level of nasted query
all ID fields should be PRIMARY KEYs
index on Attacks.typeID allows you to quickly get all pokemons capable of that type of attack
index on Attack.damage allows you to quickly find strongest attack
index on Attack.type, Attack.damage (two fields) will be helpful when finding max value for each attack
index on Attack.pokemonID will make look up pokemon -> attack -> attack type name faster
I'm not really sure about your schema but I assumed that the pokemon_name of your attack table is really the name of the pokemon.
SELECT a.*, c.*
FROM Attack a
INNER JOIN
(
SELECT type, MAX(damage) MaxD
FROM Attack
GROUP BY Type
) b ON a.Type = b.Type AND
a.damage = b.MaxD
INNER JOIN Pokemon c
ON c.Name = a.pokemon_name AND
c.Type = a.Type
the above query displays all field from attack table and pokemon table, but If you are really interested on the name, damage and type the you only do query on attack table
SELECT a.*
FROM Attack a
INNER JOIN
(
SELECT type, MAX(damage) MaxD
FROM Attack
GROUP BY Type
) b ON a.Type = b.Type AND
a.damage = b.MaxD