It's correct this Exercise Query SQL? - mysql

I have this DB
aeroporto(1) <----> (n)volo(n) <----> (1)aereo
AEROPORTO: (pK)id_ap, città, naz, num_pist
VOLO: (pk)id_volo, data, (fk)id_part, oraPart, (fk)id_arr, oraArr, (fk)tipoAereo
AEREO: (pk)id_aereo, qta_merci, num_pass, cod_aereo
/ * French cities from which more than twenty direct flights to Italy leave * /
select a.citta
from volo as v, aereoporto as a, aereoporto as b
where a.id_ap = v.id_part and b.id_ap = v.id_arr and
a.nazione != b.nazione and a.nazione = 'francia' and count(b.citta = 'italia') > 20 ;
It's correct?
Sorry for my poor english.

Return the city name of all cities leaving 'francia' and arriving in 'italia' having a count of flights > 20.
SELECT a.citta
FROM volo as v
INNER JOIN aereoporto as a
on a.id_ap = v.id_part
INNER JOIN aereoporto as b
on b.id_ap = v.id_arr
WHERE a.nazione = 'francia'
and b.nazione = 'italia'
GROUP BY a.citta
HAVING count(v.id_volo) > 20 ;
GROUP BY allows us to group like cities together from 'francia' and the having clause lets us count the flights and ensure we only show cities having more than 20 flights.
However, this does assume that a citta is not duplicated per naz. For example: if the the same named city existed in different regions within francia; then we would have a problem. So if there were a Paris Brittany and a Paris Île-de-France and both had airports going to itally; then the a.citta for Paris would be the count of both Paris cities; which may not be the desired results. Without a unique identifier which describes a.citta on which we group this problem could persist. So maybe we need to group by a.citta and a.id_ap and display both in the select so the user knows "Which airport' we're talking about. Alternatively; I believe each airport is assigned a code defining it specifically; if that were tracked as part of the airport information; we could group on it and avoid the a.id_ap. City alone isn't enough to make a record unique.
Example:
SELECT a.citta, a.id_ap
FROM volo as v
INNER JOIN aereoporto as a
on a.id_ap = v.id_part
INNER JOIN aereoporto as b
on b.id_ap = v.id_arr
WHERE a.nazione = 'francia'
and b.nazione = 'italia'
GROUP BY a.citta, a.id_ap
HAVING count(v.id_volo) > 20 ;
I don't like using cross joins in the from clause which is the , notation in the from does. This style of join is from the 1980's and should be given up in favor of the explicit syntax of inner, left, right, full outer and cross join.
My reasoning for this is the FROM clause should define the tables being used and how they relate in most cases. The where clause should be used to limit the data being returned; mixing the two comes at a cost of confusion and more difficult maintenance long term. The only exception to this rule is outer join which may need to limit the data as part of the join to maintain the integrity of the outer join.
To address comment:
Something like this may work: if the total count of flights from a cities airport matches the count of arrivals coming from italia and destinations to italia then all flights are internal/domestic; otherwise do not show that airport. So the having clause like the where, acts as the filter to exclude cities with flights arriving from or departing to other countries. This approach is a bit more ambiguous on what it's doing but from a performance standpoint it should operate better than an in, not exists, or subqueries would given proper indexes on fields.
SELECT a.citta, a.id_ap
FROM volo as v
INNER JOIN aereoporto as a
on a.id_ap = v.id_part
INNER JOIN aereoporto as b
on b.id_ap = v.id_arr
GROUP BY a.citta, a.id_ap
HAVING count(v.id_volo) = sum(case when a.nazione = 'italia' then 1 else 0 end as ItalianArrivals)
and count(v.id_volo) = sum(case when b.nazione = 'italia' then 1 else 0 end as ItalianDepartures);
Given the ambiguity of the above, maintenance may be a bit harder for the next person so by using an explicit not exists once for arrivals and once for departures you can achieve the same results; but I do not believe it would be as optimal in performance... However, it too would achieve the proper results. I prefer not exists over not in when dealing with a data set that can grow in size to over 50 values. Since the query relates to a data set that looks like it would continue to grow, not exists seems like the 2nd best choice over the not in; with the having I believe being better performance on the larger datasets.

I tried hardly to figure out your example and what is the exact output you are trying to retrieve and I think the below shall help you.
SELECT a.citta
FROM aereoporto as a, volo as v
WHERE a.id_ap = v.id_part AND a.nazione = 'francia'
GROUP BY a.citta having count(*) > 20 ;
Mockup Data
aereoporto
id_ap citta nazione
1 rome italia
2 milan italia
3 paris francia
4 bordeaux francia
volo
id_volo data id_part oraPart id_arr oraArr tipoAereo
1 NULL 3 NULL 1 NULL NULL
2 NULL 3 NULL 1 NULL NULL
3 NULL 3 NULL 2 NULL NULL
4 NULL 4 NULL 2 NULL NULL
5 NULL 4 NULL 1 NULL NULL

Related

Is there a better/faster way to compute an average delta between dates?

It's a work in medical records. Goal is computing average value in days between two medical consultations, per patient, per care-unit, per year. I'm stuck with big records : for small units with less than 50 patients / 200 consultations, the below HQL query (for one care-unit/one year) is functional and relatively quick, but for greater medical activity, there is a "combinatory explosion" with a heavy load on database ... And my wish is to analyze 10 years for some 80 care-units... in one launch.
If you have any advice I would be very grateful!
SELECT
HB3 patient.pati_nip AS NIPP,
UPPER(cufm.cufm_libelle) AS CAT_UFM,
grp.unfo_libelle AS SECTEUR_DISP,
uf_ex.codeLibelle AS UNITE,
COUNT(DISTINCT raa.id) AS RAA,
COUNT(DISTINCT patient.id) AS PATIENTS,
ROUND(AVG(raa2.traa_date-raa.traa_date),1) AS DELAIMOY_J_INTER_RAA
FROM
Ide_patient AS patient
JOIN patient.pms_edgars AS redg
JOIN redg.bas_uf AS uf_ex
JOIN redg.pms_edgar_actes AS acte
JOIN acte.bas_catalogue_gen_by_Edgr_id_cage_nature AS type
JOIN acte.pms_raas as raa
JOIN patient.pms_edgars AS redg2
JOIN redg2.bas_uf AS uf_ex2
JOIN redg2.pms_edgar_actes AS acte2
JOIN acte2.bas_catalogue_gen_by_Edgr_id_cage_nature AS type2
JOIN acte2.pms_raas as raa2
JOIN uf_ex.bas_etablissement AS etab
JOIN uf_ex.bas_uf_by_Unfo_id_unfo_grp as grp
JOIN uf_ex.bas_categorie_ufm AS cufm
WHERE
etab.id = <ETAB>
AND raa.traa_date BETWEEN INVITE(D: Actes exportés effectués entre le ) AND INVITE(D: et le )
AND type.cage_code NOT LIKE 'R%'
AND uf_ex.id = INVITE(B:UF_MED_FILT_VAL: File active+nouveaux patients pour cette UF exécutante)
AND raa.traa_dat_export IS NOT NULL
AND raa2.traa_date = (SELECT MIN(raa3.traa_date)
FROM patient.pms_edgars AS redg3
JOIN redg3.bas_uf AS uf_ex3
JOIN redg3.pms_edgar_actes AS acte3
JOIN acte3.bas_catalogue_gen_by_Edgr_id_cage_nature AS type3
JOIN acte3.pms_raas as raa3
WHERE raa3.traa_dat_export IS NOT NULL
AND raa3.traa_date > raa.traa_date
AND uf_ex3.id = uf_ex
AND type3.cage_code NOT LIKE 'R%')
ORDER BY
patient.pati_nip, UPPER(cufm.cufm_libelle), grp.unfo_libelle, uf_ex.codeLibelle
https://stackoverflow.com/users/1766831/rick-james, here is the minimal query, with no delta computing, no "agregate" functions
SELECT
HB3 patient.id AS PATI_ID,
uf_ex.codeLibelle AS UNITE,
raa.traa_date AS DATE_CONSULT_DATE
FROM
Ide_patient AS patient
JOIN patient.pms_edgars AS redg
JOIN redg.bas_uf AS uf_ex
JOIN redg.pms_edgar_actes AS acte
JOIN acte.bas_catalogue_gen_by_Edgr_id_cage_nature AS type
JOIN acte.pms_raas as raa
JOIN uf_ex.bas_etablissement AS etab
WHERE
etab.id = <ETAB>
AND raa.traa_date BETWEEN INVITE(D: consultations between ) AND INVITE(D: and )
AND type.cage_code NOT LIKE 'R%'
AND uf_ex.id = INVITE(B:UF_MED_FILT_VAL: consultations done in this care-unit)
AND raa.traa_dat_export IS NOT NULL
ORDER BY
GROUP BY uf_ex.codeLibelle, patient.id, raa.traa_date
=> First letter of type.cage_code means "type of consultations" IN ('E','D','G','A','R'), and 'R' is excluded because patient is not present (meeting of the medical team)
=> goal is computing, for all consultations (except R) of a same patient, the delta betwen two contiguous consultations in a time interval. Date Format for raa.traa_date includes hours,minutes,seconds.
=> uf_ex.id is the ID of the medical care-unit for the actual consultation
Step 1. CREATE TABLE tbl with pati_id, unite, and consult_date. Also, have a 4th column that is AUTO_INCREMENT PRIMARY KEY; let's call it id. (If you are using 8.0 or 10.2, use a "CTE" and WITH.)
Step 2. Use the above 'minimal' query to populate the 3 columns, letting id populate itself. Be sure to include ORDER BY pati_id, consult_date. (and maybe unite?)
Step 3. ALTER TABLE tbl ADD INDEX(pati_id, id)
Step 4. Do a self-join of that table with itself, but offset the id:
SELECT pati_id,
DATEDIFF(t2.consult_date, t1.consult_date) AS gap
FROM tbl AS t1
JOIN tbl AS t2 ON t2.pati_id = t1.pati_id
AND t2.id = t1.id + 1
(I leave it to you to decide how UNITE fits in.)

Subtract value of the same column with different where clauses and various rows

I Need help, I saw similar questions here, but no one helped me to solve this query.
I want to subtract values of the same column but with different where clauses but with various rows like this:
I Want to subtract this values so the output need to be like this:
Table with where clause 1
Product | Qty_totally| Name
PRODUCT A 10 HORGE
PRODUCT B 20 OMINION
PRODUCT C 30 LIKT
Table with where clause 2
Product | Qty_totally| Name
PRODUCT A 25 HORGE
PRODUCT B 50 OMINION
PRODUCT C 70 LIKT
Table with Final query
Product | Qty_totally| Name
PRODUCT A -15 HORGE
PRODUCT B -30 OMINION
PRODUCT C -40 LIKT
Help me please!!!!
I've tried this:
select descrição as 'Produto', sum(Quantidade_Total) as 'Quantidade_Entrada',Controle_armazem.Fornecedor as 'Fornecedor Controle Armazem' from Controle_armazem join produtos on controle_armazem.Modelo = produtos.idProdutos where Controle_armazem.Ativo = 1 and nota_fiscal is not null and nota_fiscal <> '' and defeito = 'Beneficiamento' AND situação = 'Beneficiado - Disponível para uso' GROUP BY descrição,Controle_armazem.fornecedor
Select descrição AS 'Produto', sum(Quantidade_Total) 'Quantidade Enviada', beneficiamento.Fornecedor as 'Fornecedor' From BeneficiamentoJoin controle_armazem On idPalete = palete join produtos on controle_armazem.Modelo = produtos.idProdutos WHERE Controle_armazem.Ativo = 1 And Beneficiamento.Ativo = 'A' GROUP BY descrição,Beneficiamento.fornecedor
I don't know how to subtract the value of "Quantidade_Total" column.
You'd simply join both queries. As there can be products in query #1 that are not in the results of query #2 and vice versa, you'd want a full outer join that MySQL does not provide. The best approach should be then to select from the products table and outer join both queries to it.
select
q1.product,
coalesce(q1.qty_totally, 0) - coalesce(q2.qty_totally, 0) as qty_totally,
q1.name as name1,
q2.name as name2
from produtos p
left join (query #1 here) q1 on q1.product = p.descrição
left join (query #2 here) q2 on q2.product = p.descrição
where q1.product is not null or q2.product is not null;

SQL: 4 Tables to 1 table with counts, groups and deductions?!

I have a project with lost, found and matched luggage on airports. I made it in Java(FX) and mySQL.
This is what I have:
I have 4 tables:
1 table Airports with 2 columns:
Airport_id & Airport_name
1 table Found with 3 columns:
Found_id & Found_AirportID & Matched
1 table Lost with 3 columns:
Lost_id & Lost_AirportID & Matched
1 table Match with 3 columns:
Match_id & Match_LostID & Match_FoundID & Match_AirportID
Whenever a match is made, the Match table gets a new row with the Match_LostID (from the Lost_id) & Match_FoundID (from the Found_id) and the Match_AirportID (Found_AirportID)
The Matched (in both Found & Lost) get both set to 1, instead of NULL
All the AirportID's are linked to the Airport table.
What I want;
For each and every airport I want the count of the lost items, the count of found items and the count of matched items. BUT when a item is 'matched' it may not appear in the count of the lost and found.
So I want a table with 4 columns:
Airportname, Count of Found, Count of Lost, Count of Matched.
I've made the following Query:
SELECT vv.Airport_name,
COUNT(DISTINCT gb.Found_id) countFound,
COUNT(DISTINCT vb.Lost_id) countLost,
COUNT(DISTINCT kt.Match_id) countMatch
FROM Airports vv
LEFT JOIN Found gb ON vv.Airport_id = gb.Found_AirportID
LEFT JOIN Lost vb ON vv.Airport_id = vb.Lost_AirportID
LEFT JOIN Match kt ON vv.Airport_id = kt.Match_AirportID
WHERE vb.Matched IS NULL OR gb.Matched IS NULL
GROUP BY vv.Airport_name
I manage to get all the count items for Found, Lost and Match.
e.g. New York has 2 found, 2 lost and 1 match.
This is displayed correctly in the table.
But as I said, if there is a match it should be removed from found and lost. It should be:
New York has 1 found, 1 lost and 1 match.
I tried a lot of things, 1 time I manage to do it but then a Airport is missing or it gets deducted from Found but not for lost...
I do not know what the solution is, can someone explain / give it to me?
Thanks in advance,
LTKort
Put the Matched IS NULL checks in the ON conditions of the LEFT JOIN, not WHERE.
SELECT vv.Airport_name,
COUNT(DISTINCT gb.Found_id) countFound,
COUNT(DISTINCT vb.Lost_id) countLost,
COUNT(DISTINCT kt.Match_id) countMatch
FROM Airports vv
LEFT JOIN Found gb ON vv.Airport_id = gb.Found_AirportID AND gb.Matched IS NULL
LEFT JOIN Lost vb ON vv.Airport_id = vb.Lost_AirportID AND vb.Matched IS NULL
LEFT JOIN Match kt ON vv.Airport_id = kt.Match_AirportID
GROUP BY vv.Airport_name
The problem with doing it in WHERE is that you're only getting the results where either the Lost or Found item was matched.
Alternatively, consider joining derived tables of aggregates to avoid many-to-many joins during the COUNT() evaluations:
SELECT a.AirportName, ftbl.countFound, lbtl.countLost, mtbl.countMatched
FROM Airports a
LEFT JOIN
(SELECT f.Found_AirportID, COUNT(f.Found_id) AS countFound
FROM Found f
WHERE f.Matched IS NULL
GROUP BY f.Found_AirportID) As ftbl
ON a.Airport_id = ftbl.Found_AirportID
LEFT JOIN
(SELECT l.Lost_AirportID, COUNT(l.Lost_id) AS countLost
FROM Lost l
WHERE l.Matched IS NULL
GROUP BY l.Lost_AirportID) As ltbl
ON a.Airport_id = ltbl.Lost_AirportID
LEFT JOIN
(SELECT m.Match_AirportID, COUNT(m.Match_id) AS countMatched
FROM Matched m
GROUP BY m.Match_AirportID) As mtbl
ON a.Airport_id = mtbl.Match_AirportID

SQL Select outter Join and Group By and IFNULL ( COUNT ) HELP =x

I have this database and i was wondering to create a great select but is too hard for me I guess I tried so many ways and I get really close, but i cant go longer.
Database
Table -> Candidato | IdCandidato(int) | idNome(varchar)
Table -> Voto | idVoto(int) | Candidato_idCandidato(int) | DiaVotacao(date)
i am creating a web voting system and need i greate select to complet my graphics to show the total voting for each day for each candidate.
Candidato = candidate | voto = vote | diaVotacao = voting day (english translation)
I need i response like this:
|VotingDay---------|-----Candidate1----------|-----Candidate2------|--Candidate3
|2014-05-14-------|---13(total votes)---------|------------4------------|-----------10|
|2014-05-15-------|---18(total votes)---------|------------0------------|------------8|
and so far i got this:
|VotingDay---------|-----TOTAL Votes----------|-----Name------|
|2014-05-14-------|---13(total votes)---------|-Candidate1
|2014-05-14-------|---18(total votes)---------|-Candidate2
|2014-05-15-------|---10(total votes)---------|-Candidate1
|2014-05-15-------|----8(total votes)---------|-Candidate2
I used the following code:
SELECT voto.DiaVotacao, IFNULL(COUNT(voto.Candidato_idCandidato),0) as Votos, candidato.Nome
FROM candidato LEFT OUTER JOIN voto ON voto.Candidato_idCandidato=candidato.idCandidato
GROUP BY voto.DiaVotacao, voto.Candidato_idCandidato
Note that i want the count of the votes for each candidate for everey day and if there is no votes apear the number 0 to indicate no votes
did u guys understand?
You need to generate the full list of candidates and days and then do the left outer join. You can get the list of days from the votos table.
Note that when you use count(<column>), it will return 0 if all the values are NULL. There is no need for ifull() or coalesce():
SELECT d.DiaVotacao, COUNT(v.Candidato_idCandidato) as Votos, c.Nome
FROM candidato c cross join
(SELECT DISTINCT v.DiaVotacao FROM voto v
) d LEFT OUTER JOIN
voto v
ON v.Candidato_idCandidato = c.idCandidato and
v.DiaVotacao = d.DiaVotacao
GROUP BY d.DiaVotacao, v.Candidato_idCandidato;

Using DISTINCT inside JOIN is creating trouble [duplicate]

This question already has an answer here:
Closed 10 years ago.
Possible Duplicate:
How can I modify this query with two Inner Joins so that it stops giving duplicate results?
I'm having trouble getting my query to work.
SELECT itpitems.identifier, itpitems.name, itpitems.subtitle, itpitems.description, itpitems.itemimg, itpitems.mainprice, itpitems.upc, itpitems.isbn, itpitems.weight, itpitems.pages, itpitems.publisher, itpitems.medium_abbr, itpitems.medium_desc, itpitems.series_abbr, itpitems.series_desc, itpitems.voicing_desc, itpitems.pianolevel_desc, itpitems.bandgrade_desc, itpitems.category_code, itprank.overall_ranking, itpitnam.name AS artist, itpitnam.type_code FROM itpitems
INNER JOIN itprank ON (itprank.item_number = itpitems.identifier)
INNER JOIN (SELECT DISTINCT type_code FROM itpitnam) itpitnam ON (itprank.item_number = itpitnam.item_number)
WHERE mainprice > 1
LIMIT 3
I keep getting Unknown column 'itpitnam.name' in 'field list'.
However, if I change DISTINCT type_code to *, I do not get that error, but I do not get the results I want either.
This is a big result table so I am making a dummy example...
With *, I get something like:
+-----------+---------+----------+
| identifier| name | type_code|
+-----------+---------+----------+
| 2 | Joe | A |
| 2 | Amy | R |
| 7 | Mike | B |
+-----------+------------+-------+
The problem here is that I have two instances of identifier = 2 because the type_code is different. I have tried GROUP BY at the outside end of the query, but it is sifting through so many records it creates too much strain on the server, so I'm trying to find an alternative way of getting the results I need.
What I want to achieve (using the same dummy output) would look something like this:
+-----------+---------+----------+
| identifier| name | type_code|
+-----------+---------+----------+
| 2 | Joe | A |
| 7 | Mike | B |
| 8 | Sam | R |
+-----------+------------+-------+
It should skip over the duplicate identifier regardless if type_code is different.
Can someone help me modify this query to get the results as simulated in the above chart?
One approach is to use an inline view, like the query you already have. But instead of using DISTINCT, you would use a GROUP BY to eliminate duplicates. The simplest inline view to satisfy your requirements would be:
( SELECT n.item_number, n.name, n.type_code
FROM itpitnam n
GROUP BY n.item_number
) itpitnam
Although its not deterministic as to which row from itpitnam the values for name and type_code are retrieved from. A more elaborate inline view can make this more specific.
Another common approach to this type of problem is to use a correlated subquery in the SELECT list. For returning a small set of rows, this can perform reasonably well. But for returning large sets, there are more efficient approaches.
SELECT i.identifier
, i.name
, i.subtitle
, i.description
, i.itemimg
, i.mainprice
, i.upc
, i.isbn
, i.weight
, i.pages
, i.publisher
, i.medium_abbr
, i.medium_desc
, i.series_abbr
, i.series_desc
, i.voicing_desc
, i.pianolevel_desc
, i.bandgrade_desc
, i.category_code
, r.overall_ranking
, ( SELECT n1.name
FROM itpitnam n1
WHERE n1.item_number = r.item_number
ORDER BY n1.type_code, n1.name
LIMIT 1
) AS artist
, ( SELECT n2.type_code
FROM itpitnam n2
WHERE n2.item_number = r.item_number
ORDER BY n2.type_code, n2.name
LIMIT 1
) AS type_code
FROM itpitems i
JOIN itprank r
ON r.item_number = i.identifier
WHERE mainprice > 1
LIMIT 3
That query will return the specified resultset, with one significant difference. The original query shows an INNER JOIN to the itpitnam table. That means that a row will be returned ONLY of there is a matching row in the itpitnam table. The query above, however, emulates an OUTER JOIN, the query will return a row when there is no matching row found in itpitnam.
UPDATE
For best performance of those correlated subqueries, you'll want an appropriate index available,
... ON itpitnam (item_number, type_code, name)
That index is most appropriate because it's a "covering index", the query can be satisfied entirely from the index without referencing data pages in the underlying table, and there's equality predicate on the leading column, and an ORDER BY on the next two columns, so that will a avoid a "sort" operation.
--
If you have a guarantee that either the type_code or name column in the itpitnam table is NOT NULL, you can add a predicate to eliminate the rows that are "missing" a matching row, e.g.
HAVING artist IS NOT NULL
(Adding that will likely have an impact on performance.) Absent that kind of guarantee, you'd need to add an INNER JOIN or a predicate that tests for the existence of a matching row, to get an INNER JOIN behavior.
SELECT a.*
b.overall_ranking,
c.name AS artist,
c.type_code
FROM itpitems a
INNER JOIN itprank b
ON b.item_number = a.identifier
INNER JOIN itpitnam c
ON b.item_number = c.item_number
INNER JOIN
(
SELECT item_number, MAX(type_code) code
FROM itpitnam
GROUP BY item_number
) d ON c.item_number = d.item_number AND
c.type_code = d.code
WHERE mainprice > 1
LIMIT 3
Follow-up question: can you please post the table schema and how are the tables related with each other? So I will know what are the columns to be linked.