Simple inner join not working - mysql

I'm trying to query the sum of the populations of all cities where the CONTINENT is 'Asia'.
The two tables CITY and COUNTRY are as follows,
city - id, countrycode, name population
country - code, name, continent, population
Here's my query
SELECT SUM(POPULATION) FROM COUNTRY CITY
JOIN ON COUNTRY.CODE = CITY.COUNTRYCODE
WHERE CONTINENT = "Asia";
This doesn't work. What am I doing wrong. I'm new to SQL.

It isn't working because the way you've written it CITY is being interpreted as a table alias for COUNTRY. Additionally, it looks like you've got a POPULATION column in each table so you need to disambiguate it. Let me rewrite the query for you:
SELECT SUM(CITY.POPULATION)
FROM COUNTRY
JOIN CITY
ON COUNTRY.CODE = CITY.COUNTRYCODE
WHERE COUNTRY.CONTINENT = "Asia";

I know the question was already answered, but I would like to put out the optimised solution. The below solution will decrease the execution time and at the same time it will take less resource to perform the SQL query.
select sum(a.population) from city a
inner join(select * from country where continent = 'Asia') b
on a.countrycode=b.code;
I would like to explain a bit on top of that, as you see I'm applying the filter condition before performing Join operation. So during reshuffling phase, the data would be very less and this way query will take less time to execute. You will not see a drastic performance changes in less data size, however while running this queries in large dataset, you can see the performance improvement.

The JOIN needs to go between the two table names:
SELECT SUM(CITY.POPULATION) FROM COUNTRY INNER JOIN CITY
ON COUNTRY.CODE = CITY.COUNTRYCODE
WHERE CONTINENT = "Asia";

MySQL JOIN syntax manual
SELECT SUM(COUNTRY.POPULATION)
FROM COUNTRY
JOIN CITY
ON COUNTRY.CODE = CITY.COUNTRYCODE
WHERE CONTINENT = "Asia";

SELECT SUM(CITY.POPULATION)
FROM CITY
INNER JOIN COUNTRY ON CITY.COUNTRYCODE = COUNTRY.Code
where COUNTRY.CONTINENT = 'Asia';
Line 3 has INNER JOIN because there is one column in both the tables that are common to both

SELECT sum(city.population) FROM city LEFT JOIN country ON city.countrycode=country.code
WHERE country.continent='Asia'

You can run the following code using Oracle.
SELECT SUM(c.POPULATION)
FROM CITY c
INNER JOIN COUNTRY co ON c.CountryCode = co.Code
WHERE CONTINENT ='Asia' ;

select SUM(cty.POPULATION) from COUNTRY cntry, CITY cty where cty.COUNTRYCODE=cntry.CODE AND cntry.CONTINENT='Asia';

select sum(S.Population)
from City S
where S.CountryCode in (select Code
from Country C
where CONTINENT = 'Asia');

Related

SQL max is non-deterministic?

I have two tables: cities and states. States has columns for state codes and full name. Cities contains columns for population, state code, and the city name. My goal is to create a table of the city in each state with the highest population.
This is my solution which seems to work in a test, but I've been told that using max() is non-deterministic and I should use a window function instead.
SELECT
s.name,
c.name,
max(c.population)
FROM cities AS c
LEFT JOIN states AS s
ON c.state_code = s.code
GROUP BY s.name
ORDER BY s.name;
What is wrong with using max here, when would it give incorrect results?
In most databases your query would not even run, because you are selecting the non-aggregated column c.name without also using it in the GROUP BY clause.
For MySql, the code would run if ONLY_FULL_GROUP_BY mode is disabled, but still it would return wrong results because the query would pick a random city name out of all the cities of each state.
See the demo.
For SQLite, your query is correct!
SQLite's feature of bare columns, makes sure that the city name you get in the results is the one that has the max population.
This is non-standard, but it is documented.
The only problem here is that if there are 2 or more cities with the same max population you will get only one of them in the results.
See the demo.
You can find the city in each state with the max population and use it in a sub-query and join it with the tables.
Query
select s.name as state, c.name as city, c.population
from states s
join cities c
on c.state_code = s.code
join (
select state_code, max(population) as max_pop
from cities
group by state_code
) as p
on p.state_code = c.state_code
and p.max_pop = c.population;
create table states(code varchar(50),name varchar(50));
create table cities(code varchar(50),name varchar(50),population int, state_code varchar(50));
insert into states values('s01','state1');
insert into cities values('c01','city1',100,'s01');
insert into cities values('c02','city2',10,'s01');
Query:
with cte as
(
SELECT
s.name state_name,
c.name city_name,
c.population,
row_number()over(partition by s.name order by c.population desc)rn
FROM cities AS c
LEFT JOIN states AS s
ON c.state_code = s.code
)
select state_name, city_name, population from cte where rn=1
Output:
state_name
city_name
population
state1
city1
100
db<>fiddle here

Problems with subqueries using Sakila

Using the Sakila DB, i am trying to get the Country name, the number of cities that a country have, and the number of addresses of a country
Using the next query i get the country and the cities number
SELECT CO.country,COUNT(CI.city_id)
FROM city CI
INNER JOIN country CO ON CO.country_id = CI.country_id
GROUP BY CO.country;
Using this other one i get the addresses number
SELECT CO.country,COUNT(A.address_id)
FROM city CI
INNER JOIN address A ON A.city_id=CI.city_id
INNER JOIN country CO ON CI.country_id=CO.country_id
GROUP BY CO.country;
I was hinted to use Subqueries to get the desired results, but i can't find how to get all that in one table. Any suggestions?
This is actually a tricky problem. Your join approach can be made to work, with some slight modifications. The total count across each country group will give the number of addresses in that country. But to get the city count for a country, we can count the distinct city names in each country group. The need for DISTINCT here is that the join to the address table will cause each city name to replicated however many times an address appears in a given city. Taking the distinct city count gets around this problem.
SELECT
co.country_id,
COUNT(DISTINCT ci.city_id) AS city_cnt,
COUNT(a.city_id) AS address_cnt
FROM country co
INNER JOIN city ci
ON co.country_id = ci.country_id
INNER JOIN address a
ON ci.city_id = a.city_id
GROUP BY
co.country_id;
You can achieve the result using below sub query. This is basically to show how you can write it. Its recommended to use join(Refer answer from Tim Biegeleisen) than Sub queries as it gives good performance.
select
Co.Country,
(Select COUNT(1) from City Ci where Ci.countryid=co.countryid) CityCount,
(Select COUNT(1) from Address A Join city c on a.city_id=c.city_id where C.countryid=co.countryid) AddressCount
From Country Co

"Invert" the output to list everything that is not printed/listed at this time

Playing around with SQL. I am trying to list the name of every country where all cities have an individual population count of less than 100 000 people.
The below code gives me every country that have a city that has more then 100 000 people, so by terms i am trying to "Invert" the output to list everything that is not printed/listed at this time.
Suggestions?
Select distinct country.Name from country,city
where city.CountryCode = country.Code and city.population > 100000;
A typical way to handing this uses aggregation and having:
select co.Name
from country co left join
city ci
on ci.CountryCode = co.Code
group by co.Name
having coalesce(max(ci.population), 0) <= 100000;
The coalesce() and left join take into account countries that have no cities.
For reference, the equivalent query for the version in your question:
select co.Name
from country co left join
city ci
on ci.CountryCode = co.Code
group by co.Name
having max(ci.population) >= 100000;
I think this could be it.
SELECT name FROM country WHERE Population IN
(
SELECT population FROM city WHERE Population < 100000
)
You are looking for all countries which do not have a city with population > 100000 - something like this should do it:
SELECT country.Name FROM country
WHERE country.CountryCode
NOT IN (SELECT DISTINCT city.CountryCode FROM city WHERE city.population > 100000)
As most other comments already said, flip the > to <=. I also rewrote your query to use a join, you might want to look at those too.
SELECT DISTINCT country.Name
FROM country INNER JOIN city ON (city.CountryCode = country.Code)
WHERE city.population <= 100000;

MySQL error 1242: Subquery returns more than 1 row

I'm working on some SQL homework, and I've come to a dead-end on this one question and I'm hoping someone can point out what exactly I'm doing wrong here.
SELECT Name,
(SELECT Name
FROM City
WHERE City.CountryCode = Country.Code) AS 'city',
(SELECT Population
FROM City
WHERE City.CountryCode = Country.Code) AS 'city_population'
FROM Country
WHERE Region IN ('Western Europe')
HAVING city_population > (SUM(Population) / COUNT(city))
ORDER BY Name, city;
What I'm trying to do here is retrieve from a database of global statistics a list of cities (from the City table) matched with their Country from that table, in which the country is in the region of Western Europe and the population of the city is greater than the average population of cities for its country, ordered by country and city name. The CountryCode and Code are the keys for the tables.
Can anyone tell me where I'm going wrong? I'm guessing MySQL is unhappy because my subqueries are returning more rows than the selector for country names does, but that's exactly what I want to do. I want multiple rows for a country value, one row for each city that meets the search criteria of having greater than average populations. The assignment also specifically forbids me from using joins to solve this problem.
A join should do it. You can join city on country code, and filter out cities that have a lower than average population
select
co.Name as CountryName,
ci.Name as CityName,
ci.Population as CityPopulation
from
Country co
inner join City ci
on ci.CountryCode = co.CountryCode
where
co.Region in ('Western Europe')
and ci.Population >
(select sum(ca.Population) / count(*) from City ca
where ca.CountryCode = co.CountryCode)
Additions:
Since you are not allowed to use joins, you could solve it in a couple of ways.
1) You can alter your query a little bit, but it won't return rows for each city. Instead it will return the list of cities as a single field. This is only a slight modification of your query. Note the GROUP_CONCAT function, which works like SUM only it concats the values instead of summing them. Also note the added ORDER BY clause in the subselects, so you can make sure The nth Population matches the nth City name.
SELECT Name,
(SELECT GROUP_CONCAT(Name)
FROM City
WHERE City.CountryCode = Country.Code
ORDER BY City.Name) AS 'city',
(SELECT GROUP_CONCAT(Population)
FROM City
WHERE City.CountryCode = Country.Code
ORDER BY City.Name) AS 'city_population'
FROM Country
WHERE Region IN ('Western Europe')
HAVING city_population > (SUM(Population) / COUNT(city))
ORDER BY Name, city;
2) You can alter by query a little bit. Remove the join on Country, and instead use some subselects in the filter and in the select. The latter is only needed if you need country name at all. If country code is enough, you can select that from City.
select
(select County.Name
from Country
where County.CountyCode = ci.CountryCode) as CountryName,
ci.CountryCode,
ci.Name as CityName,
ci.Population
from
City ci
where
-- Select only cities in these countries.
ci.CountryCode in
( select co.CountryCode
from Country co
where co.Region in ('Western Europe'))
-- Select only cities of above avarage population.
-- This is the same subselect that existed in the join before,
-- except it matches on CountryCode of the other 'instance' of
-- of the City table. Note, you will _need_ to use aliases (ca/ci)
-- here to make it work.
and ci.Population >
( select sum(ca.Population) / count(*)
from City ca
where ca.CountryCode = ci.CountryCode)
A subquery in the the select part of the statement only expects one value returned from the query. Remember the commas that separate the values in your select statement represent columns and each column expects one value. In order to get a list of values returned in a subquery (as if it were another table) and use it in the outer query you would have to put the subqueries in the from part of your Query. Note: this may not be a proper code for your results. I was just addressing the issue of the MySQL error 1242.
SELECT Name
FROM Country, (SELECT Name
FROM City
WHERE City.CountryCode = Country.Code) AS 'city',
(SELECT Population
FROM City
WHERE City.CountryCode = Country.Code) AS 'city_population'
WHERE Region IN ('Western Europe')
HAVING city_population > (SUM(Population) / COUNT(city))
ORDER BY Name, city;

Turning mysql subquery into Join

I'm new to mysql & just started learning it. Last night I was trying to re-form following sub-query on country table of world database, into a join.
SELECT continent, NAME, population FROM country c WHERE
population = (SELECT MAX(population) FROM country c2
WHERE c.continent=c2.continent AND population > 0)
I tried following query and several others with inner join etc. but failed. I'm getting result with the following query where max population is as expected but continent & country name as different.
SELECT c.continent, c2.name, MAX(c2.population) AS pop FROM country c, country c2
WHERE c.continent = c2.continent GROUP BY continent
Please help, how can I get same result as the sub-query above.
Thanks in advance
You should get the MAX(population) with GROUP BY continent inside a subquery, then JOIN it with the table itself; Like this:
SELECT c1.continent, c1.NAME, c1.population
FROM country c1
INNER JOIN
(
SELECT continent, MAX(population) AS Maxp
FROM country
WHERE population > 0
GROUP BY continent
) AS c2 ON c1.population = c2.maxp
AND c1.continent = c2.continent;