Some selections I don't know how to write [closed] - mysql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
I'm new to SQL, so please forgive me if I ask stupid questions.
I have three tables, one with countries, one with cities (connected to countries, with populations...) and one with languages (also connected to countries).
I'd like to ask MySQL about the following informations:
the names of the countries for which all the cities have more than 100000 citizen,
the names of the countries for which at least one city is in the cities table,
the names of the countries where english is spoken, but not spanish,
and so on. I start to understand junctures, a little bit of grouping, but that's about all.

First query
SELECT name FROM country WHERE id IN
(SELECT big_city.country_id FROM
(SELECT country_id, COUNT(*) as n FROM city WHERE population > 100000 GROUP BY country_id) as big_city,
(SELECT country_id, COUNT(*) as n FROM city GROUP BY country_id)
as all_city WHERE big_city.country_id = all_city.country_id AND big_city.n = all_city.n)
What I'm doing here in the subqueries is making a list of countries with all registered towns having a population greater than 100'000 people.
Second query
SELECT country.name FROM country WHERE country.id IN (SELECT DISTINCT country_id FROM city);
Doing so you will get all the country IDs in the city table, so you can use this as a condition
Third query
SELECT country.name FROM country WHERE country.id IN
(SELECT DISTINCT country_id FROM language WHERE language = "en")
AND NOT country.id IN (SELECT DISTINCT country_id FROM language WHERE language = "es")
Same as before, you fetch all the countries in which English or Spanish is spoken and you filter accordingly

Provide more details like tables and database version:
If you consider the following script in SQLServer 2017:
create table countries
(
[id] int not null identity,
[name] varchar(100) not null,
constraint pk_countries primary key ([id])
)
insert into countries ([name]) values ('Country 1'), ('Country 2'), ('Country 3')
create table cities
(
[id] int not null identity,
[idCountry] int not null,
[name] varchar(100) not null,
[population] int not null,
constraint pk_cities primary key ([id]),
constraint fk_cities_countries foreign key ([idCountry]) references countries ([id])
)
insert into cities ([idCountry], [name], [population]) values
(1, 'City 11', 1500000), (1, 'City 22', 2000000),
(2, 'City 21', 2000000), (2, 'City 22', 100)
create table languages
(
[id] int not null identity,
[idCity] int not null,
[name] varchar(100) not null,
constraint pk_languages primary key ([id]),
constraint fk_languages_cities foreign key ([idCity]) references cities ([id])
)
insert into languages ([idCity], [name]) values (1, 'Lang 1'), (1, 'Lang 2'), (1, 'Lang 3')
-- the names of the countries for which all the cities have more than 100000 citizen
select
distinct (a.name)
from
countries a
where
not exists (select * from cities where idCountry = a.id and population < 1000000) and
exists (select * from cities where idCountry = a.id)
go
-- the names of the countries for which at least one city is in the cities table,
select
distinct (a.name)
from
countries a
where
exists (select * from cities where idCountry = a.id)
Results (http://www.sqlfiddle.com/#!18/326e0/1):
Country1
Country 1
Country 2

Related

Find the reviewer with the most reviews that are below the average rating for places

We have 2 tables:
places
reviews
example of table details
CREATE TABLE places (
id INT,
name varchar(255),
address varchar(255),
type varchar(255),
average_rating INT,
price_point varchar(255),
total reviews INT,
PRIMARY KEY id);
INSERT INTO places VALUES
('1', 'Hairs to You', '45-45', 'Queens Boulevard', 'Beauty', '4.9', '$$$', '36'),
('2', 'Doggonit!', '100', 'Atlantic Ave', 'Pet Store', '3.1', '$$', '52'),
('3', 'Abra Kebabra', '193', 'Sauthoff Way', 'Restaurant', '3.3', '$', '315');
CREATE TABLE reviews (
id INT,
user_name varchar(255),
place_id varchar(255),
review_date DATE,
rating INT,
note varchar(255),
PRIMARY KEY id,
FOREIGN KEY (place_id) REFERENCES Places(id)
);
INSERT INTO reviews VALUES
('149', '#pinkdeb', '8', '2019-07-25', '4', 'Nice little place to grab a drink'),
('117', '#ahohl', '16', '2019-07-29', '3', 'The produce is always bad but otherwise okay'),
('119', '#sammyantha', '8', '2019-07-30', '4', 'LOVE how kitschy this place is! Bring your visiting friends');
I would like to find the reviewer with the most reviews that are below the average rating for places.
Do you think that it is the right code?
SELECT username, name, COUNT(*)
FROM reviews
CROSS JOIN places
ON places.id = reviews.place_id
WHERE rating < average_rating
GROUP BY username
ORDER BY COUNT(*) DESC
LIMIT 1;
I would like to get help in case it is wrong.
Consider:
SELECT user_name, Count(ID) FROM Reviews
WHERE Rating < (SELECT Avg(Rating) AR FROM Reviews)
GROUP BY user_name
ORDER BY Count(ID) DESC LIMIT 1
Your CREATE TABLE SQL errors on the PRIMARY KEY assignment so I just removed that part for testing. Could probably use some more sample data but when I try to add data rows, Fiddle errors.
MySQL Fiddle

How To Remove a Second Id From JOIN In SQL

I'm new to programming in SQL, and I need help on removing the second 1 through 5 id's on the right from my output (the numbers are highlighted in bold):
ID
Name
Gender
ID
Country
1
Abdur-Rahman
M
1
America
2
Don Madden
M
2
England
3
Dustin Tompkins
M
3
America
4
Nicki Harris
F
4
Germany
5
Samantha Harris
F
5
France
CREATE TABLE test ( id INTEGER PRIMARY KEY, name TEXT);
INSERT INTO test VALUES (1,"Albert Franco");
INSERT INTO test VALUES (2,"Don Madden");
INSERT INTO test VALUES (3,"Dustin Tompkins");
INSERT INTO test VALUES (4,"Nicki Harris");
INSERT INTO test VALUES (5,"Samantha Harris");
ALTER TABLE test
ADD COLUMN gender TEXT;
UPDATE test
SET gender = 'M'
WHERE id = 1;
UPDATE test
SET gender = 'M'
WHERE id = 2;
UPDATE test
SET gender = 'M'
WHERE id = 3;
UPDATE test
SET gender = 'F'
WHERE id = 4;
UPDATE test
SET gender = 'F'
WHERE id = 5;
CREATE TABLE country (
id INTEGER,
nation TEXT
);
INSERT INTO country VALUES (1,"America");
INSERT INTO country VALUES (2,"England");
INSERT INTO country VALUES (3,"America");
INSERT INTO country VALUES (4,"Germany");
INSERT INTO country VALUES (5,"France");
SELECT * FROM test
JOIN country
ON test.id = country.id;
To actually answer your question, you should explicitly state the columns you want, i.e.
SELECT t.id, t.name, t.gender, c.nation
FROM test AS t
JOIN country AS c
ON c.id = t.id;
It is however worth noting that your schema doesn't really make sense, you have to duplicate countries, which is not normalised. You've simply created a 1:1 relationship, you'd be as well just adding a nation column to test.
A better solution though would be to normalise the data, so your country table would become:
CREATE TABLE Country
(
Id INT AUTO_INCREMENT PRIMARY KEY,
Nation VARCHAR(50)
);
INSERT INTO Country(Nation)
VALUES ('America'), ('England'), ('France'), ('Germany');
Then in your Test Table, add CountryId as foreign key to your country table:
CREATE TABLE Test
(
Id INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(500) NOT NULL,
Gender CHAR(1) NOT NULL,
CountryId INT NOT NULL,
CONSTRAINT FK_Test__CountryId FOREIGN KEY (CountryID) REFERENCES Country (Id)
);
INSERT INTO Test (Name, Gender, CountryId)
SELECT data.Name, data.Gender, c.Id
FROM (
SELECT 'Albert Franco' AS Name, 'M' AS Gender, 'America' AS Nation
UNION ALL
SELECT 'Don Madden' AS Name, 'M' AS Gender, 'England' AS Nation
UNION ALL
SELECT 'Dustin Tompkins' AS Name, 'M' AS Gender, 'America' AS Nation
UNION ALL
SELECT 'Nicki Harris' AS Name, 'F' AS Gender, 'Germany' AS Nation
UNION ALL
SELECT 'Samantha Harris' AS Name, 'F' AS Gender, 'France' AS Nation
) AS data
INNER JOIN Country AS c
ON c.Nation = data.Nation;
Your final query is largely similar:
SELECT t.Id, t.Name, t.Gender, c.Nation
FROM Test AS t
INNER JOIN Country AS c
ON c.Id = t.CountryId;
But you have now normalised your countries, so America only appears once. Obviously in a simple example this only saves you one row, but if you have a lot of names, this has saved a lot of duplication, and potential for error. With free type entry (e.g. having a nation column in test) you inevitably end up with multiple variations of everything, e.g. "America", "USA", "U.S.A", "US", "U.S", "United States", this doesn't even consider typos! All of this leads to headaches down the road.
Full Example on SQL Fiddle
*N.B. There's a pretty good argument that the country table should not have a surrogate primary key (AUTO_INCREMENT) and it should instead use the ISO country code. The natural vs surrogate keys debate has been going on for years and years, and is well beyond the scope of this answer
If I understood your question correctly, you want something like this
Then you need to write your query like this
Select test.*, country.nation from test join country on test.id=country.id;
Now, you will only receive the id from your test table

Group by wether or not the cell exists in another table as well

My SQL syntax is MariaDB (MySQL)
I have a table with organisation spokepersons, and a table with VIP organizations, and a table with presentations. How do I group or sort by wether the spokeperson's organisation is VIP, so that VIP organisation spokepersons show up on top when retrieving all presentations?
Table presentations: int presentation_id, int person_id, varchar title, date date
Table persons: int person_id, varchar name, varchar function, varchar organisation
Table VIP_orgs: int org_id, varchar org_name
Query that doesn't work:
CREATE TABLE persons (
person_id INT AUTO_INCREMENT,
name VARCHAR(64),
organisation VARCHAR(64),
PRIMARY KEY (person_id)
);
INSERT INTO `persons` (name, organisation) VALUES
("Guy Fieri", "VIP-org"),
("Fiona", "VIP inc."),
("Mr. Robot", "Evil Corp"),
("Marcus Antonius", "Rome"),
("Cicero", "Rome"),
("Shrek", "VIP inc.");
CREATE TABLE presentations (
presentation_id INT AUTO_INCREMENT,
person_id INT,
PRIMARY KEY (presentation_id)
);
INSERT INTO `presentations` (person_id) VALUES
(1),(1),(1),(1), -- guy fieri has 4
(2),
(3),(3),(3),(3),(3),
(4),(4),(4),(4),
(5),(5),(5),
(6),(6),(6),(6);
CREATE TABLE VIP_orgs (
org_id INT AUTO_INCREMENT,
org_name VARCHAR(64),
PRIMARY KEY (org_id)
);
INSERT INTO `VIP_orgs` (org_name) VALUES
("VIP-org"),
("VIP inc.");
SELECT organisation, COUNT(*) AS count
FROM `presentations`
JOIN `persons` ON `presentations`.person_id = `persons`.person_id
GROUP BY (SELECT org_name FROM `VIP_orgs` WHERE `VIP_orgs`.org_name = organisation), organisation
ORDER BY count DESC;
What I expect it to do:
return a table org_name, (total combined number of presentations by each spokeperson of that org)
Sorted by count of presentations, grouped by organisation, VIP organisations grouped on top.
The VIP and non-VIP parts should be sorted by count independently. The returned table should thus look something like this:
name count
VIP inc. 5
VIP-org 4
Rome 7
Evil Corp 5
The query works 50%: it counts all presentations and sorts it, but it doesn't seem to group by VIP organizations. In actuality the returned table looks like this:
name count
Rome 7
VIP inc. 5
Evil Corp 5
VIP-org 4
The schema doesn't look right. I would suggest creating an organisations table with a vip BOOLEAN column and add foreign key in persons table. Make the following changes in the schema:
CREATE TABLE `organisations` (
organisation_id INT AUTO_INCREMENT,
name VARCHAR(64),
vip BOOLEAN,
PRIMARY KEY (organisation_id)
);
INSERT INTO `organisations` (name, vip) VALUES
("VIP-org", True),
("VIP inc.", True),
("Evil Corp", False),
("Rome", False);
CREATE TABLE persons (
person_id INT AUTO_INCREMENT,
name VARCHAR(64),
organisation_id INT,
PRIMARY KEY (person_id),
FOREIGN KEY (organisation_id) REFERENCES `organisations`(organisation_id)
);
INSERT INTO `persons` (name, organisation_id) VALUES
("Guy Fieri", 1),
("Fiona", 2),
("Mr. Robot", 3),
("Marcus Antonius", 4),
("Cicero", 4),
("Shrek", 2);
Now the query would look something like this:
SELECT `organisations`.name as organisation, COUNT(*) AS count
FROM `presentations`
JOIN `persons` ON `presentations`.person_id = `persons`.person_id
JOIN `organisations` ON `organisations`.organisation_id = `persons`.organisation_id
GROUP BY `organisations`.organisation_id
ORDER BY `organisations`.vip DESC, count DESC;
Output:
+--------------+------------+
| organisation | count |
+--------------+------------+
| VIP inc. | 5 |
| VIP-org | 4 |
| Rome | 7 |
| Evil Corp | 5 |
+--------------+------------+
You can see the result here: db <> fiddle
Instead of grouping by, I needed to sort. DOh!
Edit: this doesn't quite work. It does not sort by count. If I put the ORDER BY count clause first, it puts all vip orgs on the bottom.
Edit 2: using EXISTS, it seems to work
SELECT organisation, COUNT(*) AS count
FROM `presentations`
JOIN `persons` ON `presentations`.person_id = `persons`.person_id
GROUP BY organisation
ORDER BY EXISTS (SELECT org_name FROM `VIP_orgs` WHERE `VIP_orgs`.org_name = organisation) DESC, count DESC;

mysql max per group by date

I have two mysql greatest-n-per-group, greatest-by-date problems:
Considering one students table and one grades table, I want to have all students displayed with their most recent grade.
The schema script:
CREATE TABLE student (
id int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO student VALUES(1, 'jim');
INSERT INTO student VALUES(2, 'mark ');
INSERT INTO student VALUES(3, 'john');
CREATE TABLE grades (
id int(11) NOT NULL AUTO_INCREMENT,
student_id int(11) NOT NULL,
grade int(11) NOT NULL,
`date` date DEFAULT NULL,
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO grades VALUES(1, 1, 6, NULL);
INSERT INTO grades VALUES(2, 1, 8, NULL);
INSERT INTO grades VALUES(3, 1, 10, NULL);
INSERT INTO grades VALUES(4, 2, 9, '2016-05-10');
INSERT INTO grades VALUES(5, 2, 8, NULL);
INSERT INTO grades VALUES(6, 3, 6, '2016-05-26');
INSERT INTO grades VALUES(7, 3, 7, '2016-05-27');
A) I want to find out if this is a valid solution for getting the most recent record by a date field (date) from a secondary table (grades) grouped for each row in a main table (student).
My query is:
SELECT s.id, s.name, g.grade, g.date
FROM student AS s
LEFT JOIN (
SELECT student_id, grade, DATE
FROM grades AS gr
WHERE DATE = (
SELECT MAX(DATE)
FROM grades
WHERE student_id = gr.student_id
)
GROUP BY student_id
) AS g ON s.id = g.student_id
Sql Fiddle: http://sqlfiddle.com/#!9/a84171/2
This query displays the desired (almost) results. But I have doubts that this is the best approach because it looks ugly, so I am very curious about the alternatives.
B) The second problem is the reason for the (almost) above,
For the first row, name=Jim it finds no grade though we have grades for Jim.
So just in case the query above would be valid only for NOT NULL date fields.
The question would be:
How to get the most recent grade for all the students who have grades, including Jim even that his grades has no date specified (NULL). In this case the most recent grouping will be given by the latest row inserted (MAX(id)) or just random.
Doesn't work with replacing date = (SELECT... with date IN (SELECT ....
Any help would be much appreciated,
Thanks!
[UPDATE #1]:
For B) I found adding this to the sub-query, OR date IS NULL, produces the desired result:
SELECT s.id, s.name, g.grade, g.date
FROM student AS s
LEFT JOIN (
SELECT id, student_id, grade, DATE
FROM grades AS gr
WHERE DATE = (
SELECT MAX(DATE)
FROM grades
WHERE student_id = gr.student_id
) OR date IS NULL
GROUP BY student_id
) AS g ON s.id = g.student_id
[UPDATE #2]
Seems the previous update worked if the first grade has a date for a student. It doesn't if the first grade is null. I would have linked a fiddle but it seems sqlfiddle doesn't work now.
So this is what I came up until now that seems to solve the B) problem:
SELECT s.id, s.name, g.grade, g.date
FROM student AS s
LEFT JOIN (
SELECT id, student_id, grade, DATE
FROM grades AS gr
WHERE (
`date` = (
SELECT MAX(DATE)
FROM grades
WHERE student_id = gr.student_id
)
) OR (
(
SELECT MAX(DATE)
FROM grades
WHERE student_id = gr.student_id
) IS NULL AND
date IS NULL
)
) AS g ON s.id = g.student_id
GROUP BY student_id
I still would like to know if you guys know better alternatives to this ugly thing.
Thanks!
[UPDATE #3]
#Strawberry
The desired results would be:
id name grade date
1 jim 10 NULL
2 mark 9 2016-05-10
3 john 7 2016-05-27
each student with one corresponding grade
if a date exists for a grade, then get the most recent one.
The complexity of this problem stems from the logical impossibility of a grade without an associated date, so obviously the solution is to fix that.
But here's a workaround...
E.g.:
SELECT a.*
FROM grades a
JOIN
( SELECT student_id
, MAX(COALESCE(UNIX_TIMESTAMP(date),id)) date
FROM grades
GROUP
BY student_id
) b
ON b.student_id = a.student_id
AND b.date = COALESCE(UNIX_TIMESTAMP(a.date),id);
http://sqlfiddle.com/#!9/ecec43/4
SELECT s.id, s.name, g.grade, g.date
FROM student AS s
LEFT JOIN (
SELECT gr.student_id, gr.grade, gr.DATE
FROM grades AS gr
LEFT JOIN grades grm
ON grm.student_id = gr.student_id
AND grm.date>gr.date
WHERE grm.student_id IS NULL
AND gr.date IS NOT NULL
GROUP BY gr.student_id
) AS g
ON s.id = g.student_id;

Find unique sets in sql data

I have three tables:
Modules
| ID | name |
Subscription
| module_id | user_id | ...
User
| ID | user_name |
I need a list of unique subscription sets. (ie. x users subscribed to modules (1), y users subscribed to (1,2), etc. Can I do this in SQL?
Let's set up some tables.
create table modules (
module_id integer not null,
module_name varchar(15) not null,
primary key (module_id),
unique (module_name)
);
insert into modules values (1, 'First module');
insert into modules values (2, 'Second module');
create table users (
user_id integer not null,
user_name varchar(15) not null,
primary key (user_id),
unique (user_name)
);
insert into users values (100, 'First user');
insert into users values (101, 'Second user');
create table subscriptions (
module_id integer not null,
user_id integer not null,
primary key (module_id, user_id),
foreign key (module_id)
references modules (module_id),
foreign key (user_id)
references users (user_id)
);
insert into subscriptions values (1, 100);
insert into subscriptions values (1, 101);
insert into subscriptions values (2, 100);
To get the right counts, all you need is a query on the table "subscriptions".
select module_id, count(*) as num_users
from subscriptions
group by module_id
order by module_id;
module_id num_users
--
1 2
2 1
You can use that statement in a join to get the module name instead of the module id number.
select t1.module_name, t2.num_users
from modules t1
inner join (select module_id, count(*) as num_users
from subscriptions
group by module_id
) t2
on t1.module_id = t2.module_id
order by t1.module_name;
module_name num_users
--
First module 2
Second module 1
You'll need to use the Innodb engine in order to enforce foreign key constraints.
To get the users who are subscribed to both module ids 1 and 2, use a WHERE clause to pick the module id numbers, use a GROUP BY clause to get the count, and use a HAVING clause to restrict the output to only those user id numbers who have a count of 2. (That means they've subscribed to both those modules in the WHERE clause.)
select user_id, count(*) num_modules
from subscriptions
where module_id in (1, 2)
group by user_id
having count(*) = 2;
This kind of requirement can blow up in your face quite quickly if you need to report on all the possible combinations of modules. For only 10 modules, there are over 1000 possible combinations. Usually, you'd want to write a program to either
generate dynamic SQL,
generate a static SQL statement for each of the possible combinations,
write a new SQL statement each time you're asked to report on a combination (usually, most combinations aren't interesting), or
renegotiate the requirements.
My colleague came up with an interesting solution
select module_combinations
from (select user_id, group_concat(module_id separator ', ') as module_combinations from subscriptions group by user_id) a
group by a.module_combinations
But this seems closer to answering the original question.
select module_combinations, count(*) as num_users
from (select group_concat(module_id order by module_id) as module_combinations
from subscriptions
group by user_id) a
group by a.module_combinations;