GROUP_CONCAT on LEFT JOIN generates groups against NULL - mysql

I'm trying to use a LEFT JOIN in conjunction with a GROUP_CONCAT but not getting the expected results.
Two simple tables:
weather_alerts:
id | user_id | resort_id
1 | 1 | 1
2 | 1 | 2
3 | 1 | 3
4 | 1 | 5
weather_users
id | email
1 | me#me.com
The query:
SELECT GROUP_CONCAT(wa.resort_id) AS resort_ids, wu.email FROM weather_alerts wa LEFT JOIN weather_users wu ON wa.id = wu.user_id GROUP BY wu.email
Instead of generating:
email resort_ids
me#me.com 1,2,3,5
I get:
email resort_ids
NULL 2,3,5
me#me.com 1
I suspect this is an issue with the JOIN rather than the CONCAT.

It appears that your LEFT JOIN needs improvement.
create table weather_alerts (id int, user_id int, resort_id int);
insert into weather_alerts values (1, 1, 1), (2, 1, 2), (3, 1, 3), (4, 1, 5);
create table weather_users (id int, email varchar(100));
insert into weather_users values (1, 'me#me.com');
Query
SELECT GROUP_CONCAT(wa.resort_id ORDER BY wa.resort_id) AS resort_ids, wu.email
FROM weather_alerts wa
LEFT JOIN weather_users wu ON wa.user_id = wu.id
GROUP BY wu.email
Notice that you are joining on wa.id = wu.user_id. The join should be on wa.user_id = wu.id
Result
| resort_ids | email |
|------------|-----------|
| 1,2,3,5 | me#me.com |

Related

Count records with values that belong to a set or group

How do I write a query that counts (totals) the number of values in a group for data spread across three tables? For each reporter and report date, I’d like a count of the number of sightings where species codes is 10 or 20 (using IN because species group has lots of codes).
REPORTER TABLE
reporter_id | reporter_num
-----------------------------
1 | 1111
2 | 2222
REPORT TABLE
report_id | reporter_id | report_date
-------------------------------------
1 | 1 | 2022-09-05
2 | 1 | 2022-09-05
3 | 1 | 2022-09-05
4 | 1 | 2022-09-16
5 | 2 | 2022-09-22
6 | 2 | 2022-09-22
SIGHTING TABLE
sighting_id | species_code
------------------------
1 | 10
2 | 55
3 | 20
4 | 35
5 | 55
6 | 20
This is essentially what I’m working with when the three tables are joined:
reporter_num | report_date | species_code
----------------------------------------
1111 | 2022-09-05 | 10
1111 | 2022-09-05 | 55
1111 | 2022-09-05 | 20
1111 | 2022-09-16 | 35
2222 | 2022-09-22 | 55
2222 | 2022-09-22 | 20
Query: for each reporter_num and report_date (one row per reporter_num and report_date), count the number of sightings where species_code is 10 or 20. Expected results:
reporter_num | report_date | my_count
----------------------------------------
1111 | 2022-09-05 | 2
1111 | 2022-09-16 | 0
2222 | 2022-09-22 | 1
A count in my query gives the total number of records for each reporter_num and report_date which isn’t what I want:
select
reporter.reporter_num, report.report_date,
count(sighting.species_code in (10, 20)) as my_count
from report
inner join reporter on report.reporter_id = report.reporter_id
inner join location on report.report_id = location.report_id
inner join method on location.location_id = method.location_id
left join sighting on method.method_id = sighting.method_id
group by
reporter.reporter_num, report.report_date;
Query Results – my_count is total number of records, which is incorrect:
reporter_num | report_date | my_count
----------------------------------------
1111 | 2022-09-05 | 3
1111 | 2022-09-16 | 1
2222 | 2022-09-22 | 2
Tried a subquery and the counts in both the outer query and subquery are incorrect:
select
reporter.reporter_num, report.report_date,
count(my_table.my_count)
from report
inner join reporter on report.reporter_id = reporter.reporter_id
inner join (
select
reporter.reporter_id, reporter.reporter_num,
report.report_date, sighting.species_code as my_count
from report
[... see joins in above query ...]
where sighting.species_code in (10, 20)
) as my_table on reporter.reporter_id = my_table.reporter_id
group by
reporter.reporter_num, report.report_date;
I feel like I’m close but missing something (or a couple somethings). Any suggestions? Many thanks.
SELECT reporter_num, report_date,
LENGTH(gs) + 2 - LENGTH(REGEXP_REPLACE(CONCAT(',', gs, ','), ',(10|20),', 'len')) cnt
FROM (
SELECT reporter.reporter_num, report.report_date, GROUP_CONCAT(sighting.species_code) gs
FROM report
JOIN reporter ON report.reporter_id = reporter.reporter_id
[... joins on location and method ...]
JOIN sighting ON link.sighting_id = sighting.sighting_id
GROUP BY reporter.reporter_num, report.report_date
) tbl;
Outputs:
| reporter_num | report_date | cnt |
|--------------|-------------|-----|
| 1111 | 2022-09-05 | 2 |
| 1111 | 2022-09-16 | 0 |
| 2222 | 2022-09-22 | 1 |
First creating an ad hoc, temporary table using GROUP_CONCAT and GROUP BY:
SELECT reporter.reporter_num, report.report_date, GROUP_CONCAT(sighting.species_code) gs
FROM report
JOIN reporter ON report.reporter_id = reporter.reporter_id
[... joins on location and method ...]
JOIN sighting ON link.sighting_id = sighting.sighting_id
GROUP BY reporter.reporter_num, report.report_date
Outputs:
| reporter_num | report_date | gs |
|--------------|-------------|----------|
| 1111 | 2022-09-05 | 10,55,20 |
| 1111 | 2022-09-16 | 35 |
| 2222 | 2022-09-22 | 55,20 |
Then counting occurrences of either 10 or 20 in the GROUP_CONCAT'ed gs column:
LENGTH(gs) + 2 - LENGTH(REGEXP_REPLACE(CONCAT(',', gs, ','), ',(10|20),', 'len'))
In this case using REGEXP_REPLACE instead of REPLACE as is used in the credited link.
-- create
CREATE TABLE reporter (
reporter_id INTEGER PRIMARY KEY,
reporter_num INTEGER NOT NULL
);
CREATE TABLE report (
report_id INTEGER PRIMARY KEY,
reporter_id INTEGER NOT NULL,
report_date TEXT NOT NULL
);
CREATE TABLE report_sighting (
report_id INTEGER NOT NULL,
sighting_id INTEGER NOT NULL
);
CREATE TABLE link (
report_id INTEGER NOT NULL,
sighting_id INTEGER NOT NULL
);
CREATE TABLE sighting (
sighting_id INTEGER PRIMARY KEY,
species_code INTEGER NOT NULL
);
-- insert
INSERT INTO reporter VALUES (1, 1111), (2, 2222);
INSERT INTO report VALUES (1, 1, '2022-09-05'), (2, 1, '2022-09-05'), (3, 1, '2022-09-05'), (4, 1, '2022-09-16'),
(5, 2, '2022-09-22'), (6, 2, '2022-09-22');
INSERT INTO link VALUES (1, 1), (2, 2), (3, 3), (4, 4), (5, 5), (6, 6);
INSERT INTO sighting VALUES (1, 10), (2, 55), (3, 20), (4, 35), (5, 55), (6, 20);
-- fetch
SELECT reporter_num, report_date,
LENGTH(gs) + 2 - LENGTH(REGEXP_REPLACE(CONCAT(',', gs, ','), ',(10|20),', 'len')) cnt
FROM (
SELECT reporter.reporter_num, report.report_date, GROUP_CONCAT(sighting.species_code) gs
FROM report
JOIN reporter ON report.reporter_id = reporter.reporter_id
JOIN link ON report.report_id = link.report_id
JOIN sighting ON link.sighting_id = sighting.sighting_id
GROUP BY reporter.reporter_num, report.report_date
) tbl;
Try it here: https://onecompiler.com/mysql/3ygv4cry2
Credit: https://www.tutorialspoint.com/finding-number-of-occurrences-of-a-specific-string-in-mysql
I wasn't using COUNT or COUNT(IF) correctly. The IF has to be 1 if true and NULL if false; otherwise, it will add 1 even if the species code isn't in the species group. The correct code:
select
reporter.reporter_num, report.report_date,
count(if(sighting.species_code in (10, 20), 1, NULL)) as my_count
from report
[... see joins in above query ...]
group by
reporter.reporter_num, report.report_date;
The above will also return report dates that don't have any species in the species group, my_count will be zero for that case, which is what I want.
To exclude report dates with zero species in the species group, use count(*) in SELECT and filter for the species codes in WHERE.

How to join 2 tables with one of them containing multiple values in a single column

I have two tables users and interests which i'm trying to join. Inside users table i have columns as id, name, interest, etc. The interest column contain multiple values as "1,2,3". My second table interests have 2 columns id and name as:
id | name
-------------
1 | business
2 | farming
3 | fishing
What i want to do is join interests table with users table so i get the following output:
users table:
id | name | interest | interest_name
----------------------------------------------
1 | username | "1,2" | "business, farming"
2 | username | "2,3" | " farming, fishing"
I wrote the following query to achieve this:
select users.*, interests.name as interest_name
from users
left join interests on users.interest = interests.id;
Results i got:
id | name | interest | interest_name
----------------------------------------
1 | username | "1,2" | "business"
2 | username | "2,3" | " farming"
Problem:
I'm only getting the name of first values from interest column whereas i want all the values from interest column i have already tried using group_concat and find_in_set but getting the same results.
In the case you cannot create an additional database table in order to normalize the data...
Here's a solution that creates an ad hoc, temporary user_interests table within the query.
SELECT users.id user_id, username, interests, interests.interest
FROM users
LEFT JOIN (
SELECT
users.id user_id,
(SUBSTRING_INDEX(SUBSTRING_INDEX(users.interests, ',', ui.ui_id), ',', -1) + 0) ui_id
FROM users
LEFT JOIN (SELECT id AS ui_id FROM interests) ui
ON CHAR_LENGTH(users.interests) - CHAR_LENGTH(REPLACE(users.interests, ',', '')) >= (ui.ui_id - 1)
) user_interests ON users.id = user_interests.user_id
LEFT JOIN interests ON user_interests.ui_id = interests.id
ORDER BY user_id, ui_id;
Outputs:
user_id | username | interest_ids | interest
--------+----------+--------------+---------
1 | fred | 3,4,8,6,10 | fishing
1 | fred | 3,4,8,6,10 | sports
1 | fred | 3,4,8,6,10 | religion
1 | fred | 3,4,8,6,10 | science
1 | fred | 3,4,8,6,10 | philanthropy
2 | joe | 7,11,8,9 | art
2 | joe | 7,11,8,9 | science
2 | joe | 7,11,8,9 | politics
2 | joe | 7,11,8,9 | cooking
As you can see...
SELECT
users.id user_id,
(SUBSTRING_INDEX(SUBSTRING_INDEX(users.interests, ',', ui.ui_id), ',', -1) + 0) ui_id
FROM users
LEFT JOIN (SELECT id AS ui_id FROM interests) ui
ON CHAR_LENGTH(users.interests) - CHAR_LENGTH(REPLACE(users.interests, ',', '')) >= (ui.ui_id - 1)
...builds and populates the temporary table user_interests with the users.interests field data normalized:
user_id | ui_id
--------+------
1 | 3
1 | 4
1 | 6
1 | 8
1 | 10
2 | 7
2 | 8
2 | 9
2 | 11
...which is then LEFT JOIN'ed between the users and interests tables.
Try it here: https://onecompiler.com/mysql/3yfhmgq3y
-- create
CREATE TABLE users (
id INT PRIMARY KEY,
username VARCHAR(20),
interests VARCHAR(20)
);
CREATE TABLE interests (
id INT PRIMARY KEY,
interest VARCHAR(20)
);
-- insert
INSERT INTO users VALUES (1, 'fred', '3,4,8,6,10'), (2, 'joe', '7,11,8,9');
INSERT INTO interests VALUES (1, 'business'), (2, 'farming'), (3, 'fishing'), (4, 'sports'), (5, 'technology'), (6, 'religion'), (7, 'art'), (8, 'science'), (9, 'politics'), (10, 'philanthropy'), (11, 'cooking');
-- select
SELECT users.id user_id, username, interests, interests.interest
FROM users
LEFT JOIN (
SELECT
users.id user_id,
(SUBSTRING_INDEX(SUBSTRING_INDEX(users.interests, ',', ui.ui_id), ',', -1) + 0) ui_id
FROM users
LEFT JOIN (SELECT id AS ui_id FROM interests) ui
ON CHAR_LENGTH(users.interests) - CHAR_LENGTH(REPLACE(users.interests, ',', '')) >= (ui.ui_id - 1)
) user_interests ON users.id = user_interests.user_id
LEFT JOIN interests ON user_interests.ui_id = interests.id
ORDER BY user_id, ui_id;
Inspired by Leon Straathof's and fthiella's answers to this SO question.
Pull the interest column out of the users table and create a user_interests table that contains the user ids and interest ids:
user_id | interest_id
--------+------------
1 | 1
1 | 2
2 | 2
2 | 3
Then join the users table to the user_interests table, and the user_interests table to the interests table:
SELECT users.username, interests.interest
FROM users
LEFT JOIN user_interests ON users.id = user_interests.user_id
LEFT JOIN interests ON user_interests.interest_id = interests.id
WHERE interest_id IS NOT NULL;
Outputs:
username | interest
---------+---------
Clark | business
Clark | farming
Dave | farming
Dave | fishing
Then use your server programming language to compile the query results.
Try it here: https://onecompiler.com/mysql/3yfe5pp7x
-- create
CREATE TABLE users (
id INTEGER PRIMARY KEY,
username TEXT NOT NULL
);
CREATE TABLE user_interests (
user_id INTEGER,
interest_id INTEGER,
UNIQUE KEY user_interests_constraint (user_id,interest_id)
);
CREATE TABLE interests (
id INTEGER PRIMARY KEY,
interest TEXT NOT NULL
);
-- insert
INSERT INTO users VALUES (1, 'Clark'), (2, 'Dave'), (3, 'Ava');
INSERT INTO interests VALUES (1, 'business'), (2, 'farming'), (3, 'fishing');
INSERT INTO user_interests VALUES (1, 1), (1, 2), (2, 2), (2, 3);
-- fetch
SELECT users.username, interests.interest
FROM users
LEFT JOIN user_interests ON users.id = user_interests.user_id
LEFT JOIN interests ON user_interests.interest_id = interests.id
WHERE interest_id IS NOT NULL;

Select all rows with values that appear twice by the same customer?

I have a table:
CREATE TABLE Orders (
ID INT,
Customer INT,
PRIMARY KEY(ID)
);
CREATE TABLE Items (
ID INT,
Barcode INT,
PRIMARY KEY(ID, Barcode)
);
INSERT INTO Orders VALUES
(1, 1), (2, 1), (3, 2), (4, 3), (5, 3);
INSERT INTO Items VALUES
(1, 1), (1, 2), (1, 3), (1, 7),
(2, 1), (2, 3), (3, 2), (3, 8),
(4, 2), (4, 3), (4, 8), (5, 4);
I'm trying to find all customers who have ordered the same item twice and specify the item, but not from the same order. I just need a list of Orders.Customer and Items.Barcode showing this.
Here's a query that helps illustrate:
SELECT i.ID, i.Barcode, o.Customer
FROM Items i, Orders o
WHERE i.ID = o.ID
Which produces the below:
+----+---------+----------+
| ID | Barcode | Customer |
+----+---------+----------+
| 1 | 1 | 1 | # A
| 1 | 2 | 1 |
| 1 | 3 | 1 | # B
| 1 | 7 | 1 |
| 2 | 1 | 1 | # A
| 2 | 3 | 1 | # B
| 3 | 2 | 2 |
| 3 | 8 | 2 |
| 4 | 2 | 3 |
| 4 | 3 | 3 |
| 4 | 8 | 3 |
| 5 | 4 | 3 |
+----+---------+----------+
Note where I tagged A, Barcode 1 appears in both ID 1 and ID 2. Both those orders have the same customer, same barcode, but different order IDs. B is another example.
How can I pull out these rows, so I have something like the below:
+---------+----------+
| Barcode | Customer |
+---------+----------+
| 1 | 1 |
| 3 | 1 |
+---------+----------+
More declaratively, I want to know what customers have ordered the same item twice, and list the items and customers. In other words, "Customer 1 has ordered Items 1 and 3 twice".
I'm trying to find all customers who have ordered the same item twice and specify the item, but not from the same order.
This is pretty simple with a HAVING clause:
SELECT o.Customer, i.Barcode
FROM Orders o JOIN
Items i
ON i.ID = o.ID
GROUP BY o.Customer, i.Barcode
HAVING MIN(o.id) <> MAX(o.id);
Note the use of proper, explicit, standard, readable JOIN syntax. Never use commas in the FROM clause.
If you need the expected result then you should group by with having clause like below.
SELECT i.Barcode, o.Customer
FROM Items i, Orders o
WHERE i.ID = o.ID
GROUP BY i.Barcode, o.Customer HAVING COUNT(*) >1
I am assuming that you need all records which are repeating more than once.
You may try this -
With cte1 as (SELECT i.ID as orderId, i.Barcode, o.Customer
FROM Items i, Orders o
WHERE i.ID = o.ID),
cte2 as (SELECT i.ID as orderId, i.Barcode, o.Customer
FROM Items i, Orders o
WHERE i.ID = o.ID)
Select distinct cte1.Barcode, cte1.Customer
from cte1, cte2
where
cte1.Barcode = cte2.Barcode
and cte1.Customer = cte2.Customer
and cte1.orderId <> cte2.orderId;
More elegant way - Please refer Amit Verma's answer

How to get user's detail from related table with MySql?

I have 3 tables that looks like this:
Users
id | name | password
------------------------
2 | John | ******
3 | Ben | ******
4 | Dan | ******
UserHobbies
id | user_id | hobbie_id
-------------------------
1 | 2 | 1
2 | 2 | 3
3 | 3 | 1
4 | 4 | 2
Hobbies
id | HobbieName
------------------------
1 | Surfing
2 | Walking
3 | Soccer
I want to find the user's related hobbies so the result will look like this:
username | HobbieName | hobbie_id
------------------------
John | Surfing | 1
Ben | Surfing | 1
As you can see - users John and Ben have the same hobby - 'Surfing', so the result will display ONLY them.
Here is what i've done so far -
SELECT users.name, hobbies.hobbie_name, COUNT(user_hobbies.hobby_id) FROM
user_hobbies
INNER JOIN users on user_hobbies.user_id = users.id
INNER JOIN hobbies ON hobbies.id = user_hobbies.hobby_id
GROUP BY user_hobbies.hobby_id
And the result :
name | hobbie_name | count
---------------------------
dan | Surfing | 2
As you can see - i get the count of each hobbie - rather then a row with the user and the hobbie
To get only the hobbies that have multiple users, join with a subquery that counts the number of users per hobby.
SELECT users.name, hobbies.hobbie_name, user_hobbies.hobby_id
FROM user_hobbies
INNER JOIN users on user_hobbies.user_id = users.id
INNER JOIN hobbies ON hobbies.id = user_hobbies.hobby_id
INNER JOIN (
SELECT hobby_id
FROM user_hobbies
GROUP BY hobby_id
HAVING COUNT(*) > 1
) AS multiple ON multiple.hobby_id = user_hobbies.hobby_id
Could use WHERE too: http://sqlfiddle.com/#!9/241cfd/4/0
Data
create table users (id INT, name VARCHAR(20), password VARCHAR(20));
create table user_hobbies (id INT, user_id INT, hobby_id INT);
create table hobbies (id INT, name VARCHAR(20));
INSERT INTO users VALUES (2, 'John', '**********');
INSERT INTO users VALUES (3, 'Ben', '**********');
INSERT INTO users VALUES (4, 'Dan', '**********');
INSERT INTO user_hobbies VALUES (1, 2, 1);
INSERT INTO user_hobbies VALUES (2, 2, 3);
INSERT INTO user_hobbies VALUES (3, 3, 1);
INSERT INTO user_hobbies VALUES (4, 4, 2);
INSERT INTO hobbies VALUES (1, 'Surfing');
INSERT INTO hobbies VALUES (2, 'Walking');
INSERT INTO hobbies VALUES (3, 'Soccer');
SQL
SELECT u.name, h.name AS hobby, uh.hobby_id
FROM user_hobbies AS uh
INNER JOIN users AS u ON uh.user_id = u.id
INNER JOIN hobbies AS h ON h.id = uh.hobby_id
WHERE uh.hobby_id = 1;
Result
| name | hobby | hobby_id |
|------|---------|----------|
| John | Surfing | 1 |
| Ben | Surfing | 1 |
Hope that helps.

can not get correct results with group by in mysql

I have 2 SQL tables
CREATE TABLE A(
id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
name CHAR(1) NOT NULL,
PRIMARY KEY (id)
);
CREATE TABLE B(
id INT(10) UNSIGNED NOT NULL AUTO_INCREMENT,
A_id INT(10) UNSIGNED NOT NULL,
PRIMARY KEY (id)
);
INSERT INTO A VALUES (1, 'A'), (2, 'B'), (3, 'C'), (4, 'A');
INSERT INTO B VALUES (1, 1), (2, 2), (3, 4), (4, 4);
The tables look this way:
select * from A;
+----+------+
| id | name |
+----+------+
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | A |
+----+------+
select * from B;
+----+------+
| id | A_id |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 4 |
| 4 | 4 |
+----+------+
Now I want to find out how many each of the elements from table A are there in table B. Using other words I want to see:
A = 3
B = 1
C = 0
I tried to do this with: SELECT name, count(*) FROM A, B WHERE A.id = A_id GROUP BY A.id;, but it returns something completely weird. Can someone help me?
Query
SELECT a.name,COUNT(b.A_id) as `count`
FROM A a
LEFT JOIN B b
ON a.id=b.A_id
GROUP BY a.name;
Fiddle Demo
You just need a left outer join to handle the condition where there are no A's in B:
SELECT A.Name, COUNT(b.id)
FROM A
LEFT OUTER JOIN B on A.id = B.a_id
GROUP BY A.Name;
SqlFiddle here
You should use a LEFT JOIN, and not GROUP BY A.id, but instead by name:
SELECT A.name, COUNT(B.A_id)
FROM A
LEFT JOIN B ON A.id = B.A_id
GROUP BY A.name;