MySQL self join question - mysql

Take a look at the following mySQL query:
SELECT fname,lname FROM users WHERE users.id IN (SELECT sub FROM friends WHERE friends.dom = 1 )
The above query first creates a set of ALL the friends.sub's via the inner query, and then the outer query selects a list of users where user ids are contained within the set created by the inner query (ie the union of the two sets).
And this works fine. But if you needed the inner set to contain not only the subs where dom = 1, but also the doms where sub = 1, like so:
Outer query remains same as above, pure pseudocode:
(SELECT sub FROM friends WHERE friends.dom = 1 )
***AND***
(SELECT dom FROM friends WHERE friends.sub = 1 )
Is it possible to make this sort of functionality with the inner query??
Any help or assistance appreciated guys;-D
Thanks a lot guys, my headache is gone now!

Try this:
SELECT u.fname, u.lname
FROM users u
INNER JOIN friends f
ON (u.id = f.sub AND f.dom = 1)
OR (u.id = f.dom AND f.sub = 1)

I'm not sure if I correctly understand what sub and dom represent, but it looks like you can use a UNION in there:
SELECT fname, lname
FROM users
WHERE users.id IN
(
SELECT sub FROM friends WHERE friends.dom = 1
UNION
SELECT dom FROM friends WHERE friends.sub = 1
);
Test case:
CREATE TABLE users (id int, fname varchar(10), lname varchar(10));
CREATE TABLE friends (dom int, sub int);
INSERT INTO users VALUES (1, 'Bob', 'Smith');
INSERT INTO users VALUES (2, 'Peter', 'Brown');
INSERT INTO users VALUES (3, 'Jack', 'Green');
INSERT INTO users VALUES (4, 'Kevin', 'Jackson');
INSERT INTO users VALUES (5, 'Steven', 'Black');
INSERT INTO friends VALUES (1, 2);
INSERT INTO friends VALUES (1, 3);
INSERT INTO friends VALUES (4, 1);
INSERT INTO friends VALUES (3, 4);
INSERT INTO friends VALUES (5, 2);
Result:
+-------+---------+
| fname | lname |
+-------+---------+
| Peter | Brown |
| Jack | Green |
| Kevin | Jackson |
+-------+---------+
3 rows in set (0.00 sec)
That said, #Alec's solution is probably more efficient.

Related

Mysql8 join and count unique real appearances

I have the following talbes:
CREATE TABLE topics (
id INT,
text VARCHAR(100),
parent VARCHAR(1)
);
CREATE TABLE sentiment (
id INT,
grade INT,
parent VARCHAR(1)
);
And the following data:
INSERT INTO topics (id, text, parent) VALUES (1, 'Cryptocurrency', 'A');
INSERT INTO topics (id, text, parent) VALUES (2, 'Cryptocurrency', 'B');
INSERT INTO topics (id, text, parent) VALUES (2, 'ETH', 'B');
INSERT INTO sentiment (id, grade, parent) VALUES (2, 0 , 'A');
INSERT INTO sentiment (id, grade, parent) VALUES (2, 1 , 'A');
INSERT INTO sentiment (id, grade, parent) VALUES (2, 1 , 'A');
INSERT INTO sentiment (id, grade, parent) VALUES (2, 1 , 'A');
INSERT INTO sentiment (id, grade, parent) VALUES (2, 0 , 'B');
INSERT INTO sentiment (id, grade, parent) VALUES (2, 1 , 'B');
I want to select count of each topics.text and shared parent sum of sentiment.grade.
So I came up with the following query:
SELECT
count(topics.text),
topics.text,
sum(sentiment.grade)
FROM topics
inner join sentiment on (sentiment.parent = topics.parent)
group by text
The result:
| count(topics.text) | sum(sentiment.grade) | text |
| ------------------ | -------------------- | -------------- |
| 6 | 4 | Cryptocurrency |
| 2 | 1 | ETH |
---
I only have a problem with the first column, the real count of Cryptocurrency is 2 and the real count of ETH is 1.
Can you fix this query?
(I'm using mysql8, would be glad to have 5.7 compliant if possible)
View on DB Fiddle
SELECT
count(distinct t.id),
t.text,
sum(s.grade)
FROM topics t
JOIN sentiment s on s.parent = t.parent
GROUP BY t.text
As you have two rows with text=cryptocurrency in topics, one with parent=A and the other with parent=B, when you join you should expect to see 6 rows for crpytocurrency(the first row of topics matches the first four of sentiment, and the second row of topics matches the last two of sentiment). You can see that if you change your original query to this one:
SELECT
*
FROM topics
inner join sentiment on (sentiment.parent = topics.parent)
I guess you want to see the number of topics with the same text and the total grades their parents have (for cryptocurrency, the sum of A and B). This could help you:
SELECT
topics_count.n_text,
topics.text,
SUM(sentiment.grade)
FROM topics
INNER JOIN (SELECT text, count(*) 'n_text' FROM topics GROUP BY text) topics_count ON topics.text = topics_count.text
INNER JOIN sentiment ON (sentiment.parent = topics.parent)
GROUP BY text

Using the count function on third table in two table select statement in MariaDB

I just spent a few hours reading through the MariaDB docs and various questions here trying to figure out a SQL statement that did what I want. I'm definitely not an expert... eventually I did get the result I expected, but I have no idea why it works. I want to be sure I am actually getting the result I want, and it isn't just working for the few test cases I have thrown at it.
I have three tables guestbook, users, and user_likes. I am trying to write a SQL statement that will return the user name and first name from users, post content, post date, post id from guestbook, and a third column likes which is the total number of times that post id from guestbook appears in the user_likes table. It should only return posts which are of type standard and should order the rows by ascending post date.
Sample data:
CREATE TABLE users
(`user_id` int, `user_first` varchar(6), `user_last` varchar(7),
`user_email` varchar(26), `user_uname` varchar(6))
;
INSERT INTO users
(`user_id`, `user_first`, `user_last`, `user_email`, `user_uname`)
VALUES
(0, 'Bob', 'Abc', 'email#example.com', 'user1'),
(13, 'Larry', 'Abc', 'email#example.com', 'user2'),
(15, 'Noel', 'Abc', 'email#example.com', 'user3'),
(16, 'Kate', 'Abc', 'email#example.com', 'user4'),
(17, 'Walter', 'Sobchak', 'walter.sobchak#shabbus.com', 'Walter'),
(18, 'Jae', 'Abc', 'email#example.com', 'user5')
;
CREATE TABLE user_likes
(`user_id` int, `post_id` int, `like_id` int)
;
INSERT INTO user_likes
(`user_id`, `post_id`, `like_id`)
VALUES
(0, 23, 1),
(0, 41, 2),
(13, 23, 7)
;
CREATE TABLE guestbook
(`post_id` int, `user_id` int, `post_date` datetime,
`post_content` varchar(27), `post_type` varchar(8),
`post_level` int, `post_parent` varchar(4))
;
INSERT INTO guestbook
(`post_id`, `user_id`, `post_date`, `post_content`,
`post_type`, `post_level`, `post_parent`)
VALUES
(2, 0, '2018-12-15 20:32:40', 'test1', 'testing', 0, NULL),
(8, 0, '2018-12-16 14:06:40', 'test2', 'testing', 0, NULL),
(9, 13, '2018-12-16 15:47:55', 'test4', 'testing', 0, NULL),
(23, 0, '2018-12-25 17:59:46', 'Merry Christmas!', 'standard', 0, NULL),
(39, 16, '2018-12-26 00:28:04', 'Hello!', 'standard', 0, NULL),
(40, 15, '2019-01-27 00:46:12', 'Hello 2', 'standard', 0, NULL),
(41, 18, '2019-02-25 00:44:35', 'What are you doing?', 'standard', 0, NULL)
;
I tried a whole bunch of convoluted statements involving count and couldn't get what I wanted. Through what seems like dumb luck I stumbled into creating this statement which appears to be giving me what I want.
SELECT
u.user_uname, u.user_first, g.post_id, g.post_date,
g.post_content, count(user_likes.post_id) AS likes
FROM
users AS u, guestbook AS g
LEFT JOIN
user_likes on g.post_id=user_likes.post_id
WHERE
u.user_id=g.user_id AND g.post_type='standard'
GROUP BY
g.post_id
ORDER BY
g.post_date ASC;
Question:
Why does this count function appear to work?
The count function that I was able to get working is this, but it only works for hard coded post_id values.
SELECT COUNT(CASE post_id WHEN 23 THEN 1 ELSE null END) FROM user_likes;
When I try to match the post_id from guestbook table by changing to this I get an incorrect value which appears to be the whole table of user_likes.
SELECT COUNT(case when guestbook.post_id=user_likes.post_id then 1 else null end) FROM guestbook, user_likes;
Adding a GROUP BY guestbook.post_id to the end gets me closer, but now I need to figure out how to combine that with my original select statement.
+----------------------------------------------------------------------------+
| COUNT(case when guestbook.post_id=user_likes.post_id then 1 else null end) |
+----------------------------------------------------------------------------+
| 0 |
| 0 |
| 0 |
| 2 |
| 0 |
| 0 |
| 1 |
+----------------------------------------------------------------------------+
This is the output I want, which I am getting. I just don't trust that my statement is reliable or correct.
+------------+------------+---------+---------------------+---------------------+-------+
| user_uname | user_first | post_id | post_date | post_content | likes |
+------------+------------+---------+---------------------+---------------------+-------+
| user1 | Bob | 23 | 2018-12-25 17:59:46 | Merry Christmas! | 2 |
| user4 | Kate | 39 | 2018-12-26 00:28:04 | Hello! | 0 |
| user3 | Noel | 40 | 2019-01-27 00:46:12 | Hello 2 | 0 |
| user5 | Jae | 41 | 2019-02-25 00:44:35 | What are you doing? | 1 |
+------------+------------+---------+---------------------+---------------------+-------+
Fiddle of statement working: http://sqlfiddle.com/#!9/968656/1/0
JOIN + COUNT -- A query first combines the tables as directed by the JOIN and ON clauses. The result is put (at least logically) into a temporary table. Often this temp table has many more rows than any of the tables being JOINed.
Then the COUNT(..) is performed. It is counting the number of rows in that temp table. Maybe that count is exactly what you want, maybe it is a hugely inflated number.
count(user_likes.post_id) has the additional hiccup of not counting any rows where user_likes.post_id IS NULL. That is usually irrelevant, in which case, you should simply say COUNT(*).
Please don't use the commalist form for joining. Always use FROM a JOIN b ON ... where the ON clause says how tables a and b are related. If there is also some filtering, put that into the WHERE clause.
If the COUNT is too big, put aside the query you have developed and start over to develop a query that does exactly one thing -- compute the county. This query will probably use fewer tables.
Then build on that to get any other data you need. It may look something like
SELECT ...
FROM ( SELECT foo, COUNT(*) AS ct FROM t1 GROUP BY foo ) AS sub1
JOIN t2 ON t2.foo = sub1.foo
JOIN t3 ON ...
WHERE ...
Get that initial query that gets the right COUNT. Then, if needed, come back for more help.
As tried by Bryan
OK, I made a few changes.
SELECT u.user_uname, u.user_first,
g2.post_id, g2.post_content, g2.post_date,
sub.likes
FROM
(
SELECT g.post_id,
SUM(g.post_id = ul.post_id) AS likes
FROM guestbook AS g
JOIN user_likes AS ul
WHERE g.post_type = 'standard'
) AS sub
JOIN guestbook AS g2 ON sub.post_id = g2.post_id
JOIN users AS u ON u.user_id = g2.user_id;
Indexes:
guestbook: (post_type, post_id) -- for derived table
guestbook: (post_id) -- for outer SELECT
users: (user_id)
user_likes: (post_id)
Notes:
ORDER BY removed since it was useless in context.
COUNT..CASE changed to shorter SUM.
JOIN ON used
Since there is only one value coming from the derived table, this might work equally well:
SELECT u.user_uname, u.user_first,
g.post_id, g.post_content, g.post_date,
( SELECT COUNT(*)
FROM user_likes AS ul
WHERE g.post_id = ul.post_id
) AS likes
FROM guestbook AS g
JOIN users AS u USING(user_id);
WHERE g.post_type = 'standard'
This involved lots of changes; see if it looks 'right'. It is now a lot simpler.
Indexes are same as above.

Subquery on JOIN to pulling all ID/Names

* user
user_id
name
* client
client_id
name
* user_client
user_client_id
user_id
client_id
* message
message_id
client_id
description
Sample Table Rows
user_id
1
2
3
client_id name
10 John
11 James
12 David
13 Richard
14 Bob
user_client
user_id client_id
1 11
1 13
3 14
3 10
message
message_id client_id message
1 11 Hello Word
2 12 MySQL is awesome
3 14 I like StackOverflow
4 13 This is very cool
What it's not working is when I use that query as a subquery on a LEFT JOIN to pull the messages only for those clients pertinent to the user.
Any ideas?
Thanks!
The DDL to set up the example (in MySQL) and the query I believe you are looking for are shown below.
/*
-- DDL TO SET UP EXAMPLE
create schema example;
use example;
create table user (
user_id int,
name varchar(64)
);
create table client(
client_id int,
name varchar(64)
);
create table user_client (
user_client_id int,
user_id int,
client_id int
);
create table message(
message_id int,
client_id int,
message varchar(64)
);
insert into user values (1, 'Peter');
insert into user values (2, 'Paul');
insert into user values (3, 'Mary');
insert into client values (10, 'John');
insert into client values (11, 'James');
insert into client values (12, 'David');
insert into client values (13, 'Richard');
insert into client values (14, 'Bob');
insert into user_client values (1, 1, 11);
insert into user_client values (2, 1, 11);
insert into user_client values (3, 3, 14);
insert into user_client values (4, 3, 10);
insert into message values (1, 11, 'Hello World');
insert into message values (2, 12, 'MySQL is awesome');
insert into message values (3, 14, 'I like StackOverflow');
insert into message values (4, 13, 'This is very cool');
*/
-- query to get all messages for all clients of a given user
select
*
from
user_client uc
join user u on uc.user_id = u.user_id
join client c on uc.client_id = c.client_id
join message m on m.client_id = c.client_id
where
u.user_id = 1;
-- query to get all messages for a given client
select
*
from
user_client uc
join user u on uc.user_id = u.user_id
join client c on uc.client_id = c.client_id
join message m on m.client_id = c.client_id
where
c.client_id = 11;
This should really be done as three separate queries as three different questions are being asked based on the comments:
So you want a query that will return messages from related clients if
there is a record in client, and all messages if there is no record in
client for a give user? – John
...
Hi #John that is correct. And that's why I was using the subquery,
because it pulls exactly that, but for some reason the client_id and
name are coming as NULL for all but one. – Kitara
The first question (query) is find all of the messages for all of the clients of a given user.
The second question (query) is: Does a user have "authorization" to view all messages. If the user has no clients then that user is "authorized" to view all messages.
The third question (query) is: If the user is authorized to view all messages get all of the messages.
These are very simple straight forward queries to write, execute, and understand. Trying to conflate all of this into a single query will add complexity and represents poor separation of concerns. If executing three very simple queries represents a performance issue the architecture of the application needs to be reconsidered.
There was a mistake in my original ddl in one of the inserts (fixed below). In the sql below I've also added a user with no messages. I believe the query at the end of what is posted below is what you are looking for.
-- DDL TO SET UP EXAMPLE
drop schema example;
create schema example;
use example;
create table user (
user_id int,
name varchar(64)
);
create table client(
client_id int,
name varchar(64)
);
create table user_client (
user_client_id int,
user_id int,
client_id int
);
create table message(
message_id int,
client_id int,
message varchar(64)
);
insert into user values (1, 'Peter');
insert into user values (2, 'Paul');
insert into user values (3, 'Mary');
insert into client values (10, 'John');
insert into client values (11, 'James');
insert into client values (12, 'David');
insert into client values (13, 'Richard');
insert into client values (14, 'Bob');
insert into client values (15, 'Quiet Client');
insert into user_client values (1, 1, 11);
insert into user_client values (2, 1, 13);
insert into user_client values (3, 3, 14);
insert into user_client values (4, 3, 10);
insert into user_client values (5, 1, 15);
insert into message values (1, 11, 'Hello World');
insert into message values (2, 12, 'MySQL is awesome');
insert into message values (4, 13, 'This is very cool');
insert into message values (3, 14, 'I like StackOverflow');
-- query to get all messages for all clients of a given user
select
u.user_id,
u.name user_name,
c.client_id,
c.name client_name,
m.message
from
user_client uc
join user u on uc.user_id = u.user_id
join client c on uc.client_id = c.client_id
left outer join message m on m.client_id = c.client_id
where
u.user_id = 1;
Output:
+ ------------ + -------------- + -------------- + ---------------- + ------------ +
| user_id | user_name | client_id | client_name | message |
+ ------------ + -------------- + -------------- + ---------------- + ------------ +
| 1 | Peter | 11 | James | Hello World |
| 1 | Peter | 13 | Richard | This is very cool |
| 1 | Peter | 15 | Quiet Client | |
+ ------------ + -------------- + -------------- + ---------------- + ------------ +
3 rows

SQL limit for LEFT JOINed table

I have the following tables.
Industry(id, name)
Movie(id, name, industry_id) [Industry has many movies]
Trailer(id, name, movie_id) [Movie has many trailers]
I need to find 6 latest trailers for each Industry. Every movie does not need to have a trailer or can have multiple[0-n].
CREATE TABLE industry(id int, name char(10), PRIMARY KEY (id));
CREATE TABLE movie(id int, name char(10), industry_id int, PRIMARY KEY (id),
FOREIGN KEY (industry_id) REFERENCES industry(id));
CREATE TABLE trailer(id int, name char(10), movie_id int, PRIMARY KEY (id),
FOREIGN KEY (movie_id) REFERENCES movie(id));
INSERT INTO industry VALUES (1, "sandalwood");
INSERT INTO industry VALUES (2, "kollywood");
INSERT INTO movie VALUES (1, "lakshmi", 1);
INSERT INTO movie VALUES (2, "saarathi", 2);
INSERT INTO trailer VALUES (1, "lakshmi1", 1);
INSERT INTO trailer VALUES (2, "lakshmi2", 1);
INSERT INTO trailer VALUES (3, "lakshmi3", 1);
INSERT INTO trailer VALUES (4, "lakshmi4", 1);
INSERT INTO trailer VALUES (5, "lakshmi5", 1);
INSERT INTO trailer VALUES (6, "lakshmi6", 1);
INSERT INTO trailer VALUES (7, "saarathi4", 2);
INSERT INTO trailer VALUES (8, "saarathi5", 2);
INSERT INTO trailer VALUES (9, "saarathi6", 2);
SELECT c.*
FROM industry a
LEFT JOIN movie b
ON a.id = b.industry_id
LEFT JOIN trailer c
ON b.id = c.movie_id
LIMIT 0, 6
| ID | NAME | MOVIE_ID |
----------------------------
| 1 | lakshmi1 | 1 |
| 2 | lakshmi2 | 1 |
| 3 | lakshmi3 | 1 |
| 4 | lakshmi4 | 1 |
| 5 | lakshmi5 | 1 |
| 6 | lakshmi6 | 1 |
I need to fetch only one recent trailer from each movie. But I am getting all trailers for each movie. Please suggest me to get the SQL statement.
I'm not sure if this works in MySql or not because I can't remember if you can have subqueries inside of an in clause, but you might try:
select * from trailer
where id in (select max(id) from trailer group by movie_id)
Whether it works or not, it looks like you're not using the industry table in your query so there's not much point in joining to it (unless you are actually trying to exclude movies that don't have any industry assigned to them... but based on your sample I it doesn't look like that was your intention).
If the above query doesn't work in MySql, then try this one
select t.*
from trailer t join
(select max(id) id from trailer group by movie_id) t2 on t1.id = t2.id
To get recent trailor you should include date field column from which we can fetch it
If you must do this all in SQL (and not in whatever backend or code you are using, which I would actually recommend) then you are probably going to have to rely on some variable magic.
Essentially, you need to "rank" each trailer by the date and then "partition" it by the movie that the trailer belongs to. These words have actual meaning in some other flavors of SQL (such as PL/SQL) but unfortunately don't have native functionality in MySQL.
You're going to want do to something similar to what is mentioned in this SO post. Once you get the "ranks" in there partitioned by movie_id, you just select WHERE rank < 6. The query could get pretty messy and there is some risk in using variables in that way but from what I can tell this is the best way to do it strictly with a MySQL query
Try this query
SELECT * FROM industry
LEFT JOIN movie on movie.industry_id = industry.id
LEFT JOIN (
SELECT
id as T_ID,
name as T_Name,
movie_id
FROM trailer
INNER JOIN ( SELECT
MAX(id) as TID
FROM trailer
GROUP BY movie_id
) as t on t.TID = trailer.id
) as c on c.movie_id = movie.id;
Here is the Demo
SELECT i.name, m.name, MAX(t.id) AS 'Latest Trailer ID', MAX(t.name) AS 'Latest Trailer'
FROM industry i
INNER JOIN movie m ON(i.id = m.industry_id)
INNER JOIN trailer t ON(m.id = movie_id)
GROUP BY m.id
If you want latest trailer by id of trailer table then use below query:
SELECT * FROM trailer t
INNER JOIN (SELECT movie_id, MAX(id) id
FROM trailer GROUP BY movie_id) AS A ON t.id = A.id
OR If you want data latest by date then use this query:
SELECT * FROM trailer t
INNER JOIN (SELECT movie_id, MAX(latestbydate) latestbydate
FROM trailer GROUP BY movie_id
) AS A ON t.movie_id = A.movie_id AND t.latestbydate = A.latestbydate

MySQL: Concatenate different values as single value

In mysql db table I have projects with one to many relation with domain table like
projects
----------------
proId | domainId
----------------
1 1
1 2
2 1
3 3
domain
---------------------
domainId | domainName
---------------------
1 Web
2 Mobile
3 iPhone
I query
SELECT p.*, d.* FROM projects p LEFT JOIN domain d ON p.domainId = d.domainId
which results
result
-----------------------------
proId | domainId | domainName
-----------------------------
1 1 Web
1 2 Mobile
2 1 Web
3 3 iPhone
but is it possible to show all domains as single value with concatenation some thing like
-----------------------------
proId | domainId | domainName
-----------------------------
1 1, 2 Web, Mobile
2 1 Web
3 3 iPhone
probably you are looking for GROUP_CONCAT function.
SELECT a.proID,
GROUP_CONCAT(b.domainID) domainId,
GROUP_CONCAT(b.domainName) domainName
FROM projects a
LEFT JOIN domain b
ON a.domainID = b.domainID
GROUP BY a.proID
SQLFiddle Demo
For quick question I just posted major points now as by help of friends it is solved & now i am posting complete DB Schema & Query.
Schema
CREATE TABLE projects
(`projectId` int, `projectName` varchar(20))
;
INSERT INTO projects
(`projectId`, `projectName`)
VALUES
(1, 'P1'),
(2, 'P2'),
(3, 'P3')
;
CREATE TABLE domain
(`domainId` int, `domainName` varchar(6))
;
INSERT INTO domain
(`domainId`, `domainName`)
VALUES
(1, 'Web'),
(2, 'Mobile'),
(3, 'iPhone')
;
CREATE TABLE prodomain
(`domainId` int, `projectId` int)
;
INSERT INTO prodomain
(`domainId`, `projectId`)
VALUES
(1, 1),
(1, 1),
(3, 2),
(2, 3)
;
Query
SELECT p.projectId as proId, projectName, d.*,
GROUP_CONCAT(d.domainId separator ', ') as all_domains_id,
GROUP_CONCAT(d.domainName separator ', ') as all_domains_name
FROM projects p
LEFT JOIN projectdomains pd ON p.projectId = pd.projectId
LEFT JOIN domains d ON d.domainId = pd.domainId
GROUP BY p.projectId
SELECT
p.proId,
group_concat(d.domainId) as domainId,
group_concat(d.domainName) as domainName
FROM
projects p
inner JOIN
domain d ON p.domainId = d.domainId
group by p.proId
order by p.proId