unable to join two tables, one with multiple rows - mysql

I have two tables that I am attempting to join in MySQL:
reviews:
| review_id | comment | reviewer_id | user_id |
-----------------------------------------------------------
| 1 | some text. | 501 | 100 |
| 2 | lorem ipsum | 606 | 100 |
| 3 | blah blah. | 798 | 120 |
| 4 | foo bar! | 798 | 133 |
-----------------------------------------------------------
review_status:
| review_id | status | timestamp |
----------------------------------------
| 1 | 10 | 1364507521 |
| 1 | 101 | 1364508057 |
| 2 | 100 | 1364509033 |
| 1 | 150 | 1364509149 |
| 2 | 120 | 1364509283 |
| 2 | 122 | 1364855948 |
| 3 | 120 | 1364509283 |
| 3 | 122 | 1364855948 |
| 1 | 110 | 1364855945 |
| 4 | 100 | 1364509283 |
| 4 | 115 | 1364855948 |
| 4 | 210 | 1364855945 |
----------------------------------------
What I WANT is a result that looks something like this:
result
| review_id | comment | reviewer_id | user_id | status | timestamp |
--------------------------------------------------------------------------
| 1 | some text. | 501 | 100 | 200 | 1364855945 |
| 2 | lorem ipsum | 606 | 120 | 122 | 1364855948 |
--------------------------------------------------------------------------
I'm after: 1) The newest entry from the review_status table 2) A certain range of status codes (100 - 199 in this case) 3) And multiple user_id's from the review table.
This is currently my query, that I can't get to work for the life of me:
SELECT r.review_id, r.comment, r.reviewer_id, r.user_id
FROM reviews AS r
INNER JOIN
(SELECT s.status, max(s.timestamp)
FROM review_status AS s
WHERE s.status < 200
AND s.status > 99;
GROUP BY s.review_id) AS r_s
ON r.review_id = r_s.review_id
WHERE r.user_id IN (100,120);
Any help is greatly appreciated! Thanks.

You have a few issues with your current query.
the subquery is not returning review_id so you cannot use that in the join
you have an extra semi-colon in the subquery
I might suggest rewriting the query to use the following:
SELECT r.review_id, r.comment, r.reviewer_id, r.user_id,
rs.status, rs.timestamp
FROM reviews AS r
INNER JOIN review_status rs
ON r.review_id = rs.review_id
INNER JOIN
(
SELECT s.review_id, max(s.timestamp) MaxDate
FROM review_status AS s
WHERE s.status < 200
AND s.status > 99
GROUP BY s.review_id
) AS r_s
ON rs.review_id = r_s.review_id
AND rs.timestamp = r_s.MaxDate
WHERE r.user_id IN (100,120)
and rs.status < 200
AND rs.status > 99
See SQL Fiddle with Demo.
The main reason for the query to be written this way is because in your current query you are grouping by review_id but are returning the status. MySQL uses an extension to the GROUP BY clause that will allow items in the select list to be excluded being used in a GROUP BY or aggregate function but this could cause unexpected results. (see MySQL Extensions to GROUP BY)
From the MySQL Docs:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. ... You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.

Try this:
SELECT r.*, r_s.*
FROM review_status r_s LEFT JOIN reviews r
ON r.review_id = r_s.review_id
WHERE r_s.user_id > 100 AND r_s.user_id < 120
ORDER BY r_s.timestamp DESC;

SELECT r.review_id, r.comment, r.reviewer_id, r.user_id, tt.status,tt.timestamp
FROM (
SELECT rs2.review_id,rs2.status,rs2.timestamp
FROM (
SELECT MAX(rs.timestamp) as mts
FROM reviews rr
JOIN review_status AS rs ON rs.review_id = rr.id
WHERE rs.status < 200 AND rs.status > 99
AND rr.user_id IN (100,120)
GROUP BY rs.review_id
) as t
JOIN review_status rs2 ON rs2.timestamp = t.mts
GROUP BY rs2.review_id #remove duplicate statuses with the same timestamp
) as tt
JOIN reviews as r ON r.id = tt.review_id
The user_id and status filters have to be in the innermost query to avoid selecting and join-ing the entire statuses table every time.

Here's my attempt with one JOIN and one correlated sub-query:
SELECT r.*, rs.*
FROM Reviews AS r
INNER JOIN Review_status AS rs ON r.review_id = rs.review_id
WHERE rs.status BETWEEN 99 AND 200 AND
r.user_id IN (100,120) AND
rs.timestamp = (SELECT MAX(timestamp) FROM Review_status
WHERE review_id = r.review_id
ORDER BY timestamp DESC)
ORDER BY r.review_id;
Its SQL Fiddle: http://sqlfiddle.com/#!2/02f18/6

Related

Joining 2 SQL SELECT result into one query

I wanted to know if there's a way to join two or more result sets into one.
i have the following two queries
First query:
SELECT
CONCAT(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)) as day_month_year,
db.country.country ,
count(concat(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on))) as count ,
COUNT(DISTINCT db.prod_id.email) AS MAIL
from db.prod_id
left join db.country on db.prod_id.branch_id = db.country.id
where db.prod_id.created_on > '2020-11-17' and (db.country.type = 1 or db.country.type = 2)
group by
concat(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)),
db.country.country
order by db.prod_id.created_on
The second query:
select
CONCAT(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)) as day_month_year,
db.country.country,
count(CONCAT(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on))) as count_BUY
from db.prod_id
left join db.prod_evaluations on db.prod_id.id = db.prod_evaluations.id
left join db.country on db.prod_id.branch_id = db.country.id
left join (Select prod_properties.prod_id, prod_properties.value From prod_properties Where prod_properties.property_id = 5) as db3 on db3.prod_id = db.prod_id.id
where db.prod_id.created_on > '2020-11-17'
and db3.value = 'online-buy' and db.prod_id.status_id <> 25
group by
concat(day(db.prod_id.created_on),"-",month(db.prod_id.created_on),"-",year(db.prod_id.created_on)),
db.country.country
order by db.prod_id.created_on
The first query give the following result:
+------------+---------+-------+------+
| day | Country | Count | Mail |
+------------+---------+-------+------+
| 17-11-2020 | IT | 200 | 100 |
| 17-11-2020 | US | 250 | 100 |
| 18-11-2020 | IT | 350 | 300 |
| 18-11-2020 | US | 200 | 100 |
+------------+---------+-------+------+
The second query give:
+------------+---------+-----------+
| day | Country | Count_BUY |
+------------+---------+-----------+
| 17-11-2020 | IT | 50 |
| 17-11-2020 | US | 70 |
| 18-11-2020 | IT | 200 |
| 18-11-2020 | US | 50 |
+------------+---------+-----------+
Now i want to merge these two result in one:
+------------+---------+-------+------+-----------+
| day | Country | Count | Mail | Count_BUY |
+------------+---------+-------+------+-----------+
| 17-11-2020 | IT | 200 | 100 | 50 |
| 17-11-2020 | US | 250 | 100 | 70 |
| 18-11-2020 | IT | 350 | 300 | 200 |
| 18-11-2020 | US | 200 | 100 | 50 |
+------------+---------+-------+------+-----------+
How can i perform this query?
I'm using mysql
Thanks
The simple way: You can join queries.
select *
from ( <your first query here> ) first_query
join ( <your second query here> ) second_query using (day_month_year, country)
order by day_month_year, country;
This is an inner join. You can also outer join of course. MySQL doesn't support full outer joins, though. If you want that, you'll have to look up how to emulate a full outer join in MySQL.
The hard way ;-) Merge the queries.
If I am not mistaken, your two queries can be reduced to
select
date(created_on),
branch_id as country,
count(*) as count_products,
count(distinct p.email) as count_emails
from db.prod_id
where created_on >= date '2020-11-17'
and branch_id in (select country from db.country where type in (1, 2))
group by date(created_on), branch_id
order by date(created_on), branch_id;
and
select
date(created_on),
branch_id as country,
count(*) as count_buy
from db.prod_id
where created_on >= date '2020-11-17'
and status_id <> 25
and prod_id in (select prod_id from prod_properties where property_id = 5 and status_id <> 25)
group by date(created_on), branch_id
order by date(created_on), branch_id;
The two combined should be
select
date(created_on),
branch_id as country,
sum(branch_id in (select country from db.country where type in (1, 2)) as count_products,
count(distinct case when branch_id in (select country from db.country where type in (1, 2) then p.email end) as count_emails,
sum(status_id <> 25 and prod_id in (select prod_id from prod_properties where property_id = 5 and status_id <> 25)) as count_buy
from db.prod_id
where created_on >= date '2020-11-17'
group by date(created_on), branch_id
order by date(created_on), branch_id;
You see, the conditions the queries have in common remain in the where clause and the other conditions go inside the aggregation functions.
sum(boolean) is short for sum(case when boolean then 1 else 0 end), i.e. this counts the rows where the condition is met in MySQL.

When I use "WHERE user_id in ( sub query )" generate syntax error

I have a users table used below.
Users have referal_code, refered_by columns.Users has following data.
+----+--------------+------------+
| id | referal_code | refered_by |
+----+--------------+------------+
| 1 | abc | null |
| 2 | xxx | abc |
+----+--------------+------------+
I have Reviews table in which I store users reviewe by other users.
It does have user_id, evaluation columns.
+----+---------+------------+
| id | user_id | evaluation |
+----+---------+------------+
| 28 | 2 | 4 |
| 32 | 2 | 6 |
+----+---------+------------+
I'm trying to count users referred by each user have an average evaluation of 3 or more.
SELECT users.*, COUNT(
SELECT reviews.user_id FROM reviews
WHERE reviews.user_id IN(
SELECT A2.id FROM users as A2 WHERE A2.refered_by = users.referal_code
)
HAVING AVG(evaluation) >= 3) as total_3_estrelas
FROM users
WHERE 1
I have a syntax error #1064 on: WHERE user_id IN
The result I expect:
+----+--------------+------------+------------------+
| id | referal_code | refered_by | total_3_estrelas |
+----+--------------+------------+------------------+
| 1 | abc | null | 1 |
| 2 | xxx | abc | 0 |
+----+--------------+------------+------------------+
Look at this if it helps:
SELECT A.ID, A.REFERAL_CODE, A.REFERED_BY, COALESCE(TOTAL_3_ESTRELAS,0) AS TOTAL_3_ESTRELAS
FROM USERS A
LEFT JOIN
(SELECT REFERED_BY, COUNT(*) AS TOTAL_3_ESTRELAS
FROM USERS U
INNER JOIN (SELECT USER_ID, AVG(EVALUATION)
FROM REVIEWS
GROUP BY USER_ID
HAVING AVG(EVALUATION)>=3) R
ON U.ID=R.USER_ID
GROUP BY REFERED_BY) T
ON A.REFERAL_CODE=T.REFERED_BY;
From the deeper nested condition, first I calculated the average evaluation for each user_id on REVIEWS throwing away USER_ID with avg below 3, then I made the inner join with USERS and I grouped by REFERED_BY to obtain the count desired. Finally I did a left join to obtain the output in the form you expect.

MySQL - Return Latest Date and Total Sum from two rows in a column for multiple entries

For every ID_Number, there is a bill_date and then two types of bills that happen. I want to return the latest date (max date) for each ID number and then add together the two types of bill amounts. So, based on the table below, it should return:
| 1 | 201604 | 10.00 | |
| 2 | 201701 | 28.00 | |
tbl_charges
+-----------+-----------+-----------+--------+
| ID_Number | Bill_Date | Bill_Type | Amount |
+-----------+-----------+-----------+--------+
| 1 | 201601 | A | 5.00 |
| 1 | 201601 | B | 7.00 |
| 1 | 201604 | A | 4.00 |
| 1 | 201604 | B | 6.00 |
| 2 | 201701 | A | 15.00 |
| 2 | 201701 | B | 13.00 |
+-----------+-----------+-----------+--------+
Then, if possible, I want to be able to do this in a join in another query, using ID_Number as the column for the join. Would that change the query here?
Note: I am initially only wanting to run the query for about 200 distinct ID_Numbers out of about 10 million. I will be adding an 'IN' clause for those IDs. When I do the join for the final product, I will need to know how to get those latest dates out of all the other join possibilities. (ie, how do I get ID_Number 1 to join with 201604 and not 201601?)
I would use NOT EXISTS and GROUP BY
select, t1.id_number, max(t1.bill_date), sum(t1.amount)
from tbl_charges t1
where not exists (
select 1
from tbl_charges t2
where t1.id_number = t2.id_number and
t1.bill_date < t2.bill_date
)
group by t1.id_number
the NOT EXISTS filter out the irrelevant rows and GROUP BY do the sum.
I would be inclined to filter in the where:
select id_number, sum(c.amount)
from tbl_charges c
where c.date = (select max(c2.date)
from tbl_charges c2
where c2.id_number = c.id_number and c2.bill_type = c.bill_type
)
group by id_number;
Or, another fun way is to use in with tuples:
select id_number, sum(c.amount)
from tbl_charges c
where (c.id_number, c.bill_type, c.date) in
(select c2.id_number, c2.bill_type, max(c2.date)
from tbl_charges c2
group by c2.id_number, c2.bill_type
)
group by id_number;

Mysql count records grouped by ID in multiple tables

I'm developing an application integrated with facebook. This application can be embedded in FB page as tab app.
Using FB SDK feeds of page will be stored in Feeds table.
Page fans will may have liked and commented on feeds posted by page.
Users' likes store in Like Table and users' comments store in Comment table
I want to get total count ( Likes count + comment count) of each users'.
SQL Fiddle : http://sqlfiddle.com/#!2/ecb37/10/0
Table : Feeds
| ID | POST_ID |
|----|---------------------------------|
| 56 | 150348635024244_795407097185058 |
| 55 | 150348635024244_795410940518007 |
| 54 | 150348635024244_795414953850939 |
| 53 | 150348635024244_797424133650021 |
| 52 | 150348635024244_797455793646855 |
| 51 | 150348635024244_798997120159389 |
| 50 | 150348635024244_798997946825973 |
Table : Likes
SELECT user_id, COUNT(*) FROM likes GROUP by user_id
| USER_ID | LIKECOUNT |
|------------------|-----------|
| 913403225356462 | 4 |
| 150348635024244 | 3 |
| 356139014550882 | 2 |
| 753274941400012 | 2 |
| 1559751687580867 | 1 |
Table : Comments
SELECT user_id, COUNT(*) FROM comments GROUP by user_id
| USER_ID | COMMENTSCOUNT |
|-----------------|---------------|
| 150348635024244 | 2 |
| 356139014550882 | 2 |
| 913403225356462 | 2 |
Result should be like this
| POINTS | LIKESCOUNT | COMMENTSCOUNT | USER_ID |
|--------|------------|---------------|-----------------|
| 6 | 4 | 2 | 913403225356462 |
| 5 | 3 | 2 | 150348635024244 |
| 4 | 2 | 2 | 356139014550882 |
| 2 | 2 | 0 | 753274941400012 |
| 1 | 1 | 0 |1559751687580867 |
I tried this query. but count of each user's is wrong
SELECT COUNT(likes.user_id)+COUNT(comments.user_id) as points, likes.user_id FROM `likes`
LEFT JOIN comments ON likes.user_id = comments.user_id
LEFT JOIN feeds ON likes.post_id = feeds.post_id
WHERE likes.post_id LIKE '153548635024244%'
GROUP BY likes.user_id
ORDER BY points DESC
The two queries are unrelated and a join is useless. Use a UNION ALL:
SELECT user_id, sum(n) from (
SELECT user_id, COUNT(*) n FROM likes GROUP by user_id
UNION ALL
SELECT user_id, COUNT(*) FROM comments GROUP by user_id
) x
GROUP BY user_id
UNION ALL is needed instead of just UNION, because UNION removes duplicates and would cause incorrect results for the edge case of the two subqueries yielding the same counts.
The simple way to get what you want is to use count(distinct). But that will likely have lousy performance. Instead, use correlated subqueries:
SELECT COUNT(*) +
(select COUNT(c.user_id) from comments c where c.user_id = l.user_id)
) as points, l.user_id
FROM likes l
WHERE l.post_id LIKE '153548635024244%'
GROUP BY l.user_id
ORDER BY points DESC;
I'm not sure what the feeds table is for. However, you version of the query creates a cartesian product between the different tables. If you have a lot of activity for a given user, that would be very bad for performance.

Mysql join on 3 tables output

I am learning joins and have the following tables.
Student
| ID | NAME |
-------------
| 1 | A |
| 2 | B |
| 3 | C |
| 4 | D |
Pass
| ID | MARKS |
--------------
| 2 | 80 |
| 3 | 75 |
Fail
| ID | MARKS |
--------------
| 1 | 25 |
| 4 | 20 |
The output I want is this:
| NAME | MARKS |
----------------
| B | 80 |
| C | 75 |
| A | 25 |
| D | 20 |
I wrote a query like this:
select s.id,s.name,p.marks from student s
left join pass p on s.id=p.id
left join (select f.marks,f.id from fail f ) as nn on s.id=nn.id
order by marks desc;
The output I got is this:
| id | name | Marks|
--------------------
| 1 | B | 80 |
| 2 | C | 75 |
| 3 | A | Null |
| 4 | D | NUll |
Cant figure out why Null is coming. Any pointers?
You can use CASE statement for that:
SELECT Name,
CASE WHEN P.Marks IS NULL THEN f.Marks ELSE P.Marks END AS Marks
FROM Student s
LEFT JOIN Pass p ON s.ID = p.ID
LEFT JOIN Fail f ON s.ID = f.ID
ORDER BY Marks DESC;
Or you can also use IF statement:
SELECT Name,
IF(P.Marks IS NULL, F.Marks, P.Marks) AS Marks
FROM Student s
LEFT JOIN Pass p ON s.ID = p.ID
LEFT JOIN Fail f ON s.ID = f.ID
ORDER BY Marks DESC;
Output
| NAME | MARKS |
----------------
| B | 80 |
| C | 75 |
| A | 25 |
| D | 20 |
See this SQLFiddle
To learn more about JOINs see: A Visual Explanation of SQL Joins
You select only the passed marks, this is the reason of null-s appears near falied results.
If you want to select the failed marks you can use IF condition
select s.id,s.name,IF(p.marks = null, nn.marks, p.marks) as markss
from student s
left join pass p on s.id=p.id
left join fail nn on s.id=nn.id
order by markss desc;
Or you can use union of the passed and failed results.
select s.id,s.name, u.marks
from student s
left join ( (SELECT * FROM pass) UNION (SELECT * FROM fail) ) as n ON n.id = s.id
order by marks desc;
You need to understand how the different joins work to understand why you receive NULL for the marks column.
Take a look here:A Visual Explanation of SQL Joins
The relevant example for you is:
LEFT OUTER JOIN:
SELECT * FROM TableA
LEFT OUTER JOIN TableB
ON TableA.name = TableB.name
id name id name
-- ---- -- ----
1 Pirate 2 Pirate
2 Monkey null null
3 Ninja 4 Ninja
4 Spaghetti null null
The Null values you received for the marks column are rows that have no match in the left joined tables.
(the left part of the Venn diagram) the values that does have a value are the cross section between the tow groups of the Venn Diagram.
specifics for your example:
select s.id,s.name,p.marks
from student s
left join pass p on s.id=p.id
left join (select f.marks,f.id from fail f ) as nn on s.id=nn.id
order by marks desc;
The output i got is this:
id | name | Marks
-------------------
1 | B | 80
2 | C | 75
3 | A | Null
4 | D | NUll
This will return all student rows when.
students that have a passing gtade will display the grade and thous who don't will display null.
Try the below Query, use COALESCE
select s.id,s.name,COALESCE(p.marks , nn.marks) as marks
from student s
left join pass p on s.id=p.id
left join fail nn on s.id=nn.id
order by marks desc;
SQL Fiddle