Sub Query counting character strings in MySQL - mysql

LEFT JOIN
(
SELECT user_id, review, COUNT(user_id) totalCount
FROM reviews
GROUP BY user_id
) b ON b.user_id= b.user_id
I am trying to fit WHERE LENGTH(review) > 100 in this somewhere but every I put it, it gives me problems.
The sub-query above counts all total reviews by user_id. I simply want to add one more qualification. Only count reviews greater than 100 length.
On a side note, I've seen the function CHAR_LENGTH -- not sure if that i what I need either.
EDIT:
Here is complete query working perfectly as expected for my needs:
static public $top_users = "
SELECT u.username, u.score,
(COALESCE(a.totalCount, 0) * 4) +
(COALESCE(b.totalCount, 0) * 5) +
(COALESCE(c.totalCount, 0) * 1) +
(COALESCE(d.totalCount, 0) * 2) +
(COALESCE(u.friend_points, 0)) AS totalScore
FROM users u
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM items
GROUP BY user_id
) a ON a.user_id= u.user_id
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM reviews
GROUP BY user_id
) b ON b.user_id= u.user_id
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM ratings
GROUP BY user_id
) c ON c.user_id = u.user_id
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount
FROM comments
GROUP BY user_id
) d ON d.user_id = u.user_id
ORDER BY totalScore DESC LIMIT 25;";

LENGTH() returns the length of the string measured in bytes. You probably want CHAR_LENGTH() as it will give you the actual characters.
SELECT user_id, review, COUNT(user_id) totalCount
FROM reviews
WHERE CHAR_LENGTH(review) > 100
GROUP BY user_id, review
You're also not using GROUP BY correctly.
See the documentation

The query that you want is:
LEFT JOIN
(
SELECT user_id, COUNT(user_id) totalCount,
sum(case when length(review) > 100 then 1 else 0 end
) as NumLongReviews
FROM reviews
GROUP BY user_id
) b ON b.user_id= b.user_id
This counts both the reviews and the "long" reviews. That count is done using a case statement nested in a sum() function.

Related

I need to get specific ids from db if these are in current and last quarter using SQL

[DB Table]
SELECT b.first_name, b.last_name, a.pod_name, a.category, c.user_id,
SUM(IF(QUARTER(CURDATE())-1 OR (QUARTER(CURDATE())-2) AND a.user_id, 1, 0)) AS flag FROM kudos a
INNER JOIN users b ON a.user_id = b.id INNER JOIN users_groups c ON a.user_id = c.user_id
INNER JOIN groups d ON c.group_id = d.id WHERE a.group_name = 'G2' AND d.id IN (7,8,9,11,12,13,14,15,16,17,21,22,23,24,25,26,27,28)
AND QUARTER(CURDATE())-1 = a.quarter ORDER BY a.final_score+0 DESC
I need to get the user_ids of those users which are both in quarter 1 and 2 from table.
Tried above query but failed to get expected results.
Can someone please guide me on this?
if you only need user_id then you can do this :
select user_id
from tablename
where quarter in (1,2)
group by user_id
having count(distinct quarter) = 2
another way is to use window function, assuming you have one user id in each quarter:
select * from (
select * , count(*) over (partition by user_id) cn
from tablename
where quarter in (1,2)
) t where cn = 2

maximum rows per group subset

I have a query like this:
SELECT * FROM user AS u
JOIN article AS a
ON u.id = a.userid
GROUP BY u.id
How can I extract maximum 10 articles for each particular user?
Mysql don't have window functions for such type of results another work around is to use user defined variables to get the n result per group
SELECT * FROM (
SELECT a.*,
#r:= CASE WHEN #g = userid THEN #r + 1 ELSE 1 END row_num,
#g:= userid
FROM (SELECT *
FROM `user` AS u
JOIN article AS a
ON u.id = a.userid
ORDER BY u.id,a.id DESC
) a
CROSS JOIN (SELECT #g:=NULL,#r:0) b
) t
WHERE row_num <=10

issue with joins

I have the following query, in which I used JOINs. It says:
unknown column m.bv ..
Could you please take a look and tell me what I'm doing wrong?
$query4 = 'SELECT u.*, SUM(c.ts) AS total_sum1, SUM(m.bv) AS total_sum
FROM users u
LEFT JOIN
(SELECT user_id ,SUM(points) AS ts FROM coupon GROUP BY user_id) c
ON u.user_id=c.user_id
LEFT JOIN
(SELECT user_id ,SUM(points) AS bv FROM matching GROUP BY user_id) r
ON u.user_id=m.user_id
where u.user_id="'.$_SESSION['user_name'].'"
GROUP BY u.user_id';
You are selecting SUM(points) AS bv from the table with the alias r, there is no tables with the alias m. So that it has to be r.bv instead like so:
SELECT
u.*,
SUM(c.ts) AS total_sum1,
SUM(r.bv) AS total_sum
FROM users u
LEFT JOIN
(
SELECT
user_id,
SUM(points) AS ts
FROM coupon
GROUP BY user_id
) c ON u.user_id=c.user_id
LEFT JOIN
(
SELECT
user_id,
SUM(points) AS bv
FROM matching
GROUP BY user_id
) r ON u.user_id = m.user_id
where u.user_id="'.$_SESSION['user_name'].'"
GROUP BY u.user_id
Replace m., with r. Look at second Join
You have aliased the derived table with r and you reference that table (twice) with m. Correct one or the other.
Since you group by user_id in the two subqueries and user_id is (I assume) the primary key of table user, you don't really need the final GROUP BY.
I would write it like this, if it was meant for all (many) users:
SELECT u.*, COALESCE(c.ts, 0) AS total_sum1, COALESCE(m.bv, 0) AS total_sum
FROM users u
LEFT JOIN
(SELECT user_id, SUM(points) AS ts FROM coupon GROUP BY user_id) c
ON u.user_id = c.user_id
LEFT JOIN
(SELECT user_id, SUM(points) AS bv FROM matching GROUP BY user_id) m
ON u.user_id = m.user_id
and like this in your (one user) case:
SELECT u.*, COALESCE(c.ts, 0) AS total_sum1, COALESCE(m.bv, 0) AS total_sum
FROM users u
LEFT JOIN
(SELECT SUM(points) AS ts FROM coupon
WHERE user_id = "'.$_SESSION['user_name'].'") c
ON TRUE
LEFT JOIN
(SELECT SUM(points) AS bv FROM matching
WHERE user_id = "'.$_SESSION['user_name'].'") m
ON TRUE
WHERE u.user_id = "'.$_SESSION['user_name'].'"
The last query can also be simplified to:
SELECT u.*,
COALESCE( (SELECT SUM(points) FROM coupon
WHERE user_id = u.user_id)
, 0) AS total_sum1,
COALESCE( (SELECT SUM(points) FROM matching
WHERE user_id = u.user_id)
, 0) AS total_sum
FROM users u
WHERE u.user_id = "'.$_SESSION['user_name'].'"

MySQL INNER JOIN select only one row from second table

I have a users table and a payments table, for each user, those of which have payments, may have multiple associated payments in the payments table. I would like to select all users who have payments, but only select their latest payment. I'm trying this SQL but i've never tried nested SQL statements before so I want to know what i'm doing wrong. Appreciate the help
SELECT u.*
FROM users AS u
INNER JOIN (
SELECT p.*
FROM payments AS p
ORDER BY date DESC
LIMIT 1
)
ON p.user_id = u.id
WHERE u.package = 1
You need to have a subquery to get their latest date per user ID.
SELECT u.*, p.*
FROM users u
INNER JOIN payments p
ON u.id = p.user_ID
INNER JOIN
(
SELECT user_ID, MAX(date) maxDate
FROM payments
GROUP BY user_ID
) b ON p.user_ID = b.user_ID AND
p.date = b.maxDate
WHERE u.package = 1
SELECT u.*, p.*
FROM users AS u
INNER JOIN payments AS p ON p.id = (
SELECT id
FROM payments AS p2
WHERE p2.user_id = u.id
ORDER BY date DESC
LIMIT 1
)
Or
SELECT u.*, p.*
FROM users AS u
INNER JOIN payments AS p ON p.user_id = u.id
WHERE NOT EXISTS (
SELECT 1
FROM payments AS p2
WHERE
p2.user_id = p.user_id AND
(p2.date > p.date OR (p2.date = p.date AND p2.id > p.id))
)
These solutions are better than the accepted answer because they work correctly when there are multiple payments with same user and date. You can try on SQL Fiddle.
SELECT u.*, p.*, max(p.date)
FROM payments p
JOIN users u ON u.id=p.user_id AND u.package = 1
GROUP BY u.id
ORDER BY p.date DESC
Check out this sqlfiddle
SELECT u.*
FROM users AS u
INNER JOIN (
SELECT p.*,
#num := if(#id = user_id, #num + 1, 1) as row_number,
#id := user_id as tmp
FROM payments AS p,
(SELECT #num := 0) x,
(SELECT #id := 0) y
ORDER BY p.user_id ASC, date DESC)
ON (p.user_id = u.id) and (p.row_number=1)
WHERE u.package = 1
You can try this:
SELECT u.*, p.*
FROM users AS u LEFT JOIN (
SELECT *, ROW_NUMBER() OVER(PARTITION BY userid ORDER BY [Date] DESC) AS RowNo
FROM payments
) AS p ON u.userid = p.userid AND p.RowNo=1
There are two problems with your query:
Every table and subquery needs a name, so you have to name the subquery INNER JOIN (SELECT ...) AS p ON ....
The subquery as you have it only returns one row period, but you actually want one row for each user. For that you need one query to get the max date and then self-join back to get the whole row.
Assuming there are no ties for payments.date, try:
SELECT u.*, p.*
FROM (
SELECT MAX(p.date) AS date, p.user_id
FROM payments AS p
GROUP BY p.user_id
) AS latestP
INNER JOIN users AS u ON latestP.user_id = u.id
INNER JOIN payments AS p ON p.user_id = u.id AND p.date = latestP.date
WHERE u.package = 1
#John Woo's answer helped me solve a similar problem. I've improved upon his answer by setting the correct ordering as well. This has worked for me:
SELECT a.*, c.*
FROM users a
INNER JOIN payments c
ON a.id = c.user_ID
INNER JOIN (
SELECT user_ID, MAX(date) as maxDate FROM
(
SELECT user_ID, date
FROM payments
ORDER BY date DESC
) d
GROUP BY user_ID
) b ON c.user_ID = b.user_ID AND
c.date = b.maxDate
WHERE a.package = 1
I'm not sure how efficient this is, though.
SELECT U.*, V.* FROM users AS U
INNER JOIN (SELECT *
FROM payments
WHERE id IN (
SELECT MAX(id)
FROM payments
GROUP BY user_id
)) AS V ON U.id = V.user_id
This will get it working
Matei Mihai given a simple and efficient solution but it will not work until put a MAX(date) in SELECT part so this query will become:
SELECT u.*, p.*, max(date)
FROM payments p
JOIN users u ON u.id=p.user_id AND u.package = 1
GROUP BY u.id
And order by will not make any difference in grouping but it can order the final result provided by group by. I tried it and it worked for me.
My answer directly inspired from #valex very usefull, if you need several cols in the ORDER BY clause.
SELECT u.*
FROM users AS u
INNER JOIN (
SELECT p.*,
#num := if(#id = user_id, #num + 1, 1) as row_number,
#id := user_id as tmp
FROM (SELECT * FROM payments ORDER BY p.user_id ASC, date DESC) AS p,
(SELECT #num := 0) x,
(SELECT #id := 0) y
)
ON (p.user_id = u.id) and (p.row_number=1)
WHERE u.package = 1
This is quite simple do The inner join and then group by user_id and use max aggregate function in payment_id assuming your table being user and payment query can be
SELECT user.id, max(payment.id)
FROM user INNER JOIN payment ON (user.id = payment.user_id)
GROUP BY user.id
If you do not have to return the payment from the query you can do this with distinct, like:
SELECT DISTINCT u.*
FROM users AS u
INNER JOIN payments AS p ON p.user_id = u.id
This will return only users which have at least one record associated in payment table (because of inner join), and if user have multiple payments, will be returned only once (because of distinct), but the payment itself won't be returned, if you need the payment to be returned from the query, you can use for example subquery as other proposed.

Optimize sub-query selecting last record of each group

I have this query which is a dependant query and taking much execution time
SELECT
u.id,
u.user_name,
ifnull((select longitude from map where user_id = u.id order by map_id desc limit 1 ),0) as Longitude,
ifnull((select latitude from map where user_id = u.id order by map_id desc limit 1 ),0) as Longitude,
(select created from map where user_id = 1 order by created desc limit 1) as LatestTime
FROM users as u
WHERE id IN(SELECT
user1_id FROM relation
WHERE users.id = 1)
ORDER BY id;
I tried this query in (dependant)
SELECT
u.id,
u.user_name,
m.map_id,
m.longitude,
m.latitude,
m.Date as created
FROM users as u
left join (select
map_id,
longitude,
latitude,
user_id,
max(created) as `Date`
from map
group by user_id) as m
on m.user_id = u.id
WHERE id IN(SELECT
user1_id FROM relation
WHERE users.id = 1)
ORDER BY id;
The problem is that the first query is dependent and working fine but taking much execution time. With the second query the problem is that it is not fetching the latest created time.
Now i want to optimise this query. The theme is that in subquery i am first making group then i am trying to get the last record of each group. and here is the tables structure.
users : id , user_name
map : map_id , user_id ,longitude , latitude, created
relations : id , user1_id , user2_id , relation
Where performance is needed, subqueries in the SELECT clause are indeed a pain and have to be banished :)
You can rewrite this part:
SELECT
u.id,
u.user_name,
ifnull((select longitude from map where user_id = u.id order by map_id desc limit 1 ),0) as Longitude,
ifnull((select latitude from map where user_id = u.id order by map_id desc limit 1 ),0) as Longitude,
(select created from map where user_id = 1 order by created desc limit 1) as LatestTime
FROM users as u
In:
SELECT
u.id,
u.user_name,
COALESCE(m1.longitude, 0) as longitude,
COALESCE(m1.latitude, 0) as latitude
FROM users u
LEFT JOIN map m1 ON m1.user_id = u.id
LEFT JOIN map m2 ON m2.user_id = m1.user_id AND m2.map_id > m1.map_id
WHERE m2.map_id IS NULL
I wrote a short explanation of the query structure in this answer. It's a really nice trick to learn as it is more readable, subquery-less and performance wiser.
I haven't looked at the IN part yet but will if the above didn't help.
Edit1: You can extract the created date and use a MAX() instead.
SELECT
u.id,
u.user_name,
COALESCE(m1.longitude, 0) as longitude,
COALESCE(m1.latitude, 0) as latitude,
created.LatestTime
FROM (SELECT MAX(created) FROM map WHERE user_id = 1) created
INNER JOIN users u ON TRUE
LEFT JOIN map m1 ON m1.user_id = u.id
LEFT JOIN map m2 ON m2.user_id = m1.user_id AND m2.map_id > m1.map_id
WHERE m2.map_id IS NULL