I have a table for comments :
+----------+---------------------+----------+
| match_id | timestampe | comment |
+----------+---------------------+----------+
| 100 | 2014-01-01 01:00:00 | Hi |
| 200 | 2014-01-01 01:10:00 | Hi1 |
| 300 | 2014-01-01 01:20:00 | Hi2 |
| 100 | 2014-01-01 01:01:00 | Hello |
| 100 | 2014-01-01 01:02:00 | Hello1 |
| 200 | 2014-01-01 01:11:00 | hey |
+----------+---------------------+----------+
I want to get the following information from the table
SELECT match_id, max(timestampe) as maxtimestamp, count(match_id) as comments_no
FROM comments
GROUP BY match_id
order by maxtimestamp DESC
The previous explanation is working great but the problem is when I want to get the comment of the maxtimestamp.
How can I get the latest comment of each match (the comment of the maxtimestamp) using the most optimized query?
You can do it this way.
This is pretty optimal too.
SELECT c.comment, m.*
FROM
comments c
JOIN
(
SELECT t.match_id, max(t.timestampe) as maxtimestamp, count(t.match_id) as comments_no
FROM comments t
GROUP BY t.match_id
) m on c.match_id = m.match_id and c.timestampe = m.maxtimestamp
SQL Fiddle
I'm not sure about MySQL but Oracle supports window functions, so I can write something like:
select first_value(comment) over (order by timestamp desc)
from comments
Here's the easy way to do it with mysql:
SELECT * from (
SELECT match_id, timestampe as maxtimestamp, comment
FROM comments
order by maxtimestamp DESC) x
GROUP BY match_id
This exploits the customised way mysql handles group by.
to not use a subquery, you can use this below query
SELECT match_id,timestampe,comment,
IF(#prevMatchId IS NULL OR #prevMatchId != match_id,#row:=1,#row:=#row+1) as row,
#prevMatchId := match_id
FROM comments
HAVING row = 1
ORDER BY match_id,timestampe DESC
try using EXPLAIN and see which queries are more optimal
here's EXPLAIN on two queries. http://sqlfiddle.com/#!2/70efa/9/1 I am not all that familiar with EXPLAIN so maybe some experts can interpret it.
here's an EXPLAIN on two queries. if i added indexes on match_id and timestampe http://sqlfiddle.com/#!2/30266/1/1
Related
I am trying to do a very complex query (at least extremely complex for me not for YOU :) )
I have users and comments table.
SQL Fiddle: http://sqlfiddle.com/#!9/b1f845/2
select user_id, status_id from comments where user_id in (2,3);
+---------+-----------+
| user_id | status_id |
+---------+-----------+
| 2 | 10 |
| 2 | 10 |
| 2 | 10 |
| 2 | 7 |
| 2 | 7 |
| 2 | 10 |
| 3 | 9 |
| 2 | 9 |
| 2 | 6 |
+---------+-----------+
If I use
select user_id, status_id from comments where user_id in (2,3)
It returns a lot of duplicate values.
What I want to get if possible.
If you see status_id = 10 has user_id= 2,3 and 4 and 2 multiple times.
So from here I want to get maximum of latest user_id (unique) so for example,
it will be user_id = 4 and 2 now the main complex part. I now want to get users information of user_id= 4 and 2 in one column so that at the end I can get something like this
status_id | userOneUserName | userTwoUserName
10 sadek4 iamsadek2
---------------------------------------------
7 | iamsadek2 | null
---------------------------------------------
9 . | iamsadek2 | sadek2
---------------------------------------------
6 | iamsadek2 | null
How can I achieve such a complex things.
Currently I have to do it using application logic.
Thank you for your time.
I think this might be what you literally want here:
SELECT DISTINCT
status_id,
(SELECT MAX(user_id) FROM comments c2 WHERE c1.status_id = c2.status_id) user_1,
(SELECT user_id FROM comments c2 WHERE c1.status_id = c2.status_id
ORDER BY user_id LIMIT 1 OFFSET 1) user_2
FROM comments c1
WHERE user_id IN (2,3);
Demo (your update Fiddle)
We can use correlated subqueries to find the max user_id and second-to-max user_id for each status_id, and then spin each of those out as two separate columns. Using a GROUP_CONCAT approach might be preferable here, since it would also allow you to easily accommodate any numbers of users as a CSV list.
Also, if you were using MySQL 8+ or greater, then we could take advantage of the rank analytic functions, which would also be easier.
select status_id, GROUP_CONCAT(distinct(user_id) SEPARATOR ',')
from comments
group by status_id
I would suggest using GROUP BY and GROUP_CONCAT, e.g. like so:
SELECT status_id, GROUP_CONCAT(userName) AS users, GROUP_CONCAT(DISTINCT c.user_id) AS user_ids
FROM (
SELECT DISTINCT status_id, user_id FROM comments WHERE user_id in (2,3)
) c
JOIN users u ON (c.user_id = u.id)
GROUP BY status_id
ORDER BY status_id DESC
feedback table
-------------------------------
|rating|feedback|feedback_date|
-------------------------------
| 5 | good | 1452638788 |
| 1 | bad | 1452638900 |
| 0 | ugly | 1452750303 |
| 3 | ok | 1453903030 |
-------------------------------
desired result
average_rating | rating | feedback | feedback_date
2.25 | 3 | ok | 1453903030
Is it possible (in a single query) to select the average of one column and also one specific row from the table?
For example, i'd like to retrieve the average of the column rating and the most recent row as a whole.
I tried the following, and also with the ORDER BY direction as DSC but they both just gave me the average_rating and the first row in the table.
SELECT AVG(f.rating) AS average_rating, f.* FROM feedback f ORDER BY feedback_date ASC
SELECT * FROM feedback NATURAL JOIN (
SELECT AVG(rating), MAX(feedback_date) feedback_date FROM feedback
) t
See it on sqlfiddle.
you can do it with a sub query like this
SELECT AVG(f.rating) AS average_rating, t1.* FROM feedback f inner join (select * from feedback order by feedback_date asc limit 1 ) t1 on true
You can put a subquery in the SELECT clause, and calculate the average in the subquery.
SELECT (SELECT AVG(rating) FROM feedback) AS avg_rating, feedback.*
FROM feedback
ORDER BY feedback_date DESC
LIMIT 1
I have a table like this:
Table: p
+----------------+
| id | w_id |
+---------+------+
| 5 | 8 |
| 5 | 10 |
| 5 | 8 |
| 5 | 10 |
| 5 | 8 |
| 6 | 5 |
| 6 | 8 |
| 6 | 10 |
| 6 | 10 |
| 7 | 8 |
| 7 | 10 |
+----------------+
What is the best SQL to get the following result? :
+-----------------------------+
| id | most_used_w_id |
+---------+-------------------+
| 5 | 8 |
| 6 | 10 |
| 7 | 8 |
+-----------------------------+
In other words, to get, per id, the most frequent related w_id.
Note that on the example above, id 7 is related to 8 once and to 10 once.
So, either (7, 8) or (7, 10) will do as result. If it is not possible to
pick up one, then both (7, 8) and (7, 10) on result set will be ok.
I have come up with something like:
select counters2.p_id as id, counters2.w_id as most_used_w_id
from (
select p.id as p_id,
w_id,
count(w_id) as count_of_w_ids
from p
group by id, w_id
) as counters2
join (
select p_id, max(count_of_w_ids) as max_counter_for_w_ids
from (
select p.id as p_id,
w_id,
count(w_id) as count_of_w_ids
from p
group by id, w_id
) as counters
group by p_id
) as p_max
on p_max.p_id = counters2.p_id
and p_max.max_counter_for_w_ids = counters2.count_of_w_ids
;
but I am not sure at all whether this is the best way to do it. And I had to repeat the same sub-query two times.
Any better solution?
Try to use User defined variables
select id,w_id
FROM
( select T.*,
if(#id<>id,1,0) as row,
#id:=id FROM
(
select id,W_id, Count(*) as cnt FROM p Group by ID,W_id
) as T,(SELECT #id:=0) as T1
ORDER BY id,cnt DESC
) as T2
WHERE Row=1
SQLFiddle demo
Formal SQL
In fact - your solution is correct in terms of normal SQL. Why? Because you have to stick with joining values from original data to grouped data. Thus, your query can not be simplified. MySQL allows to mix non-group columns and group function, but that's totally unreliable, so I will not recommend you to rely on that effect.
MySQL
Since you're using MySQL, you can use variables. I'm not a big fan of them, but for your case they may be used to simplify things:
SELECT
c.*,
IF(#id!=id, #i:=1, #i:=#i+1) AS num,
#id:=id AS gid
FROM
(SELECT id, w_id, COUNT(w_id) AS w_count
FROM t
GROUP BY id, w_id
ORDER BY id DESC, w_count DESC) AS c
CROSS JOIN (SELECT #i:=-1, #id:=-1) AS init
HAVING
num=1;
So for your data result will look like:
+------+------+---------+------+------+
| id | w_id | w_count | num | gid |
+------+------+---------+------+------+
| 7 | 8 | 1 | 1 | 7 |
| 6 | 10 | 2 | 1 | 6 |
| 5 | 8 | 3 | 1 | 5 |
+------+------+---------+------+------+
Thus, you've found your id and corresponding w_id. The idea is - to count rows and enumerate them, paying attention to the fact, that we're ordering them in subquery. So we need only first row (because it will represent data with highest count).
This may be replaced with single GROUP BY id - but, again, server is free to choose any row in that case (it will work because it will take first row, but documentation says nothing about that for common case).
One little nice thing about this is - you can select, for example, 2-nd by frequency or 3-rd, it's very flexible.
Performance
To increase performance, you can create index on (id, w_id) - obviously, it will be used for ordering and grouping records. But variables and HAVING, however, will produce line-by-line scan for set, derived by internal GROUP BY. It isn't such bad as it was with full scan of original data, but still it isn't good thing about doing this with variables. On the other hand, doing that with JOIN & subquery like in your query won't be much different, because of creating temporery table for subquery result set too.
But to be certain, you'll have to test. And keep in mind - you already have valid solution, which, by the way, isn't bound to DBMS-specific stuff and is good in terms of common SQL.
Try this query
select p_id, ccc , w_id from
(
select p.id as p_id,
w_id, count(w_id) ccc
from p
group by id,w_id order by id,ccc desc) xxx
group by p_id having max(ccc)
here is the sqlfidddle link
You can also use this code if you do not want to rely on the first record of non-grouping columns
select p_id, ccc , w_id from
(
select p.id as p_id,
w_id, count(w_id) ccc
from p
group by id,w_id order by id,ccc desc) xxx
group by p_id having ccc=max(ccc);
I have that table :
forum:
_____________________________________________________________
|match_static_id| comment | timpstamp | user_id |
|_______________|___________|______________________|__________|
| 1 | Hi | 2013-07-10 12:15:03 | 2 |
| 1 | Hello | 2013-07-09 12:14:44 | 1 |
|_______________|___________|______________________|__________|
the working query is:
select forum.match_static_id,
count(forum.match_static_id) 'comment_no'
Group By forum.match_static_id
But what if I want to have:
select forum.match_static_id,
count(forum.match_static_id) 'comment_no',
forum.timestamp
Group By forum.match_static_id
It will give the same result as the previous query but with a value of timestamp for each record
I want this value to be the most recent timestamp could that be done?
Just use the max() function.
select forum.match_static_id,
count(forum.match_static_id) 'comment_no',
max(forum.timestamp)
Group By forum.match_static_id
Here's a list and explanation of the available aggregate functions.
How about this:
select forum.match_static_id,
count(forum.match_static_id) 'comment_no',
max(forum.timestamp)
Group By forum.match_static_id
I have a table from which I am trying to retrieve the latest position for each security:
The Table:
My query to create the table: SELECT id, security, buy_date FROM positions WHERE client_id = 4
+-------+----------+------------+
| id | security | buy_date |
+-------+----------+------------+
| 26 | PCS | 2012-02-08 |
| 27 | PCS | 2013-01-19 |
| 28 | RDN | 2012-04-17 |
| 29 | RDN | 2012-05-19 |
| 30 | RDN | 2012-08-18 |
| 31 | RDN | 2012-09-19 |
| 32 | HK | 2012-09-25 |
| 33 | HK | 2012-11-13 |
| 34 | HK | 2013-01-19 |
| 35 | SGI | 2013-01-17 |
| 36 | SGI | 2013-02-16 |
| 18084 | KERX | 2013-02-20 |
| 18249 | KERX | 0000-00-00 |
+-------+----------+------------+
I have been messing with versions of queries based on this page, but I cannot seem to get the result I'm looking for.
Here is what I've been trying:
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE buy_date = (SELECT MAX(t2.buy_date)
FROM positions t2
WHERE t1.security = t2.security)
But this just returns me:
+-------+----------+------------+
| id | security | buy_date |
+-------+----------+------------+
| 27 | PCS | 2013-01-19 |
+-------+----------+------------+
I'm trying to get the maximum/latest buy date for each security, so the results would have one row for each security with the most recent buy date. Any help is greatly appreciated.
EDIT: The position's id must be returned with the max buy date.
You can use this query. You can achieve results in 75% less time. I checked with more data set. Sub-Queries takes more time.
SELECT p1.id,
p1.security,
p1.buy_date
FROM positions p1
left join
positions p2
on p1.security = p2.security
and p1.buy_date < p2.buy_date
where
p2.id is null;
SQL-Fiddle link
You can use a subquery to get the result:
SELECT p1.id,
p1.security,
p1.buy_date
FROM positions p1
inner join
(
SELECT MAX(buy_date) MaxDate, security
FROM positions
group by security
) p2
on p1.buy_date = p2.MaxDate
and p1.security = p2.security
See SQL Fiddle with Demo
Or you can use the following in with a WHERE clause:
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE buy_date = (SELECT MAX(t2.buy_date)
FROM positions t2
WHERE t1.security = t2.security
group by t2.security)
See SQL Fiddle with Demo
This is done with a simple group by. You want to group by the securities and get the max of buy_date. The SQL:
SELECT security, max(buy_date)
from positions
group by security
Note, this is faster than bluefeet's answer but does not display the ID.
The answer by #bluefeet has two more ways to get the results you want - and the first will probably be more efficient than your query.
What I don't understand is why you say that your query doesn't work. It seems pretty fine and returns the expected result. Tested at SQL-Fiddle
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE buy_date = ( SELECT MAX(t2.buy_date)
FROM positions t2
WHERE t1.security = t2.security ) ;
If the problems appears when you add the client_id = 4 condition, then it's because you add it only in one WHERE clause while you have to add it in both:
SELECT t1.id, t1.security, t1.buy_date
FROM positions t1
WHERE client_id = 4
AND buy_date = ( SELECT MAX(t2.buy_date)
FROM positions t2
WHERE client_id = 4
AND t1.security = t2.security ) ;
select security, max(buy_date) group by security from positions;
is all you need to get max buy date for each security (when you say out loud what you want from a query and you include the phrase "for each x", you probably want a group by on x)
When you use a group by, all columns in your select must either be columns that have been grouped by or aggregates, so if, for example, you wanted to include id, you'd probably have to use a subquery similar to what you had before, since there doesn't seem to be any aggregate you can reasonably use on the ids, and another group by would give you too many rows.