Select where max Mysql - mysql

help please make sql select to database. There are such data.
My table is:
id news_id season seria date_update
---|------|---------|-----|--------------------
1 | 4 | 1 | 7 | 2017-04-14 16:38:10
2 | 4 | 1 | 7 | 2017-04-14 17:38:10
5 | 4 | 1 | 7 | 2017-04-14 16:38:10
3 | 4 | 1 | 7 | 2017-04-14 16:38:10
4 | 4 | 1 | 7 | 2017-04-14 16:38:10
6 | 4 | 1 | 7 | 2017-04-14 16:38:10
7 | 4 | 1 | 7 | 2017-04-14 16:38:10
8 | 1 | 1 | 25 | 2017-04-23 18:42:00
Need to get all cells grouped by max season and seria and date and sorted by date_update DESC.
In result i need next rows
id news_id season seria date_update
---|------|---------|-----|--------------------
8 | 1 | 1 | 25 | 2017-04-23 18:42:00
2 | 4 | 1 | 7 | 2017-04-14 17:38:10
Because this rows have highest season and seria and date_update per One news_id. I.e i need to select data wich have highest season and seria and date_update grouped by news_id and also sorted by date_update DESC
I tried so, but the data is not always correct, and it does not always for some reason cover all the cells that fit the condition.
SELECT serial.*
FROM serial as serial
INNER JOIN (SELECT id, MAX(season) AS maxseason, MAX(seria) AS maxseria FROM serial GROUP BY news_id) as one_serial
ON serial.id = one_serial.id
WHERE serial.season = one_serial.maxseason AND serial.seria = one_serial.maxseria
ORDER BY serial.date_update
Please, help. Thank.

The specification is unclear.
But we do know that the GROUP BY news_id clause is going collapse all of the rows with a common value of news_id into a single row. (Other databases would throw an error with this syntax; we can get MySQL to throw a similar error if we include ONLY_FULL_GROUP_BY in the sql_mode.)
My suggestion would be to remove the GROUP BY news_id clause from the end of the query.
But that's just a guess. It's not at all clear what you are trying to achieve.
EDIT
SELECT t.*
FROM (
SELECT r.news_id
, r.season
, r.seria
, MAX(r.date_update) AS max_date_update
FROM (
SELECT p.news_id
, p.season
, MAX(p.seria) AS max_seria
FROM (
SELECT n.news_id
, MAX(n.season) AS max_season
FROM serial n
GROUP BY n.news_id
) o
JOIN serial p
ON p.news_id = o.news_id
AND p.season = o.max_season
) q
JOIN serial r
ON r.news_id = q.news_id
AND r.season = q.season
AND r.seria = q.max_seria
) s
JOIN serial t
ON t.news_id = s.news_id
AND t.season = s.season
AND t.seria = s.seria
AND t.date_update = s.max_date_update
GROUP BY t.news_id
ORDER BY t.news_id
Or, an alternate approach making use of MySQL user-defined variables...
SELECT s.id
, s.season
, s.seria
, s.date_update
FROM (
SELECT IF(q.news_id = #p_news_id,0,1) AS is_max
, q.id
, #p_news_id := q.news_id AS news_id
, q.season
, q.seria
, q.date_update
FROM ( SELECT #p_news_id := NULL ) r
CROSS
JOIN serial q
ORDER
BY q.news_id DESC
, q.season DESC
, q.seria DESC
, q.date_update DESC
) s
WHERE s.is_max
ORDER BY s.news_id

The subquery selects the maximum season and the maximum seria per news_id. How many records exist for the news_id that match both the maximum season and the maximum seria we don't know. It can be, one or two or thousand or zero.
So with the join you get an unknown number of records per news_id. Then you group by news_id. This gets you one result row per news_id. How then can you select serial.*? * means all columns from a row, but which row,when there can be many for a news_id? MySQL usually picks values arbitrarily in this case (usually all from the same row, but even that is not guaranteed). So you end up with random rows which you order by date_update.
This doesn't make much sense. So the question is: what do you really want to achieve? Maybe my explanation suffices and you are able now to fix your query yourself.

Related

MySQL - Retrieve the max value of an associated column within a LEFT JOIN with a different perimeter than the WHERE clause of the main query

I'm using MySql 5.6 and have a select query with a LEFT JOIN but i need to retrieve the max of a associated column email_nb) but with a different "perimeter" of constraints.
Let's take an example: let me state that it is a mere example with only 5 rows but it should work also when I have thousands... (I'm stating this since there is a LIMIT clause in my query)
Table 'query_results'
+-----------------------------+------------+--------------+
| query_result_id | query_id | author |
+-----------------------------+------------+--------------+
| 2 | 1 | john |
| 3 | 1 | eric |
| 7 | 3 | martha |
| 9 | 4 | john |
| 10 | 1 | john |
+-----------------------------+------------+--------------+
Table 'customers_emails'
+-------------------+-----------------+--------------+-----------+-------------+------------------------
| customer_email_id | query_result_id | customer_id | author | email_nb | days_since_sending
+-------------------+-----------------+--------------+-----------+-------------+------------------------
| 5 | 2 | 12 | john | 2 | 150
| 12 | 3 | 7 | eric | 4 | 90
| 27 | 3 | 12 | eric | 2 | 86
| 40 | 9 | 15 | john | 9 | 87
| 42 | 2 | 12 | john | 7 | 23
| 51 | 10 | 12 | john | 3 | 89
+-------------------+-----------------+--------------+-----------+-------------+-----------------------
Notes:
you can have a query_result where the author appears in NO row at all in any of the customers_emails, hence the LEFT JOIN I'm using.
You can see author is by design kind of duplicated as it's both on the first table and the second table each time associated with a query_result_id. It's important to note.
email_nb is an integer between 0 and 10
there is a LIMIT clause as I need to retrieve a set number of records
Today my query aims at retrieving query_results with a certain number of conditions on The specificity is that I make sure to retrieve query_results with an author who does not appear in any customer_email_id where the days_since_sending would be less than 60 days: it means i check these days_since_sending not only within the records for this query, but across all customers_emails thanks to the subquery NOT IN (see below).
This is my current query for customer_id = 12 and query_id = 1
SELECT
qr.query_result_id,
qr.author,
FROM
query_results qr
LEFT JOIN
customers_emails ce
ON
qr.author = ce.author
WHERE
qr.query_id = 1 AND
qr.author IS NOT NULL
AND qr.author NOT IN (
SELECT recipient
FROM customers_emails
WHERE
(
customer_id = 12 AND
( days_since_sending >= 60) )
)
)
# we don't take by coincidence/bad luck 2 query results with the same author
GROUP BY
qr.author
ORDER BY
qr.query_result_id ASC
LIMIT
20
This is the expected output:
+-----------------------------+------------+--------------+
| query_result_id | author | email_nb |
+-----------------------------+------------+--------------+
| 10 | john | 7 |
| 3 | eric | 2 |
+-----------------------------+------------+--------------+
My challenge/difficulty today:
Notice on the 2nd line Eric is tied to email_nb 2 and not the max of all Eric's emails which could have been 4 if we had taken the max of email_nb across ALL messages to author=eric. but we stay within the limit of customer_id = 12 so there's only one left with email_nb = 2
Also notice that on the first line, the email_nb associated with query_result = 10 is 7, and not 3, which could have been the case as 3 is what appears in table customers_emails on the last line.
Indeed for emails to 'john' i had the choice between email_nb 2, 7 and 3 but I take highest so it's 7 (even if this email is from more than 60 days ago !! This is very important and part of what I don't know how to do: the perimeters are different: today I retrieve all the query_results where the author has NOT been sent a email for the past 60 days (see the NOT IN subquery) BUT I need to have in the column the max email_nb sent to john by customer_id=12 and query_id=1 EVEN if it was sent more than 60 days ago so these are different perimeters...Don't really know how to do this...
It means in other words I don't want to find the max (email_nb) within the same WHERE clauses such as days_since_sending >= 60 or within the same LIMIT and GROUP BY...as my current query: what I neeed is to retrieve the maximum value of email_nb for customer_id=12 AND query_id=1 and sent to john across ALL records on the customers_emails table!
If there is no associated row on customers_emails at all (it means no email have been ever sent by this customer for this query in the past) then the email_nb should be sth like NULL..
This means I do NOT want this output:
+-----------------------------+------------+--------------+
| query_result_id | author | email_nb |
+-----------------------------+------------+--------------+
| 10 | john | 3 |
| 3 | eric | 2 |
+-----------------------------+------------+--------------+
How to achieve this in MySQL 5.6 ?
Since you were confusing a bit, I came up on this.
select
max(q.query_result_id) as query_result_id,q.author,max(email_nb) as email_nb
from query_results q
left join customers_emails c on q.author=c.author
where customer_id=12 and query_id=1
group by q.author;
I think the best thing to do in a situation like this is break it down into smaller queries and then combine them together.
The first thing you want to do is this:
The specificity is that I make sure to retrieve query_results with an author who does not appear in any customer_email_id where the days_since_sending would be less than 60 days
This might look something like this:
-- Query A
SELECT DISTINCT q.author FROM query_results q
WHERE q.author NOT IN (
SELECT c.author FROM customers_emails c
WHERE c.days_since_sending < 60
)
AND q.query_id = 1
This will get you the list of authors (with duplicates removed) that haven't had an email in the last 60 days that appear for the given query ID. Your next requirement is the following:
I need to have in the column the max email_nb sent to john by customer_id=12 and query_id=1 EVEN if it was sent more than 60 days ago
This query could look like this:
-- Query B
SELECT c.query_result_id, c.author, MAX(c.email_nb) as max_email_nb
FROM customers_emails c
LEFT JOIN query_results q ON c.author = q.author
WHERE c.customer_id = 12
AND q.query_id = 1
GROUP BY c.query_result_id, c.author
That gets you the maximum email_nb for each author/query_result combination, not taking into consideration the date at all.
The only thing left to do is reduce the set of results from the second query down to only the authors that appear in the first query. There are a few different methods for doing that. For example, you could INNER JOIN the two queries by author:
SELECT b.* FROM (
-- Query B
SELECT c.query_result_id, c.author, MAX(c.email_nb) as max_email_nb
FROM customers_emails c
LEFT JOIN query_results q ON c.author = q.author
WHERE c.customer_id = 12
AND q.query_id = 1
GROUP BY c.query_result_id, c.author
) b INNER JOIN (
-- Query A
SELECT DISTINCT q.author FROM query_results q
WHERE q.author NOT IN (
SELECT c.author FROM customers_emails c
WHERE c.days_since_sending < 60
)
AND q.query_id = 1
) a ON a.author = b.author
You could use another NOT IN clause:
SELECT b.* FROM (
-- Query B
SELECT c.query_result_id, c.author, MAX(c.email_nb) as max_email_nb
FROM customers_emails c
LEFT JOIN query_results q ON c.author = q.author
WHERE c.customer_id = 12
AND q.query_id = 1
GROUP BY c.query_result_id, c.author
) b
WHERE b.author NOT IN (
-- Query A
SELECT DISTINCT q.author FROM query_results q
WHERE q.author NOT IN (
SELECT c.author FROM customers_emails c
WHERE c.days_since_sending < 60
)
AND q.query_id = 1
) a
There are most likely ways to improve the speed or reduce down the lines of code for this query, but if you need to do that you now have a query that works at least that you can compare the results to.

MySQL select total latest updates of a type in the last N days

In MySQL, I have a table things which holds things owned by a user_id. The table thing_updates holds updates to things, and have a status and a date_submitted which is a unix timestamp of when the update was made. things do not necessarily have a corresponding row in thing_updates, such as when an update has not yet been made. Sample data:
Table: things
id | user_id
1 | 1
2 | 1
3 | NULL
Table: thing_updates
id | thing_id | status | date_submitted
1 | 1 | x | 123456789
2 | 1 | y | 234567890
3 | 3 | x | 123456789
I have managed to get the latest status of each thing before the date 999999999 assigned to user_id = 1 with the query below.
select t.id, tu.status, t.user_id
from things as t
left join thing_updates as tu
on tu.thing_id = t.id
where (
date_submitted in (
select max(tu2.date_submitted)
from thing_updates as tu2
where date_submitted < 999999999
group by thing_id
)
or date_submitted is null
)
and t.user_id = 1
This will give me something akin to:
id | status | user_id
1 | y | 1
2 | NULL | 1
As you can see, the status y is shown because it is more recent than x and before 999999999. There are 2 results in total and this query seems to work fine.
Now I would like to get total results which have a certain status for today, yesterday, the day before, etc until 10 days ago. To do this I have created another table called chart_range which holds the numbers 0 to 9. For instance:
Table: chart_range
offset
0
1
2
...
9
I hoped to use the offset value as follows:
select cr.offset, count(x.id) as total_x
from chart_range as cr
left join (
select t.id, tu.status, t.user_id
from things as t
left join thing_updates as tu
on tu.thing_id = t.id
where (
date_submitted in (
select max(tu2.date_submitted)
from thing_updates as tu2
where date_submitted < unix_timestamp(date_add(now(), interval - cr.offset + 1 day))
group by thing_id
)
or date_submitted is null
)
and t.user_id = 1
) as x on tu.status = 'x'
group by cr.offset
order by cr.offset asc
The end goal is to get a result like this:
offset | total_x
0 | 2 <-- such as in the 999999999 example above
1 | 5
2 | 7
3 | 4
...
9 | 0
However my query does not work as cr.offset cannot be referenced in an uncorrelated subquery. How can I modify this query to work?

Select promoted items grouped by another attribute

From table like below:
id | node_id | promoted | group_type | created_at |status
------------------------------------------------------------------
8 | 4321 | 1 | 3 | 2018-01-08 13:29:55| 1
4 | 4321 | 0 | 3 | 2018-01-06 11:22:53| 1
3 | 4321 | 0 | 1 | 2018-01-05 23:19:02| 1
2 | 4321 | 1 | 1 | 2018-01-05 21:20:15| 1
1 | 4321 | 1 | 3 | 2018-01-05 11:09:51| 1
I have to get one id and group_type values per each group_type.
If there is promoted item in the group, query should return it's id and group_type.
If there are more than one promoted items in the group, most recent promoted record should be returned.
If there is no promoted item in the group, query should return most recent record.
Using query below I managed to get almost what I need
SELECT a.id, a.group_type, a.promoted, a.created_at
FROM (
SELECT group_type, MAX(promoted) AS max_promoted
FROM nodes
WHERE node_id=4321 AND status=1
GROUP BY group_type
) AS g
INNER JOIN nodes AS a
ON a.group_type = g.group_type AND a.promoted = g.max_promoted
WHERE node_id= 4321 AND status=1 ORDER BY created_at
Unfortunately when there is more than one promoted item in the group I get both.
Any idea how to get only one promoted item per group?
EDIT:
If there is more than one group, query should return multiple rows but one per every group.
You can limit the result of the query by adding LIMIT 0,1 at the end of the query.
As you have ordered your result it will works.
For more information about LIMIT see : https://dev.mysql.com/doc/refman/5.7/en/limit-optimization.html
Edited: You should order items in descending to get the latest one on top and limit items as per required i.e. 1 or 2 and so on. Also union will help in getting latest result either promoted in case not promoted. The last limit will result only single (required) row. Here's your query:
(SELECT a.id, a.group_type, a.promoted, a.created_at
FROM (
SELECT group_type, MAX(promoted) AS max_promoted
FROM nodes
WHERE node_id=4321 and status=1
GROUP BY group_type
) AS g
INNER JOIN nodes AS a
ON a.group_type = g.group_type AND a.promoted = g.max_promoted
WHERE node_id= 4321 and status=1 ORDER BY created_at desc
limit 1)
union
(select a.id, a.group_type, a.promoted, a.created_at from nodes a order by created_at desc limit 1)
limit 1
Hope it helps!

retrieve value of maximum occurrence in a table

I am in a very complicated problem. Let me explain you first what I am doing right now:
I have a table name feedback in which I am storing grades against course id. The table looks like this:
+-------+-------+-------+-------+-----------+--------------
| id | cid | grade |g_point| workload | easiness
+-------+-------+-------+-------+-----------+--------------
| 1 | 10 | A+ | 1 | 5 | 4
| 2 | 10 | A+ | 1 | 2 | 4
| 3 | 10 | B | 3 | 3 | 3
| 4 | 11 | B+ | 2 | 2 | 3
| 5 | 11 | A+ | 1 | 5 | 4
| 6 | 12 | B | 3 | 3 | 3
| 7 | 11 | B+ | 2 | 7 | 8
| 8 | 11 | A+ | 1 | 1 | 2
g_point has just specific values for the grades, thus I can use these values to show the user courses sorted by grades.
Okay, now first my task is to print out the grade of each course. The grade can be calculated by the maximum occurrence against each course. For example from this table we can see the result of cid = 10 will be A+, because it is present two times there. This is simple. I have already implemented this query which I will write here in the end.
The main problem is when we talk about the course cid = 11 which has two different grades. Now in that situation client asks me to take the average of workload and easiness of both these courses and whichever course has the greater average should be shown. The average would be computed like this:
all workload values of the grade against course
+ all easiness values of the grade against course
/ 2
From this example cid = 11 has four entries,have equal number of grades against a course
B+ grade average
avgworkload(2 + 7)/2=x
avgeasiness(3 + 8)/2 = y
answer x+y/2 = 10
A+ grade average
avgworkload(5 + 1)/2=x
avgeasiness(4 + 2)/2 = y
answer x+y/2 = 3
so the grade should be B+.
This is the query which I am running to get the max occurrence grade
SELECT
f3.coursecodeID cid,
f3.grade_point p,
f3.grade g
FROM (
SELECT
coursecodeID,
MAX(mode_qty) mode_qty
FROM (
SELECT
coursecodeID,
COUNT(grade_point) mode_qty
FROM feedback
GROUP BY
coursecodeID, grade_point
) f1
GROUP BY coursecodeID
) f2
INNER JOIN (
SELECT
coursecodeID,
grade_point,
grade,
COUNT(grade_point) mode_qty
FROM feedback
GROUP BY
coursecodeID, grade_point
) f3
ON
f2.coursecodeID = f3.coursecodeID AND
f2.mode_qty = f3.mode_qty
GROUP BY f3.coursecodeID
ORDER BY f3.grade_point
Here is SQL Fiddle.
I added a table Courses with the list of all course IDs, to make the main idea of the query easier to see. Most likely you have it in the real database. If not, you can generate it on the fly from feedback by grouping by cid.
For each cid we need to find the grade. Group feedback by cid, grade to get a list of all grades for the cid. We need to pick only one grade for a cid, so we use LIMIT 1. To determine which grade to pick we order them. First, by occurrence - simple COUNT. Second, by the average score. Finally, if there are several grades than have same occurrence and same average score, then pick the grade with the smallest g_point. You can adjust the rules by tweaking the ORDER BY clause.
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
FROM courses
ORDER BY courses.cid
result set
cid CourseGrade
10 A+
11 B+
12 B
UPDATE
MySQL doesn't have lateral joins, so one possible way to get the second column g_point is to repeat the correlated sub-query. SQL Fiddle
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
,(
SELECT feedback.g_point
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGPoint
FROM courses
ORDER BY CourseGPoint
result set
cid CourseGrade CourseGPoint
10 A+ 1
11 B+ 2
12 B 3
Update 2 Added average score into ORDER BY SQL Fiddle
SELECT
courses.cid
,(
SELECT feedback.grade
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGrade
,(
SELECT feedback.g_point
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS CourseGPoint
,(
SELECT (AVG(workload) + AVG(easiness))/2
FROM feedback
WHERE feedback.cid = courses.cid
GROUP BY
cid
,grade
ORDER BY
COUNT(*) DESC
,(AVG(workload) + AVG(easiness))/2 DESC
,g_point
LIMIT 1
) AS AvgScore
FROM courses
ORDER BY CourseGPoint, AvgScore DESC
result
cid CourseGrade CourseGPoint AvgScore
10 A+ 1 3.75
11 B+ 2 5
12 B 3 3
If I understood well you need an inner select to find the average, and a second outer select to find the maximum values of the average
select cid, grade, max(average)/2 from (
select cid, grade, avg(workload + easiness) as average
from feedback
group by cid, grade
) x group by cid, grade
This solution has been tested on your data usign sql fiddle at this link
If you change the previous query to
select cid, max(average)/2 from (
select cid, grade, avg(workload + easiness) as average
from feedback
group by cid, grade
) x group by cid
You will find the max average for each cid.
As mentioned in the comments you have to choose wich strategy use if you have more grades that meets the max average. For example if you have
+-------+-------+-------+-------+-----------+--------------
| id | cid | grade |g_point| workload | easiness
+-------+-------+-------+-------+-----------+--------------
| 1 | 10 | A+ | 1 | 5 | 4
| 2 | 10 | A+ | 1 | 2 | 4
| 3 | 10 | B | 3 | 3 | 3
| 4 | 11 | B+ | 2 | 2 | 3
| 5 | 11 | A+ | 1 | 5 | 4
| 9 | 11 | C | 1 | 3 | 6
You will have grades A+ and C soddisfing the maximum average 4.5

Mysql join with counting results in another table

I have two tables, one with ranges of numbers, second with numbers. I need to select all ranges, which have at least one number with status in (2,0). I have tried number of different joins, some of them took forever to execute, one which I ended with is fast, but it select really small number of ranges.
SELECT SQL_CALC_FOUND_ROWS md_number_ranges.*
FROM md_number_list
JOIN md_number_ranges
ON md_number_list.range_id = md_number_ranges.id
WHERE md_number_list.phone_num_status NOT IN (2, 0)
AND md_number_ranges.reseller_id=1
GROUP BY range_id
LIMIT 10
OFFSET 0
What i need is something like "select all ranges, join numbers where number.range_id = range.id and where there is at least one number with phone_number_status not in (2, 0).
Any help would be really appreciated.
Example data structure:
md_number_ranges:
id | range_start | range_end | reseller_id
1 | 000001 | 000999 | 1
2 | 100001 | 100999 | 2
md_number_list:
id | range_id | number | phone_num_status
1 | 1 | 0000001 | 1
2 | 1 | 0000002 | 2
3 | 2 | 1000012 | 0
4 | 2 | 1000015 | 2
I want to be able select range 1, because it has one number with status 1, but not range 2, because it has two numbers, but with status which i do not want to select.
It's a bit hard to tell what you want, but perhaps this will do:
SELECT *
from md_number_ranges m
join (
SELECT md_number_ranges.id
, count(*) as FOUND_ROWS
FROM md_number_list
JOIN md_number_ranges
ON md_number_list.range_id = md_number_ranges.id
WHERE md_number_list.phone_num_status NOT IN (2, 0)
AND md_number_ranges.reseller_id=1
GROUP BY range_id
) x
on x.id=m.id
LIMIT 10
OFFSET 0
Is this what you're looking for?
SELECT DISTINCT r.*
FROM md_number_ranges r
JOIN md_number_list l ON r.id = l.range_id
WHERE l.phone_num_status NOT IN (0,2)
SQL Fiddle Demo