Folks
when i m running the below query , i m getting the error for invalid use of group by function
SELECT `Margin`.`destination`,
ROUND(sum(duration),2) as total_duration,
sum(calls) as total_calls
FROM `ilax`.`margins` AS `Margin`
WHERE `date1` = '2013-08-30' and `destination` like "af%"
AND ROUND(sum(duration),2) like "3%"
group by `destination`
ORDER BY duration Asc LIMIT 0, 20;
let me know the work around
The WHERE clause is evaluated before grouping takes place, so SUM() cannot be used therein; use the HAVING clause instead, which is evaluated after grouping:
SELECT destination,
ROUND(SUM(duration), 2) AS total_duration,
SUM(calls) AS total_calls
FROM ilax.margins
WHERE date1 = '2013-08-30'
AND destination LIKE 'af%'
GROUP BY destination
HAVING total_duration LIKE '3%'
ORDER BY total_duration ASC
LIMIT 0, 20
Note also that one really ought to use numeric comparison operations for numeric values, rather than string pattern matching. For example:
HAVING total_duration >= 3000 AND total_duration < 4000
Related
I am trying to run a query to get data one time from a client database to our database but a query is taking a lot of time to execute, when I change the order by from primary key user_appoint.id to user_appoint.u_id below is my query
SELECT
CONCAT('D',user_appoint.`id`) AS ApptId,
user_appoint.`u_id`,
tbl_questions.CandAns,
tbl_questions.ExamAns,
tbl_questions.QueNote,
CONCAT("[",GROUP_CONCAT(CONCAT('"',`tbl_investigations`.`test_id`,'":"',tbl_investigations.`result`,'"')),"]") AS CandInv,
CONCAT("[",GROUP_CONCAT(CONCAT('"',`tbl_investigations`.`test_id`,'":"',tbl_investigations.`comments`,'"')),"]") AS IntComm,
IF(tbl_questions.LastUpdatedDateTime>MAX(tbl_investigations.`ModifiedAt`),tbl_questions.LastUpdatedDateTime,MAX(tbl_investigations.`ModifiedAt`)) AS LastUpdatedDateTime,
CONCAT('D',user_appoint.`id`) AS UniqueId
FROM user_appoint
LEFT JOIN tbl_investigations ON tbl_investigations.`appt_id`=user_appoint.`id` AND tbl_investigations.`ModifiedAt`>'2011-01-01 00:00:00'
LEFT JOIN tbl_questions ON tbl_questions.`appt_id` =user_appoint.`id` AND tbl_questions.`LastUpdatedDateTime`>'2011-01-01 00:00:00'
GROUP BY user_appoint.`id`
HAVING LastUpdatedDateTime>'2011-01-01 00:00:00'
ORDER BY user_appoint.`u_id`
LIMIT 0, 2000;
user_appoint.u_id is properly indexed.
Please check the explain plan of your query. And its better to always share explain plan with your original question.
explain format=json
SELECT CONCAT('D',user_appoint.id) AS ApptId, user_appoint.u_id,
tbl_questions.CandAns, tbl_questions.ExamAns, tbl_questions.QueNote,
CONCAT("[",GROUP_CONCAT(CONCAT('"',tbl_investigations.test_id,'":"',tbl_investigations.result,'"')),"]")
AS CandInv,
CONCAT("[",GROUP_CONCAT(CONCAT('"',tbl_investigations.test_id,'":"',tbl_investigations.comments,'"')),"]")
AS IntComm,
IF(tbl_questions.LastUpdatedDateTime>MAX(tbl_investigations.ModifiedAt),tbl_questions.LastUpdatedDateTime,MAX(tbl_investigations.ModifiedAt))
AS LastUpdatedDateTime, CONCAT('D',user_appoint.id) AS UniqueId FROM
user_appoint LEFT JOIN tbl_investigations ON
tbl_investigations.appt_id=user_appoint.id AND
tbl_investigations.ModifiedAt>'2011-01-01 00:00:00' LEFT JOIN
tbl_questions ON tbl_questions.appt_id =user_appoint.id AND
tbl_questions.LastUpdatedDateTime>'2011-01-01 00:00:00' GROUP BY
user_appoint.id HAVING LastUpdatedDateTime>'2011-01-01 00:00:00'
ORDER BY user_appoint.u_id LIMIT 0, 2000;
On looking at your query,I could see lot of concat,aggregate function and join is being performed in single query.
These operations will be performed for all 2000 records as you have set limit on query execution.
This might have caused query to slow down its execution.
You have 2 identical columns with different aliases
CONCAT('D',user_appoint.`id`) AS ApptId,
CONCAT('D',user_appoint.`id`) AS UniqueId
(changed) Assuming NULLs may occur in these date columns then comparing the max() values will overcome any adverse impacts by NULL:
if(max(tbl_questions.lastupdateddatetime) > max(tbl_investigations.`modifiedat`) , max(tbl_questions.lastupdateddatetime), max(tbl_investigations.`modifiedat`)) AS LastUpdatedDateTime
Try this:
SELECT *
FROM (
SELECT
Concat('D', user_appoint.`id`) AS ApptId
, user_appoint.`u_id`
, tbl_questions.candans
, tbl_questions.examans
, tbl_questions.quenote
, Concat("[", Group_concat(Concat('"', `tbl_investigations`.`test_id`, '":"', tbl_investigations.`result`, '"')), "]") AS CandInv
, Concat("[", Group_concat(Concat('"', `tbl_investigations`.`test_id`, '":"', tbl_investigations.`comments`, '"')), "]") AS IntComm
, if(max(tbl_questions.lastupdateddatetime) > max(tbl_investigations.`modifiedat`) , max(tbl_questions.lastupdateddatetime), max(tbl_investigations.`modifiedat`) ) AS LastUpdatedDateTime
, Concat('D', user_appoint.`id`) AS UniqueId
FROM user_appoint
LEFT JOIN tbl_investigations
ON tbl_investigations.`appt_id` = user_appoint.`id`
AND tbl_investigations.`modifiedat` > '2011-01-01 00:00:00'
LEFT JOIN tbl_questions
ON tbl_questions.`appt_id` = user_appoint.`id`
AND tbl_questions.`lastupdateddatetime` > '2011-01-01 00:00:00'
GROUP BY user_appoint.`id`
HAVING lastupdateddatetime > '2011-01-01 00:00:00'
) d
ORDER BY `u_id`
LIMIT 0, 2000
;
HOWEVER
You are using a non-current and non-standard form of GROUP BY clause. MySQL started life allowing this bizarre situation where you could select many columns but only group by one of those. This is completely non-standard for SQL.
In recent versions of MySQL the default settings have changed and using just one column in the GROUP BY clause will cause an error.
So, you may have to change the way you perform the grouping to
GROUP BY
user_appoint.`id`
, user_appoint.`u_id`
, tbl_questions.candans
, tbl_questions.examans
, tbl_questions.quenote
If none of these improve performance please provide the execution plan (as text).
Does anyone know what's wrong with this query?
This works perfectly on its own:
SELECT * FROM
(SELECT * FROM data WHERE site = '".$id."'
AND disabled = '0'
AND carvotes NOT LIKE '0'
AND (time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY car ORDER BY carvotes DESC LIMIT 0 , 10)
X order by time DESC
So does this:
SELECT * FROM data WHERE site = '".$id."' AND disabled = '0' GROUP BY car DESC ORDER BY time desc LIMIT 0 , 30
But combining them like this:
SELECT * FROM data WHERE site = '".$id."' AND disabled = '0' AND car NOT IN (SELECT * FROM
(SELECT * FROM data WHERE site = '".$id."'
AND disabled = '0'
AND carvotes NOT LIKE '0'
AND (time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY car ORDER BY carvotes DESC LIMIT 0 , 10)
X order by time DESC) GROUP BY car DESC ORDER BY time desc LIMIT 0 , 30
Gives errors. Any ideas?
Please try the following...
$result = mysqli_query( $con,
"SELECT *
FROM data
WHERE site = '" . $id .
"' AND disabled = '0'
AND car NOT IN ( SELECT car
FROM ( SELECT car,
carvotes
FROM data
WHERE site = '" . $id .
"' AND disabled = '0'
AND carvotes NOT LIKE '0'
AND ( time > ( NOW( ) - INTERVAL 14 DAY ) )
GROUP BY car
ORDER BY carvotes DESC
LIMIT 10 ) X
)
GROUP BY car
ORDER BY time DESC
LIMIT 30" );
The main cause of your problem is that with car NOT IN ( SELECT * FROM ( SELECT *... you are trying to compare each record's value of car with each row returned by your subquery. IN requires you to have the same number of fields on both sides of the comparison. By using SELECT * at both levels of the subquery you were ensuring that the right side of the comparison had however many fields are in data versus your single field on the left, which confused MySQL.
Since you are aiming to compare to a single field, namely car, our subquery has to select just the car field from its dataset. Since the sort order of the subquery's results has no effect upon the IN comparison, and since our innermost query will be returning just car, I have removed the outer level of the subquery.
Beyond changing the first part of the subquery to SELECT car, the only other change that I have made to the subquery is to change LIMIT 0, 10 to LIMIT 10. The former means limit to the the 10 records that are offset by 0 from the first record. This is useful if you want records 6 to 15, but redundant for 1 to 10 as LIMIT 10 has the same affect and is slightly simpler. Ditto for LIMIT 0, 30 at the end of your overall statement.
As for the main body of the statement, I have not made any attempt to specify what fields (or aggregate functions of those fields) should be returned since you have made no statement indicating what your requirements / preferences are. If you are satisfied that GROUP BY has left you with a still valid set of values, then all the good, but if not then I recommend that you rewrite your Question to be specific about that detail.
By default, MySQL sorts the data subjected to a GROUP BY into ascending order, but if an ORDER BY clause is also present then it overrides the GROUP BY's sort pattern. As such, there is no benefit to specifying DESC after either of your GROUP BY car clauses, so I have removed it where it occurs.
Interesting Sidenote : You can override a GROUP BY's sort by specifying ORDER BY NULL.
If you have any questions or comments, then please feel free to post a Comment accordingly.
Further Reading
https://dev.mysql.com/doc/refman/5.7/en/order-by-optimization.html - on optimising your ORDER BY sorting
https://dev.mysql.com/doc/refman/5.7/en/select.html - on the SELECT statement's syntax - specifically the parts to do with LIMIT.
https://www.w3schools.com/php/php_mysql_select_limit.asp - a simpler explanation of LIMIT
This is your query:
SELECT *
FROM data
WHERE site = '".$id."' AND disabled = '0' AND
car NOT IN (SELECT *
FROM (SELECT *
FROM data
WHERE site = '".$id."' AND
disabled = '0' AND
carvotes NOT LIKE '0' AND
(time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY car
ORDER BY carvotes DESC
LIMIT 0 , 10
) x
ORDER BY time DESC
)
GROUP BY car DESC
ORDER BY time desc
LIMIT 0 , 30 ;
Several comments:
Do not wrap integer constants in single quotes. This can mislead people. This can mislead optimizers.
Do not use string functions on integers (such as like). Same reason.
NOT IN with subqueries is dangerous. The construct does not handle NULL values the way you expect. Use NOT EXISTS or LEFT JOIN instead.
When using subqueries, ORDER BY is almost never appropriate.
Never use SELECT * with GROUP BY. It is just wrong. Happily, MySQL 5.7 has changed its defaults to reject this anti-pattern
So, a better way to write this query is something like this:
SELECT d.car, MAX(time) as time
FROM data d LEFT JOIN
(SELECT d2.*
FROM data d2
WHERE d2.site = '".$id."' AND
d2.disabled = 0 AND
d2.carvotes NOT LIKE 0 AND
(d2.time > ( now( ) - INTERVAL 14 DAY ))
GROUP BY d2.car
ORDER BY carvotes DESC
LIMIT 0 , 10
) car10
ON d.car = car10.car
WHERE d.site = '".$id."' AND d.disabled = 0' AND
car10.car IS NOT NULL
GROUP BY car DESC
ORDER BY MAX(time) desc
LIMIT 0 , 30 ;
Alternatively, use SELECT * and remove the GROUP BY in the outer query.
For readability, I would like to modify the below statement. Is there a way to extract the CASE statement, so I can use it multiple times without having to write it out every time?
select
mturk_worker.notes,
worker_id,
count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum( case when isnull(imdb_url) and isnull(accepted_imdb_url) then 1
when imdb_url = accepted_imdb_url then 1
else 0 end ) correct,
100 * ( sum( case when isnull(imdb_url) and isnull(accepted_imdb_url) then 1
when imdb_url = accepted_imdb_url then 1
else 0 end)
/ count(episode_has_accepted_imdb_url) ) percentage
from
mturk_completion
inner join mturk_worker using (worker_id)
where
timestamp > '2015-02-01'
group by
worker_id
order by
percentage desc,
correct desc
You can actually eliminate the case statements. MySQL will interpret boolean expressions as integers in a numeric context (with 1 being true and 0 being false):
select mturk_worker.notes, worker_id, count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum(imdb_url = accepted_imdb_url or imdb_url is null and accepted_idb_url is null) as correct,
(100 * sum(imdb_url = accepted_imdb_url or imdb_url is null and accepted_idb_url is null) / count(episode_has_accepted_imdb_url)
) as percentage
from mturk_completion inner join
mturk_worker
using (worker_id)
where timestamp > '2015-02-01'
group by worker_id
order by percentage desc, correct desc;
If you like, you can simplify it further by using the null-safe equals operator:
select mturk_worker.notes, worker_id, count(worker_id) answers,
count(episode_has_accepted_imdb_url) scored,
sum(imdb_url <=> accepted_imdb_url) as correct,
(100 * sum(imdb_url <=> accepted_imdb_url) / count(episode_has_accepted_imdb_url)
) as percentage
from mturk_completion inner join
mturk_worker
using (worker_id)
where timestamp > '2015-02-01'
group by worker_id
order by percentage desc, correct desc;
This isn't standard SQL, but it is perfectly fine in MySQL.
Otherwise, you would need to use a subquery, and there is additional overhead in MySQL associated with subqueries.
Help me with next SQL:
SELECT
date_format(from_unixtime(`ticket_logs`.`created`),'%Y-%m-%d') AS `datac`,
`ticket_logs`.`ticket_id` AS `ticket_id`,
ticket_logs.value_old,
ticket_logs.value_new,
max(`ticket_logs`.`action`) AS `ultima_act`
FROM
`ticket_logs`
WHERE
(
(`ticket_logs`.`action` = 6)
OR (`ticket_logs`.`action` = 16)
)
GROUP BY
date_format(
from_unixtime(`ticket_logs`.`created`),
'%Y-%m-%d'
),
`ticket_logs`.`ticket_id`
ORDER BY
`datac` DESC,
`ticket_logs`.`ticket_id` DESC
The problem is that "value_old" and "value_new", always take the first value per date and not the value corresponding with the max value of "action"
I don't see how this is a problem. That is how SQL works -- the order by takes place after the group by. In addition, MySQL is just confusing you, because you are using an extension to group by that you don't fully understand -- having extra columns in the select that are not in the group by. (See this.)
Fortunately, MySQL supports a hack to get what you want, without writing a much more complicated SQL statement. The expressions you want are:
substring_index(group_concat(ticket_logs.value_old order by `ticket_logs`.`action` desc), ',', 1)
substring_index(group_concat(ticket_logs.value_new order by `ticket_logs`.`action` desc), ',', 1)
Found another approach. I used the "created" column to obtain a max and joined:
SELECT
date_format(
from_unixtime(`ticket_logs`.`created`),
'%Y-%m-%d'
) AS `datax`,
ticket_logs.ticket_id,
ticket_logs.action,
ticket_logs.value_old,
ticket_logs.value_new
FROM
ticket_logs
INNER JOIN (
SELECT
date_format(
from_unixtime(`ticket_logs`.`created`),
'%Y-%m-%d'
) AS `datac`,
max(ticket_logs.created) AS maxts,
ticket_id
FROM
ticket_logs
WHERE
ticket_logs.action = 6
OR ticket_logs.action = 16
GROUP BY
date_format(
from_unixtime(`ticket_logs`.`created`),
'%Y-%m-%d'
),
ticket_id
) maxtbl ON ticket_logs.ticket_id = maxtbl.ticket_id
AND ticket_logs.created = maxtbl.maxts
ORDER BY
datax DESC,
ticket_id DESC
My current sql statement is like follows..
SELECT * FROM `Main`
WHERE username = '$username'
AND fromsite = '$website'
ORDER BY `votes`.`timestamp` DESC
LIMIT 0 , 1
however the timestamp column shows a timestamp like " "2014-03-19 12:00:43" ...Following another question I had an answer along the lines of using ...
SELECT UNIX_TIMESTAMP() - UNIX_TIMESTAMP(timestamp) AS seconds_ago
However not great with mysql and still don't see how I can still use the original select statement and work in a function like above, so that converts the timestamp column and shows as seconds ago instead.
Just ditch the * and list the columns and expressions you want returned.
SELECT m.username
, m.fromsite
, m.timestamp
, UNIX_TIMESTAMP()-UNIX_TIMESTAMP(m.timestamp) AS seconds_ago
, m.votes
FROM `Main` m
WHERE m.username = '$username'
AND m.fromsite = '$website'
ORDER BY m.votes, m.timestamp DESC
LIMIT 0,1