I have a select query like this
select count(distinct id)*100/totalcount as freq, count (distinct id) from
<few joins, conditions, gorup by here> .....
Will this result in 2 calculations of count under MySql 5.0? I can calculate the frequency in my appliation as well if this is a problem. I am aware of the solutions presented in Adding percentages to multiple counts in one SQL SELECT Query but I just want to avoid nested queries
select count(distinct id)*100/totalcount as freq, count (distinct id) from
<few joins, conditions, gorup by here> .....
Yes, it will result in several evaluations.
Each recordset on DISTINCT id will be built separately for each function
Note that if not for DISTINCT, MySQL would use each record only once (though in multiple function calls).
Since COUNT is very cheap, function calls add almost nothing to overall query time.
You can benefit from rewriting your query as this:
SELECT COUNT(id) * 100 / totalcount AS freq,
COUNT(id)
FROM (
SELECT DISTINCT id
FROM original_query
) q
BTW, why do you need both GROUP BY and DISTINCT in one query? Could you please post your original query as it is?
Related
I am aware that the execution order of MySQL is not fixed. But, I heard it usually goes like this:
FROM, including JOINs
WHERE
GROUP BY
HAVING
SELECT
DISTINCT
ORDER BY
LIMIT and OFFSET
However, if I run functions like COUNT() for example (like the code below), when does it get to be executed? and how does MySQL decide the subjects that will be calculated with the function (e.g. What to count for COUNT() function)? I am confused about the execution order and the target designation of functions like AVG(), SUM(), MAX(), etc. in MySQL.
SELECT productvendor, count(*)
FROM products
GROUP BY productvendor
HAVING count(*) >= 9;
You sequence is not correct
select is before GROUP BY
FROM, including JOINs
WHERE
SELECT the row obtained by from and where in a temporary area for others
operation (and build the column alias)
DISTINCT
GROUP BY
HAVING
ORDER BY
LIMIT and OFFSET
return the final result
the count and the aggegation function are done on a temporary result with the select column .. this operation produce the result filtered by having
So i have a mysql table with over 9 million records. They are call records. Each record represents 1 individual call. The columns are as follows:
CUSTOMER
RAW_SECS
TERM_TRUNK
CALL_DATE
There are others but these are the ones I will be using.
So I need to count the total number of calls for a certain week in a certain Term Trunk. I then need to sum up the number of seconds for those calls. Then I need to count the total number of calls that were below 7 seconds. I always do this in 2 queries and combine them but I was wondering if there were ways to do it in one? I'm new to mysql so i'm sure my syntax is horrific but here is what I do...
Query 1:
SELECT CUSTOMER, SUM(RAW_SECS), COUNT(*)
FROM Mytable
WHERE TERM_TRUNK IN ('Mytrunk1', 'Mytrunk2')
GROUP BY CUSTOMER;
Query 2:
SELECT CUSTOMER, COUNT(*)
FROM Mytable2
WHERE TERM_TRUNK IN ('Mytrunk1', 'Mytrunk2') AND RAW_SECS < 7
GROUP BY CUSTOMER;
Is there any way to combine these two queries into one? Or maybe just a better way of doing it? I appreciate all the help!
There are 2 ways of achieving the expected outcome in a single query:
conditional counting: use a case expression or if() function within the count() (or sum()) to count only specific records
use self join: left join the table on itself using the id field of the table and in the join condition filter the alias on the right hand side of the join on calls shorter than 7 seconds
The advantage of the 2nd approach is that you may be able to use indexes to speed it up, while the conditional counting cannot use indexes.
SELECT m1.CUSTOMER, SUM(m1.RAW_SECS), COUNT(m1.customer), count(m2.customer)
FROM Mytable m1
LEFT JOIN Mytable m2 ON m1.id=m2.id and m2.raw_secs<7
WHERE TERM_TRUNK IN ('Mytrunk1', 'Mytrunk2')
GROUP BY CUSTOMER;
I have a SQL query (MYSQL) that I would like to go faster. The general problem is to count distinct keys that has an aggregated condition on them. That is, I like to sum the values of a column in the rows with the same key value and then determine if it should be included in the count. The only solution I have come up with is to do a sub-query that do the summing and then count distinct in the outer query using having there. Like:
SELECT COUNT(DISTINCT key), sum1, sum2, categoryid
FROM
(
SELECT SUM(cnt1) AS sum1,
SUM(cnt2) AS sum2,
key,categoryid
FROM table
GROUP BY key,categoryid
) as SUBQUERY
GROUP BY categoryid
HAVING (8*sum1)/sum2 > 0;
The problem (as I see it) is that the query use a sub-query that will produce a temp table. As the data set large (10M rows, 500K distinct keys) it takes a lot of time. It looks like it should be possible to do better as a straight distinct count without the condition takes just a tenth of the time of this query and summing without grouping takes only a fraction of that.
Anyone with ideas on how to improve on performance?
Thanks in advance!
Lasse
I actually was able to cut the response time myself by moving the count distinct to the inner query. Don't know why I didn't see that earlier. Obviously makes the temp table smaller. However it is still a factor 4-5 slower than the distinct count without a condition.
The new select looks like:
SELECT dist_cnt, sum1, sum2, categoryid
FROM
(
SELECT COUNT(DISTINCT key) AS dist_cnt,
SUM(cnt1) AS sum1,
SUM(cnt2) AS sum2,
key,categoryid
FROM table
GROUP BY key,categoryid
) as SUBQUERY
WHERE (8*sum1)/sum2 > 0
GROUP BY categoryid
Anyway, I think it should be possible to get it at least a factor 2 faster.
Lasse
I am trying to optimize queries to my database. I have the following query:
select date, (
select count(user_id)
from myTable
where logdate = date
) as value
from myTable;
As far as I can see, the second value is computed efficiently. However, is there any common practice to optimize this kind of query in MySQL?
I believe you can avoid writing a subquery and preform the same query using aggregation, which may run faster:
SELECT date, COUNT(user_id) AS numRecords
FROM myTable
GROUP BY date;
Here is a reference on aggregate functions.
you do not have to put group functions in a separate select. Just do
select date, count(user_id) from myTable group by date;
There is no hard and fast. In this query, it was a matter of one select being more efficient than 2. But here is some tips for beginners on optimizing queries.
select id, first_name, count(*) from users;
The users table contains 10 entries, but the above select query shows only a single row. Is it wrong to mix count(*) as part of the above query?
COUNT is a function that aggregates. You can't mix it into your normal query.
If you want to receive the ten entries just do a normal select:
SELECT id, name FROM users;
and to get the number of entries:
SELECT COUNT(id) FROM users;
Its becuase you are using an aggregate function in the select part of the query,
to return the 10 records you just need the id, and first_name in the query.
EG:
SELECT id, first_Name
FROM users
if you wanted to get a count of the records in the table then you could use
SELECT (Count(id))
FROM [users]
It's not "wrong", but it is meaningless without a "group by" clause - most databases will reject that query, as aggregate functions should include a group by if you're including other columns.
Not sure exactly what you're trying to achieve with this?
select id, first_name,(select count(*) from users) AS usercount from users;
will give each individual user and the total count but again, not sure why you would want it.
select id, first_name from users,(select count(*) as total from users) as t;
COUNT is an aggregate function and it will always give you count of all records in table unless used in combination with group by.
If you use it in combination with normal query, then it will take priority in deciding the final output as in your case it returns 1.
If you want to return all 10 records, you should just write -
select id,first_name from users
If you need number of rows in a table, you can use MySQL's SQL_CALC_FOUND_ROWS clause. Check MySQL docs to see how it's used.