Optimizing a mySQL query with functions - mysql

I'm running this query (via PHP) and wondering if there is a faster way to get the same result:
SELECT
date(date_time) as `date`,
unix_timestamp(date(date_time)) as `timestamp`,
month(date(date_time)) as `month`,
dayname(date(date_time)) as `dayname`,
dayofmonth(date(date_time)) as `daynum`,
hour(date_time) as `hour`, minute(date_time) as `increment`
FROM loc_data
WHERE loc_id = 2
As you can see I'm performing the date(date_time) function 5 times but would like to store the result of the first and use that result from then on. Would this increase performance of the query? The query is called many thousands of times in a script. (When I perform the functions in PHP instead of mySQL I get no big difference in speed from the current query above.)

Have you tried with SQL variables?
select #a from (select #a := 1) a
That works, with your query
SELECT
#n as `date`,
unix_timestamp(#n) as `timestamp`
FROM loc_data l,
(select #n := date(l.date_time)) a
WHERE l.loc_id = 2
But I'm not sure if this will work for you.

Related

How can I speed up this query with an aliased column?

So I found this code snippet here on SO. It essentially fakes a "row_number()" function for MySQL. It executes quite fast, which I like and need, but I am unable to tack on a where clause at the end.
select
#i:=#i+1 as iterator, t.*
from
big_table as t, (select #i:=0) as foo
Adding in where iterator = 875 yields an error.
The snippet above executes in about .0004 seconds. I know I can wrap it within another query as a subquery, but then it becomes painfully slow.
select * from (
select
#i:=#i+1 as iterator, t.*
from
big_table as t, (select #i:=0) as foo) t
where iterator = 875
The snippet above takes over 10 seconds to execute.
Anyway to speed this up?
In this case you could use the LIMIT as a WHERE:
select
#i:=#i+1 as iterator, t.*
from
big_table as t, (select #i:=874) as foo
LIMIT 875,1
Since you only want record 875, this would be fast.
Could you please try this?
Increasing the value of the variable in the where clause and checking it against 875 would do the trick.
SELECT
t.*
FROM
big_table AS t,
(SELECT #i := 0) AS foo
WHERE
(#i :=#i + 1) = 875
LIMIT 1
Caution:
Unless you specify an order by clause it's not guaranteed that you will get the same row every time having the desired row number. MySQL doesn't ensure this since data in table is an unordered set.
So, if you specify an order on some field then you don't need user defined variable to get that particular record.
SELECT
big_table.*
FROM big_table
ORDER BY big_table.<some_field>
LIMIT 875,1;
You can significantly improve performance if the some_field is indexed.

MySQL same query result in multiple subqueries

I am trying to optimize following query.
SELECT t3.*,
(SELECT SUM(t4.col_sum)
FROM (...) t4
WHERE t4.timestamp BETWEEN CONCAT(SUBSTR(t3.timestamp, 1, 11), "00:00:00") AND t3.timestamp)
AS cum_sum
FROM (...) t3
Where (...) is a container for long query. It results 2 columns: timestamp and col_sum. I want to add third column to it by writing a query. That third column is a cumulative sum of col_sum.
The problem is I am putting same big query in two places (...)
Is there a way I can obtain a result and use the result in those two/multiple places (...)?
One method is to use a temporary table.
Probably a more efficient method is to use variables to calculate a cumulative sum. It would be something like:
select t.*,
(#c := if(#t = left(t.timestamp, 11), #c + t.col_sum,
if(#t := left(t.timestamp, 11), 0, 0)
)
) as cumesum
from (. . .) t cross join
(select #t := '', #c := 0) vars
order by t.timestamp;
The above query orders the rows by timestamp. The variable #t keeps track of the first 11 characters in the timestamp -- as I read your logic, you want to do the cumulative sum only within a group where this is constant.
The variable #c keeps track of the cumulative sum, resetting to zero when a new "first 11 characters" are encountered. The logic looks a bit complicated, but it is best to put all variable assignments in a single expression, because MySQL does not guarantee the order of evaluation of expressions.

how to do lag operation in mysql

Guys I want to use analytical function lag in mysql. In Oracle it is supported but I can't do it in Mysql. So can anybody help me how to perform lag operation in Mysql?
For example
UID Operation
1 Logged-in
2 View content
3 Review
I want to use lag function so that my output would be as follows
UID Operation Lagoperation
1 Logged-in
2 View content Logged-in
3 Review View content
Does Mysql support lag function???
You can emulate it with user variables:
select uid, operation, previous_operation from (
select
y.*
, #prev AS previous_Operation
, #prev := Operation
from
your_table y
, (select #prev:=NULL) vars
order by uid
) subquery_alias
see it working in an sqlfiddle live
Here you initialize your variable(s). It's the same as writing SET #prev:=NULL; before writing your query.
, (select #prev:=NULL) vars
Then the order of these statements in the select clause is important:
, #prev AS previous_Operation
, #prev := Operation
The first just displays the variables value, the second assigns the value of the current row to the variable.
It's also important to have an ORDER BY clause, as the output is otherwise not deterministic.
All this is put into a subquery just out of aesthetic reasons,... to filter out this
, #prev := Operation
column.

group_concat performance issue in MySQL

I added a group_concat to a query and killed the performance. The explain plans are identical before and after I added it, so I'm confused as to how to optimize this.
Here is a simplified version of the query:
SELECT #curRow := #curRow + 1 AS row_number,
docID,
docTypeID,
CASE WHEN COUNT(1) > 1
THEN group_concat( makeID )
-- THEN 'multiple makes found'
ELSE MIN(makeID)
END AS makeID,
MIN(desc) AS desc
FROM simplified_mysql_table,
(SELECT #curRow := 0) r
GROUP BY docID, docTypeID,
CASE WHEN docTypeID = 1
THEN 0
ELSE row_number
END;
Note the CASE statement in the SELECT. The group_concat kills performance. If I comment that line and just output 'multiple makes found' it executes very quickly. Any idea what is causing this?
In the original non-simplified version of this query we had a DISTINCT, which was completely unnecessary and causing the performance issue with group_concat. I'm not sure why it caused such a problem, but removing it fixed the performance issue.
In MySQL, group_concat performance should not kill query performance. It is additional work involving strings, so some slow down is expected. But more like 10% rather than 10X. Can you quantify the difference in the query times?
Question: is MakeID a character string or integer? I wonder if a conversion from integer to string might affect the performance.
Second, what would the performance be for concat(min(MakeId), '-', max(MakedId)) isntead of the group_concat?
Third, does the real group_concat use DISTINCT or ORDER BY? These could slow things down, especially in a memory limited environment.
An idea is to split the query into two parts, one for WHEN docTypeID = 1 and one for the rest.
Try adding an index on (docTypeID, docID, makeID, desc) first:
SELECT
docID,
docTypeID,
CAST(makeID AS CHAR) AS makeID,
`desc`
FROM
simplified_mysql_table,
WHERE
NOT docTypeID = 1
UNION ALL
SELECT
docID,
1,
GROUP_CONCAT( makeID ),
`desc` -- this is not standard SQL
FROM
simplified_mysql_table,
WHERE
docTypeID = 1
GROUP BY
docTypeID,
docID ;

MySQL Running Total with COUNT

I'm aware of the set #running_sum=0; #running_sum:=#running_sum + ... method, however, it does not seem to be working in my case.
My query:
SELECT DISTINCT(date), COUNT(*) AS count
FROM table1
WHERE date > '2011-09-29' AND applicationid = '123'
GROUP BY date ORDER BY date
The result gives me unique dates, with the count of occurrences of application 123.
I want to keep a running total of the count, to see the accumulated growth.
Right now I'm doing this in PHP, but I want to switch it all to MySQL.
Using the method from the first line of this post simply duplicates the count, instead of accumulating it.
What am I missing?
P.S. The set is very small, only about 100 entries.
Edit: you're right ypercube:
Here's the version with running_sum:
SET #running_sum=0;
SELECT date, #running_sum:=#running_sum + COUNT(*) AS total FROM table1
WHERE date > '2011-09-29' AND applicationid = '123'
GROUP BY date ORDER BY date
count column ends up being the same as if I just printed COUNT(*)
Updated Answer
The OP asked for a single-query approach, so as not to have to SET a user variable separately from using the variable to compute the running total:
SELECT d.date,
#running_sum:=#running_sum + d.count AS running
FROM ( SELECT date, COUNT(*) AS `count`
FROM table1
WHERE date > '2011-09-29' AND applicationid = '123'
GROUP BY date
ORDER BY date ) d
JOIN (SELECT #running_sum := 0 AS dummy) dummy;
"Inline initialization" of user variables is useful for simulating other analytic functions, too. Indeed I learned this technique from answers like this one.
Original Answer
You need to introduce an enclosing query to tabulate the #running_sum over your COUNT(*)ed records:
SET #running_sum=0;
SELECT d.date,
#running_sum:=#running_sum + d.count AS running
FROM ( SELECT date, COUNT(*) AS `count`
FROM table1
WHERE date > '2011-09-29' AND applicationid = '123'
GROUP BY date
ORDER BY date ) d;
See also this answer.
SQL is notoriously poor at running totals. As your result set is in order, you are much better advised to append a calculated running total column on the client side. Nothing in SQL will be as performant as that.
The Running total can be easily calculated using the lib_mysqludf_ta UDF library.
https://github.com/mysqludf/lib_mysqludf_ta#readme