how to do lag operation in mysql

how to do lag operation in mysql - mysql

Guys I want to use analytical function lag in mysql. In Oracle it is supported but I can't do it in Mysql. So can anybody help me how to perform lag operation in Mysql?
For example
UID Operation
1 Logged-in
2 View content
3 Review
I want to use lag function so that my output would be as follows
UID Operation Lagoperation
1 Logged-in
2 View content Logged-in
3 Review View content
Does Mysql support lag function???

You can emulate it with user variables:
select uid, operation, previous_operation from (
select
y.*
, #prev AS previous_Operation
, #prev := Operation
from
your_table y
, (select #prev:=NULL) vars
order by uid
) subquery_alias
see it working in an sqlfiddle live
Here you initialize your variable(s). It's the same as writing SET #prev:=NULL; before writing your query.
, (select #prev:=NULL) vars
Then the order of these statements in the select clause is important:
, #prev AS previous_Operation
, #prev := Operation
The first just displays the variables value, the second assigns the value of the current row to the variable.
It's also important to have an ORDER BY clause, as the output is otherwise not deterministic.
All this is put into a subquery just out of aesthetic reasons,... to filter out this
, #prev := Operation
column.

Related

How to: Sort user by values and theier new position in the table

I'm trying sort all the users in the database by a value and give them a new ID, that will act as their position.
I've tried to order them descending by money, that's the value I'm looking for to order by, but I wasn't able to properly update each user's ID in order.
Please note, the database has over 5000 entries, so I need a way that won't lag out the database.

Unfortunately, MySQL doesn't have a nice easy RANK() function. You can use a workaround like what is described here though: Rank function in MySQL
SELECT t1.*, #curRank := #curRank + 1 AS rank
FROM table_name t1, (
SELECT #curRank := 0
) t2
ORDER BY money

MySQL same query result in multiple subqueries

I am trying to optimize following query.
SELECT t3.*,
(SELECT SUM(t4.col_sum)
FROM (...) t4
WHERE t4.timestamp BETWEEN CONCAT(SUBSTR(t3.timestamp, 1, 11), "00:00:00") AND t3.timestamp)
AS cum_sum
FROM (...) t3
Where (...) is a container for long query. It results 2 columns: timestamp and col_sum. I want to add third column to it by writing a query. That third column is a cumulative sum of col_sum.
The problem is I am putting same big query in two places (...)
Is there a way I can obtain a result and use the result in those two/multiple places (...)?

One method is to use a temporary table.
Probably a more efficient method is to use variables to calculate a cumulative sum. It would be something like:
select t.*,
(#c := if(#t = left(t.timestamp, 11), #c + t.col_sum,
if(#t := left(t.timestamp, 11), 0, 0)
)
) as cumesum
from (. . .) t cross join
(select #t := '', #c := 0) vars
order by t.timestamp;
The above query orders the rows by timestamp. The variable #t keeps track of the first 11 characters in the timestamp -- as I read your logic, you want to do the cumulative sum only within a group where this is constant.
The variable #c keeps track of the cumulative sum, resetting to zero when a new "first 11 characters" are encountered. The logic looks a bit complicated, but it is best to put all variable assignments in a single expression, because MySQL does not guarantee the order of evaluation of expressions.

Numbering rows in groups with MySQL: how does it work?

There are several good posts on how to number rows within groups with MySQL, but how does the actually code work? I'm unclear on what MySQL evaluates first in the code below.
For instance, placing #yearqt := yearqt as bloc before the IF() call produces different results, and I'm unclear on the role of the s1 subquery in initializing the # variables: when are they updated as MySQL runs through the data rows? Is the order by statement run before the select?
The code below selects three random records per yearqt group. There may be other ways to do this, but the question pertains to how the code works, not how I could do this differently or whether I can do this more efficiently. Thank you.
select * from (
select customer_id , yearqt , a ,
IF(#yearqt = yearqt , #rownum := #rownum + 1 , #rownum := 1) as rownum ,
#yearqt := yearqt as bloc
from
( select customer_id , yearqt , rand(123) as a from tbl
order by rand(123)
) a join ( select #rownum := 0 , #yearqt := '' ) s1
order by yearqt
) s2
where rownum <= 3
order by bloc

This question is related to how the engine retrieves SQL SELECT query results. The order is roughly the following:
Calculate explain plan
Calculate sets and join them using plan's directives (FROM / JOIN phase)
Apply WHERE clause
Apply GROUP BY/HAVING clause
Apply ORDER BY clause
Projection phase: every row returned is ordered and can now be 'displayed'.
So, in respect to the variables, you now understand why there's subquery to initialize them. This subquery is evaluated only once, and at the beginning of the process.
After that, the project phase seems to treat each selected attribute in the order you decided which is the reason why puting #yearqt := yearqt as bloc up one attribute would changes the outcome of the next/previous IF statement. Since each row will be projected once, it means any work you're doing on the variables will be done as many times as the number of rows in the final resulset.

The purpose of this
join ( select #rownum := 0 , #yearqt := '' ) s1
is to initialize the user-defined variables at the beginning of statement execution. Because this is a rowsource for the outer query (MySQL calls it a derived table) this will be executed BEFORE the outer query runs. We aren't really interested in what this query returns, except that it returns a single row, because of the JOIN operation.
So this inline view s1 could be omitted from the query and be replaced by a couple of SET statements that are executed immediately before the query:
SET #rownum := 0;
SET #yearqt := 0;
But then we'd have three separate statements to run, and we'd get different output from the query if these weren't run, if those variables were set to some other value. By including this in the query itself, it's a single statement, and we remove the dependency on separate SET statements.
This is the query that's really doing the work, whittled down to just the two expressions that matter in this case
SELEECT IF(#yearqt = t.yearqt , #rownum := #rownum + 1 , #rownum := 1) as rownum
, #yearqt := t.yearqt as bloc
FROM ( ... ) t
ORDER BY t.yearqt
Some key points that make this "work"
MySQL processes the expressions in the SELECT list in the order that they appear in the SELECT list.
MySQL processes the rows in the order specified in the ORDER BY.
The references to user-defined variables are evaluated for each row, not once at the beginning of the statement.
Note that the MySQL Reference Manual points out that this behavior is not guaranteed. (So, it may change in a future release.)
So, the processing of that can be described as
for the first expression:
compare the value of the yearqt column from the current row with current value of #yearqt user-defined variable
set the value of #rownum user-defined variable
return the result of the IF() expression in the resultset
for the second expression:
set the value of the #yearqt user-defined variable to the value of the yearqt column from the current row
return the value of the yearqt column in the resultset
The net effect is that for each row processed, we're comparing the value in the yearqt column to the value from the "previously" processed row, and we're saving the current value to compare to the next row.

Getting the top N results of every category in a table

I'd like to extract the top 10 results of a certain category within a table, ordered by date. My table looks like
CREATE TABLE IF NOT EXISTS Table
( name VARCHAR(50)
, category VARCHAR(50)
, date TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
So far I've come up with SELECT category FROM Table GROUP BY category;, this will give me every category in store.
Next I need to run SELECT * FROM Table WHERE category=$categ ORDER BY date DESC LIMIT 10; in some kind of foreach loop for every $categ fed to me by the first instruction.
I'd like to do all of this in MySQL, if possible; I've come across several answers online but they all seem to involve two or more tables, or provide difficult examples that seem hard to understand... It would seem silly to me that something that can be dealt with so simply in server code (doesn't even create that much overhead, apart from the needless storage of the category names) is so difficult to translate into SQL code, but if nothing works that's what I'll end up doing, I guess.

You can use an inline view and user-defined variables to set a "row number" column, and then the outer query can filter based on the "row number" column. (Doing this, we can emulate a ROW_NUMBER analytic function.)
For large sets, this may not be the most efficient approach, but it works reasonably for small sets.
The outer query would look something like this:
SELECT q.*
FROM (
<view_query>
) q
WHERE q.row_num <= 10
ORDER
BY q.category, q.date DESC, q.name
The view query would be something like this
SELECT IF(#cat = t.category,#i := #i + 1, #i := 1) AS row_num
, #cat := t.category AS category
, t.date
. t.name
FROM mytable t
CROSS
JOIN ( SELECT #i := 0, #cat := NULL ) i
ORDER BY t.category, t.date DESC

Optimizing a mySQL query with functions

I'm running this query (via PHP) and wondering if there is a faster way to get the same result:
SELECT
date(date_time) as `date`,
unix_timestamp(date(date_time)) as `timestamp`,
month(date(date_time)) as `month`,
dayname(date(date_time)) as `dayname`,
dayofmonth(date(date_time)) as `daynum`,
hour(date_time) as `hour`, minute(date_time) as `increment`
FROM loc_data
WHERE loc_id = 2
As you can see I'm performing the date(date_time) function 5 times but would like to store the result of the first and use that result from then on. Would this increase performance of the query? The query is called many thousands of times in a script. (When I perform the functions in PHP instead of mySQL I get no big difference in speed from the current query above.)

Have you tried with SQL variables?
select #a from (select #a := 1) a
That works, with your query
SELECT
#n as `date`,
unix_timestamp(#n) as `timestamp`
FROM loc_data l,
(select #n := date(l.date_time)) a
WHERE l.loc_id = 2
But I'm not sure if this will work for you.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

how to do lag operation in mysql - mysql

Related

How to: Sort user by values and theier new position in the table

MySQL same query result in multiple subqueries

Numbering rows in groups with MySQL: how does it work?

Getting the top N results of every category in a table

Optimizing a mySQL query with functions

Categories

Resources