I want to carry down the value of a variable from row 1 to row 2, use it for a calculation in row 2 and then take the output to row 3 in my query. The process repeats for 1000s of rows. Retain does this in SAS, how do I do it in MySql?
In SQL, you would use subqueries for in-between observations calculations and relate values between the nested and main query. Since typical uses of the RETAIN statement in SAS involves running totals, counting number of value occurrences, setting indicators within a BY-group, nested subqueries can replicate such functionality.
The below example demonstrates running aggregates across grouped observations.
Example table
id group name amount
1 Object-oriented Java 100
2 Object-oriented C# 50
3 Object-oriented Python 75
4 Object-oriented PHP 65
5 Special Purpose SQL 80
6 Special Purpose XSLT 60
7 Statistical R 85
8 Statistical SAS 100
Query with two subqueries for running counts and running sums:
SELECT t1.id, t1.group, t1.name, t1.amount,
(SELECT Count(*) FROM maintable As t2
WHERE t1.group = t2.group AND t1.id >= t2.id) As RunningCount,
(SELECT Sum(t3.amount) FROM maintable As t3
WHERE t1.group = t3.group AND t1.id >= t3.id) As RunningAmount
FROM maintable As t1
Output
id group name amount RunningCount RunningAmount
1 Object-oriented Java 100 1 100
2 Object-oriented C# 50 2 150
3 Object-oriented Python 75 3 225
4 Object-oriented PHP 65 4 290
5 Special Purpose SQL 80 1 80
6 Special Purpose XSLT 60 2 140
7 Statistical R 85 1 85
8 Statistical SAS 100 2 185
In MySQL, you would do this using variables. Here is an example to calculate the row number:
select t.*, (#rn := #rn + 1) as rn
from table t cross join
(select #rn := 0) params
order by col;
Related
I have a SQL table (temp2) like this:
I want to calculate the balance*rate/sum(balance) for each cat
So, my desired output would be something like this:
To get this output, I used following code:
DROP TABLE IF EXISTS temp3;
create table temp3 as select cat, balance * rate /sum(balance) as prod from temp2
group by cat
select temp2.emp_id, temp2.cat,temp2.balance, temp2.rate , temp3.prod from temp2
left outer join temp3 on temp2.cat=temp3.cat
So here I have created a new table to get the answer.
Will there be an easier way to get the same results?
There's no need for the new table unless you need to refer to it in multiple queries. You can just join with a subquery.
SELECT t2.emp_id, t2.cat, t2.balance, t2.rate, t3.prod
FROM temp2 AS t2
JOIN (
SELECT cat, balance * rate /sum(balance) AS prod
FROM temp2
GROUP BY cat
) AS t3 ON t2.cat = t3.cat
There's no need to use LEFT JOIN. Since the subquery gets cat from the same table, there can't be any non-matching rows.
Sometimes it's useful to create the new table so you can add an index for performance reasons.
You actually don't need a join or subquery at all thanks to window functions:
SELECT emp_id, cat, balance, rate,
balance * rate / sum(balance) OVER (PARTITION BY cat) AS prod
FROM temp2
ORDER BY emp_id;
gives
emp_id cat balance rate prod
------ --- ------- ---- ------------------
1 1 1000.0 0.25 0.0625
2 3 1250.0 0.25 0.0568181818181818
3 2 1500.0 0.25 0.0681818181818182
4 1 1000.0 0.25 0.0625
5 2 1250.0 0.25 0.0568181818181818
6 3 1500.0 0.25 0.0681818181818182
100 1 1000.0 0.25 0.0625
101 3 1250.0 0.25 0.0568181818181818
102 2 1500.0 0.25 0.0681818181818182
103 1 1000.0 0.25 0.0625
104 2 1250.0 0.25 0.0568181818181818
105 3 1500.0 0.25 0.0681818181818182
(Create an index on temp2.cat for best performance).
This is also more accurate; both yours and Barmar's uses balance and rate in the grouped query without including those values in the GROUP BY clause - that's an error in most databases, but Sqlite will pick a random row from the group to use as the values, which when different rows in the group have different values for them, will throw off the final calculations. To do it properly with grouping (If, for example, you're using an old database version that doesn't support window functions), you need something like
SELECT t2.emp_id, t2.cat, t2.balance, t2.balance * t2.rate / t3.sumbalance AS prod
FROM temp2 AS t2
JOIN (SELECT cat, sum(balance) AS sumbalance
FROM temp2
GROUP BY cat) AS t3
ON t2.cat = t3.cat
ORDER BY t2.emp_id;
Assume a table with two columns t (a string with TimeStamps) and v (decimal). For each t I want to query the MAXIMUM of the value v in a certain range defined by the current t.
How can i transfer below statement to proper SQL?
select t, max(v for t between t-2MIN and t+2min) from table_name;
Example:
Assume below table.
t
v
1
3
2
2
3
5
4
4
5
8
6
1
I need an SQL-statement which gives me (for e.g. a width 2: max(v for t between t-2 and t+2)) the following result
t
v
1
5
2
5
3
8
4
8
5
8
6
8
Join the table with itself using the range as the joining condition.
SELECT t1.t, MAX(t2.v) AS max_v
FROM table_name AS t1
JOIN table_name AS t2 ON t2.t BETWEEN t1.t - 2 AND t1.t + 2
GROUP BY t1.t
If you use MySQL 8.x I think you should be able to do it using window functions, but I don't know the proper syntax for this.
In MySql 8 you can use a MAX OVER with rows between a range.
select t
, max(v) over (order by t rows
between 2 preceding and 2 following) v
from table_name
ADD:
SQL query:
SELECT s.name
, d.feeling
, COUNT(1) AS times
FROM data_table d
, staff s
WHERE d.nid = s.id
AND d.project_id = 1
AND d.crawl_time BETWEEN '2018-10-17' AND '2018-10-24'
AND LENGTH(TRIM(d.feeling)) > 0
GROUP
BY d.nid
, d.feeling
ADD (END)
I write a sql to query the times of each body's feelings group by name and feeling. Here is the result.
-- a certain sql returns, not an existing table --
`name` `feeling` `times` (expect)
Jack happy 10 0.45
Jack sad 7 0.31
Jack common 5 0.22
Lily happy 3 0.27
Lily sad 6 0.54
Lily common 2 0.18
Sam happy 6 0.42
Sam sad 7 0.5
Sam common 1 0.07
Now the aim is to calculate the ratio of everyone's feelings. For example, the happy feeling of Jack takes 10/(10+7+5) of his feelings, and for sad feeling is 7/(10+7+5).
When using SUM(result.count) and GROUP BY name to figure, the sad or common feelings cannot show. Then I use subquery, but the table does not exist. Is there anyway to work without creating a view?
One solution which should work on any version of MySQL uses a subquery to find the sum of times for each name, and then joins to it:
SELECT
t1.name,
t1.feeling,
t1.times,
t1.times / t2.times_sum AS feeling_ratio
FROM yourTable t1
INNER JOIN
(
SELECT name, SUM(times) AS times_sum
FROM yourTable
GROUP BY name
) t2
ON t1.name = t2.name;
If you are using MySQL 8+ or later, and have access to analytic functions, then there is simpler way of writing this:
SELECT
t1.name,
t1.feeling,
t1.times,
t1.times / SUM(t1.times) OVER (PARTITION BY t1.name) AS feeling_ratio
FROM yourTable t1;
Assume I have a table like this:
id pay
-- ---
1 10
2 20
3 30
4 40
5 50
6 60
I want to create a view from table above with this result:
id pay paid_before
-- --- -------------
1 10 0
2 20 10
3 30 30
4 40 60
5 50 100
6 60 150
which "paid_before" is sum of pay rows that have smaller id.
How could I do this job?
This accomplishes what you want.
SELECT p1.id,p1.pay, sum(p2.pay) as Paid_Before FROM PAYMENTS P1 LEFT JOIN
PAYMENTS P2 ON p1.id > p2.id
GROUP BY p1.id, p1.pay
See this sql fiddle
In MySQL, this is most efficiently done with variables:
select p.id, p.pay, (#p := #p + p.pay) - p.pay as PaidBefore
from payments p cross join
(select #p := 0) vars
order by id;
Although this is not standard SQL (which I usually prefer), that is okay. The standard SQL solution is to use cumulative sum:
select p.id, p.pay, sum(p.pay) over (order by p.id) - p.pay as PaidBefore
from payments p;
Many databases support this syntax, but not MySQL.
The SQL Fiddle (courtesy of Atilla) is here.
I get a list of options with price like the following:
(it's the result from a select query sort by price asc)
price color quanlity
o_id
1 2 R medium
3 3 G bad
4 4 G good
5 6 B good
2 8 R medium
Now I need to pair those options according to requirements:
e.g. if I need 2 R(red) and 4 G(green)
I'd like to return a list of possible combinations (sort by price asc), like:
R(2) G(4)
c_id o_id o_id total price
1 1 3 16
2 1 4 20
3 2 3 28
4 2 4 32
My current solution for this is to make multiple queries to the DB:
(I'm using Java at the application layer / back end.)
select distinct colors, and store it in a List
In a For loop, select options of each color into a different temp table
join the List of Tables, and calculate the total, sort by total.
But is there a way to condense the above operations into a stored procedure or something more elegant?
You just need a simple self-join:
SELECT R.o_id AS R_id, G.o_id AS G_id, 2*R.price + 4*G.price AS total
FROM mytable R JOIN mytable G ON R.color = 'R' AND G.color = 'G'
ORDER BY total
See it on sqlfiddle.