Let's say I have 2 simple tables
Table t1 Table t2
+------+ +------+
| i | | j |
+------+ +------+
| 42 | | a |
| 1 | | b |
| 5 | | c |
+------+ +------+
How can I have an output of the 2 tables, joined without any condition except the row number?
I would like to avoid the creation of another index if possible.
I am using MySQL 5.7
With this example, the output would be :
Table output
+------+------+
| i | j |
+------+------+
| 42 | a |
| 1 | b |
| 5 | c |
+------+------+
What you ask can be done, assuming that your comment is true;
"Even if table i and j are subqueries (containing order by)?"
Schema (MySQL v5.7)
CREATE TABLE table_1 ( i INT );
CREATE TABLE table_2 ( j VARCHAR(4) );
INSERT INTO table_1
VALUES (3),(5),(1);
INSERT INTO table_2
VALUES ('c'), ('b'),('a');
Query
SELECT t1.i, t2.j
FROM (SELECT t1.i
, #rownum1 := #rownum1 + 1 AS rownum
FROM (SELECT table_1.i
FROM table_1
ORDER BY ?) t1
CROSS JOIN (SELECT #rownum1 := 0) v) t1
JOIN (SELECT t2.j
, #rownum2 := #rownum2 + 1 AS rownum
FROM (SELECT table_2.j
FROM table_2
ORDER BY ?) t2
CROSS JOIN (SELECT #rownum2 := 0) v) t2 ON t2.rownum = t1.rownum;
However, this approach is a) not efficient, and b) indicative of questionable design. You probably want to look for something that actually relates your two tables or, if nothing exists, create something. If there is really nothing that relates the two tables, then you'll have trouble with the ORDER BY clauses anyway.
If the tables do not necessarily have the same number of rows, then use union all and group by -- along with variables:
select max(t.i) as i, max(t.j) as j
from ((select (#rn1 := #rn1 + 1) as seqnum, t1.i
from t1 cross join
(select #rn1 := 0) params
) union all
(select (#rn2 := #rn2 + 1) as seqnum, t2.j
from t2 cross join
(select #rn2 := 0) params
)
) t
group by seqnum;
Note: The results in each column are in an arbitrary and indeterminate order. The order might vary on different runs on the query.
You don't provide enough information to ensure the ordering.
you can try this code
select t1.i,t2.j
from
(SELECT i,#row_num:=#row_num+1 as row_num FROM t1, (SELECT #row_num:= 0) AS sl) t1
join
(SELECT j,#row_num:=#row_num+1 as row_num FROM t2, (SELECT #row_num:= 0) AS sl) t2
on t1.row_num=t2.row_num
Question
Please consider the following table:
+--------------+--------+--------+
| transactionID | Sgroup | Rgroup |
+--------------+--------+--------+
| 1 | A | I |
| 1 | A | J |
| 2 | B | B |
| 2 | B | K |
+--------------+--------+--------+
For each transactionID (2 rows are associated with ID 1, two rows with ID 2) I want to select the row for which Sgroup = Rgroup, if any row within a transactionID satisfies the condition. Otherwise, I want to select a row at random. For each transactionID at most one row satisfies Sgroup = Rgroup. How can I do this?
Attempted Solution
I know how to select rows for which the condition Sgroup = Rgroup is fulfilled as follows:
SELECT *
FROM Transaction
WHERE Sgroup = Rgroup;
+---------------+--------+--------+
| transactionID | Sgroup | Rgroup |
+---------------+--------+--------+
| 2 | B | B |
+---------------+--------+--------+
I also know how to chose a row randomly (thanks to this question) if the condition is not fulfilled as follows:
SELECT * FROM
(SELECT *
FROM Transaction
WHERE NOT transactionID IN
(SELECT transactionID
FROM Transaction
WHERE Sgroup = Rgroup)
ORDER BY RAND()) AS temp
GROUP BY temp.transactionID;
+---------------+--------+--------+
| transactionID | Sgroup | Rgroup |
+---------------+--------+--------+
| 1 | A | I |
+---------------+--------+--------+
How can I combine these two expressions into one? I tried working with a CASE expression I didn't get far. Can somebody kindly suggest a solution?
Example Code Here is the code to generate the table:
CREATE DATABASE MinimalExample;
USE MinimalExample;
CREATE TABLE Transaction (
transactionID int,
Sgroup nvarchar(1),
Rgroup nvarchar(1)
);
INSERT INTO Transaction VALUES
(1,'A','I'),
(1,'A','J'),
(2,'B','B'),
(2,'B','K');
I think variables might be the simplest solution if you really mean "random":
select t.*
from (select t.*,
(#rn := if(#i = transactionID, #rn + 1,
if(#i := transactionID, 1, 1)
)
) as rn
from (select t.*
from t
order by transactionID, (sgroup = rgroup) desc, rand()
) t cross join
(select #i := -1, #rn := 0) params
) t
where rn = 1;
If by "random" you mean "arbitrary", you can use this quick-and-dirty trick:
(select t.*
from t
where sgroup = rgroup
)
union all
(select t.*
from t
where not exists (select 1 from t t2 where t2.id = t.id and t2.sgroup = t2.rgroup)
group by transactionID
);
This uses the dreaded select * with group by, something which I strongly discourage using under almost all circumstances. However, in this case, you are specifically trying to reduce each group to an indeterminate row, so it doesn't seem quite so bad. I will note that MySQL does not guarantee that the columns in the result set all come from the same row, although in practice they do.
Finally, if you had a unique primary key on each row, you could use probably the simplest solution:
select t.*
from t
where t.id = (select t2.id
from t t2
where t2.transactionID = t.transactionID
order by (rgroup = sgroup) desc, rand()
);
i've tried some other topics for this but couldn't get answers that actually worked for me.
I have a activities table with some values ( in mysql)
| id| user_id | elevation | distance |
|---|------------|--------------------|----------|
| 1 | 1 | 220 | 5000 |
| 2 | 1 | 300 | 7000 |
| 3 | 2 | 520 | 2000 |
| 4 | 2 | 120 | 3500 |
I need to sum distance and elevation until distance sum up to certain value, per user_id.
Example, sum until 5000 is reached:
User 1 - distance 5000 - elevation 220
User 2 - distance 5500 - elevation 640
I found many solutions but none with group_by. How i do this in mysql?
Update : I used that query but now i'm with another problem. The join always use the insert order, and not a datetime field i want.
SELECT
t.*
FROM
(
SELECT
t.*,
(
#d := #d + DISTANCE
) AS running_distance
FROM
(
SELECT
t.*,
c.meta
FROM
inscricao i
INNER JOIN categorias c ON
i.categoria_id = c.id
LEFT JOIN(
select
t.data_inicio,t.usuario_id,t.aplicativo,t.data_fim,t.distance,t.tempo_decorrido,t.ritmo_cardiaco,t.velocidade_media,t.type,t.ganho_de_altimetria
from
corridas t
order by
data_inicio asc
) t ON
t.usuario_id = i.usuario_id
AND t.data_inicio >= i.inicio
AND t.data_fim <= i.fim
WHERE
i.desafio_id = 29
AND(
i.usuario_id = 5354
)
ORDER BY
data_inicio asc
-- usuario_id
) t
join (
SELECT
#u :=- 1,
#d := 0
) params
ORDER BY
data_inicio asc
) t
WHERE
(
running_distance >= meta * 1000
AND running_distance - DISTANCE < meta * 1000
)
OR(
running_distance <= meta * 1000
)
order by
data_inicio desc
So if a older activity is inserted after, the sum gets wrong. Someone knows how to handle it?
You can use variables to get the cumulative sum . . . then some simple filtering logic:
select t.*
from (select t.*,
(#d := if(#u = user_id, #d + distance,
if(#u := user_id, distance, distance)
)
) as running_distance -- pun intended ??
from (select t.*
from t
order by user_id, id
) t cross join
(select #u := -1, #d := 0) params
) t
where running_distance >= 5000 and
running_distance - distance < 5000;
Notes:
The more recent versions of MySQL are finicky about variable assignment and order by. The innermost subquery is not needed in earlier versions of MySQL.
MySQL does not guarantee the order of evaluation of expressions in a select. Hence, all variable assignments are in a single expression.
If distance can be negative, then a user may have more than one row in the result set.
This is not an aggregation query.
I am trying to get the avg of an item so I am using a subquery.
Update: I should have been clearer initially, but i want the avg to be for the last 5 items only
First I started with
SELECT
y.id
FROM (
SELECT *
FROM (
SELECT *
FROM products
WHERE itemid=1
) x
ORDER BY id DESC
LIMIT 15
) y;
Which runs but is fairly useless as it just shows me the ids.
I then added in the below
SELECT
y.id,
(SELECT AVG(deposit) FROM (SELECT deposit FROM products WHERE id < y.id ORDER BY id DESC LIMIT 5)z) AVGDEPOSIT
FROM (
SELECT *
FROM (
SELECT *
FROM products
WHERE itemid=1
) x
ORDER BY id DESC
LIMIT 15
) y;
When I do this I get the error Unknown column 'y.id' in 'where clause', upon further reading here I believe this is because when the queries go down to the next level they need to be joined?
So I tried the below ** removed un needed suquery
SELECT
y.id,
(SELECT AVG(deposit) FROM (
SELECT deposit
FROM products
INNER JOIN y as yy ON products.id = yy.id
WHERE id < yy.id
ORDER BY id DESC
LIMIT 5)z
) AVGDEPOSIT
FROM (
SELECT *
FROM products
WHERE itemid=1
ORDER BY id DESC
LIMIT 15
) y;
But I get Table 'test.y' doesn't exist. Am I on the right track here? What do I need to change to get what I am after here?
The example can be found here in sqlfiddle.
CREATE TABLE products
(`id` int, `itemid` int, `deposit` int);
INSERT INTO products
(`id`, `itemid`, `deposit`)
VALUES
(1, 1, 50),
(2, 1, 75),
(3, 1, 90),
(4, 1, 80),
(5, 1, 100),
(6, 1, 75),
(7, 1, 75),
(8, 1, 90),
(9, 1, 90),
(10, 1, 100);
Given my data in this example, my expected result is below, where there is a column next to each ID that has the avg of the previous 5 deposits.
id | AVGDEPOSIT
10 | 86 (deposit value of (id9+id8+id7+id6+id5)/5) to get the AVG
9 | 84
8 | 84
7 | 84
6 | 79
5 | 73.75
I'm not an MySQL expert (in MS SQL it could be done easier), and your question looks a bit unclear for me, but it looks like you're trying to get average of previous 5 items.
If you have Id without gaps, it's easy:
select
p.id,
(
select avg(t.deposit)
from products as t
where t.itemid = 1 and t.id >= p.id - 5 and t.id < p.id
) as avgdeposit
from products as p
where p.itemid = 1
order by p.id desc
limit 15
If not, then I've tri tried to do this query like this
select
p.id,
(
select avg(t.deposit)
from (
select tt.deposit
from products as tt
where tt.itemid = 1 and tt.id < p.id
order by tt.id desc
limit 5
) as t
) as avgdeposit
from products as p
where p.itemid = 1
order by p.id desc
limit 15
But I've got exception Unknown column 'p.id' in 'where clause'. Looks like MySQL cannot handle 2 levels of nesting of subqueries.
But you can get 5 previous items with offset, like this:
select
p.id,
(
select avg(t.deposit)
from products as t
where t.itemid = 1 and t.id > coalesce(p.prev_id, -1) and t.id < p.id
) as avgdeposit
from
(
select
p.id,
(
select tt.id
from products as tt
where tt.itemid = 1 and tt.id <= p.id
order by tt.id desc
limit 1 offset 6
) as prev_id
from products as p
where p.itemid = 1
order by p.id desc
limit 15
) as p
sql fiddle demo
This is my solution. It is easy to understand how it works, but at the same time it can't be optimized much since I'm using some string functions, and it's far from standard SQL. If you only need to return a few records, it could be still fine.
This query will return, for every ID, a comma separated list of previous ID, ordered in ascending order:
SELECT p1.id, p1.itemid, GROUP_CONCAT(p2.id ORDER BY p2.id DESC) previous_ids
FROM
products p1 LEFT JOIN products p2
ON p1.itemid=p2.itemid AND p1.id>p2.id
GROUP BY
p1.id, p1.itemid
ORDER BY
p1.itemid ASC, p1.id DESC
and it will return something like this:
| ID | ITEMID | PREVIOUS_IDS |
|----|--------|-------------------|
| 10 | 1 | 9,8,7,6,5,4,3,2,1 |
| 9 | 1 | 8,7,6,5,4,3,2,1 |
| 8 | 1 | 7,6,5,4,3,2,1 |
| 7 | 1 | 6,5,4,3,2,1 |
| 6 | 1 | 5,4,3,2,1 |
| 5 | 1 | 4,3,2,1 |
| 4 | 1 | 3,2,1 |
| 3 | 1 | 2,1 |
| 2 | 1 | 1 |
| 1 | 1 | (null) |
then we can join the result of this query with the products table itself, and on the join condition we can use FIND_IN_SET(src, csvalues) that return the position of the src string inside the comma separated values:
ON FIND_IN_SET(id, previous_ids) BETWEEN 1 AND 5
and the final query looks like this:
SELECT
list_previous.id,
AVG(products.deposit)
FROM (
SELECT p1.id, p1.itemid, GROUP_CONCAT(p2.id ORDER BY p2.id DESC) previous_ids
FROM
products p1 INNER JOIN products p2
ON p1.itemid=p2.itemid AND p1.id>p2.id
GROUP BY
p1.id, p1.itemid
) list_previous LEFT JOIN products
ON list_previous.itemid=products.itemid
AND FIND_IN_SET(products.id, previous_ids) BETWEEN 1 AND 5
GROUP BY
list_previous.id
ORDER BY
id DESC
Please see fiddle here. I won't recommend using this trick for big tables, but for small sets of data it is fine.
This is maybe not the simplest solution, but it does do the job and is an interesting variation and in my opinion transparent. I simulate the analytical functions that I know from Oracle.
As we do not assume the id to be consecutive the counting of the rows is simulated by increasing #rn each row. Next products table including the rownum is joint with itself and only the rows 2-6 are used to build the average.
select p2id, avg(deposit), group_concat(p1id order by p1id desc), group_concat(deposit order by p1id desc)
from ( select p2.id p2id, p1.rn p1rn, p1.deposit, p2.rn p2rn, p1.id p1id
from (select p.*,#rn1:=#rn1+1 as rn from products p,(select #rn1 := 0) r) p1
, (select p.*,#rn2:=#rn2+1 as rn from products p,(select #rn2 := 0) r) p2 ) r
where p2rn-p1rn between 1 and 5
group by p2id
order by p2id desc
;
Result:
+------+--------------+---------------------------------------+------------------------------------------+
| p2id | avg(deposit) | group_concat(p1id order by p1id desc) | group_concat(deposit order by p1id desc) |
+------+--------------+---------------------------------------+------------------------------------------+
| 10 | 86.0000 | 9,8,7,6,5 | 90,90,75,75,100 |
| 9 | 84.0000 | 8,7,6,5,4 | 90,75,75,100,80 |
| 8 | 84.0000 | 7,6,5,4,3 | 75,75,100,80,90 |
| 7 | 84.0000 | 6,5,4,3,2 | 75,100,80,90,75 |
| 6 | 79.0000 | 5,4,3,2,1 | 100,80,90,75,50 |
| 5 | 73.7500 | 4,3,2,1 | 80,90,75,50 |
| 4 | 71.6667 | 3,2,1 | 90,75,50 |
| 3 | 62.5000 | 2,1 | 75,50 |
| 2 | 50.0000 | 1 | 50 |
+------+--------------+---------------------------------------+------------------------------------------+
SQL Fiddle Demo: http://sqlfiddle.com/#!2/c13bc/129
I want to thank this answer on how to simulate analytical functions in mysql: MySQL get row position in ORDER BY
It looks like you just want:
SELECT
id,
(SELECT AVG(deposit)
FROM (
SELECT deposit
FROM products
ORDER BY id DESC
LIMIT 5) last5
) avgdeposit
FROM products
The inner query gets the last 5 rows added to product, the query that wraps that gets the average for their deposits.
I'm going to simplify your query a bit so I can explain it.
SELECT
y.id,
(
SELECT AVG(deposit) FROM
(
SELECT deposit
FROM products
LIMIT 5
) z
) AVGDEPOSIT
FROM
(
SELECT *
FROM
(
SELECT *
FROM products
) x
LIMIT 15
) y;
My guess would be that you just need to insert some AS keywords in there. I'm sure someone else will come up with something more elegant, but for now you can try it out.
SELECT
y.id,
(
SELECT AVG(deposit) FROM
(
SELECT deposit
FROM products
LIMIT 5
) z
) AS AVGDEPOSIT
FROM
(
SELECT *
FROM
(
SELECT *
FROM products
) AS x
LIMIT 15
) y;
Here's one way to do it in MySQL:
SELECT p.id
, ( SELECT AVG(deposit)
FROM ( SELECT #rownum:=#rownum+1 rn, deposit, id
FROM ( SELECT #rownum:=0 ) r
, products
ORDER BY id ) t
WHERE rn BETWEEN p.rn-5 AND p.rn-1 ) avgdeposit
FROM ( SELECT #rownum1:=#rownum1+1 rn, id
FROM ( SELECT #rownum1:=0 ) r
, products
ORDER BY id ) p
WHERE p.rn >= 5
ORDER BY p.rn DESC;
It's a shame MySQL doesn't support the WITH clause or windowing functions. Having both would greatly simplify the query to the following:
WITH tbl AS (
SELECT id, deposit, ROW_NUMBER() OVER(ORDER BY id) rn
FROM products
)
SELECT id
, ( SELECT AVG(deposit)
FROM tbl
WHERE rn BETWEEN t.rn-5 AND t.rn-1 )
FROM tbl t
WHERE rn >= 5
ORDER BY rn DESC;
The latter query runs fine in Postgres.
2 possible solutions here
Firstly using user variables to add a sequence number. Do this twice, and join the second set to the first where the sequence number is between the id - 1 and the id - 5. Then just use AVG. No correlated sub queries.
SELECT Sub3.id, Sub3.itemid, Sub3.deposit, AVG(Sub4.deposit)
FROM
(
SELECT Sub1.id, Sub1.itemid, Sub1.deposit, #Seq:=#Seq+1 AS Sequence
FROM
(
SELECT id, itemid, deposit
FROM products
ORDER BY id DESC
) Sub1
CROSS JOIN
(
SELECT #Seq:=0
) Sub2
) Sub3
LEFT OUTER JOIN
(
SELECT Sub1.id, Sub1.itemid, Sub1.deposit, #Seq1:=#Seq1+1 AS Sequence
FROM
(
SELECT id, itemid, deposit
FROM products
ORDER BY id DESC
) Sub1
CROSS JOIN
(
SELECT #Seq1:=0
) Sub2
) Sub4
ON Sub4.Sequence BETWEEN Sub3.Sequence + 1 AND Sub3.Sequence + 5
GROUP BY Sub3.id, Sub3.itemid, Sub3.deposit
ORDER BY Sub3.id DESC
Second one is cruder, and uses a correlated sub query (which is likely to perform poorly as the amount of data increases). Does a normal select but for the last column it has a sub query that refers to the id in the main select.
SELECT id, itemid, deposit, (SELECT AVG(P2.deposit) FROM products P2 WHERE P2.id BETWEEN P1.id - 5 AND p1.id - 1 ORDER BY id DESC LIMIT 5)
FROM products P1
ORDER BY id DESC
Is this what you are after?
SELECT m.id
, AVG(d.deposit)
FROM products m
, products d
WHERE d.id < m.id
AND d.id >= m.id - 5
GROUP BY m.id
ORDER BY m.id DESC
;
But can't be that simple. Firstly, the table cannot just contain one itemid (hence your WHERE clause); Second, the id cannot be sequential/without gaps within an itemid. Thirdly, you probably want to produce something that runs across itemid and not one itemid at a time. So here it is.
SELECT itemid
, m_id as id
, AVG(d.deposit) as deposit
FROM (
SELECT itemid
, m_id
, d_id
, d.deposit
, #seq := (CASE WHEN m_id = d_id THEN 0 ELSE #seq + 1 END) seq
FROM (
SELECT m.itemid
, m.id m_id
, d.id d_id
, d.deposit
FROM products m
, products d
WHERE m.itemid = d.itemid
AND d.id <= m.id
ORDER BY m.id DESC
, d.id DESC) d
, (SELECT #seq := 0) s
) d
WHERE seq BETWEEN 1 AND 5
GROUP BY itemid
, m_id
ORDER BY itemid
, m_id DESC
;