I have a table that sort of looks like this
id | name | c1 | c2 | c3 | c4 | time
-------------------------------------------------
1 | miley | 23 | 11 | 21 | 18 | 2013-01-13 20:26:25
2 | john | 31 | 29 | 23 | 27 | 2013-01-14 20:26:25
3 | steve | 44 | 31 | 33 | 35 | 2013-01-14 20:26:25
4 | miley | 34 | 44 | 47 | 48 | 2013-01-15 08:26:25
5 | john | 27 | 53 | 49 | 52 | 2013-01-15 08:26:25
6 | steve | 27 | 62 | 50 | 64 | 2013-01-16 08:26:25
7 | miley | 44 | 54 | 57 | 87 | 2013-01-16 20:26:25
8 | john | 37 | 93 | 59 | 62 | 2013-01-17 20:26:25
9 | steve | 85 | 71 | 87 | 74 | 2013-01-17 20:26:25
...etc
*note: this is a random table I made up to just give you an idea of what my table looks like
I need to grab the name for who had the greatest change in a specific column over the course of a specific date range. I've tried a bunch of different queries by can't get one to work. I think my closest solution is something like...
SELECT table1.name, MAX(table1.c1-h.c1) as maxDiff
FROM table_a as table1
LEFT JOIN table_a as table2
ON table2.name=table1.name AND table1.c1>table2.c1
WHERE table2.c1 IS NOT NULL
What am I doing wrong? To be clear, I want to be able to select a range of dates then determine who has the biggest difference for that date range in a determined column. Also note that the data only increments over time, so the first capture of any day will always be <= the last capture of the day for that person.
It sounds like you will be needing a nested query. First, a query of each person on their own measurements within the date range, then order it by the biggest and take the top 1... something like this may work for you...
select
PreGroupByName.`Name`,
PreGroupByName.MaxC1 - PreGroupByName.MinC1 as MaxSpread
from
( select
t1.`Name`,
min( t1.c1 ) as MinC1,
max( t1.c1 ) as MaxC1
from
table_a t1
where
t1.`time` between '2013-01-01' and '2013-01-17' -- or whatever date/time range
group by
t1.`Name` ) as PreGroupByName
order by
MaxSpread DESC
limit 1
SELECT
`id`,`name`
,MAX(`c1`)-MIN(`c1`) AS `diff_c1`
-- ,MAX(`c2`)-MIN(`c2`) AS `diff_c2`
-- ,MAX(`c3`)-MIN(`c3`) AS `diff_c3`
-- ,MAX(`c4`)-MIN(`c4`) AS `diff_c4`
FROM `the_table`
WHERE `time` BETWEEN '2013-01-13 20:26:25' AND '2013-01-17 20:26:25'
GROUP BY `name`
ORDER BY `diff_c1` DESC -- whichever you want to evaluate
LIMIT 1
Related
I came across a task where I have to return the total COUNT and SUM of issued policies for each day of the month and compare it to the previous year.
Table PolicyOrder has fields:
PolicyOrderId - primary key
CreatedAt (DATETIME)
CalculatedPremium - cost of policy or "premium"
PolicyOrderStatusId - irrelevant to the question but still - status of the policy.
To solve this I came up with a query that inner joins self table and sums/counts by grouping according to DAY of the creation date.
SELECT
DATE(po1.CreatedAt) AS dayDate_2017,
SUM(po1.CalculatedPremium) AS premiumSum_2017,
COUNT(po1.PolicyOrderId) AS policyCount_2017,
po2.*
FROM
PolicyOrder po1
INNER JOIN (
SELECT
DATE(CreatedAt) AS dayDate_2018,
SUM(CalculatedPremium) AS premiumSum_2018,
COUNT(PolicyOrderId) AS policyCount_2018
FROM
PolicyOrder po2
WHERE
YEAR(CreatedAt) = 2018 AND
MONTH(CreatedAt) = 10 AND
PolicyOrderStatusId = 6
GROUP BY
DAY(CreatedAt)
) po2 ON (
DAY(po2.dayDate_2018) = DAY(po1.CreatedAt)
)
WHERE
YEAR(po1.CreatedAt) = 2017 AND
MONTH(po1.CreatedAt) = 10 AND
PolicyOrderStatusId = 6
GROUP BY
DAY(po1.CreatedAt)
The above query returns these results:
dayDate_2017 | premiumSum_2017 | policyCount_2017 | dayDate_2018 | premiumSum_2018 | policyCount_2018
2017-10-01 | 4699.36 | 98 | 2018-10-01 | 8524.21 | 144
2017-10-02 | 9114.55 | 168 | 2018-10-02 | 7942.25 | 140
2017-10-03 | 9512.43 | 178 | 2018-10-03 | 9399.61 | 161
2017-10-04 | 9291.77 | 155 | 2018-10-04 | 6922.83 | 137
2017-10-05 | 8063.27 | 155 | 2018-10-05 | 9278.58 | 178
2017-10-06 | 9743.40 | 184 | 2018-10-06 | 6139.38 | 136
...
2017-10-31 | ...
The problem is that now I have to add two more columns in which policies has to be counted and amounts added from the start of the year UP UNTIL each returned row.
Desired results:
dayDate_2017 | premiumSum_2017 | policyCount_2017 | sumFromYearBegining | countFromYearBegining
2017-10-01 | 4699.36 | 98 | 150000.34 | 5332
2017-10-02 | 9114.55 | 168 | 156230.55 | 5443
2017-10-03 | 9512.43 | 178 | 160232.44 | 5663
...
2017-10-31 | ...
WHERE:
sumFromYearBegining (150000.34) - SUM of premiumSum from 2017-01-01 until 2017-10-01 (excluding)
countFromYearBegining (5332) - COUNT of policies from 2017-01-01 until 2017-10-01 (excluding)
sumFromYearBegining (1566239.55) - SUM of premiumSum from 2017-01-01 until 2017-10-02 (excluding)
countFromYearBegining (5443) - COUNT of policies from 2017-01-01 until 2017-10-02 (excluding)
sumFromYearBegining (160232.44) - SUM of premiumSum from 2017-01-01 until 2017-10-02 (excluding)
countFromYearBegining (5663) - COUNT of policies from 2017-01-01 until 2017-10-02 (excluding)
I have tried inner joining same table COUNTed and SUMed which failed because I cannot specify the range up to which I need to count and sum, I have tried LEFT joining and then counting, which fails because the results are counted not untill each row result but until the last result etc...
DB Fiddle: https://www.db-fiddle.com/f/ckM8HyTD6NjLbK41Mq1gct/5
Any help from you SQL ninjas highly appreciated.
We can use User-defined variables to calculate Rolling Sum / Count, in absence of Window Functions' availability.
We will first need to determine the Sum and Count for every day in the year 2017 (even though you need rows for a particular month only). Because, in order to calculate rolling Sum for the days in March month, we would need the sum/count values from the January, and February month(s) as well. One optimization possibility is that we can restrict calculations from the first month to the require month only.
Note that ORDER BY daydate_2017 is necessary in order to be able to calculate rolling sum correctly. By default, data is in unordered fashion. Without defining the order, we cannot guarantee that Sum will be correct.
Also, we need to two levels of sub-select queries. First level is used to calculate the Rolling sum values. Second level is used to restrict the result to February month only. Since WHERE is executed before SELECT; we cannot restrict the result to February month, in the first level itself.
If you need similar rolling Sum for the year 2018 as well; similar query logic can be implemented in other set of sub-select queries.
SELECT dt2_2017.*, dt_2018.*
FROM
(
SELECT dt_2017.*,
#totsum := #totsum + dt_2017.premiumsum_2017 AS sumFromYearBegining_2017,
#totcount := #totcount + dt_2017.policycount_2017 AS countFromYearBeginning_2017
FROM (SELECT Date(po1.createdat) AS dayDate_2017,
Sum(po1.calculatedpremium) AS premiumSum_2017,
Count(po1.policyorderid) AS policyCount_2017
FROM PolicyOrder AS po1
WHERE po1.policyorderstatusid = 6 AND
YEAR(po1.createdat) = 2017 AND
MONTH(po1.createdat) <= 2 -- calculate upto February for 2017
GROUP BY daydate_2017
ORDER BY daydate_2017) AS dt_2017
CROSS JOIN (SELECT #totsum := 0, #totcount := 0) AS user_init_vars
) AS dt2_2017
INNER JOIN (
SELECT
DATE(po2.CreatedAt) AS dayDate_2018,
SUM(po2.CalculatedPremium) AS premiumSum_2018,
COUNT(po2.PolicyOrderId) AS policyCount_2018
FROM
PolicyOrder po2
WHERE
YEAR(po2.CreatedAt) = 2018 AND
MONTH(po2.CreatedAt) = 2 AND
po2.PolicyOrderStatusId = 6
GROUP BY
dayDate_2018
) dt_2018 ON DAY(dt_2018.dayDate_2018) = DAY(dt2_2017.dayDate_2017)
WHERE YEAR(dt2_2017.daydate_2017) = 2017 AND
MONTH(dt2_2017.daydate_2017) = 2;
RESULT: View on DB Fiddle
| dayDate_2017 | premiumSum_2017 | policyCount_2017 | sumFromYearBegining_2017 | countFromYearBeginning_2017 | dayDate_2018 | premiumSum_2018 | policyCount_2018 |
| ------------ | --------------- | ---------------- | ------------------------ | --------------------------- | ------------ | --------------- | ---------------- |
| 2017-02-01 | 4131.16 | 131 | 118346.77 | 3627 | 2018-02-01 | 8323.91 | 149 |
| 2017-02-02 | 2712.74 | 85 | 121059.51000000001 | 3712 | 2018-02-02 | 9469.33 | 153 |
| 2017-02-03 | 3888.59 | 111 | 124948.1 | 3823 | 2018-02-03 | 6409.21 | 97 |
| 2017-02-04 | 2447.99 | 74 | 127396.09000000001 | 3897 | 2018-02-04 | 5693.69 | 120 |
| 2017-02-05 | 1437.5 | 45 | 128833.59000000001 | 3942 | 2018-02-05 | 8574.97 | 129 |
| 2017-02-06 | 4254.48 | 127 | 133088.07 | 4069 | 2018-02-06 | 8277.51 | 133 |
| 2017-02-07 | 4746.49 | 136 | 137834.56 | 4205 | 2018-02-07 | 9853.75 | 173 |
| 2017-02-08 | 3898.05 | 125 | 141732.61 | 4330 | 2018-02-08 | 9116.33 | 144 |
| 2017-02-09 | 8306.86 | 286 | 150039.46999999997 | 4616 | 2018-02-09 | 8818.32 | 166 |
| 2017-02-10 | 6740.99 | 204 | 156780.45999999996 | 4820 | 2018-02-10 | 7880.17 | 134 |
| 2017-02-11 | 4290.38 | 133 | 161070.83999999997 | 4953 | 2018-02-11 | 8394.15 | 180 |
| 2017-02-12 | 3687.58 | 122 | 164758.41999999995 | 5075 | 2018-02-12 | 10378.29 | 171 |
| 2017-02-13 | 4939.31 | 159 | 169697.72999999995 | 5234 | 2018-02-13 | 9383.15 | 160 |
If you want a way that avoids using #variables in the select list, and also avoids analytics (only mysql 8 supports them) you can do it with a semi-cartesian product:
WITH prevYr AS(
SELECT
YEAR(CreatedAt) AS year_prev,
MONTH(CreatedAt) AS month_prev,
DAY(CreatedAt) AS day_prev,
SUM(CalculatedPremium) AS premiumSum_prev,
COUNT(PolicyOrderId) AS policyCount_prev
FROM
PolicyOrder
WHERE
CreatedAt BETWEEN '2017-02-01' AND '2017-02-28' AND
PolicyOrderStatusId = 6
GROUP BY
YEAR(CreatedAt), MONTH(CreatedAt), DAY(CreatedAt)
),
currYr AS (
SELECT
YEAR(CreatedAt) AS year_curr,
MONTH(CreatedAt) AS month_curr,
DAY(CreatedAt) AS day_curr,
SUM(CalculatedPremium) AS premiumSum_curr,
COUNT(PolicyOrderId) AS policyCount_curr
FROM
PolicyOrder
WHERE
CreatedAt BETWEEN '2018-02-01' AND '2018-02-28' AND
PolicyOrderStatusId = 6
GROUP BY
YEAR(CreatedAt), MONTH(CreatedAt), DAY(CreatedAt)
)
SELECT
*
FROM
prevYr
INNER JOIN
currYr
ON
currYr.day_curr = prevYr.day_prev
INNER JOIN
(
SELECT
main.day_prev AS dayRolling_prev,
SUM(pre.premiumSum_prev) AS premiumSumRolling_prev,
SUM(pre.policyCount_prev) AS policyCountRolling_prev
FROM
prevYr main LEFT OUTER JOIN prevYr pre ON pre.day_prev < main.day_prev
GROUP BY
main.day_prev
) rollingPrev
ON
currYr.day_curr = rollingPrev.dayRolling_prev
ORDER BY 1,2,3
We summarise the year 2017 and year 2018 data into two CTEs because it makes things a lot cleaner and neater later, particularly for this rolling count. You can probably follow the logic of the CTE easily because it's lifted more or less straight from your query - I only dropped the DATE column in favour of a year/month/date triplet because it made other things cleaner (joins) and can be recombined to a date if needed. I also swapped the WHERE clauses to use date BETWEEN x AND y because this will leverage an index on a column whereas using YEAR(date) = x AND MONTH(date) = y might not
The rolling counts works via something I referred to as a semi-cartesian. It's actually a cartesian product; any database join that results in rows from one o both tables multiplying and being represented repeatedly in the output, is a cartesian product. Rather than being a full product (every row crossed with every other row) in this case it uses a less than, so every row is only crossed with a subset of rows. As the date increases, more rows match the predicate, because a date of 30th has 29 rows that are less than it.
This thus causes the following pattern of data:
maindate predate maincount precount
2017-02-01 NULL 10 NULL
2017-02-02 2017-02-01 20 10
2017-02-03 2017-02-01 30 10
2017-02-03 2017-02-02 30 20
2017-02-04 2017-02-01 40 10
2017-02-04 2017-02-02 40 20
2017-02-04 2017-02-03 40 30
You can see that for any given main date, it repeats N - 1 times because there are N - 1 dates lower than in that satisfy the join condition predate < maindate
If we group by the maindate and sum the counts associated with each predate, we get the rolling sum of all the pre-counts on that main-date (So, on the 4th day of the month, it's SUM(pre count for dates 1st - 3rd, i.e. 10+20+30 = 60. On the 5th day, we sum the counts for days 1 to 4. On the 6th day, we sum days 1 to 5 etc)
I have a query between two tables.
First table is a list of users
+----+-------+-----------+
| id | name | expire_on |
+----+-------+-----------+
| 22 | JOHN | (null) |
| 44 | SMITH | (null) |
| 55 | DOE | 5 |
+----+-------+-----------+
Where "expire_on" can be NULL, but if compiled it is the expire of his subscription, in days.
And I have a list of transactions:
+----+----------------+-----------------+--------------+-------------+----------------------+
| id | id_member_card | amount_original | amount_final | description | utc_date_t |
+----+----------------+-----------------+--------------+-------------+----------------------+
| 1 | 22 | 12 | 12 | (null) | 2017-05-01T10:11:12Z |
| 2 | 22 | 50 | 50 | (null) | 2018-02-01T10:20:30Z |
| 3 | 44 | 7 | 7 | (null) | 2018-02-02T07:50:40Z |
| 4 | 22 | 9 | 9 | (null) | 2018-03-01T10:00:14Z |
| 5 | 44 | 5 | 5 | (null) | 2018-03-03T08:09:10Z |
| 6 | 22 | 0 | 0 | RENEW | 2018-05-02T11:22:33Z |
| 7 | 55 | 12 | 12 | (null) | 2018-05-03T10:20:30Z |
+----+----------------+-----------------+--------------+-------------+----------------------+
I have this starting points:
1) The user "expires" after 365 days of his very first transaction. The id 44 will expire on the 02-02-2019... > BUT >
2) If the user has a field "expire_on", he expires after the X days and not anymore the 365. In my example, id 55 is expired on the 07-05-2018.
3) If in the transaction list there is a RENEW, the user expires 365 days after this transaction renew and not anymore from the first one. Id 22 will expire only on the 02-05-2019 (pratically, we could consider a RENEW transaction as his first_transaction, if this can help to write a smarter query ) > BUT
If the user has the expire_on set, he expires X days after this renew (if the id 22 had expire_on set on, for example, let's say, 10 days, he would be expired on 12-05-2018 and not anymore 02-05-2019).
I hope that I'm clear.
Now MySql query, that I cannot complete considering the RENEW or not.
First of all, this is the link to the fiddle: http://sqlfiddle.com/#!9/16a3a/1
And this is the query:
SELECT member_card.id AS id,
member_card.name,
member_card.expire_on,
ts1.* FROM member_card
INNER JOIN (
SELECT member_card.id,
MIN(transaction.utc_date_t) AS first_transaction,
MAX(transaction.utc_date_t) AS last_transaction,
IFNULL (
DATE(DATE_ADD(MAX(transaction.utc_date_t), INTERVAL expire_on DAY)) ,
DATE(DATE_ADD(MAX(transaction.utc_date_t), INTERVAL 365 DAY))
)
AS final_expire ,
SUM(transaction.amount_final) AS balance
FROM transaction
INNER JOIN member_card ON transaction.id_member_card = member_card.id
GROUP BY member_card.id ) AS ts1 ON member_card.id = ts1.id
WHERE ( final_expire BETWEEN '2019-02-01' AND '2019-02-28' )
GROUP BY member_card.id
With my query, I would expect to find id 44, because his first transaction is made on 2018-02-01, so he will expire on the february 2019. But my query considers only LAST transaction (see MAX aggregate).
So, I need to search and looking for:
If exists a RENEW:
If yes, take this date and sum 365 (OR the custom expire date)
If no, take MIN transaction.
Thank you very much for your support.
Tryng to solve
I could also get the last renew transactions, with another query:
SELECT id_member_card , MAX(utc_date_t) AS last_transaction_renew
FROM transaction
WHERE description IS NOT NULL
GROUP BY id_member_card
and substitute these found id_member_card to the others, using this last_transaction_renew inside that IFNULL, but, how?
Query 1:
SELECT num_requerimiento, asunto
FROM masivos_texto INNER JOIN envios_masivos
ON id_masivos=id_envio;
Result 1:
+---------------------+---------------------+
| num_requerimiento | asunto |
|---------------------+----------------------
| 1800 | inscripcion |
|---------------------+---------------------+
| 1801 | seguimiento |
+---------------------+---------------------+
Query 2:
SELECT id_envio, estatus, count(estatus)
FROM acuses_recibo
WHERE id_envio IN (SELECT id_masivos FROM cati_atencion.masivos_texto WHERE fecha >= '2014-01-01' AND fecha <= '2015-06-16')
GROUP BY id_envio, estatus;
Result 2:
+---------------------+---------------------+----------------------+
| id_envio | estatus | count(estatus) |
|---------------------+--------------------------------------------+
| 84 | 0 | 4031 |
|---------------------+---------------------+----------------------+
| 84 | 1 | 632 |
+---------------------+---------------------+----------------------+
| 85 | 0 | 35635 |
+---------------------+---------------------+----------------------+
| 85 | 1 | 3711 |
+---------------------+---------------------+----------------------+
Desired Result:
+---------------------+-----------------+------------+------------+-------------------+
| num_requerimiento | asunto | id_envio | estatus | count(estatus) |
|---------------------+-----------------+------------+------------+-------------------+
| 1800 | inscripcion | 84 | 0 | 4031 |
|---------------------+-----------------+------------+------------+-------------------+
| 1800 | inscripcion | 84 | 1 | 632 |
+---------------------+-----------------+------------+------------+-------------------+
| 1801 | seguimiento | 85 | 0 | 635 |
+---------------------+-----------------+------------+------------+-------------------+
| 1801 | seguimiento | 85 | 1 | 711 |
+---------------------+-----------------+------------+------------+-------------------+
in the Desired Result the id_envio/id_masivos corresponding to num_requerimiento 1800 is 84,
and the id_envio/id_masivos corresponding to num_requerimiento 1801 is 85,
and estatus in the 2nd table cant take up to three values, than i.a. for your assistance.
UNION doesn´t work, it gives me the 1st table followed by the 2nd, and only if the selects are of the same number of columns.
To do this with SQL, you will need a table relating your masivos_texto and acuses_recibo tables. I suggest you create a table. You could call it req_id or anything suitable. This is often called a JOIN table. It will have this content
num_requerimiento id_envio
1800 84
1801 85
Then you'll be able to join your first and second queries together appropriately.
It's not possible to write your query for you without knowing the rows of your tables.
Solved!! I needed to use aliases to each SELECT, as adding an alias to each select level, like this:
SELECT result1.num_requerimiento, result1.asunto, result1.id_masivos, result2.estatus, result2.conteo
FROM
(SELECT C.num_requerimiento, B.asunto, B.id_masivos
FROM masivos_texto B INNER JOIN envios_masivos C
ON B.id_masivos=C.id_envio) as result1
INNER JOIN
(SELECT A.id_envio, A.estatus, count(estatus) as conteo
from acuses_recibo A
WHERE A.id_envio IN (SELECT B.id_masivos FROM masivos_texto B where B.fecha >= '2014-01-01' AND B.fecha <= '2015-06-16')
GROUP BY A.id_envio, A.estatus) as result2
ON result1.id_masivos=result2.id_envio;
and that generates the 3rd table needed. Hope it helps someone in the future.
I have an external 3rd party program export the database to mysql in real time, and I want to show data for reporting. So, I can't change the structure, because it's being sync in real time.
The table structure is something like this
ID | Date | Transaction
-----------------------------
12 | 2012-11-01 | 200
12 | 2012-11-02 | 250
12 | 2012-11-03 | 150
12 | 2012-11-04 | 1000
12 | 2012-11-05 | 225
....
13 | 2012-11-01 | 175
13 | 2012-11-02 | 20
13 | 2012-11-03 | 50
13 | 2012-11-04 | 100
13 | 2012-11-05 | 180
13 | 2012-11-06 | 195
The data are very large and keep getting bigger each day.
What I want to do is to build a report (view table) based on something like this:
ID | Date | Transaction | Prev Day Transaction
----------------------------------------------------
12 | 2012-11-01 | 200 | 0
12 | 2012-11-02 | 250 | 200
12 | 2012-11-03 | 150 | 250
12 | 2012-11-04 | 1000 | 150
12 | 2012-11-05 | 225 | 1000
....
13 | 2012-11-01 | 175 | 0
13 | 2012-11-02 | 20 | 175
13 | 2012-11-03 | 50 | 20
13 | 2012-11-04 | 100 | 50
13 | 2012-11-05 | 180 | 100
13 | 2012-11-06 | 195 | 180
I just can't get the fast select statement. Currently the original data is already 283,120 rows. And it will grow like 500 rows daily.
I've tried something like:
SELECT *, (SELECT transaction FROM table as t2 WHERE t1.id=t2.id
AND t1.date>t2.date ORDER BY t2.date DESC LIMIT 0,1)
FROM table AS t1
It's working, but the select statement is very slow. Most of the time, it's getting cut of in the middle of the operation.
What I need help is a very fast sql statement, which later on I could use to build the view table.
See this link: http://sqlfiddle.com/#!2/54a5e/12
select t.id,t.cDate,t.cTrans
,(case when #pID=t.id then #pTran else 0 end) as preT
,(#pID :=t.id) as `tID`,(#pTran := t.cTrans) as `tTrans`
from tb_test_1 as t,(select #pID = 0, #pTran = 0) as t2
order by id,cDate;
tID and tTrans column must be retained, and cannot display on page.
Please forgive me as I only know a little english!
Try this query -
SELECT t1.*, COALESCE(t2.transaction, 0) Prev_Day_Transaction FROM trans t1
LEFT JOIN (SELECT * FROM trans ORDER BY id, date DESC) t2
ON t1.id = t2.id AND t1.date > t2.date
GROUP BY t1.id, t1.date;
+------+------------+-------------+----------------------+
| id | date | transaction | Prev_Day_Transaction |
+------+------------+-------------+----------------------+
| 12 | 2012-11-01 | 200 | 0 |
| 12 | 2012-11-02 | 250 | 200 |
| 12 | 2012-11-03 | 150 | 250 |
| 12 | 2012-11-04 | 1000 | 150 |
| 12 | 2012-11-05 | 225 | 1000 |
| 13 | 2012-11-01 | 175 | 0 |
| 13 | 2012-11-02 | 20 | 175 |
| 13 | 2012-11-03 | 50 | 20 |
| 13 | 2012-11-04 | 100 | 50 |
| 13 | 2012-11-05 | 180 | 100 |
| 13 | 2012-11-06 | 195 | 180 |
+------+------------+-------------+----------------------+
Add composite index (id, date) to the table.
===========================
ALTER TABLE mt4_daily
ADD INDEX IX_mt4_daily_DATE (DATE);
ALTER TABLE mt4_daily
ADD INDEX IX_mt4_daily (ID, DATE);
Divide the table into few pars through select statements and join them using UNION Set operator. As all set operators are parallel operation you will get the data very quickly. You can divide the data by using the Unique numeric column in your table. e.g.
select * from tbl_x where col1%3=0 union
select * from tbl_x where col1%3=1 union
select * from tbl_x where col1%3=2 ...
The above sql query divides the data and fetches in parallel way
I would try to write the query like this:
SELECT
tbl.ID,
tbl.Date,
tbl.Transaction,
COALESCE(tbl1.Transaction,0) as PrevDay
FROM
tbl left join tbl tbl1
on tbl.ID = tbl1.ID
and tbl.Date = tbl1.Date + INTERVAL 1 DAY
(this will work only if you make sure that the table contains all days, if you miss one day, the next day will always show PrevDay as 0, i am not sure if this is what you need).
EDIT: i would try this solution that works even if some days are missing:
SELECT
tbl.id,
tbl.date,
tbl.Transaction,
COALESCE(tbl1.Transaction,0) as PrevDay
FROM
(SELECT tbl.id, tbl.date as d1, max(tbl1.ddate) as d2
FROM tbl LEFT JOIN tbl tbl1
ON tbl.id = tbl1.id and tbl.date>tbl1.date
GROUP BY tbl.id, tbl.date) t
LEFT JOIN tbl on tbl.id = t.id and DATE(tbl.ddate) = DATE(t.d1)
LEFT JOIN tbl tbl1 ON tbl1.id = t.id and DATE(tbl1.date) = DATE(t.d2)
I got an sql issue. I have two tables which look like this:
first TABLE X second TABLE Y
TabX_ID| DATE | Value Z TabY_ID|TabX_ID | DATE | Value X | Value Y
4711 | 15.01 | 12 1 | 4711 | 15.01| 123 | 876
4711 | 20.01 | 5 2 | 4711 | 16.01| 12 | 54
4711 | 25.01 | 67 3 | 4711 | 17.01| 23 | 38
4 | 4711 | 20.01| 56 | 13
5 | 4711 | 23.01| 1 | 5
I need to assing all the data from TABLE Y to the data in the TABLE X DATE to the fitting
timeframe.
I cant use a simple min - max because it changes.
1. DATE min 15.01 DATE-max:19.01
2. DATE-min:20.01 DATE-max:24.01
3. DATE-min:25.01 DATE-max:...
So it looks like this
1 | 15.01 | 123 | 876
4711 | 15.01 | 12 -> 2 | 16.01 | 12 | 54
3 | 17.01 | 23 | 38
4711 | 20.01 | 5 -> 4 | 20.01 | 56 | 13
5 | 23.01 | 1 | 5
First I need to perform calculations with the TABLE Y VALUES X an Y and after that I need the VALUE Z
from TABLE X. So it looks like this:
ID | DATE | Calculated_Val
4711| 15.01 | 345
4711| 20.01 | 892
Is there a way to do this?
thx in advance
Not sure about MySQL but if you are doing this with Oracle, I would use the LEAD analytic function to get the next date value in the future in tableX and then join that to tableY.
An example of this would be:
select
tabX_id,
date_val as min_date,
next_date_val as max_date,
valueZ,
valueX,
valueY,
y.date_val as tabY_date
from (
select
tabX_id,
date_val,
lead(date_val) over (partition by tabx_id order by date_val)
as next_date_val,
valueZ
from tabX
) x
join tabY y on (x.tabX_id = y.tabX_id and
y.date_val >= x.date_val and
(x._next_date_val is null or y.date_val < x.next_date_val))
Note that I haven't modified the next value of the date so am using a less-than condition. This is probably appropriate if you have a time component in any of the date fields but might not be exactly what you want if they are just date value.
This is a simple join and group by:
select x.TabX_ID, y.DATE, min(ValueX), min(ValueY)
from TableX x
join TableY y
on x.TabX_ID = y.TabX_ID
and x.DATE = y.DATE
group by x.TabX_ID, y.DATE