Joining together consecutive date - mysql

I have versions of the value that need to be combined. Records with other identifiers may also appear in the input table.
How i can do this in MySQL?
Input
ID Prev Value StartDate Finishdate
1140004 0 0 2019-11-01 00:00:00.000 2019-11-09 23:59:00.000
1140004 0 1 2019-11-10 00:00:00.000 2019-11-14 23:59:00.000
1140004 1 1 2019-11-15 00:00:00.000 2019-11-30 23:59:00.000
Expected
ID Prev Value StartDate FinishDate
1140004 0 1 2019-11-10 00:00:00.000 2019-11-30 23:59:00.000

Please add more details about your expected result as you will need to perform some sort of aggregation/calculation on the columns which are not ID to get to the one row result you are expecting (as this will of course need to be grouped by ID).
Below, for example, for every Id we get the sum of value, the min StartDate and the max FinishDate. What sort of aggregation you do for each column depends on your use case of course so this is just an example.
You can play around with this example here.
select Id,
sum(Value) as "Sum Value",
min(StartDate) as "Min StartDate",
max(FinishDate) as "Max FinishDate"
from data
group by Id

Related

Delete the duplicate values in the SUM with MySQL or SQL

Hi I am doing a sum of a table, but the problem is that the table has duplicate rows, so I wonder how can I do the sum without duplicated rows:
The main table is this one:
folio
cashier_id
amount
date
0001
1
2500
2022-06-01 00:00:00
0002
2
10000
2022-06-01 00:00:00
0001
1
2500
2022-06-01 00:00:00
0003
1
1000
2022-06-01 00:00:00
If I sum that you can see that the first and the third row are duplicated, so when I do the sum it makes it wrong because, the result will be:
cashier_id
cash_amount
1
6000
2
10000
but it should be:
cashier_id
cash_amount
1
3500
2
10000
The query that I use to make the sum is this one:
SELECT `jysparki_jis`.`api_transactions`.`cashier_id` AS `cashier_id`,
SUM(`jysparki_jis`.`api_transactions`.`cash_amount`) AS `cash_amount`,,
COUNT(0) AS `ticket_number`,
DATE(`jysparki_jis`.`api_transactions`.`created_at`) AS `date`
FROM `jysparki_jis`.`api_transactions`
WHERE DATE(`jysparki_jis`.`api_transactions`.`created_at`) >= '2022-01-01'
AND (`jysparki_jis`.`api_transactions`.`dte_type_id` = 39
OR `jysparki_jis`.`api_transactions`.`dte_type_id` = 61)
AND `jysparki_jis`.`api_transactions`.`cashier_id` <> 0
GROUP BY `jysparki_jis`.`api_transactions`.`cashier_id`,
DATE(`jysparki_jis`.`api_transactions`.`created_at`)
How you can see the sum is this:
SUM(`jysparki_jis`.`api_transactions`.`cash_amount`).
I wonder how can I do the sum avoiding to duplicate the folio with same cashier_id?
I know that if I filter for the cashier_id and folio I can avoid the duplicate rows but I do not know how to do that, can you help me?
Thanks
Given your provided input tables, you can use the DISTINCT clause inside the SUM aggregation function to solve your problem:
SELECT cashier_id, SUM(DISTINCT amount)
FROM tab
GROUP BY cashier_id,
folio,
date
Check the demo here.
Then you can add up your conditions inside your WHERE clause to this query, and your aggregation on the "created_at" field (that should correspond to the "date" field of your sample table - I guess). This solution may give your the general idea.

MySQL - Find start and end of blocks of consecutive rows with the same value

I need to extract and migrate values from one table to another. the source table contains sumarized values for a specific effectivity date. If a value is changed, a new line is written if something is changed on the component values with the data valid starting at this effective date.
source_id
entity_id
effective_date
component_1
component_2
component_3
int(ai)
int
date
int
int
int
1
159
2020-01-01
100
0
90
2
159
2020-05-01
140
50
90
3
159
2020-08-01
0
30
90
5
159
2020-12-01
0
30
50
i need now migrate this data to a new table like this. the goal is that selecting data for a given month the result is the valid data for this month is given.
id
source_id
entity_id
startdate
enddate
component_type
value
int(ai)
int
int
date
date
int
int
each row represents a value for a component valid for a period of month.
I now run the insert update for each effective month by setting it as a parameter.
I insert value changes as new rows to the table an prevent duplicates by using a unique key (entity_id,effective_date,component_type)
SET #effective_date = '2020-01-01';
INSERT INTO component_final
select NULL,
source_id,
entity_id,
effective_date,
NULL,
1,
component_1
FROM component_source
WHERE effective_date = #effective_date
AND component_1>0;
after migrating the first row it should be that result
id
source_id
entity_id
startdate
enddate
component_type
value
1
1
159
2020-01-01
NULL
1
100
2
1
159
2020-01-01
NULL
3
90
SET #effective_date = '2020-05-01';
INSERT INTO component_final
select NULL,
source_id,
entity_id,
effective_date,
NULL,
1,
component_1
FROM component_source
WHERE effective_date = #effective_date
AND component_1>0;
after migrating the second row it should be that result
id
source_id
entity_id
startdate
enddate
component_type
value
1
1
159
2020-01-01
2020-04-30
1
100
2
1
159
2020-01-01
NULL
3
90
3
2
159
2020-05-01
NULL
1
140
4
2
159
2020-05-01
NULL
2
50
so if there is a value change in the future an end date has to be set.
I'm not able to do the second step, updating the data, if the component is changed in the future.
Maybe it is possible to have it as triggers after insert new row with same entity and component - but I was not able to make it work.
Some ideas? I want to handle this only inside of the MySQL.
You do not need the column enddate in the table component_final, because it's value depends on other values in the same table:
SELECT
id,
source_id,
entity_id,
startdate,
( SELECT DATE_ADD(MIN(cf2.startdate),INTERVAL -1 DAY)
FROM component_final cf2
WHERE cf2.startdate > cf1.startdate
AND cf2.source_id = cf1.source_id
AND cf2.entity_id = cf1.entity_id
) as enddate,
component_type,
value
FROM component_final cf1;
I understand that the core issue is how to find the source_ids where a component changes (0 means a removal, so we don't want these entries in the result) and how to assign the respective end dates at the same time. For the sake of illustration I simplify your example a bit:
There is only one component_type (I take into account that there might then be consecutive entries with unchanged value)
there is only one entity_id, so we can ignore it
It should be easy to extend this simpler version to your real-world problem.
To this is an example input:
source_id
effective_date
value
1
2020-01-01
100
2
2020-01-03
100
3
2020-01-05
80
4
2020-01-10
0
5
2020-01-12
30
I would expect the following output to be generated:
source_id
start_date
end_date
value
1
2020-01-01
2020-01-04
100
3
2020-01-05
2020-01-09
80
5
2020-01-12
NULL
30
You can achieve this with one query by joing each row with the previous one to check if the value has changed (find the start dates of periods) and the first row that is in the future and has a different value (find the start of the next period). If there is no previous row, it is considered a start as well. If there is no later update of the value, we have no end_date.
SELECT
main.source_id,
main.effective_date as start_date,
DATE_SUB(next_start.effective_date, INTERVAL 1 DAY) as end_date,
main.value
FROM source main
LEFT JOIN source prev ON prev.effective_date = (
SELECT MAX(effective_date)
FROM source
WHERE effective_date < main.effective_date
)
LEFT JOIN source next_start ON next_start.effective_date = (
SELECT MIN(effective_date)
FROM source
WHERE effective_date > main.effective_date AND value <> main.value
)
WHERE
ISNULL(prev.source_id) OR prev.value <> main.value
AND main.value <> 0
ORDER BY main.source_id
As I said: This will have to be adapted to your problem, e.g. by adding proper join conditions for the entity_id.
#Luuk pointed out that you don't need the end date because it can be derived from the data. This would be the case if you had entries for the start of "0 periods" as well, i.e. if there is no value set. If you don't have entries for these, you can't derive the end from the start of the respectively next period since there might be a gap in between.

Converting the result of a MySQL table as per requirement

The mysql table we work on has data in the following format:
entityId status updated_date
-------------------------------
1 1 29/05/2017 12:00
1 2 29/05/2017 03:00
1 3 29/05/2017 07:00
1 4 29/05/2017 14:00
1 5 30/05/2017 02:00
1 6 30/05/2017 08:00
2 1 31/05/2017 03:00
2 2 31/05/2017 05:00
.
.
So every entity id has 6 statuses, and every status has an update datetime. Each status has an activity attached to it.
For example 1 - Started journey
2 - Reached first destination
3 - Left Point A, moving towards B. etc
I need to get an output in the below format for specific entity id eg 3 and 4. I need the time for status 3 and 4 independently.
entity_id time_started_journey time_reached_first_destination
(update time of status 3) (update time of status 4)
--------------------------------------------------------------
1 29/05/2017 7:00 29/05/2017 14:00
2 30/05/2017 7:00 30/05/2017 16:00
Later I need to calculate the total time which would be the difference of the two.
How can I achieve the desired result using mysql.
I tried using Union operator but cannot do it separate columns.
Also, tried using case when operator with the below query but failed.
select distinct entityid,
(case status when 3 then freight_update_time else 0 end)
as starttime,
(case status when 4 then freight_update_time else 0 end) as endtime
from table ;
Can anyone throw light on this?
Conditional aggregation is one way to return a resultset that looks like that.
SELECT t.entityid
, MAX(IF(t.status=3,t.updated_date,NULL)) AS time_started_journey
, MAX(IF(t.status-4,t.updated_date,NULL)) AS time_reached_first_destination
FROM mytable t
WHERE t.status IN (3,4)
GROUP BY t.entityid
ORDER BY t.entityid
This is just one suggestion; the specification is unclear about what the query should do with duplicated status values for a given entityid.
There are other query patterns that will return similar results.
My query in MySQL
SELECT
e3.updated_date AS sta3,
e4.updated_date AS sta4
FROM
`prueba` AS e3
LEFT JOIN prueba AS e4
ON
e3.entityId = e4.entityId AND e4.status = 4
WHERE
e3.status = 3
OUTPUT:

Previous row value in MySQL

I'm trying to find a way of retrieving a value from the previous row. What I want to do is first sort the rows by Date 1 (earliest first). Then, if Date 2 is later than all previous dates in that column, I want to pull out that row (plus the first initial row). My server does not support the LAG function. I have tried suggestions using CTE, but my server does not seem to recognise that either.
What I want to do is check whether, after sorting by Date 1, if Date_2 for row 2 > Date_2 for row 1, and if so return that row.
Here's an example table. As you can see, the ID is not in the same order as Date 1.
ID Date 1 Date 2
1 2000-01-01 2010-01-01
2 2001-08-01 2013-06-01
3 2000-06-01 2011-01-01
4 1999-07-01 2010-12-01
5 2002-02-01 2012-12-01
So in my example, I want these 3 records to be returned:
ID Date_1 Date_2 Previous_max
4 1999-07-01 2010-12-01 NULL
3 2000-06-01 2011-01-01 2010-12-01
2 2001-08-01 2013-06-01 2011-01-01
ID 1 and 5 are not returned because Date 1 is later and Date 2 is earlier than another row (4 and 2 respectively).
You should be able to do this with a correlated subquery:
select t.*,
(select max(date_2) from table t2 where t2.date_1 < t.date_1) as prev_max
from table t
having prev_max is null or prev_max < date_2;

How to show every max value in mysql?

I've multiple values with different timestamps like the following:
10 01:01:00
20 01:35:00
30 02:10:00
05 02:45:00
12 03:05:00
21 03:30:00
10 04:06:00
40 05:15:00
I don't have a column with which I can group by and find max. I want to get the records with max values like 30,21, and 40. The data is always in this format, like value increasing and then starts from zero again. What query will help me to find these records?
To clarify, it's sorted by the timestamp, and I want to get the timestamps for the local maxima, the rows where the next row has a lesser value:
value tmstmp
----- --------
10 01:01:00
20 01:35:00
30 02:10:00 <-- this one since next value is 5 (< 30).
05 02:45:00
12 03:05:00
21 03:30:00 <-- this one since next value is 10 (< 21).
10 04:06:00
40 05:15:00 <-- this one since next value is 40 (< infinity).
Somehow your question is not clear to me.
Assume that first column name is "value" and second column name is "timestamp".
Select Max(value) from group by timestamp.
This answer might be a bit late, however i think i have found the solution
SELECT * FROM temp t1 WHERE value >
IFNULL(
(SELECT value FROM temp t2
WHERE t2.tmstmp > t1.tmstmp ORDER BY t2.tmstmp ASC limit 1),
-1
)
ORDER BY tmstmp ASC
To clarify:
I find the values where the value is greater than the next value in the row.
To also get the final value I have added an IFNULL around the subquery to make sure the subquery will then return -1
The only problem i see is when the time goes over to the next day, that's why i hope you can have a date appended to it as well.
Hopefully this will still help others