MySQL cumulative sum gives wrong result - mysql

Here is the sample of my table with some sample data-
The strange things happens while making cumulative sum of difference between columns gorivo.PovratKM and gorivo.PolazakKM and same for gorivo.UkupnoGorivo.
The cumulative sums are in column SumUkKM for difference between gorivo.PovratKM and gorivo.PolazakKM and for cumulative sum for gorivo.UkupnoGorivo is column SumGorivo.
The output should be something like:
+-------------+------------+-------------+------------+
| Polazak KM | Povratal KM| Prijedeno KM| SumUkKM |
+-------------+------------+-------------+------------+
| 814990 | 816220 | 1230 | 1230 |
+-------------+------------+-------------+------------+
| 816220 | 817096 | 876 | 2106 |
+-------------+------------+-------------+------------+
| 817096 | 817124 | 28 | 2134 |
+-------------+------------+-------------+------------+
| 817124 | 818426 | 1302 | 3436 |
+-------------+------------+-------------+------------+
What I'm doing wrong in my query?

MySql allows to declare variables in the sql sentence, (select #SumUkGorivo := 0, #SumUkKM := 0) x the CROSS JOIN allows to calculate its value for each row of the other table.
Using variables, you can, for example set reset points or partitions on the same way than SUM() OVER (PARTITION BY is used by other dmbs like SQL or Postgres.
SELECT
y.`PolazakKM`, y.`PovratakKM`,
#SumUkGorivo := #SumUkGorivo + `UkupnoGorivo` as SumUkGorivo,
#SumUkKM := #SumUkKM + (y.`PovratakKM` - y.`PolazakKM`) as SumUkKM
FROM
(select #SumUkGorivo := 0, #SumUkKM := 0) x,
(select gorivo.`PolazakKM`, gorivo.`PovratakKM`, gorivo.`UkupnoGorivo`
from gorivo WHERE gorivo.`IDVozilo` = 131
order by `DatumT`) y
;

Related

MySQL: Add Default Value to Joined Table when Row not Found

System info:
$ uname -srvm
Linux 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64
$ mysql --version
mysql Ver 8.0.31-0ubuntu0.22.04.1 for Linux on x86_64 ((Ubuntu))
I am very inexperienced with MySQL & have been looking for an answer to this for about half a week. I am working with two tables named character_stats & halloffame that I want to join in a query. They look like this:
mysql> SELECT name, level FROM character_stats;
+-----------+-------+
| name | level |
+-----------+-------+
| foo | 0 |
| bar | 0 |
| baz | 3 |
| tester | 4 |
| testertoo | 2 |
+-----------+-------+
mysql> SELECT * from halloffame;
+----+-----------+----------+--------+
| id | charname | fametype | points |
+----+-----------+----------+--------+
| 1 | bar | T | 0 |
| 2 | foo | T | 0 |
| 3 | baz | T | 0 |
| 4 | tester | T | 0 |
| 5 | testertoo | T | 0 |
| 6 | tester | D | 40 |
| 7 | tester | M | 92 |
| 8 | bar | M | 63 |
+----+-----------+----------+--------+
In my query, I want to display all the rows from character_stats & I want to join the points column from halloffame for fametype='M'. If there is no row for fametype='M', I want to set points to 0 for that character name, instead of omitting the entire row as is done in the following:
mysql> SELECT name, level, points FROM character_stats JOIN
-> (SELECT charname, points FROM halloffame WHERE fametype='M')
-> AS hof ON (hof.charname=name);
+--------+-------+--------+
| name | level | points |
+--------+-------+--------+
| tester | 4 | 92 |
| bar | 0 | 63 |
+--------+-------+--------+
So I want it to output this:
+-----------+-------+--------+
| name | level | points |
+-----------+-------+--------+
| foo | 0 | 0 |
| bar | 0 | 63 |
| baz | 3 | 0 |
| tester | 4 | 92 |
| testertoo | 2 | 0 |
+-----------+-------+--------+
I have tried to learn how to use IFNULL, IF-THEN-ELSE, CASE, COALESCE, & COUNT statements from what I have found in documentation & answers on stackoverflow.com. But as I said, I am very inexperienced & don't know how to implement them.
The following works on its own:
SELECT IFNULL((SELECT points FROM halloffame WHERE fametype='M'
AND charname='foo' LIMIT 1), 0) as points;
But I don't know how to join it to the character_stats table. The following would work if I knew how to get the value of character_stats.name before COALESCE is called:
SELECT name, level, 'M' AS fametype, points FROM character_stats
JOIN (SELECT COALESCE((SELECT points FROM halloffame WHERE
fametype='M' AND charname=name LIMIT 1), 0) AS points) AS hof;
According to Adding Default Values on Joining Tables I should be able to use CROSS JOIN, but I am doing something wrong as it still results in Unknown column 'cc.name' in 'where clause':
SELECT name, level, points FROM character_stats
CROSS JOIN (SELECT DISTINCT name FROM character_stats) AS cc
JOIN (SELECT COALESCE((SELECT points FROM halloffame WHERE
fametype='M' AND charname=cc.name LIMIT 1), 0) AS points) AS hof;
Some references I have looked at:
Returning a value even if no result
Usage of MySQL's "IF EXISTS"
Return Default value if no row found
MySQL.. Return '1' if a COUNT returns anything greater than 0
How do write IF ELSE statement in a MySQL query
Simple check for SELECT query empty result
Is there a function equivalent to the Oracle's NVL in MySQL?
MySQL: COALESCE within JOIN
Unknown Column In Where Clause With Join
Adding Default Values on Joining Tables
https://www.tutorialspoint.com/returning-a-value-even-if-there-is-no-result-in-a-mysql-query
I found that I can do the following:
SELECT name, level, COALESCE((SELECT points FROM
halloffame WHERE fametype='M' AND charname=name
LIMIT 1), 0) AS points FROM character_stats;
Though I would still like to know how to do it within a JOIN statement.

Group rows by the same value in the field, while matching on partial value only

I have a table that has many rows (between a few 1000s to a few million).
I need my query to do the following:
group results by the same part of the value in the field;
order by the biggest group first.
The table has mostly values that have only some part are similar (and i.e. suffix would be different). Since the number of similar values is huge - I cannot predict all of them.
Here is i.e. my table:
+--------+-----------+------+
| Id | Uri | Run |
+--------+-----------+------+
| 15145 | select_123| Y |
| 15146 | select_345| Y |
| 15148 | delete_123| N |
| 15150 | select_234| Y |
| 15314 | delete_334| N |
| 15315 | copy_all | N |
| 15316 | merge_all | Y |
| 15317 | select_565| Y |
| 15318 | copy_all | Y |
| 15319 | delete_345| Y |
+--------+-----------+------+
What I would like to see, something like this (the Count part is desirable but not required):
+-----------+------+
| Uri | Count|
+-----------+------+
| select | 4 |
| delete | 3 |
| copy_all | 2 |
| merge_all| 1 |
+-----------+------+
If you're using MySQL 5.x, you can strip the trailing _ and digits from the Uri value using this expression:
LEFT(Uri, LENGTH(Uri) - LOCATE('_', REVERSE(Uri)))
Using a REGEXP test to see if the Uri ends in _ and some digits, we can then process the Uri according to that and then GROUP BY that value to get the counts:
SELECT CASE WHEN Uri REGEXP '_[0-9]+$' THEN LEFT(Uri, LENGTH(Uri) - LOCATE('_', REVERSE(Uri)))
ELSE Uri
END AS Uri2,
COUNT(*) AS Count
FROM data
GROUP BY Uri2
Output:
Uri2 Count
copy_all 2
delete 3
merge_all 1
select 4
Demo on SQLFiddle
The format of the string makes it uneasy to parse it with string functions.
If you are running MySQL 8.0, you can truncate the string with regexp_replace(), then group by and order by:
select regexp_replace(uri, '_\\d+$', '') new_uri, count(*) cnt
from mytable
group by new_uri
order by cnt desc
If you're using MySQL 8.x, you can use REGEXP_REPLACE() to remove the numeric suffixes from select_XXX and delete_XXX, then group by the result.
SELECT REGEXP_REPLACE(uri, '_[0-9]+$', '') AS new_uri, COUNT(*) as count
FROM yourTable
GROUP BY new_uri
You can do as below and create a view and using the case expression + substr find which are 'select' and 'delete'.
Following the view you can query it with the count/group_by.
WITH view_1 AS (
SELECT
CASE
WHEN substr(uri, 1, 6) = 'select' THEN
substr(uri, 1, 6)
WHEN substr(uri, 1, 6) = 'delete' THEN
substr(uri, 1, 6)
ELSE uri
END AS uri
FROM
your_table
)
SELECT
uri,
COUNT(uri) as "Count"
FROM
view_1
GROUP BY
uri
ORDER BY count(uri) DESC;
Output will be
delete 5
merge_all 4
select 3
copy_all 3

How mysql SELECT and WHERE works generally and how it works when there's an added column?

I'm new on mysql and I don't know the terms, so I'll speak as I can. Here my problem:
I have works table, it have columns: id, date, and done.
+----+---------------------+------+
| id | date | done |
+----+---------------------+------+
| 1 | 2020-05-01 14:22:34 | 10 |
| 2 | 2020-05-02 14:22:45 | 50 |
| 3 | 2020-05-03 14:23:00 | 30 |
| 4 | 2020-05-04 14:23:13 | 100 |
| 5 | 2020-05-05 14:23:24 | 25 |
+----+---------------------+------+
I want to select all of them with +1 new column, named cummulative_done. I use this sql command to get the result I want.
SET #cdone := 0;
SELECT *, (#cdone := #cdone + done) as cummulative_done
FROM works;
Result:
+----+---------------------+------+------------------+
| id | date | done | cummulative_done |
+----+---------------------+------+------------------+
| 1 | 2020-05-01 14:22:34 | 10 | 10 |
| 2 | 2020-05-02 14:22:45 | 50 | 60 |
| 3 | 2020-05-03 14:23:00 | 30 | 90 |
| 4 | 2020-05-04 14:23:13 | 100 | 190 |
| 5 | 2020-05-05 14:23:24 | 25 | 215 |
+----+---------------------+------+------------------+
Then, I want that result to be filtered by cummulative_done using sql WHERE. On my first try, I use this sql command.
SET #cdone := 0;
SELECT *, (#cdone := #cdone + done) as cummulative_done
FROM works
WHERE cummulative_done <= 100;
And it give me error: Error in query (1054): Unknown column 'cummulative_done' in 'where clause'.
After that, I searched on google and got a solution using this sql command.
SET #cdone := 0;
SELECT *, (#cdone := #cdone + done) as cummulative_done
FROM works
WHERE #cdone + done <= 100;
It result:
+----+---------------------+------+------------------+
| id | date | done | cummulative_done |
+----+---------------------+------+------------------+
| 1 | 2020-05-01 14:22:34 | 10 | 10 |
| 2 | 2020-05-02 14:22:45 | 50 | 60 |
| 3 | 2020-05-03 14:23:00 | 30 | 90 |
+----+---------------------+------+------------------+
It give me the result I want, but I don't understand why it works and why mine not works.
From that, I realized my understanding of sql SELECT and WHERE is wrong. Can anyone explain to me how sql SELECT and WHERE works generally? After that please explain why my first try not works and why the solution works.
If can, I prefer step-by-step or deep explaination but still easy to understand. Thanks.
Check out the order of MySQL statements execution. WHERE is executed before SELECT so you can't filter by columns defined later.
More information:
MySQL query / clause execution order
Every iteration of next row variable #cdone increases with value done from previous row so that condition applies to results. For each row order of WHERE/SELECT execution stays the same, but execution is performed for each row one by one.
The WHERE clause, if given, indicates the condition or conditions that
rows must satisfy to be selected. where_condition is an expression
that evaluates to true for each row to be selected. The statement
selects all rows if there is no WHERE clause.
You can modify your code.
SET #cdone := 0;
SELECT *, #cdone as cummulative_done
FROM works
WHERE (#cdone := #cdone + done) <= 100;
https://dev.mysql.com/doc/refman/8.0/en/select.html

Random results spaced in specific time (according to column data)

I've those 2 tables in MySQL
- "tag_name" which contains an unique tag_name_id of each individual created tag, and the respective tag
- "tags" which contains also an unique tags_id, a timecode (time instant) and group_id (same tags can have different group_id)
Now, what I'm trying to do is getting a random timecode somewhere in the first 10 seconds (timecode <= 10). And after that, and according to that timecode extracted, try to select all the results that are distanced by 3 or more seconds. (all the results less than 3 seconds must be discarded).
Example:
If I have these results in my database:
2,3,4,4,6,13,14,17,18,18,21,25,28,28,etc (timecodes)
I want to grab one of the first 10 randomly (lets say I pick the 4) and then I want to start sorting them randomly according to that time instant. ("randomly" because I want to change the order when same timecode instants appears ie: "4,4" because they are associated to different tags, so I want them to "switch" between them, so I can pick up a different one every time)
So the result query came in something like this: 4,13,17,21,25,28,etc
I already have this query that returns me the random number, and this morning I have been trying to make a Select inside the Select because I think the answer is there, but I can't retrieve the results I want, and I also can't find a way to retrieve result distanced by 3...
SELECT tag_name.tag, ROUND(avg(timecode)) as timecode, group_id
FROM tags
INNER JOIN tag_name
ON tag_name.tag_name_id = tags.tag_name_id
WHERE tags.filename = 'filename.mp4' AND timecode <= 10
GROUP BY group_id, tag_name.tag
ORDER BY RAND()
LIMIT 1
Here is the SQLFiddle
So we have a table of timecodes which, for the sake of argument, might look like this...
CREATE TABLE timecodes(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,timecode INT NOT NULL);
INSERT INTO timecodes (timecode) VALUES (2),(3),(4),(4),(6),(13),(14),(17),(18),(18),(21),(25),(28),(28);
SELECT * FROM timecodes;
+----+----------+
| id | timecode |
+----+----------+
| 1 | 2 |
| 2 | 3 |
| 3 | 4 |
| 4 | 4 |
| 5 | 6 |
| 6 | 13 |
| 7 | 14 |
| 8 | 17 |
| 9 | 18 |
| 10 | 18 |
| 11 | 21 |
| 12 | 25 |
| 13 | 28 |
| 14 | 28 |
+----+----------+
Now, this question is in two parts. The first part concerns obtaining a random result from within the first n results. One way of doing that (although probably not the fastest way) is like this...
SELECT #seed := x.timecode
FROM timecodes x
JOIN timecodes y
ON y.id <= x.id
GROUP
BY x.id
HAVING COUNT(*) <= 5
ORDER
BY RAND()
LIMIT 1;
+---------------------+
| #seed := x.timecode |
+---------------------+
| 4 |
+---------------------+
This query generates a seed (in this case '4'), which can be ploughed back into subsequent queries, e.g.;
SELECT #seed := MIN(y.timecode)
FROM timecodes x
JOIN timecodes y
ON y.timecode >= x.timecode + 3
WHERE x.timecode = #seed;
1st iteration
+--------------------------+
| #seed := MIN(y.timecode) |
+--------------------------+
| 13 |
+--------------------------+
2nd iteration
+--------------------------+
| #seed := MIN(y.timecode) |
+--------------------------+
| 17 |
+--------------------------+
3rd iteration
+--------------------------+
| #seed := MIN(y.timecode) |
+--------------------------+
| 21 |
+--------------------------+
4th iteration
+--------------------------+
| #seed := MIN(y.timecode) |
+--------------------------+
| 25 |
+--------------------------+
5th iteration
+--------------------------+
| #seed := MIN(y.timecode) |
+--------------------------+
| 28 |
+--------------------------+
6th iteration
+--------------------------+
| #seed := MIN(y.timecode) |
+--------------------------+
| NULL |
+--------------------------+
This can be wrapped up in a sproc, or some application level code that says 'do the first thing then, while #seed is NOT NULL, do the second thing' - but that's a step beyond my pay grade.

Interpolate missing values in a MySQL table

I have some intraday stock data saved into a MySQL table which looks like this:
+----------+-------+
| tick | quote |
+----------+-------+
| 08:00:10 | 5778 |
| 08:00:11 | 5776 |
| 08:00:12 | 5778 |
| 08:00:13 | 5778 |
| 08:00:14 | NULL |
| 08:00:15 | NULL |
| 08:00:16 | 5779 |
| 08:00:17 | 5778 |
| 08:00:18 | 5780 |
| 08:00:19 | NULL |
| 08:00:20 | 5781 |
| 08:00:21 | 5779 |
| 08:00:22 | 5779 |
| 08:00:23 | 5779 |
| 08:00:24 | 5778 |
| 08:00:25 | 5779 |
| 08:00:26 | 5777 |
| 08:00:27 | NULL |
| 08:00:28 | NULL |
| 08:00:29 | 5776 |
+----------+-------+
As you can see, there are some points where no data is available (quote is NULL). What I would like to do is a simple step interpolation. This means each NULL value should be updated with the last value available. The only way I managed to do this is with cursors, which is pretty slow due to the large amount of data. I'm basically searching something like this:
UPDATE table AS t1
SET quote = (SELECT quote FROM table AS t2
WHERE t2.tick < t1.tick AND
t2.quote IS NOT NULL
ORDER BY t2.tick DESC
LIMIT 1)
WHERE quote IS NULL
Of course this query will not work, but this is how it should look like.
I would appreciate any ideas on how this can be solved without cursors and temp tables.
This should work:
SET #prev = NULL;
UPDATE ticks
SET quote= #prev := coalesce(quote, #prev)
ORDER BY tick;
BTW the same trick works for reading:
SELECT t.tick, #prev := coalesce(t.quote, #prev)
FROM ticks t
JOIN (SELECT #prev:=NULL) as x -- initializes #prev
ORDER BY tick
The main problem here is reference to main query in subquery t2.tick < t1.tick. Because of this you can't simply wrap the subquery in a another subquery.
If this is one time query and there is not so many data, you can do something like that:
UPDATE `table` AS t1
SET quote = (SELECT quote FROM (SELECT quote, tick FROM `table` AS t2 WHERE t2.quote IS NOT NULL) as t3 WHERE t3.tick < t1.tick ORDER BY t3.tick DESC LIMIT 1)
WHERE quote IS NULL
But really, really don't use that as it will be probably to slow. On each null quote, this query selects all data from table table and then from results it gets desired row.
I would create a (temporary) table with the same layout as your table and run the following two queries:
Insert all interpolations into the temp_stock table
INSERT INTO temp_stock (tick, quote)
SELECT s2.tick
, (s1.quote + s3.quote) /2 as quote
FROM stock
INNER JOIN stock s1 ON (s1.tick < s2.tick)
INNER JOIN stock s3 ON (s3.tick > s2.tick)
WHERE s2.quote IS NULL
GROUP BY s2.tick
HAVING s1.tick = MAX(s1.tick), s3.tick = MIN(s3.tick)
Update the stock table with the temp values
UPDATE stock s
INNER JOIN temp_stock ts ON (ts.tick = s.tick) SET s.quote = ts.quote
It does use a temp table (make sure it's a memory table for speed), but it doesn't need a cursor.