Check for ranges overlapping database side with coalesce - mysql

I can't figure out how to check on database side if two ranges, that can handle null values (ex. range A: null - null range B: 3 - 10), overlaps.
In this case, those two ranges overlaps because in my code null - null it's equal to -∞ and +∞ so 3 - 10 is inside -∞ - +∞.
The problem is that i need to build a query that returns all the records from my table stock_rule that have a range that overlaps with the stock_rule record that i'm trying to create.
If the count is major than zero then i can't save the record.
I'm trying to achieve that using COALESCE function (MySQL 8.0) in this way:
COALESCE(rule.min_price, 0)<=COALESCE(:minPrice, rule.min_price,0) AND
COALESCE(rule.max_price, 0)<=COALESCE(:minPrice, rule.max_price, 0) AND
COALESCE(rule.min_price, 0)<=COALESCE(:maxPrice, rule.min_price,0) AND
COALESCE(rule.max_price, 0)<=COALESCE(:maxPrice, rule.max_price, 0) AND
COALESCE(:minPrice, 0)>=COALESCE(rule.min_price, :minPrice, 0) AND
COALESCE(:maxPrice,0)<=COALESCE(rule.min_price, :maxPrice, 0) AND
COALESCE(:minPrice,0)>=COALESCE(rule.max_price, :minPrice, 0) AND
COALESCE(:maxPrice, 0)<=COALESCE(rule.max_price, :maxPrice, 0)

I guess something like this would work...
DROP TABLE ranges;
CREATE TABLE ranges
(id seriAL PRIMARY KEY
,range_start INT NULL
,range_end INT NULL
);
INSERT INTO ranges VALUES
(1,NULL,NULL),
(2,3,10),
(3,12,NULL),
(4,NULL,20),
(5,10,11);
SELECT *
FROM ranges x
JOIN ranges y
ON y.id <> x.id
AND COALESCE(x.range_start,0) <= y.range_end
AND COALESCE(x.range_end,(SELECT MAX(range_end) FROM ranges)) >= y.range_start;
+----+-------------+-----------+----+-------------+-----------+
| id | range_start | range_end | id | range_start | range_end |
+----+-------------+-----------+----+-------------+-----------+
| 1 | NULL | NULL | 2 | 3 | 10 |
| 4 | NULL | 20 | 2 | 3 | 10 |
| 5 | 10 | 11 | 2 | 3 | 10 |
| 1 | NULL | NULL | 5 | 10 | 11 |
| 2 | 3 | 10 | 5 | 10 | 11 |
| 4 | NULL | 20 | 5 | 10 | 11 |
+----+-------------+-----------+----+-------------+-----------+
mysql>

Related

Calculations of different columns in Mysql Query

I have a Table:-
+-----+--------------+--------------+----------+--------------------+---------------+-----------------+
| id | CustomerName | VideoQuality | IsActive | BufferedTime | ElapsedTime | TotalBufferTime |
+-----+--------------+--------------+----------+--------------------+---------------+-----------------+
| 139 | HotStar | 180 | Yes | 10.367167126617211 | 30.000000000 | NULL |
| 140 | HotStar | 1300 | NULL | 5.43524230876729 | 34.000000000 | NULL |
| 141 | HotStar | 1300 | NULL | 5.671054515212042 | 38.000000000 | NULL |
| 142 | HotStar | 1300 | NULL | 5.045639532902047 | 41.000000000 | NULL |
| 143 | HotStar | 1300 | NULL | 5.455747718023355 | 44.000000000 | NULL |
| 144 | HotStar | 1300 | NULL | 5.691559924468107 | 49.000000000 | NULL |
i want to calculate the columns BufferTime and ElapsedTime and insert that output to TotalBufferTime column but i want to skip the first row of the BufferTime.
So the fisrt calculation will be 5.43 + 30.000 second calculation will be 5.67 + 34.00 and so on.
I also have a column IsActive which shows the first row of Buffer time.
I want to do something like this :-
update RequestInfo SET `TotalBufferTime` = BufferedTime + ElapsedTime;
only thing i want to skip only the first row of the column buffered time.
Assuming you a field id that determines row order in your table, you can use a correlated subquery so as to get BufferedTime of previous row like this:
SELECT t1.CustomerName, t1.VideoQuality, t1.IsActive, t1.BufferedTime,
t1.ElapsedTime,
(SELECT t2.BufferedTime
FROM mytable AS t2
WHERE t2.td > t1.id
ORDER BY id LIMIT 1) + t1.ElapsedTime AS TotalBufferTime
FROM mytable AS t1
WHERE IsActive IS NULL
Edit:
To UPDATE you can use the following query:
SET #et = 0;
SET #ElapsedTime = NULL;
UPDATE RequestInfo
SET TotalBufferTime = CASE
WHEN (#et := #ElapsedTime) < 0 THEN NULL
WHEN #ElapsedTime := ElapsedTime THEN BufferedTime + #et
END
ORDER BY id;
The trick here is to use a CASE expression where the first WHEN clause is always evaluated (because it is the first one) but is never true. This way #et variable is initialized with the value of #ElapsedTime, i.e. the value of the previous record.
Demo here

MySQL query executes fine, but returns (false) empty result set when using != NULL?

I have the following result set, that I'm trying to drill down
+----+---------+---------------+---------------------+----------------------+---------------+-----------+------------------+------------------+
| id | auth_id | trusts_number | buy_sell_actions_id | corporate_actions_id | fx_actions_id | submitted | created_at | updated_at |
+----+---------+---------------+---------------------+----------------------+---------------+-----------+------------------+------------------+
| 2 | 6 | N100723 | 2 | NULL | NULL | 0 | 08/05/2015 11:30 | 08/05/2015 15:32 |
| 5 | 6 | N100723 | NULL | NULL | 1 | 0 | 08/05/2015 15:10 | 08/05/2015 15:10 |
| 6 | 6 | N100723 | NULL | NULL | 2 | 1 | 08/05/2015 15:12 | 08/05/2015 15:41 |
+----+---------+---------------+---------------------+----------------------+---------------+-----------+------------------+------------------+
This result set is generated with the query
SELECT * FROM actions WHERE auth_id = 6 AND trusts_number = 'N100723'
I also want to get rid of any field with fx_actions is NULL, so I change the query to
SELECT * FROM actions WHERE auth_id = 6 AND trusts_number = 'N100723' AND fx_actions_id != NULL
However this returns an empty result set. I've never used "negative" query parameters in MySQL before, so I'm not sure if they should take on a different syntax or what?
Any help would be much appreciated.
Normal comparison operators don't work well with NULL. Both Something = NULL and Something != NULL will return 'unknown', which causes the row to be omitted in the result. Use the special operators IS NULL and IS NOT NULL instead:
SELECT * FROM actions
WHERE auth_id = 6
AND trusts_number = 'N100723'
AND fx_actions_id IS NOT NULL
Wikipedia on NULL and its background
Because null isn't a value, you should use IS NOT NULL

Count the rows of a table, while checking if the value is NULL or not and grouping the results by the values of another column

I am stuck trying to find a solution to my silly little problem.
The MySQL table looks as follows:
-- Create a table that will record all AdCamp hits
CREATE TABLE `advertising_campaign_hits` (
`adcamp_hit_id` INT NOT NULL AUTO_INCREMENT,
`adcamp_id` INT,
`customer_id` INT,
`recorded_at` DATETIME,
PRIMARY KEY (`adcamp_hit_id`)
) ENGINE=MyISAM;
Example values would look like this:
a_h_id | a_id | c_id | ...
1 | 1 | 1 | ...
2 | 1 | 2 | ...
3 | 1 | 3 | ...
4 | 1 | 0 | ...
5 | 1 | 0 | ...
6 | 2 | 1 | ...
7 | 2 | 0 | ...
The goal here is to count the number of hits for each of the advertising campaigns, but divide them into two groups of KnownCustomers and UnknownCustomers and then further divide them each by adcamp_id.
So the results I would expect to get are:
adcamp_id | HitsByKnown | HitsByUnknown
1 | 3 | 2
2 | 1 | 1
I am currently stuck in where I can get SQL to give me two separate rows for each of the adcamps, but the results of COUNT(*) list all of my entries.
So what I get is:
adcamp_id | HitsByKnown | HitsByUnknown
1 | 4 | 3
2 | 4 | 3
I can't figure out how to split it all up.
select adcamp_id,sum(if(customer_id>0,1,0)) as HitsByKnown,sum(if(customer_id=0,1,0)) as HitsByUnknown from advertising_campaign_hits group by adcamp_id
or even easier:
select adcamp_id,sum(customer_id!=0) as HitsByKnown,sum(customer_id=0) as HitsByUnknown from advertising_campaign_hits group by adcamp_id

Populate values from one table

I am trying to populate an empty table(t) from another table(t2) based on a flag field being set. He is my attempt below and the table data.
UPDATE 2014PriceSheetIssues AS t
JOIN TransSalesAvebyType2013Combined AS t2
SET t.`Tran_Type`=t2.`Tran_Type` WHERE t.`rflag`='1';
When I run the script, I receive (0) zero records affected.??
+-----------+----------------+-------------------+-------+-------+
| Tran_Type | RetailAvePrice | WholesaleAvePrice | Rflag | Wflag |
+-----------+----------------+-------------------+-------+-------+
| 125C | 992 | 650 | 1 | NULL |
| 2004R | 1500 | NULL | 1 | NULL |
| 4EAT | 1480 | 1999 | 1 | 1 |
+-----------+----------------+-------------------+-------+-------+
I think you should just do the following
INSERT INTO 2014PriceSheetIssues
( `fldX`, `fldY` )
VALUES (
SELECT `fldX`, `fldY`
FROM TransSalesAvebyType2013Combined
WHERE 2014PriceSheetIssues.`rflag`='1'
)
The select query gets the values and the insert puts it in the (empty) other table.

JOIN query is far too slow. Won't use INDEX?

I have a transitional table that I temporarily fill with some values before querying it and destroying it.
CREATE TABLE SearchListA(
`pTime` int unsigned NOT NULL ,
`STD` double unsigned NOT NULL,
`STD_Pos` int unsigned NOT NULL,
`SearchEnd` int unsigned NOT NULL,
UNIQUE INDEX (`pTime`,`STD` ASC) USING BTREE
) ENGINE = MEMORY;
It looks as such:
+------------+------------+---------+------------+
| pTime | STD | STD_Pos | SearchEnd |
+------------+------------+---------+------------+
| 1105715400 | 1.58474499 | 0 | 1105723200 |
| 1106297700 | 2.5997839 | 0 | 1106544000 |
| 1107440400 | 2.04860375 | 0 | 1107440700 |
| 1107440700 | 1.58864998 | 0 | 1107467400 |
| 1107467400 | 1.55207218 | 0 | 1107790500 |
| 1107790500 | 2.04239417 | 0 | 1108022100 |
| 1108022100 | 1.61385678 | 0 | 1108128000 |
| 1108771500 | 1.58835083 | 0 | 1108771800 |
| 1108771800 | 1.65734727 | 0 | 1108772100 |
| 1108772100 | 2.09378189 | 0 | 1109027700 |
+------------+------------+---------+------------+
Only columns pTime and SearchEnd are relevant to my problem.
My intention is to use this table to speed up searching through a much larger, static table.
The first column, pTime, is where the search should start
The fourth column, SearchEnd, is where the search should end
The larger table is similar; it looks like this:
CREATE TABLE `b50d1_abs` (
`pTime` int(10) unsigned NOT NULL,
`Slope` double NOT NULL,
`STD` double NOT NULL,
`Slope_Pos` int(11) NOT NULL,
`STD_Pos` int(11) NOT NULL,
PRIMARY KEY (`pTime`),
KEY `Slope` (`Slope`) USING BTREE,
KEY `STD` (`STD`),
KEY `ID1` (`pTime`,`STD`) USING BTREE
) ENGINE=MyISAM DEFAULT CHARSET=latin1 MIN_ROWS=339331 MAX_ROWS=539331 PACK_KEYS=1 ROW_FORMAT=FIXED;
+------------+-------------+------------+-----------+---------+
| pTime | Slope | STD | Slope_Pos | STD_Pos |
+------------+-------------+------------+-----------+---------+
| 1107309300 | 1.63257919 | 1.39241698 | 0 | 1 |
| 1107314400 | 6.8959276 | 0.22425643 | 1 | 1 |
| 1107323100 | 18.19909502 | 1.46854808 | 1 | 0 |
| 1107335400 | 2.50135747 | 0.4736305 | 0 | 0 |
| 1107362100 | 4.28778281 | 0.85576985 | 0 | 1 |
| 1107363300 | 6.96289593 | 1.41299044 | 0 | 0 |
| 1107363900 | 8.10316742 | 0.2859726 | 0 | 0 |
| 1107367500 | 16.62443439 | 0.61587645 | 0 | 0 |
| 1107368400 | 19.37918552 | 1.18746968 | 0 | 0 |
| 1107369300 | 21.94570136 | 0.94261744 | 0 | 0 |
| 1107371400 | 25.85701357 | 0.2741292 | 0 | 1 |
| 1107375300 | 21.98914027 | 1.59521158 | 0 | 1 |
| 1107375600 | 20.80542986 | 1.59231289 | 0 | 1 |
| 1107375900 | 19.62714932 | 1.50661679 | 0 | 1 |
| 1107381900 | 8.23167421 | 0.98048205 | 1 | 1 |
| 1107383400 | 10.68778281 | 1.41607579 | 1 | 0 |
+------------+-------------+------------+-----------+---------+
...etc (439340 rows)
Here, the columns pTime, STD, and STD_Pos are relevant to my problem.
For every element in the smaller table (SearchListA), I need to search the specified range within the larger table (b50d1_abs()) and return the row with the lowest b50d1_abs.pTime that is higher than the current SearchListA.pTime and that also matches the following conditions:
SearchListA.STD < b50d1_abs.STD AND SearchListA.STD_Pos <> b50d1_abs.STD_Pos
AND
b50d1_abs.pTime < SearchListA.SearchEnd
The latter condition is simply to reduce the length of the search.
This seems to me like a pretty straightforward query that should be able to use indexes; especially since all values are unsigned numbers - But I cannot get it to execute nearly fast enough! I think it is because it rebuilds the entire table each time instead of just omitting values from it.
I would be extremely grateful if someone takes a look at my code and figures out a more efficient way to go about this:
SELECT
m.pTime as OpenTime,
m.STD,
m.STD_Pos,
mu.pTime AS CloseTime
FROM
SearchListA m
JOIN b50d1_abs mu ON mu.pTime =(
SELECT
md.pTime
FROM
b50d1_abs as md
WHERE
md.pTime > m.pTime
AND md.pTime <=m.SearchEnd
AND m.STD < md.STD AND m.STD_Pos <> md.STD_Pos
LIMIT 1
);
Here is my EXPLAIN EXTENDED statement:
+----+--------------------+-------+--------+-----------------+---------+---------+------+--------+----------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+--------------------+-------+--------+-----------------+---------+---------+------+--------+----------+--------------------------+
| 1 | PRIMARY | m | ALL | NULL | NULL | NULL | NULL | 365 | 100.00 | |
| 1 | PRIMARY | mu | eq_ref | PRIMARY,ID1 | PRIMARY | 4 | func | 1 | 100.00 | Using where; Using index |
| 2 | DEPENDENT SUBQUERY | md | ALL | PRIMARY,STD,ID1 | NULL | NULL | NULL | 439340 | 100.00 | Using where |
+----+--------------------+-------+--------+-----------------+---------+---------+------+--------+----------+--------------------------+
It looks like the lengthiest query (#2) doesn't use indexes at all!
If I try FORCE INDEX then it will list it under possible_keys, but still list NULL under Key and still take an extremely long time (over 80 seconds).
I need to get this query under 10 second; and even 10 is too long.
Your subquery is a dependent subquery, so the best case is that it's going to be evaluated once for every row in table m. Since m contains few rows, that would be OK.
But if you put that subquery in a JOIN condition, it is going to be executed (rows in m)*(rows in mu) times, no matter what.
Note that your results may be incorrect since :
return the row with the lowest b50d1_abs.pTime
but you don't specify that anywhere.
Try this query :
SELECT
m.pTime as OpenTime,
m.STD,
m.STD_Pos,
(
SELECT min( big.pTime )
FROM b50d1_abs as big
WHERE big.pTime > m.pTime
AND big.pTime <= m.SearchEnd
AND m.STD < big.STD AND m.STD_Pos <> big.STD_Pos
) AS CloseTime
FROM SearchListA m
or this one :
SELECT
m.pTime as OpenTime,
m.STD,
m.STD_Pos,
min( big.pTime )
FROM
SearchListA m
JOIN b50d1_abs as big ON (
big.pTime > m.pTime
AND big.pTime <= m.SearchEnd
AND m.STD < big.STD AND m.STD_Pos <> big.STD_Pos
)
GROUP BY m.pTime
(if you also want rows where the search was unsuccessful, make that a LEFT JOIN).
SELECT
m.pTime as OpenTime,
m.STD,
m.STD_Pos,
(
SELECT big.pTime
FROM b50d1_abs as big
WHERE big.pTime > m.pTime
AND big.pTime <= m.SearchEnd
AND m.STD < big.STD AND m.STD_Pos <> big.STD_Pos
ORDER BY big.pTime LIMIT 1
) AS CloseTime
FROM SearchListA m
(Try an index on b50d1_abs( pTime, STD, STD_Pos)
FYI here are some tests using Postgres on a test data set that should look like yours (maybe remotely, lol)
CREATE TABLE small (
pTime INT PRIMARY KEY,
STD FLOAT NOT NULL,
STD_POS BOOL NOT NULL,
SearchEnd INT NOT NULL
);
CREATE TABLE big(
pTime INTEGER PRIMARY KEY,
Slope FLOAT NOT NULL,
STD FLOAT NOT NULL,
Slope_Pos BOOL NOT NULL,
STD_POS BOOL NOT NULL
);
INSERT INTO small SELECT
n*100000,
random(),
random()<0.1,
n*100000+random()*50000
FROM generate_series( 1, 365 ) n;
INSERT INTO big SELECT
n*100,
random(),
random(),
random() > 0.5,
random() > 0.5
FROM generate_series( 1, 500000 ) n;
Query 1 : 6.90 ms (yes milliseconds)
Query 2 : 48.20 ms
Query 3 : 6.46 ms
I'll start a new answer cause it starts to look like a mess ;)
With your data I get, using MySQL 5.1.41
Query 1 : takes forever, Ctrl-C
Query 2 : 520 ms
Query 3 : takes forever, Ctrl-C
Explain for 2 looks good :
+----+-------------+-------+------+---------------------+------+---------+------+--------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+---------------------+------+---------+------+--------+------------------------------------------------+
| 1 | SIMPLE | m | ALL | PRIMARY,STD,ID1,ID2 | NULL | NULL | NULL | 743 | Using temporary; Using filesort |
| 1 | SIMPLE | big | ALL | PRIMARY,ID1,ID2 | NULL | NULL | NULL | 439340 | Range checked for each record (index map: 0x7) |
+----+-------------+-------+------+---------------------+------+---------+------+--------+------------------------------------------------+
So, I loaded your data into postgres...
Query 1 : 14.8 ms
Query 2 : 100 ms
Query 3 : 14.8 ms (same plan as 1)
In fact rewriting 2 as query 1 (or 3) fixes a little optimizer shortcoming and finds the optimal query plan for this scenario.
Would you recommend using Postgres over MySql for this scenario?
Speed is extremely important to me.
Well, I don't know why mysql barfs so much on queries 1 and 3 (which are pretty simple and easy), in fact it should even beat postgres (using an index only scan) but apparently not, eh. You should ask a mysql specialist !
I'm more used to postgres... got fed up with mysql a long time ago ! If you need complex queries postgres usually wins big time (but you'll need to re-learn how to optimize and tune your new database)...