Efficient assignment of percentile/rank in MYSQL - mysql

I have a couple of very large tables (over 400,000 rows) that look like the following:
+---------+--------+---------------+
| ID | M1 | M1_Percentile |
+---------+--------+---------------+
| 3684514 | 3.2997 | NULL |
| 3684515 | 3.0476 | NULL |
| 3684516 | 2.6499 | NULL |
| 3684517 | 0.3585 | NULL |
| 3684518 | 1.6919 | NULL |
| 3684519 | 2.8515 | NULL |
| 3684520 | 4.0728 | NULL |
| 3684521 | 4.0224 | NULL |
| 3684522 | 5.8207 | NULL |
| 3684523 | 6.8291 | NULL |
+---------+--------+---------------+...about 400,000 more
I need to assign each row in the M1_Percentile column a value that represents "the percent of rows with M1 values equal or lower to the current row's M1 value"
In other words, I need:
I implemented this sucessfully, but it is FAR FAR too slow. If anyone could create a more efficient version of the following code, I would really appreciate it!
UPDATE myTable AS X JOIN (
SELECT
s1.ID, COUNT(s2.ID)/ (SELECT COUNT(*) FROM myTable) * 100 AS percentile
FROM
myTable s1 JOIN myTable s2 on (s2.M1 <= s1.M1)
GROUP BY s1.ID
ORDER BY s1.ID) AS Z
ON (X.ID = Z.ID)
SET X.M1_Percentile = Z.percentile;
This is the (correct but slow) result from the above query if the number of rows is limited to the ones you see (10 rows):
+---------+--------+---------------+
| ID | M1 | M1_Percentile |
+---------+--------+---------------+
| 3684514 | 3.2997 | 60 |
| 3684515 | 3.0476 | 50 |
| 3684516 | 2.6499 | 30 |
| 3684517 | 0.3585 | 10 |
| 3684518 | 1.6919 | 20 |
| 3684519 | 2.8515 | 40 |
| 3684520 | 4.0728 | 80 |
| 3684521 | 4.0224 | 70 |
| 3684522 | 5.8207 | 90 |
| 3684523 | 6.8291 | 100 |
+---------+--------+---------------+
Producing the same results for the entire 400,000 rows takes magnitudes longer.

I cannot test this, but you could try something like:
update table t
set mi_percentile = (
select count(*)
from table t1
where M1 < t.M1 / (
select count(*)
from table));
UPDATE:
update test t
set m1_pc = (
(select count(*) from test t1 where t1.M1 < t.M1) * 100 /
( select count(*) from test));
This works in Oracle (the only database I have available). I do remember getting that error in MySQL. It is very annoying.

Fair warning: mysql isn't my native environment. However, after a little research, I think the following query should be workable:
UPDATE myTable AS X
JOIN (
SELECT X.ID, (
SELECT COUNT(*)
FROM myTable X1
WHERE (X.M1, X.id) >= (X1.M1, X1.id) as Rank)
FROM myTable as X
) AS RowRank
ON (X.ID = RowRank.ID)
CROSS JOIN (
SELECT COUNT(*) as TotalCount
FROM myTable
) AS TotalCount
SET X.M1_Percentile = RowRank.Rank / TotalCount.TotalCount;

Related

getting NonUniqueDiscoveredSqlAliasException

the table structure is as follows
+---------------+----------+-------------------+-------------+---------+--------+-------------------------+----------------+----------------+-----------------------+----------------------+---------------------+-------------------+--------------+
| REDEMPTION_ID | CCID | MEMBERSHIP_NUMBER | POINTS_TYPE | PARTNER
| SCHEME | REDEMPTION_ORDER_STATUS | MEMBER_SEGMENT | PARTNER_POINTS |
MEMBERSHIP_FIRST_NAME | MEMBERSHIP_LAST_NAME | REDEMPTION_DATE |
OUTBOUND_FILENAME | PRODUCT_TYPE |
+---------------+----------+--------
-----------+-------------+---------+--------+-------------------------+----------------+----------------+-----------------------+----------------------+---------------------+-------------------+--------------+
| 1003740 | 21212103 | 1231237 | BASE | QANTAS
| Visa | ORDERED | LEGACY | 5000.000000 |
e | Name | 2017-10-23 10:26:51 |
NABQF05P.012 | CONSUMER |
| 1003741 | 21212103 |1231238 | BONUS | QANTAS | Visa | ORDERED | LEGACY | 2500.000000 | e | Name
| 2017-10-23 10:26:51 | NABQF05P.012 | CONSUMER |
I want to group the above rows based on the columns Membership_Number and POINTS_TYPE and the resulting row should be one row.
I am using the following query :
select * from ((
select * from NAB_REDEMPTION_DETAILS
where PARTNER='QANTAS' and REDEMPTION_ORDER_STATUS IN ('PLACED','RESEND') and POINTS_TYPE = 'BASE' group by MEMBERSHIP_NUMBER) a
left OUTER JOIN (
select * from NAB_REDEMPTION_DETAILS
where PARTNER='QANTAS' and REDEMPTION_ORDER_STATUS IN ('PLACED','RESEND') and POINTS_TYPE = 'BONUS' group by MEMBERSHIP_NUMBER) b on a.MEMBERSHIP_NUMBER=b.MEMBERSHIP_NUMBER) union (
select * from (
select * from NAB_REDEMPTION_DETAILS
where PARTNER='QANTAS' and REDEMPTION_ORDER_STATUS IN ('PLACED','RESEND') and POINTS_TYPE = 'BASE' group by MEMBERSHIP_NUMBER) c right OUTER JOIN (
select * from NAB_REDEMPTION_DETAILS
where PARTNER='QANTAS' and REDEMPTION_ORDER_STATUS IN ('PLACED','RESEND') and POINTS_TYPE = 'BONUS' group by MEMBERSHIP_NUMBER) d on c.MEMBERSHIP_NUMBER=d.MEMBERSHIP_NUMBER)
when executing the query I am getting the above mentioned exception
I figured it out. since I was using joins in the same table and the Redemption_id was a Primarykey, It was resulting in the exception. I used group_concat instead of join and it solved my problem.
Kindly find the solution in the link http://sqlfiddle.com/#!9/dc59a1/5

Performant way to self-join and filter by revised rows

I'm trying to select all rows in this table, with the constraint that revised id's are selected instead of the original ones. So, if a row has a revision, that revision is selected instead of that row, if there are multiple revision numbers the highest revision number is preferred.
I think an example table, output, and query will explain this better:
Table:
+----+-------+-------------+-----------------+-------------+
| id | value | original_id | revision_number | is_revision |
+----+-------+-------------+-----------------+-------------+
| 1 | abcd | null | null | 0 |
| 2 | zxcv | null | null | 0 |
| 3 | qwert | null | null | 0 |
| 4 | abd | 1 | 1 | 1 |
| 5 | abcde | 1 | 2 | 1 |
| 6 | zxcvb | 2 | 1 | 1 |
| 7 | poiu | null | null | 0 |
+----+-------+-------------+-----------------+-------------+
Desired Output:
+----+-------+-------------+-----------------+
| id | value | original_id | revision_number |
+----+-------+-------------+-----------------+
| 3 | qwert | null | null |
| 5 | abcde | 1 | 2 |
| 6 | zxcvb | 2 | 1 |
| 7 | poiu | null | null |
+----+-------+-------------+-----------------+
View Called revisions_max:
SELECT
responses.original_id AS original_id,
MAX(responses.revision_number) AS revision
FROM
responses
WHERE
original_id IS NOT NULL
GROUP BY responses.original_id
My Current Query:
SELECT
responses.*
FROM
responses
WHERE
id NOT IN (
SELECT
original_id
FROM
revisions_max
)
AND
is_revision = 0
UNION
SELECT
responses.*
FROM
responses
INNER JOIN revisions_max ON revisions_max.original_id = responses.original_id
AND revisions_max.revision_number = responses.revision_number
This query works, but takes 0.06 seconds to run. With a table of only 2000 rows. This table will quickly start expanding to tens or hundreds of thousands of rows. The query under the union is what takes most of the time.
What can I do to improve this queries performance?
How about using coalesce()?
SELECT COALESCE(y.id, x.id) AS id,
COALESCE(y.value, x.value) AS value,
COALESCE(y.original_id, x.original_id) AS original_id,
COALESCE(y.revision_number, x.revision_number) AS revision_number
FROM responses x
LEFT JOIN (SELECT r1.*
FROM responses r1
INNER JOIN (SELECT responses.original_id AS
original_id,
Max(responses.revision_number) AS
revision
FROM responses
WHERE original_id IS NOT NULL
GROUP BY responses.original_id) rev
ON r1.original_id = rev.original_id
AND r1.revision_number = rev.revision) y
ON x.id = y.original_id
WHERE y.id IS NOT NULL
OR x.original_id IS NULL;
The approach I would take with any other DBMS is to use NOT EXISTS:
SELECT r1.*
FROM Responses AS r1
WHERE NOT EXISTS
( SELECT 1
FROM Responses AS r2
WHERE r2.original_id = COALESCE(r1.original_id, r1.id)
AND r2.revision_number > COALESCE(r1.revision_number, 0)
);
To remove any rows where a higher revision number exists for the same id (or original_id if it is populated). However, in MySQL, LEFT JOIN/IS NULL will perform better than NOT EXISTS1. As such I would rewrite the above as:
SELECT r1.*
FROM Responses AS r1
LEFT JOIN Responses AS r2
ON r2.original_id = COALESCE(r1.original_id, r1.id)
AND r2.revision_number > COALESCE(r1.revision_number, 0)
WHERE r2.id IS NULL;
Example on DBFiddle
I realise that you have said that you don't want to use LEFT JOIN and check for nulls, but I don't see that there is a better solution.
1. At least this was the case historically, I don't actively use MySQL so don't keep up to date with developments in the optimiser

Getting no of products in all categories and parent categories contains a keyword

I am trying to fetch all the categories and their count (no of products in that category) of those products where keyword matches. The query I tried doesn't give me the correct result.
Also I want the parent categories till level 1 and their count as well.
e.g. I am trying with keyword watch, then category "watches" should be there with some count. Also the parent category "accessories" with the sum of its descendant categories count.
my table structures are:
tblProducts: There are 5 categories of a product, fldCategoryId1, fldCategoryId2, fldCategoryId3, fldCategoryId4 and fldCategoryId5. fldProductStatus should be 'A'
+-----------------------------+-------------------+
| Field | Type |
+-----------------------------+-------------------+
| fldUniqueId | bigint(20) |
| fldCategoryId1 | bigint(20) |
| fldCategoryId2 | bigint(20) |
| fldCategoryId3 | bigint(20) |
| fldCategoryId4 | bigint(20) |
| fldCategoryId5 | bigint(20) |
| fldProductStatus | enum('A','P','D') |
| fldForSearch | longtext |
+-----------------------------+-------------------+
tblCategory:
+------------------------------+-----------------------+
| Field | Type |
+------------------------------+-----------------------+
| fldCategoryId | bigint(20) |
| fldCategoryName | varchar(128) |
| fldCategoryParent | int(11) |
| fldCategoryLevel | enum('0','1','2','3') |
| fldCategoryActive | enum('Y','N') |
+------------------------------+-----------------------+
Search Query:
SELECT count( c.fldCategoryId ) AS cnt, c.fldCategoryLevel, c.fldCategoryParent, c.fldCategoryId, c.fldCategoryName, p.fldForSearch, c.fldCategoryParent
FROM tblCategory c, tblProducts p
WHERE (
c.fldCategoryId = p.fldCategoryId1
OR c.fldCategoryId = p.fldCategoryId2
OR c.fldCategoryId = p.fldCategoryId3
OR c.fldCategoryId = p.fldCategoryId4
OR c.fldCategoryId = p.fldCategoryId5
)
AND p.fldProductStatus = 'A'
AND (
MATCH ( p.fldForSearch )
AGAINST (
'+(watches watch)'
IN BOOLEAN MODE
)
)
GROUP BY c.fldCategoryId
Note: The table is in the InnoDB engine and have FULLTEXT search index on 'fldForSearch' column.
EDIT: sample data can be found in sqlfiddle
I'm not sure what you mean by:
Also I want the parent categories till level 1 and their count as well.
But the following query will show you a count for each category (including those with 0 found products), and a general rollup:
SELECT
c.fldCategoryId,
c.fldCategoryLevel,
c.fldCategoryName,
COUNT( * ) AS cnt
FROM tblCategory c
LEFT JOIN tblProducts p ON
(c.fldCategoryId = p.fldCategoryId1
OR c.fldCategoryId = p.fldCategoryId2
OR c.fldCategoryId = p.fldCategoryId3
OR c.fldCategoryId = p.fldCategoryId4
OR c.fldCategoryId = p.fldCategoryId5)
AND p.fldProductStatus = 'A'
AND MATCH ( p.fldForSearch )
AGAINST (
'+(watches watch)'
IN BOOLEAN MODE
)
GROUP BY
c.fldCategoryId
c.fldCategoryLevel,
c.fldCategoryName
WITH ROLLUP;
Notes:
you cannot select p.fldForSearch if you expect a count of all the products in the category. fldForSearch is on a per product basis, it defeats the grouping purpose
I left joined with products so it returns the categories with 0 products matching your keywords. If you don't want this to happen just remove the LEFT keyword
I haven't checked the MATCH condition I assume it's correct.
Start by not splaying an array (fldCategoryId...) across columns. Instead, add a new table.
Once you have done that, the queries change, such as getting rid of OR clauses.
Hopefully, any further issues will fall into place.
Since your category tree has a fixed height (4 levels), you can create a transitive closure table on the fly with
SELECT c1.fldCategoryId AS descendantId, c.fldCategoryId AS ancestorId
FROM tblcategory c1
LEFT JOIN tblcategory c2 ON c2.fldCategoryId = c1.fldCategoryParent
LEFT JOIN tblcategory c3 ON c3.fldCategoryId = c2.fldCategoryParent
JOIN tblcategory c ON c.fldCategoryId IN (
c1.fldCategoryId,
c1.fldCategoryParent,
c2.fldCategoryParent,
c3.fldCategoryParent
)
The result will look like
| descendantId | ancestorId |
|--------------|------------|
| 1 | 1 |
| 2 | 1 |
| 2 | 2 |
| ... | ... |
| 5 | 1 |
| 5 | 2 |
| 5 | 5 |
| ... | ... |
You can now use it in a subquery (derived table) to join it with products using descendantId and with categories using ancestorId. That means that a product from category X will be indirectly associated with all ancestors of X (as well as with X). For example: Category 5 is a child of 2 - and 2 is a child of 1. So all products from category 5 must be counted for categories 5, 2 and 1.
Final query:
SELECT c.*, coalesce(sub.cnt, 0) as cnt
FROM tblCategory c
LEFT JOIN (
SELECT tc.ancestorId, COUNT(DISTINCT p.fldUniqueId) AS cnt
FROM tblProducts p
JOIN (
SELECT c1.fldCategoryId AS descendantId, c.fldCategoryId AS ancestorId
FROM tblcategory c1
LEFT JOIN tblcategory c2 ON c2.fldCategoryId = c1.fldCategoryParent
LEFT JOIN tblcategory c3 ON c3.fldCategoryId = c2.fldCategoryParent
JOIN tblcategory c ON c.fldCategoryId IN (
c1.fldCategoryId,
c1.fldCategoryParent,
c2.fldCategoryParent,
c3.fldCategoryParent
)
) tc ON tc.descendantId IN (
p.fldCategoryId1,
p.fldCategoryId2,
p.fldCategoryId3,
p.fldCategoryId4,
p.fldCategoryId5
)
WHERE p.fldProductStatus = 'A'
AND MATCH ( p.fldForSearch )
AGAINST ( '+(watches watch)' IN BOOLEAN MODE )
GROUP BY tc.ancestorId
) sub ON c.fldCategoryId = sub.ancestorId
Result for your sample data (without level, since it seems to be wrong anyway):
| fldCategoryId | fldCategoryName | fldCategoryParent | fldCategoryActive | cnt |
|---------------|-----------------|-------------------|-------------------|-----|
| 1 | Men | 0 | Y | 5 |
| 2 | Accessories | 1 | Y | 5 |
| 3 | Men Watch | 1 | Y | 3 |
| 5 | Watch | 2 | Y | 5 |
| 6 | Clock | 2 | Y | 3 |
| 7 | Wrist watch | 1 | Y | 2 |
| 8 | Watch | 2 | Y | 4 |
| 9 | watch2 | 3 | Y | 2 |
| 10 | fastrack | 8 | Y | 3 |
| 11 | swish | 8 | Y | 2 |
| 12 | digital | 5 | Y | 2 |
| 13 | analog | 5 | Y | 2 |
| 14 | dual | 5 | Y | 1 |
Demos:
sqlfiddle
rextester
Note that the outer (left joined) subquery is logically not necessary. But from my experience MySQL doesn't perform well without it.
There are still ways for performance optimisation. One is to store the transitive closure table in an indexed temporary table. You can also persist it in a regular table, if categories do rarely change. You can also manage it with triggers.

Mysql select query with condition

I have a Mysql table with the following data.
|ID | Date | BillNumber|BillMonth | Amount | Name |AccNum |
| 2 |2015-09-25| 454345 | 092015 | 135.00 |Andrew Good| 735976|
| 3 |2015-09-26| 356282 | 092015 | 142.00 |Peter Pan | 123489|
| 4 |2015-08-11| 312738 | 082015 | 162.00 |Andrew Good| 735976|
| 5 |2015-07-12| 287628 | 072015 | 220.67 |Andrew Good| 735976|
| 6 |2015-06-12| 100756 | 062015 | 556.34 |Andrew Good| 735976|
What I wanted to achieve is to retrieve the data of Andrew Good with AccNum 735976 for the BillMonth of 092015, provided that the user can entry any of his BillNumber(past/current).
If the reason that that row is of interest is because it is the latest of his rows, try:
select *
from tbl t
where name = ( select name
from tbl
where billnumber = 100756 -- can be any of his
)
and date = ( select max(date)
from tbl x
where x.name = t.name
)
(the billnumber can be any of his)

How to resolve this MySQL query?

I have a table that looks like this:
CREATE TEMPORARY TABLE MainList (
`pTime` int(10) unsigned NOT NULL,
`STD` double NOT NULL,
PRIMARY KEY (`pTime`)
) ENGINE=MEMORY;
+------------+-------------+
| pTime | STD |
+------------+-------------+
| 1106080500 | -0.5058072 |
| 1106081100 | -0.82790455 |
| 1106081400 | -0.59226294 |
| 1106081700 | -0.99998194 |
| 1106540100 | -0.86649279 |
| 1107194700 | 1.51340543 |
| 1107305700 | 0.96225296 |
| 1107306300 | 0.53937716 |
+------------+-------------+ .. etc
pTime is my primary key.
I want to make a query that, for every row in my table, will find the first pTime where STD has a flipped sign and is further away from 0 than STD of the above table. (For simplicity's sake, just imagine that I am looking for 0-STD)
Here is an example of the output I want:
+------------+-------------+------------+-------------+
| pTime | STD | pTime_Oppo | STD_Oppo |
+------------+-------------+------------+-------------+
| 1106080500 | -0.5058072 | 1106090400 | 0.57510881 |
| 1106081100 | -0.82790455 | 1106091300 | 0.85599817 |
| 1106081400 | -0.59226294 | 1106091300 | 0.85599817 |
| 1106081700 | -0.99998194 | 1106091600 | 1.0660959 |
+------------+-------------+------------+-------------+
I can't seem to get it right!
I tried the following:
SELECT DISTINCT
MainList.pTime,
MainList.STD,
b34d1.pTime,
b34d1.STD
FROM
MainList
JOIN b34d1 ON(
b34d1.pTime > MainList.pTime
AND(
(
MainList.STD > 0
AND b34d1.STD <= 0 - MainList.STD
)
OR(
MainList.STD < 0
AND b34d1.STD >= 0 - MainList.STD
)
)
);
That code just freezes my server up.
P.S Table b34d1 is just like MainList, except it contains much more elements:
mysql> select STD, Slope from b31d1 limit 10;
+-------------+--------------+
| STD | Slope |
+-------------+--------------+
| -0.44922675 | -5.2016129 |
| -0.11892021 | -8.15249267 |
| 0.62574686 | -10.19794721 |
| 1.10469057 | -12.43768328 |
| 1.52917352 | -13.08651026 |
| 1.61803899 | -13.2441349 |
| 1.82686555 | -12.04912023 |
| 2.07480736 | -11.22067449 |
| 2.45529961 | -7.84090909 |
| 1.86468335 | -6.26466276 |
+-------------+--------------+
mysql> select count(*) from b31d1;
+----------+
| count(*) |
+----------+
| 439340 |
+----------+
1 row in set (0.00 sec)
In fact MainList is just a filtered version of b34d1 that uses the MEMORY engine
mysql> show create table b34d1;
+-------+-----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------+
| Table | Create Table
|
+-------+-----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------+
| b34d1 | CREATE TABLE `b34d1` (
`pTime` int(10) unsigned NOT NULL,
`Slope` double NOT NULL,
`STD` double NOT NULL,
PRIMARY KEY (`pTime`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 MIN_ROWS=339331 MAX_ROWS=539331 PACK_KEYS=1 ROW_FORMAT=FIXED |
+-------+-----------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------+
Edit: I just did a little experiment and I am very confused by the results:
SELECT DISTINCT
b34d1.pTime,
b34d1.STD,
Anti.pTime,
Anti.STD
FROM
b34d1
LEFT JOIN b34d1 As Anti ON(
Anti.pTime > b34d1.pTime
AND(
(
b34d1.STD > 0
AND b34d1.STD <= 0 - Anti.STD
)
OR(
b34d1.STD < 0
AND b34d1.STD >= 0 - Anti.STD
)
)
) limit 10;
+------------+-------------+------------+------------+
| pTime | STD | pTime | STD |
+------------+-------------+------------+------------+
| 1104537600 | -0.70381962 | 1104539100 | 0.73473692 |
| 1104537600 | -0.70381962 | 1104714000 | 1.46733274 |
| 1104537600 | -0.70381962 | 1104714300 | 2.02097356 |
| 1104537600 | -0.70381962 | 1104714600 | 2.60642099 |
| 1104537600 | -0.70381962 | 1104714900 | 2.01006557 |
| 1104537600 | -0.70381962 | 1104715200 | 1.97724189 |
| 1104537600 | -0.70381962 | 1104715500 | 1.85683704 |
| 1104537600 | -0.70381962 | 1104715800 | 1.2754127 |
| 1104537600 | -0.70381962 | 1104716100 | 0.87900156 |
| 1104537600 | -0.70381962 | 1104716400 | 0.72957739 |
+------------+-------------+------------+------------+
Why are all the values under the first pTime the same?
Selecting other fields from a row having some aggregate statistic (such as a minimum or maximum value) is a little messy in SQL. Such queries aren't so simple. You typically need an extra join or a subquery. For example:
SELECT m.pTime, m.STD, m2.pTime AS pTime_Oppo, m2.STD AS STD_Oppo
FROM MainList AS m
JOIN
(SELECT m1.pTime, MIN(m2.pTime) AS pTime_Oppo
FROM MainList AS m1
JOIN MainList AS m2
ON m1.pTime < m2.pTime AND SIGN(m1.STD) != SIGN(m2.STD)
WHERE ABS(m1.STD) <= ABS(m2.std)
GROUP BY m1.pTime
) AS oppo ON m.pTime = oppo.pTime
JOIN MainList AS m2 ON oppo.pTime_Oppo = m2.pTime
;
Using the sample data:
INSERT INTO MainList (`pTime`, `STD`)
VALUES
(1106080500, -0.5058072),
(1106081100, -0.82790455),
(1106081400, -0.59226294),
(1106081700, -0.99998194),
(1106090400, 0.57510881),
(1106091300, 0.85599817),
(1106091600, 1.0660959),
(1106540100, -0.86649279),
(1107194700, 1.51340543),
(1107305700, 0.96225296),
(1107306300, 0.53937716),
;
The results are:
+------------+-------------+------------+-------------+
| pTime | STD | pTime_Oppo | STD_Oppo |
+------------+-------------+------------+-------------+
| 1106080500 | -0.5058072 | 1106090400 | 0.57510881 |
| 1106081100 | -0.82790455 | 1106091300 | 0.85599817 |
| 1106081400 | -0.59226294 | 1106091300 | 0.85599817 |
| 1106081700 | -0.99998194 | 1106091600 | 1.0660959 |
| 1106090400 | 0.57510881 | 1106540100 | -0.86649279 |
| 1106091300 | 0.85599817 | 1106540100 | -0.86649279 |
| 1106540100 | -0.86649279 | 1107194700 | 1.51340543 |
+------------+-------------+------------+-------------+
Any solution based on functions like ABS or SIGN or anything similar required to check sign is doomed to be ineffective on big sets of data, because it makes indexing impossible.
You are creating a temporary table inside a SP so you can alter it schema without losing anything, adding a column that stores sign of STD and storing STD itself unsigned will give you HUGE performance boost, because you can simply find first bigger pTime and bigger STD with a different sign and all conditions can use indices in a query like this (STD_positive keeps STD's sign):
SELECT * from mainlist m
LEFT JOIN mainlist mu
ON mu.pTime = ( SELECT md.pTime FROM mainlist md
WHERE m.pTime < md.pTime
AND m.STD < md.STD
AND m.STD_positive <> md.STD_positive
ORDER BY md.pTime
LIMIT 1 )
LEFT JOIN is needed here to return rows that dont have bigger STD. If you don't need them use simple JOIN. This query should run fine even on lots of records, with proper indices based on careful checking of EXPLAIN output, starting with an index on STD.
SELECT
m.pTime,
m.STD,
mo.pTime AS pTime_Oppo,
-mo.STD AS STD_Oppo
FROM MainList m
INNER JOIN (
SELECT
pTime,
-STD AS STD
FROM MainList
) mo ON m.STD > 0 AND mo.STD > m.STD
OR m.STD < 0 AND mo.STD < m.STD
LEFT JOIN (
SELECT
pTime,
-STD AS STD
FROM MainList
) mo2 ON mo.STD > 0 AND mo2.STD > m.STD AND mo.STD > mo2.STD
OR mo.STD < 0 AND mo2.STD < m.STD AND mo.STD < mo2.STD
WHERE mo2.pTime IS NULL