MySQL how to select the nearest values to my variables - mysql

I have the following query where Im trying to retrieve matches, within a certain breathing space, of the variables entered.
SELECT fthg, ftag, avover, avunder, whh, wha, whd
FROM full
WHERE (whh < ($home_odds + 0.05)
AND whh > ($home_odds - 0.05)
AND wha < ($away_odds + 0.05)
AND wha > ($away_odds -0.05)
AND whd < ($draw_odds + 0.05)
AND whd > ($draw_odds - 0.05))
There are occasions where this returns 0 results so in that case I would like to retrieve the closest matching record to all three but Im not quite sure how to put the query together.
Basically this is the last resort if the other query doesn't return results, this one will return the next best thing no matter how far from the original values.
Thanks for the help

Your original query would be simpler and more readable as this:
SELECT
fthg,
ftag,
avover,
avunder,
whh,
wha,
whd
FROM full
WHERE ABS($home_odds - whh) < 0.05
and ABS($away_odds - wha) < 0.05
and ABS($draw_odds - whd) < 0.05
If that query returns nothing, you could run this one:
SELECT
fthg,
ftag,
avover,
avunder,
whh,
wha,
whd
FROM full
ORDER BY
ABS($home_odds - whh) + ABS($away_odds - wha) + ABS($draw_odds - whd)
LIMIT 1
It will return the row with the lowest deviation from the combination of those three pairs of fields.

How about faking a distance calculation between the parameters you provide and the parameters you are comparing to? Something like
SELECT fthg, ftag, avover, avunder, whh, wha, whd
FROM full
ORDER BY
sqrt(abs(whh - $home_odds) * abs(whh - $home_odds)) +
sqrt(abs(wha - $away_odds) * abs(wha - $away_odds)) +
sqrt(abs(whd - $draw_odds) * abs(whd - $draw_odds))
This way, even if there are no matches given the range you are interested in, you can still get a closer result.

Related

Is there a way to calculate entropy in sql \ mysql?

I would like to calculate the entropy of a list in mysql.
Now I run this and move to python:
select group_concat(first_name), last_name
from table
group by last name
What I am looking for would be the equivalent of
entropy(first_name)
Returning a single number for each.
Similar to the below usage for numericals:
std(age)/avg(age)
EDIT- Partially answered: Thank you to commenter #IVO GELOV for a very efficient approximation:
SELECT LOG2(COUNT(DISTINCT column)) FROM Table
Based on solution above and an approximate of the t-test we reach comparative weighted entropy. Hacky, but works like a charm:
CASE
WHEN count(*)-1 < 6 THEN (1 + LOG2(COUNT(distinct first_name)))*5.61*power(count(*)-1,-0.71)
WHEN count(*)-1 >= 6 and cnt-1 < 27 THEN (1 + LOG2(COUNT(distinct first_name)))*2.2*power(count(*)-1,-0.081)
ELSE (1 + LOG2(COUNT(distinct first_name)))*1.815*power(count(*)-1,-0.02)
END as entropy
Defined for rows with count(*) > 1

SELECT not returning anything using HAVING clause

I'm trying to get the total percentage off and only return the matches >+ 80. However, this doesn't return any results:
SELECT * FROM products WHERE Available=1 AND Merchant='Amazon' HAVING (LowestUsedPrice - LowestNewPrice) / LowestNewPrice * 100 >= ?
Am I using HAVING correctly?
HAVING specifies a search condition for a group or an aggregate function used in SELECT statement.
HAVING is applied after the aggregation phase and must be used if you want to filter aggregate results.
Your query is wrong.
What you can do is do the conditioning in where clause only.
SELECT *
FROM products
WHERE Available=1
AND Merchant='Amazon'
AND (LowestUsedPrice - LowestNewPrice) / LowestNewPrice * 100 >= ?
As per I understand you wanna use having to filter whereas you may use just where condition.
About returning results, your query will produce syntax error. If you use as following and don't get any result then obviously it is because of conditions and your data. In that case if you provide data, you may get some help.
SELECT * FROM products
WHERE Available=1
AND Merchant='Amazon'
AND (LowestUsedPrice - LowestNewPrice) / LowestNewPrice * 100 >= ?
SELECT *, (LowestUsedPrice - LowestNewPrice) / LowestNewPrice * 100 as percentage FROM products WHERE Available=1 AND Merchant='Amazon' HAVING percentage >= ?
Your main problem is that your percentage formula is wrong. Use
((LowestUsedPrice - LowestNewPrice) / LowestNewPrice) * 100
or
(LowestUsedPrice - LowestNewPrice) * 100 / LowestNewPrice
The way you are using will do something like
(20 - 8) / 15 * 100
12 / 1500 = 0.008
And you need
1200 / 15 = 80
Add this fix to the solution on other answers

Speed up SQL SELECT with arithmetic and geometric calculations

This is a follow-up to my previous post How to improve wind data SQL query performance.
I have expanded the SQL statement to also perform the first part in the calculation of the average wind direction using circular statistics. This means that I want to calculate the average of the cosines and sines of the wind direction. In my PHP script, I will then perform the second part and calculate the inverse tangent and add 180 or 360 degrees if necessary.
The wind direction is stored in my table as voltages read from the sensor in the field 'dirvolt' so I first need to convert it to radians.
The user can look at historical wind data by stepping backwards using a pagination function, hence the use of LIMIT which values are set dynamically in my PHP script.
My SQL statement currently looks like this:
SELECT ROUND(AVG(speed),1) AS speed_mean, MAX(speed) as speed_max,
MIN(speed) AS speed_min, MAX(dt) AS last_dt,
AVG(SIN(2.04*dirvolt-0.12)) as dir_sin_mean,
AVG(COS(2.04*dirvolt-0.12)) as dir_cos_mean
FROM table
GROUP BY FLOOR(UNIX_TIMESTAMP(dt) / 300)
ORDER BY FLOOR(UNIX_TIMESTAMP(dt) / 300) DESC
LIMIT 0, 72
The query takes about 3-8 seconds to run depending on what value I use to group the data (300 in the code above).
In order for me to learn, is there anything I can do to optimize or improve the SQL statement otherwise?
SHOW CREATE TABLE table;
From that I can see if you already have INDEX(dt) (or equivalent). With that, we can modify the SELECT to be significantly faster.
But first, change the focus from 72*300 seconds worth of readings to datetime ranges, which is 6(?) hours.
Let's look at this query:
SELECT * FROM table
WHERE dt >= '...' - INTERVAL 6 HOUR
AND dt < '...';
The '...' would be the same datetime in both places. Does that run fast enough with the index?
If yes, then let's build the final query using that as a subquery:
SELECT FORMAT(AVG(speed), 1) AS speed_mean,
MAX(speed) as speed_max,
MIN(speed) AS speed_min,
MAX(dt) AS last_dt,
AVG(SIN(2.04*dirvolt-0.12)) as dir_sin_mean,
AVG(COS(2.04*dirvolt-0.12)) as dir_cos_mean
FROM
( SELECT * FROM table
WHERE dt >= '...' - INTERVAL 6 HOUR
AND dt < '...'
) AS x
GROUP BY FLOOR(UNIX_TIMESTAMP(dt) / 300)
ORDER BY FLOOR(UNIX_TIMESTAMP(dt) / 300) DESC;
Explanation: What you had could not use an index, hence had to scan the entire table (which is getting bigger and bigger). My subquery could use an index, hence was much faster. The effort for my outer query was not "too bad" since it worked with only N rows.

mysql get difference from the result of 2 fields queried

I'm not good at sql but I can create,understand common SQL queries. While scouring the net it seems its hard to find a befitting way on this query.
I have a query which is
SELECT COUNT(`BetID`),
FORMAT(SUM(`BetAmount`),0),
FORMAT(SUM(`Payout`),0),
ROUND((SUM(`BetAmount`) / COUNT(`BetID`)),2),
ROUND((((SUM(`BetAmount`) + SUM(`Payout`)) / SUM(`Payout`)) * 100),2)
FROM `betdb`
I would like to subtract the result of
FORMAT(SUM(`BetAmount`),0)
and
FORMAT(SUM(`Payout`),0)
Any other ideas to execute subtraction in this mysql query?
If you want the numbers rounded before subtracting them (which seems to be the case when you want to subtract the formatted numbers), you'll need to round them first to the same precision as the formatting, subtract and lastly format the result;
SELECT COUNT(`BetID`),
FORMAT(SUM(`BetAmount`),0),
FORMAT(SUM(`Payout`),0),
FORMAT(ROUND(SUM(`BetAmount`),0) - ROUND(SUM(`Payout`),0),0) diff,
ROUND((SUM(`BetAmount`) / COUNT(`BetID`)),2),
ROUND((((SUM(`BetAmount`) + SUM(`Payout`)) / SUM(`Payout`)) * 100),2)
FROM `betdb`
A simple SQLfiddle to test with.
Use FORMAT((SUM(BetAmount) - SUM(Payout)),0)
Try this:
SELECT COUNT(`BetID`),
FORMAT(SUM(`BetAmount`),0),
FORMAT(SUM(`Payout`),0),
FORMAT((SUM(`BetAmount`) - SUM(`Payout`)),0),
ROUND((SUM(`BetAmount`) / COUNT(`BetID`)),2),
ROUND((((SUM(`BetAmount`) + SUM(`Payout`)) / SUM(`Payout`)) * 100),2)
FROM `betdb`
You could also try using a join statement so that the calculation is only done once:
SELECT *,t.BetTotal - t.PayoutTotal as Difference
FROM (
SELECT
COUNT(`BetID`) AS Count,
FORMAT(SUM(`BetAmount`),0) as BetTotal,
FORMAT(SUM(`Payout`),0) as PayoutTotal,
ROUND((SUM(`BetAmount`) / COUNT(`BetID`)),2),
ROUND((((SUM(`BetAmount`) + SUM(`Payout`)) / SUM(`Payout`)) * 100),2)
FROM `betdb`
) as t

How do I insert a random value into mysql?

It looks like RAND is what I need but I'm having a bit of trouble understanding how it works.
I need to insert a random number between 60 and 120 into a couple thousand rows. Table name is: listing and the column name is: hits
Could you please help?
To make a random integer between 60 and 120, you need to do a bit of arithmetic with the results of RAND(), which produces only floating point values:
SELECT FLOOR(60 + RAND() * 61);
So what's going on here:
RAND() will produce a value like 0.847269199. We multiply that by 61, which gives us the value 51.83615194. We add 60, since that's your desired offset above zero (111.83615194). FLOOR() rounds the whole thing down to the nearest whole number. Finally, you have 111.
To do this over a few thousand existing rows:
UPDATE table SET randcolumn = FLOOR(60 + RAND() * 61) WHERE (<some condition if necessary>);
See the MySQL docs on RAND() for more examples.
Note I think I have the arithmetic right, but if you get values of 59 or 121 outside the expected range, change the +60 up or down accordingly.
Here is how to get the random number in a range. The following can bit a bit ambiguous simply because the 61 is actually your max value (120) minus your min value (60) + 1 to get inclusive results.
SELECT FLOOR(60 + (RAND() * 61));
SELECT FLOOR(MIN_Value + (RAND() * (MAX_Value - MIN_Value) + 1);
http://dev.mysql.com/doc/refman/5.0/en/mathematical-functions.html#function_rand
UPDATE X SET C = FLOOR(61 * RAND() + 60) WHERE ...;
to get a number between 60 and 120 (including 60 and 120);
RAND() creates a number in the interval [0;1) (that is excluding 1). So 61 * RAND() yields a number in [0, 61). 61 * RAND() + 60 is in [60;121) By rounding down you ensure that your number is indeed in [60;120].
When I faced this kind of issue, I tried manual, but I have over 500 lines,
I logically brought a trick which helped me, because if you run RAND on query,
you might end up getting error report due to Duplicates, OR PRIMARY KEY issue, especially if that column is a PRIMARY KEY and AUTO INCREMENT.
Firstly - I renamed the column in question, e.g. mine was ID -> IDS
Secondly - I created another column and Called it ID
Thirdly - I RAN this code
UPDATE history SET id = FLOOR( 217 + RAND( ) *2161 )
This created a random numbers automatically, later i deleted the renamed IDS colume
credit FROM MICHAEL.
Thank you