How do I compare average runtime of two functions in MySQL? - mysql

I wanted to compare average runtime of two functions in MySQL -
Square distance: pow(x1 - x2, 2) + pow(y1 - y2, 2) + pow(z1 - z2, 2)
vs
Dot product: x1 * x2 + y1 * y2 + z1 * z2
Now, whichever function I choose is going to run around 50,000,000,000 times in a single query! So, even the tiniest of difference in their runtime matters.
So, I tried profiling. Here's what I got,
mysql> show profiles;
+----------+------------+-----------------------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+-----------------------------------------------------------------------+
| 4 | 0.00014400 | select pow(rand()-rand(),2)+pow(rand()-rand(),2)+pow(rand()-rand(),2) |
| 5 | 0.00012800 | select pow(rand()-rand(),2)+pow(rand()-rand(),2)+pow(rand()-rand(),2) |
| 6 | 0.00017000 | select pow(rand()-rand(),2)+pow(rand()-rand(),2)+pow(rand()-rand(),2) |
| 7 | 0.00024800 | select pow(rand()-rand(),2)+pow(rand()-rand(),2)+pow(rand()-rand(),2) |
| 8 | 0.00014400 | select pow(rand()-rand(),2)+pow(rand()-rand(),2)+pow(rand()-rand(),2) |
| 9 | 0.00014000 | select pow(rand()-rand(),2)+pow(rand()-rand(),2)+pow(rand()-rand(),2) |
| 10 | 0.00014900 | select pow(rand()-rand(),2)+pow(rand()-rand(),2)+pow(rand()-rand(),2) |
| 11 | 0.00015000 | select rand()*rand()+rand()*rand()+rand()*rand() |
| 12 | 0.00012000 | select rand()*rand()+rand()*rand()+rand()*rand() |
| 13 | 0.00015200 | select rand()*rand()+rand()*rand()+rand()*rand() |
| 14 | 0.00022500 | select rand()*rand()+rand()*rand()+rand()*rand() |
| 15 | 0.00012700 | select rand()*rand()+rand()*rand()+rand()*rand() |
| 16 | 0.00013200 | select rand()*rand()+rand()*rand()+rand()*rand() |
| 17 | 0.00013400 | select rand()*rand()+rand()*rand()+rand()*rand() |
| 18 | 0.00013800 | select rand()*rand()+rand()*rand()+rand()*rand() |
+----------+------------+-----------------------------------------------------------------------+
15 rows in set, 1 warning (0.00 sec)
This is not very helpful at all, runtimes fluctuate around so much that I have no clue which one is faster and by how much.
I need to run each of these functions like 10,000 times to get a nice and consistent average runtime. How do I accomplish this in MySQL?
(Note that rand() is called 6 times in both the functions so it's runtime doesn't really make a difference)
Edit:
Sure, I can create a temp table, it would be slightly inconvenient, fill it with random values, which again is not straight forward (see How do I populate a mysql table with many random numbers) and then proceed to comparing my functions.
I wanted to know If a better way existed in MySQL.

In the best of the cases, the function pow detects that the exponent is the integer 2 and performs exponentiation with a single multiply. There is no reason it could beat a pure multiply.

Related

Transitive, Directed Graph in SQL

I am trying to make a graph transitive using SQL.
I do not see, why this should not work:
with recursive recursive_table(from, to) as (
SELECT * FROM Graph
UNION ALL
SELECT r1.to, r2.from FROM recursive_table r1, recursive_table r2
WHERE r1.from = r2.to
UNION ALL
SELECT * FROM recursive_table
)
SELECT * FROM recursive_table;
In every recursion, I take the elements specified in the not transitive Graph (1), everything which is the result of the next recursion (3) and everything which results out of the next recursion (2).
However, SQL says:
[2021-02-12 10:36:05] [HY000][3577] In recursive query block of Recursive Common Table Expression 'recursive_table', the recursive table must be referenced only once, and not in any subquery
A sample output would be the following:
Input:
+------+------+--+
| Col1 | Col2 | |
+------+------+--+
| 1 | 2 | |
| 2 | 3 | |
| 1 | 4 | |
| 4 | 5 | |
+------+------+--+
Output:
+------+------+--+
| Col1 | Col2 | |
+------+------+--+
| 1 | 2 | |
| 2 | 3 | |
| 1 | 4 | |
| 4 | 5 | |
| 1 | 3 | |
| 1 | 5 | |
+------+------+--+
So, mathematically speaking,
If you can go from a to b in a finite amount of steps > 0, add (a,b) to the graph.
For example, you can go from 1 to 2 and from 2 to 3 on the input data, therefore you can go from 1 to 3.
Another example is a circle with n - knots.
This means, the input would be something like this...
+------+------+--+
| Col1 | Col2 | |
+------+------+--+
| 1 | 2 | |
| 2 | 3 | |
| 3 | ... | |
| ... | n | |
| n | 1 | |
+------+------+--+
The correct output would be [n] X [n]
It is a little hard to say exactly why your code doesn't work. There are multiple potential issues:
from is not a valid column name.
Recursive CTEs rarely have two union alls.
Recursive CTEs do not usually reference the recursive CTE multiple times.
In any case, correct code is simpler:
with recursive recursive_table(col1, col2) as (
SELECT col1, col2
FROM graph
UNION ALL
SELECT r1.col1, g.col2
FROM recursive_table r1 JOIN
graph g
ON r1.col2 = g.col1
)
SELECT *
FROM recursive_table;
Here is a db<>fiddle.
Note that both this code and your code assume that the graph has no cycles. That is not part of your question, but if it is an issue, ask a new question.

How can I select all rows which have been inserted in the last day?

I have a table like this:
// reset_password_emails
+----+----------+--------------------+-------------+
| id | id_user | token | unix_time |
+----+----------+--------------------+-------------+
| 1 | 2353 | 0c274nhdc62b9dc... | 1339412843 |
| 2 | 2353 | 0934jkf34098joi... | 1339412864 |
| 3 | 5462 | 3408ujf34o9gfvr... | 1339412894 |
| 4 | 3422 | 2309jrgv0435gff... | 1339412899 |
| 5 | 3422 | 34oihfc3lpot4gv... | 1339412906 |
| 6 | 2353 | 3498hfjp34gv4r3... | 1339412906 |
| 16 | 2353 | asdf3rf3409kv39... | 1466272801 |
| 7 | 7785 | 123dcoj34f43kie... | 1339412951 |
| 9 | 5462 | 3fcewloui493e4r... | 1339413621 |
| 13 | 8007 | 56gvb45cf3454g3... | 1339424860 |
| 14 | 7785 | vg4er5y2f4f45v4... | 1339424822 |
+----+----------+--------------------+-------------+
Each row is an email. Now I'm trying to implement a limitation for sending-reset-password email. I mean an user can achieve 3 emails per day (not more).
So I need an query to check user's history for the number of emails:
SELECT count(1) FROM reset_password_emails WHERE token = :token AND {from not until last day}
How can I implement this:
. . . {from now until last day}
Actually I can do that like: NOW() <= (unix_time + 86400) .. But I guess there is a better approach by using interval. Can anybody tell me what's that?
Your expression will work, but has 3 problems:
the way you've coded it means the subtraction must be performed for every row (performance hit)
because you're not using the raw column value, you couldn't use an index on the time column (if one existed)
it isn't clear to read
Try this:
unix_time > unix_timestamp(subdate(now(), interval '1' day))
here the threshold datetime is calculated once per query, so all of the problems above have been addressed.
See SQLFiddle demo
You can convert your unix_time using from_unixtime function
select r.*
from reset_password_emails r
where now() <= from_unixtime(r.unix_time) - interval '1' day
Just add the extra filters you want.
See it here: http://sqlfiddle.com/#!9/4a7a9/3
It evaluates to no rows because your given data for unix_time field is all from 2011
Edited with a sqlfiddle that show the conversion:
http://sqlfiddle.com/#!9/4a7a9/4

Convert Mysql Query to Rails ActiveRecord Query Without using find_by_sql

I have table named questions like follows
+----+---------------------------------------------------------+----------+
| id | title | category |
+----+---------------------------------------------------------+----------+
| 89 | Tinker or work with your hands? | 2 |
| 54 | Sketch, draw, paint? | 3 |
| 53 | Express yourself clearly? | 4 |
| 77 | Keep accurate records? | 6 |
| 32 | Efficient? | 6 |
| 52 | Make original crafts, dinners, school or work projects? | 3 |
| 70 | Be elected to office or make your opinions heard? | 5 |
| 78 | Take photographs? | 3 |
| 84 | Start your own political campaign? | 5 |
| 9 | Free spirit or a rebel? | 3 |
| 38 | Lead a group? | 5 |
| 71 | Work in groups? | 4 |
| 2 | Helpful? | 4 |
| 4 | Mechanical? | 6 |
| 14 | Responsible? | 6 |
| 66 | Pitch a tent, an idea? | 1 |
| 62 | Write useful business letters? | 5 |
| 28 | Creative? | 3 |
| 68 | Perform experiments? | 2 |
| 10 | Like to figure things out? | 2 |
+----+---------------------------------------------------------+----------+
I have a sql query to get one random record from each category.Can any one convert the mysql query to rails activerecord query(with out using Question.find_by_sql).This mysql query is working absolutely fine but I need only active record query because of my dependency in further steps.
Here is mysql query
SELECT t.id, title as question, category
FROM
(
SELECT
(
SELECT id
FROM questions
WHERE category = t.category
ORDER BY RAND()
LIMIT 1
) id
FROM questions t
GROUP BY category
) q JOIN questions t
ON q.id = t.id
Thank You for your consideration!
When things get crazy one have to reach out for Arel:
It is intended to be a framework framework; that is, you can build
your own ORM with it, focusing on innovative object and collection
modeling as opposed to database compatibility and query generation.
So what we want to do is to let Arel create the query for us. Moreover the approach here is gonna be used: the questions table is left joined with randomized version of itself:
q_normal = Arel::Table.new("questions")
q_random = Arel::Table.new("questions").project(Arel.sql("*")).order("RAND()").as("q2")
Time to left join
query = q_normal.join(q_random, Arel::Nodes::OuterJoin).on(q_normal[:category].eq(q_random[:category])).group(q_normal[:category]).order(q_random[:category])
Now you can use which columns you want using project, e.g.:
query.project(q_normal[:id])
The only way I can think of to do this requires a good bit of application code. I don't think there's a way of accessing the RAND() functionality in MySQL (or equivalent in other DB technologies) using ActiveRecord. Here's what I came up with:
counts = Question.group(:category_id).count(:id)
offsets = {}
counts.each do |cat_id, count|
offsets[cat_id] = rand(count)
end
random_questions = []
offsets.each do |cat_id, offset|
random_questions.push(Question.where(:category_id => cat_id).offset(offset).first)
end

MySQL: optimize query for scoring calculation

I have a data table that I use to do some calculations. The resulting data set after calculations looks like:
+------------+-----------+------+----------+
| id_process | id_region | type | result |
+------------+-----------+------+----------+
| 1 | 4 | 1 | 65.2174 |
| 1 | 5 | 1 | 78.7419 |
| 1 | 6 | 1 | 95.2308 |
| 1 | 4 | 1 | 25.0000 |
| 1 | 7 | 1 | 100.0000 |
+------------+-----------+------+----------+
By other hand I have other table that contains a set of ranges that are used to classify the calculations results. The range tables looks like:
+----------+--------------+---------+
| id_level | start | end | status |
+----------+--------------+---------+
| 1 | 0 | 75 | Danger |
| 2 | 76 | 90 | Alert |
| 3 | 91 | 100 | Good |
+----------+--------------+---------+
I need to do a query that add the corresponding 'status' column to each value when do calculations. Currently, I can do that adding the following field to calculation query:
select
...,
...,
[math formula] as result,
(select status
from ranges r
where result between r.start and r.end) status
from ...
where ...
It works ok. But when I have a lot of rows (more than 200K), calculation query become slow.
My question is: there is some way to find that 'status' value without do that subquery?
Some one have worked on something similar before?
Thanks
Yes, you are looking for a subquery and join:
select s.*, r.status
from (select s.*
from <your query here>
) s left outer join
ranges r
on s.result between r.start and r.end
Explicit joins often optimize better than nested select. In this case, though, the ranges table seems pretty small, so this may not be the performance issue.

MySQL using GROUP BY to group by multiple columns

I'd like to use GROUP BY multiple columns, I think it's best to start with an example:
SELECT
eventsviews.eventId,
showsActive.showId,
showsActive.venueId,
COUNT(*) AS count
FROM eventsviews
INNER JOIN events ON events.eventId = eventsviews.eventId
INNER JOIN showsActive ON showsActive.eventId = eventsviews.eventId
WHERE events.status = 1
GROUP BY showsActive.venueId, showsActive.showId, showsActive.eventId
ORDER BY count DESC
LIMIT 100;
Output:
| *eventId* | *showId* | *venueId* | *count* |
+-----------+----------+-----------+---------+
[...snip...]
| 95 | 92099 | 9770 | 32 |
| 95 | 105472 | 10702 | 32 |
| 3804 | 41225 | 8165 | 17 |
| 3804 | 41226 | 8165 | 17 |
| 923 | 2866 | 5451 | 14 |
| 923 | 20184 | 5930 | 14 |
[...snip...]
What I would like instead:
| *eventId* | *showId* | *venueId* | *count* |
+-----------+----------+-----------+---------+
| 95 | 92099 | 9770 | 32 |
| 3804 | 41226 | 8165 | 17 |
| 923 | 20184 | 5930 | 14 |
So, I want my data grouped by eventId, but only once for each showId and venueId ...
I actually have a SQL query that does that, but it has 8 subqueries and is as slow as a T-Ford ... And since this is executed on every page load, speeding things up looks like a good idea!
There are a few questions like this, and I've tried many different things, but I've been at this query for an hour and I can't seem to get it to work as I want :-(
Thanks!
You probably want either a min or a max on showid, and then not include it in the group by, I can't tell which because looking at your "prefered" resultset, you have both.
If you want your data grouped by eventId, group just by eventId and you'll get exactly the result you're looking for.
This is a MySQL feature (?) that it allows you to select non-aggregate columns, in which case it will return the first row available. In other DBMS it's achieved by DISTINCT ON, which is not available in MySQL.