SQL: Calculate bitrate - mysql

I have a column with bytes, and another with milliseconds. And I must calculate average bitrate in bits per second.
I'm doing this:
SELECT AVG(Bytes*8)/AVG(Milliseconds/1000)
FROM Tracks
Apparently it is wrong. I'm using an app with exercises
I have this result
254492.61
And should be
254400.25

I think you only want one average calculation
SELECT AVG((Bytes*8.0)/(Milliseconds/1000.0))
FROM Tracks
and you may want to increase precision to decimals which is why 8.0 and 1000.0 are used above. Remove if unwanted.

I would be inclined to write this as:
SELECT SUM(Bytes*8) / SUM(Milliseconds/1000)
FROM Tracks
This is equivalent to your query, though -- assuming that the values are never NULL.
Perhaps they mean the average of averages:
SELECT AVG(Bytes * 8 / (Milliseconds / 1000))
FROM Tracks;
I would not describe this as the average bits per second, however.

Related

Calculating percentage of time value is 0 vs 1

Apologies for the awful title, but I can't think of a better way of explaining this.
I have a field for each ID which can either be 0 or 1, and changes once per day based on the activity of stock in a warehouse. I.e. In stock / out of stock. This is purely used as an approximate percentage for reporting.
E.g. 10 days at 1, 90 days at 0 = 10%
What I'm doing right now is running a cron script once a day to store the current value (with timestamp) in another table, then easily working out the percentage that way.
This is working, but there must be a more efficient method? With 100,000 unique IDs for example, this equates to 26,500,000 rows per year. You can see the problem there.
I can't think of a more efficient way. Maybe there isn't one.
You can query it dynamically using a simple select:
select count(if(field_name=1,1,null))/count(*)*100 as percentage_of_one from yourtable
field_name is the field which has to be 1 to be counted.

mysql hamming distance between two phash

I have a table A which has a column 'template_phash'. I store the phash generated from 400K images.
Now I take a random image and generate a phash from that image.
Now how do I query so that I can get the record from table A which hamming distance difference is less than a threshold value, say 20.
I have seen Hamming distance on binary strings in SQL, but couldn't figure it out.
I think I figured out that I need to make a function to achieve this but how?
Both of my phash are in BigInt eg: 7641692061273169067
Please help me make the function so that I could query like
SELECT product_id, HAMMING_DISTANCE(phash1, phash2) as hd
FROM A
WHERE hd < 20 ORDER BY hd ASC;
I figured out that the hamming distance is just the count of different bits between the two hashes. First xor the two hashes then get the count of binary ones:
SELECT product_id, BIT_COUNT(phash1 ^ phash2) as hd from A ORDER BY hd ASC;

I want to divide date of a particular interval into 10 parts and query the count for each part in mysql

I have a table.
And it has two fields id and datetime.
What I need to do is, for any two given datetimes, I need to divide the time range into 10 equal intervals and give row count of each interval.
Please let me know whether this is possible without using any external support from languages like java or php.
select ((UNIX_TIMESTAMP(date_col) / CAST((time2 - time1)/10) AS INT) + time1), count(id) from my_table where date_col >= time1 AND date_col <= time2 GROUP BY ((UNIX_TIMESTAMP(date_col) / CAST((time2 - time1)/10) AS INT) + time1)
I haven't tested it. But something like this should work.
The easiest way to divide date intervals is if you store them as longs (ie #of ticks from the "beginning of time"). As far as I know, there is no way to do it using MySQL's datetime.
How you decide to do it ultimately depends on your application. I would store them as longs and have whatever front end you are using handle to conversion to a more readable format.
What exactly do you mean by giving the row count of each interval? That part doesn't make sense to me.

Filtering MySQL query result according to a interval of timestamp

Let's say I have a very large MySQL table with a timestamp field. So I want to filter out some of the results not to have too many rows because I am going to print them.
Let's say the timestamps are increasing as the number of rows increase and they are like every one minute on average. (Does not necessarily to be exactly once every minute, ex: 2010-06-07 03:55:14, 2010-06-07 03:56:23, 2010-06-07 03:57:01, 2010-06-07 03:57:51, 2010-06-07 03:59:21 ...)
As I mentioned earlier I want to filter out some of the records, I do not have specific rule to do that, but I was thinking to filter out the rows according to the timestamp interval. After I achieve filtering I want to have a result set which has a certain amount of minutes between timestamps on average (ex: 2010-06-07 03:20:14, 2010-06-07 03:29:23, 2010-06-07 03:38:01, 2010-06-07 03:49:51, 2010-06-07 03:59:21 ...)
Last but not least, the operation should not take incredible amount of time, I need this functionality to be almost fast as a normal select operation.
Do you have any suggestions?
I wasn't able to come up with a query that would do this off the top of my head, but here's what I was thinking:
If you have a lot of entries within a single minute, figure out a way to collapse the results such that there is max 1 entry for any given minute (DISTINCT, DATE_FORMAT maybe?).
Limit the number of results by using modulus on the minute value, something like this (if you'd like an entry from every 10 minutes):
WHERE MOD(MINUTE(tstamp_column, 10)) = 0
If your goal is to filter records, presumably what you really want is a small percentage of the records, but not the first 10 or 100. In which case, which not just select them randomly? The MySQL RAND() function will return a floating point number n, such that 0 <= n < 1.0. Convert your desired percentage to a floating point number, and use it like this:
SELECT * FROM table
WHERE RAND() < 0.001
If you want repeatable results (for testing), you can use a seed parameter to force the function to always return the same set of numbers.

Avg Time difference in mysql

Is there a function to find average time difference in the standard time format in my sql.
You can use timestampdiff to find the difference between two times.
I'm not sure what you mean by "average," though. Average across the table? Average across a row?
If it's the table or a subset of rows:
select
avg(timestampdiff(SECOND, startTimestamp, endTimestamp)) as avgdiff
from
table
The avg function works like any other aggregate function, and will respond to group by. For example:
select
col1,
avg(timestampdiff(SECOND, startTimestamp, endTimestamp)) as avgdiff
from
table
group by col1
That will give you the average differences for each distinct value of col1.
Hopefully this gets you pointed in the right direction!
What I like to do is a
SELECT count(*), AVG(TIME_TO_SEC(TIMEDIFF(end,start)))
FROM
table
Gives the number of rows as well...
In order to get actual averages in the standard time format from mysql I had to convert to seconds, average, and then convert back:
SEC_TO_TIME(AVG(TIME_TO_SEC(TIMEDIFF(timeA, timeB))))
If you don't convert to seconds, you get an odd decimal representation of the minutes that doesn't really make any sense (to me).
I was curious if AVG() was accurate or not, the way that COUNT() actually just approximates the value ("this value is an approximation"). After all, let's review the average formula: average = sum / count. So, knowing that the count is accurate is actually really important for this formula!
After testing multiple combinations, it definitely seems like AVG() works and is a great approach. You can calculate yourself to see if it's working with...
SELECT
COUNT(id) AS count,
AVG(TIMESTAMPDIFF(SECOND, OrigDateTime, LastDateTime)) AS avg_average,
SUM(TIMESTAMPDIFF(SECOND, OrigDateTime, LastDateTime)) / (select COUNT(id) FROM yourTable) as calculated_average,
AVG(TIME_TO_SEC(TIMEDIFF(LastDateTime,OrigDateTime))) as timediff_average,
SEC_TO_TIME(AVG(TIME_TO_SEC(TIMEDIFF(LastDateTime, OrigDateTime)))) as date_display
FROM yourTable
Sample Results:
count: 441000
avg_average: 5045436.4376
calculated_average: 5045436.4376
timediff_average: 5045436.4376
date_display: 1401:30:36
Seems to be pretty accurate!
This will return:
count: The count.
avg_average: The average based on AVG(). (Thanks to Eric for their answer on this!)
calculated_average: The average based on SUM()/COUNT().
timediff_avg: The average based on TIMEDIFF(). (Thanks to Andrew for their answer on this!)
date_display: A nicely-formatted display version. (Thanks to C S for their answer on this!)