MySQL: Creating buckets on the fly - mysql

I have a mysql table that stores network utilization for every five minutes, I want to now use this data for graphing. Is there a way where I could just specify the start time and the end time and the number of buckets / samples I need, and MySQL could in someway oblige :?
My table
+---------------------+-----+
| Tstamp | QID |
+---------------------+-----+
| 2010-12-10 15:05:39 | 20 |
| 2010-12-10 15:06:09 | 26 |
| 2010-12-10 15:06:14 | 27 |
| 2010-12-10 15:06:18 | 28 |
| 2010-12-10 15:06:23 | 40 |
| 2010-12-10 15:10:38 | 20 |
| 2010-12-10 15:11:12 | 26 |
| 2010-12-10 15:11:17 | 27 |
| 2010-12-10 15:11:21 | 28 |
------ SNIP ------
So can I specify I need 20 samples from the last 24 hours.
Thanks!
Harsh

You can convert your DATETIME to a UNIX_TIMESTAMP, and play with division and modulo...

Here is a sample query you can use. Notice it does not work if the number of requested samples in the given time range is more than half of the available records for that range (which would mean the bucket size is one).
-- Configuration
SET #samples = 4;
SET #start = '2011-05-06 19:44:00';
SET #end = '2011-05-06 20:46:50';
--
SET #bucket = (SELECT FLOOR(count(*)/#samples) as bucket_size FROM table1
WHERE Tstamp BETWEEN #start AND #end);
SELECT
SUM(t.QID), FLOOR((t.ID-1)/#bucket) as bucket
FROM (SELECT QID , #r:=#r+1 as ID
FROM table1
JOIN (SELECT #r:=0) r
WHERE Tstamp BETWEEN #start AND #end
ORDER BY Tstamp) as t
GROUP BY bucket
HAVING count(t.QID) = #bucket
ORDER BY bucket;
P.S. I believe there is a more elegant way to do this, but since no one has provided a working query I hope this helps.

Related

How to find the range where the given number in mysql

I want to check what range/level the number is in. I have the table of buy and pay. Only I can thinking of is about between. But here, it is different because the column is not only min and max.
pay_level
| id | type | buy1 | pay1 | buy2 | pay2 | buy3 | pay3 |
|----|------|------|------|------|------|------|------|
| 1 | p1 | 10 | 100 | 20 | 80 | 30 | 70 |
|----|------|------|------|------|------|------|------|
| 2 | p2 | 10 | 100 | 20 | 80 | 30 | 70 |
|----|------|------|------|------|------|------|------|
| 3 | p3 | 5 | 500 | 10 | 400 | 30 | 300 |
|----|------|------|------|------|------|------|------|
Ok, according to the table above. My goal is to see how much cost is the incoming order.
For example.
A order p1 for 12 unit. So the price per unit is 100. Because he is buying between buy1 and buy2
B order p1 for 15 units. Then he got 100 per unit as well as A.
C order p1 for 25 units. He got 70 because it's in between pay2 and pay3.
What I can thinking of is to compare 2 columns where the order in between. So my code is:
select * from pay_level where order between buy1 and buy2 and type='p1'
But the problem is occurs when the order is more than 20 (of buy2). I know my English is not good to explain this clear enough. Hope you understand.
First normalise your schema design...
DROP TABLE IF EXISTS wilf;
CREATE TABLE wilf
(id INT AUTO_INCREMENT PRIMARY KEY
,type INT NOT NULL
,x INT NOT NULL
,buy INT NOT NULL
,pay INT NOT NULL
);
INSERT INTO wilf VALUES
(1,1,1,10,100),
(2,2,1,10,100),
(3,3,1, 5,500),
(4,1,2,20, 80),
(5,2,2,20, 80),
(6,3,2,10,400),
(7,1,3,30, 70),
(8,2,3,30, 70),
(9,3,3,30,300);
...and then your queries become trivial...
SELECT pay FROM wilf WHERE type = 1 AND buy < 12 ORDER BY id DESC LIMIT 1;
+-----+
| pay |
+-----+
| 100 |
+-----+
(And C should have got 80)
You'll need a CASE expression to navigate this one since you can't dynamically refer to a database object (table, column, etc) in your sql.
I think something like the following would get you in the ballpark:
SELECT
CASE WHEN order BETWEEN buy1 and buy2 THEN pay1
WHEN order BETWEEN buy2 and buy3 THEN pay2
WHEN order > buy3 THEN pay3 END as cost
FROM pay_level
WHERE type = 'p1'

Detecting variations in a data set

I have a data set with this structure:
ContractNumber | MonthlyPayment | Duration | StartDate | EndDate
One contract number can occur many times as this data set is a consolidation of different reports with the same structure.
Now I want to filter / find the contract numbers in which MonthlyPayment and/or Duration and/or StartDate and/or EndDate differ.
Example (note that Contract Number is not a Primary key):
ContractNumber | MonthlyPayment | Duration | StartDate | EndDate
001 | 500 | 12 | 01.01.2015 | 31.12.2015
001 | 500 | 12 | 01.01.2015 | 31.12.2015
001 | 500 | 12 | 01.01.2015 | 31.12.2015
002 | 1500 | 24 | 01.01.2014 | 31.12.2017
002 | 1500 | 24 | 01.01.2014 | 31.12.2017
002 | 1500 | 24 | 01.01.2014 | 31.12.2018
With this sample data set, I would need to retrieve 002 with a specific query. 001 is the the same and does not Change, but 002 changes over time.
Besides of writing a VBA script running over an Excel, I don't have any solid idea on how to solve this with SQL
My first idea would be a SQL Approach with grouping, where same values are grouped together, but not the different ones. I am currently experimenting on this one. My attempt is currently:
1.) Have the usual table
2.) Create a second table / query with this structure:
ContractNumber | AVG(MonthlyPayment) | AVG(Duration) | AVG(StartDate) | AVG(EndDate)
Which I created with Grouping.
E.G.
Table 1.)
ContractNumber | MonthlyPayment
1 | 10
1 | 10
1 | 20
2 | 300
2 | 300
2 | 300
Table 2.)
ContractNumber | AVG(MonthlyPayment)
1 | 13.3
2 | 300
3) Now I want to find the distinct contract number where - in this example only the MonthlyPayment - does not equal to the average (it should be the same - otherwise we have a variation which I need to find).
Do you have any idea how I could solve this? I would otherwise start writing a VBA or Python script. I have the data set in CSV, so for now I could also do it with MySQL, Power Bi or Excel.
I need to perform this Analysis once, so I would not Need a full approach, so the queries can be splitted into different steps.
Very appreciated! Thank you very much.
To find all contract numbers with differences, use:
select ContractNumber
from
(
select distinct ContractNumber, MonthlyPayment , Duration , StartDate , EndDate
from MyTable
) x
group by ContractNumber
having count(*) >1

How to add random interval to timestamp in MySQL?

We got the following table mytable:
+----+------------+------------+
| id | created | expired |
+----+------------+------------+
| 1 | 1496476314 | NULL |
| 6 | 1496477511 | NULL |
| 7 | 1496477518 | NULL |
| 12 | 1496477534 | NULL |
| 13 | 1496477536 | NULL |
| 15 | 1496477541 | NULL |
| 21 | 1496477548 | NULL |
| 22 | 1496477550 | NULL |
| 26 | 1496477565 | NULL |
| 28 | 1496477566 | NULL |
| 29 | 1496477583 | NULL |
+----+------------+------------+
We'd like to do the following:
set expired = created + random(15 - 30 minutes) as unix_timestamp where expired is null;
I currently have no idea to done it.
If u just can give me some ideas it would save my day.
I tried to convert the created timestamp to date_time and add to that date_time the wanted 15 - 30 minutes and finally convert the new_date_time back to unix_timestamp, but there should be an easier way.
If you want to add a random number of minutes between, say, 14 and 33, you can do it like this:
SET expired = DATE_ADD(created, INTERVAL 14 + RAND()*(33-14) MINUTE);
If you want to have seconds granularity, you need to add SECOND-typed intervals:
SET expired = DATE_ADD(created, INTERVAL 14*60 + RAND()*(33-14)*60 SECOND);
This would saves one datetime conversion if you had a DATETIME for the expired column, which makes it slightly easier to expire records (WHERE expired < NOW()). If you have an integer holding a Unix timestamp, then Darshan's answer is definitely the way to go, and you'd do well to calculate the Unix timestamp in your app and then plug it in the query:
WHERE expired <= 123456789
Having an index on that column would make expirations go blazingly fast. I think it might be even faster than the datetime method, but it's just a sensation, I haven't actually checked.
unix_timestamp is number of seconds elapsed since 1st January 1970. Now, if you want to add 15 to 30 minutes then the equivalent seconds would be 900 to 1800. Here's what you can do:
set expired = created + ROUND((RAND() * (900))+900) where expired is null;
This is how it works:
RAND() will generate a random number between 0 and 1
By using RAND() * (maximum - minimum)) + minimum we make sure we generate a number between 900 and 1800.
ROUND then rounds that number to nearest int.

How to get the expired candidates in MySQL

I am having a table with the columns for expired_date and registered_date.
Expired date have set for 2 days to registered date.
Its look like this:
+--------------+--------------+---------------------+
| candidate_id | date_expires | date_added |
+--------------+--------------+---------------------+
| 1 | 2016-03-26 | 2016-03-24 14:42:18 |
| 2 | 2016-03-23 | 2016-03-21 15:43:40 |
| 3 | 2016-02-15 | 2016-02-13 14:53:30 |
| 4 | 2016-02-22 | 2016-02-20 14:54:19 |
+--------------+--------------+---------------------+
My question is, I want to select expired profile to current date and time.
This is how I tried it, but it doesn't work.
SELECT * FROM candidates WHERE date_added = DATE_ADD(date_added, INTERVAL 2 DAY);
Hope somebody may help me out.
Thank you.
You may try any of the following query which meets your need.
SELECT
*
FROM candidates
WHERE date_expires < CURDATE();
Or if you want to get the expired accounts with respect to date_added field then follow the query given below:
SELECT
*
FROM candidates
WHERE DATE_ADD(date_added, INTERVAL 2 DAY) < CURDATE();
EDIT:
For fine-grained comparison you may use the following query:
SELECT
*
FROM candidates
WHERE TIMESTAMPADD(DAY,2,date_added) < NOW();
Note: Actually you don't need to store the expired dates in database. Rather you can store the profile life time (in this case it is 2 Days) in database if this profile life time varies across different accounts. You don't need to store this in database if it's constant in nature (i.e. Always 2 DAYS).
So if you want to bring this change in your table structure then it would look like below:
+--------------+--------------+---------------------+
| candidate_id | days | date_added |
+--------------+--------------+---------------------+
| 1 | 2 | 2016-03-24 14:42:18 |
| 2 | 5 | 2016-03-21 15:43:40 |
| 3 | 3 | 2016-02-13 14:53:30 |
| 4 | 10 | 2016-02-20 14:54:19 |
+--------------+--------------+---------------------+
You need a modified query for this change.
Here it is:
SELECT
*
FROM candidates
WHERE TIMESTAMPADD(DAY,days,date_added) < NOW();
You're looking for this
SELECT *
FROM candidates
WHERE date_expires < NOW();

mysql query returns an empty result set

I have been tasked with a query I am having problems with. Here is the query:
Given a user id and a month, produce a list containing student name, list of files they own (largest to smallest) including total number of files and number of bytes used in a month specified.
Here is what I have so far:
(Select * from htmp_cs368
Join roster_cs368 ON htmp_cs368.userId =
roster_cs368.lastName Where htmp_cs368.userId =
(SELECT lastName FROM roster_cs368 WHERE userId = 'userId' AND htmp_cs368.monthIn = 'monthIn'))
UNION
(Select * from atmp_cs368
JOIN roster_cs368 ON atmp_cs368.userId =
roster_cs368.userId Where roster_cs368.userId =
'userId' AND atmp_cs368.monthIn = 'monthIn') ORDER BY fileSize DESC;
I am getting a result of empty set. My tables are full. I am hoping somone can correct my mistakes.
I have included my schema:
mysql> select * from roster_cs368
-> ;
+--------+-----------+-----------+
| userId | firstName | lastName |
+--------+-----------+-----------+
| apn7cf | Allen | Newton |
| atggg3 | andrew | goebel |
Primary key is userId
mysql> select * from htmp_cs368;
+------------+----------+------------+----------+----------+-------+------+-------+----------------------+
| filePerms | numLinks | userId | idGroup | fileSize | monthIn | day | time | fileName |
+------------+----------+------------+----------+----------+-------+------+-------+----------------------+
| drwx------ | 2 | schulte | faculty | 289 | Nov | 7 | 2011 | Java |
| -rw-r--r-- | 1 | schulte | faculty | 136 | Apr | 29 | 2012 | LD |
| drwxr-xr-x | 3 | schulte | faculty | 177 | Mar | 20 | 2012 | Upgrade |
No primary key here
select * from atmp_cs368;
+------------+----------+--------------+----------+----------+-------+------+-------+-----------------------------+
| filePerms | numLinks | userId | idGroup | fileSize | monthIn | day | time | fileName |
+------------+----------+--------------+----------+----------+-------+------+-------+-----------------------------+
| drwxr-xr-x | 2 | remierm | 203 | 245 | Sep | 17 | 14:40 | 148360_sun_studio_12 |
| drwx---rwx | 31 | antognolij | sasl | 2315 | Oct | 24 | 12:28 | 275 |
| -rwx------ | 1 | kyzvdb | student | 36 | Sep | 19 | 13:05 | 275hh |
No primary key here as either.
I have had very little experience with mysql. I also have to come up with:
If no user id is specified, all files, if no month specified, all users and if neither specified, all months and users.
I am stuck and at a lost. I appreciate any help! Thanks!
You seem to have a number of problems in the SQL.
First
Join roster_cs368 ON htmp_cs368.userId = roster_cs368.lastName
You try to join the userId field to the lastName field, which definitely won't work. It should be userId in both tables.
Then
WHERE userId = 'userId' AND htmp_cs368.monthIn = 'monthIn'
Assuming those really are literal strings, they won't match anything in the table. You need to use a parameterized query, and substitute question marks in the SQL, as in
WHERE userId = ? AND htmp_cs368.monthIn = ?
and provide the actual values to be used in the Java code.
I think you're looking for something along these lines (untested, but this will give you a starting point)
List of files
select r.lastName, r.firstName, t.fileName, t.fileSize
from htmp_cs368 t join roster_cs368 r on t.userId=r.userId
where t.userId=? and t.monthIn=?
order by fileSize desc
Summary:
select r.lastName, r.firstName, count(t.fileName), sum(t.fileSize)
from htmp_cs368 t join roster_cs368 r on t.userId=r.userId
where t.userId=? and t.monthIn=?
group by t.userId
This is a simple approach that does not take into account files appearing and disappearing during a month, but you don't seem to have data in your tables for this.
Also, it's not clear what atmp_cs368 is for, or why the time column in one table seems to have year values.
As pointed out by others you seem to have a number of problems in your SQL. I dont think it can compile as well.
Try:
SELECT r.userId, files.*
FROM roster_cs368 AS r
JOIN (
Select * from htmp_cs368 WHERE userId = 'userId' AND monthIn = 'monthIn'
UNION
Select * from atmp_cs368 Where userId = 'userId' AND monthIn = 'monthIn'
) AS files ON files.userId = r.userId
ORDER BY files.fileSize DESC;
You need only one JOIN. This lists users and all their files. And take care to equate apples to apples (userId != lastName).
Now to get count of files and file sizes etc you need a GroupBy effectively. But you cannot list files and get count of files together "easily". It will have to be one way or other. Just for the count you can use Jim's solution.
This JOIN looks a tad suspicious...
JOIN roster_cs368 ON htmp_cs368.userId = roster_cs368.lastName
Even if userId in htmp_cs368 has an equivalent value in the lastName column of roster_cs368, this is very bad form. JOINS should typically be done on like-named columns.
If these two columns are unrelated (it's hard to tell when roster_cs368 also has a userId column), then that would be at least part of your problem.
Also, htmp_cs368.monthIn = 'monthIn' doesn't make sense. This won't match anything in that column either.