Apologies if this seems a 'stupid' question - I don't really know the right term to describe what I am trying to do (and thus searching for help on it was bit fruitless).
Basically, I initially had data that was in the form:
| timestamp | category A | category B | .......| category n|
| 2011-12-02 00:05:00 | 23.63 | 27.00 | .......| 24.03 |
| 2011-12-02 00:10:00 | 23.75 | 24.42 | .......| 24.45 |
| 2011-12-02 00:15:00 | 23.31 | 23.96 | .......| 26.54 |
I put this data into a database (and normalised it) so that it exists in the database as follows:
+---------------------+--------------+-------+
| timestamp | catergory_id | value |
+---------------------+--------------+-------+
| 2011-12-02 00:05:00 | 2 | 27.00 |
| 2011-12-02 00:10:00 | 2 | 24.42 |
| 2011-12-02 00:15:00 | 2 | 23.96 |
| 2011-12-02 00:20:00 | 2 | 23.73 |
| 2011-12-02 00:25:00 | 2 | 23.73 |
+---------------------+--------------+-------+
What I am trying to select different categories by timestamp (to enable comparison) , like so:
+---------------------+-------+-------+
| timestamp | cat_a | cat_b |
+---------------------+-------+-------+
| 2011-12-02 00:05:00 | 23.63 | 27.00 |
| 2011-12-02 00:10:00 | 23.75 | 24.42 |
| 2011-12-02 00:15:00 | 23.31 | 23.96 |
| 2011-12-02 00:20:00 | 23.00 | 23.73 |
| 2011-12-02 00:25:00 | 22.91 | 23.73 |
+---------------------+-------+-------+
This is basically similar to the original data structure (But I would like to select/compare between multiple and variable, categories not just two).
I have been able to to this using join (after selecting the individual categories in individual tables). This is okay for say comparing across two categories, but seems quite inefficient, particularly if I want to select say 15 or 20 different categories to compare. It is also problematic if a particular category is missing a data point.
(The other way I have been doing this is by selecting individual tables and later "merging" the data in the python application in which it is later used, but this also seems equally inefficient)
I feel like there must be an easier or more intuitive way to do this in mysql- and I am just missing something quite basic. I don't really want to de-normalize (As there is a lot of categories, and it makes sense to have it normalized for other uses, besides this one).
Cheers,
This is basically a pivot table problem. MySQL doesn't have a built-in SQL extension to make pivot tables like some other DBMSs do so they are a bit tricky. You can find one way of making them here: http://www.artfulsoftware.com/infotree/qrytip.php?id=78
My solution to this problem used the python data tool pandas. (This won't suit those interested in a pure MySQL solution - for this case, check out Joni's solutions above, or have a look at some of the similar stackoverflow answers e.g. mysql pivot query results with GROUP BY or MySQL pivot table query with dynamic columns).
Firstly I created a pandas dataframe with the data I wanted to select/compare (using the sql.read_frame method from pandas.io, and the appropriate sql_query):
df=sql.read_frame(sql_query,DB_connection)
This created a dataframe as such:
df.head():
timestamp category_id value
0 2011-01-01 00:00:00 4 22.05
1 2011-01-01 00:05:00 4 24.10
2 2011-01-01 00:10:00 4 23.98
3 2011-01-01 00:15:00 4 24.10
4 2011-01-01 00:20:00 4 24.10
This was then "pivoted" using the pandas.pivot_table method:
df2=df.pivot_table(rows='timestamp',cols='category_id',values='value')
Which create the exact output I was after:
df2.head():
category_id 2 4 5 6 7
timestamp
2011-01-01 00:00:00 23.43 22.05 25.07 19.47 21.32
2011-01-01 00:05:00 25.31 24.10 25.69 21.32 22.94
2011-01-01 00:10:00 25.31 23.98 24.84 21.32 22.59
2011-01-01 00:15:00 25.31 24.10 25.47 21.10 21.39
2011-01-01 00:20:00 25.31 24.10 25.69 20.01 17.9
Hope someone else finds this useful!
Related
I have a problem about the data that I want to display. Basically I have this table.
history_table:
| history_date_from | history_date_to |
+-----------------+---------------+
| 2019-10-12 | 2019-10-12 |
| 2019-10-25 | 2019-10-28 |
| 2019-11-18 | 2019-11-22 |
| 2019-11-19 | 2019-11-25 |
| 2019-11-20 | 2019-11-20 |
The problem that I'm having is what if today is already 2019-11-19. I still want to show the third row until 2019-11-22.
Here is my current query:
SELECT history_date_from,history_date_to
FROM history_table
WHERE DATE(history_date_from)= CURDATE() BETWEEN DATE(history_date_from) AND DATE(history_date_to)
But the problem from my query is it will just depend on the CURDATE of the history_date_from, what I'm trying to achieve is to still get the third row for tomorrow until the end of the date depends on the history_date_to.
So if today 2019-11-19 the output should be:
| history_date_from | history_date_to |
| 2019-11-18 | 2019-11-22 |
| 2019-11-19 | 2019-11-25 |
because the history_date_to is still not done in terms of date.
Any help would be really appreciated, I think I'm just making thing complicated with my query.
You can use the following solution using BETWEEN:
SELECT *
FROM history_table
WHERE CURDATE() BETWEEN history_date_from AND history_date_to
demo on dbfiddle.uk
You don't need to check for the exact match on history_date_from. You want to know if the current date is in a specific period of time. So BETWEEN is a good way to go.
I have the following query:
SELECT `Time`,
`Resolution`,
HOUR(TIMEDIFF(`Resolution`,`Time`)),
TIMEDIFF(`Resolution`,`Time`),
datediff(`Resolution`,`Time`)
FROM Cases;
In order to debug, I add the TIMEDIFF without the HOUR before, just to see if the result is different. I use datediff to double check.
The result of the query is:
+---------------------+---------------------+-------------------------------------+-------------------------------+-------------------------------+
| Time | Resolution | HOUR(TIMEDIFF(`Resolution`,`Time`)) | TIMEDIFF(`Resolution`,`Time`) | datediff(`Resolution`,`Time`) |
+---------------------+---------------------+-------------------------------------+-------------------------------+-------------------------------+
| 2017-01-10 13:35:00 | 2017-01-24 10:52:00 | 333 | 333:17:00 | 14 |
| 2017-01-12 15:53:00 | 2017-02-21 16:06:00 | 838 | 838:59:59 | 40 |
| 2017-01-18 09:19:00 | 2017-01-18 13:39:00 | 4 | 04:20:00 | 0 |
| 2017-01-23 09:00:00 | 2017-01-23 15:08:00 | 6 | 06:08:00 | 0 |
| 2017-01-24 08:49:00 | 2017-02-20 14:34:00 | 653 | 653:45:00 | 27 |
Actually, it delivers more lines, but the relevant line is the 2 result - 838 hours, which translates to 34.91 days, let's say 35, but the DATEDIFF give 40 and when you do yourself the calculation it is 40 days! 12th Jan to 21st Feb.
All other 21 results are correct.
Any idea why? A bug in mysql?
All responses are highly appreciated.
Use
TIMESTAMPDIFF(HOUR,`Time`, `Resolution`)
instead.
It also negates the need to use HOUR().
https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_timestampdiff
The result returned by TIMEDIFF() is limited to the range allowed for TIME values. https://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_timediff
TIME values may range from -838:59:59 to 838:59:59. https://dev.mysql.com/doc/refman/5.5/en/time.html
So you're getting the maximum possible value.
I need help on a small problem with a subtraction in the same table and column
Well, iam creating a view, but the aplication generated the results of used time in tha same table and column.
My table have the following columns: id,field_id,object_id and value_date.
| ID | FIELD_ID | OBJECT_ID | VALUE_DATE |
| 55 | 4 | 33 | 2016-12-18 19:02:00 |
| 56 | 5 | 33 | 2016-12-18 19:12:00 |
| 57 | 4 | 35 | 2016-12-18 19:30:00 |
| 58 | 5 | 35 | 2016-12-18 20:00:00 |
I do not have much knowledge in sql, but i have tried some functions like timestampdiff, period_siff and others examples in stackoverflow.com.
Someone help me to subtract ID 56 with field_id 5 by line with ID 55 and field_id 4 in object_id 33 in SQL to bring the result in minutes. Ex: 10 or 00:10:00
An article about this problem would already help me. Thank you very much!
Lets assume that you want result to be in day format then query will be :
SELECT DATEDIFF(day,startDate,endDate) AS 'Day'
FROM table1;
Find complete example here
The soluction is below:
select TIMESTAMPDIFF(MINUTE,F1.value_date,F2.value_date) as minutes, F1.value_date,F2.value_date,F1.object_id,F2.object_id,F1.field_id,F2.field_id
from otrs_tst.dynamic_field_value F1
join otrs_tst.dynamic_field_value F2 on F1.object_id = F2.object_id
where F1.field_id in ('4','5')
and F2.field_id in ('4','5')
and F2.field_id <> F1.field_id
and F1.field_id < F2.field_id
group by F1.object_id,F2.field_id
I have a problem with my thinking and I must ask someone to help.
I create simply working time tracking system and I can manage all other things with that code, but I cannot now figure out how I can calculate following data.
Fist, I have couple tables and one store four things.
index | who | timestamp | what
1 | 1 | 2016-09-21 08:00:00 | Work
2 | 2 | 2016-09-21 08:01:00 | Work
3 | 1 | 2016-09-21 10:00:00 | Bin
4 | 2 | 2016-09-21 10:00:00 | Bin
5 | 1 | 2016-09-21 10:15:00 | Bout
6 | 2 | 2016-09-21 10:17:00 | Bout
7 | 2 | 2016-09-21 13:00:00 | Bin
8 | 1 | 2016-09-21 13:00:00 | Bin
9 | 1 | 2016-09-21 13:30:00 | Bout
10 | 2 | 2016-09-21 13:30:00 | Bout
11 | 2 | 2016-09-21 15:58:00 | Home
12 | 1 | 2016-09-21 16:05:00 | Home
I can nicely calculate times between Work and Home and got right value to the right person.
But I'm stuck now with those break times.
I need calculate all possible breaks together per person and that way get a total time what is spend to breaks per person.
So I need something like following answer when ask person 1 info:
Who | Time | Breaktime | Working time
1 | 08:05:00 | 00:45:00 | 07:20:00
Or maybe all persons can came to same page when ask specific day...
Who | Time | Breaktime | Working time
1 | 08:05:00 | 00:45:00 | 07:20:00
2 | 07:57:00 | 00:47:00 | 07:10:00
There is always pairs with events. Work -> Home and Bin -> Bout.
And yes, there is much more persons and could be much more brake times per person.
That might be a bad presentation, sorry about that (NooB). I hope I give that much information as someone can help. But ask if there is something to ask.
That is code what I use when I solve one day total time at working place.
SELECT TIMEDIFF(
(SELECT timestamp FROM `stamps` WHERE (who like '1' and DATE (timestamp) like '2016-09-21' and what like 'Home')),
(SELECT timestamp FROM `stamps` WHERE (who like '1' and DATE (timestamp) like '2016-09-21' and what like 'Work'))
)
But I cannot use it with multiple events.
That is what I found. Maybe not nice, but it works =).
$break = mysqli_query($conn,"SELECT what, timestamp FROM stamps WHERE (who like '$id' and DATE (timestamp) like '$day' and what like 'B%') ORDER BY timestamp");
while($row = mysqli_fetch_array($break))
{
if ( $row['what'] == "Bin" )
$start = strtotime( $row['timestamp'] );
else { $stop = strtotime( $row['timestamp'] );
$hours += ( $stop - $start );
}
}
$btime = gmdate('H:i:s', floor($hours));
$btime gives 00:00:00 style result.
I have an event system and for my repeat events I am using a cron like system.
Repeat Event:
+----+----------+--------------+
| id | event_id | repeat_value |
+----+----------+--------------+
| 1 | 11 | *_*_* |
| 2 | 12 | *_*_2 |
| 3 | 13 | *_*_4/2 |
| 4 | 14 | 23_*_* |
| 5 | 15 | 30_05_* |
+----+----------+--------------+
NOTE: The cron value is day_month_day of week
Event:
+----+------------------------+---------------------+---------------------+
| id | name | start_date_time | end_date_time |
+----+------------------------+---------------------+---------------------+
| 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
| 15 | Repeat yearly | 2014-05-30 07:30:00 | 2014-05-30 10:15:00 |
+----+------------------------+---------------------+---------------------+
Anyway I have a query to select the events:
SELECT *
FROM RepeatEvent
JOIN `Event`
ON `Event`.`id` = `RepeatEvent`.`event_id`
That produces:
+----+----------+--------------+----+------------------------+---------------------+---------------------+
| id | event_id | repeat_value | id | name | start_date_time | end_date_time |
+----+----------+--------------+----+------------------------+---------------------+---------------------+
| 1 | 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 2 | 12 | *_*_2 | 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 3 | 13 | *_*_4/2 | 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 4 | 14 | 23_*_* | 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
| 5 | 15 | 30_05_* | 15 | Repeat yearly | 2014-05-30 07:30:00 | 2014-05-30 10:15:00 |
+----+----------+--------------+----+------------------------+---------------------+---------------------+
However, I want to select events within a month. I will only have certain conditions: daily, weekly, every two weeks, month and yearly.
I want to put in my where clause a way to divide the string of the repeat value and if it fits any of the following conditions to show it as a result (repeatEvent is row that is being interrogated, search is the date being looked for):
array(3) = string_divide(repeat_value, '_')
daily = array(0)
monthy = array(1)
dayOfWeek = array(2)
if(daily == '*' && month == '*' && dayOfWeek == '*') //returns all the daily events as they will happen
return repeatEvent
if(if(daily == '*' && month == '*' && dayOfWeek == search.dayOfWeek) //returns all the events on specific day
return repeatEvent
if(daily == search.date && month == '*' && dayOfWeek == '*') //returns all the daily events as they will happen
return repeatEvent
if (contains(dayOfWeek, '/'))
array(2) = string_divide(dayOfWeek,'/')
specificDayOfWeek = array(0);
if(specificDayOfWeek == repeatEvent.start_date.dayNumber)
if(timestampOf(search.timestamp)-timestampOf(repeatEvent.start_date)/604800 == (0 OR EVEN)
return repeatEvent
if(daily == search.date && month == search.month && dayOfWeek == '*') //returns a single yearly event (shouldn't often crop up)
return repeatEvent
//everything else is either an unknown format of repeat_value or not an event on this day
To summarise I want to run a query in which the repeat value is split in the where clause and I can interrogate the split items. I have looked at cursors but the internet seems to advise against them.
I could process the results of selecting all the repeat events in PHP, however, I imagine this being very slow.
Here is what I would like to see if looking at the month of April:
+----------+--------------+----+------------------------+---------------------+---------------------+
| event_id | repeat_value | id | name | start_date_time | end_date_time |
+----------+--------------+----+------------------------+---------------------+---------------------+
| 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
+----------+--------------+----+------------------------+---------------------+---------------------+
Here is what I would like to see if looking at the month of May
+----------+--------------+----+------------------------+---------------------+---------------------+
| event_id | repeat_value | id | name | start_date_time | end_date_time |
+----------+--------------+----+------------------------+---------------------+---------------------+
| 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 12 | *_*_2 | 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 13 | *_*_4/2 | 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 14 | 23_*_* | 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
| 15 | 30_05_* | 15 | Repeat yearly | 2014-05-30 07:30:00 | 2014-05-30 10:15:00 |
+----------+--------------+----+------------------------+---------------------+---------------------+
Here is what I would like to see if looking at the month of June
+----------+--------------+----+------------------------+---------------------+---------------------+
| event_id | repeat_value | id | name | start_date_time | end_date_time |
+----------+--------------+----+------------------------+---------------------+---------------------+
| 11 | *_*_* | 11 | Repeat daily | 2014-04-30 12:00:00 | 2014-04-30 12:15:00 |
| 12 | *_*_2 | 12 | Repeat weekly | 2014-05-06 12:00:00 | 2014-05-06 13:00:00 |
| 13 | *_*_4/2 | 13 | Repeat every two weeks | 2014-05-08 12:45:00 | 2014-05-08 13:45:00 |
| 14 | 23_*_* | 14 | Repeat monthly | 2014-05-23 15:15:00 | 2014-05-23 16:00:00 |
+----------+--------------+----+------------------------+---------------------+---------------------+
You could put a bandaid on this, but no one would be doing you any favors to tell you that that is the answer.
If your MySQL database can be changed I would strongly advise you to split your current column with underscores day_month_day of year to three separate columns, day, month, and day_of_year. I would also advise you to change your format to be INT rather than VARCHAR. This will make it faster and MUCH easier to search and parse, because it is designed in a way that doesn't need to be translated into computer language through complicated programs... It is most of the way there already.
Here's why:
Reason 1: Your Table is not Optimized
Your table is not optimized and will be slowed regardless of what you choose to do at this stage. SQL is not built to have multiple values in one column. The entire point of an SQL database is to split values into different columns and rows.
The advantage to normalizing this table is that it will be far quicker to search it, and you will be able to build queries in MySQL. Take a look at Normalization. It is a complicated concept, but once you get it you will avoid creating messy and complicated programs.
Reason 2: Your Table could be tweaked slightly to harness the power of computer date/time functions.
Computers follow time based on Unix Epoch Time. It counts seconds and is always running in your computer. In fact, computers have been counting this since, as the name implies, the first Unix computer was ever switched on. Further, each computer and computer based program/system, has built in, quick date and time functions. MySQL is no different.
I would also recommend also storing all of these as integers. repeat_doy (day of year) can easily be a smallint or at least a standard int, and instead of putting a month and day, you can put the actual 1-365 day of the year. You can use DAY_OF_YEAR(NOW()) to input this into MySQL. To pull it back out as a date you can use MAKEDATE(YEAR(NOW),repeat_doy). Instead of an asterisk to signify all, you can either use 0's or NULL.
With a cron like system you probably will not need to do that sort of calculation anyway.
Instead, it will probably be easier to just measure the day of year elsewhere (every computer and language can do this. In Unix it is just date "%j").
Solution
Split your one repeat_value into three separate values and turn them all into integers based on UNIX time values. Day is 1-7 (or 0-6 for Sunday to Saturday), Month is 1-12, and day of year is 1-365 (remember, we are not including 366 because we are basing our year on an arbitrary non-leap year).
If you want to pull information in your SELECT query in your original format, it is much easier to use concat to merge the three columns than it is to try to search and split on one column. You can also easily harness built in MySQL functions to quickly turn what you pull into real, current, days, without a bunch of effort on your part.
To implement it in your SQL database:
+----+----------+--------------+--------------+------------+
| id | event_id | repeat_day | repeat_month | repeat_doy |
+----+----------+--------------+--------------+------------+
| 1 | 11 | * | * | * |
| 2 | 12 | * | * | 2 |
| 3 | 13 | * | * | 4/2 |
| 4 | 14 | 23 | * | * |
| 5 | 15 | 30 | 5 | * |
+----+----------+--------------+--------------+------------+
Now you should be able to build one query to get all of this data together regardless of how complicated your query. By normalizing your table, you will be able to fully harness the power of relational databases, without the headaches and hacks.
Edit
Hugo Delsing made a great point in the comments below. In my initial example I provided a fix to leap years for day_of_year in which I chose to ignore Feb 29. A much better solution removes the need for a fix. Split day_of_year to month and day with a compound index. He also has a suggestion about weeks and number of weeks, but I will just recommend you read it for more details.
Try to write where condition using this:
substring_index(repeat_value,'_', 1)
instead of daily
substring_index(substring_index(repeat_value,'_', -2), '_', 1)
instead of monthly
and
substring_index(substring_index(repeat_value,'_', -1), '_', 1)
instead of dayOfWeek
I think you are overthinking the problem if you only want the events per month and not per day. Assuming that you always correctly fill the repeat_value, the query is very basic.
Basically all event occur every month where the repeat_value is either LIKE '%_*_%' or LIKE '%_{month}_%'.
Since you mentions PHP I'm assuming you are building the query in PHP and thus I used the same.
<?php
function buildQuery($searchDate) {
//you could/should do some more checking if the date is valid if the user provides the string
$searchDate = empty($searchDate) ? date("Y-m-d") : $searchDate;
$splitDate = explode('-', $searchDate);
$month = $splitDate[1];
//Select everything that started after the searchdate
//the \_ is because else the _ would match any char.
$query = 'SELECT *
FROM RepeatEvent
JOIN `Event`
ON `Event`.`id` = `RepeatEvent`.`event_id`
WHERE `Event`.`start_date_time` < \''.$searchDate.'\'
AND
(
`RepeatEvent`.`repeat_value` LIKE \'%\_'.$month.'\_%\'
OR `RepeatEvent`.`repeat_value` LIKE \'%\_*\_%\'
)
';
return $query;
}
//show querys for all months on current day/year
for ($month = 1; $month<=12; $month++) {
echo buildQuery(date('Y-'.$month.'-d')) . '<hr>';
}
?>
Now if the repeat_value could be wrong, you could add a simple regex check to make sure the value is always like *_*_* or *_*_*/*
You can use basic regular expressions in MySQL:
http://dev.mysql.com/doc/refman/5.0/en/pattern-matching.html
For a monthly event in May (first day) you can use a pattern like this (not tested):
[0-9\*]+\_[5\*]\_1
You can generate this pattern via PHP