I have a database column(time) which define the data type as datetime which is 2019-11-08 15:49:26.860. I want to list out the data which belong to 2019-11-08 only regardless the timestamp. My database table will look like the following
ID | task | time
1 | read | 2019-11-08 01:00:00.546
2 | sleep | 2019-11-08 03:00:00.546
3 | boxing | 2019-11-18 01:00:00.546
I tested the following query but it shown that the datetime is not the string value
select * from task where time = '2019-11-08%'
Since time is of datetime datatype (which, by the way, is kind of counter-intuitive), don't use string functions. Use date functions instead:
select * from task where date(time) = '2019-11-08';
I would actually recommend using the following, since it is more index-friendly:
select * from task where where time >= '2019-11-08' and time < '2019-11-09';
Related
this might be a trivial question for some of you but I haven't found/understood a solution to the following problem:
I have a large c 60 GB database structured the following way:
| Field | Type | Null | Key | Default | Extra |
+------------+----------+------+-----+---------+-------+
| date | datetime | YES | MUL | NULL | |
| chgpct1d | double | YES | | NULL | |
| pair | text | YES | | NULL | |
The database stores the last 10 years of daily percentage changes for c 200k different pair-trades. Thus, neither date nor pair is a unique key (a combination of date + pair would be). There are c 2600 distinct date entries and c 200k distinct pairs which generate > 520 MM rows.
The following query takes c multiple minutes to return a result.
SELECT date, chgpct1d, pair FROM db WHERE date = '2018-12-20';
What can I do to speed things up?
I've read about multiple-column indices but I'm not sure if that would help in my case given that all of the WHERE-queries will only ever point to the 'date' field.
MySQL probably does a full table scan to satisfy your query. That's like looking up a word in a dictionary that has its entries in random order: very slow.
Two things:
Create an index on these columns: (date, chgpct1d, pair).
Because the column named date has the DATETIME data type, it can potentially contain values like 2018-12-20 10:17:20. When you say WHERE date = '2018-12-20' it actually means WHERE date = '2018-12-20 00:00:00'. So, use this instead
WHERE date >= '2018-12-20'
AND date < '2018-12-21`
That will capture all the date values at any time on your chosen date.
Why does this help? Because your multicolumn index starts with date, MySQL can do a range scan on it given the WHERE statement you have. And, because the index contains everything needed by your query the database server doesn't have to look anywhere else, but can satisfy the query directly from the index. That index is said to cover the query.
Notice that with half a gigarow in your table, creating the index will take a while. Do it overnight or something.
I have received a Mysql table with customer data which is pretty badly structured, almost all the fields are TEXT. To prevent losing data I have created another table where I am trying to import the columns in a correct and useful manner. Splitting Full_name into name and surname columns work great but when I try to convert the field created_time containing dd/mm/yy date to DATETIME it shows wrong data eg. 10/10/18 to 2010-10-18.
I have resorted to creating another field created_time_old, copying the text data there and then converting it via STR_TO_DATE as some other answer suggested. However, I can only put strings there, when I put the whole column it just gives me:
1411 - Incorrect datetime value: '' for function str_to_date.
I assume that there is another way/function that will manage to do it but I am not very experienced when it comes to SQL. Moreover, if you have any other suggestions when it comes to my code, please post them :)
My code is below
TRUNCATE TABLE Facetel_bazaPGS;
INSERT INTO Facetel_bazaPGS (id,created_time_old,campaign_name,email,ImiÄ™,Nazwisko,`phone_number`,platform,Time_added)
SELECT `id`,created_time,`campaign_name`,`email`,substring_index(`full_name`, ' ',1 ),substring(`full_name` from INSTR(`full_name`, ' ') + 1),`phone_number`,`platform`,Time_added
FROM `Facetel_bazaPGS_input`;
UPDATE Facetel_bazaPGS
SET platform = Replace(REPLACE(platform, 'fb', 'Facebook') , 'ig', 'Instagram');
UPDATE Facetel_bazaPGS
SET created_time = STR_TO_DATE(`created_time_old`, '%d/%m/%y');
EDIT#1: Adding sample data (can't give real data because of GDPR)
+------------------+--------------+---------------+--------------------+--------------+---------------------+----------+---------------------+
| id | created_time | campaign_name | email | full_name | phone_number | platform | Time_added |
+------------------+--------------+---------------+--------------------+--------------+---------------------+----------+---------------------+
| 1010334092505681 | 10/10/18 | leady | samplemail#mail.eu | Name | your_typical_number | ig | 2018-10-11 08:29:45 |
| 1010457652493325 | 10/10/18 | leady | samplemail#mail.eu | Name Surname | your_typical_number | ig | 2018-10-11 08:29:45 |
| 1010470612492029 | 10/10/18 | leady | samplemail#mail.eu | Name Surname | your_typical_number | fb | 2018-10-11 08:29:45 |
+------------------+--------------+---------------+--------------------+--------------+---------------------+----------+---------------------+
This answer is a speculation, but it attempts to get to the root cause of your 1411 error coming from the call to STR_TO_DATE. We can try running the following query to detect malformed date strings in your Facetel_bazaPGS_input source table:
SELECT *
FROM Facetel_bazaPGS_input
WHERE created_time NOT REGEXP '^[0-9]{2}/[0-9]{2}/[0-9]{2}$';
This would return any record not having a created_time in the format you expect. If you don't see an empty result set, then you can fix the date strings.
Note: Just now I can see that your error message seems to be saying that some created_time values are either empty string or NULL. In either case, the above query should flush out such records.
Thank you again for all your suggestions, I have given up, changed the format of the column in Excel, imported again and asked for any future imputs to have YYYY-MM-DD format in the created_time column.
I need to query the info in MySql where I'm given two time strings, so I need to find anything in between.
the format the table looks like
id | date | hour | other | columns | that are not important
-----------------------------------------------------------
1 | 2016-04-11| 1 | asdsa......
2 | 2016-04-11| 2 | asdasdsadsadas...
.
.
.
n | 2016-04-12| 23 | sadasdsadsadasd
Say I have the time strings 2016-04-11 1 and 2016-04-12 23 and I need to find all info from 1 to n. I can separate the date and hour and do a query using BETWEEN...AND for the date, but I have no idea how to fit the time into the formula. Using another BETWEEN definitely won't work, so I definitely need to fit the statement somewhere else. I'm not sure how to proceed though.
WHERE ((`date` = fromDate AND `hour` > fromHour) OR `date` > fromDate)
AND ((`date` = toDate AND `hour` < toHour) OR `date` < toDate)
I know I could use PHP to do this, but wanted to find out if there was a way to calculate the difference between two times using just a query? I tried the query below, but it's returning NULL for the time difference.
The data in my table is stored as:
| created | changed |
+------------+------------+
| 1333643004 | 1333643133 |
I wanted to figure out a way to return:
| 2012-04-05 09:23:24 | 2012-04-05 09:25:33 | 00:02:09 |
I tried:
SELECT
FROM_UNIXTIME(created) AS created,
FROM_UNIXTIME(changed) AS changed,
TIMEDIFF ( changed, created ) / 60 AS timediff
FROM content
WHERE id = 45;
Which yielded:
| 2012-04-05 09:23:24 | 2012-04-05 09:25:33 | NULL |
The result returned by TIMEDIFF() is limited to the range allowed for
TIME values. Alternatively, you can use either of the functions
TIMESTAMPDIFF() and UNIX_TIMESTAMP(), both of which return integers.
I would call UNIX_TIMESTAMP() on both columns (which returns integers) and then subtract them. This will give you an integer which you can convert in the query or in PHP.
SELECT
UNIX_TIMESTAMP(created) AS created,
UNIX_TIMESTAMP(changed) AS changed,
changed-created AS difference
FROM content
WHERE id = 45;
http://dev.mysql.com/doc/refman/5.5/en/date-and-time-functions.html#function_unix-timestamp
I have a database with a created_at column containing the datetime in Y-m-d H:i:s format.
The latest datetime entry is 2011-09-28 00:10:02.
I need the query to be relative to the latest datetime entry.
The first value in the query should be the latest datetime entry.
The second value in the query should be the entry closest to 7 days from the first value.
The third value should be the entry closest to 7 days from the second value.
REPEAT #3.
What I mean by "closest to 7 days from":
The following are dates, the interval I desire is a week, in seconds a week is 604800 seconds.
7 days from the first value is equal to 1316578202 (1317183002-604800)
the value closest to 1316578202 (7 days) is... 1316571974
unix timestamp | Y-m-d H:i:s
1317183002 | 2011-09-28 00:10:02 -> appear in query (first value)
1317101233 | 2011-09-27 01:27:13
1317009182 | 2011-09-25 23:53:02
1316916554 | 2011-09-24 22:09:14
1316836656 | 2011-09-23 23:57:36
1316745220 | 2011-09-22 22:33:40
1316659915 | 2011-09-21 22:51:55
1316571974 | 2011-09-20 22:26:14 -> closest to 7 days from 1317183002 (first value)
1316499187 | 2011-09-20 02:13:07
1316064243 | 2011-09-15 01:24:03
1315967707 | 2011-09-13 22:35:07 -> closest to 7 days from 1316571974 (second value)
1315881414 | 2011-09-12 22:36:54
1315794048 | 2011-09-11 22:20:48
1315715786 | 2011-09-11 00:36:26
1315622142 | 2011-09-09 22:35:42
I would really appreciate any help, I have not been able to do this via mysql and no online resources seem to deal with relative date manipulation such as this. I would like the query to be modular enough to be able to change the interval weekly, monthly, or yearly. Thanks in advance!
Answer #1 Reply:
SELECT
UNIX_TIMESTAMP(created_at)
AS unix_timestamp,
(
SELECT MIN(UNIX_TIMESTAMP(created_at))
FROM my_table
WHERE created_at >=
(
SELECT max(created_at) - 7
FROM my_table
)
)
AS `random_1`,
(
SELECT MIN(UNIX_TIMESTAMP(created_at))
FROM my_table
WHERE created_at >=
(
SELECT MAX(created_at) - 14
FROM my_table
)
)
AS `random_2`
FROM my_table
WHERE created_at =
(
SELECT MAX(created_at)
FROM my_table
)
Returns:
unix_timestamp | random_1 | random_2
1317183002 | 1317183002 | 1317183002
Answer #2 Reply:
RESULT SET:
This is the result set for a yearly interval:
id | created_at | period_index | period_timestamp
267 | 2010-09-27 22:57:05 | 0 | 1317183002
1 | 2009-12-10 15:08:00 | 1 | 1285554786
I desire this result:
id | created_at | period_index | period_timestamp
626 | 2011-09-28 00:10:02 | 0 | 0
267 | 2010-09-27 22:57:05 | 1 | 1317183002
I hope this makes more sense.
It's not exactly what you asked for, but the following example is pretty close....
Example 1:
select
floor(timestampdiff(SECOND, tbl.time, most_recent.time)/604800) as period_index,
unix_timestamp(max(tbl.time)) as period_timestamp
from
tbl
, (select max(time) as time from tbl) most_recent
group by period_index
gives results:
+--------------+------------------+
| period_index | period_timestamp |
+--------------+------------------+
| 0 | 1317183002 |
| 1 | 1316571974 |
| 2 | 1315967707 |
+--------------+------------------+
This breaks the dataset into groups based on "periods", where (in this example) each period is 7-days (604800 seconds) long. The period_timestamp that is returned for each period is the 'latest' (most recent) timestamp that falls within that period.
The period boundaries are all computed based on the most recent timestamp in the database, rather than computing each period's start and end time individually based on the timestamp of the period before it. The difference is subtle - your question requests the latter (iterative approach), but I'm hoping that the former (approach I've described here) will suffice for your needs, since SQL doesn't lend itself well to implementing iterative algorithms.
If you really do need to determine each period based on the timestamp in the previous period, then your best bet is going to be an iterative approach -- either using a programming language of your choice (like php), or by building a stored procedure that uses a cursor.
Edit #1
Here's the table structure for the above example.
CREATE TABLE `tbl` (
`id` int(10) unsigned NOT NULL auto_increment PRIMARY KEY,
`time` datetime NOT NULL
)
Edit #2
Ok, first: I've improved the original example query (see revised "Example 1" above). It still works the same way, and gives the same results, but it's cleaner, more efficient, and easier to understand.
Now... the query above is a group-by query, meaning it shows aggregate results for the "period" groups as I described above - not row-by-row results like a "normal" query. With a group-by query, you're limited to using aggregate columns only. Aggregate columns are those columns that are named in the group by clause, or that are computed by an aggregate function like MAX(time)). It is not possible to extract meaningful values for non-aggregate columns (like id) from within the projection of a group-by query.
Unfortunately, mysql doesn't generate an error when you try to do this. Instead, it just picks a value at random from within the grouped rows, and shows that value for the non-aggregate column in the grouped result. This is what's causing the odd behavior the OP reported when trying to use the code from Example #1.
Fortunately, this problem is fairly easy to solve. Just wrap another query around the group query, to select the row-by-row information you're interested in...
Example 2:
SELECT
entries.id,
entries.time,
periods.idx as period_index,
unix_timestamp(periods.time) as period_timestamp
FROM
tbl entries
JOIN
(select
floor(timestampdiff( SECOND, tbl.time, most_recent.time)/31536000) as idx,
max(tbl.time) as time
from
tbl
, (select max(time) as time from tbl) most_recent
group by idx
) periods
ON entries.time = periods.time
Result:
+-----+---------------------+--------------+------------------+
| id | time | period_index | period_timestamp |
+-----+---------------------+--------------+------------------+
| 598 | 2011-09-28 04:10:02 | 0 | 1317183002 |
| 996 | 2010-09-27 22:57:05 | 1 | 1285628225 |
+-----+---------------------+--------------+------------------+
Notes:
Example 2 uses a period length of 31536000 seconds (365-days). While Example 1 (above) uses a period of 604800 seconds (7-days). Other than that, the inner query in Example 2 is the same as the primary query shown in Example 1.
If a matching period_time belongs to more than one entry (i.e. two or more entries have the exact same time, and that time matches one of the selected period_time values), then the above query (Example 2) will include multiple rows for the given period timestamp (one for each match). Whatever code consumes this result set should be prepared to handle such an edge case.
It's also worth noting that these queries will perform much, much better if you define an index on your datetime column. For my example schema, that would look like this:
ALTER TABLE tbl ADD INDEX idx_time ( time )
If you're willing to go for the closest that is after the week is out then this'll work. You can extend it to work out the closest but it'll look so disgusting it's probably not worth it.
select unix_timestamp
, ( select min(unix_tstamp)
from my_table
where sql_tstamp >= ( select max(sql_tstamp) - 7
from my_table )
)
, ( select min(unix_tstamp)
from my_table
where sql_tstamp >= ( select max(sql_tstamp) - 14
from my_table )
)
from my_table
where sql_tstamp = ( select max(sql_tstamp)
from my_table )