Homogenize a field with different date formats in Mysql - mysql

I am working with Mysql workbench.
I have a huge database in a csv, that contains among other things, 3 columns with different formats of dates.
To be able of loading this csv file into my database, I have to set the 3 date columns as text, otherwise, it doesn't upload them properly.
Here an example of my data:
inDate, outDAte
19-01-10, 02-02-10
04-01-11 12:02, 2011-01-11 11:31
29-01-11 6:57, 29-03-2010
30-03-10, 01-04-2010
2012-12-03 05:39:27.040, 12-12-12 17:04
2012-12-04 13:47:01.040, 29-11-12
I want to homogenize them and to make 2 columns of each one of those one only with "date" and other only with "time".
I have tried working with "regular expressions" and with "case".
When I used "reg expressions" gave me nulls and with "case" gave me "truncated incorrect value".
I have tried to find something similar situations in the web. I have found that people got similar issues but with two date formats not with so many different formats as I do:
Convert varchar column to date in mysql at database level
Converting multiple datetime formats to single mysql datetime format
Format date in SELECT * query.
I am really new in this and I do not know how to write so many exceptions in mysql.

Load the CSV into a temporary table; massage the values in that table; finally copy to the 'real' table.
Have 2 columns in that table for each date; one for the raw value coming from the CSV; the other being a DATETIME(3) (or whatever the sanitized version will be).
Do one of these for each distinctly different format:
UPDATE tmp SET inDate = ...
WHERE raw_inDate REGEXP '...';
The WHERE may need things like AND LENGTH(inDate) = 8 and other ways to test other than REGEXP.
SUBSTRING_INDEX(inDate, '-', ...) may be a handy function for splitting up a date.
But, really, I would rather write the code in Perl or some other real programming language.

Related

How to convert date from MM/dd/yyyy to yyyy-MM-dd in MySQL? Error Code: 1411. Incorrect datetime value: '' for function str_to_date

I have a large dataset with employees' time entries. The current date format is MM/dd/yyyy. However, I need to convert all the dates into yyyy-MM-dd format.
I have tried the following:
Update human_resources.timekeeping
Set Actual_Date = str_to_date(Actual_Date,'%d-%m-%Y');
Got the errror messsage Error Code: 1411. Incorrect datetime value: '' for function str_to_date.
My SQL version is 5.7.18-log.
I tried to view SQL mode using SELECT ##sql_mode; and I got NO ENGINE SUBSTITUTION.
I have tried to retrieve the value like shown below and it was working fine.
Converting varchar mm/dd/yy to date format yyyy-mm-dd
However, updating the data would not work. I need to update the actual records, not insert new records.
Hope someone can help me regarding this. Thank you in advance!
EDIT: The data type for Actual_Date is VARCHAR.
Apologies if my explanation may be a bit confusing. But I am using this data set to display and filter time entries in a gridview. When I am filtering dates, say for example (01/15/2022-01/25/2022), data from 2021 is also being displayed. When I tried to manually change the format of some of my data in sql to yyyy-MM-dd, my code seemed to be working fine. The problem is there are a lot of data in this table, which is why manually updating the format is impossible. What is the first thing that I need to do? I'm sorry this is all still a bit confusing for me.
My apologies if you have already taken the following things into consideration but I thought them worth mentioning.
Given that you say this is a "large dataset" I assume this is a table that is currently in use. Does the existing application rely on the Actual_Date being in that string format? Does it rely on a fixed number of columns in the table? Some poorly written applications can be very brittle when it comes to changing underlying data structure.
You may want to consider creating a copy of the current table, modifying the structure of the copy, and replacing the original with a view with the same columns and formats as the original. This way you get improved data but reduce risk to existing application.
In the title and first line of your question you state that the current format is MM/dd/yyyy
Update human_resources.timekeeping Set Actual_Date = str_to_date(Actual_Date,'%m/%d/%Y');
Your separator is / not -
%d-%m-%Y >> %d/%m/%Y

How to format a date field into diferente languages in the same MySQL command?

The following will return the day in Italian:
SET lc_time_names = 'it_IT';
select date_format('2018/01/01','%W') as day_italian;
However I need to convert or format the date into multiple languages, so it would return me another column in English, Japanese, so on...
My problem is that I have to set the locale BEFORE running the select command.
Create a temporary table with three columns: language, weekday (int) and translation. Then you can join on it with over the language and DAYOFWEEK() or WEEKDAY(). Sadly there is no easier way since not a single date-function supports language parameters.

Performance in rlike expression or alternate query?

I am doing a series of updates on some tables after I import them from tab-separated values. The data comes with dates in a format I do not like. I bring them in as strings, manipulate them so that they are in the same format as MySQL dates and then convert the column. Or sometimes not, but I want them to be like MySQL dates even if they are strings.
They start out like '1/4/2013 12:00:00 AM' or '11/4/2012 2:37:45 PM'.
I turn these into '2013-01-04' (usually, since times are present even when the original schema clearly specifies dates only) and '2012-11-04 14:37:45'.
I am using rlike. And this does not use indexes? Wow. That sucks.
But already, for each column, I have to use 4 updates to handle the different cases ('1/7', '2/13', '11/2', '12/24'). If I did these using like, it might take 16 different updates for each column....
And, if I am seeing it right, I cannot even get positional parameters out of the rlike expression, yes? You know, the part of the expression wrapped in parentheses that becomes $1 or $2....
So, it seems as though it is going to be quicker to pre-process the tsv file with perl. Really? Wow. Again, this sucks.
Any other suggestions? I cannot have this taking 3 hours every time I need to pull in the data.
Recall the classic 1997 quote from Jamie Zawinski:
Some people, when confronted with a problem, think "I know, I'll use regular expressions."
Now they have two problems.
Have you tried using STR_TO_DATE()? This is exactly for parsing nonstandard date/time strings into canonical datetime values.
If you try parsing with STR_TO_DATE() and the string doesn't match the expected format, the function returns NULL.
So you could try parsing in different formats, and return the first one that gives a non-null result.
UPDATE mytable
SET datecolumn = COALESCE(
STR_TO_DATE(stringcolumn, '%m/%d'),
STR_TO_DATE(stringcolumn, '%d/%m/%Y'),
...etc.
);
I can't tell what your different cases are. It might or might not be possible to cover all cases in one pass.
Another alternative is as you say, preprocess the raw data with Perl before you load it into MySQL. But even then, don't fight with regular expressions, use Date::Parse instead.

UNIX_TIMESTAMP outputting NULL in MySQL?

I've got a table setup which has populated data. In column "date", I have dates in the following format:
yyyymmdd i.e. 20131110
I have created a new field and called it newdate with the text format.
Then, I open up the SQL window and put the following in
UPDATE wl_daily
SET
newdate = UNIX_TIMESTAMP(date)
For some reason, it is running correctly, however it only outputs NULL to all the rows. Also, the column name is blank for some reason
Any suggestions?
That's because your field in a string and you're trying to add timestamp to it which is not a string. You need to use a valid datetime field like timestamp for this to work.
Advice: don't store dates and times as strings. Store them in their native format. It makes working with dates and times much easier.
While John Cronde's answer is correct - it doesnt help your situtation
UNIX_TIMESTAMP(STR_TO_DATE(`date`, '%Y%m%d'))
will do the conversion for example
SELECT UNIX_TIMESTAMP(STR_TO_DATE('20131111', '%Y%m%d'))
returns
unix_timestamp(STR_TO_DATE('20131111', '%Y%m%d'))
---------------------------------------------------
1384128000
You should only use this to convert your columns to the date specific columns. Converting each time you need a number will add load and slow down the query if used in production

MySqlImport - Import a date field not in the proper format

I have a csv file that has a date field in a format like (among other fields):
17DEC2009
When I do a mysqlimport, the other fields are imported properly, but this field remains 0000-00-00 00:00:00
How can I import this date properly? Do I have to run a sed/awk command on the file first to put it into a proper format? If so, what would that be like? Does the fact that the month is spelled out instead of a number matter?
STR_TO_DATE() enables you to convert a string to a proper DATE within the query. It expects the date string, and a format string.
Check the examples in the manual entry to figure out the correct format.
I think it should be along the lines of %d%b%Y (However the %b is supposed to produce Strings like Dec instead of DEC so you will have to try out whether it works).
I had this issue in the past. What I had to do was to utilize LOAD DATA and set the appropriate expression here -
[SET col_name = expr,...]
http://dev.mysql.com/doc/refman/5.1/en/load-data.html
Here is the approach I took to solve similar problem. My use case was bit complex with so many columns, but making here simple to present the solution.
I have Persons table with (Id int autogen, name varchar(100),DOB date), and few million of data(name,DOB) needs to be populated from CSV file.
Created additional column in persons table with name like (varchar_DOB varchar(25)).
Imported data using mysqlimport utility into columns(name,varchar_DOB).
Executed update query that updated DOB column using str_to_date(varchar_DOB,'format') function.
Now, I have expected data populated DOB column.
The same logic could be applied in doing even other kind of data formatting like double,time_stamp etc.