how to update a VARCHAR column value with RegEx?

how to update a VARCHAR column value with RegEx? - mysql

I need to update the values from a VARCHAR column in a MySQL database from YYMMDDSXXXXX to YYMMDDSXXXX, where YY is year (i.e. 11 for 2011), MM is month (i.e. 09 for September), DD is day (i.e. 15), S is an one-digit order number (1 to 0) and XXXXX is a sequential number from 00001 to 99999.
I need to reduce ten-fold the sequential number, which should go from 0001 to 9999.
I thought about something like:
update TABLE_NAME set FIELD_NAME = replace(FIELD_NAME, ‘find this string’, ‘replace found string with this string’);
But I'm not very good with MySQL, so I'm not sure how to do it. Can someone help?
Thanks in advance!

Regex replace can be implemented easily using this:
https://launchpad.net/mysql-udf-regexp
The regex you will need (in perl/posix std notation e.g. with sed)
s/([0-9]{7})0-9/$1$2/g
That is assuming you want to shorten the counter from the left (most significant) to keep the counts you have already unique (and obviously if you don't need that many places you want to remove the useless - most significant - bit).
That should get you well on your way, have fun :)

Related

SUBSTRING_INDEX Not Warking in Mysql

I am trying to find max invoice:
SELECT IFNULL(MAX(SUBSTRING_INDEX(invoice,'I', -1)) + 1, 1) AS invoice
FROM sales
SQL Fiddle
When I run this SQL query, it can not count more than 10.
invoice
20221026P1I1
20221026P1I2
20221026P1I3
20221026P1I4
20221026P1I5
20221026P1I6
20221026P1I7
20221026P1I8
20221026P1I9
20221026P1I10
20221026P1I11
20221026P1I12
I am trying to find max invoice 12 + 1 = 13

Your use of SUBSTRING_INDEX() is correct, however you should cast the string value to a bona fide integer:
SELECT COALESCE(MAX(CAST(SUBSTRING_INDEX(invoice, 'I', -1) AS UNSIGNED)), 1) AS invoice
FROM sales;
The problem with trying to find the max of the text substrings themselves is that text numbers sort lexicographically, e.g.
1
10
11
2
23
But this isn't the behavior you want, you want the numeric maximum. Hence we should cast these substrings and then compare.
Side note: You could have avoided this problem entirely by maintaining a pure numeric invoice number column. You may want to change your table design to include such a column.

How do you round floats conditionally?

I am writing a query that is used by report generating software.
Part of this is querying for the hours needed to complete a project. We record this a 2 decimal float so that we can estimate to the quarter hour.
However, if we are using it in our report and the hour we recorded is something like 8.00, I want to query it and format it so that 8.00 is just 8. However any hours with something past the decimal, like 8.25, should remain as 8.25. How can I make this work?
hours Queried Result
====== -> My Query -> ==============
8.00 8
8.25 8.25
I am using MySQL 5.6

You can use the REPLACE() function to remove .00:
REPLACE(hours, '.00', '') AS hours

You can convert it to a string and check the rightmost 2 characters and trim those if they are '00'.
SELECT TRIM(TRAILING '.00' FROM CAST(column_name AS VARCHAR));

SELECT REPLACE(Round(8.00), '.00', ' ');
I will give more example so you can clear your Logic:
MySQL ROUND() rounds a number specified as an argument up to a number specified as another argument.
Syntax:
ROUND(N,[D]);
Where 'N' is rounded up to D decimal places.
and 'D' is indicating up to how many decimal places N will be rounded.
Example 1:-
SELECT ROUND(4.43);
Output :-
4
The above MySQL statement will round the given number 4.43. No decimal places have been defined, so the default decimal value is 0.
Example 2:-
SELECT ROUND(-4.53);
Output:-
-5
The above MySQL statement will round the given number -4.53. No decimal places have been defined, so the default decimal value is 0.

Storing date periods in database

I would like to discuss the "best" way to storage date periods in a database. Let's talk about SQL/MySQL, but this question may be for any database. I have the sensation I am doing something wrong for years...
In english, the information I have is:
-In year 2014, value is 1000
-In year 2015, value is 2000
-In year 2016, there is no value
-In year 2017 (and go on), value is 3000
Someone may store as:
BeginDate EndDate Value
2014-01-01 2014-12-31 1000
2015-01-01 2015-12-31 2000
2017-01-01 NULL 3000
Others may store as:
Date Value
2014-01-01 1000
2015-01-01 2000
2016-01-01 NULL
2017-01-01 3000
First method validation rules looks like mayhem to develop in order to avoid holes and overlaps.
In second method the problem seem to filter one punctual date inside a period.
What my colleagues prefer? Any other suggestion?
EDIT: I used full year only for example, my data usually change with day granularity.
EDIT 2: I thought about using stored "Date" as "BeginDate", order rows by Date, then select the "EndDate" in next (or previous) row. Storing "BeginDate" and "Interval" would lead to hole/overlap problem as method one, that I need a complex validation rule to avoid.

It mostly depends on the way you will be using this information - I'm assuming you do more than just store values for a year in your database.
Lots of guesses here, but I guess you have other tables with time-bounded data, and that you need to compare the dates to find matches.
For instance, in your current schema:
select *
from other_table ot
inner join year_table yt on ot.transaction_date between yt.year_start and yt.year_end
That should be an easy query to optimize - it's a straight data comparison, and if the table is big enough, you can add indexes to speed it up.
In your second schema suggestion, it's not as easy:
select *
from other_table ot
inner join year_table yt
on ot.transaction_date between yt.year_start
and yt.year_start + INTERVAL 1 YEAR
Crucially - this is harder to optimize, as every comparison needs to execute a scalar function. It might not matter - but with a large table, or a more complex query, it could be a bottleneck.
You can also store the year as an integer (as some of the commenters recommend).
select *
from other_table ot
inner join year_table yt on year(ot.transaction_date) = yt.year
Again - this is likely to have a performance impact, as every comparison requires a function to execute.
The purist in me doesn't like to store this as an integer - so you could also use MySQL's YEAR datatype.
So, assuming data size isn't an issue you're optimizing for, the solution really would lie in the way your data in this table relates to the rest of your schema.

MySQL pattern matching with character exception

I'm trying to compare the results of two queries, one acquiring call IDs for calls made to my Asterisk server externally (10 digits) and the other acquiring call IDs connected to FROM the server (11 digits). The outbound calls are prepended a '1' before their number. Currently I'm using a statement like the following:
select data2, from_unixtime(time_id) day from queuemetrics.queue_log
where time_id > '1346475600' and (data2, time_id) in
(select dst, unix_timestamp(calldate) from asteriskcdrdb.cdr
where calldate > '2012-09-01' and lastdata like <blocked for privacy>)
order by day;
data2 is the column holding the 10 digit numbers, dst holds the 11 digit numbers. Is there a way I can pattern match the 2-11th characters of a column ONLY? To just skip over the first one? Obviously a LIKE or RLIKE would be useful, but I really need to maintain the nested query for this to work. Any help would be great. Also, pay no attention to my weird use of from_unixtime and unix_timestamp. I was experimenting with figuring if I needed my times in the same format for the search to work. Not important.

You may use RIGHT to extract the rightmost characters of a string:
RIGHT(your_field_here, 10);
If there are some characters you want to ignore at the beginning AND at the end of the string, then you may use SUBSTR:
SUBSTR(your_field_here, 2, 10);
Your query would then be:
SELECT data2, FROM_UNIXTIME(time_id) day FROM queuemetrics.queue_log
WHERE time_id > '1346475600' AND (data2, time_id) IN
(SELECT SUBSTR(dst, 2, 10), UNIX_TIMESTAMP(calldate) FROM asteriskcdrdb.cdr
WHERE calldate > '2012-09-01' AND lastdata LIKE <blocked for privacy>)
ORDER BY day;

Why not trim the leading digit from your dst field?

I want to divide date of a particular interval into 10 parts and query the count for each part in mysql

I have a table.
And it has two fields id and datetime.
What I need to do is, for any two given datetimes, I need to divide the time range into 10 equal intervals and give row count of each interval.
Please let me know whether this is possible without using any external support from languages like java or php.

select ((UNIX_TIMESTAMP(date_col) / CAST((time2 - time1)/10) AS INT) + time1), count(id) from my_table where date_col >= time1 AND date_col <= time2 GROUP BY ((UNIX_TIMESTAMP(date_col) / CAST((time2 - time1)/10) AS INT) + time1)
I haven't tested it. But something like this should work.

The easiest way to divide date intervals is if you store them as longs (ie #of ticks from the "beginning of time"). As far as I know, there is no way to do it using MySQL's datetime.
How you decide to do it ultimately depends on your application. I would store them as longs and have whatever front end you are using handle to conversion to a more readable format.
What exactly do you mean by giving the row count of each interval? That part doesn't make sense to me.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008