I am attempting to clean up in a messy table consisting of words which are unnecessary.
The example below shows the typical content:
row1 |
-------------
text <12> |
more [dada] |
(123) foo |
la {55w} da |
Basically what i define as unnecessary content is all the words starting and ending with a particular symbol: <...>, [...], {...} and (...). Usually i would use the replace function, but since the data inside of the symbols are arbitrary it is not quite possible.
Is it possible to use some kind of RegEX in the REPLACE function?
UPDATE
Please take notice that the content wrapped inside the symbols can any letters and numbers, basically unpredictable.
Ok i see now !
use the replace like this - see example(will clean everything from inside '()')
mysql> CREATE TABLE tbl (
-> txt VARCHAR(255)
-> );
Query OK, 0 rows affected (0.50 sec)
mysql> INSERT INTO tbl VALUES
-> ('sometext (asdebtrw)'),
-> ('some other text ( sd sdasddebtrw)'),
-> ('somesdaftext ( (sd)( ))ebt rw)()'),
-> ('sometext1'),
-> ('sometext2'),
-> ('sometext1 (replacethistext) anothertext1'),
-> ('s'),
-> ('w(sdf) rr')
-> ;
Query OK, 8 rows affected (0.00 sec)
Records: 8 Duplicates: 0 Warnings: 0
mysql> select * from tbl;
+------------------------------------------+
| txt |
+------------------------------------------+
| sometext (asdebtrw) |
| some other text ( sd sdasddebtrw) |
| somesdaftext ( (sd)( ))ebt rw)() |
| sometext1 |
| sometext2 |
| sometext1 (replacethistext) anothertext1 |
| s |
| w(sdf) rr |
+------------------------------------------+
8 rows in set (0.00 sec)
mysql> UPDATE tbl
-> SET txt = REPLACE(txt, SUBSTRING(txt, LOCATE('(', txt), LENGTH(txt) - LOCATE(')', REVERSE(txt)) - LOCATE('(', txt) + 2), '')
-> WHERE txt LIKE '%(%)%';
Query OK, 5 rows affected (0.20 sec)
Rows matched: 5 Changed: 5 Warnings: 0
mysql> select * from tbl;
+-------------------------+
| txt |
+-------------------------+
| sometext |
| some other text |
| somesdaftext |
| sometext1 |
| sometext2 |
| sometext1 anothertext1 |
| s |
| w rr |
+-------------------------+
8 rows in set (0.22 sec)
regex_replace is your mate here:
SELECT REGEXP_REPLACE('ab12cd','[0-9]','') AS remove_digits;
-> abcd
Though it may be a MariaDB enhancement.
Related
I created a column called oilcompany that has SET data (Hunt, Pioneer, Chevron, BP)
I can enter any one of those into the oilcompany column and change from one to another one but I can not figure out how to change from one oilcompany to multiple oilcompany (eg. Hunt and BP)... any suggestion?
In the MySQL documentation there are not examples for UPDATE statements, but I normally use two ways to update these kind of columns:
Using text values
Using numeric values
Creating the test environment
mysql> CREATE TABLE tmp_table(
-> id INT NOT NULL AUTO_INCREMENT PRIMARY KEY,
-> oilcompany SET('Hunt', 'Pioneer', 'Chevron', 'BP')
-> );
Query OK, 0 rows affected (0.54 sec)
mysql> INSERT INTO tmp_table(oilcompany) VALUES ('Hunt'), ('Pioneer');
Query OK, 2 rows affected (0.11 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM tmp_table;
+----+------------+
| id | oilcompany |
+----+------------+
| 1 | Hunt |
| 2 | Pioneer |
+----+------------+
2 rows in set (0.00 sec)
Alternative#1: Using Text Values
As a SET is a collection of ENUM elements, and any ENUM element can be treated as a string, then we can do things like:
mysql> UPDATE tmp_table
-> SET oilcompany = 'Hunt,BP'
-> WHERE id = 1;
Query OK, 1 row affected (0.07 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> SELECT * FROM tmp_table;
+----+------------+
| id | oilcompany |
+----+------------+
| 1 | Hunt,BP |
| 2 | Pioneer |
+----+------------+
2 rows in set (0.00 sec)
Alternative#2: Using Numeric Values
Any SET element is stored internally as a 64bit number containing the combination of the bits that represent each SET element.
In our table: 'Hunt'=1, 'Pioneer'=2, 'Chevron'=4, 'BP'=8.
Also, mysql allows to use these numbers instead of text values. If we need to see the numeric value in the select, we need to use the SET column inside a numeric expression (E.g. adding zero).
Let's see the current values:
mysql> SELECT id, oilcompany+0, oilcompany FROM tmp_table;
+----+--------------+------------+
| id | oilcompany+0 | oilcompany |
+----+--------------+------------+
| 1 | 9 | Hunt,BP |
| 2 | 2 | Pioneer |
+----+--------------+------------+
2 rows in set (0.00 sec)
Here 9 = 'Hunt' (1) + 'BP' (8) and 2 = 'Pioneer' (2).
Now, let's change the Pioneer to 'Hunt' (1) + 'Chevron' (4):
mysql> UPDATE tmp_table
-> SET oilcompany = 5
-> WHERE id = 2;
Query OK, 1 row affected (0.08 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> SELECT id, oilcompany+0, oilcompany FROM tmp_table;
+----+--------------+--------------+
| id | oilcompany+0 | oilcompany |
+----+--------------+--------------+
| 1 | 9 | Hunt,BP |
| 2 | 5 | Hunt,Chevron |
+----+--------------+--------------+
2 rows in set (0.00 sec)
I want to insert data into a table in a specific order. This is because I need to give each entry a specific ID. What I am using is a select statement:
select (#i := #i + 1) as id, ...
order by column
The problem I am having is that this does not seem to work. I get the result I want from the select query. However, when I try to insert the data into the table the order by statement is ignored. Is there any way to force the correct order in the insert statement?
What I want is this:
+----+------+-------------+
| id | name | breadcrumbs |
+----+------+-------------+
| 1 | test | 01 |
| 5 | -d | 01,05 |
| 4 | c | 04 |
| 6 | e | 06 |
| 2 | -a | 06,02 |
| 3 | --b | 06,02,03 |
+----+------+-------------+
To become this:
+----+------+-------------+
| id | name | breadcrumbs |
+----+------+-------------+
| 1 | test | 01 |
| 2 | -d | 01,05 |
| 3 | c | 04 |
| 4 | e | 06 |
| 5 | -a | 06,02 |
| 6 | --b | 06,02,03 |
+----+------+-------------+
In a separate temporary table.
I would make certain that #i is initalised see select in from clause below
MariaDB [sandbox]> drop table if exists t;
Query OK, 0 rows affected (0.14 sec)
MariaDB [sandbox]>
MariaDB [sandbox]> create table t(id int, name varchar(10), breadcrumbs varchar(100));
Query OK, 0 rows affected (0.18 sec)
MariaDB [sandbox]> insert into t values
-> ( 1 , 'test' , '01' ),
-> ( 5 , '-d' , '01,05' ),
-> ( 4 , 'c' , '04' ),
-> ( 6 , 'e' , '06' ),
-> ( 2 , '-a' , '06,02' ),
-> ( 3 , '--b' , '06,02,03');
Query OK, 6 rows affected (0.01 sec)
Records: 6 Duplicates: 0 Warnings: 0
MariaDB [sandbox]>
MariaDB [sandbox]> drop table if exists t1;
Query OK, 0 rows affected (0.13 sec)
MariaDB [sandbox]> create table t1 as
-> select
-> #i:=#i+1 id,
-> t.name,t.breadcrumbs
-> from (select #i:=0) i,
-> t
-> order by breadcrumbs;
Query OK, 6 rows affected (0.22 sec)
Records: 6 Duplicates: 0 Warnings: 0
MariaDB [sandbox]>
MariaDB [sandbox]> select * from t1;
+------+------+-------------+
| id | name | breadcrumbs |
+------+------+-------------+
| 1 | test | 01 |
| 2 | -d | 01,05 |
| 3 | c | 04 |
| 4 | e | 06 |
| 5 | -a | 06,02 |
| 6 | --b | 06,02,03 |
+------+------+-------------+
6 rows in set (0.00 sec)
I want to insert data into a table in a specific order.
There is no internal order to the records in a MySQL database table. Tables are modeled after unordered sets. The only order which exists is the one you apply by using an ORDER BY clause when you query. So moving forward, instead of worrying about the order in which your records are inserted, you should instead make sure that your table has the necessary columns and data to order your result sets the way you want.
I have one table
test
ID text sum
-----------------------
1 1_2_3 0
2 2_3_4_5 0
i want to update this table as
ID text sum
------------------------
1 1_2_3 6
2 2_3_4_5 14
how to write the query or function/procedure.
You should really NORMALIZE your data,but assuming you are forced to work with it:
UPDATE tableName SET sum=SUBSTRING_INDEX(text,'_',1) +
SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(text,'_0'),'_',2),'_',-1) +
SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(text,'_0'),'_',3),'_',-1) +
SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(text,'_0'),'_',4),'_',-1) +
SUBSTRING_INDEX(SUBSTRING_INDEX(CONCAT(text,'_0'),'_',5),'_',-1);
Use SUBSTRING _INDEX to isolate each number,CONCAT is used to give a 0 if the number of expressions is larger than the number of values.
The fiddle
Try with this solution...
Hope this will help you....
SELECT SUM(Trim( Left(Name, InStr(Name, "_") - 1)) +
Trim( Mid(Name, InStr(Name, "_") + 1)) +
Trim(Right(Name, InStr(Name, ",") + 1))) as SUM FROM TEST;
Where the table structure is like this:
Id | Name |
1 | 1_2_3 |
2 | 5_8_10 |
It's resulted as
SUM
6
23
The best way to go about this is to make each text field an SQL statement.
First, here is sample data
mysql> drop table if exists prabhu;
Query OK, 0 rows affected (0.27 sec)
mysql> create table prabhu
-> (
-> id int not null auto_increment primary key,
-> text varchar(128),
-> sum int default 0
-> );
Query OK, 0 rows affected (0.56 sec)
mysql> insert into prabhu (text) values ('1_2_3'),('2_3_4_5');
Query OK, 2 rows affected (0.09 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> select * from prabhu;
+----+---------+------+
| id | text | sum |
+----+---------+------+
| 1 | 1_2_3 | 0 |
| 2 | 2_3_4_5 | 0 |
+----+---------+------+
2 rows in set (0.00 sec)
mysql>
Here is a query to make each row produce an SQL statement to update the sum column
mysql> SELECT CONCAT('UPDATE prabhu SET sum=',
-> REPLACE(text,'_','+'),' WHERE id=',id,';') sqlstmt FROM prabhu;
+-------------------------------------------+
| sqlstmt |
+-------------------------------------------+
| UPDATE prabhu SET sum=1+2+3 WHERE id=1; |
| UPDATE prabhu SET sum=2+3+4+5 WHERE id=2; |
+-------------------------------------------+
2 rows in set (0.00 sec)
mysql>
Now, pipe the output of the query back into mysql and execute each line
C:\>mysql -Dtest -ANe"SELECT CONCAT('UPDATE prabhu SET sum=',REPLACE(text,'_','+'),' WHERE id=',id,';') sqlstmt FROM pra
bhu" | mysql -Dtest
C:\>mysql -Dtest -Ae"SELECT * FROM prabhu"
+----+---------+------+
| id | text | sum |
+----+---------+------+
| 1 | 1_2_3 | 6 |
| 2 | 2_3_4_5 | 14 |
+----+---------+------+
C:\>
Give it a Try !!!
I'm testing the InfiniDB community edition to see if it suits our needing.
I imported in a single table about 10 millions rows (loading of data was surprisingly fast), and I'm trying to do some query on it, but these are the results (with NON cached queries.. if query caching exists in InfiniDB):
Query 1 (very fast):
select * from mytable limit 150000,1000
1000 rows in set (0.04 sec)
Query 2 (immediate):
select count(*) from mytable;
+----------+
| count(*) |
+----------+
| 9429378 |
+----------+
1 row in set (0.00 sec)
Ok it seems to be amazingly fast.. but:
Query 3:
select count(title) from mytable;
.. still going after several minutes
Query 4:
select id from mytable where id like '%ABCD%';
+------------+
| id |
+------------+
| ABCD |
+------------+
1 row in set (11 min 17.30 sec)
I must be doing something wrong, it's not possible that it's performing this way with so simple queries. Any Idea?
That shouldn't be the case, there does appear to be something odd going on, see quick test below.
What is your server configuration: memory/OS/CPU and platform (dedicated, virtual, cloud).
Could I get the schema declaration and method to load the data?
Which version are you using? Version 4 community has significantly more features than prior versions, i.e. core syntax matches enterprise.
Cheers,
Jim T
mysql> insert into mytable select a, a from (select hex(rand() * 100000) a from lineitem limit 10000000) b;
Query OK, 10000000 rows affected (1 min 54.12 sec)
Records: 10000000 Duplicates: 0 Warnings: 0
mysql> desc mytable;
+-------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id | varchar(32) | YES | | NULL | |
| title | varchar(32) | YES | | NULL | |
+-------+-------------+------+-----+---------+-------+
2 rows in set (0.01 sec)
mysql> select * from mytable limit 150000,1000;
+-------+-------+
| id | title |
+-------+-------+
| E81 | E81 |
| 746A | 746A |
. . .
| DFC8 | DFC8 |
| 2C56 | 2C56 |
+-------+-------+
1000 rows in set (0.07 sec)
mysql> select count(*) from mytable;
+----------+
| count(*) |
+----------+
| 10000000 |
+----------+
1 row in set (0.06 sec)
mysql> select count(title) from mytable;
+--------------+
| count(title) |
+--------------+
| 10000000 |
+--------------+
1 row in set (0.09 sec)
mysql> select id from mytable where id like '%ABCD%' limit 1;
+------+
| id |
+------+
| ABCD |
+------+
1 row in set (0.03 sec)
If I have to find a string name "Akito" and it lies in the table foo then following is the normal procedure,
select * from foo where `name = 'Akito'`
I tried to check two variations of it,
Worked fine
select * from foo where name = 'Akito '
Did not Worked Fine
select * from foo where name = ' Akito'
Can anyone please explain why did the 2nd one did not work?
Thanks in advance
CHAR types fill the string to the length of the field with null bytes (while VARCHAR add delimiters to indicate the end of the string - thus ignoring extra data at the end (I mean empty bytes)), and therefore comparisons that have spaces at the end will ignore those. Leading spaces are relevant as their alter the string itself. See Christopher's answer.
EDIT: some further elaboration required
See some practical tests below. VARCHAR types do add spaces to the string, whilst CHAR fields, even though they fill the string up to its size with spaces, ignore them during comparisons. See specifically the second line with the LENGTH function query:
mysql> create table test (a VARCHAR(10), b CHAR(10));
Query OK, 0 rows affected (0.17 sec)
mysql> insert into test values ('a', 'a'), ('a ', 'a '), (' a', ' a');
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> select a, LENGTH(a), b, LENGTH(b) FROM test;
+------+-----------+------+-----------+
| a | LENGTH(a) | b | LENGTH(b) |
+------+-----------+------+-----------+
| a | 1 | a | 1 |
| a | 2 | a | 1 |
| a | 2 | a | 2 |
+------+-----------+------+-----------+
3 rows in set (0.00 sec)
where MySQL states the CHAR field, with the value of 'a ' as it was inserted, has only 1 character in length. Furthermore, if we concatenate a little data:
mysql> select CONCAT(a, '.'), CONCAT(b, '.') FROM test;
+----------------+----------------+
| CONCAT(a, '.') | CONCAT(b, '.') |
+----------------+----------------+
| a. | a. |
| a . | a. |
| a. | a. |
+----------------+----------------+
3 rows in set (0.00 sec)
mysql> select CONCAT(a, b), CONCAT(b, a) FROM test;
+--------------+--------------+
| CONCAT(a, b) | CONCAT(b, a) |
+--------------+--------------+
| aa | aa |
| a a | aa |
| a a | a a |
+--------------+--------------+
3 rows in set (0.00 sec)
you can see that, since VARCHAR does store where the string ends, the space remains on concatenations - which does not hold true for CHAR types. Now, keeping in mind the previous LENGTH example, where line two has different lengths for its fields a and b, we test:
mysql> SELECT * FROM test WHERE a=b;
+------+------+
| a | b |
+------+------+
| a | a |
| a | a |
| a | a |
+------+------+
3 rows in set (0.00 sec)
Therefore, we can sum up stating that the CHAR datatype ignores and trims extra space at the end of its string, while VARCHAR does not - except during comparisons:
mysql> select a from test where a = 'a ';
+------+
| a |
+------+
| a |
| a |
+------+
2 rows in set (0.00 sec)
mysql> select a from test where a = 'a';
+------+
| a |
+------+
| a |
| a |
+------+
2 rows in set (0.00 sec)
mysql> select a from test where a = ' a';
+------+
| a |
+------+
| a |
+------+
1 row in set (0.00 sec)
So, is the same true for the CHAR type?
mysql> select a from test where b = 'a ';
+------+
| a |
+------+
| a |
| a |
+------+
2 rows in set (0.00 sec)
mysql> select a from test where b = 'a';
+------+
| a |
+------+
| a |
| a |
+------+
2 rows in set (0.00 sec)
mysql> select a from test where b = ' a';
+------+
| a |
+------+
| a |
+------+
1 row in set (0.00 sec)
Which displays that the CHAR and VARCHAR types have different storage methods, but follow the same rules for sheer string comparison. Trailing spaces are ignored; while leading spaces modify the string itself.
http://dev.mysql.com/doc/refman/5.0/en/string-comparison-functions.html says the following:
In particular, trailing spaces are significant, which is not true for
CHAR or VARCHAR comparisons performed with the = operator:
mysql> SELECT 'a' = 'a ', 'a' LIKE 'a ';
+------------+---------------+
| 'a' = 'a ' | 'a' LIKE 'a ' |
+------------+---------------+
| 1 | 0 |
+------------+---------------+
1 row in set (0.00 sec)
Trailing means not leading. Those seem to be relevant.