How to extract unique nested variable names out of one string variable? - mysql

Case
In our MySql database the data is stored in combined json-strings like this:
| ID | DATA |
| 100 | {var1str: "sometxt", var2double: 0,01, var3integer: 1, var4str: "another text"} |
| 101 | {var3integer: 5, var2double: 2,05, var1str: "txt", var4str: "more text"} |
Problem
Most of the DATA-fields hold over 2500 variables. The order of variables in the DATA-string is random (as shown in above example). Right now we only know how to extract data with the following querie:
select
ID,
json_extract(DATA,'var1str'),
json_extract(DATA,'var2double'),
FROM table
With this querie, only the values of var1str and var2double will be returned as result. Values of variable 3 and 4 are ignored. There is no overview of what possible variables are hiding in the data fields.
With almost 60.000 entries and over 3.000 possible unique variable names, I would like to create a query that loops through all of the 60.000 DATA-fields and extracts every unique variable name that is found in there.
Solution?
The querie I am looking for would give the following result:
var1str
var2double
var3integer
var4str
My knowledge of MySql is very limited. Any direction given to get to this solution is much appreciated.

What version of MySQL are you using?.
From MySQL 8.0.4 and later JSON_TABLE function is supported and can be useful in this case.
mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 8.0.11 |
+-----------+
1 row in set (0.00 sec)
mysql> DROP TABLE IF EXISTS `table`;
Query OK, 0 rows affected (0.09 sec)
mysql> CREATE TABLE IF NOT EXISTS `table` (
-> `ID` BIGINT UNSIGNED AUTO_INCREMENT PRIMARY KEY,
-> `DATA` JSON NOT NULL
-> ) AUTO_INCREMENT=100;
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO `table`
-> (`DATA`)
-> VALUES
-> ('{"var1str": "sometxt", "var2double": 0.01, "var3integer": 1, "var4str": "another text"}'),
-> ('{"var3integer": 5, "var2double": 2.05, "var1str": "txt", "var4str": "more text"}');
Query OK, 2 rows affected (0.00 sec)
Records: 2 Duplicates: 0 Warnings: 0
mysql> SELECT
-> DISTINCT `der`.`key`
-> FROM
-> `table`,
-> JSON_TABLE(
-> JSON_KEYS(`DATA`), '$[*]'
-> COLUMNS(
-> `key` VARCHAR(64) PATH "$"
-> )
-> ) `der`;
+-------------+
| key |
+-------------+
| var1str |
| var4str |
| var2double |
| var3integer |
+-------------+
4 rows in set (0.01 sec)
Be aware of the Bug #90610 ERROR 1142 (42000) when using JSON_TABLE.

Related

What mean by char(40)?

I have a mysql table which has a data structure as follows,
create table data(
....
name char(40) NULL,
...
)
But I could insert names which has characters more than 40 in to name field. Can someone explain what is the actual meaning of char(40)?
You cannot insert a string of more than 40 characters in a column defined with the type CHAR(40).
If you run MySQL in strict mode, you will get an error if you try to insert a longer string.
mysql> create table mytable ( c char(40) );
Query OK, 0 rows affected (0.01 sec)
mysql> insert into mytable (c) values ('Now is the time for all good men to come to the aid of their country.');
ERROR 1406 (22001): Data too long for column 'c' at row 1
If you run MySQL in non-strict mode, the insert will succeed, but only the first 40 characters of your string is stored in the column. The characters beyond 40 are lost, and you get no error.
mysql> set sql_mode='';
Query OK, 0 rows affected (0.00 sec)
mysql> insert into mytable (c) values ('Now is the time for all good men to come to the aid of their country.');
Query OK, 1 row affected, 1 warning (0.01 sec)
mysql> show warnings;
+---------+------+----------------------------------------+
| Level | Code | Message |
+---------+------+----------------------------------------+
| Warning | 1265 | Data truncated for column 'c' at row 1 |
+---------+------+----------------------------------------+
1 row in set (0.00 sec)
mysql> select c from mytable;
+------------------------------------------+
| c |
+------------------------------------------+
| Now is the time for all good men to come |
+------------------------------------------+
1 row in set (0.00 sec)
I recommend operating MySQL in strict mode (strict mode is the default since MySQL 5.7). I would prefer to get an error instead of losing data.

MySQL: Cannot update JSON column to convert value from float to integer

I have a MySQL table with a JSON column. I want to update some rows in the JSON column to change a json value from a float to an integer. e.g {"a": 20.0} should become {"a": 20}. It looks like MySQL finds these 2 values equivalent, so it never bothers to update the row.
Here is the state of my test table:
mysql> describe test;
+-------+------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------+------+-----+---------+-------+
| id | int | NO | PRI | NULL | |
| val | json | YES | | NULL | |
+-------+------+------+-----+---------+-------+
2 rows in set (0.00 sec)
mysql> select * from test;
+----+-------------+
| id | val |
+----+-------------+
| 1 | {"a": 20.0} |
+----+-------------+
1 row in set (0.00 sec)
My aim is to change val to {"a": 20}
I've tried the following queries:
mysql> update test set val=JSON_OBJECT("a", 20) where id=1;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0
(0 rows changed)
mysql> update test
set val=JSON_SET(
val,
"$.a",
FLOOR(
JSON_EXTRACT(val, "$.a")
)
)
where id=1;
Query OK, 0 rows affected (0.00 sec)
Rows matched: 1 Changed: 0 Warnings: 0
(0 rows changed)
mysql> insert into test (id, val) values (1, JSON_OBJECT("a", 20)) ON DUPLICATE KEY UPDATE id=VALUES(id), val=VALUES(val);
Query OK, 0 rows affected, 2 warnings (0.00 sec)
(0 rows affected)
It looks like it doesn't matter how I try to write it, whether I attempt to modify the existing value, or specify a whole new JSON_OBJECT. So I'm wondering if the reason is simply that MySQL considers the before & after values to be equivalent.
Is there any way around this?
(This does not address the original Question, but addresses a problem encountered in Answering it.)
Gross... 8.0 has a naughty history of all-too-quickly removing something after recently deprecating it. Beware. Here is the issue with VALUES from the Changelog for 8.0.20:
----- 2020-04-27 8.0.20 General Availability -- -- -----
The use of VALUES() to access new row values in INSERT ... ON DUPLICATE KEY UPDATE statements is now deprecated, and is subject to removal in a future MySQL release. Instead, you should use aliases for the new row and its columns as implemented in MySQL 8.0.19 and later.
For example, the statement shown here uses VALUES() to access new row values:
INSERT INTO t1 (a,b,c) VALUES (1,2,3),(4,5,6)
ON DUPLICATE KEY UPDATE c=VALUES(a)+VALUES(b);
Henceforth, you should instead use a statement similar to the following, which uses an alias for the new row:
INSERT INTO t1 (a,b,c) VALUES (1,2,3),(4,5,6) AS new
ON DUPLICATE KEY UPDATE c = new.a+new.b;
Alternatively, you can employ aliases for both the new row and each of its columns, as shown here:
INSERT INTO t1 (a,b,c) VALUES (1,2,3),(4,5,6) AS new(m,n,p)
ON DUPLICATE KEY UPDATE c = m+n;
For more information and examples, see INSERT ... ON DUPLICATE KEY UPDATE Statement.

STR_TO_DATE giving an error instead of null

I am taking data from a csv file and throwing it all into a tempory table, so everything is in string format.
So even date fields are in string format, so I need to convert date from string to a date. All dates are in this format 28/02/2013
I used STR_TO_DATE for this, but I am having a problem.
Here is a snippet of my code.
INSERT INTO `invoice` (`DueDate`)
SELECT
STR_TO_DATE('','%d/%m/%Y')
FROM `upload_invoice`
There are of course more fields than this, but I am concentrating on the field that doesn't work.
Using this command if a date is invalid it should put in a null, but instead of a null being put in, it generates an error instead.
#1411 - Incorrect datetime value: '' for function str_to_date
I understand what the error means. It means it is getting an empty field instead of a properly formatted date, but after reading the documentation it should not be throwing an error, but it should inserting a null.
However if I use the SELECT statement without the INSERT it works.
I could do the following line which actually works to a point
IF(`DueDate`!='',STR_TO_DATE(`DueDate`,'%d/%m/%Y'),null) as `DueDate`
So STR_TO_DATE doesn't run if the field is empty. Now this works, but it can't check for a date which is not valid eg if a date was ASDFADFAS.
So then I tried
IF(TO_DAY(STR_TO_DATE(`DueDate`,'%d/%m/%Y') IS NOT NULL),STR_TO_DATE(`DueDate`,'%d/%m/%Y'),null) as `DueDate`
But this still gives the #1411 error on the if statement.
So my question is why isn't STR_TO_DATE not returning NULL on an incorrect date? I should not be getting the #1411 error.
This is not an exact duplicate of the other question. Also there was not a satisfactory answer. I solved this a while and I have added my solution, which is actually a better solution that was given in the other post, so I think because of my better answer this should stay.
An option you can try:
mysql> SELECT VERSION();
+-----------+
| VERSION() |
+-----------+
| 5.7.19 |
+-----------+
1 row in set (0.00 sec)
mysql> DROP TABLE IF EXISTS `upload_invoice`, `invoice`;
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE IF NOT EXISTS `invoice` (
-> `id` BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
-> `DueDate` DATE
-> );
Query OK, 0 rows affected (0.00 sec)
mysql> CREATE TABLE IF NOT EXISTS `upload_invoice` (
-> `DueDate` VARCHAR(10)
-> );
Query OK, 0 rows affected (0.00 sec)
mysql> INSERT INTO `upload_invoice`
-> (`DueDate`)
-> VALUES
-> ('ASDFADFAS'), (NULL), (''),
-> ('28/02/2001'), ('30/02/2001');
Query OK, 5 rows affected (0.01 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> INSERT INTO `invoice`
-> SELECT
-> NULL,
-> IF(`DueDate` REGEXP '[[:digit:]]{2}/[[:digit:]]{2}/[[:digit:]]{4}' AND
-> UNIX_TIMESTAMP(
-> STR_TO_DATE(`DueDate`, '%d/%m/%Y')
-> ) > 0,
-> STR_TO_DATE(`DueDate`, '%d/%m/%Y'),
-> NULL)
-> FROM `upload_invoice`;
Query OK, 5 rows affected (0.00 sec)
Records: 5 Duplicates: 0 Warnings: 0
mysql> SELECT `id`, `DueDate`
-> FROM `invoice`;
+----+------------+
| id | DueDate |
+----+------------+
| 1 | NULL |
| 2 | NULL |
| 3 | NULL |
| 4 | 2001-02-28 |
| 5 | NULL |
+----+------------+
5 rows in set (0.00 sec)
See db-fiddle.
I forgot I posted this question, but I solved this problem a while ago like this
IF(`{$date}`!='',STR_TO_DATE(`{$date}`,'%d/%m/%Y'),null) as `{$date}`
So because the line is long and confusing I made a function like this
protected function strDate($date){
return "IF(`{$date}`!='',STR_TO_DATE(`{$date}`,'%d/%m/%Y'),null) as `{$date}`";
}
INSERT INTO `invoice` (`DueDate`)
SELECT
{$this->strDate('DueDate')}
FROM `upload_invoice`
I really forgot I posted this question. It seems like an eternity away, but this is how I solved the issue.

Extracting numerical values from mySQL column string value

I have a "person" column in a mySQL database that represents the age and weight of a person as a string separated by a comma.
Example:
"24,175"
I want to be able to separate and extract those values and cast them as numbers.
Example: turn "24,175" to
24 as age
175 as weight
So that I can write a query similar to the following
SELECT person
FROM TABLE
WHERE age>140 OR weight>1000
I want to be able to check for values that are not possible. i.e age>140 OR weight >1000.
I cannot modify the table/environment I'm working with
I only have access to queries.
I'm thinking about solving it this way
find the index where the comma exists. CHARINDEX(',',person)
Split the string into substrings using LEFT , RIGHT, CAST and CHARINDEX(',',person)
Cast age substring and weight substring to numbers using CAST(age AS INT) CAST(weight AS INT)
SELECT person
FROM TABLE
WHERE CAST(LEFT(person,CHARINDEX(',',person) AS INT)>150 OR CAST(RIGHT(person,CHARINDEX(',',person) AS INT) >1000
If I did anything wrong please correct me.
Are all the functions usable/supported by mySQL? (RIGHT, LEFT, CHARINDEX) Will this work?
Exception: Another value for this column could be "unknown". Will this cause errors if we're trying to check for the index of , if it doesn't exist in the string? Is there a way to include "unknown" cases in the result and have it output a message of "error, person not recognized"
you can also split is with SUBSTR_INDEX like this:
MariaDB [yourschema]> SELECT * FROM spliit;
+----+--------+
| id | d |
+----+--------+
| 1 | 24,175 |
+----+--------+
1 row in set (0.03 sec)
MariaDB [yourschema]> SELECT
-> SUBSTRING_INDEX(d, ',', 1) AS age
-> , SUBSTRING_INDEX(d, ',', -1) AS weight
->
-> FROM spliit;
+------+--------+
| age | weight |
+------+--------+
| 24 | 175 |
+------+--------+
1 row in set (0.00 sec)
MariaDB [yourschema]>
sample
yes, you can direct calculate with it in MySQL
MariaDB [yourschema]> SELECT
-> SUBSTRING_INDEX(d, ',', 1) + 2 AS age
-> , SUBSTRING_INDEX(d, ',', 1) * 12 AS `month`
-> , SUBSTRING_INDEX(d, ',', -1) + 3 AS weight
-> FROM spliit;
+------+-------+--------+
| age | month | weight |
+------+-------+--------+
| 26 | 288 | 178 |
+------+-------+--------+
1 row in set, 1 warning (0.03 sec)
MariaDB [yourschema]>
SELECT person
FROM TABLE
WHERE CAST(LEFT(person,LOCATE(',',person) AS INTEGER)>150 OR CAST(RIGHT(person,(LOCATE(',',person)+1) AS INTEGER) >1000
Instead of Char index use LOCATE im MqSQL
Also note the CAST function
You also can use VIRTUAL PERSITENT COLUMNS that calculate the fields automatis and you can also use a INDEX on each substr / Integer.
sample
MariaDB [yourschema]> CREATE TABLE `splitit` (
-> `id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
-> `d` VARCHAR(32) DEFAULT NULL,
-> age INT(11) AS (SUBSTRING_INDEX(d, ',', 1)) PERSISTENT,
-> weight INT(5) AS (SUBSTRING_INDEX(d, ',', -1)) PERSISTENT,
-> PRIMARY KEY (`id`),
-> INDEX idx_age (age),
-> INDEX idx_weight (weight)
-> ) ENGINE=INNODB DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.79 sec)
MariaDB [yourschema]> INSERT INTO splitit (d) VALUES ('11,234'),('2,66'),('5,2');
Query OK, 3 rows affected (0.06 sec)
Records: 3 Duplicates: 0 Warnings: 0
MariaDB [yourschema]> SELECT * FROM splitit;
+----+--------+------+--------+
| id | d | age | weight |
+----+--------+------+--------+
| 1 | 11,234 | 11 | 234 |
| 2 | 2,66 | 2 | 66 |
| 3 | 5,2 | 5 | 2 |
+----+--------+------+--------+
3 rows in set (0.00 sec)
MariaDB [yourschema]>
You can do this all in the where clause:
where substring_index(person, ',', 1) + 0 > 140 or
substring_index(person, ',' -1) + 0 > 1000
Note that the + 0 does an silent conversion to integers. And, substring_index()is much more convenient than the functions in SQL Server.
You can readily incorporate this logic into a view:
create view v_table as
select t.*,
substring_index(person, ',', 1) + 0 as age,
substring_index(person, ',' -1) + 0 as weight
from table t;
If you want to filter out bad values within the view, you can use a MySQL extension and add:
having age > 140 or weight > 1000
after the from clause.

mysql MATCH query white-space not considered correctly

I have a mysql MATCH query with regex matching using FULL TEXT index. The matching is done Ok. as long as I don't have a white-space inside my matching pattern. For example This matches good:
SELECT * ,
MATCH (
name
)
AGAINST (
'Lady*'
IN BOOLEAN
MODE
) AS SCORE
But this doesn't:
SELECT * ,
MATCH (
name
)
AGAINST (
'Lady G*'
IN BOOLEAN
MODE
) AS SCORE
It just matches as if I wrote
SELECT * ,
MATCH (
name
)
AGAINST (
'G*'
IN BOOLEAN
MODE
) AS SCORE
How should I fix this?
Use IN NATURAL LANGUAGE MODE instead of BOOLEAN.
mysql> CREATE TABLE articles (
-> id INT UNSIGNED AUTO_INCREMENT NOT NULL PRIMARY KEY,
-> title VARCHAR(200),
-> body TEXT,
-> FULLTEXT (title,body)
-> ) ENGINE=MyISAM;
Query OK, 0 rows affected (0.08 sec)
mysql> INSERT INTO articles (title,body) VALUES
-> ('MySQL Tutorial','DBMS stands for DataBase ...'),
-> ('1001 MySQL Tricks','1. Never run mysqld as root. 2. ...'),
-> ('MySQL Security','When configured properly, MySQL ...');
Query OK, 3 rows affected (0.00 sec)
Records: 3 Duplicates: 0 Warnings: 0
mysql> SELECT * FROM articles
-> WHERE MATCH (title,body)
-> AGAINST ('stands for' IN NATURAL LANGUAGE MODE);
+----+----------------+------------------------------+
| id | title | body |
+----+----------------+------------------------------+
| 1 | MySQL Tutorial | DBMS stands for DataBase ... |
+----+----------------+------------------------------+
1 row in set (0.00 sec)
mysql> SELECT * FROM articles WHERE MATCH (title,body) AGAINST ('stands fo*' IN NATURAL LANGUAGE MODE);
+----+----------------+------------------------------+
| id | title | body |
+----+----------------+------------------------------+
| 1 | MySQL Tutorial | DBMS stands for DataBase ... |
+----+----------------+------------------------------+
1 row in set (0.00 sec)