How to get values from json field - mysql

I have a table that looks like this:
CREATE TABLE `testtable` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`data` json DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The json field has data that looks like this:
{'a1': 'Title', 'b17': 'Message'}
I want to select id and a1(json). I don't want b17. Is there a way to do this?

If you are using MySQL 5.7.8 or newer, you can take advantage of the JSON data type supported there:
https://dev.mysql.com/doc/refman/5.7/en/json.html
You can then use JSON Path expressions to extract values from the field.
Otherwise, you are stuck with either extracting the value through SUBSTR() and POSITION() functions - hokey example below - assuming the formatting of the values is sufficiently predictable, or else processing the result outside of SQL.
SELECT id, SUBSTR(hackedJSON, 1, POSITION("'" IN hackedJSON) - 1) a1
FROM (
SELECT id, SUBSTR(data, POSITION("'a1':" IN data) + 7) hackedJSON
FROM testtable) a

Related

Using json blob field attributes in MySQL where clause

I am using MySQL 5.7 and have a table with following schema
CREATE TABLE `Test` (
`id` int(11) NOT NULL AUTO_INCREMENT COMMENT 'primary key',
`created_by` varchar(45) COLLATE utf8_unicode_ci DEFAULT NULL,
`status` varchar(45) COLLATE utf8_unicode_ci DEFAULT NULL,
`metadata` blob COMMENT 'to capture the custom metadata',
`created_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`),
) ENGINE=InnoDB AUTO_INCREMENT=10 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci'
And the sample row data for the table looks like this
1234,user1,open,"{'key1': 'value1', 'key2': 'value2', 'key3': 'value3'}",2021-05-18 16:01:25
I want to select rows from this table based on the keys in json blob field metadata; for example, let's say where key1 = 'value1'. So I tried something like this
select * from `test` where metadata->>"$.key1" = "value1";
But I got this error Cannot create a JSON value from a string with CHARACTER SET 'binary'. So I casted it to json first by something like below
select JSON_EXTRACT(CAST(metadata as JSON), "$") as meta from test;
The problem is this returns base64 encoded string and when I try to decode the same using FROM_BASE64 like below I get null values in the column.
select FROM_BASE64(JSON_EXTRACT(CAST(metadata as JSON), "$")) as meta from test;
So I think I have two problems here: the first one being how to decode the base64 encoded data which I get after casting blob as json, and second how to filter the rows based on keys in the metadata field.
I do feel this as a design error where the most ideal data type should have been json but since this is how it is now, I need some way to workaround this.
Edit
I also tried following as suggested in one of the comments
select cast(convert(cast(metadata as char) using utf8) as json) from test;
but I get this error
Data truncation: Invalid JSON text in argument 1 to function cast_as_json: "Missing a name for object member." at position 1
Is there any way I can work around this ?

MySQL 5.7 JSON: how to set variable to json_contains_path like '$.name'

I use MySQL5.7 and store json in mysql, now i need to do some analysis on this json.
Such as below table:
CREATE TABLE `json_test1` (
`id` int unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(128) NOT NULL,
`json` json DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB
insert test data:
insert into json_test1 (name, json)
values ('lily', '{"age": 22, "lily": "ldfd"}'),
('sam','{"sam":"dfsf"}'),
('k','{"arm":"aaa"}'),
('cd','{"xl":"bbb"}');
Now, i need to check if json key contains name, for example: ('sam','{"sam":"sam"}'), if json contains key 'sam', so for this single one, i can use below sql:
select json_contains_path(json, 'one','$.sam') jsonC
from json_test1 where name='sam';
Output is 1, so json contains key 'sam', but how can i check all these data to see if key exist in json, in this condition, i need pass a variable to function json_contains_path(json, 'one','$.{name}') , but i don't know how to set variable in this function.
Expected result:
Anyone know that?
You just need to add a CONCAT() function such as
SELECT name, JSON_CONTAINS_PATH(json, 'one',CONCAT('$."',name,'"')) AS jsonExists
FROM json_test1
Demo

how to perform a SELECT on a JSON column in mysql/mariaDB

how to apply WHERE clause on JSON column to perform a SELECT query on a table which is having two columns (id Integer, attr JSON). The JSON is nested and in the filter condition there is only one key value pair of json is allowed. This key value pair can be anywhere in the Josn.
+----+-----------------------------------------------------------------
| id | attr
|
+----+-----------------------------------------------------------------
| 1 | {"id":"0001","type":"donut","name":"Cake","ppu":0.55}
|
| 2 | {"id":"0002","type":"donut","name":"Cake","ppu":0.55,"batters":
{"batter1":100,"batter2":200}}
+----+-----------------------------------------------------------------
In MariaDB 10.2, you can use the JSON functions.
For example, if you want to SELECT all donuts from your database, you do:
SELECT * FROM t WHERE JSON_CONTAINS(attr, '"donut"', '$.type');
Note: In MariaDB, JSON functions work with all text data types (VARCHAR, TEXT etc.). The JSON type is simply an alias for LONGTEXT.
Similarly to markusjm's answer, you can select directly from the json by field like:
SELECT json_extract(attr, '$.type') FROM t;
If you are still using MySQL 5.6 (has no JSON parsing support) we can use the substring_index functions to parse json data.
Here is a working example:
CREATE TABLE `products` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`attr` longtext COLLATE utf8_unicode_ci NOT NULL,
`created_at` datetime NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO products (attr, created_at)
VALUES
('{"id":"0001","type":"donut","name":"Cake","ppu":0.55}', now()),
('{"id":"0002","type":"donut","name":"Cake","ppu":0.55,"batters":{"batter1":100,"batter2":200}}', now()),
('{"id":"0003","type":"apple","name":"Apple","ppu":0.60}', now()),
('{"id":"0003","type":"orange","name":"Orange","ppu":0.65}', now());
select
substring_index(substring_index(attr, '"type":"', -1), '",', 1) AS product_type
from products
having product_type = 'donut';

How to insert a vector into a column of a table in mysql?

In R, I have a vector, "myVector", of strings which I want to insert into a column, "myColumn", of a mysql table, "myTable". I understand I can write the sql query and run it in R using dbSendQuery. So let's figure out the sql query first. Here is an example:
myVector = c("hi","I", "am")
Let's insert myVector in the column myColumn of myTable, row numbers 3 to 5, here is the sql query which works except for the last line I have no idea:
UPDATE myTable t JOIN
(SELECT id
FROM myTable tt
LIMIT 3, 3
) tt
ON tt.id = t.id
SET myColumn = myVector;
Thanks
Assuming that I understand your problem correctly, I have two possible solutions on my mind:
1. one column per element:
if your vectors are all have equal number of elements, you could store each of them in a seperate column. Proceeding from your example above, the table could look like this. (the size of the columns and whether to allow null values or not depends on your data)
CREATE TABLE `myTable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`element1` varchar(255) DEFAULT NULL,
`element2` varchar(255) DEFAULT NULL,
`element3` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The statement for inserting your vector from above would be:
INSERT INTO `myTable` (`id`, `element1`, `element2`, `element3`)
VALUES (1, 'hi', 'I', 'am');
Depending on how much elements your vectors have this approach might be more or less applicable.
2. Storing the vector as a blob:
Another approach could be storing the vector as a blob. Blob (Binary Large Object) is a datatype to store a variable amount of (binary) data (see: https://dev.mysql.com/doc/refman/5.7/en/blob.html). This idea is taken from this article: http://jfaganuk.github.io/2015/01/12/storing-r-objects-in-sqlite-tables/
The table could be created using the following statement:
CREATE TABLE `myTable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`myVector` blob,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;
When inserting your vector you bind the variable to your query. As I am not a R specialist I would refer to this article for the implementation details.
I'm not aware, if MySQL support Vector data type, but you could design your table as workaround where Vector can be store in different table and will have relation with myTable as 1-M.
This is help you to manage and retrieve details easily. So, assuming myTable is your table and it's existing design is :
myTable
-------
id
col1
vectorCol
So, you main table can be
CREATE TABLE myTable (
id INT NOT NULL AUTO_INCREMENT,
col1 varchar(50),
PRIMARY KEY (id)
);
and table which will store your vector.
CREATE TABLE vectorTab (
id INT NOT NULL AUTO_INCREMENT, -- in case ordering matter
parent_id INT NOT NULL,
value TEXT,
PRIMARY KEY (id),
FOREIGN KEY (parent_id) REFERENCES myTable (id) ON DELETE CASCADE ON UPDATE CASCADE
);
What you should do is export your R vector as JSON using toJSON() function for example:
myJSONVector = toJSON(c("hi","I", "am"))
Also create or alter myTable so that myColumn has the appropriate JSON Data Type
Attempting to insert a value into a JSON column succeeds if the value
is a valid JSON value, but fails if it is not:
Example
CREATE TABLE `myTable` (`myColumn` JSON);
INSERT INTO `myTable` VALUES(myJSONVector); // will fail if myJSONVector is not valid JSON
// update query would be
UPDATE `myTable` SET `myColumn` = myJSONVector
WHERE `id` IN (3,4,5);
In addition
you can make an R vector from JSON using function fromJSON().

PySpark, order of column on write to MySQL with JDBC

I'm struggling a bit understanding spark and writing dataframes to a mysql database. I have the following code:
forecastDict = {'uuid': u'8df34d5a-ce02-4d02-b282-e10363690122', 'created_at': datetime.datetime(2014, 12, 31, 23, 0)}
forecastFrame = sqlContext.createDataFrame([forecastDict])
forecastFrame.write.jdbc(url="jdbc:mysql://example.com/example_db?user=bla&password=blabal123", table="example_table", mode="append")
The last line in the code throws the following error:
Incorrect datetime value: '8df34d5a-ce02-4d02-b282-e10363690122' for column 'created_at' at row 1
I can post the entire stack trace if necessary, but basically what's happening here is that the pyspark is mapping the uuid field to the wrong column in mysql. Here's the mysql definition:
mysql> show create table example_table;
...
CREATE TABLE `example_table` (
`uuid` varchar(36) NOT NULL,
`created_at` datetime NOT NULL,
PRIMARY KEY (`uuid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
...
If we change the mysql definition to the following (notice that only the order of the columns is different):
CREATE TABLE `example_table` (
`created_at` datetime NOT NULL,
`uuid` varchar(36) NOT NULL,
PRIMARY KEY (`uuid`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The insert works fine. Is there a way to implement this without being dependent on the order of the columns, or what's the preferred way of saving data to an external relational database from spark?
Thanks!
--chris
I would simply force expected order on write:
url = ...
table = ...
columns = (sqlContext.read.format('jdbc')
.options(url=url, dbtable=table)
.load()
.columns())
forecastFrame.select(*columns).write.jdbc(url=url, dbtable=table, mode='append')
Also be careful with using schema inference on dictionaries. This is not only deprecated but also rather unstable.