Transform column data into JSON column data using hybrid database - json

Working with hybrid databases, the point is to extract data from any column of a table, and transform it into another column using JSON format. This is being done into a PostgreSQL database.
Practical example, we have two tables:
CREATE TABLE `avion` (
`id` bigint NOT NULL,
`fabricante` varchar(255) DEFAULT NULL,
`revisiones_json` DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `avion` VALUES (1,'Airbus'),(2,'Boeing');
CREATE TABLE `revision` (
`id` bigint NOT NULL,
`avion_id` bigint DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `FKepufjqvypljnk6si1dhtdcn3r` (`avion_id`),
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci;
INSERT INTO `revision` VALUES (1,1),(2,2),(3,2);
We want to obtain for each "Avion" entity, its "revision" related ids, and insert them into a JSON field called "revisiones_json" contained in "Avion" entity.
I'm trying to convert with a subquery the list of related "Revision" entity ids into an array, and set them converting into a JSON_OBJECT, but it's not working. So my question is, someone knows why this conversion its not being made? Some quote misspelling?
This is used query for update:
update avion a1
set a1.revisiones_json = JSON_OBJECT('id',
SELECT JSON_ARRAYAGG(r.id) as ids from
avion a inner join revision r on a.id = r.avion_id
where a.id = a1.id
)
where a1.id > 0;

Related

Field 'category_id' doesn't have a default value MySQL

I'm learning SQL.
I'm trying to insert data. My MySQL database looks like this.
CREATE TABLE category (
category_id CHAR(100),
category_name VARCHAR(120) NOT NULL,
PRIMARY KEY (category_id)
)
I ran this command
INSERT INTO category (category_name) VALUES ("test");
But I got this error
ERROR 1364 (HY000): Field 'category_id' doesn't have a default value
Thank you in advance.
If you want to have an incrementing ID it would need to be an int. You Generally want to make ID's integers not chars to speed up lookup regardless.
CREATE TABLE IF NOT EXISTS category (
`category_id` INT NOT NULL AUTO_INCREMENT,
`category_name` VARCHAR(120) NOT NULL,
PRIMARY KEY (`category_id`)
) ENGINE=InnoDB DEFAULT CHARSET = utf8mb4 COLLATE utf8mb4_unicode_ci;
That will let you insert without adding your own ID, and automatically generate unique ID's for the records.
Issue was you set your category_id field to not have a default value and also not allow null, which means you -have- to set a value for it in the insert. If you wanted to use your existing table you would need to do this:
INSERT INTO category (category_id, category_name) VALUES ("someid", "test");

Insert multi-dimensional data into tables using only MySQL code

I'm exploring the possibility of using mysql procedures to do insert statements of json objects that embed other json objects. Let's say I created these two mysql tables:
CREATE TABLE `student` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`birth_date` date NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;
CREATE TABLE `course` (
`course_id` int(11) unsigned NOT NULL,
`student_id` int(11) unsigned NOT NULL,
`enrollment_date` date NOT NULL,
CONSTRAINT `courses_ibfk_1` FOREIGN KEY (`student_id`) REFERENCES `student` (`id`) ON DELETE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
And let's say I'm receiving the following JSON object which I intend to transform and insert into the student and course table:
let payload = [
{
name:"Alice",
birth_date:"1968-01-28",
courses:[
{course_id:4325,enrollment_date:"2018-05-01"},
{course_id:3119,enrollment_date:"2018-09-01"},
{course_id:1302,enrollment_date:"2018-01-01"},
],
},
{
name:"Bob",
birth_date:"1971-10-01",
courses:[
{course_id:2000,enrollment_date:"2018-09-01"},
{course_id:3109,enrollment_date:"2018-09-01"},
{course_id:4305,enrollment_date:"2018-09-01"},
],
},
];
In the past, I would insert the json data using client side code that does something like this
foreach (payload as student) {
studentId = exec_prepared_statement("INSERT INTO student SET name= ?, birth_date = ?",student.name,student.birth_date)
foreach(student.courses as course){
courseId = exec_prepared_statement("INSERT INTO course SET student_id = ?, course_id = ?, enrollment_date = ?",studentId, course.course_id,course.enrollment_date)
}
}
But I'm not sure how to achieve this series of insert behaviour using purely MySQL code because I'm not sure how to pass multi-dimensional data into a mysql stored procedure. It will be great if I can see an example of that somewhere.
Alternatively, someone can tell me if I should completely avoid trying to insert multi-dimensional data via purely MySQL code...that I should have client-side code like PHP, Go, JS do the job? I'd really like to see what this would look like with just MySQL so I can compare which one is more maintainable.

How to insert a vector into a column of a table in mysql?

In R, I have a vector, "myVector", of strings which I want to insert into a column, "myColumn", of a mysql table, "myTable". I understand I can write the sql query and run it in R using dbSendQuery. So let's figure out the sql query first. Here is an example:
myVector = c("hi","I", "am")
Let's insert myVector in the column myColumn of myTable, row numbers 3 to 5, here is the sql query which works except for the last line I have no idea:
UPDATE myTable t JOIN
(SELECT id
FROM myTable tt
LIMIT 3, 3
) tt
ON tt.id = t.id
SET myColumn = myVector;
Thanks
Assuming that I understand your problem correctly, I have two possible solutions on my mind:
1. one column per element:
if your vectors are all have equal number of elements, you could store each of them in a seperate column. Proceeding from your example above, the table could look like this. (the size of the columns and whether to allow null values or not depends on your data)
CREATE TABLE `myTable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`element1` varchar(255) DEFAULT NULL,
`element2` varchar(255) DEFAULT NULL,
`element3` varchar(255) DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
The statement for inserting your vector from above would be:
INSERT INTO `myTable` (`id`, `element1`, `element2`, `element3`)
VALUES (1, 'hi', 'I', 'am');
Depending on how much elements your vectors have this approach might be more or less applicable.
2. Storing the vector as a blob:
Another approach could be storing the vector as a blob. Blob (Binary Large Object) is a datatype to store a variable amount of (binary) data (see: https://dev.mysql.com/doc/refman/5.7/en/blob.html). This idea is taken from this article: http://jfaganuk.github.io/2015/01/12/storing-r-objects-in-sqlite-tables/
The table could be created using the following statement:
CREATE TABLE `myTable` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`myVector` blob,
PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=2 DEFAULT CHARSET=utf8;
When inserting your vector you bind the variable to your query. As I am not a R specialist I would refer to this article for the implementation details.
I'm not aware, if MySQL support Vector data type, but you could design your table as workaround where Vector can be store in different table and will have relation with myTable as 1-M.
This is help you to manage and retrieve details easily. So, assuming myTable is your table and it's existing design is :
myTable
-------
id
col1
vectorCol
So, you main table can be
CREATE TABLE myTable (
id INT NOT NULL AUTO_INCREMENT,
col1 varchar(50),
PRIMARY KEY (id)
);
and table which will store your vector.
CREATE TABLE vectorTab (
id INT NOT NULL AUTO_INCREMENT, -- in case ordering matter
parent_id INT NOT NULL,
value TEXT,
PRIMARY KEY (id),
FOREIGN KEY (parent_id) REFERENCES myTable (id) ON DELETE CASCADE ON UPDATE CASCADE
);
What you should do is export your R vector as JSON using toJSON() function for example:
myJSONVector = toJSON(c("hi","I", "am"))
Also create or alter myTable so that myColumn has the appropriate JSON Data Type
Attempting to insert a value into a JSON column succeeds if the value
is a valid JSON value, but fails if it is not:
Example
CREATE TABLE `myTable` (`myColumn` JSON);
INSERT INTO `myTable` VALUES(myJSONVector); // will fail if myJSONVector is not valid JSON
// update query would be
UPDATE `myTable` SET `myColumn` = myJSONVector
WHERE `id` IN (3,4,5);
In addition
you can make an R vector from JSON using function fromJSON().

Improve order by json field performance on mysql

I have this table:
CREATE TABLE `mytable` (
`session_id` mediumint(8) UNSIGNED NOT NULL,
`data` json NOT NULL,
`jobname` varchar(100) COLLATE utf8_unicode_ci GENERATED ALWAYS AS
(json_unquote(json_extract(`data`,'$.jobname'))) VIRTUAL
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
PARTITION BY HASH (session_id)
PARTITIONS 10;
ALTER TABLE `mytable`
ADD KEY `session` (`session_id`),
ADD KEY `jobname` (`jobname`);
It has 2 million rows.
When execute this query, it takes around 23 sec to get the result.
SELECT JSON_EXTRACT(f.data, '$.jobdesc') AS jobdesc
FROM mytable f
WHERE f.session_id = 1
ORDER BY jobdesc DESC
I understand that it is slow because there is no index for jobdesc field.
On data's column, I have 12 fields. I want to let user to be able to sort all fields. If I add index for each field, is it good approach?
Is there any way to improve it?
I am using MYSQL 5.7.13.
You would have to create a virtual column with an index for each of your 12 fields, if you want the user to have the option of sorting by.
CREATE TABLE `mytable` (
`session_id` mediumint(8) UNSIGNED NOT NULL,
`data` json NOT NULL,
`jobname` varchar(100) AS (json_unquote(json_extract(`data`,'$.jobname'))),
`jobdesc` varchar(100) AS (json_unquote(json_extract(`data`,'$.jobdesc'))),
...other extracted virtual fields...
KEY (`jobname`),
KEY (`jobdesc`),
...other indexes on virtual columns...
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
PARTITION BY HASH (session_id)
PARTITIONS 10;
This makes me wonder: why bother using JSON? Why not just declare 12 conventional, non-virtual columns with indexes?
CREATE TABLE `mytable` (
`session_id` mediumint(8) UNSIGNED NOT NULL,
...no `data` column needed...
`jobname` varchar(100),
`jobdesc` varchar(100),
...
KEY (`jobname`),
KEY (`jobdesc`),
...
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
PARTITION BY HASH (session_id)
PARTITIONS 10;
JSON is best when you treat it as a single atomic document, and don't try to use SQL operations on fields within it. If you regularly need to access fields within your JSON, make them into conventional columns.

Let field value set sort order direction in Mysql

Given the following table:
CREATE TABLE `example` (
`Identifier` int(11) NOT NULL AUTO_INCREMENT,
`FieldValue` varchar(255) DEFAULT NULL,
`FieldOrder` enum('asc','desc') DEFAULT 'asc',
PRIMARY KEY (`Identifier`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
I want to run a query that sorts the FieldValue field based on the value in the FieldOrder field. E.g.
Select * from example order by FieldValue [here should the FieldOrder value be placed]
Is it possible to make a reference to the FieldOrder field in the sort by part of the query?
One way to approach this is to treat FieldValue as positive or negative for FieldOrder values of "asc" and "desc" respectively. This can be expressed by a case expression:
SELECT *
FROM example
ORDER BY CASE FieldOrder WHEN 'asc' THEN 1 ELSE -1 END * FieldValue
Not that I know of ...
Please note that every row has a FieldOrder so what you seek to achieve seems to be questionable. What if one row says asc and another row say desc? How should the order of the two rows be shown then?
If you want to have parameterized order by action, you can consider using the following two methods:
Use a stored procedure that takes an argument for, say,
sortingOrder
Use a programming language (e.g. Java) to construct a query string and
inject the sorting order dynamically, and
then execute the query string to MySQL
You need to normalize your data and store it in a better schema. Do not store multiple values in a single field, only store one value per row. What you can do is store the multiple data in its own table, where each piece of data is in its own row.
Try a schema like this:
CREATE TABLE `example` (
`Identifier` int(11) NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`Identifier`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
CREATE TABLE `exampleData` (
`RowID` int(11) NOT NULL AUTO_INCREMENT,
`Identifier` int(11) NOT NULL,
`FieldValue` varchar(255) DEFAULT NULL,
PRIMARY KEY (`RowID`),
INDEX `Identifier` (`Identifier`),
FOREIGN KEY (Identifier) REFERENCES example(Identifier)
ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
Now you have two tables, and in your exampleData table, you have one row per each piece of data in FieldValue.
Now you can query like so:
SELECT Identifier, FieldValue
FROM example
JOIN exampleData USING(Identifier)
WHERE Identifier = 3
ORDER BY FieldValue ASC