How to manage JSON query performance in MySQL DB - mysql

I have a Mysql8 DB which contains JSON data. Unfortunately, the content is not always the same. To make it simple, the hierarchy is always the same, but sometimes part of the "tree" is missing or slightly different. For instance:
$.bilan.victimes.*.preview.TAGSAU (I use a star, since sometimes, it's '1', '2', etc... and sometimes it is only '$.bilan.victimes' (without further subkeys)
Now, I am using queries to lookup information in the JSON like:
SELECT
COUNT(fiche_id) AS USAGE_DSA,
JSON_VALUE(content, '$.bilan.victimes.*.preview.DSA') AS DSA
FROM bilan_json
WHERE STR_TO_DATE(JSON_VALUE(content, '$.bilan.victimes.*.preview.TAGSAU'),'%e/%c/%Y %H%#%i') >= '2021-01-01'
GROUP BY DSA;
This is working fine, but since there is a lot of records, and JSON could be very long, it takes an awful bunch of time to display the result. In this example, this is only key... I am supposed to retrieve multiples values from the JSON, sometimes in a single query.
I've read about virtual columns (https://stackoverflow.com/questions/68118107/how-to-create-a-virtual-column-to-index-json-column-in-mysql#:~:text=if%20table%20is%20already%20created%20and%20you%20want,%60jval%60%3B%20Dont%20forget%20to%20index%20the%20Generated%20Columns) and also improving performance for JSON object (https://blogit.create.pt/goncalomelo/2018/12/20/query-performance-for-json-objects-inside-sql-server/) but I can't really figure out if I should create a virtual column per key ? And, how can I create a virtual column with a transform ? In above case, I would create something like :
ALTER TABLE bilan_json
ADD COLUMN tagsau DATETIME
GENERATED ALWAYS AS STR_TO_DATE(JSON_VALUE(content, '$.bilan.victimes.*.preview.TAGSAU'),'%e/%c/%Y %H%#%i')
AFTER content;
What would be your advice ?

Simply put, If you expect to need a field in JSON for a WHERE or ORDER BY clause, that field should be in its own column.
3 approaches:
Redundantly store it in a column as you INSERT the rows.
Use a Virtual ("Generated") column (as you suggest).
Remove it from JSON as you put it in its own column.
Once it is in a column, it can be indexed. (It is unclear how useful an index would be for the SELECT you show.)
Did you try that ALTER? Did it work? We need SHOW CREATE TABLE in order to advise further.

Related

Indexing JSON column in MySQL 8

So I'm experimenting with json column. Mysql 8.0.17 supposed to work with multi value JSON indexes, like this:
CREATE INDEX data__nbr_idx ON a1( (CAST(data->'$.nbr' AS UNSIGNED ARRAY)) )
I have column categories with JSON like this ["books", "clothes"].
I need to get all products from "books" category. I can use "json_contains" or new "member of".
SELECT * FROM products WHERE JSON_CONTAINS(categories, '\"books\"')
SELECT * FROM products WHERE "books" MEMBER OF(categories)
And it works. The problem is that of course EXPLAIN will reveal that there queries are making full table scan, and because of that it is slow.
So I need some index.
I changed index example by replacing "unsigned" type with "char(32) since my categories are strings and not numbers. I cannot find any example for this in google so I assumed that char() will be fine, but not.
This is my index query:
CREATE INDEX categories_index ON products((CAST(categories AS CHAR(32) ARRAY)))
also tried
CREATE INDEX categories_index ON products((CAST(categories->'$' AS CHAR(32) ARRAY)))
but selects are still making full table scan. What I'm doing wrong?
How to correctly index json column without using virtual columns?
For a multi-valued json index, the json paths have to match, so with an index
CREATE INDEX categories_index
ON products((CAST(categories->'$' AS CHAR(32) ARRAY)))
your query has to also use the path ->'$' (or the equivalent json_extract(...,'$'))
SELECT * FROM products WHERE "books" MEMBER OF(categories->'$')
Make it consistent and it should work.
It seems though that an index without an explicit path does not work as expected, so you have to specify ->'$' if you want to use the whole document. This is probably a bug, but could also be an intended behaviour of casting or autocasting to an array. If you specify the path you'll be on the safe side.

MySql 5.7 json_extract by key

I have a table and it looks like below:
Table data
id params
1 {"company1X":{"price":"1124.55"},"company2X":{"price":"1,124.55"},"company3X":{"price":""},"company4X":{"price":""},"company5X":{"price":"1528.0"}}
I don't know the name of "company" to use in my request.
How can I fetch my data ordered by price?
Thanks!
P.S I have tried select json_extract(params, '$[*].price') from data but it doesn't work (return nulls).
$[*] gets all elements of a JSON array, not an object. This is an object, so you get NULL.
$.* will get you all elements in a JSON object, so $.*.price gets you a JSON array of all prices.
mysql> select json_extract(params, '$.*.price') from foo;
+-------------------------------------------+
| json_extract(params, '$.*.price') |
+-------------------------------------------+
| ["1124.55", "1,124.55", "", "", "1528.0"] |
+-------------------------------------------+
Now there's a problem. As far as SQL is concerned, this is a single row. It can't be sorted with a normal order by, that works on rows.
MySQL has no function for sorting JSON... so you're stuck. You can return the JSON array and let whatever is receiving the data do the sorting. You might be able to write a stored procedure to sort the array... but that's a lot of work to support a bad table design. Instead, change the table.
The real problem is this is a bad use of a JSON column.
JSON columns defeat most of the point of a SQL database (less so in PostgreSQL which has much better JSON support). SQL databases work with rows and columns, but JSON shoves what would be multiple rows and columns into a single cell.
For this reason, JSON columns should be used sparingly; typically when you're not sure what sort of data you'll be needing to store. Important information like "price" that's going to be searched and sorted should be done as normal columns.
You'll want to change your table to be a normal SQL table with columns for the company name and price. Then you can use normal SQL features like order by and performance will benefit from indexing. There isn't enough information in your question to suggest what that table might look like.

One column or separate columns for extra data - mysql

I was thinking what if I have a table with columns for meta_description (varchar 300), meta_tags (varchar 300), and meta_title (varchar 200)... can I "join" all this columns just into one column "extra_information" (longtext) and save here the same information but maybe in JSON format?
Is this convenient or not and why :)?
This fields are not very important for me, I will never make any query to search or sort the results trough this information. The metatags for example are only a comma separated text I don't need to do some kind of relation table on this.
What I want to know is this will save space on my database or will be working a little bit faster, or things like that... But if you tell me that have 5 columns instead of just one is the same for MySQL of course I will have the 5 columns...
Thanks a lot!
The answer boils down on: Does MySQL have to work with your data?
If all date is concatenated in one column, be it as JSON or comma-seperated or what not, it is nearly off limits for any MySQL operation. You can surely SELECT it, but it is very hard to search, group or sort by anything inside that column. So, it you are absolutly sure MySQL soes never have to see the data itself and will only return some column with data in it, go for it.
Benefits are that the table structure does not have to be changed because your data changes. and column structure is very clean
if you need to filter, sort, group or do whatever operation on it within a SQL query, leave it in seperate columns.

Each user has different 'structure' using only one table

I'm trying to do it like this:
Every single user can choose fields (like structures on MySQL) where this fields can handle their respective value, it's like doing a DB inside a DB.
But how can I do it using a single table?
(not talking about user accounts etc where I should be able to use a pointer to his own "structure")
Do something like: varchar Key where register something like "Name:asd" where PHP explode : to get the respective structure ('name' in this case) and the respective value? ('asd')
Use BLOB? can someone turn the light on for me? I don't know how to do something where works better than my current explanation...
I know my text is confuse and sorry for any bad english.
EDIT:
Also, they could add multiple keys/"structures" where accepts a new value
And they are not able to see the Database or Tables, they still normal users
My server does not support Postogre
In my opinion you should create two tables.
with the user info
with 3 fields (userid, key and value)
Each user has 1 record in the first table. Each user can have 0 or more records in the second table. This will ensure you can still search the data and that users can easily add more key/value pairs when needed.
Don't start building a database in a database. In this case, since the user makes the field by himself there is no relation between the fields as I understand? In that case it would make sense to take a look at the NoSQL databases since they seem to fit very good for this kind of situations.
Another thing to check is something like:
http://www.postgresql.org/docs/8.4/static/hstore.html
Do not try to build tables like: records, fields, field types etc. That's a bad practice and should not be needed.
For a more specific answer on your wishes we need a bit more info about the data the user is storing.
While i think the rational answer to this question is the one given by PeeHaa, if you really want the data to fit into one table you could try saving a serialized PHP array in one of the fields. Check out serialize and unserialize
Generates a storable representation of a value
This is useful for storing or passing PHP values around without losing
their type and structure.
This method is discouraged as it is not at all scalable.
Use a table with key-value pairs. So three columns:
user id
key ("name")
value ("asd")
Add an index on user id, so that you can query a user's attributes easily. If you wanted to query all users with the same properties, then you could add a second index on key and/or value.
Hope you are using a programming language also to get the data and present them.
You can have a single table which has a varchar field. Then you store the serialized data of the field structure and their value in that field. When you want to get the structure, query the data and De-serialize that varchar field data.
As per my knowledge every programming language supports serialization and De-serialization.
Edited : This is not a scalable option.

How can I pull an ID from a varchar field and JOIN another table?

I have a field called 'click_target' that stores a string of data similar to this:
http://domain.com/deals/244?utm_source=sendgrid.com&utm_medium=email&utm_campaign=website
Using a MySQL query, is it possible to pull the ID (244) from the string and use it to join another table?
You can certainly play games with expressions to pull the ID out of this string but I have a bigger worry - you're burning a dependency on the URL format into a query in the database. That's not really a good idea becuase when (I don't say IF) the URL's change your queries will suddenly fail silently - no errors, just empty (if you're lucky) or nonsensical results.
It's an ugly hack, but I believe this line will extract your ID for you in the above url.
REVERSE(LEFT(LOCATE('/', REVERSE(LEFT(click_target, LOCATE('?', click_target)-1)))-1))
Basically, I am getting the text between the first '?' and the last '/'. From there you can join on whatever table you want with that value, though I recommend aliasing it or storing it in a variable so that it is not recalculated frequently.
If you need the id, fix your database to store it porperly in a separate field.