In MySQL 5.7 I can compare two json objects where the order of the key are ignored e.g the following two json string are equal:
SELECT CAST('{"num": "27,28", "date": "2019-11-01"}' AS JSON) = CAST('{"date": "2019-11-01", "num": "27,28"}' AS JSON);
In mariadb there is no such json data type and so the above will return false.
Is there any way to accomplish the same in mariadb?
Note I've seen the following post and the solution is not ideal: Compare JSON values in MariaDB
Related
My question is whether PostgreSQL actually stores json data in a jsonb column type with quotation marks?
The content in the column is stored as:
"{\"Verdie\":\"Barbecue Ribs\",\"Maurice\":\"Pappardelle alla Bolognese\",\"Vincent\":\"Tiramisù\"}"
I can't workout if this is a feature of PostgreSQL or how I'm seeding my Rails database with Faker data:
# seeds.rb
require "faker"
10.times do
con = Connector.create(
user_id: 1,
name: Faker::Company.name,
description: Faker::Company.buzzword
)
rand(6).times do
con.connectors_data.create(
version: Faker::Number.number(digits: 5),
metadata: Faker::Json.shallow_json(width: 3, options: { key: "Name.first_name", value: "Food.dish" }),
comment: Faker::Lorem.sentence
)
end
end
Its due to what you're using to seed the data.
This is a classic "double encoding" issue.
When dealing with JSON columns you need to remember that that the database adapter (the pg gem) will automatically serialize Ruby hashes, arrays, numerals, strings and booleans into json*. If you feed something that you have already converted into JSON the database adapter it will store it as a string - thus the escaped quotes. "JSON strings" in Ruby are not a specific type and the adapter has no idea that you for example intended to store the JSON object {"foo": "bar"} and not the string "{\"foo\": \"bar\"}".
This is what also what commonly happens when serialize or store are used on JSON columns out of ignorance.
The result is that you get garbage data that can't be queried without using the Postgres to_json function on every row which is extremely inefficient or you need to update the entire table with:
UPDATE table_name SET column_name = to_json(column_name);
Which can also be very costly.
While you could do:
rand(6).times do
con.connectors_data.create(
version: Faker::Number.number(digits: 5),
metadata: JSON.parse(Faker::Json.shallow_json(width: 3, options: { key: "Name.first_name", value: "Food.dish" })),
comment: Faker::Lorem.sentence
)
end
Its very smelly and the underlying method that Faker::Json uses to generate the hash is not public so you might want to look around for a better alternative.
I have millions of files with the following (poor) JSON format:
{
"3000105002":[
{
"pool_id": "97808",
"pool_name": "WILDCAT (DO NOT USE)",
"status": "Zone Permanently Plugged",
"bhl": "D-12-10N-05E 902 FWL 902 FWL",
"acreage": ""
},
{
"pool_id": "96838",
"pool_name": "DRY & ABANDONED",
"status": "Zone Permanently Plugged",
"bhl": "D-12-10N-05E 902 FWL 902 FWL",
"acreage": ""
}]
}
I've tried to generate an Athena DDL that would accommodate this type (especially the api field) of structure with this:
CREATE EXTERNAL TABLE wp_info (
api:array < struct < pool_id:string,
pool_name:string,
status:string,
bhl:string,
acreage:string>>)
LOCATION 's3://foo/'
After trying to generate a table with this, the following error is thrown:
Your query has the following error(s):
FAILED: ParseException line 2:12 cannot recognize input near ':' 'array' '<' in column type
What is a workable solution to this issue? Note that the api string is different for every one of the million files. The api key is not actually within any of the files, so I hope there is a way that Athena can accommodate just the string-type value for these data.
If you don't have control over the JSON format that you are receiving, and you don't have a streaming service in the middle to transform the JSON format to something simpler, you can use regex functions to retrieve the relevant data that you need.
A simple way to do it is to use Create-Table-As-Select (CTAS) query that will convert the data from its complex JSON format to a simpler table format.
CREATE TABLE new_table
WITH (
external_location = 's3://path/to/ctas_partitioned/',
format = 'Parquet',
parquet_compression = 'SNAPPY')
AS SELECT
regexp_extract(line, '"pool_id": "(\d+)"', 1) as pool_id,
regexp_extract(line, ' "pool_name": "([^"])",', 1) as pool_name,
...
FROM json_lines_table;
You will improve the performance of the queries to the new table, as you are using Parquet format.
Note that you can also update the table when you can new data, by running the CTAS query again with external_location as 's3://path/to/ctas_partitioned/part=01' or any other partition scheme
Imaging the existing JSON doc:
{
"first": "data",
"second": [1,2,3]
}
When I try to execute:
JSON_ARRAY_APPEND(doc,'$.third',4)
I expect mysql to create the parameter as an empty array and add my element into that array resulting in:
{
"first": "data",
"second": [1,2,3],
"third": [4]
}
This however is not the case. I am trying to do this in an UPDATE query to add data into the db using something similar to:
UPDATE mytable
SET myjson=JSON_ARRAY_APPEND(myjson,'$.third',4)
WHERE ...
I am using mysql 8.0.16 if that makes any difference. I am not getting any errors, just 0 row(s) affected
Your JSON is not an array, so rather than JSON_ARRAY_APPEND(), you can consider using JSON_MERGE_PATCH() function if the order of the keys do not matter :
UPDATE mytable
SET myjson = JSON_MERGE_PATCH(myjson, '{"third": [4]}')
Demo
According to Normalization principle ; To make lookups more efficient, MySQL also sorts the keys of a JSON object. You should be aware that the result of this ordering is subject to change and not guaranteed to be consistent across releases.
I have the following data in a json column:
[
["model-1", 0.06232],
["model-2", 0.33587],
["model-3", 0.04962],
["model-4", 0.235],
["model-5", 0.31719]
]
My goal is to prepend a string to the first element of each inner list so that the output becomes:
[
["somestr/model-1", 0.06232],
["somestr/model-2", 0.33587],
["somestr/model-3", 0.04962],
["somestr/model-4", 0.235],
["somestr/model-5", 0.31719]
]
I have been able to "extract" the first elements using mariadb JSON_EXTRACT as follows:
SELECT JSON_EXTRACT(factors, '$[*][0]') FROM my_table;
But could not get any further.
Is it possible to achieve this using mariadb / mysql JSON functions? Or am I better off doing this on the application level?
I have an SQL Table which one of the columns contain a JSON array in the following format:
[
{
"id":"1",
"translation":"something here",
"value":"value of something here"
},
{
"id":"2",
"translation":"something else here",
"value":"value of something else here"
},
..
..
..
]
Is there any way to use an SQL Query and retrieve columns with the ID as header and the "value" as the value of the column? Instead of return only one column with the JSON array.
For example, if I run:
SELECT column_with_json FROM myTable
It will return the above array. Where I want to return
1,2
value of something here, value of something else here
You can't use SQL to retrieve columns from the JSON stored inside the table: to the database engine the JSON is just unstructured text saved in a text field.
Some relational databases, like PostgreSQL, have a JSON type and functions to support JSON query. If this is your case, you should be able to perform the query you want.
Check this for an example on how it work with PostgreSQL:
http://clarkdave.net/2013/06/what-can-you-do-with-postgresql-and-json/