A MySQL table has a JSON column containing a large amount of JSON data.
For example:
SELECT nodes From Table LIMIT 1;
results in:
'{"data": {"id": "Node A", "state": true, "details": [{"value": "Value","description": "Test"}, {"value": "Value2", "description": "Test2"}, {"value": "Value 7", "description": "Test 7"}, {"value": "Value 9", "description": "Test 9"}]}}'
How can I write queries that return rows in accordance with the following examples:
Where Node A state is True. In this case "Node A" is the value of key "id" and "state" contains True or False.
Where "value" is "Value2" or where "description" is "Test2." Note that these values are in a list that contains key value pairs.
I doubt if you can make a direct MySQL query to achieve above task. You will have to load all the string data from MySQL db then parse this string to get JSON object upon which you can perform your custom query operation to get your output.
But here in this case i will suggest you to use MongoDB which will be an ideal database storage solution and you can make direct queries.
Related
I have some data to be inserted into a MySQL column with the JSON datatype (blob_forms).
The value of the fields column is populated asynchronously, and if the document has multiple pages, then I need to append the data onto the existing row.
So a same table is;
document
document_id INT
text_data JSON
blob_forms JSON
blob_data JSON
The first chunk of data is correctly inserted and it is this data; (A sample)
{"fields": [
{"key": "Date", "value": "01/01/2020"},
{"key": "Number", "value": "xxx 2416 xx"},
{"key": "Invoice Date", "value": "xx/xx/2020"},
{"key": "Reg. No.", "value": "7575855"},
{"key": "VAT", "value": "1,000.00"}
]}
I am using lambda (Python) to handle the database insert, using this query
insertString = json.dumps(newObj)
sql = "INSERT INTO `document` (`document_id`, `blob_forms`) VALUES (%s, %s) ON DUPLICATE KEY UPDATE `blob_forms` = %s"
cursor.execute(sql, (self.documentId, insertString, insertString))
conn.commit()
The problem is, I also want to do an UPDATE too, so that if blob_forms has a value already, it would add the new items in the fields array to the existing objects fields array.
So basically use the original data input a second, so that if it is sent again, with the same document_id it would append to any existing data in blob_forms but preserve the JSON structure.
(Please note other processes write to this table and possibly this row due to the async nature, as the data for the columns can be written in any order, but the document_id ties them all together.
My failed attempt was something like this;
SET #j = {"fields": [{"key": "Date", "value": "01/01/2020"},{"key": "Number", "value": "xxx 2416 xx"},{"key": "Invoice Date", "value": "xx/xx/2020"},{"key": "Reg. No.", "value": "7575855"},{"key": "VAT", "value": "1,000.00"}]}
INSERT INTO `document` (`document_id`, `blob_forms`) VALUES ('DFGHJKfghj45678', #j) ON DUPLICATE KEY UPDATE blob_forms = JSON_INSERT(blob_forms, '$', #j)
I'm not sure that you can get the results that you want with 1 clean query in mysql. My advice would be to make the changes to the array on the client side (or wherever) and updating the entire field without delving into whether there is an existing value or not. I architect all of my api's in this way to keep the database interactions clean and fast.
So far this looks closest;
SET #j = '{"fields": [{"key": "Date", "value": "01/01/2020"},{"key": "Number", "value": "xxx 2416 xx"},{"key": "Invoice Date", "value": "xx/xx/2020"},{"key": "Reg. No.", "value": "7575855"},{"key": "VAT", "value": "1,000.00"}]}';
INSERT INTO `document` (`document_id`, `blob_forms`) VALUES ('DFGHJKfghj45678', #j) ON DUPLICATE KEY UPDATE blob_forms = JSON_MERGE_PRESERVE(blob_forms, #j)
I have pyspark dataframe and it have 'n' number of rows with each row having one column result
The content of the result column is a JSON
{"crawlDate": "2019-07-03 20:03:44", "Code": "200", "c1": "XYZ", "desc": "desc", "attributes": {"abc":123, "def":456}}
{"crawlDate": "2019-07-04 20:03:44", "Code": "200", "c1": "ABC", "desc": "desc1"}
{"crawlDate": "2019-07-04 20:03:44", "Code": "200", "c1": "ABC", "desc": "desc1", "attributes": {"abc":456, "def":123}}
df.show():
Now I want to check how many records(ROWS) have attributes element and how many records don't have.
I tried to use array_contains, filter and explode functions in spark, but It didn't get the results.
Any suggestions please?
import org.apache.spark.sql.functions._
df.select(get_json_object($"result", "$.attributes").alias("attributes")) .filter(col("attributes").isNotNull).count()
with this logic, we can get the count of attribute existing records count
for your reference, please read this
https://docs.databricks.com/spark/latest/dataframes-datasets/complex-nested-data.html
another solution if your input is JSON format, then
val df = spark.read.json("path of json file")
df.filter(col("attributes").isNotNull).count()
similar API we can get in python.
Below simple logic worked after lot of struggle
total_count = old_df.count()
new_df = old_df.filter(old_df.result.contains("attributes"))
success_count = new_df.count()
failure_count = total_count - success_count
This a sample document. I wish to insert a MongoDB query in the field "condition".
So how to insert it?
{
"_id": 9000001,
"GeoId": 111002,
"collection": "Age",
"condition": "{ db.age.find({"Fields.FieldValue" : ""})}",
"alertMessageTemplate": "The age is #<20# & #>100# ",
}
My best approach will be collection name and condition is separate fields:
"collection": "Age",
"condition": {"Fields.FieldValue" : ""},
This way you can save condition as a JSON object.
Then you can format on find: db.getCollection('+collection+').find('+condition+');
If i have json like the following in a column in a mysql database
[
{
"name": "John",
"checked": true
},
{
"name": "Lucy",
"checked": false
}
]
how can I select in mysql all rows where in the same object name = 'John' and checked = true.
The json objects may have more keys and keys may not be in a specific order.
Just use JSON_CONTAINS:
SELECT * from `users`
WHERE JSON_CONTAINS(`data`, '{"name": "John","checked": true}');
You can try to match the string if the keys are always in the same order. Assuming your column is called people
SELECT
IF(people LIKE '%[
{
\"name\": \"John\",
\"checked\": true%', TRUE, FALSE) AS 'john_checked'
FROM table
WHERE (people LIKE '%[
{
\"name\": \"John\",
\"checked\": true%')
With the knowledge of this solution, you could create a shorter SQL such as the following. You may use this alone, or use it as a subquery within the where clause of a query that will return all the rows.
SELECT
IF(people LIKE '%\"checked\": true%', TRUE, FALSE) AS 'john_checked'
FROM table
WHERE (people LIKE '%\"name\": \"John\"%')
You can probably see in this that JSON is not ideal for storing in mySQL.
The better solution is to design your database as a relational one, i.e. have an additional table called people and a column(s) that link the data. Saying how to design this would require me to know much more about your data/subject, but you should learn about SQL "joins" and normalisation.
There are other questions that discuss JSON in mySQL, such as Storing JSON in database vs. having a new column for each key and Storing Data in MySQL as JSON
As of mySQL 5.7 there are some json related functions. See this wagon article, and the mySQL documentation.
Here is how it can be done in postgresql:
create table users (data jsonb);'
insert into users values ('[{"name": "John", "checked": "true"}, {"name": "Lucy", "checked": "false"}]'),('[{"name": "John", "checked": "false"}, {"name": "Lucy", "checked": "false"}]'),('[{"name": "John", "checked": "false"}, {"name": "Lucy", "checked": "true"}]');
select * from users, jsonb_array_elements(users.data) obj
where obj->>'name' = 'John' and obj->>'checked' = 'true';
data | value
-----------------------------------------------------------------------------+-------------------------------------
[{"name": "John", "checked": "true"}, {"name": "Lucy", "checked": "false"}] | {"name": "John", "checked": "true"}
(1 row)
I want to be able to access deeper elements stored in a json in the field json, stored in a postgresql database. For example, I would like to be able to access the elements that traverse the path states->events->time from the json provided below. Here is the postgreSQL query I'm using:
SELECT
data#>> '{userId}' as user,
data#>> '{region}' as region,
data#>>'{priorTimeSpentInApp}' as priotTimeSpentInApp,
data#>>'{userAttributes, "Total Friends"}' as totalFriends
from game_json
WHERE game_name LIKE 'myNewGame'
LIMIT 1000
and here is an example record from the json field
{
"region": "oh",
"deviceModel": "inHouseDevice",
"states": [
{
"events": [
{
"time": 1430247045.176,
"name": "Session Start",
"value": 0,
"parameters": {
"Balance": "40"
},
"info": ""
},
{
"time": 1430247293.501,
"name": "Mission1",
"value": 1,
"parameters": {
"Result": "Win ",
"Replay": "no",
"Attempt Number": "1"
},
"info": ""
}
]
}
],
"priorTimeSpentInApp": 28989.41467999999,
"country": "CA",
"city": "vancouver",
"isDeveloper": true,
"time": 1430247044.414,
"duration": 411.53,
"timezone": "America/Cleveland",
"priorSessions": 47,
"experiments": [],
"systemVersion": "3.8.1",
"appVersion": "14312",
"userId": "ef617d7ad4c6982e2cb7f6902801eb8a",
"isSession": true,
"firstRun": 1429572011.15,
"priorEvents": 69,
"userAttributes": {
"Total Friends": "0",
"Device Type": "Tablet",
"Social Connection": "None",
"Item Slots Owned": "12",
"Total Levels Played": "0",
"Retention Cohort": "Day 0",
"Player Progression": "0",
"Characters Owned": "1"
},
"deviceId": "ef617d7ad4c6982e2cb7f6902801eb8a"
}
That SQL query works, except that it doesn't give me any return values for totalFriends (e.g. data#>>'{userAttributes, "Total Friends"}' as totalFriends). I assume that part of the problem is that events falls within a square bracket (I don't know what that indicates in the json format) as opposed to a curly brace, but I'm also unable to extract values from the userAttributes key.
I would appreciate it if anyone could help me.
I'm sorry if this question has been asked elsewhere. I'm so new to postgresql and even json that I'm having trouble coming up with the proper terminology to find the answers to this (and related) questions.
You should definitely familiarize yourself with the basics of json
and json functions and operators in Postgres.
In the second source pay attention to the operators -> and ->>.
General rule: use -> to get a json object, ->> to get a json value as text.
Using these operators you can rewrite your query in the way which returns correct value of 'Total Friends':
select
data->>'userId' as user,
data->>'region' as region,
data->>'priorTimeSpentInApp' as priotTimeSpentInApp,
data->'userAttributes'->>'Total Friends' as totalFriends
from game_json
where game_name like 'myNewGame';
Json objects in square brackets are elements of a json array.
Json arrays may have many elements.
The elements are accessed by an index.
Json arrays are indexed from 0 (the first element of an array has an index 0).
Example:
select
data->'states'->0->'events'->1->>'name'
from game_json
where game_name like 'myNewGame';
-- returns "Mission1"
select
data->'states'->0->'events'->1->>'name'
from game_json
where game_name like 'myNewGame';
This did help me