MySQL - How to find number of occurance in JSON? - mysql

I have following table -
+-----+-----------+--------------------------------------+
| id | client_id | service_values |
+-----+-----------+------------+-------------------------+
| 100 | 1000 | {"1": "60", "2": "64", "3": "92"} |
| 101 | 1000 | {"1": "66", "2": "64", "3": "92"} |
| 102 | 1000 | {"1": "70", "2": "64", "3": "92"} |
| 103 | 1001 | {"1": "60", "2": "54", "3": "92"} |
| 104 | 1001 | {"1": "90", "2": "64", "3": "92"} |
| 105 | 1002 | {"1": "80", "2": "64", "3": "92"} |
+-----+-----------+--------------------------------------+
I need to fetch service_values where 2 or more values is LESS THAN 65. For example from the above records, the expected result would be -
+-----+-----------+--------------------------------------+
| id | client_id | service_values |
+-----+-----------+------------+-------------------------+
| 100 | 1000 | {"1": "60", "2": "64", "3": "92"} |
| 103 | 1001 | {"1": "60", "2": "54", "3": "92"} |
+-----+-----------+--------------------------------------+
I find a lot but couldn't find anything to perform aggregate function in JSON data.
Update - I tried something like this but couldn't succeed -
SELECT JSON_UNQUOTE(JSON_EXTRACT(service_values,'$.1')) AS A1, JSON_UNQUOTE(JSON_EXTRACT(service_values,'$.2')) AS A2, JSON_UNQUOTE(JSON_EXTRACT(service_values,'$.3')) AS A3 FROM service_data HAVING A1 < 65 OR A2 < 65 OR A3 < 65

SELECT *
FROM test
WHERE (JSON_UNQUOTE(JSON_EXTRACT(service_values, "$.1")) < 65)
+(JSON_UNQUOTE(JSON_EXTRACT(service_values, "$.2")) < 65)
+(JSON_UNQUOTE(JSON_EXTRACT(service_values, "$.3")) < 65) >= 2;
https://dbfiddle.uk/?rdbms=mariadb_10.4&fiddle=ea41cb428cbe356c9f98a99f59ca37dc

Related

From differents columns create a new one with a json format

I'm a stuck with pandas. i have two df like that :
index | seller | sales | is_active
:-----: |--------|---------|-----------
0 | smith | Yes | Yes
1 | john | No | Yes
2 | alan | Yes | No
and an other one :
index | seller | product | EAN | URL | PRICE |
:-----: |--------|---------|-------------|----------------|:----------:|
0 | smith | book | ANUDH17e89 | www.ecvdgv.com | 13.45
1 | smith | dvd | NVGS5w621 | www.awfcj.com | 23.76
2 | smith | cd | NCYbh658 | www.bstx.com | 9.99
3 | john | sofa | codkv32876 | www..... | 348
4 | john | umbrella| chudbic132 | www..... | 38
5 | john | bag | coGTTf276 | www..... | 54
6 | alan | tv | BYU1890H | www..... | 239
7 | alan | cable | ndhnjh0988 | www..... | 5
8 | alan | fridge | BTFS$42561 | www..... | 158
And i would like to do a left join on the first df and create a column as a json wit the differents informations in a new column as a json. Ssomething like that :
index | seller | sales | is_active | New_column
:-----: |--------|---------|-----------|-----------
0 | smith | Yes | Yes | {product : book,
EAN : ANUDH17e89,
URL : www.ecvdgv.com,
Price : 13,45} ,
{product :dvd,
EAN : NVGS5w621,
URL : www.awfcj.com,
Price : 23,76},
etc..
and the same for each seller
Hope is clear
Thanks or your help !
Try:
import json
df2["New_column"] = df2.apply(lambda x: json.dumps(x.to_dict()), axis=1)
out = df1.merge(
df2[["seller", "New_column"]]
.groupby("seller")
.agg(", ".join)
.reset_index(),
on="seller",
)
print(out)
Prints:
seller sales is_active New_column
0 smith Yes Yes {"seller": "smith", "product": "book", "EAN": "ANUDH17e89", "URL": "www.ecvdgv.com", "PRICE": 13.45}, {"seller": "smith", "product": "dvd", "EAN": "NVGS5w621", "URL": "www.awfcj.com", "PRICE": 23.76}, {"seller": "smith", "product": "cd", "EAN": "NCYbh658", "URL": "www.bstx.com", "PRICE": 9.99}
1 john No Yes {"seller": "john", "product": "sofa", "EAN": "codkv32876", "URL": "www.....", "PRICE": 348.0}, {"seller": "john", "product": "umbrella", "EAN": "chudbic132", "URL": "www.....", "PRICE": 38.0}, {"seller": "john", "product": "bag", "EAN": "coGTTf276", "URL": "www.....", "PRICE": 54.0}
2 alan Yes No {"seller": "alan", "product": "tv", "EAN": "BYU1890H", "URL": "www.....", "PRICE": 239.0}, {"seller": "alan", "product": "cable", "EAN": "ndhnjh0988", "URL": "www.....", "PRICE": 5.0}, {"seller": "alan", "product": "fridge", "EAN": "BTFS$42561", "URL": "www.....", "PRICE": 158.0}

Merging Variant rows in Snowflake

I have a table structure in snowflake with variant data type as shown below, you can see the a single ID is having multiple variant objects.
+-----+--------------------------+
| ID | STATE_INFO |
|-----+--------------------------|
| IND | { |
| | "population": "1000k", |
| | "state": "KA" |
| | } |
| IND | { |
| | "population": "2000k", |
| | "state": "AP" |
| | } |
| IND | { |
| | "population": "3000K", |
| | "state": "TN" |
| | } |
| US | { |
| | "population": "100k", |
| | "state": "Texas" |
| | } |
| US | { |
| | "population": "200k", |
| | "state": "Florida" |
| | } |
| US | { |
| | "population": "300K", |
| | "state": "Iowa" |
| | } |
+-----+--------------------------+
I want to combine these variant objects into a single object like below by merging the rows into one array or dictionary object
+-----+---------------------------+
| ID | STATE_INFO |
|-----+---------------------------|
| IND | [{ |
| | "population": "1000k", |
| | "state": "KA" |
| | }, |
| | { |
| | "population": "2000k", |
| | "state": "AP" |
| | }, |
| | { |
| | "population": "3000K", |
| | "state": "TN" |
| | }] |
| US | [{ |
| | "population": "100k", |
| | "state": "Texas" |
| | }, |
| | { |
| | "population": "200k", |
| | "state": "Florida" |
| | }, |
| | { |
| | "population": "300K", |
| | "state": "Iowa" |
| | }] |
+-----+---------------------------+
Like in SQL terminologies, we can say like below SQL statement
Select id,merge(STATE_INFO) from table group by id;
Like Mike said ARRAY_AGG function is what you need and it works on a variant column
select id, array_agg(STATE_INFO) within group (order by id) STATE_INFO
from table
group by 1
order by 1
Using this CTE for data:
With data(id, state_info) as (
select column1, parse_json(column2)
from values
('IND', '{ "population": "1000k", "state": "KA" }'),
('IND', '{ "population": "2000k", "state": "AP" }'),
('IND', '{ "population": "3000K", "state": "TN" }'),
('US', '{ "population": "100k", "state": "Texas" }'),
('US', '{ "population": "200k", "state": "Florida" }'),
('US', '{ "population": "300K", "state": "Iowa" }')
)
This code is is almost exactly the same is demircioglu's answer, but has no ordering of the array content.
select id, array_agg(state_info) as stateinfo
from data
group by 1;
which because of the order of the input still appears ordered. But it is really random, it depends if you need the data ordered or not:
ID
STATEINFO
US
[ { "population": "100k", "state": "Texas" }, { "population": "200k", "state": "Florida" }, { "population": "300K", "state": "Iowa" } ]
IND
[ { "population": "1000k", "state": "KA" }, { "population": "2000k", "state": "AP" }, { "population": "3000K", "state": "TN" } ]

How to generate n-level hierarchical JSON from Spark DataFrame

Given the following Spark dataframe:
+----+-----------+------+-------+
| id | parent_id | data | level |
+----+-----------+------+-------+
| 1 | null | x | 1 |
| 21 | 1 | y | 2 |
| 22 | 1 | w | 2 |
| 31 | 21 | z | 3 |
+----+-----------+------+-------+
Where 'level' means the level in a tree-like structure.
How can I generate a n-level hierarchical JSON?
{
"id": "1",
"data": "x",
"items": [
{
"id": "21",
"data": "y",
"items": [
{
"id": "31",
"data": "z",
"items": []
}
]
},
{
"id": "22",
"data": "w",
"items": []
}
]
}

Update mysql row based on same ID condition

I have a MySql table (locations) that looks this:
locations
+-------+---------+--------+-------------------------------------------------------------+
| id | street | number | geoloc |
+-------+---------+--------+-------------------------------------------------------------+
| 10 | street1 | 1 | {"type": "Point", "coordinates": [-95.31231, 21.41241]} |
| 1000 | street2 | 2 | {"type": "Point", "coordinates": [9312.31231, 8231.41241]} |
| 1000 | street2 | 2 | {"type": "Point", "coordinates": [-95.45342, 21.44423]} |
| 10 | street1 | 1 | {"type": "Point", "coordinates": [312.31231, 33231.41241]} |
| 10 | street1 | 1 | {"type": "Point", "coordinates": [4312.31231, 3231.41241]} |
| 10000 | street3 | 3 | {"type": "Point", "coordinates": [-95.31271, 21.41312]} |
+-------+---------+--------+-------------------------------------------------------------+
Now the problem is that some of the location's have wrong geoloc values, and the rule to filter wrong values/good values is that some x,y coordinates are valid (ex. -95.31231, 21.41241) and some don't (ex. 4312.31231, 3231.41241). The filter pattern for good values should be this format (-95.xxxxxx, 21.xxxxxx).
The end result after update should be exactly this:
locations
+-------+---------+--------+-------------------------------------------------------------+
| id | street | number | geoloc |
+-------+---------+--------+-------------------------------------------------------------+
| 10 | street1 | 1 | {"type": "Point", "coordinates": [-95.31231, 21.41241]} |
| 1000 | street2 | 2 | {"type": "Point", "coordinates": [-95.45342, 21.44423]} |
| 1000 | street2 | 2 | {"type": "Point", "coordinates": [-95.45342, 21.44423]} |
| 10 | street1 | 1 | {"type": "Point", "coordinates": [-95.31231, 21.41241]} |
| 10 | street1 | 1 | {"type": "Point", "coordinates": [-95.31231, 21.41241]} |
| 10000 | street3 | 3 | {"type": "Point", "coordinates": [-95.31271, 21.41312]} |
+-------+---------+--------+-------------------------------------------------------------+
What I'm trying to do is this:
UPDATE locations l1, (
SELECT DISTINCT id, geoloc
FROM locations
WHERE geoloc IS NOT NULL
) l2 SET l1.geoloc = l2.geoloc
WHERE l1.id = l2.id;
And I'm not sure that WHERE is actually matching by my desired output.
UPDATE locations l1, (
SELECT DISTINCT id, geoloc <-- This will always return all rows
FROM locations
WHERE geoloc IS NOT NULL
) l2 SET l1.geoloc = l2.geoloc
WHERE l1.id = l2.id;
Also the update would be arbitrary.
You should try something like this.
UPDATE locations l1, locations l2
SET l1.geoloc = l2.geoloc
WHERE l1.id = l2.id
and l1. geoloc <> l2.geoloc
and ST_X(l2.geoloc) like '%-95.%';

Extract key, value from columns field in Postgresql table

id | columns | timestamp | query_id | task_id
-------+----------------------------------------------+----------------------------+----------------------+---------------------------
1 | {"uid": "112", "name": "redis-server"} | 2018-07-18 18:45:39.045387 | 1 | 2
2 | {"uid": "0", "name": "celery"} | 2018-07-18 18:45:39.047671 | 1 | 2
3 | {"uid": "111", "name": "post"} | 2018-07-18 18:45:39.048218 | 1 | 2
4 | {"uid": "111", "name": "post"} | 2018-07-18 18:45:39.048732 | 1 | 2
Looking to extract normal values from json for UID & NAME through query syntax
You can use JSON operator ->>. ie:
select *, "columns"->>'uid' as uid, "columns"->>'name' as name
from myTable;