I'm a stuck with pandas. i have two df like that :
index | seller | sales | is_active
:-----: |--------|---------|-----------
0 | smith | Yes | Yes
1 | john | No | Yes
2 | alan | Yes | No
and an other one :
index | seller | product | EAN | URL | PRICE |
:-----: |--------|---------|-------------|----------------|:----------:|
0 | smith | book | ANUDH17e89 | www.ecvdgv.com | 13.45
1 | smith | dvd | NVGS5w621 | www.awfcj.com | 23.76
2 | smith | cd | NCYbh658 | www.bstx.com | 9.99
3 | john | sofa | codkv32876 | www..... | 348
4 | john | umbrella| chudbic132 | www..... | 38
5 | john | bag | coGTTf276 | www..... | 54
6 | alan | tv | BYU1890H | www..... | 239
7 | alan | cable | ndhnjh0988 | www..... | 5
8 | alan | fridge | BTFS$42561 | www..... | 158
And i would like to do a left join on the first df and create a column as a json wit the differents informations in a new column as a json. Ssomething like that :
index | seller | sales | is_active | New_column
:-----: |--------|---------|-----------|-----------
0 | smith | Yes | Yes | {product : book,
EAN : ANUDH17e89,
URL : www.ecvdgv.com,
Price : 13,45} ,
{product :dvd,
EAN : NVGS5w621,
URL : www.awfcj.com,
Price : 23,76},
etc..
and the same for each seller
Hope is clear
Thanks or your help !
Try:
import json
df2["New_column"] = df2.apply(lambda x: json.dumps(x.to_dict()), axis=1)
out = df1.merge(
df2[["seller", "New_column"]]
.groupby("seller")
.agg(", ".join)
.reset_index(),
on="seller",
)
print(out)
Prints:
seller sales is_active New_column
0 smith Yes Yes {"seller": "smith", "product": "book", "EAN": "ANUDH17e89", "URL": "www.ecvdgv.com", "PRICE": 13.45}, {"seller": "smith", "product": "dvd", "EAN": "NVGS5w621", "URL": "www.awfcj.com", "PRICE": 23.76}, {"seller": "smith", "product": "cd", "EAN": "NCYbh658", "URL": "www.bstx.com", "PRICE": 9.99}
1 john No Yes {"seller": "john", "product": "sofa", "EAN": "codkv32876", "URL": "www.....", "PRICE": 348.0}, {"seller": "john", "product": "umbrella", "EAN": "chudbic132", "URL": "www.....", "PRICE": 38.0}, {"seller": "john", "product": "bag", "EAN": "coGTTf276", "URL": "www.....", "PRICE": 54.0}
2 alan Yes No {"seller": "alan", "product": "tv", "EAN": "BYU1890H", "URL": "www.....", "PRICE": 239.0}, {"seller": "alan", "product": "cable", "EAN": "ndhnjh0988", "URL": "www.....", "PRICE": 5.0}, {"seller": "alan", "product": "fridge", "EAN": "BTFS$42561", "URL": "www.....", "PRICE": 158.0}
Related
I have that kinda table and data :
Table Name : data
+------+-----------------+--------+----------+
| id | number | name | surname |
+------+-----------------+--------+----------+
| 1 | [1, 2, 3, 4, 5] | John | Doe |
| 2 | [1, 2, 4, 8] | James | Webb |
| 3 | [3, 4, 5] | Jenny | Test |
+------+-----------------+--------+----------+
For example, I want to fetch the rows in the number column with the value 3 :
+------+-----------------+--------+----------+
| id | number | name | surname |
+------+-----------------+--------+----------+
| 1 | [1, 2, 3, 4, 5] | John | Doe |
| 3 | [3, 4, 5] | Jenny | Test |
+------+-----------------+--------+----------+
I tried that with Laravel but didn't work. :
DB::table('data')
->whereRaw('FIND_IN_SET(?, number)', [3])
->get();
How can I solve that problem? Thanks for your answers.
You can use toJson(), to convert the collection to json object in Laravel.
$_data = DB::table('data')
->whereRaw('FIND_IN_SET(?, number)', [3])
->get()->toJson();
dd($_data);
Have a look at the [documentation] https://laravel.com/docs/9.x/responses#json-responses
Thanks for your answers. I found the answer to my question as follows :
I made the Array values as strings. Like that :
Table Name : data
+------+---------------------------+--------+----------+
| id | number | name | surname |
+------+---------------------------+--------+----------+
| 1 | ["1", "2", "3", "4", "5"] | John | Doe |
| 2 | ["1", "2", "4", "8"] | James | Webb |
| 3 | ["3", "4", "5"] | Jenny | Test |
+------+---------------------------+--------+----------+
And then I used to this codes. :
$_data = DB::table('data')
->where('number', 'LIKE', '%"' . "3" . '"%')
->get();
Also you can do with SQL commands. Like that :
SELECT *
FROM data
WHERE number LIKE '%"3"%';
Exactly like I want, it turned to me this :
+------+---------------------------+--------+----------+
| id | number | name | surname |
+------+---------------------------+--------+----------+
| 1 | ["1", "2", "3", "4", "5"] | John | Doe |
| 3 | ["3", "4", "5"] | Jenny | Test |
+------+---------------------------+--------+----------+
I have a table structure in snowflake with variant data type as shown below, you can see the a single ID is having multiple variant objects.
+-----+--------------------------+
| ID | STATE_INFO |
|-----+--------------------------|
| IND | { |
| | "population": "1000k", |
| | "state": "KA" |
| | } |
| IND | { |
| | "population": "2000k", |
| | "state": "AP" |
| | } |
| IND | { |
| | "population": "3000K", |
| | "state": "TN" |
| | } |
| US | { |
| | "population": "100k", |
| | "state": "Texas" |
| | } |
| US | { |
| | "population": "200k", |
| | "state": "Florida" |
| | } |
| US | { |
| | "population": "300K", |
| | "state": "Iowa" |
| | } |
+-----+--------------------------+
I want to combine these variant objects into a single object like below by merging the rows into one array or dictionary object
+-----+---------------------------+
| ID | STATE_INFO |
|-----+---------------------------|
| IND | [{ |
| | "population": "1000k", |
| | "state": "KA" |
| | }, |
| | { |
| | "population": "2000k", |
| | "state": "AP" |
| | }, |
| | { |
| | "population": "3000K", |
| | "state": "TN" |
| | }] |
| US | [{ |
| | "population": "100k", |
| | "state": "Texas" |
| | }, |
| | { |
| | "population": "200k", |
| | "state": "Florida" |
| | }, |
| | { |
| | "population": "300K", |
| | "state": "Iowa" |
| | }] |
+-----+---------------------------+
Like in SQL terminologies, we can say like below SQL statement
Select id,merge(STATE_INFO) from table group by id;
Like Mike said ARRAY_AGG function is what you need and it works on a variant column
select id, array_agg(STATE_INFO) within group (order by id) STATE_INFO
from table
group by 1
order by 1
Using this CTE for data:
With data(id, state_info) as (
select column1, parse_json(column2)
from values
('IND', '{ "population": "1000k", "state": "KA" }'),
('IND', '{ "population": "2000k", "state": "AP" }'),
('IND', '{ "population": "3000K", "state": "TN" }'),
('US', '{ "population": "100k", "state": "Texas" }'),
('US', '{ "population": "200k", "state": "Florida" }'),
('US', '{ "population": "300K", "state": "Iowa" }')
)
This code is is almost exactly the same is demircioglu's answer, but has no ordering of the array content.
select id, array_agg(state_info) as stateinfo
from data
group by 1;
which because of the order of the input still appears ordered. But it is really random, it depends if you need the data ordered or not:
ID
STATEINFO
US
[ { "population": "100k", "state": "Texas" }, { "population": "200k", "state": "Florida" }, { "population": "300K", "state": "Iowa" } ]
IND
[ { "population": "1000k", "state": "KA" }, { "population": "2000k", "state": "AP" }, { "population": "3000K", "state": "TN" } ]
Given the following Spark dataframe:
+----+-----------+------+-------+
| id | parent_id | data | level |
+----+-----------+------+-------+
| 1 | null | x | 1 |
| 21 | 1 | y | 2 |
| 22 | 1 | w | 2 |
| 31 | 21 | z | 3 |
+----+-----------+------+-------+
Where 'level' means the level in a tree-like structure.
How can I generate a n-level hierarchical JSON?
{
"id": "1",
"data": "x",
"items": [
{
"id": "21",
"data": "y",
"items": [
{
"id": "31",
"data": "z",
"items": []
}
]
},
{
"id": "22",
"data": "w",
"items": []
}
]
}
I have following table -
+-----+-----------+--------------------------------------+
| id | client_id | service_values |
+-----+-----------+------------+-------------------------+
| 100 | 1000 | {"1": "60", "2": "64", "3": "92"} |
| 101 | 1000 | {"1": "66", "2": "64", "3": "92"} |
| 102 | 1000 | {"1": "70", "2": "64", "3": "92"} |
| 103 | 1001 | {"1": "60", "2": "54", "3": "92"} |
| 104 | 1001 | {"1": "90", "2": "64", "3": "92"} |
| 105 | 1002 | {"1": "80", "2": "64", "3": "92"} |
+-----+-----------+--------------------------------------+
I need to fetch service_values where 2 or more values is LESS THAN 65. For example from the above records, the expected result would be -
+-----+-----------+--------------------------------------+
| id | client_id | service_values |
+-----+-----------+------------+-------------------------+
| 100 | 1000 | {"1": "60", "2": "64", "3": "92"} |
| 103 | 1001 | {"1": "60", "2": "54", "3": "92"} |
+-----+-----------+--------------------------------------+
I find a lot but couldn't find anything to perform aggregate function in JSON data.
Update - I tried something like this but couldn't succeed -
SELECT JSON_UNQUOTE(JSON_EXTRACT(service_values,'$.1')) AS A1, JSON_UNQUOTE(JSON_EXTRACT(service_values,'$.2')) AS A2, JSON_UNQUOTE(JSON_EXTRACT(service_values,'$.3')) AS A3 FROM service_data HAVING A1 < 65 OR A2 < 65 OR A3 < 65
SELECT *
FROM test
WHERE (JSON_UNQUOTE(JSON_EXTRACT(service_values, "$.1")) < 65)
+(JSON_UNQUOTE(JSON_EXTRACT(service_values, "$.2")) < 65)
+(JSON_UNQUOTE(JSON_EXTRACT(service_values, "$.3")) < 65) >= 2;
https://dbfiddle.uk/?rdbms=mariadb_10.4&fiddle=ea41cb428cbe356c9f98a99f59ca37dc
id | columns | timestamp | query_id | task_id
-------+----------------------------------------------+----------------------------+----------------------+---------------------------
1 | {"uid": "112", "name": "redis-server"} | 2018-07-18 18:45:39.045387 | 1 | 2
2 | {"uid": "0", "name": "celery"} | 2018-07-18 18:45:39.047671 | 1 | 2
3 | {"uid": "111", "name": "post"} | 2018-07-18 18:45:39.048218 | 1 | 2
4 | {"uid": "111", "name": "post"} | 2018-07-18 18:45:39.048732 | 1 | 2
Looking to extract normal values from json for UID & NAME through query syntax
You can use JSON operator ->>. ie:
select *, "columns"->>'uid' as uid, "columns"->>'name' as name
from myTable;