How can I insert following JSON in MongoDB collection as different documents - json

I need to insert data in mongo but the JSON I am getting has multiple values in every field and I don't know how can I split them to insert in different documents.
I want to insert array data in different objects in MongoDB
{
"activity_template_id": [
1,
2,
3,
4,
5,
7
],
"done_date": [
"2019-08-10",
"2019-08-10",
"2019-08-10",
"0000-01-01",
"0000-01-01",
"0000-01-01"
],
"is_prescribed": [
"N",
"N",
"N",
"N",
"N",
"Y"
],
"material_id": [
1,
5,
21,
10,
14,
0
],
"qty": [
"1",
"1",
"1",
"0",
"0",
"0"
],
"unit_id": [
1,
1,
25,
0,
0,
0
],
}

(As far as I know) there is no feature in MongoDB itself that would process input data like that. You would do that in application code before calling MongoDB.
If there is no separate application, you can use standard JavaScript functions within the mongo shell to do that.

Related

pandas json_normalize columns created as dtype object

I have a json object served from an api as follows:
{
"workouts": [
{
"id": 92527291,
"starts": "2021-06-28T15:42:44.000Z",
"minutes": 30,
"name": "Indoor Cycling",
"created_at": "2021-06-28T16:12:57.000Z",
"updated_at": "2021-06-28T16:12:57.000Z",
"plan_id": null,
"workout_token": "ELEMNT BOLT A1B3:59",
"workout_type_id": 12,
"workout_summary": {
"id": 87540207,
"heart_rate_avg": "152.0",
"calories_accum": "332.0",
"created_at": "2021-06-28T16:12:58.000Z",
"updated_at": "2021-06-28T16:12:58.000Z",
"power_avg": "185.0",
"distance_accum": "17520.21",
"cadence_avg": "87.0",
"ascent_accum": "0.0",
"duration_active_accum": "1801.0",
"duration_paused_accum": "0.0",
"duration_total_accum": "1801.0",
"power_bike_np_last": "186.0",
"power_bike_tss_last": "27.6",
"speed_avg": "9.73",
"work_accum": "332109.0",
"file": {
"url": "https://cdn.wahooligan.com/wahoo-cloud/production/uploads/workout_file/file/FPoJBPZo17BvTmSomq5Y_Q/2021-06-28-154244-ELEMNT_BOLT_A1B3-59-0.fit"
}
}
}
],
"total": 55,
"page": 1,
"per_page": 1,
"order": "descending",
"sort": "starts"
}
I want to get the data into a dataframe. However, lots of the columns seem to have a dtype of object. I assume that this is because some of the numeric values in the json are double quoted. What is the best and most efficient way to avoid this (the json potentially has many workouts elements)?
Is it to fix the returned json? Or to iterate through the dataframe columns and convert the objects to floats?
Thank you
Martyn
IIUC, you can try:
df = pd.json_normalize(json_data, meta=[
'total', 'page', 'per_page', 'order', 'sort'], record_path='workouts').convert_dtypes()
Try using pandas.to_numeric. Here are the docs.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_numeric.html

Query All Elements in Nested JSON Array PostrgreSQL

I am trying to create a query in SQL to retrieve DNS answer information so that I can visualize it in Grafana with the add of TimescaleDB. Right now, I am struggling to get postgres to query more than one element at a time. The structure of my JSON that I am trying to query looks like this:
{
"Z": 0,
"AA": 0,
"ID": 56559,
"QR": 1,
"RA": 1,
"RD": 1,
"TC": 0,
"RCode": 0,
"OpCode": 0,
"answer": [
{
"ttl": 19046,
"name": "i.stack.imgur.com",
"type": 5,
"class": 1,
"rdata": "i.stack.imgur.com.cdn.cloudflare.net"
},
{
"ttl": 220,
"name": "i.stack.imgur.com.cdn.cloudflare.net",
"type": 1,
"class": 1,
"rdata": "104.16.30.34"
},
{
"ttl": 220,
"name": "i.stack.imgur.com.cdn.cloudflare.net",
"type": 1,
"class": 1,
"rdata": "104.16.31.34"
},
{
"ttl": 220,
"name": "i.stack.imgur.com.cdn.cloudflare.net",
"type": 1,
"class": 1,
"rdata": "104.16.0.35"
}
],
"ANCount": 13,
"ARCount": 0,
"QDCount": 1,
"question": [
{
"name": "i.stack.imgur.com",
"qtype": 1,
"qclass": 1
}
]
}
There can be any number of answers, including zero, so I would like to figure out a way to query all answers. For example, I am trying to retrieve the ttl field from every index answer, and I can query a specific index, but have trouble querying all occurrences.
This works for querying a single index:
SELECT (data->'answer'->>0)::json->'ttl'
FROM dns;
When I looked around, I found this as a potential solution for querying all indices within the array, but it did not seem to work and told me "cannot extract elements from a scalar":
SELECT answer->>'ttl' ttl
FROM dns, jsonb_array_elements(data->'answer') answer, jsonb_array_elements(answer->'ttl') ttl
Using jsonb_array_elements() will give you a row for every object in the answer array. You can then dereference that object:
select a.obj->>'ttl' as ttl, a.obj->>'name' as name, a.obj->>'rdata' as rdata
from dns d
cross join lateral jsonb_array_elements(data->'answer') as a(obj)

How to convert raw JSON with nested array field in Postman body into form-data?

Yesterday I've asked about this question but got no response maybe because it was too specific related to Django REST Framework. I feel like it's simply the key-value pair problem in form-data I use to post. So I'm going to re-ask the question with simplified content.
What is the form-data format's equivalent for this raw JSON:
"markets": [
{
"market": 1,
"name": "White Stone",
"slabs": [
1,
2
],
"thicknesses": [
1,
2,
3
],
"finish_types": [
1
]
},
{
"market": 2,
"name": "White Marble",
"slabs": [
1
],
"thicknesses": [
1
],
"finish_types": [
1,
3,
6
]
}
]
I want to create a new Product instance with markets field. markets is an array and has its own attributes. Some of them are also arrays. I can't send more than 1 slabs, thicknesses, and finish_types each within a single markets. slabs, thicknesses, and finish_types are foreign keys.
When I tried to do the key-value pairs like the image above, the only saved elements are the last one inputed.
Here's the created markets:
"markets": [
{
"id": 65,
"market": 1,
"name": "White Stone",
"slabs": [
2
],
"thicknesses": [
3
],
"finish_types": [
1
]
}
]
And when I tried another key format like this no slabs and thicknesses will be saved:
"markets": [
{
"id": 66,
"market": 1,
"name": "White Stone",
"slabs": [],
"thicknesses": [],
"finish_types": [
1
]
}
]
According to this answer.
How about you try this format:
Key Value
markets[0][market] 1
markets[0][name] white stone
markets[0][slabs][] 2
markets[0][thicknesses][] 3
markets[0][finish_types][] 1
And maybe this Django thread might help you.

Do I have to reorganize the data to animate it on time, using Unity and C#?

My Json file looks something like this (it's huge, so this is just simplified):
{
"foo": {
"id": [
20,
1,
3,
4,
60,
1,
],
"times": [
330.89,
5.33,
353.89,
33.89,
14.5,
207.5,
]
},
"poo": {
"id": [
20,
1,
3,
4,
60,
1,
],
"times": [
3.5,
323.89,
97.7,
154.5,
27.5,
265.60,
]
}
}
I have a similar json file as the one above, but a much more complex one. What I want to do is to use the "time" and "id" data and perform an action for the right "id" at the exact time. So the variables id and times are actually mapped to each other (has the same index). Is there a method to take out the right id for the right time to perform an action without having too many complicated loops?

Convert to dataframe from JSON in R

I am facing issues with conversion of JSON to dataframe. I tried using libraries: jsonlite, RJSONIO,rjson.
I keep getting 'invalid character in the string' or unclosed string.
I am getting this data from a standard API so should be able to parse this json. Also, JSON editors can parse this data just fine.
My question is:
Is there a standard way using which I can make sure that my dataframe gets created and ignore above errors?
My best guess was to convert this data to JSON format using toJSON function from either of the libraries but if I use
newdata <- fromJSON(toJSON(data))
it somehow never gets converted to dataframe. Why is that?
If I instead use
newdata <- fromJSON(data)
I get a valid dataframe but sometimes because of above errors, it doesn't work which is what I am trying to know. How do you deal with this?
I have tried using this too freshDeskTicketsToDF <-
jsonlite::fromJSON(paste(readLines(textConnection(freshDeskTickets)), collapse=""))
It seemed to solve the problem but somewhere I got unclosed string with this method which I otherwise did not.
Are there better ways to deal with this in R?
Also, why is it that using toJSON on data passed to fromJSON never gets converted to a dataframe?
If I decide to take off html tags from the values assisgned to keys in JSON data. How does that work? Can I do that?
Edit: It looks like I get this error when I have <html tags> in my "string data" but I have them all across my JSON data and I don't get it every time.
How to deal with problems like this?
Note: this issue not specific to the data that I have. What I am looking for is ways to deal with problems like these and not one specific solution to a single problem.
I just realized thattoJSON converts R objects to JSON and not JSON to valid JSON. Is there a way to do it instead?
Sample data:
[
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022205968,
"responder_id": 1018353725,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Order number-100403891",
"to_emails": [
"contact#stalkbuylove.com"
],
"product_id": null,
"id": 174093,
"type": "Order Status query",
"due_by": "2016-09-02T08:57:30Z",
"fr_due_by": "2016-09-02T02:57:30Z",
"is_escalated": true,
"description": "<div dir=\"ltr\">Hi Team,<div><br></div>\n<div>I have ordered an item from your website, order number-100403891. I had called on August 30 2016 to postpone the delivery date. The guy i spoke from your end had confirmed that he will hold and push the delivery date to September 5 or 6 or 7 2016. And he confirmed the same.</div>\n<div>However, the guy I spoke to<b> did not do it</b>. </div>\n<div>I got to know it from ABHINAV from your customer care team who I spoke to on August 1st at 13:10. Hence I have put a request again and he said he will talk to some guys and give me the desired dates for delivery which is 5,6,7 of September 2016. </div>\n<div>Please let me know the concern on this and hope for a quick turn around.</div>\n<div><br></div>\n<div>Thank you,</div>\n<div>Hari,</div>\n<div>+91-9538199699.</div>\n</div>\n",
"description_text": "Hi Team,\r\n\r\nI have ordered an item from your website, order number-100403891. I had\r\ncalled on August 30 2016 to postpone the delivery date. The guy i spoke\r\nfrom your end had confirmed that he will hold and push the delivery date to\r\nSeptember 5 or 6 or 7 2016. And he confirmed the same.\r\nHowever, the guy I spoke to* did not do it*.\r\nI got to know it from ABHINAV from your customer care team who I spoke to\r\non August 1st at 13:10. Hence I have put a request again and he said he\r\nwill talk to some guys and give me the desired dates for delivery which is\r\n5,6,7 of September 2016.\r\nPlease let me know the concern on this and hope for a quick turn around.\r\n\r\nThank you,\r\nHari,\r\n+91-9538199699.\n",
"custom_fields": {
},
"created_at": "2016-09-01T07:51:18Z",
"updated_at": "2016-09-11T11:00:33Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022148025,
"responder_id": 1021145209,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Defect in d piece",
"to_emails": [
"contact#stalkbuylove.com"
],
"product_id": null,
"id": 174092,
"type": "Return",
"due_by": "2016-09-01T15:51:00Z",
"fr_due_by": "2016-09-01T09:51:00Z",
"is_escalated": false,
"description": "<div><br></div>\n<div><br></div>\n<div><br></div>\n<div><div style=\"font-size:75%;color:#575757\">Sent from Samsung Mobile</div></div>",
"description_text": "\n\n\nSent from Samsung Mobile",
"custom_fields": {
},
"created_at": "2016-09-01T07:51:00Z",
"updated_at": "2016-09-06T09:00:14Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022205895,
"responder_id": 1018353725,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Re: StalkBuyLove Return Request for order: 100404435",
"to_emails": [
"StalkBuyLove <contact#stalkbuylove.com>"
],
"product_id": null,
"id": 174088,
"type": "Refund query",
"due_by": "2016-09-01T15:43:56Z",
"fr_due_by": "2016-09-01T09:43:56Z",
"is_escalated": true,
"description": "<div>Hi. Can u deposit the amount if i giv u my account number. Right away i cant choose any other product frim ur site. <br><br>Sent from my iPhone</div>\n<div>\n<br>On Sep 1, 2016, at 12:38 PM, StalkBuyLove <contact#stalkbuylove.com> wrote:<br><br>\n</div>\n<blockquote><div>\n<div><img title=\"StalkBuyLove\" alt=\"Stalkbuylove\" src=\"http://www.stalkbuylove.com/launcher_icons/Newlogo_Stalkbuylove_240x50.png\"></div>\n<div>Hello <b>Anamica Aggarwal</b>,</div>\n<div>We have initiated a return request for order: <b>100404435</b> with the following products:</div>\n<table style=\"width:80%\">\r\n <tbody>\n<tr style=\"background-color:#B0C4DE\">\r\n <th>Item Name</th>\r\n <th>Sku</th>\r\n </tr>\n<tr>\r\n <td style=\"text-align:center\">Articuno Top</td>\r\n <td style=\"text-align:center\">IN1627MTOTOPPCH-198-18</td>\r\n </tr>\n</tbody>\n</table>\n<div>Lots of love,</div>\n<div>Team SBL</div>\n<img src=\"http://mandrillapp.com/track/open.php?u=30069003&id=bff0a5daee4a47fe9c6b04d2680c3c39\" height=\"1\" width=\"1\">\r\n</div></blockquote>",
"description_text": "Hi. Can u deposit the amount if i giv u my account number. Right away i cant choose any other product frim ur site. \n\nSent from my iPhone\n\n> On Sep 1, 2016, at 12:38 PM, StalkBuyLove <contact#stalkbuylove.com> wrote:\n> \n> \n> Hello Anamica Aggarwal,\n> \n> We have initiated a return request for order: 100404435 with the following products:\n> \n> Item Name\tSku\n> Articuno Top\tIN1627MTOTOPPCH-198-18\n> Lots of love,\n> \n> Team SBL\n> \n",
"custom_fields": {
},
"created_at": "2016-09-01T07:43:56Z",
"updated_at": "2016-09-11T11:00:32Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": false,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022205881,
"responder_id": 1021145209,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Details for order",
"to_emails": [
"contact#stalkbuylove.com"
],
"product_id": null,
"id": 174086,
"type": "Order Status query",
"due_by": "2016-09-01T15:42:50Z",
"fr_due_by": "2016-09-01T09:42:50Z",
"is_escalated": false,
"description": "<div><span></span></div>\n<div>\n<span>Hey can i get details of my order </span><br><span>How much more time will it take to get delivered? </span><br><span>Order no-</span><h2 style=\"font-weight: normal; margin: 0px;\"><font><span style=\"background-color: rgba(255, 255, 255, 0);\">100403837</span></font></h2>\n<span></span><br><span>Sent from my iPhone</span><br>\n</div>",
"description_text": "Hey can i get details of my order \r\nHow much more time will it take to get delivered? \r\nOrder no-\r\n100403837\r\n\r\nSent from my iPhone\n",
"custom_fields": {
},
"created_at": "2016-09-01T07:42:50Z",
"updated_at": "2016-09-06T09:00:13Z"
},
{
"cc_emails": [
],
"fwd_emails": [
],
"reply_cc_emails": [
],
"fr_escalated": true,
"spam": false,
"email_config_id": 1000062780,
"group_id": 1000179078,
"priority": 1,
"requester_id": 1022204690,
"responder_id": 1021145209,
"source": 1,
"company_id": null,
"status": 5,
"subject": "Refund",
"to_emails": [
"contact#stalkbuylove.com"
],
"product_id": null,
"id": 174080,
"type": "Refund query",
"due_by": "2016-09-01T15:36:26Z",
"fr_due_by": "2016-09-01T09:36:26Z",
"is_escalated": true,
"description": "<div>\r<br>Bank statement as asked for refund! Please intiate the proccedings asap!<br>\n</div>",
"description_text": "\r\nBank statement as asked for refund! Please intiate the proccedings asap!\n",
"custom_fields": {
},
"created_at": "2016-09-01T07:36:26Z",
"updated_at": "2016-09-07T08:00:19Z"
}
]
library(jsonlite)
df <- stream_in(file("~/data/sample.json"))
This stream_in function directly convert into datafram