Fill Json with missed Data - json

I have a json and in my example values for years 1996 to 2012 were left out. Is there any way to insert the missing values (years)? If yes, how? Finally, the json should be completely filled in (1988 to 2021) and insert the value "0" for the years where the year is missing.
Kind Regards and thank you
{
"data": {
"ArrayValue": [
0.0030350000000000004,
0.003661,
0.0080348532,
0.0053554275,
0.004284,
0.008569710000000001,
0.008569710000000001,
0.007498282499999999,
0.189286,
0.42142999999999997,
0.461429,
0.5075000000000001,
0.5575,
0.615,
0.705,
0.76,
0.8075,
0.865,
0.89
],
"ArrayYears": [
"1988",
"1989",
"1990",
"1991",
"1992",
"1993",
"1994",
"1995",
"2012",
"2013",
"2014",
"2015",
"2016",
"2017",
"2018",
"2019",
"2020",
"2021",
"TTM"
]
}
}

2012 is there. Filling in 1996 - 2011:
'use strict';
const fs = require('fs');
let rawdata = fs.readFileSync('input.json');
let obj = JSON.parse(rawdata);
const len = 16;
const insertPos = 8;
const totSize = 35;
const tempArrData = new Array(len).fill(0);
const tempArrYears = new Array(len);
let i = 0;
for (let y = 1996; y < 2012; y += 1) {
tempArrYears[i] = y + '';
i += 1;
}
const arrV = obj.data.ArrayValue;
arrV.splice(insertPos, 0, ...tempArrData);
const arrY = obj.data.ArrayYears;
arrY.splice(insertPos, 0, ...tempArrYears);
for (let i = 0; i < totSize; i += 1) {
console.log(`${obj.data.ArrayYears[i]} , ${obj.data.ArrayValue[i]}`);
}
Output:
1988 , 0.0030350000000000004
1989 , 0.003661
1990 , 0.0080348532
1991 , 0.0053554275
1992 , 0.004284
1993 , 0.008569710000000001
1994 , 0.008569710000000001
1995 , 0.007498282499999999
1996 , 0
1997 , 0
1998 , 0
1999 , 0
2000 , 0
2001 , 0
2002 , 0
2003 , 0
2004 , 0
2005 , 0
2006 , 0
2007 , 0
2008 , 0
2009 , 0
2010 , 0
2011 , 0
2012 , 0.189286
2013 , 0.42142999999999997
2014 , 0.461429
2015 , 0.5075000000000001
2016 , 0.5575
2017 , 0.615
2018 , 0.705
2019 , 0.76
2020 , 0.8075
2021 , 0.865
TTM , 0.89

Related

How can I parse this html?

I'm trying to scrape https://PickleballBrackets.com using Selenium and BeautifulSoup with this code:
browser = webdriver.Safari()
browser.get('https://pickleballbrackets.com')
soup = BeautifulSoup(browser.page_source, 'lxml')
If I look at browser.page_source after I get the html, I can see 50 instances of
<div class="browse-row-box">
but after I create a soup object, they are lost. I believe that means that I have poorly formed html. I've tried all three parsers ('lxml', 'html5lib', 'html.parser') without any luck.
Suggestions on how to proceed?
Lot easier to get the data from the source.
import pandas as pd
import requests
url = 'https://pickleballbrackets.com/Json.asmx/EventsSearch_PublicUI'
payload = {
'AgeIDs': "",
'Alpha': "All",
'ClubID': "",
'CountryID': "",
'DateFilter': "future",
'EventTypeIDs': "1",
'FormatIDs': "",
'FromDate': "",
'IncludeTestEvents': "0",
'OrderBy': "EventActivityFirstDate",
'OrderDirection': "Asc",
'PageNumber': "1",
'PageSize': 9999,
'PlayerGroupIDs': "",
'PrizeMoney': "All",
'RankIDs': "",
'ReturnType': "json",
'SearchWord': "",
'ShowOnCalendar': "0",
'SportIDs': "dc1894c6-7e85-43bc-bfa2-3993b0dd630f",
'StateIDs': "",
'ToDate': "",
'prt': ""}
jsonData = requests.post(url, json=payload).json()
df = pd.DataFrame(jsonData['d'])
Output:
print(df.head(2).to_string())
RowNumber RecordCount PageCount CurrPage EventID ClubID Title TimeZoneAbbreviation UTCOffset HasDST StartTimesPosted Logo OnlineRegistration_Active Registration_DateOpen Registration_DateClosed IsSanctioned CancelTourney LocationOfEvent_Venue LocationOfEvent_StreetAddress LocationOfEvent_City LocationOfEvent_CountryTitle LocationOfEvent_StateTitle LocationOfEvent_Zip ShowDraws IsFavorite IsPrizeMoney MaxRegistrationsForEntireEvent Sanction_PCO SanctionLevelAppovedStatus_PCO SanctionLevelID_PCO Sanction_SSIPA SanctionLevelAppovedStatus_SSIPA SanctionLevelID_SSIPA Sanction_USAPA SanctionLevelAppovedStatus_USAP SanctionLevelID_USAP Sanction_WPF SanctionLevelAppovedStatus_WPF SanctionLevelID_WPF Sanction_GPA SanctionLevelAppovedStatus_GPA SanctionLevelID_GPA EventActivityFirstDate EventActivityLastDate IsRegClosed Cost_Registration_Current Cost_FeeOnEvents RegistrationCount_InAtLeastOneLiveEvent showResultsButton SantionLevels_PCO_Title SantionLevels_PCO_LevelLogo SantionLevels_SSIPA_Title SantionLevels_SSIPA_LevelLogo SantionLevels_USAP_Title SantionLevels_USAP_LevelLogo SantionLevels_WPF_Title SantionLevels_WPF_LevelLogo SantionLevels_GPA_Title SantionLevels_GPA_LevelLogo mng
0 1 152 1 1 410d04c2-49c5-48a4-847f-0f0ac0aa92f7 91c83e9c-c8e3-460d-b124-52f5c1036336 Cincinnati Pickleball Club 2022 March Mania EST -5 True False 410d04c2-49c5-48a4-847f-0f0ac0aa92f7_Logo.png True 1/24/2022 7:30:00 AM 3/22/2022 5:00:00 PM False False Five Seasons Ohio 11790 Snider Road Cincinnati United States Ohio 45249 -1 0 False 0 False False False False False 3/25/2022 4:00:00 PM 3/27/2022 2:00:00 PM 1 50.0 225.0 238 1 0
1 2 152 1 1 9f0c5976-94e9-4d58-a273-774744bdacec e5cd380b-fe72-4ef4-89e8-5053e94587a3 Flash Fridays Slam Series - March 25th EST -5 True False 9f0c5976-94e9-4d58-a273-774744bdacec_Logo.png True 3/1/2022 5:00:00 PM 3/23/2022 11:45:00 PM False False Holbrook Park 100 Sherwood Dr Huntersville United States North Carolina 28078 1 0 False 6 False False False False False 3/25/2022 4:00:00 PM 3/25/2022 4:00:00 PM 1 25.0 0.0 0 0 0
....
[152 rows x 60 columns]

How to insert object of array into JSON format?

If this json output looks like this
{
"Bank_Name":"This is bank name",
"ACC_Name":"Tummy",
"ACC_No":"1122XXXX115",
"Date_Active":"Jan 31 2019 2:16PM",
"Date_Expired":"Nov 17 2020 1:14PM",
"Bank_Status":"Expired",
"email_Notif":[
{"Verification":[{"User_Email":"tfe.master#gmail.com","Send_DateTime":"2020-11-03T13:30:59.7036152"},
{"User_Email":"the.user#outlook.com","Send_DateTime":"2020-11-03T13:31:02.1563596"}]
},{"Verified":[{"DateTime": "2020-11-03T13:31:02.1563596"}]
},
{ "Updating":[{"User_Email":"the.spv#gmail.com","Send_DateTime":"2020-11-03T13:30:59.7036152"},
{"User_Email":"the.officer#outlook.com","Send_DateTime":"2020-11-03T13:31:02.1563596"}]
}
],
"rejection_Statuses":[
{"Verification":"Nov 3 2020 01:31:02 PM"} ,
{"Verified":"Nov 7 2020 01:12:03 PM"} ,
{"Updating":"Nov 17 2020 01:18:03 PM"} ,
{"Re_run":"Nov 27 2020 05:18:03 PM"}
]
}
Questions:
How do I use "JSON_Modifiy" in SQL Server to insert email_Notif (as object of array) ? if JSON input looks like this:
{"Bank_Name": "BPD SULAWESI SELATAN",
"ACC_Name": "Tutik",
"ACC_No": "1122000115",
"Date_Active": "Jan 31 2019 2:16PM",
"Date_Expired": "Nov 17 2020 1:14PM",
"Bank_Status": "Expired",
"rejection_Statuses":[
{"Verification":"Nov 3 2020 01:31:02 PM"} ,
{"Verified":"Nov 7 2020 01:12:03 PM"} ,
{"Updating":"Nov 17 2020 01:18:03 PM"} ,
{"Re_run":"Nov 27 2020 05:18:03 PM"}]
}
How to get value from "email_Notify" as JSON format in SQL Server query by using select statement ? (Verification, Verified and Updating)
If I understand correctly, one way to achieve this is:
JSON:
DECLARE
#email_Verfication nvarchar(max) = N'{
"Verification":[
{"User_Email":"tfe.master#gmail.com", "Send_DateTime":"2020-11-03T13:30:59.7036152"},
{"User_Email":"the.user#outlook.com", "Send_DateTime":"2020-11-03T13:31:02.1563596"}
]
}',
#email_Verified nvarchar(max) = N'{
"Verified":[
{"DateTime":"2020-11-03T13:31:02.1563596"}
]
}',
#email_Updating nvarchar(max) = N'{"Updating":[
{"User_Email":"the.spv#gmail.com", "Send_DateTime":"2020-11-03T13:30:59.7036152"},
{"User_Email":"the.officer#outlook.com", "Send_DateTime":"2020-11-03T13:31:02.1563596"}
]
}',
#detail NVARCHAR (MAX) = N'{
"Bank_Name":"BPD SULAWESI SELATAN",
"ACC_Name":"Tutik",
"ACC_No":"1122000115",
"Date_Active":"Jan 31 2019 2:16PM",
"Date_Expired":"-",
"Bank_Status":"Active",
"rejection_Statuses":[
{"Verification":"Nov 3 2020 01:31:02 PM"},
{"Verified":"Nov 7 2020 01:12:03 PM"},
{"Updating":"Nov 17 2020 01:18:03 PM"},
{"Re_run":"Nov 27 2020 05:18:03 PM"}
]
}'
Modify JSON:
SET #detail = JSON_MODIFY (#detail, 'append $.email_Notif', JSON_QUERY(#email_Verfication, '$'))
SET #detail = JSON_MODIFY (#detail, 'append $.email_Notif', JSON_QUERY(#email_Verified, '$'))
SET #detail = JSON_MODIFY (#detail, 'append $.email_Notif', JSON_QUERY(#email_Updating, '$'))
Parse JSON:
SELECT j2.[key], j2.[value]
FROM OPENJSON(#json, '$.email_Notif') j1
CROSS APPLY OPENJSON(j1.[value], '$') j2

Constructing request payload in R using rjson/jsonlite

My current code as seen below attempts to construct a request payload (body), but isn't giving me the desired result.
library(df2json)
library(rjson)
y = rjson::fromJSON((df2json::df2json(dataframe)))
globalparam = ""
req = list(
Inputs = list(
input1 = y
)
,GlobalParameters = paste("{",globalparam,"}",sep="")#globalparam
)
body = enc2utf8((rjson::toJSON(req)))
body currently turns out to be
{
"Inputs": {
"input1": [
{
"X": 7,
"Y": 5,
"month": "mar",
"day": "fri",
"FFMC": 86.2,
"DMC": 26.2,
"DC": 94.3,
"ISI": 5.1,
"temp": 8.2,
"RH": 51,
"wind": 6.7,
"rain": 0,
"area": 0
}
]
},
"GlobalParameters": "{}"
}
However, I need it to look like this:
{
"Inputs": {
"input1": [
{
"X": 7,
"Y": 5,
"month": "mar",
"day": "fri",
"FFMC": 86.2,
"DMC": 26.2,
"DC": 94.3,
"ISI": 5.1,
"temp": 8.2,
"RH": 51,
"wind": 6.7,
"rain": 0,
"area": 0
}
]
},
"GlobalParameters": {}
}
So basically global parameters have to be {}, but not hardcoded. It seemed like a fairly simple problem, but I couldn't fix it. Please help!
EDIT:
This is the dataframe
X Y month day FFMC DMC DC ISI temp RH wind rain area
1 7 5 mar fri 86.2 26.2 94.3 5.1 8.2 51 6.7 0.0 0
2 7 4 oct tue 90.6 35.4 669.1 6.7 18.0 33 0.9 0.0 0
3 7 4 oct sat 90.6 43.7 686.9 6.7 14.6 33 1.3 0.0 0
4 8 6 mar fri 91.7 33.3 77.5 9.0 8.3 97 4.0 0.2 0
This is an example of another data frame
> a = data.frame("col1" = c(81, 81, 81, 81), "col2" = c(72, 69, 79, 84))
Using this sample data
dd<-read.table(text=" X Y month day FFMC DMC DC ISI temp RH wind rain area
1 7 5 mar fri 86.2 26.2 94.3 5.1 8.2 51 6.7 0.0 0", header=T)
You can do
globalparam = setNames(list(), character(0))
req = list(
Inputs = list(
input1 = dd
)
,GlobalParameters = globalparam
)
body = enc2utf8((rjson::toJSON(req)))
Note that globalparam looks a bit funny because we need to force it to a named list for rjson to treat it properly. We only have to do this when it's empty.

R data.frame to JSON with child nodes / hierarchical

I am trying to write a data.frame from R into a JSON file, but in a hierarchical structure with child nodes within them. I found examples and JSONIO but I wasn't able to apply it to my case.
This is the data.frame in R
> DF
Date_by_Month CCG Year Month refYear name OC_5a OC_5b OC_5c
1 2010-01-01 MyTown 2010 01 2009 2009/2010 0 15 27
2 2010-02-01 MyTown 2010 02 2009 2009/2010 1 14 22
3 2010-03-01 MyTown 2010 03 2009 2009/2010 1 6 10
4 2010-04-01 MyTown 2010 04 2010 2010/2011 0 10 10
5 2010-05-01 MyTown 2010 05 2010 2010/2011 1 16 7
6 2010-06-01 MyTown 2010 06 2010 2010/2011 0 13 25
In addtion to writing the data by month, I would also like to create an aggregate child, the 'yearly' one, which holds the sum (for example) of all the months that fall in this year. This is how I would like the JSON file to look like:
[
{
"ccg":"MyTown",
"data":[
{"period":"yearly",
"scores":[
{"name":"2009/2010","refYear":"2009","OC_5a":2, "OC_5b": 35, "OC_5c": 59},
{"name":"2010/2011","refYear":"2010","OC_5a":1, "OC_5b": 39, "OC_5c": 42},
]
},
{"period":"monthly",
"scores":[
{"name":"2009/2010","refYear":"2009","month":"01","year":"2010","OC_5a":0, "OC_5b": 15, "OC_5c": 27},
{"name":"2009/2010","refYear":"2009","month":"02","year":"2010","OC_5a":1, "OC_5b": 14, "OC_5c": 22},
{"name":"2009/2010","refYear":"2009","month":"03","year":"2010","OC_5a":1, "OC_5b": 6, "OC_5c": 10},
{"name":"2009/2010","refYear":"2009","month":"04","year":"2010","OC_5a":0, "OC_5b": 10, "OC_5c": 10},
{"name":"2009/2010","refYear":"2009","month":"05","year":"2010","OC_5a":1, "OC_5b": 16, "OC_5c": 7},
{"name":"2009/2010","refYear":"2009","month":"01","year":"2010","OC_5a":0, "OC_5b": 13, "OC_5c": 25}
]
}
]
},
]
Thank you so much for your help!
Expanding on my comment:
The jsonlite package has a lot of features, but what you're describing doesn't really map to a data frame anymore so I doubt any canned routine has this functionality. Your best bet is probably to convert the data frame to a more general list (FYI data frames are stored internally as lists of columns) with a structure that matches the structure of the JSON exactly, then just use the converter to translate
This is complicated in general but in your case should be fairly simple. The list will be structured exactly like the JSON data:
list(
list(
ccg = "Town1",
data = list(
list(
period = "yearly",
scores = yearly_data_frame_town1
),
list(
period = "monthly",
scores = monthly_data_frame_town1
)
)
),
list(
ccg = "Town2",
data = list(
list(
period = "yearly",
scores = yearly_data_frame_town2
),
list(
period = "monthly",
scores = monthly_data_frame_town2
)
)
)
)
Constructing this list should be a straightforward case of looping over unique(DF$CCG) and using aggregate at each step, to construct the yearly data.
If you need performance, look to either the data.table or dplyr packages to do the looping and aggregating all at once. The former is flexible and performant but a little esoteric. The latter has relatively easy syntax and is similarly performant, but is designed specifically around building pipelines for data frames so it might take some hacking to get it to produce the right output format.
Looks like ssdecontrol has you covered... but here's my solution. Need to loop over unique CCG and Years to create the entire data set...
df <- read.table(textConnection("Date_by_Month CCG Year Month refYear name OC_5a OC_5b OC_5c
2010-01-01 MyTown 2010 01 2009 2009/2010 0 15 27
2010-02-01 MyTown 2010 02 2009 2009/2010 1 14 22
2010-03-01 MyTown 2010 03 2009 2009/2010 1 6 10
2010-04-01 MyTown 2010 04 2010 2010/2011 0 10 10
2010-05-01 MyTown 2010 05 2010 2010/2011 1 16 7
2010-06-01 MyTown 2010 06 2010 2010/2011 0 13 25"), stringsAsFactors=F, header=T)
library(RJSONIO)
to_list <- function(ccg, year){
df_monthly <- subset(df, CCG==ccg & Year==year)
df_yearly <- aggregate(df[,c("OC_5a", "OC_5b", "OC_5c")] ,df[,c("name", "refYear")], sum)
l <- list("ccg"=ccg,
data=list(list("period" = "yearly",
"scores" = as.list(df_yearly)
),
list("period" = "monthly",
"scores" = as.list(df[,c("name", "refYear", "OC_5a", "OC_5b", "OC_5c")])
)
)
)
return(l)
}
toJSON(to_list("MyTown", "2010"), pretty=T)
Which returns this:
{
"ccg" : "MyTown",
"data" : [
{
"period" : "yearly",
"scores" : {
"name" : [
"2009/2010",
"2010/2011"
],
"refYear" : [
2009,
2010
],
"OC_5a" : [
2,
1
],
"OC_5b" : [
35,
39
],
"OC_5c" : [
59,
42
]
}
},
{
"period" : "monthly",
"scores" : {
"name" : [
"2009/2010",
"2009/2010",
"2009/2010",
"2010/2011",
"2010/2011",
"2010/2011"
],
"refYear" : [
2009,
2009,
2009,
2010,
2010,
2010
],
"OC_5a" : [
0,
1,
1,
0,
1,
0
],
"OC_5b" : [
15,
14,
6,
10,
16,
13
],
"OC_5c" : [
27,
22,
10,
10,
7,
25
]
}
}
]
}

Using Linq to shape structure for Json serialization

I have a pretty simple structure that looks something like this:
var list = new List<CategoryInTimeItem>
{
new CategoryInTimeItem { Name = "Food", Year = 2012, Month = 1, Amount = 100 },
new CategoryInTimeItem { Name = "Food", Year = 2012, Month = 2, Amount = 110 },
new CategoryInTimeItem { Name = "Food", Year = 2012, Month = 3, Amount = 130 },
new CategoryInTimeItem { Name = "Food", Year = 2012, Month = 4, Amount = 130 },
new CategoryInTimeItem { Name = "Transport", Year = 2012, Month = 1, Amount = 1000 },
new CategoryInTimeItem { Name = "Transport", Year = 2012, Month = 2, Amount = 1101 },
new CategoryInTimeItem { Name = "Transport", Year = 2012, Month = 3, Amount = 1301 },
new CategoryInTimeItem { Name = "Transport", Year = 2012, Month = 4, Amount = 1301 }
};
I want to reshape this structure so that when it get's serialized to json the result should look like this, one array for each name:
[
[["2012-1", 100], ["2012-2", 110], ["2012-3", 130], ["2012-4", 130]],
[["2012-1", 1000], ["2012-2", 1101], ["2012-3", 1301], ["2012-4", 1301]]
]
My linq query looks like this:
result.Values =
from d in list
orderby d.Name , d.Year , d.Month
group d by d.Name
into grp
select new[]
{
grp.Select(y => new object[] {y.DateName, y.Amount})
};
This almost works, however I get an extra "level" of arrays, so when serialized to json the result looks like this:
[
[[["2012-1", 100], ["2012-2", 110], ["2012-3", 130], ["2012-4", 130]]],
[[["2012-1", 1000], ["2012-2", 1101], ["2012-3", 1301], ["2012-4", 1301]]]
]
What am I doing wrong here?
You've almost been there, just instead of
from d in list
...
select new[]
{
grp.Select(y => new object[] {y.DateName, y.Amount})
}
simply:
from d in list
...
select grp.Select(y => new object[] {y.DateName, y.Amount}).ToList()
You just added an unnecessary level of array at the end.