Large Json file send batches wise to HubSpot API

Large Json file send batches wise to HubSpot API - json

I tried many ways and tested many scenarios I did R&D a lot but unable to found issue/solution
I have a requirement, The HubSpot API accepts only 15k rec every time so we have large json file so we need to split/divide like batches wise 15k rec need to send api once 15k added in api it sleeps 10 sec and capture each response like this, the process would continue until all rec finished
I try with chunk code and modulus operator but didn't get any response
Not sure below code work or not can anyone please suggest better way
How to send batches wise to HubSpot API, How to post
Thanks in advance, this would great help for me!!!!!!!!
with open(r'D:\Users\lakshmi.vijaya\Desktop\Invalidemail\allhubusers_data.json', 'r') as run:
dict_run = run.readlines()
dict_ready = (''.join(dict_run))
count = 1000
subsets = (dict_ready[x:x + count] for x in range(0, len(dict_ready), count))
url = 'https://api.hubapi.com/contacts/v1/contact/batch'
headers = {'Authorization' : "Bearer pat-na1-**************************", 'Accept' : 'application/json', 'Content-Type' : 'application/json','Transfer-encoding':'chunked'}
for subset in subsets:
#print(subset)
urllib3.disable_warnings()
r = requests.post(url, data=subset, headers=headers,verify=False,
timeout=(15,20), stream=True)
print(r.status_code)
print(r.content)
ERROR:;;
400
b'\r\n400 Bad Request\r\n\r\n400 Bad Request\r\ncloudflare\r\n\r\n\r\n'
This is other method:
with open(r'D:\Users\lakshmi.vijaya\Desktop\Invalidemail\allhubusers_data.json', 'r') as run:
dict_run = run.readlines()
dict_ready = (''.join(dict_run))
url = 'https://api.hubapi.com/contacts/v1/contact/batch'
headers = {'Authorization' : "Bearer pat-na1***********-", 'Accept' : 'application/json', 'Content-Type' : 'application/json','Transfer-encoding':'chunked'}
urllib3.disable_warnings()
r = requests.post(url, data=dict_ready, headers=headers,verify=False,
timeout=(15,20), stream=True)
r.iter_content(chunk_size=1000000)
print(r.status_code)
print(r.content)
ERROR::::
raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='api.hubapi.com', port=443): Max retries exceeded with url: /contacts/v1/contact/batch
(Caused by SSLError(SSLEOFError(8, 'EOF occurred in violation of protocol (_ssl.c:2396)')))
This how json data looks like in large json file
{
"email": "aaazaj21#yahoo.com",
"properties": [
{
"property": "XlinkUserID",
"value": 422211111
},
{
"property": "register_time",
"value": "2021-09-02"
},
{
"property": "linked_alexa",
"value": 1
},
{
"property": "linked_googlehome",
"value": 0
},
{
"property": "fan_speed_switch_0x51_",
"value": 2
}
]
},
{
"email": "zzz7#gmail.com",
"properties": [
{
"property": "XlinkUserID",
"value": 13333666
},
{
"property": "register_time",
"value": "2021-04-24"
},
{
"property": "linked_alexa",
"value": 1
},
{
"property": "linked_googlehome",
"value": 0
},
{
"property": "full_colora19_st_0x06_",
"value": 2
}
]
}
I try with adding list of objects
[
{
"email": "aaazaj21#yahoo.com",
"properties": [
{
"property": "XlinkUserID",
"value": 422211111
},
{
"property": "register_time",
"value": "2021-09-02"
},
{
"property": "linked_alexa",
"value": 1
},
{
"property": "linked_googlehome",
"value": 0
},
{
"property": "fan_speed_switch_0x51_",
"value": 2
}
]
},
{
"email": "zzz7#gmail.com",
"properties": [
{
"property": "XlinkUserID",
"value": 13333666
},
{
"property": "register_time",
"value": "2021-04-24"
},
{
"property": "linked_alexa",
"value": 1
},
{
"property": "linked_googlehome",
"value": 0
},
{
"property": "full_colora19_st_0x06_",
"value": 2
}
]
}
]

You haven't said if your JSON file is a representation of an array of objects or just one object. Arrays are converted to Python lists by json.load and objects are converted to Python dictionaries.
Here is some code that assumes it is an array of objects if is is not an array of objects see https://stackoverflow.com/a/22878842/839338 but the same principle can be used
Assuming you want 15k bytes not records if it is the number of records you can simplify the code and just pass 15000 as the second argument to chunk_list().
import json
import math
import pprint
# See https://stackoverflow.com/a/312464/839338
def chunk_list(list_to_chunk, number_of_list_items):
"""Yield successive chunk_size-sized chunks from list."""
for i in range(0, len(list_to_chunk), number_of_list_items):
yield list_to_chunk[i:i + number_of_list_items]
with open('./allhubusers_data.json', 'r') as run:
json_data = json.load(run)
desired_size = 15000
json_size = len(json.dumps(json_data))
print(f'{json_size=}')
print(f'Divide into {math.ceil(json_size/desired_size)} sub-sets')
print(f'Number of list items per subset = {len(json_data)//math.ceil(json_size/desired_size)}')
if isinstance(json_data, list):
print("Found a list")
sub_sets = chunk_list(json_data, len(json_data)//math.ceil(json_size/desired_size))
else:
exit("Data not list")
for sub_set in sub_sets:
pprint.pprint(sub_set)
print(f'Length of sub-set {len(json.dumps(sub_set))}')
# Do stuff with the sub sets...
text_subset = json.dumps(sub_set) # ...
you may need to adjust the value of desired_size downwards if the sub_sets vary in length of text.
UPDATED IN RESPONSE TO COMMENT
If you just need 15000 records per request this code should work for you
import json
import pprint
import requests
# See https://stackoverflow.com/a/312464/839338
def chunk_list(list_to_chunk, number_of_list_items):
"""Yield successive chunk_size-sized chunks from list."""
for i in range(0, len(list_to_chunk), number_of_list_items):
yield list_to_chunk[i:i + number_of_list_items]
url = 'https://api.hubapi.com/contacts/v1/contact/batch'
headers = {
'Authorization': "Bearer pat-na1-**************************",
'Accept': 'application/json',
'Content-Type': 'application/json',
'Transfer-encoding': 'chunked'
}
with open(r'D:\Users\lakshmi.vijaya\Desktop\Invalidemail\allhubusers_data.json', 'r') as run:
json_data = json.load(run)
desired_size = 15000
if isinstance(json_data, list):
print("Found a list")
sub_sets = chunk_list(json_data, desired_size)
else:
exit("Data not list")
for sub_set in sub_sets:
# pprint.pprint(sub_set)
print(f'Length of sub-set {len(sub_set)}')
r = requests.post(
url,
data=json.dumps(sub_set),
headers=headers,
verify=False,
timeout=(15, 20),
stream=True
)
print(r.status_code)
print(r.content)

Related

How to get the All index values in Groovy JSON xpath

Please find the attached Groovy code which I am using to get the particular filed from the response body.
Query 1 :
It is retrieving the results when the I am using the correct Index value like if the data.RenewalDetails[o], will give output as Value 1 and if the data.RenewalDetails[1], output as Value 2.
But in my real case, I will never know about number of blocks in the response, so I want to get all the values that are satisficing the condition, I tried data.RenewalDetails[*] but it is not working. Can you please help ?
Query 2:
Apart from the above condition, I want to add one more filter, where "FamilyCode": "PREMIUM" in the Itemdetails, Can you help on the same ?
def BoundId = new groovy.json.JsonSlurper().parseText('{"data":{"RenewalDetails":[{"ExpiryDetails":{"duration":"xxxxx","destination":"LHR","from":"AUH","value":2,"segments":[{"valudeid":"xxx-xx6262-xxxyyy-1111-11-11-1111"}]},"Itemdetails":[{"BoundId":"Value1","isexpired":true,"FamilyCode":"PREMIUM","availabilityDetails":[{"travelID":"AAA-AB1234-AAABBB-2022-11-10-1111","quota":"X","scale":"XXX","class":"X"}]}]},{"ExpiryDetails":{"duration":"xxxxx","destination":"LHR","from":"AUH","value":2,"segments":[{"valudeid":"xxx-xx6262-xxxyyy-1111-11-11-1111"}]},"Itemdetails":[{"BoundId":"Value2","isexpired":true,"FamilyCode":"PREMIUM","availabilityDetails":[{"travelID":"AAA-AB1234-AAABBB-2022-11-10-1111","quota":"X","scale":"XXX","class":"X"}]}]}]},"warnings":[{"code":"xxxx","detail":"xxxxxxxx","title":"xxxxxxxx"}]}')
.data.RenewalDetails[0].Itemdetails.find { itemDetail ->
itemDetail.availabilityDetails[0].travelID.length() == 33
}?.BoundId
println "Hello " + BoundId

Something like this:
def txt = '''\
{
"data": {
"RenewalDetails": [
{
"ExpiryDetails": {
"duration": "xxxxx",
"destination": "LHR",
"from": "AUH",
"value": 2,
"segments": [
{
"valudeid": "xxx-xx6262-xxxyyy-1111-11-11-1111"
}
]
},
"Itemdetails": [
{
"BoundId": "Value1",
"isexpired": true,
"FamilyCode": "PREMIUM",
"availabilityDetails": [
{
"travelID": "AAA-AB1234-AAABBB-2022-11-10-1111",
"quota": "X",
"scale": "XXX",
"class": "X"
}
]
}
]
},
{
"ExpiryDetails": {
"duration": "xxxxx",
"destination": "LHR",
"from": "AUH",
"value": 2,
"segments": [
{
"valudeid": "xxx-xx6262-xxxyyy-1111-11-11-1111"
}
]
},
"Itemdetails": [
{
"BoundId": "Value2",
"isexpired": true,
"FamilyCode": "PREMIUM",
"availabilityDetails": [
{
"travelID": "AAA-AB1234-AAABBB-2022-11-10-1111",
"quota": "X",
"scale": "XXX",
"class": "X"
}
]
}
]
}
]
},
"warnings": [
{
"code": "xxxx",
"detail": "xxxxxxxx",
"title": "xxxxxxxx"
}
]
}'''
def json = new groovy.json.JsonSlurper().parseText txt
List<String> BoundIds = json.data.RenewalDetails.Itemdetails*.find { itemDetail ->
itemDetail.availabilityDetails[0].travelID.size() == 33 && itemDetail.FamilyCode == 'PREMIUM'
}?.BoundId
assert BoundIds.toString() == '[Value1, Value2]'
Note, that you will get the BoundIds as a List

If you amend your code like this:
def json = new groovy.json.JsonSlurper().parse(prev.getResponseData()
you would be able to access the number of returned items as:
def size = json.data.RenewalDetails.size()
as RenewalDetails represents a List
Just add as many queries you want using Groovy's && operator:
find { itemDetail ->
itemDetail.availabilityDetails[0].travelID.length() == 33 &&
itemDetail.FamilyCode.equals('PREMIUM')
}
More information:
Apache Groovy - Parsing and producing JSON
Apache Groovy: What Is Groovy Used For?

pyjq - how to use "select" with both query and value as variables

I am writing a code in python3 where i am struggling with usage of variables with "pyjq", the code works without variables but variables are not getting parsed inside pyjq.
The documentation referred is https://github.com/doloopwhile/pyjq/blob/master/README.md#api
Please check the code given below and suggest -
My code
import json, os
import pyjq
from flask import Flask, request, jsonify
def query_records():
args = {"meta.antivirus.enabled": "true"}
for key, value in args.items():
with open('/tmp/data.txt', 'r') as f:
print (key)
print (value)
data = f.read()
records = json.loads(data)
query = ("." + key)
print (query)
#jq '.[]|select(.meta.antivirus.enabled=="true")' filename.json works,issue with variable substitution in python
match = pyjq.all('.[]|select(["$query"]==$value)', records, vars={"value": value,"query": query})
print (match)
query_records()
Content of file "/tmp/data.txt"
[
{
"name": "alpharetta",
"meta": {
"antivirus": {
"enabled": "true"
},
"limits": {
"cpu": {
"enabled": "true",
"value": "250m"
}
}
}
},
{
"meta": {
"allergens": {
"eggs": "true",
"nuts": "false",
"seafood": "false"
},
"calories": 230,
"carbohydrates": {
"dietary-fiber": "4g",
"sugars": "1g"
},
"fats": {
"saturated-fat": "0g",
"trans-fat": "1g"
}
},
"name": "sandwich-nutrition"
},
{
"meta": {
"allergens": {
"eggs": "true",
"nuts": "false",
"seafood": "true"
},
"calories": 440,
"carbohydrates": {
"dietary-fiber": "4g",
"sugars": "2g"
},
"fats": {
"saturated-fat": "0g",
"trans-fat": "1g"
}
},
"name": "random-nutrition"
}
]
Expected output(which works without variables)
{
"name": "alpharetta",
"meta": {
"antivirus": {
"enabled": "true"
},
"limits": {
"cpu": {
"enabled": "true",
"value": "250m"
}
}
}
}
Current output []
seems like some issue with variables not being passed in case of "query" , help would be appreciated.
Edit 1
It works if I hardcode "query" -
match = pyjq.all('.[]|select(.meta.antivirus.enabled==$value)', records, vars={"value": value,"query": query})
but not vice-versa
which probably narrows it down to issue with the variable "query"
JQ is not a necessity and I can use other libraries too,given that json is returned

Variables are intended to be used for values, not for jq expressions (at least not directly).
I think the easiest option here is to go for an fstring:
match = pyjq.all(f'.[]|select({query}==$value)', records, vars={"value": value})
and it probably makes sense to prepend the period inside the fstring:
match = pyjq.all(f'.[]|select(.{key}==$value)', records, vars={"value": value})

Getting an error "Can not deserialize instance of java.lang.String out of START_OBJECT " even though the Json code is from Docs

I took this syntax for the payload from the formal docs of jira, yet i am still getting an error. I am using either python or curl both give the same error. I supppose this is a Json related issue , could you find what is wrong with the jason/payload and how do i go about fixing it?
import requests
import json
url = "https://jira.company.io/rest/api/latest/issue/ISS-37424/transitions"
payload = json.dumps({
"update": {
"comment": [
{
"add": {
"body": {
"type": "doc",
"version": 1,
"content": [
{
"type": "paragraph",
"content": [
{
"text": "Bug has been fixed",
"type": "text"
}
]
}
]
}
}
}
]
},
"transition": {
"id": "2"
}
})
headers = {
'Authorization': 'Basic YmVzQ=LKKJYTFTgfg','
Accept': 'application/json',
'Content-Type': 'application/json',
}
response = requests.request("POST", url, headers=headers, data=payload)
print(response.text)

Json response validation based on substring presence in Karate Framework [duplicate]

This question already has an answer here:
Karate API json response - How to validate the presence of a key which sometimes come and sometimes not in the API response
(1 answer)
Closed 2 years ago.
I am trying to validate a JSON which has Optional keys depending on a condition.
Json response is as follows:
{ "hits": [
{
"_type": "sessions",
"_source": {
"movie_name": "The Greatest Showman (U/A) - English",
"session_data": {
"freeSeating": false,
"mandatoryItems": [],
"areas": [
{
"seatingType": "fixed"
},
{
"seatingType": "free"
},
{
"seatingType": "mixed"
}
]
}
}
},
{
"_type": "sessions",
"_source": {
"movie_name": "The Greatest Showman (U/A) - English",
"session_data": {
"freeSeating": false,
"mandatoryItems": [],
"areas": [
{
"seatingType": "fixed"
},
{
"seatingType": "free"
}
]
}
}
},
{
"_type": "sessions",
"_source": {
"movie_name": "The Greatest Showman 3D (U/A) - English",
"session_data": {
"freeSeating": false,
"mandatoryItems": [
{
"quantity": 1,
"level": "ticket",
"price": 30,
"type": "3dglasses"
}
],
"areas": [
{
"seatingType": "fixed"
}
]
}
}
}
]
}
If movie_name contains "3D" in it then mandatoryItems = [{"quantity": 1,"level": "ticket", "price": 30, "type": "3dglasses"}]
If movie_name does not contains "3D" in it then mandatoryItems = []
I want to achieve above assertions in my feature file.
Note: "movie_name" and "mandatoryItems" are present each element of an array. So I want to assert this condition on entire array.
Thanks in advance!!

Sorry #Peter for the inconvenience caused by me. I worked on this problem statement by referring all possible sources and wrote following code which is giving me desired output:
Given url api_url
When method Get
And def mandatoryItems_present =
"""
{
"quantity": 1,
"level": "ticket",
"price": '#number',
"type": "3dglasses",
}
"""
Then status 200
And print response
And def source_list = karate.jsonPath(response, '$.._source')
And print source_list
And match each source_list[*].session_data contains {'freeSeating': '#boolean','mandatoryItems':'##[] mandatoryItems_present'}
And def movie_names = get source_list[*].movie_name
And def mandatoryItems_list = get source_list[*].session_data.mandatoryItems
And def name_size = names.size();
And print name_size
And def threeD_movie_list = new ArrayList()
And eval for(var i = 0; i < name_size; i++) {if (names[i].match('3D')) threeD_movie_list.add(names[i])}
And print threeD_movie_list
And def threeD_movies_array_size = threeD_movie_list.size();
And print threeD_movies_array_size
And print expected
And def expected = (threeD_movies_array_size == 0 ? {"mandatoryItems" : []} : {'mandatoryItems' : [mandatoryItems_present]} )
And print expected
And match each response.hits[*]._source.session_data[*].mandatoryItems == expected
Please let me know whether this approach is correct.

How to put logic and assert dynamic data using karate

Here is my one response for a particular request
{
"data": {
"foo": [{
"total_value":200,
"applied_value": [{
"type": "A",
"id": 79806,
"value": 200
}]
}]
}
}
Here is my another response for the SAME request
{
"data": {
"foo": [{
"total_value":300,
"applied_value": [{
"type": "A",
"id": 79806,
"value": 200
},
{
"type": "B",
"id": 79809,
"value": 100
}
]
}]
}
}
I am unsure for which scenario will I get which response
So the use case is
Whenever there are 2 values in applied_value add two values and assert
Whenever there is only 1 value in applied_value directly assert

Here's one possible solution:
* def adder = function(array) { var total = 0; for (var i = 0; i < array.length; i++) total += array[i]; return total }
* def response =
"""
{
"data": {
"foo": [{
"total_value":300,
"applied_value": [{
"type": "A",
"id": 79806,
"value": 200
},
{
"type": "B",
"id": 79809,
"value": 100
}
]
}]
}
}
"""
* def expected = get[0] response..total_value
* def values = $response..value
* def total = adder(values)
* match expected == total
Just as an example, an alternate way to implement the adder routine is like this:
* def total = 0
* def add = function(x){ karate.set('total', karate.get('total') + x ) }
* eval karate.forEach(values, add)

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Large Json file send batches wise to HubSpot API - json

Related

How to get the All index values in Groovy JSON xpath

pyjq - how to use "select" with both query and value as variables

Getting an error "Can not deserialize instance of java.lang.String out of START_OBJECT " even though the Json code is from Docs

Json response validation based on substring presence in Karate Framework [duplicate]

How to put logic and assert dynamic data using karate

Categories

Resources