Spark Read JSON with Request Parameters - json

I'm trying to read a JSON response from IBM Cloud's DB2 Warehouse documentation. This requires me to pass a request body wherein I have to supply userid and password as request parameters.
To read using spark.read.json, I did not find anything wherein request parameters could be supplied. Is there anyway using which we could do that?
Usually I would read the JSON using Scala alone using scalaj-http and play-json libraries like:
val body = Json.obj(Constants.KEY_USERID -> userid, Constants.KEY_PASSWORD -> password)
val response = Json.parse(Http(url + Constants.KEY_ENDPOINT_AUTH_TOKENS)
.header(Constants.KEY_CONTENT_TYPE , "application/json")
.header(Constants.KEY_ACCEPT , "application/json")
.postData(body.toString())
.asString.body)
My requirement is I cannot use these 2 libraries and have to do it using scala with the spark framework.

You can not use spark.read.json directly for REST API data ingestion.
First, make your API call request to get response data and then convert it to DataFrame with Spark. Note that if your API is paginated then, you'll need to make multiple calls to get all data.
For your example, you need to call authentication endpoint in order to get a Bearer token and then add it to the request header :
Authorization: Bearer <your_token>
All this part could be done using only Scala (example scala.io.Source.fromURL).
Once you get the response_data, use spark to convert it to DF :
import spark.implicits._
val df = spark.read.json(Seq(response_data).toDS)

Related

Convert GET parameter to POST request body and extract specific JSON property

Background
I am making a spreadsheet that refers to information security license registration provided by the Japanese government (https://riss.ipa.go.jp/). The spreadsheet will be used on Microsoft Excel/LibreOffice Calc on Windows/Linux, so I want to avoid using platform-specific functionality like a script with the XMLHTTP60 module.
The site https://riss.ipa.go.jp has a URI that can retrieve registration information with a registration number (https://riss.ipa.go.jp/ajax/findRissRequest). The URI only works with a POST request with the application/x-www-form-urlencoded style request body and doesn't work with a GET request. The response of the URI is JSON format.
Problem #1
Microsoft Excel and LibreOffice Calc have the WEBSERVICE function that can be used to send a request to a URI. This function is supported on all platforms and is suitable for my use case.
Unfortunately, the WEBSERVICE function only supports GET requests, and the URI I want to use only supports POST requests.
Problem #2
Microsoft Excel and LibreOffice Calc have the FILTERXML function that can be used to extract a specific element from XML.
Unfortunately, the URI I want to use returns response in JSON format. There are no functions to parse JSON in Microsoft Excel and LibreOffice Calc.
Question
Is there any way to convert GET request to POST request and extract a JSON property?
For example, is there any Web API like http://api.example.com/convert/get-to-post?uri=https://riss.ipa.go.jp/ajax/findRissRequest&reg_no=000006&property=result.reg_date that calls https://riss.ipa.go.jp/ajax/findRissRequest with POST request body reg_no=000006 and extract property result.reg_date from its response?
After all, I could not find any existing services. So I made a web API service with AWS Lambda and API Gateway.
First, I made a Lambda function like this:
import json
import urllib.request
import urllib.parse
def lambda_handler(event, context):
queryStringParameters = event.get('params').get('querystring')
data = urllib.parse.urlencode(queryStringParameters)
data = data.encode('UTF-8')
f = urllib.request.urlopen("https://riss.ipa.go.jp/ajax/findRissRequest", data)
j = json.loads(f.read().decode('utf-8'))
return j
Then I made a resource with a GET method in API Gateway and connect it with the Lambda function.
In Integration Request, you have to use non-proxy integration. Also, you have to specify a mapping template for Content-Type application/json with Method Request passthrough template.
In Integration Response, you have to specify a mapping template for Content-Type application/xml like this:
<?xml version="1.0" encoding="UTF-8" ?>
#set($root = $input.path('$.result[0]'))
<result>
#foreach($key in $root.keySet())
<$key>$root.get($key)</$key>
#end
</result>
Then I added the HEAD and OPTIONS method for the resource. It is because the WEBSERVICE function of LibreOffice sends OPTIONS and HEAD requests before a GET request.
You can use a mock in Integration Request with a mapping template for Content-Type application/json like { "statusCode": 200 }.
The result of WEBSERVICE function will be #VALUE! without these methods.
Finally, I can get a property from a web service that only accepts POST requests and returns a JSON with WEBSERVICE and FILTERXML like:
=FILTERXML(WEBSERVICE("https://xxxxxxxxxx.execute-api.ap-northeast-1.amazonaws.com/prod/passthru?reg_no=000006"),"//result/reg_date")

How to convert request data to json in AWS Lambda services?

"error%5Bcode%5D=BAD_REQUEST_ERROR&error%5Bdescription%5D=Payment+failed&error%5Bsource%5D=gateway&error%5Bstep%5D=payment_authorization&error%5Breason%5D=payment_failed&error%5Bmetadata%5D=%7B%22payment_id%22%3A%22pay_Es97gMGzx61l1u%22%2C%22order_id%22%3A%22order_Es96Rxp5OmnVVF%22%7D"
We are currently migrating to Lambda services from Flask. In Flask I was able to get the data in dictionary but in the AWS Lambda services, I am receiving the data as string, does anyone know how to parse this or convert it into a json or dictionary?
Thanks for ur time (:
This example string looks like an encode URI String. Where do you get it ? Could you provide more informations about usage context : API Gateway or request from another lambda, and how do you get this this?
With Pyhton usually you can get the path parameters and the query string parameters as a dictionnary with :
def my_handler(event, context):
params = event["pathParameters"]
query = event["queryStringParameters"]

What is diffrence bwtween JSONResponse and JSONRenderer in Django

All I know is that JSONResponse is HttpResponse with content_type="application/JSON"
And JSONRenderer will convert the python dictionary data to JSON format
Do they do the same work? Or is there any difference between them.
I've read the difference between JSONParser and JSONRenderer which doesn't really solve my problem
JSONResponse and JSONRenderer are quite similar and perform largely the same action. Both format server responses in JSON, however their usage differs.
Both convert plain Python data to JSON format through the use of json.dumps and send the output back to the client. See JSONRenderer source and JSONResponse source for the code.
In terms of their difference, a JSONResponse should be returned by a view method in generic Django to send data with the header Content-Type: application/json. JSONRenderer on the other hand is used in Django Rest Framework to format serialized data to JSON format depending on the accept header in the received request. Check the documentation on Django request-responses: (https://docs.djangoproject.com/en/2.1/ref/request-response/) or the docs on DRF renderers (http://www.django-rest-framework.org/api-guide/renderers/) for more on their usage.
As an example a JSONResponse might be used like this:
def some_view(request):
data = get_data()
return JSONResponse(data)
And usage for JSONRenderer in settings.py:
REST_FRAMEWORK = {
'DEFAULT_RENDERER_CLASSES': (
'rest_framework.renderers.JSONRenderer', 'rest_framework.renderers.BrowsableAPIRenderer'
)
}
The above will render response data for routes using Django Rest Framework in JSON depending on the accept header of the request.

Checking retrieved data format is in JSON or Invalid format using Spring Boot

I am trying to identify the data format of retrieving data in a REST end point. I planning to only give response for request that having JSON data format when API is calling. From the retrieved header I am planning to identify that.
I am defining the end point is like following structure:
#PostMapping("/login/checkAuthorization")
public PrivillegeResponse checkAuthorizationAction(#RequestBody
PrivillegeModel privillegeObj )
{
//codes to be executed
//giving JSON response
}
Before giving its business logic implementation , I need to verify that the data retrieved is JSON data. What are the possibilities to achieve these functionalities?
For SpringBoot , you need to define the Class with #RestController .
For restricting it to Json , just define the consumes="application/json" attribute .
You can refer to ::
Producing and consuming custom JSON Objects in Spring RESTful services
For a Generic Approach,check
Spring RequestMapping for controllers that produce and consume JSON

Google Cloud Endpoints allowed method's return types

I'm developing a REST API using Cloud Endpoints. I know that as per the documentation each API method should return an entity that is then automatically converted to a valid JSON string.
However, I'm dealing with an authentication library that in some cases returns a JSON which should be passed back to the client as a response.
Sticking with the default approach, meaning returning an entity, would still be possible, but it would involve a number of obnoxious intermediate steps, like parsing the JSON and filling the right fields of the entity to be returned according to the JSON content.
I was wondering if there is a more straightforward way to instruct the API to directly return the JSON string, instead of converting it to an entity just to have it translated back to the source JSON.
One thing that probably may help you.
Suppose you have a RPC response class as
from protorpc import messages
from protorpc import message_types
from protorpc import remote
class ApiResponse(message.Messages):
response = messages.StringField(1, required = True)
You can return your JSON in response field by using JSON module. First of all import json in your endpoints api
import json
Suppose your response is
json_response = [{'name': 'Jack', 'age':12}, {'name': 'Joe', 'age':13}]
In your response to this api you can do this:-
return ApiResponse(response = json.dumps({'data': json_response}))
json.dumps() converts your dictionary object into JSON string that can be passed to response of ApiResponse class.
After you receive response in client side(javascript), you can simply parse it to JSON using
JSON.parse(response)