I need to crawl a paginated REST API with offset and limit parameters using Talend.
The API gives me a list of the resources I am interested in.
For instance, the response to the initial request with offset=0 and limit=2 is:
{
"meta": {
"limit": 2,
"next": "/api/v1/request/?offset=2&limit=2",
"offset": 0,
"previous": null,
"total_count": 4300
},
"objects": [
{
"id": 1,
"name": "foo"
},
{
"id": 2,
"name": "bar"
}
]
}
As you can see, the response object contains an objects key, i.e. some of the desired resources and a meta key which indicates the next URL to query: next. So far I am able to perform the initial request with tRESTClient. However, I don't know how to proceed from here and request the remaining pages using the clue given by next.
How can I perform multiple requests to that API so that I iterate over the whole list until next equals null (=list is exhausted)?
I tried to figure out how tSetGlobalVar and tLoop could help me, but so far with no success. But then again, I am a Talend newbie.
Current job
This is what my job currently looks like:
Related
I have a question about Json extractor sampler in Jmeter.
I have a json, that include list of components, the problem is that the order of the list is changing, and if I send the request in the morning the order of the list is not as in the evening .
I want to extract the value "SalesPerson" with Id = 10606.
The problem is that if the Id of 10606 is the first in response it is OK, But if it is in different order it brings wrong value.
I need that in each time I send a request I can search Id 10606 and return the sales person (in the example "Bebeto").
Is it can be done? that it will extract exactly the specific value and not according it's place in response regards.
{
"Error": null,
"ErrorCode": 0,
"Data": [{
"Account": "Pro",
"SalesPerson": "Ronaldo",
"Id": 7722,
"Name": "Brazil"
}, {
"Account": "Basic",
"SalesPerson": "Bebeto",
"Id": 10606,
"Name": "USA"
}, {
"Account": "Basic",
"SalesPerson": "Rivaldo",
"Id": 13017,
"Name": "Greece"
}],
"Totals": 3
}
Provided the sampler, that worked OK just if 10606 is the first in the list. I want it to extract Salesperson with Id 10606 regardless its position in the response. I want to provide 10606 and get the salesperson.
In the provided scenario the results is wrong since its extract "Ronaldo"
What am I missing?
Go for Filter Operator, something like:
$..[?(#.Id == '10606')].SalesPerson
should do the trick for you
Demo:
More information and tricks: JMeter's JSON Path Extractor Plugin - Advanced Usage Scenarios
I am designing a REST call that should deliver information for a location (lat/lon) and consider the user context/configuration.
As the number of user properties is high and nested, I am not sure, what is the correct way to design a new query (GET vs POST). Currently we use a POST request for simplicity - the query payload could look like this, but is custom and very different for each user. It also includes an array of multiple configuration items. Currently the request looks like this:
POST http://api.something.com/locationInformation
{
"location": {
"accuracy": 30,
"coordinates": [
16.34879820048809,
48.230067741347334
],
"provider": "network",
"timestamp": "2016-01-06T12:00:00.000Z"
},
"userConfiguration": [
{
"id": "asdfasdfasdfs09898sdf",
"values": [
"false"
]
},
{
"id": "iojkljio230909sdjklsdf",
"values": [
"99jkjiouio89",
"sdfilkjöjfoi093s09sdf"
]
}
]
}
So my question is: is it in such a case ok to "abuse" a POST request in order to query information?
Is there an elegant way to pass such data using a GET request?
Yes u can pass this data using a GET request by passing it to a request header.
use the header() method.Initialize a String variable say String data=//your json; and pass it to the header as follows header("data",data) in your client while building the request.
i'm making a new application using Parse.com as a backend, i'm trying to make less requests to the Parse, I have a class which is pointing to another object of another class.
Class1(things):
ObjectID Name Category(pointer)
JDFHSJFxv Apple QSGKqf343
Class2(Categories):
ObjectID Name Number Image
QSGKqf343 Fruits 45 http://myserver.com/fruits.jpeg
when i'm trying to retreive data for my first class things using REST API i'm getting this json object :
{
"results": [
{
"Name": "Apple",
"createdAt": "2015-07-12T02:50:20.291Z",
"objectId": "JDFHSJFxv",
"category": {
"__type": "Pointer",
"className": "Teams",
"objectId": "QSGKqf343"
},
"updatedAt": "2015-07-12T02:55:33.696Z"
}
]
}
the json doesn't contains all the data included in the object i'm pointing to, I will have to make another request to get all the data of that object,
is There any way to fix that
You need to tell Parse to return the related object in your query, via the include key.
e.g., add the following to your CURL --data-urlencode 'include=category'
I tried this query to get the list of all dish instances
[{
"id": null,
"name": null,
"type": "/food/dish"
}]
But it only gives me first page:
http://www.freebase.com/query?autorun=1&q=%5B%7B%22id%22:null,%22name%22:null,%22type%22:%22/food/dish%22%7D%5D
Question 1: How to add paging to get all 2.5K or so dish instances? I tried to add "cursor: 2" and didn't work.
Let say I have a name "pizza", I tried this query to get detail of "pizza":
{
"*": null,
"name": "pizza",
"type": "/food/dish"
}
But that didn't give me description and images like this page http://www.freebase.com/m/0663v
Question 2: How to get all information, or at least description and image URLs like in the freebase page above?
Bonus: I tried to do everything via Freebase Node.js here https://github.com/spencermountain/Freebase.js
I suggest you separate this into 2 questions, so each has their own topic, and it is easier for future visitors to search.
That said:
Question 1
You can increase the number of results you get per page by adding limit: to your query. Regardless, you will have to use paging. To use paging, you need to add the cursor parameter to your mqlread HTTP request. Again: cursor is not part of the MQL query itself, but rather of the HTTP envelope that submits it.
For the first query, issue an empty cursor, and for subsequent queries use the cursor value returned to you by mqlread.
Note that all this will need to be done with the API, not with freebase directly, and as such the URL will need to be:
https://www.googleapis.com/freebase/v1/mqlread?cursor=&query=[{"id":null,"name":null,"type":"/food/dish","limit":5}]
Also note that if you plan on doing this for anything other than testing you will need to obtain a key from Google.
Finally, note that some strings in Freebase are "freebase-encoded", read up on that regarding how to decode them in the result.
Question 2
If you just want the ingredient names then simply add "/dining/cuisine/ingredients": [] to your query. Note that many dishes do not have ingredients, but Pizza does:
{
"id": "/m/0663v",
"name": null,
"type": "/food/dish",
"/dining/cuisine/ingredients": []
}
Getting the images means adding "/common/topic/image": [{}] to your query, and using the returned id for each image.
Getting an image URL from a given image id is done by prepending https://usercontent.googleapis.com/freebase/v1/image/ to the id.
Edit
Tom correctly noted that I forgot about image descriptions. The description for each image will be available under name: in the returned /common/topic/image array. For example, for the query
[{
"id": "/en/minestrone",
"/common/topic/image": [{
"id": null,
"name": null
}]
}]
you get the following images and their descriptions:
{
"result": [{
"id": "/en/minestrone",
"/common/topic/image": [
{
"id": "/wikipedia/images/commons_id/1492185",
"name": "MinestroneSoup"
},
{
"id": "/wikipedia/images/commons_id/12565901",
"name": "Homemade minestrone"
}
]
}]
}
Your final MQL, then, is:
[{
"id": null,
"name": null,
"type": "/food/dish",
"/common/topic/image": [{
"id": null,
"name": null
}],
"/dining/cuisine/ingredients": []
}]
... and the HTTP envelope will contain a key and a value for cursor.
See Nitzan's answer for the answer to question 1.
For your second question, the easiest way to get descriptions and images is by using the Topic API e.g. https://www.googleapis.com/freebase/v1/topic/m/0663v
What is the right way to format your responses in JSON and why? I've seen different services do it two ways, consider a simple GET /users resource:
{
"success": true,
"message": "User created successfully",
"data": [
{"id": 1, "name": "John"},
{"id": 2, "name": "George"},
{"id": 3, "name": "Bob"},
{"id": 4, "name": "Jane"}
]
}
That is how I usually do that. I have some abstract helper fields like success and message, there may be some more but the question is if should I nest the data in the data field to an array called the same way as the resource - users:
{
"success": true,
"message": "User created successfully",
"data": {
"users": [
{"id": 1, "name": "John"},
{"id": 2, "name": "George"},
{"id": 3, "name": "Bob"},
{"id": 4, "name": "Jane"}
]
}
}
Even if we don't use the abstraction:
{
"users": [
{"id": 1, "name": "John"},
{"id": 2, "name": "George"},
{"id": 3, "name": "Bob"},
{"id": 4, "name": "Jane"}
]
}
Seems the users key is obsolete as any client will know the route they called, which consists of /users, where users are mentioned, and the client code like
$users = $request->perform('http://this.api/users')->body()->json_decode();
looks much better than
$users = $request->perform('http://this.api/users')->body()->json_decode()->users;
as it avoids repeated users.
One use case where the envelope can be useful is when you are expecting to be dealing with large lists and need to do pagination to prevent huge response payloads. The envelope is a good place to put the pagination meta data:
{
"users": [...],
"offset": 0,
"limit": 50,
"total": 10000
}
(This is what we do in a RESTful API I'm working on)
Clearly this is only relevant for requests that return lists of things (e.g. /users/) and not for requests that return single entities (e.g. /users/42) and even for requests that return lists, you don't have to use an envelope - one alternative would be to use response headers for this meta data instead.
PS. I would only advise having a success and message fields if you have a concrete use case for them. Otherwise don't bother, they are simply unnecessary.
Just to get on the same page, data is a field in a JSON object. In the first example the value of data is an array. In the second example the value of data is an object.
Either is valid, so to answer your question: no it is not necessary to nest named objects in an named object. It is necessary that all fields of an object be named, but you are free to nest arrays within an object.
It really just depends on what the processor expects. If data can be anything, then the first approach is fine. If code expects the value of the data field to be an object, then you have to use something like the second example.
According to your comment which you added to first comment: more descriptive data is better data as every information is useful for consumer of you API - REST endpoint. So if you know that the content is user, or whatever, it's better to use it in schema or endpoint url.
Better description = better consuption :-)