Cannot construct instance of `org.apache.drill.exec.server.rest.QueryWrapper` - apache-drill

I'm running Apache Drill out-of-the box in embedded mode.
When I send a POST request to localhost:8047/query.json, it produces 400 with error:
Cannot construct instance of `org.apache.drill.exec.server.rest.QueryWrapper`, problem: null
at [Source: (org.glassfish.jersey.message.internal.EntityInputStream); line: 4, column: 1]
Request:
{
"QueryType": "SQL",
"Query": "SELECT count(*) as `cnt` FROM dfs.`/data/demo/Parquet/*.parquet`"
}
Content-Type: application/json
When running in distributed mode it happens as well.
Running query through web interface seems ok ...
According to google I'm the only one with this error.
Any ideas?

There was a dumb bug in the request.
The fields in the request must be in camelCase, not QueryType but queryType. There were capital 'Q's in previous request due to wrong serializer settings.
I did not notice this detail for a hour.
This works:
{
"queryType": "SQL",
"query": "SELECT count(*) as `cnt` FROM dfs.`/data/demo/Parquet/*.parquet`"
}
It would be nice if the API returned a normal error like "queryType is missing" instead of enigmatic Cannot construct instance of org.apache.drill.exec.server.rest.QueryWrapper.
Hope this will save somebody a hour of life.

Related

Fiware Orion batch duplicate

I'm using APPEND_STRICT, but having trouble understanding a certain concept.
Example, I have a single entity in Fiware Orion(already created) and want to create let's say 1000 entities in batch using APEND_STRINCT(v2/op/update).
In 1000 entities there is 1 duplicate(an entity I mention that is already in Orion).
So is this correct, Orion will throw error 422 without any information in term of the id of an entity that already exists, error talk about attributes of the entity (I understand why it is the concept of APPEND_STRICT) but showing it would really help.
And another part is if the entity which is duplicate was on position 400 then Orion send error but continue to write remaining entities, this is really hard to manage because I cannot know when a total write is done and have to show some response while Orion still works on them in the background.
Are my assumptions correct and can be something done to avoid this, something I failed to notice.
Thanks.
Edit
Error message:
{ error: 'Unprocessable',
description: 'one or more of the attributes in the request already exist:
[ family, serialNumber, refSortingType, description, refType, storedWasteOrigin, location, address, fillingLevel, cargoWeight, temperature, methaneConcentration, regulation, responsible, owner, dateServiceStarted, dateLastEmptying, nextActuationDeadline, actuationHours, openingHours, dateLastCleaning, nextCleaningDeadline, refDepositPointIsle, status, color, image, annotations, areaServed, dateModified, refDevice ]' } } }
Example request:
{ method: 'POST',
headers:
{ 'Content-Type': 'application/json',
'Fiware-Service': 'waste4think',
'Fiware-ServicePath': '/d',
'X-Auth-Token': 'DssfKZe82e1dyJof416EmrQPdFQ3QK1' },
uri: 'http://localhost:1026/v2/op/update',
body: { actionType: 'APPEND_STRICT', entities: [Array] }
{"actionType":"APPEND_STRICT","entities":[{"id":"xxx","type":"xxx","family":{"value":"Agent","type":"String","metadata":{}},"serialNumber":{"value":"","type":"String","metadata":{}},"refSortingType":{"value":"SortingType:2","type":"String","metadata":{}},"description":{"value":"","type":"String","metadata":{}},"refType":{"value":"DepositPointType:0","type":"String","metadata":{}},"storedWasteOrigin":{"value":"","type":"String","metadata":{}},"location":{"value":{"type":"Point","coordinates":[xxx]},"type":"geo:json"},"address":{"value":"xxxxx.","type":"String","metadata":{}},"fillingLevel":{"value":0,"type":"Float","metadata":{"unit":{"value":"C62","type":"String"}}},"cargoWeight":{"value":0,"type":"Float","metadata":{"unit":{"value":"KGM","type":"String"}}},"temperature":{"value":0,"type":"Float","metadata":{"unit":{"value":"CEL","type":"String"}}},"methaneConcentration":{"value":0,"type":"Float","metadata":{"unit":{"value":"59","type":"String"}}},"regulation":{"value":"Municipal association","type":"String","metadata":{}},"responsible":{"value":"","type":"String","metadata":{}},"owner":{"value":"xxx","type":"String","metadata":{}},"dateServiceStarted":{"value":"","type":"String","metadata":{}},"dateLastEmptying":{"value":"","type":"String","metadata":{}},"nextActuationDeadline":{"value":"","type":"String","metadata":{}},"actuationHours":{"value":[],"type":"List","metadata":{}},"openingHours":{"value":[],"type":"List","metadata":{}},"dateLastCleaning":{"value":"","type":"String","metadata":{}},"nextCleaningDeadline":{"value":"","type":"String","metadata":{}},"refDepositPointIsle":{"value":"","type":"String","metadata":{}},"status":{"value":"ok","type":"String","metadata":{}},"color":{"value":"","type":"String","metadata":{}},"image":{"value":"","type":"String","metadata":{}},"annotations":{"value":"","type":"String","metadata":{}},"areaServed":{"value":"","type":"String","metadata":{}},"dateModified":{"value":"","type":"String","metadata":{}},"refDevice":{"value":"","type":"String","metadata":{}}}]}
As for the request, I split the post part and body part. As you can see by error msg is not possible to know what entity caused this
I think the functionality is as you describe. Orion responses with a list of the attributes that already exist but not to which entity they belong. A response like this could be probably more useful:
'one or more of the attributes in the request already exist:
entity23: [ family, serialNumber], entity 42: [refSortingType, description]'
with some capping (e.g. as much as 20 entities) to preclude too bigs responses.
If you think implementing something like that could be insteresting, please create a new issue in the Orion repository about it, please.
Some additional comments:
APPEND_STRICT is deprecated. The right keyword is appendStrict.
Regarding "Orion send error but continue to write remaining entities, this is really hard to manage because I cannot know when a total write is done and have to show some response while Orion still works on them in the background". Orion doesn't response until it finishes to process the whole batch in the POST /v2/op/entity request. So your REST client can be sure that when the response is received everything has been processed (although that processing could involve errors due to duplicated attributes in some entities, as we have been discussing). Have you experience a different behaviour? In that case, how did you get it? (it Orion is behaving that way it could be a bug and I'd like to know about in order to debug it).

SODA API ERROR: code": "permission_denied", "error": true, "message": "Invalid app_token specified"

using Socrata to access Data,
the issue am having is, when I DO NOT use my app_token key (String1 See Below), it works fine, with current data information, but when I do use my app_token with the String (String2), i get the follow error. (See below), And if I use (String1) with just my app_token without no extra data fields like draw_data (draw_date=2016-06-24T00:00:00.000) it works. So i know its not my key, any reason why? How do i get this to work correctly?
String1 (WORKS): https://data.ny.gov/resource/h6w8-42p9.json?draw_date=2016-06-24T00:00:00.000
String2 (DON'T WORK): https://data.ny.gov/resource/h6w8-42p9.json?$$app_token=MY-TOKEN?draw_date=2016-06-24T00:00:00.000
Getting Error (With String2):
SODA code": "permission_denied", "error": true, "message": "Invalid app_token specified"
First, I'm pretty confident MY-TOKEN isn't your app token, but just in case, make sure you've signed up for a real app token.
Second:
https://data.ny.gov/resource/h6w8-42p9.json?$$app_token=MY-TOKEN?draw_date=2016-06-24T00:00:00.000
...should instead be:
https://data.ny.gov/resource/h6w8-42p9.json?$$app_token=MY-TOKEN&draw_date=2016-06-24T00:00:00.000.
There should be an ampersand (&) between your $$app_token and draw_date parameters. The question mark (?) is only used to separate the URL from the parameter set. To our query parser, it looks like your app token is MY-TOKEN?draw_date=2016-06-24T00:00:00.000.

How to do a RESTful GET on an indefinite number of parameters?

I have a collection of IDs of RESTful resources (all the same type of resource), the number of which can be indefinitely large. I want to make a REST call to get the names of these resources. Something like this:
Send:
['005fc983-fe41-43b5-8555-d9a2310719cd', '4c6e6898-e519-4bac-b03e-e8873d3fa3f0',...]
Receive:
['Resource A', 'Resource B',...]
What is the best way to retrieve the names of these resources RESTfully?
Here are the ideas I have had and the problems I see with each approach:
The naive approach is to iterate through all IDs in my collection and do a 'GET /resource/:id' for each ID. This would be prohibitively slow and resource intensive because of the large number of HTTP calls I would have to make.
The next approach I thought of is to pass the IDs as parameters to a single GET call. The problem here is that most servers have a limit on the URL length, which would be quickly exceeded.
Next, I thought that putting the IDs in the body of a GET would work, but according to Roy Fielding, data in the GET body should not affect the results of a REST call: HTTP GET with request body
I could use a POST request and put the data on the POST body, but POST is intended for creating and modifying resources, which is not what I'm doing. Maybe I should ignore the intent of the verb and use it anyway?
I could split the request into multiple GET requests to avoid exceeding the max URL length. The problem here is that I have to combine the results after all calls have returned, which is potentially slow.
I could create a collection resource within my main resource by posting my list of IDs to 'POST /resource/collection', then use a 'GET /resource/collection/:id' call to retrieve the results. This actually works, but then I have to do a 'DELETE /resource/collection/:id' to clean up. It takes multiple calls, requires cleanup, and seems a bit clunky overall, so it's okay, but not ideal.
Is there a better way to do this?
Your last approach is RESTful and the one I recommend. I'd do this:
Step 1:
Request:
POST /resource/collection
Content-Tpye: application/json
{
"ids": [
"005fc983-fe41-43b5-8555-d9a2310719cd",
"4c6e6898-e519-4bac-b03e-e8873d3fa3f0"
]
}
Response:
201 Created
Location: /resource/collection/89AB8902-FDF1-11E4-ADDF-CD4FB664A5DC
Step 2:
Request:
GET /resource/collection/89AB8902-FDF1-11E4-ADDF-CD4FB664A5DC
Response:
200 OK
Content-Type: application/json
{
"resources": [ ... ]
}
but then I have to do a 'DELETE /resource/collection/:id' to clean up.
Not, that is not necessary. The server could implement a job that removes all collections that are older than a specific timestamp. It is not the client who has to do this.
If later a client access the collection again, the server would respond with
410 Gone

REST resource returning different object depending on state

I'm trying to define a REST API and I'm having trouble with one requirement.
I have an action that the API user can do that is the same thing, but can be done in two different ways.
For example, say my user uses my API to change the intensity of a light. I will have an URL something like
api/light/intensity
One option the user has to change the intensity is to set as a % of the maximum luminosity, the other option is setting the intensity as an exact value, in lumens (there is a detector for that) and he can pass the "precision" that can be low, medium and high (it changes the time it takes to get to the correct intensity).
I want the user to be able to GET the current intensity, meaning in which mode he is and depending on the mode, the % or the value in lumens and the precision.
This is where I'm lost, my GET will return a JSON object for example, is it OK to send something like
{
"Mode" = "Percent",
"Percent" = 50.5
}
when I'm in "percentage" mode and
{
"Mode" = "Exact",
"Lumens" = 200,
"Precision" = "High"
}
When I'm in "lumens" mode?
If that seems OK, how would I tell the user which type of "object" he should parse?
What would be the best way to let the user send his changes? I was thinking about having two URL, one for each mode, like
PUT /api/light/intensity/exact and PUT /api/light/intensity/percent
And both being waiting for JSON objects similar to the ones above, without the Mode.
Use HTTP Content negotiation. This allows:
the client to tell the server what representation of a resource it wants to GET,
the server to tell the client what representation of a resource it returns to the client,
the client to tell the server what represenation of a resource it is PUTing to the server.
Define two vendor content types:
application/vnd.com.example.light.intensity.percentage+json
application/vnd.com.example.light.intensity.lumens+json
The client tells the server which of both it wants:
GET /api/light/intensity/
Accept: application/vnd.com.example.light.intensity+percentage
The server responds:
200 OK
Content-Type: application/vnd.com.example.light.intensity+percentage
{
"Percent" = 50.5
}
The client wants to change the intensity:
PUT /api/light/intensity/
Content-Type: application/vnd.com.example.light.intensity+percentage
{
"Percent" = 42.7
}
The server knows from the Content-Type header how to interpret the JSON body. In this example it handles the request as in 'Percent' mode.
If the second content type was used, client and server would know to interpret the request/response as in 'Lumes' mode.
Edit: Note that the GET and PUT request use the same URL because the requests are about the same resource: the light intensity. All that differs is the representation of this resource. The proper way to handle this are content types.
The specifics will depend a bit on your API, and the needs of your users. The same GET method call to a RESTful API should always return the same value: a representation of the resource as defined by the information in the URL and nothing else. If you're maintaining state in the system, you're violating a precept of REST. (Edit: as pointed out by Gimly, that statement is unclear. It's not a violation of RESTful design for the system to maintain its own internal state, especially if a request changes the state of the system with a PUT, POST or DELETE. It's a violation for a request to rely on that state to return a representation of the resource, or to request a state change. Each request should be self-contained.)
I'd use a query string to change the format of the representation:
GET /api/light/intensity
GET /api/light/intensity?f=percent
That way /api/light/intensity always refers to the same resource (defaulting to the "exact" representation, which has the most data), and the query string "filters" the representation, similarly to a search query. It removes some data (in this case, the exact luminosity and precision) in favor of a relative representation in percent of some maximum value. Alternately, you could think of it as controlling the output format: GET /foo.json vs GET /foo.xml. The resource is the same, but the representation differs.
For updating a resource, you can take an object as you've described. Your server will have to understand the different formats, but you could either PUT to the bare URL, or again use a query parameter to control the format expected by the server, and then let your payload be more abstract, using value instead of lumens or percentage:
PUT /api/light/intensity
Payload: {"value": 200, "precision": "high"}
PUT /api/light/intensity?f=percent
Payload: {"value": 50.5}
That allows you to structure the API for your light resource in such a way that intensity is one property of the resource. "Percent" then becomes a convenience representation in the output, so when you return the entire light resource, it would read something like:
"light": {
"name": "the light",
"id": 12345,
"intensity": 200,
"max-intensity": 400,
...
}
So the API user could calculate current percent based on intensity and max-intensity. (You could of course substitute "percent" for "max-intensity" and let the user do the math the other way, but it feels more natural to me to provide absolute values and let the math calculate relative values.
Edit
Please see Tichodroma's answer for the better way of handling this. I'm leaving the answer because the discussion in the comments was useful to me, and may be useful to others in the future.

Rails 3: How to return errors in a JSON request?

How can I return a 800, 404, etc error when a user makes a JSON/XML request to my API?
I've tried
error 404, {:error => "ERror".to_json }
with no success.
Also, I've tried to put a "respond_to" but it doesn't work as well (it duplicates the respond_to and gives error).
Thanks
The same way you return such errors with html, it's part of the HTTP Header.
render json: #myobject, status: :unprocessable_entity
Update, response to comment:
You can get all the status codes from Rack. Rails passes the symbolized status to Rack
Rack::Utils.status_code(options[:status])
which simply matches the symbol to the list of status (the strings are converted to symbols)
Here is the smoking fresh list: https://github.com/rack/rack/blob/master/lib/rack/utils.rb#L575-L638
Scroll a bit lower and you'll see the status_code method. It's fun to read the source code!