Preferred method to format logs from the Couchbase tracing package? - couchbase

By default, all the logs from the tracing package (such as com.couchbase.tracing][OverThresholdRequestsRecordedEvent) are JSON documents such as
{
"kv": {
"top_requests": [
{
"operation_name": "get",
"last_dispatch_duration_us": 113274,
"last_remote_socket": "xxx:11210",
"last_local_id": "78C03F4700000001/000000008AA7BA2A",
"last_local_socket": "xxx:49908",
"total_dispatch_duration_us": 113274,
"total_server_duration_us": 19,
"operation_id": "0x152f3",
"timeout_ms": 250,
"last_server_duration_us": 19,
"total_duration_us": 113388
}
],
"total_count": 1
}
}
As I use Splunk to process my logs, It is highly preferred to have the logs with the format field=value
I am aware that Splunk can process JSON documents at runtime, but that adds overhead at query time
Is there any recommended approach to do this?
As an initial approach, I was thinking to subscribe to the Couchbase Event bus and format it there. But I am not sure if that's the best approach
clusterEnvironment.eventBus().subscribe(e-> {
if (e instanceof OverThresholdRequestsRecordedEvent){
// format
}
});

Related

How to load a json file with unit ML test examples into jmeter http request payload

I have an endpoint that serves a ML model and I want to perform load testing on it. I'm using Jmeter 4.0 and its UI to construct a simple plan test. With 1 thread group that loops for a given duration and continuosly performs https requests.
How do I parse multiple test examples into the payload of a http request, one by one and in json format. These examples are contained in a json file called samples.json. The nested structure is the following:
{ "dataset": [
{"id": 1,
"in":[
{
"Feature1": 8.9
"Feature2":7.1
}],
"out": "Class1",
},
{"id": 2,
"in":[
{
"Feature1": 3.2
"Feature2":5.1
}],
"out": "Class1",
}]
}
IMPORTANT: I do not know the number of attributes a priori, so I need to retrieve them from the in key as that may change for other types of models, therefore I can't make use of harcoded jmeter variables, similar to what it's used in the CSV Config Set add-on, where they need to specify the variables names for each column of the csv file
I have no idea how you're gonna use the values from JSON in the HTTP Request sampler, however this is how you can parse your samples.json file and get in values from it in the JSR223 Sampler
new groovy.json.JsonSlurper().parse(new File('samples.json')).dataset.each { entry ->
entry.get('in').each { feature ->
feature.each { value ->
log.info(value.key + '=' + value.value)
}
}
}
The above code basically prints the keys and respective values into jmeter.log file
But you can easily amend it to store the values into JMeter Variables, write them into a CSV file, set HTTP Request sampler to use them on the fly, etc.
More information:
Groovy: Parsing and producing JSON
Apache Groovy - Why and How You Should Use It
Apache JMeter API

How to interpret Fiware CYGNUS stats service output?

Starting from my own installation of the following Fiware components: Orion Context Broker, CYGNUS NGSI, Fiware STH and MongoDB, after a while I got the following result consuming a stats service which I found inside CYGNUS management API.
Service: GET http://<cygnus_host>:<management_port>/v1/stats
Result:
{
"success":"true",
"stats":{
"sources":[
{
"name":"http-source",
"status":"START",
"setup_time":"2018-05-10T13:35:06.194Z",
"num_received_events":78,
"num_processed_events":78
}
],
"channels":[
{
"name":"sth-channel",
"status":"START",
"setup_time":"2018-05-10T13:35:06.662Z",
"num_events":1,
"num_puts_ok":78,
"num_puts_failed":0,
"num_takes_ok":77,
"num_takes_failed":112
},
{
"name":"mongo-channel",
"status":"START",
"setup_time":"2018-05-10T13:35:06.662Z",
"num_events":0,
"num_puts_ok":78,
"num_puts_failed":0,
"num_takes_ok":78,
"num_takes_failed":139
},
{
"name":"hdfs-channel",
"status":"START",
"setup_time":"2018-05-10T13:35:06.662Z",
"num_events":1,
"num_puts_ok":78,
"num_puts_failed":0,
"num_takes_ok":77,
"num_takes_failed":35
}
],
"sinks":[
{
"name":"hdfs-sink",
"status":"START",
"setup_time":"2018-05-10T13:35:06.341Z",
"num_processed_events":77,
"num_persisted_events":0
},
{
"name":"mongo-sink",
"status":"START",
"setup_time":"2018-05-10T13:35:06.374Z",
"num_processed_events":78,
"num_persisted_events":78
},
{
"name":"sth-sink",
"status":"START",
"setup_time":"2018-05-10T13:35:06.380Z",
"num_processed_events":78,
"num_persisted_events":77
}
]
}
}
What caught my attention was the amount of num_takes_failed on each channel and here is my first question:
What exactly does this variable mean?
Looking into CYGNUS documentation I suppose that a "take" is something like a retry of a certain action in Flume Mongo channel but which action is that?
I looked at the MongoDB log files and did not find anything related to a connection saturation or similar problem, which brings me to my second question.
Should I worry about this statistic? If yes, how do I solve this problem?
Thank you very much in advance for any help.
You don't have to be worried about the num_takes_failed if you see that the number of processed_events is the same than the number of persisted_events. The numb_takes_filed is the result of the subtraction between the values of the flume methods EventTakeAttemptCount and EventTakeSuccessCount where the EventTakeAttemptCount is the total number of times the sink(s) attempted to read the events from the channel. This doesn't mean that the events were returned each time since sinks might poll and the channel might not have any data, On the other hand,EventTakesuccessCount is the total number of events that were successfully taken by the sink(s).
Moreover, if you want to know more about how the events are processed by the channels and sinks, you can run Cygnus on debug mode and see the log output for ensuring that every event is processed and persisted in the correct way

getDegree()/isOutgoing() funcitons don't work in graphAware/neo4j-to-elasticsearch mapping.json

Neo4j Version: 3.2.2
Operating System: Ubuntu 16.04
I use getDegree() function in mapping.json file, but the return would always be null; I'm using the dataset neo4j tutorial Movie/Actor dataset.
Output from elasticsearch request
mapping.json
{
"defaults": {
"key_property": "uuid",
"nodes_index": "default-index-node",
"relationships_index": "default-index-relationship",
"include_remaining_properties": true
},
"node_mappings": [
{
"condition": "hasLabel('Person')",
"type": "getLabels()",
"properties": {
"getDegree": "getDegree()",
"getDegree(type)": "getDegree('ACTED_IN')",
"getDegree(direction)": "getGegree('OUTGOING')",
"getDegree('type', 'direction')": "getDegree('ACTED_IN', 'OUTGOING')",
"getDegree-degree": "degree"
}
}
],
"relationship_mappings": [
{
"condition": "allRelationships()",
"type": "type",
}
]
}
Also if I use isOutgoing(), isIncoming(), otherNode function in relationship_mappings properties part, elasticsearch would never load the relationship data from neo4j. I think I probably have some misunderstanding of this sentence only when one of the participating nodes "looking" at the relationship is provided on this page https://github.com/graphaware/neo4j-framework/tree/master/common#inclusion-policies
mapping.json
{
"defaults": {
"key_property": "uuid",
"nodes_index": "default-index-node",
"relationships_index": "default-index-relationship",
"include_remaining_properties": true
},
"node_mappings": [
{
"condition": "allNodes()",
"type": "getLabels()"
}
],
"relationship_mappings": [
{
"condition": "allRelationships()",
"type": "type",
"properties": {
"isOutgoing": "isOutgoing()",
"isIncomming": "isIncomming()",
"otherNode": "otherNode"
}
}
]
}
BTW, is there any page that list all of the functions that we can use in mapping.json? I know two of them
github.com/graphaware/neo4j-framework/tree/master/common#inclusion-policies
github.com/graphaware/neo4j-to-elasticsearch/blob/master/docs/json-mapper.md
but it seems there are more, since I can use getType(), which hasn't been listed in any of the above pages.
Please let me know if I can provide any further help to solve the problem
Thanks!
The getDegree() function is not available to use, in contrary to getType(). I will explain why :
When the mapper (the part responsible to create a node or relationship representation as ES document ) is doing its job, it receive a DetachedGraphObject being a detached node or relationship.
The meaning of detached is that it is happening outside of a transaction and thus query operations are not available against the database anymore. The getType() is available because it is part of the relationship metadata and it is cheap, however if we would want to do the same for getDegree() this can be seriously more costly during the DetachedObject creation (which happen in a tx) depending on the number of different types etc.
This is however something we are working on, by externalising the mapper in a standalone java application coupled with a broker like kafa, rabbit,.. between neo and this app. We would not, however offer the possibilty to requery the graph in the current version of the module as it can have serious performance impacts if the user is not very careful.
As last, the only suggestion I can give you is to keep a property on your node with the updates of degrees you need to replicate to ES.
UPDATE
Regarding this part of the documentation :
For Relationships only when one of the participating nodes "looking" at the relationship is provided:
This is used only when not using the json definition, so you can use one or the other. the json definition has been added later as addition and both cannot be used together.
For answering this part, it means that the nodes of the incoming or outgoing side, depending on the definition, should be included in the inclusion policy for nodes, like hasLabel('Employee') || hasProperty('form') || getProperty('age', 0) > 20 . If you have an allNodes policy then it is fine.

Ember Data and mapping JSON objects

I have truly searched and I have not found a decent example of using the serializer to get objects from a differently formatted JSON response. My reason for not changing the format of the JSON response is outlined here http://flask.pocoo.org/docs/security/#json-security.
I'm not very good with javascript yet so I had a hard time understanding the hooks in the serialize_json.js or maybe I should be using mapping (I just don't know). So here is an example of my JSON response for many objects:
{
"total_pages": 1,
"objects": [
{
"is_completed": true,
"id": 1,
"title": "I need to eat"
},
{
"is_completed": false,
"id": 2,
"title": "Hey does this work"
},
{
"is_completed": false,
"id": 3,
"title": "Go to sleep"
},
],
"num_results": 3,
"page": 1
}
When ember-data tries to use this I get the following error:
DEBUG: -------------------------------
DEBUG: Ember.VERSION : 1.0.0-rc.1
DEBUG: Handlebars.VERSION : 1.0.0-rc.3
DEBUG: jQuery.VERSION : 1.9.1
DEBUG: -------------------------------
Uncaught Error: assertion failed: Your server returned a hash with the key total_pages but you have no mapping for it
Which totally makes when you look at my code for the data store:
Todos.Store = DS.Store.extend({
revision: 12,
adapter: DS.RESTAdapter.create({
mappings: {objects: "Todos.Todo"},
namespace: 'api'
})
});
My question is how do I deal with total_pages, num_results and page? And by deal, I mean ignore so I can just map the objects array.
All root properties you return in your JSON result are mapped to a DS.Model in Ember Data. You should not return properties that are not modelled or you should model them.
If you want to get rid of the error you should make an empty model for the properties you don't use.
Read more here
Why are you returning properties you don't want to use? Or is it out of your control?
The way to accomplish this is with a custom serializer. If all your data is returned from the server in this format you could simply create ApplicationSerializer like this:
DS.RESTSerilizer.extend({
normalizePayload: function(type, payload) {
delete payload.total_pages;
delete payload.num_results;
delete payload.page;
return payload;
}
});
That should allow Ember Data to consume your API seamlessly.
Ember is fairly opinionated about how things are done. Ember data is no exception. The Ember team works towards certain standards that it thinks is best, which is, in my opinion, a good thing.
Check out this post on where ember is going. TL;DR because there are so many varying implementations of api calls, they're setting their efforts towards supporting the JSON API.
From my understanding, there is no easy way to do what you're asking. Your best bet is to write your own custom adapter and serialized. This shouldn't be too hard to do, and has been done before. I recommend you having a look at the Tastypie adapter used for Python's Django Tastypie

Is there any standard for JSON API response format?

Do standards or best practices exist for structuring JSON responses from an API? Obviously, every application's data is different, so that much I'm not concerned with, but rather the "response boilerplate", if you will. An example of what I mean:
Successful request:
{
"success": true,
"payload": {
/* Application-specific data would go here. */
}
}
Failed request:
{
"success": false,
"payload": {
/* Application-specific data would go here. */
},
"error": {
"code": 123,
"message": "An error occurred!"
}
}
Yes there are a couple of standards (albeit some liberties on the definition of standard) that have emerged:
JSON API - JSON API covers creating and updating resources as well, not just responses.
JSend - Simple and probably what you are already doing.
OData JSON Protocol - Very complicated.
HAL - Like OData but aiming to be HATEOAS like.
There are also JSON API description formats:
Swagger
JSON Schema (used by swagger but you could use it stand alone)
WADL in JSON
RAML
HAL because HATEOAS in theory is self describing.
Google JSON guide
Success response return data
{
"data": {
"id": 1001,
"name": "Wing"
}
}
Error response return error
{
"error": {
"code": 404,
"message": "ID not found"
}
}
and if your client is JS, you can use if ("error" in response) {} to check if there is an error.
I guess a defacto standard has not really emerged (and may never).
But regardless, here is my take:
Successful request:
{
"status": "success",
"data": {
/* Application-specific data would go here. */
},
"message": null /* Or optional success message */
}
Failed request:
{
"status": "error",
"data": null, /* or optional error payload */
"message": "Error xyz has occurred"
}
Advantage: Same top-level elements in both success and error cases
Disadvantage: No error code, but if you want, you can either change the status to be a (success or failure) code, -or- you can add another top-level item named "code".
Assuming you question is about REST webservices design and more precisely concerning success/error.
I think there are 3 different types of design.
Use only HTTP Status code to indicate if there was an error and try to limit yourself to the standard ones (usually it should suffice).
Pros: It is a standard independent of your api.
Cons: Less information on what really happened.
Use HTTP Status + json body (even if it is an error). Define a uniform structure for errors (ex: code, message, reason, type, etc) and use it for errors, if it is a success then just return the expected json response.
Pros: Still standard as you use the existing HTTP status codes and you return a json describing the error (you provide more information on what happened).
Cons: The output json will vary depending if it is a error or success.
Forget the http status (ex: always status 200), always use json and add at the root of the response a boolean responseValid and a error object (code,message,etc) that will be populated if it is an error otherwise the other fields (success) are populated.
Pros: The client deals only with the body of the response that is a json string and ignores the status(?).
Cons: The less standard.
It's up to you to choose :)
Depending on the API I would choose 2 or 3 (I prefer 2 for json rest apis).
Another thing I have experienced in designing REST Api is the importance of documentation for each resource (url): the parameters, the body, the response, the headers etc + examples.
I would also recommend you to use jersey (jax-rs implementation) + genson (java/json databinding library).
You only have to drop genson + jersey in your classpath and json is automatically supported.
EDIT:
Solution 2 is the hardest to implement but the advantage is that you can nicely handle exceptions and not only business errors, initial effort is more important but you win on the long term.
Solution 3 is the easy to implement on both, server side and client but it's not so nice as you will have to encapsulate the objects you want to return in a response object containing also the responseValid + error.
The RFC 7807: Problem Details for HTTP APIs is at the moment the closest thing we have to an official standard.
Following is the json format instagram is using
{
"meta": {
"error_type": "OAuthException",
"code": 400,
"error_message": "..."
}
"data": {
...
},
"pagination": {
"next_url": "...",
"next_max_id": "13872296"
}
}
I will not be as arrogant to claim that this is a standard so I will use the "I prefer" form.
I prefer terse response (when requesting a list of /articles I want a JSON array of articles).
In my designs I use HTTP for status report, a 200 returns just the payload.
400 returns a message of what was wrong with request:
{"message" : "Missing parameter: 'param'"}
Return 404 if the model/controler/URI doesn't exist
If there was error with processing on my side, I return 501 with a message:
{"message" : "Could not connect to data store."}
From what I've seen quite a few REST-ish frameworks tend to be along these lines.
Rationale:
JSON is supposed to be a payload format, it's not a session protocol. The whole idea of verbose session-ish payloads comes from the XML/SOAP world and various misguided choices that created those bloated designs. After we realized all of it was a massive headache, the whole point of REST/JSON was to KISS it, and adhere to HTTP. I don't think that there is anything remotely standard in either JSend and especially not with the more verbose among them. XHR will react to HTTP response, if you use jQuery for your AJAX (like most do) you can use try/catch and done()/fail() callbacks to capture errors. I can't see how encapsulating status reports in JSON is any more useful than that.
For what it's worth I do this differently. A successful call just has the JSON objects. I don't need a higher level JSON object that contains a success field indicating true and a payload field that has the JSON object. I just return the appropriate JSON object with a 200 or whatever is appropriate in the 200 range for the HTTP status in the header.
However, if there is an error (something in the 400 family) I return a well-formed JSON error object. For example, if the client is POSTing a User with an email address and phone number and one of these is malformed (i.e. I cannot insert it into my underlying database) I will return something like this:
{
"description" : "Validation Failed"
"errors" : [ {
"field" : "phoneNumber",
"message" : "Invalid phone number."
} ],
}
Important bits here are that the "field" property must match the JSON field exactly that could not be validated. This allows clients to know exactly what went wrong with their request. Also, "message" is in the locale of the request. If both the "emailAddress" and "phoneNumber" were invalid then the "errors" array would contain entries for both. A 409 (Conflict) JSON response body might look like this:
{
"description" : "Already Exists"
"errors" : [ {
"field" : "phoneNumber",
"message" : "Phone number already exists for another user."
} ],
}
With the HTTP status code and this JSON the client has all they need to respond to errors in a deterministic way and it does not create a new error standard that tries to complete replace HTTP status codes. Note, these only happen for the range of 400 errors. For anything in the 200 range I can just return whatever is appropriate. For me it is often a HAL-like JSON object but that doesn't really matter here.
The one thing I thought about adding was a numeric error code either in the the "errors" array entries or the root of the JSON object itself. But so far we haven't needed it.
Their is no agreement on the rest api response formats of big software giants - Google, Facebook, Twitter, Amazon and others, though many links have been provided in the answers above, where some people have tried to standardize the response format.
As needs of the API's can differ it is very difficult to get everyone on board and agree to some format. If you have millions of users using your API, why would you change your response format?
Following is my take on the response format inspired by Google, Twitter, Amazon and some posts on internet:
https://github.com/adnan-kamili/rest-api-response-format
Swagger file:
https://github.com/adnan-kamili/swagger-sample-template
The point of JSON is that it is completely dynamic and flexible. Bend it to whatever whim you would like, because it's just a set of serialized JavaScript objects and arrays, rooted in a single node.
What the type of the rootnode is is up to you, what it contains is up to you, whether you send metadata along with the response is up to you, whether you set the mime-type to application/json or leave it as text/plain is up to you (as long as you know how to handle the edge cases).
Build a lightweight schema that you like.
Personally, I've found that analytics-tracking and mp3/ogg serving and image-gallery serving and text-messaging and network-packets for online gaming, and blog-posts and blog-comments all have very different requirements in terms of what is sent and what is received and how they should be consumed.
So the last thing I'd want, when doing all of that, is to try to make each one conform to the same boilerplate standard, which is based on XML2.0 or somesuch.
That said, there's a lot to be said for using schemas which make sense to you and are well thought out.
Just read some API responses, note what you like, criticize what you don't, write those criticisms down and understand why they rub you the wrong way, and then think about how to apply what you learned to what you need.
JSON-RPC 2.0 defines a standard request and response format, and is a breath of fresh air after working with REST APIs.
The basic framework suggested looks fine, but the error object as defined is too limited. One often cannot use a single value to express the problem, and instead a chain of problems and causes is needed.
I did a little research and found that the most common format for returning error (exceptions) is a structure of this form:
{
"success": false,
"error": {
"code": "400",
"message": "main error message here",
"target": "approx what the error came from",
"details": [
{
"code": "23-098a",
"message": "Disk drive has frozen up again. It needs to be replaced",
"target": "not sure what the target is"
}
],
"innererror": {
"trace": [ ... ],
"context": [ ... ]
}
}
}
This is the format proposed by the OASIS data standard OASIS OData and seems to be the most standard option out there, however there does not seem to be high adoption rates of any standard at this point. This format is consistent with the JSON-RPC specification.
You can find the complete open source library that implements this at: Mendocino JSON Utilities. This library supports the JSON Objects as well as the exceptions.
The details are discussed in my blog post on Error Handling in JSON REST API
For those coming later, in addition to the accepted answer that includes HAL, JSend, and JSON API, I would add a few other specifications worth looking into:
JSON-LD, which is a W3C Recommendation and specifies how to build interoperable Web Services in JSON
Ion Hypermedia Type for REST, which claims itself as a "a simple and intuitive JSON-based hypermedia type for REST"
There is no lawbreaking or outlaw standard other than common sense. If we abstract this like two people talking, the standard is the best way they can accurately understand each other in minimum words in minimum time. In our case, 'minimum words' is optimizing bandwidth for transport efficiency and 'accurately understand' is the structure for parser efficiency; which ultimately ends up with the less the data, and the common the structure; so that it can go through a pin hole and can be parsed through a common scope (at least initially).
Almost in every cases suggested, I see separate responses for 'Success' and 'Error' scenario, which is kind of ambiguity to me. If responses are different in these two cases, then why do we really need to put a 'Success' flag there? Is it not obvious that the absence of 'Error' is a 'Success'? Is it possible to have a response where 'Success' is TRUE with an 'Error' set? Or the way, 'Success' is FALSE with no 'Error' set? Just one flag is not enough? I would prefer to have the 'Error' flag only, because I believe there will be less 'Error' than 'Success'.
Also, should we really make the 'Error' a flag? What about if I want to respond with multiple validation errors? So, I find it more efficient to have an 'Error' node with each error as child to that node; where an empty (counts to zero) 'Error' node would denote a 'Success'.
I used to follow this standard, was pretty good, easy, and clean on the client layer.
Normally, the HTTP status 200, so that's a standard check which I use at the top. and I normally use the following JSON
I also use a template for the API's
dynamic response;
try {
// query and what not.
response.payload = new {
data = new {
pagination = new Pagination(),
customer = new Customer(),
notifications = 5
}
}
// again something here if we get here success has to be true
// I follow an exit first strategy, instead of building a pyramid
// of doom.
response.success = true;
}
catch(Exception exception){
response.success = false;
response.message = exception.GetStackTrace();
_logger.Fatal(exception, this.GetFacadeName())
}
return response;
{
"success": boolean,
"message": "some message",
"payload": {
"data" : []
"message": ""
... // put whatever you want to here.
}
}
on the client layer I would use the following:
if(response.code != 200) {
// woops something went wrong.
return;
}
if(!response.success){
console.debug ( response.message );
return;
}
// if we are here then success has to be true.
if(response.payload) {
....
}
notice how I break early avoiding the pyramid of doom.
I use this structure for REST APIs:
{
"success": false,
"response": {
"data": [],
"pagination": {}
},
"errors": [
{
"code": 500,
"message": "server 500 Error"
}
]
}
A bit late but here is my take on HTTP error responses, I send the code, (via status), the generic message, and details (if I want to provide details for a specific endpoint, some are self explanatory so no need for details but it can be custom message or even a full stack trace depending on use case). For success it's a similar format, code, message and any data in the data property.
ExpressJS response examples:
// Error
res
.status(422)
.json({
error: {
message: 'missing parameters',
details: `missing ${missingParam}`,
}
});
// or
res
.status(422)
.json({
error: {
message: 'missing parameters',
details: 'expected: {prop1, prop2, prop3',
}
});
// Success
res
.status(200)
.json({
message: 'password updated',
data: {member: { username }}, // [] ...
});
Best Response for web apis that can easily understand by mobile developers.
This is for "Success" Response
{
"code":"1",
"msg":"Successfull Transaction",
"value":"",
"data":{
"EmployeeName":"Admin",
"EmployeeID":1
}
}
This is for "Error" Response
{
"code": "4",
"msg": "Invalid Username and Password",
"value": "",
"data": {}
}