API response: difference between string and numeric type - json

I have an API used by a mobile app. It has an endpoint named /feed
It returns a collection of objects with different types.
[
{
"type": "USER",
"value": 5632,
},
{
"type": "IMAGE",
"value": 1412,
},
]
I am wondering, whether the type should be a string or a number like this:
[
{
"type": 100,
"value": 5632,
},
{
"type": 200,
"value": 1412,
},
]
Is there a significant difference between the two? The IOS developer of the app stated that numbers are easier to compare than strings.
Found a similar question but it does not have any answers.

Yes, numbers are faster to compare than strings, that's a fact. Then, whether it is a significant gain or not will depend on the algorithm where this json is parsed, the quantity of data it contains, etc. And furthermore, sacrificing code design for performances is often a bad option. It really depends on your project.

Related

Elasticsearch dynamic mapping for object within attribute

Wondering if I can create a "dynamic mapping" within an elasticsearch index. The problem I am trying to solve is the following: I have a schema that has an attribute that contains an object that can differ greatly between records. I would like to mirror this data within elasticsearch if possible but believe that automatic mapping may get in the way.
Imagine a scenario where I have a schema like the following:
{
name: string
origin: string
payload: object // can be of any type / schema
}
Is it possible to create a mapping that supports this? I do not need to query the records by this payload attribute, but it would be great if I can.
Note that I have checked the documentation but am confused on if what elastic calls dynamic mapping is what I am looking for.
It's certainly possible to specify which queryable fields you expect the payload to contain and what those fields' mappings should be.
Let's say each doc will include the fields payload.livemode and payload.created_at. If these are the only two fields you'll want to perform queries on, and you'd like to disable dynamic, index-time mappings autogenerated by Elasticsearch for the rest of the fields, you can use dynamic templates like so:
PUT my-payload-index
{
"mappings": {
"dynamic_templates": [
{
"variable_payload": {
"path_match": "payload",
"mapping": {
"type": "object",
"dynamic": false,
"properties": {
"created_at": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"livemode": {
"type": "boolean"
}
}
}
}
}
],
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"origin": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
Then, as you ingest your docs:
POST my-payload-index/_doc
{
"name": "abc",
"origin": "web.dev",
"payload": {
"created_at": "2021-04-05 08:00:00",
"livemode": false,
"abc":"def"
}
}
POST my-payload-index/_doc
{
"name": "abc",
"origin": "web.dev",
"payload": {
"created_at": "2021-04-05 08:00:00",
"livemode": true,
"modified_at": "2021-04-05 09:00:00"
}
}
and verify with
GET my-payload-index/_mapping
no new mappings will be generated for the fields payload.abc nor payload.modified_at.
Not only that — the new fields will also be ignored, as per the documentation:
These fields will not be indexed or searchable, but will still appear in the _source field of returned hits.
Side note: if fields are neither stored nor searchable, they're effectively the opposite of enabled.
The Big Picture
Working with variable contents of a single, top-level object is quite standard. Take for instance the stripe event object — each event has an id, an api_version and a few other shared params. Then there's the data object that's analogous to your payload field.
Now, all is fine, until you need to aggregate on the contents of your payload. See, since the content is variable, so are the data paths / accessors. But wildcards in aggregation paths don't work in Elasticsearch. Scripts do but are onerous to maintain.
Back to stripe. They partially solved it through what they call polymorphic, typed hashes — as discussed in their blog on API design:
A pretty neat approach that's worth emulating.
P.S. I discuss dynamic templates in more detail in the chapter "Mapping Automation" of my ES Handbook.

Camunda rest engine json number as BigDecimal

I've downloaded the camunda bpm platform in order to evaluate dmn decisions by using the rest api.
For that I deploy this decision table:
And send this json request:
{
"variables":
{
"a": {
"value": 100.01
},
"b": {
"value": 10.01
}
}
}
I receive the following response:
[
{
"result": {
"type": "Double",
"value": 110.02000000000001,
"valueInfo": {}
}
}
]
I expect that the value of "result" were "110.02" but instead it gives "110.02000000000001".
The issue is that camunda "engine-rest" receives numbers as "Double", so by making a sum it looses precision.
It's there a way for making camunda "engine-rest" receives numbers from json as "BigDecimal" instead of "Double" in order to not loose precision?.
This thread discusses your problem. Looks like it's an open bug.
"Below, you can find a test class which evaluates a simple decision table with a decimal result type, showcasing it being rounded to the nearest double value if it's not surrounded by quotes."
Sounds just like what you're looking for.
Best regards
Sebastian

Is returning only IDs for a JSON API collection allowed?

So let's say I have a resources called articles. These have a numeric id and you can access them under something such as:
GET /articles/1 for a specific article.
And let's say that returns something like:
{
"data": {
"type": "articles",
"id": "1",
"attributes": {
"title": "JSON:API paints my bikeshed!",
"body": "A bunch of text here"
}
}
}
Now my question is how to handle a request to GET /articles. I.e. how to deal with the request to the collection.
You see, accessing the body of the article is slow and painful. The last thing I want this REST API to do is actually try to get all that information. Yet as far as I can tell the JSON API schema seems to assume that you can always return full resources.
Is there any "allowed" way to return just the IDs (or partial attributes, like "title") under JSON API while actively not providing the ability to get the full resource?
Something like:
GET /articles returning:
{
"data": [
{
"type": "article_snubs",
"id": 1,
"attributes": {
"title": "JSON:API paints my bikeshed!"
}
}, {
"type": "article_snubs",
"id": 2,
"attributes": {
"title": "Some second thing here"
}
}
]
}
Maybe with links to the full articles?
Basically, is this at all possible while following JSON API or a REST standard? Because there is absolutely no way that GET /articles is ever going to be returning full resources due to the associate cost of getting the data, which I do not think is a rare situation to be in.
As far as I understand the JSON API specification there is no requirement that an API must return all fields (attributes and relationships) of a resource by default. The only MUST statement regarding fields inclusion that I'm aware of is related to Sparse Fieldsets (fields query param):
Sparse Fieldsets
[...]
If a client requests a restricted set of fields for a given resource type, an endpoint MUST NOT include additional fields in resource objects of that type in its response.
https://jsonapi.org/format/#fetching-sparse-fieldsets
Even so this is not forbidden by spec I would not recommend that approach. Returning only a subset of fields makes consuming your API much harder as you have to consult the documentation in order to get a list of all supported fields. It's much more within the meaning of the spec to let the client decide which information (and related resources) should be included.
The "attributes" object of a JSON-API doc does not need to be a complete representation:
attributes: an attributes object representing some of the resource’s data.
You can provide a "self" link to get the full representation, or perhaps even a "body" link to get just the body:
links: a links object containing links related to the resource.
E.g.
{
"data": [
{
"type": "articles_snubs",
"id": "1",
"attributes": {
"title": "JSON API paints my bikeshed!"
},
"links": {
"self": "/articles/1",
"body": "/articles/1/body"
}
},
{
"type": "article_snubs",
"id": "2",
"attributes": {
"title": "Some second thing here"
},
"links": {
"self": "/articles/2",
"body": "/articles/2/body"
}
}
]
}

How to store different kind of datas in single bucket in couchbase

I want to store the diff kind of data in same bucket in Couchbase, am googled its suggest doc_type and type parameter , but i don't know how i implement that one in my data,I tried below code but i don't know how to do that?
i didn't seen any example,any help will be appreciate,
JsonObject doc = JsonObject.fromJson(consumerRecord.value().toString());
String id=consumerRecord.key().toString();
bucket.upsert(JsonDocument.create(id, doc));
Couchbase buckets store documents of any kind. You don't have to do anything special. As long as it's valid JSON, it can be stored in a Couchbase bucket.
Now, what you might want is some way to differentiate between different types of documents (for querying purposes, for instance). So, you can put a "type" field (or docType or whatever you'd like) as a kind of 'marker' value. You can see examples of this in the built-in travel-sample bucket. For instance, look at the airline_10 and the route_10000 documents:
airline_10
{
"callsign": "MILE-AIR",
"country": "United States",
"iata": "Q5",
"icao": "MLA",
"id": 10,
"name": "40-Mile Air",
"type": "airline"
}
route_10000
{
"airline": "AF",
"airlineid": "airline_137",
"destinationairport": "MRS",
"distance": 2881.617376098415,
"equipment": "320",
"id": 10000,
"schedule": [ ... ],
"sourceairport": "TLV",
"stops": 0,
"type": "route"
}
Notice that one has a type of "airline", the other has a type of "route". An example of a query to find routes:
SELECT t.*
FROM `travel-sample` t
WHERE t.type = 'route'
LIMIT 10
This is one way to differentiate documents. Also notice that the keys are constructed semantically, so they could also be used. Neither of these are a requirement, so it really depends on how you plan to access documents.

Elasticsearch mapping of nested structure

I'm looking for some pointers on mapping a somewhat dynamic structure for consumption by Elasticsearch.
The raw structure itself is json, but the problem is that a portion of the structure contains a variable, rather than the outer elements of the structure being static.
To provide a somewhat redacted example, my json looks like this:
"stat": {
"state": "valid",
"duration": 5,
},
"12345-abc": {
"content_length": 5,
"version": 2
}
"54321-xyz": {
"content_length": 2,
"version", 1
}
The first block is easy; Elasticsearch does a great job of mapping the "stat" portion of the structure, and if I were to dump a lot of that data into an index it would work as expected. The problem is that the next 2 blocks are essentially the same thing, but the raw json is formatted in such a way that a unique element has crept into the structure, and Elasticsearch wants to map that by default, generating a map that looks like this:
"stat": {
"properties": {
"state": {
"type": "string"
},
"duration": {
"type": "double"
}
}
},
"12345-abc": {
"properties": {
"content_length": {
"type": "double"
},
"version": {
"type": "double"
}
}
},
"54321-xyz": {
"properties": {
"content_length": {
"type": "double"
},
"version": {
"type": "double"
}
}
}
I'd like the ability to index all of the "content_length" data, but it's getting separated, and with some of the variable names being used, when I drop the data into Kibana I wind up with really long fieldnames that become next to useless.
Is it possible to provide a generic tag to the structure? Or is this more trivially addressed at the json generation phase, with our developers hard coding a generic structure name and adding an identifier field name.
Any insight / help greatly appreciated.
Thanks!
If those keys like 12345-abc are generated and possibly infinite values, it will get hard (if not impossible) to do some useful queries or aggregations. It's not really clear which exact use case you have for analyzing your data, but you should probably have a look at nested objects (https://www.elastic.co/guide/en/elasticsearch/guide/current/nested-objects.html) and generate your input json accordingly to what you want to query for. It seems that you will have better aggregation results if you put these additional objects into an array with a special field containing what is currently your key.
{
"stat": ...,
"things": [
{
"thingkey": "12345-abc",
"content_length": 5,
"version": 2
},
...
]
}