JSON document inserted as binary object in Couchbase - json

I'm trying to insert a java POJO into the couchbase store and the json just below the cas call looks like this -
{
"key": "sampleKey",
"myMap": {
"Messages": [
{
"field": "f1",
"label": "l1"
},
{
"field": "f2",
"label": "l2"
},
{
"field": "f3",
"label": "l3"
},
{
"field": "f4",
"label": "l4"
}
],
"Orders": [
{
"field": "f1",
"label": "l1"
},
{
"field": "f2",
"label": "l2"
},
{
"field": "f3",
"label": "l3"
},
{
"field": "f4",
"label": "l4"
},
{
"field": "f5",
"label": "l5"
}
]
}
}
I have verified that this is a valid JSON and it's still being inserted as binary object as I try to look up this document via couchbase GUI, it shows up the base64 encoded string. A couple of other documents are fine though. I am wondering if this is happening only for the cas method and not set.
The relevant java code is this:
String myJson = objectMapper.writeValueAsString(cacheObject);
CASResponse response = couchbaseClient.cas(cacheObject.getKey(), casValue.getCas(), myJson, PersistTo.MASTER);
// Java pojo
public class CacheObject
{
private String key;
private Map<String, List<FieldLabel>> myMap = new HashMap<String, List<FieldLabel>>();
// setters and getters
}
Any pointers on why this could be happening will be appreciated.
Update1: I'm using Couchbase java client version 1.4.4, server's 2.5
Update2: I don't think this has to do with my code or json, I tried replacing my json with a large json document (a valid one) and I saw the same result in the couchbase GUI. I think this's happening because size of the document may go over 2.5KB. The json I pasted above has the actual field and labels removed, they are slightly longer strings.
Strangely, when I modify my document, documents below 960 characters generally show up as Json, however slightly above ones are stored as binary.

If the size of the document is above 2.5KB the document will not be editable in console and this value can be changed in a file called documents.js

Related

Elasticsearch dynamic mapping for object within attribute

Wondering if I can create a "dynamic mapping" within an elasticsearch index. The problem I am trying to solve is the following: I have a schema that has an attribute that contains an object that can differ greatly between records. I would like to mirror this data within elasticsearch if possible but believe that automatic mapping may get in the way.
Imagine a scenario where I have a schema like the following:
{
name: string
origin: string
payload: object // can be of any type / schema
}
Is it possible to create a mapping that supports this? I do not need to query the records by this payload attribute, but it would be great if I can.
Note that I have checked the documentation but am confused on if what elastic calls dynamic mapping is what I am looking for.
It's certainly possible to specify which queryable fields you expect the payload to contain and what those fields' mappings should be.
Let's say each doc will include the fields payload.livemode and payload.created_at. If these are the only two fields you'll want to perform queries on, and you'd like to disable dynamic, index-time mappings autogenerated by Elasticsearch for the rest of the fields, you can use dynamic templates like so:
PUT my-payload-index
{
"mappings": {
"dynamic_templates": [
{
"variable_payload": {
"path_match": "payload",
"mapping": {
"type": "object",
"dynamic": false,
"properties": {
"created_at": {
"type": "date",
"format": "yyyy-MM-dd HH:mm:ss"
},
"livemode": {
"type": "boolean"
}
}
}
}
}
],
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
},
"origin": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
}
}
}
}
}
}
Then, as you ingest your docs:
POST my-payload-index/_doc
{
"name": "abc",
"origin": "web.dev",
"payload": {
"created_at": "2021-04-05 08:00:00",
"livemode": false,
"abc":"def"
}
}
POST my-payload-index/_doc
{
"name": "abc",
"origin": "web.dev",
"payload": {
"created_at": "2021-04-05 08:00:00",
"livemode": true,
"modified_at": "2021-04-05 09:00:00"
}
}
and verify with
GET my-payload-index/_mapping
no new mappings will be generated for the fields payload.abc nor payload.modified_at.
Not only that — the new fields will also be ignored, as per the documentation:
These fields will not be indexed or searchable, but will still appear in the _source field of returned hits.
Side note: if fields are neither stored nor searchable, they're effectively the opposite of enabled.
The Big Picture
Working with variable contents of a single, top-level object is quite standard. Take for instance the stripe event object — each event has an id, an api_version and a few other shared params. Then there's the data object that's analogous to your payload field.
Now, all is fine, until you need to aggregate on the contents of your payload. See, since the content is variable, so are the data paths / accessors. But wildcards in aggregation paths don't work in Elasticsearch. Scripts do but are onerous to maintain.
Back to stripe. They partially solved it through what they call polymorphic, typed hashes — as discussed in their blog on API design:
A pretty neat approach that's worth emulating.
P.S. I discuss dynamic templates in more detail in the chapter "Mapping Automation" of my ES Handbook.

Using Cygnus to write in mongo db with no string

I'm trying to use Orion CB and Cygnus to write information about water quality and water consume and I need to write in float type. However it is impossible to me to know if there is any possibility to write this with float or double format.
Could someone tell me if this possibility exists?
As stated at FIWARE Orion documentation, you are free to specify your entities attributes using JSON format.
So, you will have your entity in the following format:
{
"id": "entityID",
"type": "entityType",
"attr_1": <val_1>,
"attr_2": <val_2>,
...
"attr_N": <val_N>
}
In which each <val_n> will be in the following format:
{
"type": <...>,
"value": <...>,
"metadata": <...>
}
Thus, you can have some entity described, for example, as:
{
"id": "sensor_ID",
"type": "room_sensor",
"temperature": {
"type": "float",
"value": 23.2
},
"noise": {
"type": "integer",
"value": 35
}
}
Therefore, you can use float or double as you want.

Why does a numeric key in the JSON Structure always get displayed first

(Cannot summarize the problem in a single statement, hence the ambiguous title)
I create a JSON structure via Angular Typescript, wherein when a user interacts with certains parts of the component the JSON Structure gets updated.
Steps
Initially, the JSON under consideration is by default set to the following:
{
"keyword": {
"value": "product",
"type": "main"
}
}
For example, a user chooses some parameter Name. Once the user complies to certain steps in the UI, the JSON structure gets updated to the following:
{
"keyword": {
"value": "product",
"type": "main"
},
"Name": {
"value": " <hasProperty> Name",
"type": "dataprop"
}
}
Once the user selects a numeric value for a parameter like dryTime, the JSON gets updated to the following:
{
"20": { // WHY WOULD 20 be here?
"value": "<hasValue> 20",
"type": "fValue"
},
"keyword": {
"value": "Varnish",
"type": "main"
},
"Name": {
"value": " <hasProperty> Name",
"type": "dataprop"
},
"dryingTime": {
"value": " <hasProperty> dryingTime",
"type": "dataprop"
}
}
I understand that a JSON is an unordered data structure. But a previous implementation of something similar actually worked well, i.e., the value 20 here was 20.0 before and it was displayed after dryingTime in my JSON.
The order is critical for me as I parse all the Keys in the above mentioned JSON using a for loop and store it in an array. This array needs to show all the keys in the order of the User Interaction.
Where am I going wrong here if I decide to stay with JSON and not with an array to store such interactions?
Yes, JSON fields are unordered. JSON array is ordered.
If you want to keep the order of elements insterted, you could build your JSON like so:
{
"keyword": {
"value": "Varnish",
"type": "main"
},
"props": [
{
"name": "dryingTime",
"value": 20
},
{
"name": "anotherOrderedField",
"value": "fieldValue"
}
]
}

Issue with JSON Schema object validation (How does it work internally)?

BACKGROUND:
I have mockedup my issue here for reference:
https://jsonschemalint.com/#/version/draft-06/markup/json?gist=4c185903d7aeb13f6977852a09cf5462
and I am using this npm package: https://www.npmjs.com/package/jsonschema
CODE
//i read in JSON specified in files (the contents of which are below) and parse them into JSON objects. This process works fine.
var jsonDef = JSON.parse(schemaFile); //i store this jsonDef in my database as an object
var jsonObj = JSON.parse(objFile);
var jsonv = new JSONValidator();
var validateResult = jsonv.validate(jsonObj, jsonDef); //jsonDef is read from my database
//validateResult.valid is true
PROBLEM:
I have a general schema + metadata definition like so ("props" contains the actual object schema I want to validate)
schemaFile:
{
"name":"schoolDemo",
"displayName":"School Demo",
"propertiesKey":"assetId",
"props":{
"assetId": {
"type": "string"
},
"grade": {
"type": "number"
}
}
}
objFile:
{
"assetId": "75255972",
"grade": "A"
}
However, when I try to validate against the following user-input object, it succeeds. Shouldn't it fail because:
(1) there is no "properties" element in the initial metadata+schema definition? This field seems to be required based on the examples shown here: https://www.npmjs.com/package/jsonschema
(2) the type for grade is not a number
What am I doing wrong?
The validation is passing because the schema is not well formed ("properties" is missing).
Try this instead and then the validation will fail:
{
"name": "schoolDemo",
"displayName": "School Demo",
"propertiesKey": "assetId",
"properties": {
"assetId": {
"type": "string"
},
"grade": {
"type": "number"
}
}
}

Reading complex json data without iteration

I am working with some data and often the data is nested and i am required to perform some CRUD operations based on the structure of the data i have. For instance i have this json structure
{
"_id": "KnNLkJEhrDsvWedLu",
"createdAt": {
"$date": "2016-10-13T11:24:13.843Z"
},
"services": {
"password": {
"bcrypt": "$2a$30$1/cniPwPNCuwZ/MQDPQkLej..cAATkoGX.qD1TS4iHgf/pwZYE.j."
},
"email": {
"verificationTokens": [
{
"token": "qxe_T9IS7jW7gntpK0Q7UQ35RJ9jO9m2lclnokO3z87",
"address": "drwho#gmail.com",
"when": {
"$date": "2016-10-13T11:24:14.428Z"
}
}
]
},
"resume": {
"loginTokens": []
}
},
"username": "doctorwho",
"emails": [
{
"address": "drwho#gmail.com",
"verified": false
}
],
"persodata": {
"lastlogin": {
"$date": "2016-10-13T11:29:36.816Z"
},
"fname": "Doctor",
"lname": "Who",
"mobile": "+4480000000",
"identity": "1",
"email": "drwho#gmail.com",
"gender": null
}
}
I have several data sets with such complex structure. I need to read the data, edit and also delete. Before i get to iteration, i was wondering how i can read the data without iteration then iterate when i absolutely have to.
What are the rules i should keep in mind when reading such complex json structures to enable me read any complex structure i come across?.
I am currently using javascript but i am looking for rules that apply in other languages as well.
Parsing Json in JavaScript should be easy. http://www.json.org/js.html.
"Since JSON is a proper subset of JavaScript, the compiler will correctly parse the text and produce an object structure". Just follow the examples on that page.
If you want to use another language, in Java you could use Jackson or Gson to map those json strings to objects. Then using them becomes easy. Both libraries are annotation based, and wouldn't be difficult to implement.