How to debug vega-lite specification? No data points seen plotted - vega-lite

I have the following vega-lite specification (open vega-editor)
It seems very straightforward, but I don't see any data point being plotted.
There is no error, but a bunch of warnings, which I cannot make out to fix the problem.
I'd like to learn how to debug to vega-lite specifications. Or get help to correct some stupid mistakes of mine.
EDIT: (to keep the completeness of the lessons learned, here is the specification snippet, and the warnings:)
{
"mark": "tick",
"encoding": {
"y": {"field": "arrival-time", "type": "temporal"},
"x": {"field": "eta-variance", "type": "quantitative"}
},
"data": [
{"arrival-time": "2021-07-31 07:07:04", "eta-variance": -30},
{"arrival-time": "2021-07-31 07:07:04", "eta-variance": -51},
{"arrival-time": "2021-07-31 07:31:06", "eta-variance": -357},
{"arrival-time": "2021-07-31 07:31:06", "eta-variance": -395},
{"arrival-time": "2021-07-31 07:32:55", "eta-variance": -248},
{"arrival-time": "2021-07-31 07:32:55", "eta-variance": -286},
{"arrival-time": "2021-07-31 07:40:55", "eta-variance": -14},
{"arrival-time": "2021-07-31 07:40:55", "eta-variance": 34},
{"arrival-time": "2021-07-31 07:40:55", "eta-variance": 0},
{"arrival-time": "2021-07-31 07:40:55", "eta-variance": -12}
]
}
[Warning] Validation: /data must be object of #/type
[Warning] Validation: /data must be object of #/type
[Warning] Validation: /data must be object of #/type
[Warning] Validation: /data must match a schema in anyOf of #/anyOf
[Warning] Validation: /data must be object of #/type
[Warning] Validation: /data must be object of #/definitions/SphereGenerator/type
[Warning] Validation: /data must be object of #/type
[Warning] Validation: /data must match a schema in anyOf of #/anyOf
[Warning] Validation: /data must match a schema in anyOf of #/anyOf
[Warning] Validation: /data must be null of #/properties/data/anyOf/1/type
[Warning] Validation: /data must match a schema in anyOf of #/properties/data/anyOf
[Warning] Validation: must have required property 'facet' of #/required
[Warning] Validation: must have required property 'layer' of #/required
[Warning] Validation: must have required property 'repeat' of #/anyOf/0/required
[Warning] Validation: must have required property 'repeat' of #/anyOf/1/required
[Warning] Validation: must match a schema in anyOf of #/anyOf
[Warning] Validation: must have required property 'concat' of #/required
[Warning] Validation: must have required property 'vconcat' of #/required
[Warning] Validation: must have required property 'hconcat' of #/required
[Warning] Validation: must match a schema in anyOf of #/anyOf
[Warning] Infinite extent for field "eta-variance": [Infinity, -Infinity]
[Warning] Infinite extent for field "arrival-time": [Infinity, -Infinity]
[Warning] Infinite extent for field "eta-variance": [Infinity, -Infinity]
[Warning] Infinite extent for field "arrival-time": [Infinity, -Infinity]

Mikhail Akimov spotted my mistake that the specification of data was wrong.
The proper specification should be
{"data": {"values": [...]}
...}
The mistake is missing "values": and the associated structure.
The moral of the story is that vega-lite warnings are actually could be errors, and when it does not work, the warnings should be addressed as errors.

Related

Define exact custom Properties in openAPI 3.1

I have a JSON schema I am trying to describe, a JSON object which has a additionalProperties node which contains an array of key value pairs.
{
"additionalProperties": [
{
"key": "optionA",
"value": "1"
},
{
"key": "optionB",
"value": "0"
},
{
"key": "optionC",
"value": "1"
}
],
}
Whilst I can use quite a generic schema for this like this
additionalProperties:
properties:
key:
type: string
value:
type: string
required:
- key
- value
type: object
I ideally wish to explain what the various keys that can appear and what they mean. I.e. optionA means this and OptionB means that. Is there a way I can describe the exact options which will appear in the array?
The description field is used when you want to provide additional information or context to the reader that isn't necessarily explained by schema alone.
additionalProperties:
description: Your explanation goes here. Note that you can use markdown formatting if desired.
properties:
key:
type: string
value:
type: string
required:
- key
- value
type: object
You can also more accurately describe your options in the schema if they are all known values using oneOf, allOf, or anyOf. (Documentation here)
additionalProperties:
properties:
anyOf:
- $ref: '#/components/schemas/optionA'
- $ref: '#/components/schemas/optionB'
- $ref: '#/components/schemas/optionC'

How to define which time_index attribute is used when notifing Quantumleap?

I am using FIWARE for some time-series data of heat-pumps. I use Orion 2.5.2 and Quantumleap 0.7.6.
My entities have a lot of attributes which are reported in batches. Those data-batches have individual time-stamps for each attribute, so the exact time of a measurement is known (this is also rather important). I use a little python tool to split these batches and send them to the iot-agent seperately via http, using the time-stamp parameter.
I end up with an entity like this:
...
"attrs": {
"temp_outdoor": {
"value": "-6.6",
"type": "Number",
"md": {
"TimeInstant": {
"type": "DateTime",
"value": 1613148707.7509995
}
},
"mdNames": [
"TimeInstant"
],
"creDate": 1612780352.3855166,
"modDate": 1613148716.1449544
},
"temp_return_flow": {
"value": "40.8",
"type": "Number",
"md": {
"TimeInstant": {
"type": "DateTime",
"value": 1613149016.394001
}
},
"mdNames": [
"TimeInstant"
],
"creDate": 1612780352.3855166,
"modDate": 1613149021.5991328
},
"TimeInstant": {
"value": 1613149101.1790009,
"type": "DateTime",
"mdNames": [],
"creDate": 1612780352.3855166,
"modDate": 1613149102.5100079
},
...
I don't really care about the creDate and modDate but about the TimeInstant in "md" of each attribute. Also the bottom "TimeInstant" attribute is just the value of the last Data-Point I think? I would like to use the "md" TimeInstant to create the time_index in CrateDB. Hence, the reported time has to be the custom-metadata one. I tried some different values while subscribing to Quantumleap but can't get it right.
Can someone tell me how to specify md->TimeInstant as value for time_index?
I find the documentation to be rather unconclusive on that topic and hope that someone has already solved that mistery and might let me in on it :)
Thanks!
Looking at your payload, it is not clear what's the NGSI model used, which would be the information need to help you. Anyhow, as reported by the documentation:
A fundamental element in the time-series database is the time index. You may be wondering... where is it stored? QuantumLeap will persist the time index for each notification in a special column called time_index.
The value that is used for the time index of a received notification is defined according to the following policy, which choses the first present and valid time value chosen from the following ordered list of options.
Custom time index. The value of the Fiware-TimeIndex-Attribute http header. Note that for a notification to contain such header, the corresponding subscription has to be created with an httpCustom block, as detailed in the Subscriptions and Custom Notifications section of the NGSI spec. This is the way you can instruct QL to use custom attributes of the notification payload to be taken as time index indicators.
Custom time index metadata. The most recent custom time index (the value of the Fiware-TimeIndex-Attribute) attribute value found in any of the attribute metadata sections in the notification. See the previous option about the details regarding subscriptions.
TimeInstant attribute. As specified in the FIWARE IoT agent documentation.
TimeInstant metadata. The most recent TimeInstant attribute value found in any of the attribute metadata sections in the notification. (Again, refer to the FIWARE IoT agent documentation.)
timestamp attribute.
timestamp metadata. The most recent timestamp attribute value found in any of the attribute metadata sections in the notification. As specified in the FIWARE data models documentation.
dateModified attribute. If you payed attention in the Orion Subscription section, this is the "dateModified" value notified by Orion.
dateModified metadata. The most recent dateModified attribute value found in any of the attribute metadata sections in the notification.
Finally, QL will use the Current Time (the time of notification reception) if none of the above options is present or none of the values found can actually be converted to a datetime.
This means that (and if you understand NGSI model, the documentation is quite clear), with the following payload
{
"id": "Room1",
"type": "Room",
"temperature": {
"value": 24.2,
"type": "Number",
"metadata": {
"myTime": {
"type": "DateTime",
"value": "2020-12-16T17:13:46.00Z"
}
}
},
"pressure": {
"value": 720,
"type": "Number",
"metadata": {
"TimeInstant": {
"type": "DateTime",
"value": "2020-12-16T17:13:46.00Z"
}
}
},
"dateObserved": "2021-02-02T00:00:00.00Z",
"dateCreated": "2019-09-24T12:49:02.00Z",
"dateModified": "2021-02-02T23:00:50.00Z",
"TimeInstant": {
"type": "DateTime",
"value": "2020-12-16T17:13:46.00Z"
}
}
If in the notification you set a custom header Fiware-TimeIndex-Attribute=dateObserved, time_index will be the value of dateObserved. If you set Fiware-TimeIndex-Attribute=myTime it will be the myTime attribute metadata linked to temperature. If not Fiware-TimeIndex-Attribute header is passed, the most recent value of metadata attribute TimeInstant will be picked. Suppose to remove the metadata attribute TimeInstant in the payload above, then attribute TimeInstant will be picked. If TimeInstant attribute is removed as well, dateModified value will be picked. In case that attribute is not received as well, current time is used.

Kafka jdbc sink connector with json schema not working

Using the latest kafka and confluent jdbc sink connectors. Sending a really simple Json message:
{
"schema": {
"type": "struct",
"fields": [
{
"type": "int",
"optional": false,
"field": "id"
},
{
"type": "string",
"optional": true,
"field": "msg"
}
],
"optional": false,
"name": "msgschema"
},
"payload": {
"id": 222,
"msg": "hi"
}
}
But getting error:
org.apache.kafka.connect.errors.DataException: JsonConverter with schemas.enable requires "schema" and "payload" fields and may not contain additional fields. If you are trying to deserialize plain JSON data, set schemas.enable=false in your converter configuration.
Jsonlint says the Json is valid. I have kept json schemas.enable=true in kafka configuration. Any pointers?
You need to tell Connect that your schema is embedded in the JSON you're using.
You have:
value.converter=org.apache.kafka.connect.json.JsonConverter
But need also:
value.converter.schemas.enable=true
In order to use the JDBC sink, your streamed messages must have a schema. This can be achieved either by using Avro with Schema Registry, or by using JSON with schemas. You might need to delete the topic, re-run sink and then start source side once again if schemas.enable=true has been configured after initially running the source properties file.
Example:
sink.properties file
name=sink-mysql
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
tasks.max=1
topics=test-mysql-jdbc-foobar
connection.url=jdbc:mysql://127.0.0.1:3306/demo?user=user1&password=user1pass
auto.create=true
and an example worker configuration file connect-avro-standalone.properties:
bootstrap.servers=localhost:9092
key.converter=org.apache.kafka.connect.json.JsonConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=true
value.converter.schemas.enable=true
internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false
# Local storage file for offset data
offset.storage.file.filename=/tmp/connect.offsets
plugin.path=share/java
and execute
./bin/connect-standalone etc/schema-registry/connect-avro-standalone.properties etc/kafka-connect-jdbc/sink.properties
I had recently came across same issue and it took multiple retries before i figure it out what was missing:
Following settings worked for me:
key.converter.schemas.enable=false
value.converter.schemas.enable=true
Also, make sure the table exists before in the database and connector should not try attempt creating one. Set
auto.create=false

swagger-codegen: errors with \d+ patterns and type: "array"

I'm a newbie to Swagger. I've used the swagger servlet to generate my swagger.json file from our REST API Java classes. The swagger.json file shows swagger 2.0 (I assume this is the 2.0 schema version). There was nothing fancy in the source files, just #Api and a few #ApiOperation annotations.
Then I tried using swagger-codegen-cli (both version 2.1.4 and 2.1.6-SNAPSHOT, the latest) to generate HTML output from the JSON file. I got the following results on both:
reading from dsm_swagger.json
[main] ERROR io.swagger.codegen.DefaultCodegen - unexpected missing property for name suppressed
[main] WARN io.swagger.codegen.DefaultCodegen - skipping invalid property {
"type" : "array"
}
writing file /home/combs/dsm_swagger/./index.html
So I get an output file, but any types that are flagged as lists of objects are not handled correctly. These do appear to be valid 2.0 constructs.
I'm also getting Jackson errors about invalid escape characters because it sees
"pattern": "\d+"
in the file. I can work around the \d by using [0-9], but assume it should be handled as is.
Has anybody seen these particular issues and know if they're either fixed or there is a workaround in swagger-codegen or the source file? Is swagger-codegen actually handling v2.0 specs correctly? Any pointers to up to date info or code would be appreciated!
EDIT:
As mentioned in a comment, by using "#JsonIgnore" and "#JsonProperty" in appropriate places and upgrading to V1.5.6 of swagger-core, I got around the issues with invalid property and type "array" messages. Here's an example of the issue with \d:
"/v1/admins/{adminId}": {
"put": {
"tags": [
"admins"
],
"summary": "Update information about a particular admin, given its ID. The update information is passed in the POST body.",
"description": "Longer notes about what this does",
"operationId": "updateUser",
"consumes": [
"application/json"
],
"produces": [
"application/json"
],
"parameters": [
{
"name": "adminId",
"in": "path",
"required": true,
"type": "integer",
"pattern": "\d+",
"format": "int64"
},
{
"in": "body",
"name": "body",
"required": false,
"schema": {
"$ref": "#/definitions/UserUpdateInfo"
}
}
],
"responses": {
"200": {
"description": "successful operation",
"schema": {
"$ref": "#/definitions/UserInfo"
}
}
}
}
},
This is the exact output of swagger-core, and yet swagger-codegen fails with the following:
combs#dcombs-lap:~/dsm_swagger$ gen_file
reading from dsm_swagger.json
reading from dsm_swagger.json
com.fasterxml.jackson.core.JsonParseException: Unrecognized character escape 'd' (code 100)
at [Source: dsm_swagger.json; line: 411, column: 27]
at com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1419)
at com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:508)
at com.fasterxml.jackson.core.base.ParserMinimalBase._handleUnrecognizedCharacterEscape(ParserMinimalBase.java:485)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._decodeEscaped(UTF8StreamJsonParser.java:2924)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2209)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2165)
at com.fasterxml.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:279)
at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:224)
at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeArray(JsonNodeDeserializer.java:262)
at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:221)
at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:218)
at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:218)
at com.fasterxml.jackson.databind.deser.std.BaseNodeDeserializer.deserializeObject(JsonNodeDeserializer.java:218)
at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:62)
at com.fasterxml.jackson.databind.deser.std.JsonNodeDeserializer.deserialize(JsonNodeDeserializer.java:14)
at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3066)
at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:1861)
at io.swagger.parser.SwaggerCompatConverter.readResourceListing(SwaggerCompatConverter.java:139)
at io.swagger.parser.SwaggerCompatConverter.read(SwaggerCompatConverter.java:74)
at io.swagger.parser.SwaggerParser.read(SwaggerParser.java:73)
at io.swagger.codegen.config.CodegenConfigurator.toClientOptInput(CodegenConfigurator.java:317)
at io.swagger.codegen.cmd.Generate.run(Generate.java:186)
at io.swagger.codegen.SwaggerCodegen.main(SwaggerCodegen.java:35)
Exception in thread "main" java.lang.RuntimeException: missing swagger input or config!
at io.swagger.codegen.DefaultGenerator.generate(DefaultGenerator.java:89)
at io.swagger.codegen.cmd.Generate.run(Generate.java:188)
at io.swagger.codegen.SwaggerCodegen.main(SwaggerCodegen.java:35)
combs#dcombs-lap:~/dsm_swagger$

Is it possible to have an optional field in an Avro schema (i.e. the field does not appear at all in the .json file)?

Is it possible to have an optional field in an Avro schema (i.e. the field does not appear at all in the .JSON file)?
In my Avro schema, I have two fields:
{"name": "author", "type": ["null", "string"], "default": null},
{"name": "importance", "type": ["null", "string"], "default": null},
And in my JSON files those two fields can exist or not.
However, when they do not exist, I receive an error (e.g. when I test such a JSON file using avro-tools command line client):
Expected field name not found: author
I understand that as long as the field name exists in a JSON, it can be null, or a string value, but what I'm trying to express is something like "this JSON is valid if the those field names do not exist, OR if they exist and they are null or string".
Is this possible to express in an Avro schema? If so, how?
you can define the default attribute as undefined example.
so the field can be skipped.
{
"name": "first_name",
"type": "string",
"default": "undefined"
},
Also all field are manadatory in avro.
if you want it to be optional, then union its type with null.
example:
{
"name": "username",
"type": [
"null",
"string"
],
"default": null
},
According to avro specification this is possible, using the default attribute.
See https://avro.apache.org/docs/1.8.2/spec.html
default: A default value for this field, used when reading instances that lack this field (optional). Permitted values depend on the field's schema type, according to the table below. Default values for union fields correspond to the first schema in the union.
At the example you gave, you do add the default attribute with value "null", so this should work. However, supporting this depends also on the library you use for reading the avro message (there are libraries at c,c++,python,java,c#,ruby etc.). Maybe (probably) the library you use lack this feature.