NiFi failed to parse data in convert record - json

I'm trying to convert JSON to CSV using the ConvertRecord processor but the only error I'm getting back is Could not parse incoming data. As this is not very descriptive, I'm at a loss as to how to diagnose the issue.
I know that my avro schema is valid because A) NiFi doesn't throw an error regarding the schema when I insert it into the Schema Registry and B) I tested my schema on here and it didn't give me an issue.
I also know that my JSON is valid because I can load it in Python using json.loads() and it doesn't give me any problems.
I'm just not quite sure where I've gone wrong, nor how to fix it.
JSON
{
"DOC": {
"DOCID": "1234",
"Subjects": {
"Subject_xref": ["2233"]
},
"TXT": {
"COUNTRY": ["United States"],
"ESTATE": ["Mount Vernon"],
"PERSON": ["George Washington"]
},
"RAW_TXT": "George Washington lived in his family home, Mount Vernon, located in the United States.",
"RELINFO": [
{"ID" : "REL-1234-100",
"RELTYPE" : "PER-PROP",
"PERID" : "PER-1234-009",
"PROPID" : "PROP-1234-001",
"SENTID" : "1234-SENT-001",
"PROP_NORM" : "Mount Vernon",
"PROP_MENTION" : "Mount Vernon",
"PER_NORM" : "George Washington",
"PER_MENTION" : "George Washington"}
],
"ENTINFO": [
{"ID": "PER-1234-009", "TYPE": "PERSON", "NORM": "George Washington", "REFID": "PER-1234-009", "MENTION": "George Washington"},
{"ID": "CTRY-1234-003", "TYPE": "COUNTRY", "NORM": "United States", "REFID": "CTRY-1234-003", "MENTION": "United States."},
{"ID": "PROP-1234-001", "TYPE": "ESTATE", "NORM": "Mount Vernon", "REFID": "PROP-1234-001", "MENTION": "Mount Vernon"}
]
}
}
Avro
{
"type": "record",
"namespace": "name.space",
"name": "nlp_output",
"fields": [
{"name": "DOC", "type": {
"name": "DOCDocument", "type": "record", "namespace": "doc.name.space", "fields": [
{"name": "DOCID", "type": ["long","null"], "default": null},
{"name": "Subjects", "type": {
"name": "Subjects", "type": "record", "namespace": "subjects.name.space", "fields": [
{"name": "SubjectIdentificationID", "aliases": ["Subject_xref"], "type": ["long","null"], "default": null}
]
}},
{"name": "TXT", "type": {
"name": "TXT", "type": "record", "namespace": "text.name.space", "fields": [
{"name": "COUNTRY", "type": {"type": "array", "items": ["string", "null"]}, "default": null, "doc": ""},
{"name": "ESTATE", "type": {"type": "array", "items": ["string", "null"]}, "default": null, "doc": ""},
{"name": "PERSON", "type": {"type": "array", "items": ["string", "null"]}, "default": null, "doc": ""}
]
}},
{"name": "RAW_TXT", "type": ["string","null"], "default": null},
{"name": "RELINFO", "type": {
"name": "RelatedEntities", "type": "record", "namespace": "relent.name.space", "fields": [
{"name": "ID", "type": ["string", "null"], "default": null},
{"name": "RELTYPE", "type": ["string", "null"], "default": null},
{"name": "PERID", "type": ["string", "null"], "default": null},
{"name": "PROPID", "type": ["string", "null"], "default": null},
{"name": "SENTID", "type": ["string", "null"], "default": null},
{"name": "PROP_NORM", "type": ["string", "null"], "default": null},
{"name": "PROP_MENTION", "type": ["string", "null"], "default": null},
{"name": "PER_NORM", "type": ["string", "null"], "default": null},
{"name": "PER_MENTION", "type": ["string", "null"], "default": null}
]
}},
{"name": "ENTINFO", "doc": "Sentences stripped of tags for ease of reading", "type": {
"name": "Entities", "type": "record", "namespace": "entities.name.space", "fields": [
{"name": "ID", "type": ["string", "null"], "default": null},
{"name": "TYPE", "type": ["string", "null"], "default": null},
{"name": "NORM", "type": ["string", "null"], "default": null},
{"name": "REFID", "type": ["string", "null"], "default": null},
{"name": "MENTION", "type": ["string", "null"], "default": null}
]
}}
]
}}
]
}

Your schema doesn't match your JSON. You have SubjectIdentificationID defined as long or null but in the JSON Subject_xref is an array.
{
"type": "record",
"namespace": "name.space",
"name": "nlp_output",
"fields": [
{"name": "DOC", "type": {
"name": "DOCDocument", "type": "record", "namespace": "doc.name.space", "fields": [
{"name": "DOCID", "type": ["long","null"], "default": null},
{"name": "Subjects", "type": {
"name": "Subjects", "type": "record", "namespace": "subjects.name.space", "fields": [
{"name": "SubjectIdentificationID", "aliases": ["Subject_xref"], "type": {"type": "array", "items": ["long", "null"]}, "default": null}
]
}},
{"name": "TXT", "type": {
"name": "TXT", "type": "record", "namespace": "text.name.space", "fields": [
{"name": "COUNTRY", "type": {"type": "array", "items": ["string", "null"]}, "default": null, "doc": ""},
{"name": "ESTATE", "type": {"type": "array", "items": ["string", "null"]}, "default": null, "doc": ""},
{"name": "PERSON", "type": {"type": "array", "items": ["string", "null"]}, "default": null, "doc": ""}
]
}},
{"name": "RAW_TXT", "type": ["string","null"], "default": null},
{"name": "RELINFO", "type": {
"name": "RelatedEntities", "type": "record", "namespace": "relent.name.space", "fields": [
{"name": "ID", "type": ["string", "null"], "default": null},
{"name": "RELTYPE", "type": ["string", "null"], "default": null},
{"name": "PERID", "type": ["string", "null"], "default": null},
{"name": "PROPID", "type": ["string", "null"], "default": null},
{"name": "SENTID", "type": ["string", "null"], "default": null},
{"name": "PROP_NORM", "type": ["string", "null"], "default": null},
{"name": "PROP_MENTION", "type": ["string", "null"], "default": null},
{"name": "PER_NORM", "type": ["string", "null"], "default": null},
{"name": "PER_MENTION", "type": ["string", "null"], "default": null}
]
}},
{"name": "ENTINFO", "doc": "Sentences stripped of tags for ease of reading", "type": {
"name": "Entities", "type": "record", "namespace": "entities.name.space", "fields": [
{"name": "ID", "type": ["string", "null"], "default": null},
{"name": "TYPE", "type": ["string", "null"], "default": null},
{"name": "NORM", "type": ["string", "null"], "default": null},
{"name": "REFID", "type": ["string", "null"], "default": null},
{"name": "MENTION", "type": ["string", "null"], "default": null}
]
}}
]
}}
]
}

Related

JsonSchema: Validate an item in the List based on value of another attribute

I have a requirement to put a check on value based on certain other attribute value in same object (both of them are inside list ) e.g paymentType is SHOPPING_CARDS, I want to mandate few attributes i.e paymentReference1 , paymentReference2 otherwise not. I have added this check , I am passing paymentType as CASH but still getting false error that required attributes are missing , though paymentId1 is being passed in all the items in list. Can someone please help to find out what I am doing wrong?
Failing Impl
Error
$.paymentReference1: is missing but it is required
Update- As per one comment below , I have moved if/else in schema where this check is required but now issue is it is not raising any alarm when we remove paymentRef.
Json Schema
{
"type": "object",
"required": [
"payment"
],
"properties": {
"payment": {
"$id": "#root/payment",
"title": "Payment",
"type": "object",
"required": [
"requestId",
"requestTimestamp",
"requestType",
"orderNo",
"orderDate",
"enterpriseCode",
"documentType",
"entryType",
"paymentRuleId",
"membershipNo",
"orderTotalAmount",
"priceInfo",
"paymentMethods",
"createServiceId"
],
"properties": {
"requestId": {
"$id": "#root/payment/requestId",
"title": "Requestid",
"type": "string",
"default": "",
"examples": [
"263e9575-09ea-45dc-b1d9-87ec7888d3ff"
],
"pattern": "^.*$"
},
"requestTimestamp": {
"$id": "#root/payment/requestTimestamp",
"title": "Requesttimestamp",
"type": "string",
"default": "",
"examples": [
"2021-12-01T07:16:31Z"
],
"pattern": "^.*$"
},
"requestType": {
"$id": "#root/payment/requestType",
"title": "Requesttype",
"type": "string",
"default": "",
"examples": [
"CREATE_ORDER/ADDITION"
],
"pattern": "^.*$"
},
"orderNo": {
"$id": "#root/payment/orderNo",
"title": "Orderno",
"type": "string",
"default": "",
"examples": [
"9762909359"
],
"pattern": "^.*$"
},
"orderDate": {
"$id": "#root/payment/orderDate",
"title": "Orderdate",
"type": "string",
"default": "",
"examples": [
"2022-05-09T14:01:21.000+0000"
],
"pattern": "^.*$"
},
"enterpriseCode": {
"$id": "#root/payment/enterpriseCode",
"title": "Enterprisecode",
"type": "string",
"default": "",
"examples": [
"SAMS"
],
"pattern": "^.*$"
},
"documentType": {
"$id": "#root/payment/documentType",
"title": "Documenttype",
"type": "string",
"default": "",
"examples": [
"0001"
],
"pattern": "^.*$"
},
"entryType": {
"$id": "#root/payment/entryType",
"title": "Entrytype",
"type": "string",
"default": "",
"examples": [
"ONLINE"
],
"pattern": "^.*$"
},
"paymentRuleId": {
"$id": "#root/payment/paymentRuleId",
"title": "Paymentruleid",
"type": "string",
"default": "",
"examples": [
"SAMS"
],
"pattern": "^.*$"
},
"membershipNo": {
"$id": "#root/payment/membershipNo",
"title": "Membershipno",
"type": "string",
"default": "",
"examples": [
"10142100469959798"
],
"pattern": "^.*$"
},
"orderTotalAmount": {
"$id": "#root/payment/orderTotalAmount",
"title": "Ordertotalamount",
"type": "string",
"default": "",
"examples": [
""
],
"pattern": "^.*$"
},
"priceInfo": {
"$id": "#root/payment/priceInfo",
"title": "Priceinfo",
"type": "object",
"required": [
"currency",
"enterpriseCurrency"
],
"properties": {
"currency": {
"$id": "#root/payment/priceInfo/currency",
"title": "Currency",
"type": "string",
"default": "",
"examples": [
"USD"
],
"pattern": "^.*$"
},
"enterpriseCurrency": {
"$id": "#root/payment/priceInfo/enterpriseCurrency",
"title": "Enterprisecurrency",
"type": "string",
"default": "",
"examples": [
"USD"
],
"pattern": "^.*$"
}
}
},
"paymentMethods": {
"$id": "#root/payment/paymentMethods",
"title": "Paymentmethods",
"type": "array",
"default": [
],
"items": {
"$id": "#root/payment/paymentMethods/items",
"title": "Items",
"type": "object",
"required": [
"sequenceNo",
"customerPONo",
"displaySvcNo",
"maxChargeLimit",
"paymentType",
"svcNo",
"unlimitedCharges",
"paymentDetails"
],
"properties": {
"sequenceNo": {
"$id": "#root/payment/paymentMethods/items/sequenceNo",
"title": "Sequenceno",
"type": "string",
"default": "",
"examples": [
"1"
],
"pattern": "^.*$"
},
"customerPONo": {
"$id": "#root/payment/paymentMethods/items/customerPONo",
"title": "Customerpono",
"type": "string",
"default": "",
"examples": [
"0"
],
"pattern": "^.*$"
},
"displaySvcNo": {
"$id": "#root/payment/paymentMethods/items/displaySvcNo",
"title": "Displaysvcno",
"type": "string",
"default": "",
"examples": [
"719"
],
"pattern": "^.*$"
},
"maxChargeLimit": {
"$id": "#root/payment/paymentMethods/items/maxChargeLimit",
"title": "Maxchargelimit",
"type": "number",
"examples": [
159.27
],
"default": 0.0
},
"paymentId1": {
"$id": "#root/payment/paymentMethods/items/paymentReference1",
"type": "string",
"default": "",
"examples": [
"T4543F/JCgkLnk6OT4qJ/hc+sg=="
],
"pattern": "^.*$"
},
"paymentId2": {
"$id": "#root/payment/paymentMethods/items/paymentReference2",
"title": "Paymentreference2",
"type": "string",
"default": "",
"examples": [
"ebdbbf2f-aec1-414a-a8b7-ec33de993c02"
],
"pattern": "^.*$"
},
"paymentId3": {
"$id": "#root/payment/paymentMethods/items/paymentReference3",
"title": "Paymentreference3",
"type": "string",
"default": "",
"examples": [
"Regular"
],
"pattern": "^.*$"
},
"paymentType": {
"$id": "#root/payment/paymentMethods/items/paymentType",
"title": "Paymenttype",
"type": "string",
"default": "",
"examples": [
"CASH"
],
"pattern": "^.*$"
},
"svcNo": {
"$id": "#root/payment/paymentMethods/items/svcNo",
"title": "Svcno",
"type": "string",
"default": "",
"examples": [
"6194995892193728"
],
"pattern": "^.*$"
},
"unlimitedCharges": {
"$id": "#root/payment/paymentMethods/items/unlimitedCharges",
"title": "Unlimitedcharges",
"type": "string",
"default": "",
"examples": [
"N"
],
"pattern": "^.*$"
},
"paymentDetails": {
"$id": "#root/payment/paymentMethods/items/paymentDetails",
"title": "Paymentdetails",
"type": "object",
"required": [
"internalReturnCode",
"authorizationID",
"authCode",
"internalReturnMessage",
"processedAmount",
"chargeType",
"holdAgainstBook",
"authTime",
"requestAmount",
"authReturnMessage",
"authReturnCode",
"requestId"
],
"properties": {
"internalReturnCode": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/internalReturnCode",
"title": "Internalreturncode",
"type": "string",
"default": "",
"examples": [
""
],
"pattern": "^.*$"
},
"authorizationID": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/authorizationID",
"title": "Authorizationid",
"type": "string",
"default": "",
"examples": [
"940799"
],
"pattern": "^.*$"
},
"authCode": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/authCode",
"title": "Authcode",
"type": "string",
"default": "",
"examples": [
"000"
],
"pattern": "^.*$"
},
"internalReturnMessage": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/internalReturnMessage",
"title": "Internalreturnmessage",
"type": "string",
"default": "",
"examples": [
""
],
"pattern": "^.*$"
},
"processedAmount": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/processedAmount",
"title": "Processedamount",
"type": "string",
"default": "",
"examples": [
"159.27"
],
"pattern": "^.*$"
},
"chargeType": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/chargeType",
"title": "Chargetype",
"type": "string",
"default": "",
"examples": [
"CHARGE"
],
"pattern": "^.*$"
},
"holdAgainstBook": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/holdAgainstBook",
"title": "Holdagainstbook",
"type": "string",
"default": "",
"examples": [
"Y"
],
"pattern": "^.*$"
},
"authTime": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/authTime",
"title": "Authtime",
"type": "string",
"default": "",
"examples": [
"2022-06-08T23:21:49"
],
"pattern": "^.*$"
},
"requestAmount": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/requestAmount",
"title": "Requestamount",
"type": "string",
"default": "",
"examples": [
"159.27"
],
"pattern": "^.*$"
},
"authReturnMessage": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/authReturnMessage",
"title": "Authreturnmessage",
"type": "string",
"default": "",
"examples": [
""
],
"pattern": "^.*$"
},
"authReturnCode": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/authReturnCode",
"title": "Authreturncode",
"type": "string",
"default": "",
"examples": [
"000"
],
"pattern": "^.*$"
},
"requestId": {
"$id": "#root/payment/paymentMethods/items/paymentDetails/requestId",
"title": "Requestid",
"type": "string",
"default": "",
"examples": [
"6bf404e0-d756-4247-9030-669c83d0d825"
],
"pattern": "^.*$"
}
}
}
}
},
"if": {
"properties": {
"paymentType": {
"enum": [
"CASH"
]
}
}
},
"then": {
"required": [
"paymentId1"
]
}
},
"createServiceId": {
"$id": "#root/payment/createServiceId",
"title": "Createserviceid",
"type": "string",
"default": "",
"examples": [
"PRE_FULFILLMENT"
],
"pattern": "^.*$"
}
}
}
}
}
Your use of if/then is fine, it's just in the wrong place. You need to move it into the schema that has the properties you're working with.
{
"type": "object",
"properties": {
"payment": {
"type": "object",
"properties": {
"paymentMethods": {
"type": "array",
"items": {
"type": "object",
"properties": {
"paymentId1": { "type": "string" },
"paymentId2": { "type": "string" },
"paymentType": { "type": "string" }
},
"if": {
"properties": {
"paymentType": { "enum": ["CASH"] }
}
},
"then": { "required": ["paymentId1","paymentId2"] }
}
}
}
}
}
}

Debezium MySql source connector - cant see data in topic

I have defined Debezium MySQL source connector with the following configuraion
{
"name": "quickstart-debezium-source1",
"config": {
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": 1,
"database.hostname": "host.docker.internal",
"database.server.name": "connect_test",
"database.server.id": "5555",
"database.port": "3306",
"database.user": "root",
"database.password": "Fintech1!",
"database.history.kafka.topic": "debezium-source",
"database.history.kafka.bootstrap.servers": "broker:29092",
"include.schema.changes": "true",
"key.converter" : "io.confluent.connect.avro.AvroConverter",
"key.converter.schema.registry.url" : "http://host.docker.internal:8081",
"value.converter":"io.confluent.connect.avro.AvroConverter",
"value.converter.schema.registry.url": "http://host.docker.internal:8081",
"topic.creation.default.replication.factor": -1,
"topic.creation.default.partitions": -1,
"topic.creation.default.cleanup.policy": "compact",
"topic.creation.default.compression.type": "lz4"
Also have the following table in MySql:
I am able to see in Kafka new topic created and also new data that being added/updated to the table.
but, I can't see the new data on the topic (I am using confluent-cloud-center)
On the topic I can see the new data like:
I cant see the values of the columns in the table
Also I am trying to create a KSQLDB table out of this topic and no results are coming.
Hope some1 could help me with that,
I also have schema-registry, and the schema registry of the topic is:
{
"type": "record",
"name": "Envelope",
"namespace": "connect_test.connect_test.test",
"fields": [
{
"name": "before",
"type": [
"null",
{
"type": "record",
"name": "Value",
"fields": [
{
"name": "id",
"type": "long"
},
{
"name": "name",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "email",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "department",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "modified",
"type": {
"type": "string",
"connect.version": 1,
"connect.default": "1970-01-01T00:00:00Z",
"connect.name": "io.debezium.time.ZonedTimestamp"
},
"default": "1970-01-01T00:00:00Z"
}
],
"connect.name": "connect_test.connect_test.test.Value"
}
],
"default": null
},
{
"name": "after",
"type": [
"null",
"Value"
],
"default": null
},
{
"name": "source",
"type": {
"type": "record",
"name": "Source",
"namespace": "io.debezium.connector.mysql",
"fields": [
{
"name": "version",
"type": "string"
},
{
"name": "connector",
"type": "string"
},
{
"name": "name",
"type": "string"
},
{
"name": "ts_ms",
"type": "long"
},
{
"name": "snapshot",
"type": [
{
"type": "string",
"connect.version": 1,
"connect.parameters": {
"allowed": "true,last,false,incremental"
},
"connect.default": "false",
"connect.name": "io.debezium.data.Enum"
},
"null"
],
"default": "false"
},
{
"name": "db",
"type": "string"
},
{
"name": "sequence",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "table",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "server_id",
"type": "long"
},
{
"name": "gtid",
"type": [
"null",
"string"
],
"default": null
},
{
"name": "file",
"type": "string"
},
{
"name": "pos",
"type": "long"
},
{
"name": "row",
"type": "int"
},
{
"name": "thread",
"type": [
"null",
"long"
],
"default": null
},
{
"name": "query",
"type": [
"null",
"string"
],
"default": null
}
],
"connect.name": "io.debezium.connector.mysql.Source"
}
},
{
"name": "op",
"type": "string"
},
{
"name": "ts_ms",
"type": [
"null",
"long"
],
"default": null
},
{
"name": "transaction",
"type": [
"null",
{
"type": "record",
"name": "ConnectDefault",
"namespace": "io.confluent.connect.avro",
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "total_order",
"type": "long"
},
{
"name": "data_collection_order",
"type": "long"
}
]
}
],
"default": null
}
],
"connect.name": "connect_test.connect_test.test.Envelope"
}

AWS SageMaker SparkML Schema Eroor: member.environment' failed to satisfy constraint

I am deploying a model onto AWS via Sagemaker:
I set up my JSON schema as follow:
import json
schema = {
"input": [
{
"name": "V1",
"type": "double"
},
{
"name": "V2",
"type": "double"
},
{
"name": "V3",
"type": "double"
},
{
"name": "V4",
"type": "double"
},
{
"name": "V5",
"type": "double"
},
{
"name": "V6",
"type": "double"
},
{
"name": "V7",
"type": "double"
},
{
"name": "V8",
"type": "double"
},
{
"name": "V9",
"type": "double"
},
{
"name": "V10",
"type": "double"
},
{
"name": "V11",
"type": "double"
},
{
"name": "V12",
"type": "double"
},
{
"name": "V13",
"type": "double"
},
{
"name": "V14",
"type": "double"
},
{
"name": "V15",
"type": "double"
},
{
"name": "V16",
"type": "double"
},
{
"name": "V17",
"type": "double"
},
{
"name": "V18",
"type": "double"
},
{
"name": "V19",
"type": "double"
},
{
"name": "V20",
"type": "double"
},
{
"name": "V21",
"type": "double"
},
{
"name": "V22",
"type": "double"
},
{
"name": "V23",
"type": "double"
},
{
"name": "V24",
"type": "double"
},
{
"name": "V25",
"type": "double"
},
{
"name": "V26",
"type": "double"
},
{
"name": "V27",
"type": "double"
},
{
"name": "V28",
"type": "double"
},
{
"name": "Amount",
"type": "double"
},
],
"output":
{
"name": "features",
"type": "double",
"struct": "vector"
}
}
schema_json = json.dumps(schema)
print(schema_json)
And deployed as:
from sagemaker.model import Model
from sagemaker.pipeline import PipelineModel
from sagemaker.sparkml.model import SparkMLModel
sparkml_data = 's3://{}/{}/{}'.format(s3_model_bucket, s3_model_key_prefix, 'model.tar.gz')
# passing the schema defined above by using an environment variable that sagemaker-sparkml-serving understands
sparkml_model = SparkMLModel(model_data=sparkml_data, env={'SAGEMAKER_SPARKML_SCHEMA' : schema_json})
xgb_model = Model(model_data=xgb_model.model_data, image=training_image)
model_name = 'inference-pipeline-' + timestamp_prefix
sm_model = PipelineModel(name=model_name, role=role, models=[sparkml_model, xgb_model])
endpoint_name = 'inference-pipeline-ep-' + timestamp_prefix
sm_model.deploy(initial_instance_count=1, instance_type='ml.c4.xlarge', endpoint_name=endpoint_name)
I got the error as below:
ClientError: An error occurred (ValidationException) when calling the CreateModel operation: 1 validation error detected: Value '{SAGEMAKER_SPARKML_SCHEMA={"input": [{"type": "double", "name": "V1"}, {"type": "double", "name": "V2"}, {"type": "double", "name": "V3"}, {"type": "double", "name": "V4"}, {"type": "double", "name": "V5"}, {"type": "double", "name": "V6"}, {"type": "double", "name": "V7"}, {"type": "double", "name": "V8"}, {"type": "double", "name": "V9"}, {"type": "double", "name": "V10"}, {"type": "double", "name": "V11"}, {"type": "double", "name": "V12"}, {"type": "double", "name": "V13"}, {"type": "double", "name": "V14"}, {"type": "double", "name": "V15"}, {"type": "double", "name": "V16"}, {"type": "double", "name": "V17"}, {"type": "double", "name": "V18"}, {"type": "double", "name": "V19"}, {"type": "double", "name": "V20"}, {"type": "double", "name": "V21"}, {"type": "double", "name": "V22"}, {"type": "double", "name": "V23"}, {"type": "double", "name": "V24"}, {"type": "double", "name": "V25"}, {"type": "double", "name": "V26"}, {"type": "double", "name": "V27"}, {"type": "double", "name": "V28"}, {"type": "double", "name": "Amount"}], "output": {"type": "double", "name": "features", "struct": "vector"}}}' at 'containers.1**.member.environment' failed to satisfy constraint: Map value must satisfy constraint: [Member must have length less than or equal to 1024,** Member must have length greater than or equal to 0, Member must satisfy regular expression pattern: [\S\s]*]
I try to reduce my features to 20 and it able to deploy. Just wondering how can I Pass the schema with 29 attributes?
I do not think the environment length of 1024 limit will be increased in a short time. To work around this, you could try to rebuild the spark ml container with the SAGEMAKER_SPARKML_SCHEMA env var:
https://github.com/aws/sagemaker-sparkml-serving-container/blob/master/README.md#running-the-image-locally

How to make a JSON Schema for a JSON file in Oxygen

I have a Json file and I need to write a Scheme for it in Oxygen.
"characters": [
{
"house":"Gryffindor",
"orderOfThePhoenix":false,
"name":"Cuthbert Binns",
"bloodStatus":"unknown",
"deathEater":false,
"dumbledoresArmy":false,
"school":"Hogwarts School of Witchcraft and Wizardry",
"role":"Professor, History of Magic",
"__v":0,
"ministryOfMagic":false,
"_id":"5a0fa67dae5bc100213c2333",
"species":"ghost"
}
],
"spells": [
{
"spell":"Aberto",
"effect":"opens objects",
"_id":"5b74ebd5fb6fc0739646754c",
"type":"Charm"
}
],
"houses": [
{
"values": [
"courage",
"bravery",
"nerve",
"chivalry"
],
"headOfHouse":"Minerva McGonagall",
"mascot":"lion",
"name":"Gryffindor",
"houseGhost":"Nearly Headless Nick",
"founder":"Goderic Gryffindor",
"colors": [
"scarlet",
"gold"
],
"school":"Hogwarts School of Witchcraft and Wizardry",
"__v":0,
"members": [
"5a0fa648ae5bc100213c2332",
"5a0fa67dae5bc100213c2333",
"5a0fa7dcae5bc100213c2338",
"5a123f130f5ae10021650dcc"
],
"_id":"5a05e2b252f721a3cf2ea33f"
},
For sure the current JSON file is much bigger. If someone could send related links it would help too, or some kind of tutorials.
Could you please help me with creating a schema for it?
If you want to create a JSON Schema, the best way to start is to check the "json-schema.org" tutorials. You can find them here:
https://json-schema.org/learn/getting-started-step-by-step.html
https://json-schema.org/understanding-json-schema/
In the next version of Oxygen there will be support to create a JSON Schema based on a JSON instance or on an XSD, but you will need to check the created schema and customize it for your needs.
For example, for the instance you provided the schema can look something like this:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"characters": {"$ref": "#/definitions/characters_type"},
"spells": {"$ref": "#/definitions/spells_type"},
"houses": {"$ref": "#/definitions/houses_type"}
},
"definitions": {
"characters_type": {
"type": "array",
"minItems": 0,
"items": {
"type": "object",
"properties": {
"house": {"type": "string"},
"orderOfThePhoenix": {"type": "boolean"},
"name": {"type": "string"},
"bloodStatus": {"type": "string"},
"deathEater": {"type": "boolean"},
"dumbledoresArmy": {"type": "boolean"},
"school": {"type": "string"},
"role": {"type": "string"},
"__v": {"type": "number"},
"ministryOfMagic": {"type": "boolean"},
"_id": {"type": "string"},
"species": {"type": "string"}
},
"required": [
"role",
"bloodStatus",
"school",
"species",
"deathEater",
"dumbledoresArmy",
"__v",
"name",
"ministryOfMagic",
"_id",
"orderOfThePhoenix",
"house"
]
}
},
"spells_type": {
"type": "array",
"minItems": 0,
"items": {
"type": "object",
"properties": {
"spell": {"type": "string"},
"effect": {"type": "string"},
"_id": {"type": "string"},
"type": {"type": "string"}
},
"required": [
"spell",
"effect",
"_id",
"type"
]
}
},
"values_type": {
"type": "array",
"minItems": 0,
"items": {"type": "string"}
},
"houses_type": {
"type": "array",
"minItems": 0,
"items": {
"type": "object",
"properties": {
"values": {"$ref": "#/definitions/values_type"},
"headOfHouse": {"type": "string"},
"mascot": {"type": "string"},
"name": {"type": "string"},
"houseGhost": {"type": "string"},
"founder": {"type": "string"},
"colors": {"$ref": "#/definitions/values_type"},
"school": {"type": "string"},
"__v": {"type": "number"},
"members": {"$ref": "#/definitions/values_type"},
"_id": {"type": "string"}
},
"required": [
"headOfHouse",
"houseGhost",
"mascot",
"school",
"founder",
"values",
"__v",
"members",
"name",
"_id",
"colors"
]
}
}
}
}
Best Regards,
Octavian

Validate Json schema based on the value specified for a property

I have a Json request having the below data and a corresponding json schema for it
With this request, I want to make a few fields as required depending on the mode
Say if mode is 1, then I want the fields a and b in obj1 to be required and field x in obj3 as required.
Now if the mode is 2, I would want fields p, q and r in obj2 to be required, fields a and c in obj1 as required and field y in obj3 as required.
Next if the mode is 3, I want only fields a and c as required
Json request
{
"mode": "1",
"obj1": {
"a": 12,
"b": "test",
"c": "18 June 2019"
},
"obj2": {
"p": 100,
"q": "new",
"r": "19 June 2019",
"s" : "test2"
},
"obj3": {
"x": 12,
"y": "test3"
}
}
**Json schema**
{
"definitions": {},
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "http://example.com/root.json",
"type": "object",
"properties": {
"mode": {
"$id": "#/properties/mode",
"type": "string",
"examples": [
"1"
]
},
"obj1": {
"$id": "#/properties/obj1",
"type": "object",
"title": "The Obj 1 Schema",
"properties": {
"a": {
"$id": "#/properties/obj1/properties/a",
"type": "integer",
"examples": [
12
]
},
"b": {
"$id": "#/properties/obj1/properties/b",
"type": "string",
"examples": [
"test"
]
},
"c": {
"$id": "#/properties/obj1/properties/c",
"type": "string",
"examples": [
"18 June 2019"
]
}
}
},
"obj 2": {
"$id": "#/properties/obj2",
"type": "object",
"title": "The Obj 2 Schema",
"properties": {
"p": {
"$id": "#/properties/obj2/properties/p",
"type": "integer",
"examples": [
100
]
},
"q": {
"$id": "#/properties/obj2/properties/q",
"type": "string",
"examples": [
"new"
]
},
"r": {
"$id": "#/properties/obj2/properties/r",
"type": "string",
"examples": [
"19 June 2019"
]
},
"s": {
"$id": "#/properties/obj2/properties/s",
"type": "string",
"examples": [
"test2"
]
}
}
},
"obj 3": {
"$id": "#/properties/obj3",
"type": "object",
"title": "The Obj 3 Schema",
"properties": {
"x": {
"$id": "#/properties/obj3/properties/x",
"type": "integer",
"examples": [
12
]
},
"y": {
"$id": "#/properties/obj3/properties/y",
"type": "string",
"examples": [
"test3"
]
}
}
}
}
}
EDIT - Changed the schema to validate based on the suggestion by #gregsdennis
JSON SCHEMA
{
"definitions": {},
"$schema": "http://json-schema.org/draft-07/schema#",
"$id": "http://example.com/root.json",
"type": "object",
"properties": {
"mode": {
"$id": "#/properties/mode",
"type": "string",
"examples": [
"1"
]
},
"obj1": {
"$id": "#/properties/obj1",
"type": "object",
"title": "The Obj 1 Schema",
"properties": {
"a": {
"type": "integer",
"examples": [
12
]
},
"b": {
"type": "string",
"examples": [
"test"
]
},
"c": {
"type": "string",
"examples": [
"18 June 2019"
]
}
}
},
"obj 2": {
"$id": "#/properties/obj2",
"type": "object",
"title": "The Obj 2 Schema",
"properties": {
"p": {
"type": "integer",
"examples": [
100
]
},
"q": {
"type": "string",
"examples": [
"new"
]
},
"r": {
"type": "string",
"examples": [
"19 June 2019"
]
},
"s": {
"type": "string",
"examples": [
"test2"
]
}
}
},
"obj 3": {
"$id": "#/properties/obj3",
"type": "object",
"title": "The Obj 3 Schema",
"properties": {
"x": {
"type": "integer",
"examples": [
12
]
},
"y": {
"type": "string",
"examples": [
"test3"
]
}
}
}
},
"oneOf": [
{
"properties": {
"mode": {"const": 1},
"obj1": {"required": ["a","b"]},
"obj3": {"required": ["x"]}
}
},
{
"properties": {
"mode": {"const": 2},
"obj2": {"required": ["p","q","r"]},
"obj1": {"required": ["a","c"]},
"obj3": {"required": ["y"]}
}
}
]
}
So, in brief, irrespective of how many modes, fields or objects I have, I would like only a few selected fields from different objects to be required at a given time for a particular mode.
Can anyone please suggest any solutions to achieve this? Is it possible to have such validations in the json schema?
What you want is an oneOf where each subschema gives a valid state for each of obj* properties. Each state would be something like this:
{
"properties": {
"mode": {"const": 1},
"obj1": {"required": ["a","b"]},
"obj3": {"required": ["x"]}
}
}
Create one of these for each of the states you listed in your question, and throw them all in a oneOf that lives at the root.
You could do this with if/then/else, but for this case, I'd prefer the oneOf to avoid nesting.
Also, I notice that you have a lot of superfluous $ids in the middle that are just specifying their location within the schema. You want the one at the root, but you don't need the others. Implementations can work out these kinds of location-based references trivially.