js2xmlparser not parsing numeric keys - json

I am trying to create xml file from json object, using js2xmlparser. My code is as below:
var js2xmlparser = require("js2xmlparser");
var data = {
"product": "painting",
"88CODE": "-2"
};
console.log(js2xmlparser.parse("product", data));
But it throws an error as below:
E:\projects\xml-generator\node_modules\xmlcreate\lib\nodes\XmlElement.js:94
throw new Error("name should not contain characters not"
^
Error: name should not contain characters not allowed in XML names
at XmlElement.set [as name] (E:\projects\xml-generator\node_modules\xmlcreate\lib\nodes\XmlElement.js:94:23)
at new XmlElement (E:\projects\xml-generator\node_modules\xmlcreate\lib\nodes\XmlElement.js:72:20)
at XmlElement.element (E:\projects\xml-generator\node_modules\xmlcreate\lib\nodes\XmlElement.js:218:23)
at parseObjectOrMapEntry (E:\projects\xml-generator\node_modules\js2xmlparser\lib\main.js:130:33)
at parseObjectOrMap (E:\projects\xml-generator\node_modules\js2xmlparser\lib\main.js:152:13)
at parseValue (E:\projects\xml-generator\node_modules\js2xmlparser\lib\main.js:220:9)
at parseToDocument (E:\projects\xml-generator\node_modules\js2xmlparser\lib\main.js:249:5)
at Object.parse (E:\projects\xml-generator\node_modules\js2xmlparser\lib\main.js:265:20)
at Object.<anonymous> (E:\projects\xml-generator\server.js:16:26)
at Module._compile (module.js:570:32)
I want one of the node to be <88CODE>. How do i resolve this issue ?
Thanks

The xml standard states that an xml element name must start with a letter or an underscore... thus your error. Your data property 88CODE must be renamed.
So the short answer is if you want an element to have the name 88CODE... you'll get this error. Consider renaming the element to something else... perhaps _88CODE
Xml Element Naming Rules
Element names are case-sensitive.
Element names must start with a letter or underscore.
Element names cannot start with the letters xml (or XML, or Xml, etc)
Element names can contain letters, digits, hyphens, underscores, and periods.
Element names cannot contain spaces.
sorry.. w3schools reference :)

this will work fine or change the name of the second field so it must start with an letter or _
var js2xmlparser = require("js2xmlparser");
var data = {
"product": "painting",
"_88CODE": "-2"
};
console.log(js2xmlparser.parse("product", data));

Related

How to parse invalid JSON contianing invalid number

I work with a legacy customer who sends me webhook events. Sometimes their system sends me a value that looks like this
[{"id":"LXKhRA3RHtaVBhnczVRJLdr","ecc":"0X6","cph":"X1X4X77074", "ts":16XX445656000}]
I am using python's json.loads to parse the data sent to me. Here the ts is an invalid number and python gives json.decoder.JSONDecodeError whenever I try to parse this string.
It is okay with me to get None in ts field if I can not parse it.
What would be a smart (& possibly generic) way to solve this problem?
This may not be so generic, but you can try using yaml to load:
import yaml
s = '[{"id":"LXKhRA3RHtaVBhnczVRJLdr","ecc":"0X6","cph":"X1X4X77074","ts":16XX445656000}]'
yaml.safe_load(s)
Output:
[{'id': 'LXKhRA3RHtaVBhnczVRJLdr',
'ecc': '0X6',
'cph': 'X1X4X77074',
'ts': '16XX445656000'}]
If the problem is always in the ts key, and this value is always a string of numbers and letters, you could just remove it before trying to parse:
import re
jstr = """[{"id":"LXKhRA3RHtaVBhnczVRJLdr","ecc":"0X6","cph":"X1X4X77074", "ts":16XX445656000}]"""
jstr_sanitized = re.sub(r',?\s*\"ts\":[A-Z0-9]+', "", jstr)
jobj = json.loads(jstr_sanitized)
# [{'id': 'LXKhRA3RHtaVBhnczVRJLdr', 'ecc': '0X6', 'cph': 'X1X4X77074'}]
Regex explanation (try online):
,?\s*\"ts\":[A-Z0-9]+
,? Zero or one commas
\s* Any number of whitespace characters
\"ts\": Literally "ts":
[A-Z0-9]+ One or more uppercase letters or numbers
Alternatively, you could catch the JSONDecodeError and look at its pos attribute for the offending character. Then, you could either remove just that character and try again, or look for the next space, comma, or bracket and remove characters until that point before you try again.
jstr = """[{"id":"LXKhRA3RHtaVBhnczVRJLdr","ecc":"0X6","cph":"X1X4X77074", "ts":16XX445656000}]"""
while True:
try:
jobj = json.loads(jstr)
break
except json.JSONDecodeError as ex:
jstr = jstr[:ex.pos] + jstr[ex.pos+1:]
This mangles the output so that the ts key is now a valid integer (after removing the Xs) but since you don't care about that anyway, it should be fine:
[{'id': 'LXKhRA3RHtaVBhnczVRJLdr',
'ecc': '0X6',
'cph': 'X1X4X77074',
'ts': 16445656000}]
Since you'd end up repeatedly re-parsing the initial valid part, this is probably not a great idea if you have a huge json string, or there are lots of places that could throw an error, but it should be fine for the kind of example you have shown.

Check and print occurrences of an array of string in a dataset in Python

I want to check if an array of strings occur in a dataset and print those rows where the string array elements occur.
rareTitles = {"Capt", "Col", "Countess", "Don", "Dr", "Jonkheer", "Lady",
"Major", "Mlle", "Mme", "Ms", "Rev", "Sir"}
dataset[rareTitles in (dataset['Title'])]
I am getting following error:
TypeError: unhashable type: 'set'
First of all, I think the comparison should go the other way around - you look for a dataset['Title'], that contains string from rareTitles.
You can use str attribute of a pandas DataSeries, which allows as to use string methods, like contains. As this method accepts also a pattern as a regular expression, you can put as an argument something like 'Capt|Col...'. To join all elements of a set you can use str.join() method.
So the solution would be
dataset[dataset['Title'].str.contains('|'.join(rareTitles))]
Link to documentation: pandas.Series.str.contains

How to send MarkDown to API

I'm trying to send Some Markdown text to a rest api. Just now I figure it out that break lines are not accepted in json.
Example. How to send this to my api:
An h1 header
============
Paragraphs are separated by a blank line.
2nd paragraph. *Italic*, **bold**, and `monospace`. Itemized lists
look like:
* this one
* that one
* the other one
Note that --- not considering the asterisk --- the actual text
content starts at 4-columns in.
> Block quotes are
> written like so.
>
> They can span multiple paragraphs,
> if you like.
Use 3 dashes for an em-dash. Use 2 dashes for ranges (ex., "it's all
in chapters 12--14"). Three dots ... will be converted to an ellipsis.
Unicode is supported. ☺
as
{
"body" : " (the markdown) ",
}
As you're trying to send it to a REST API endpoint, I'll assume you're searching for ways to do it using Javascript (since you didn't specify what tech you were using).
Rule of thumb: except if your goal is to re-build a JSON builder, use the ones already existing.
And, guess what, Javascript implements its JSON tools ! (see documentation here)
As it's shown in the documentation, you can use the JSON.stringify function to simply convert an object, like a string to a json-compliant encoded string, that can later be decoded on the server side.
This example illustrates how to do so:
var arr = {
text: "This is some text"
};
var json_string = JSON.stringify(arr);
// Result is:
// "{"text":"This is some text"}"
// Now the json_string contains a json-compliant encoded string.
You also can decode JSON client-side with javascript using the other JSON.parse() method (see documentation):
var json_string = '{"text":"This is some text"}';
var arr = JSON.parse(json_string);
// Now the arr contains an array containing the value
// "This is some text" accessible with the key "text"
If that doesn't answer your question, please edit it to make it more precise, especially on what tech you're using. I'll edit this answer accordingly
You need to replace the line-endings with \n and then pass it in your body key.
Also, make sure you escape double-quotes (") by \" else your body will end there.
# An h1 header\n============\n\nParagraphs are separated by a blank line.\n\n2nd paragraph. *Italic*, **bold**, and `monospace`. Itemized lists\nlook like:\n\n * this one\n * that one\n * the other one\n\nNote that --- not considering the asterisk --- the actual text\ncontent starts at 4-columns in.\n\n> Block quotes are\n> written like so.\n>\n> They can span multiple paragraphs,\n> if you like.\n\nUse 3 dashes for an em-dash. Use 2 dashes for ranges (ex., \"it's all\nin chapters 12--14\"). Three dots ... will be converted to an ellipsis.\nUnicode is supported.

Escape quotes inside quoted fields when parsing CSV in Flink

In Flink, parsing a CSV file using readCsvFile raises an exception when encountring a field containing quotes like "Fazenda São José ""OB"" Airport":
org.apache.flink.api.common.io.ParseException: Line could not be parsed: '191,"SDOB","small_airport","Fazenda São José ""OB"" Airport",-21.425199508666992,-46.75429916381836,2585,"SA","BR","BR-SP","Tapiratiba","no","SDOB",,"SDOB",,,'
I've found in this mailing list thread and this JIRA issue that quoting inside the field should be realized through the \ character, but I don't have control over the data to modify it. Is there a way to work around this?
I've also tried using ignoreInvalidLines() (which is the less preferable solution) but it gave me the following error:
08:49:05,737 INFO org.apache.flink.api.common.io.LocatableInputSplitAssigner - Assigning remote split to host localhost
08:49:05,765 ERROR org.apache.flink.runtime.operators.BatchTask - Error in task code: CHAIN DataSource (at main(Job.java:53) (org.apache.flink.api.java.io.TupleCsvInputFormat)) -> Map (Map at main(Job.java:54)) -> Combine(SUM(1), at main(Job.java:56) (2/8)
java.lang.ArrayIndexOutOfBoundsException: -1
at org.apache.flink.api.common.io.GenericCsvInputFormat.skipFields(GenericCsvInputFormat.java:443)
at org.apache.flink.api.common.io.GenericCsvInputFormat.parseRecord(GenericCsvInputFormat.java:412)
at org.apache.flink.api.java.io.CsvInputFormat.readRecord(CsvInputFormat.java:111)
at org.apache.flink.api.common.io.DelimitedInputFormat.nextRecord(DelimitedInputFormat.java:454)
at org.apache.flink.api.java.io.CsvInputFormat.nextRecord(CsvInputFormat.java:79)
at org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:176)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
at java.lang.Thread.run(Thread.java:745)
Here is my code:
DataSet<Tuple2<String, Integer>> csvInput = env.readCsvFile("resources/airports.csv")
.ignoreFirstLine()
.ignoreInvalidLines()
.parseQuotedStrings('"')
.includeFields("100000001")
.types(String.class, String.class)
.map((Tuple2<String, String> value) -> new Tuple2<>(value.f1, 1))
.groupBy(0)
.sum(1);
If you cannot change the input data, then you should turn off parseQuotedString(). This will simply look for the next field delimiter and return everything in between as a string (including the quotations marks). Then you can remove the leading and trailing quotation mark in a subsequent map operation.

receiving nth character when trying to access JSON values

Having trouble with accessing json values formed from php array
var latlag = '<?php echo json_encode($coordinates); ?>';
alert(latlng) produces:
[{
"1280":{"lat":"-1.197070","lng":"-1.197070"},
"1239":{"lat":"-1.222410","lng":"-1.222410"},
"1258":{"lat":"-1.153020","lng":"-1.153020"},
...
}]
I've tried all sorts of ways to access lat and lag for a specific ID and the only result other than undefined has been the nth character of latlng as if its being treated like a string?!
alert(latlng[10]); # {
alert(latlng[1280]['lat]); # undefined
alert(latlng['1280'].lat); # undefined
You don't want to put the JSON in quotes, so:
var latlag = <?php echo json_encode($coordinates); ?>;
(Technically, that's not JSON at all, it's a JavaScript object initializer. But that's fine, JSON is a subset of initializer syntax and so all valid JSON texts are also valid JavaScript initializers.)
If the structure is really as you've quoted it, it's an array with one entry, which is an object with properties with names like 1280 and 1258, whose values are objects with properties named lat and lng. So you'd access those like this:
alert(latlng[0]["1280"].lat);
latlng is the array, latlng[0] is the one object it holds, and latlng[0]["1280"] is the {"lat":"-1.197070","lng":"-1.197070"} object.
You may be wondering why I've used quotes around 1280 above. It's because those keys are clearly given as strings (as is required in JSON, though not in JavaScript initializers), and so I can't be sure there aren't entries like "0012". Property names are always strings even when not written as strings, so latlng[0][1280] and latlng["0"]["1280"] both mean the same thing (because the 0 and the 1280 are converted to string [yes, really]), but naturally latlng[0]["0012"] is not the same as latlng[0][12] because the latter uses "12", not "0012", as the property name. If you know you won't have leading zeros, you can ditch the quotes.