QT - Parse XML with html reference notation of umlauts - html

I hope anybody can help me.
My problem is that I get XML with html entities for special charakters like:
<person>
<firstname>Max</firstname>
<lastname>Müller</lastname>
</person>
<person>
<firstname>Bernd</firstname>
<lastname>Schäfer</lastname>
</person>
I find no way in QT to decode the "&uuml" to a normal "ü". In the QT-DomTree this entity will stand in a QDomEntityRefrence object wich has no getter or other output or parse functionality.
I use standard way to parse the XML tree
QDomDocument doc;
if (!doc.setContent(response, &errors))
return false;
QDomElement const & root = doc.firstChildElement("person");
for (QDomElement xmlPerson= root.firstChildElement("person"); !xmlPerson.isNull(); xmlPerson = xmlPerson.nextSiblingElement("person"))
{
QDomNodeList personCont = xmlPerson.childNodes();
PersonObj person;
for(int i = 0; i < personCont.count(); i++)
{
QDomNode itemNode = personCont.at(i);
if(itemNode.isElement()){
QDomElement item = itemNode.toElement();
if(item.tagName() == "firstname")
{
person.setFirstname(item.firstChild().text());
}
else if(item.tagName() == "lastname")
{
addressBook.setLastname(item.firstChild().text());
}
...
Result:
Max Mller
Bernd Schfer
Thanks for your greate awnsers

Use QTextDocument()
QTextDocument doc;
doc.setHtml("Schäfer");
qDebug()<<doc.toPlainText();
In your example
QTextDocument doc;
switch(item.tagName())
{
case "firstname":
doc.setHtml(item.firstChild().text());
person.setFirstname(doc.toPlainText());
break;
case "lastname":
doc.setHtml(item.firstChild().text());
addressBook.setLastname(doc.toPlainText());
break;
...

Related

Ignore namespace attributes while serializing xml data to json

I am trying to serialize xml directly to json using JsonSerializer() but the namespace attributes are getting added as fields in the final json. Any suggestion on how to remove this? I tried with JsonConvert.Serialize() but some childnodes are missing in the serialized json.
A solution to your problem could be to deserialize your object to a dictionary first. This way you can add some logic in between the conversion to it.
Check the example below:
var xml = #"<?xml version='1.0' standalone='no'?>
<root>
<person id='1'>
<name>Alan</name>
<url>http://www.google.com</url>
</person>
<person id='2'>
<name>Louis</name>
<url>http://www.yahoo.com</url>
</person>
</root>";
XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);
var childNodeList = doc.DocumentElement.ChildNodes;
for (int i = 0; i < childNodeList.Count; i++)
{
var nodes = childNodeList.Item(i).ChildNodes;
var dict = new Dictionary<string, object>();
foreach (XmlNode node in nodes)
{
var serializedNode = JsonConvert.SerializeXmlNode(node);
var prop = JsonConvert.DeserializeObject<IDictionary<string, object>>(serializedNode).FirstOrDefault();
dict.Add(prop.Key, prop.Value ?? " ");
}
Console.WriteLine($"item {i}");
Console.WriteLine(string.Join("\r\n", dict.Select(e => $"{e.Key}: {e.Value}")));
}
Output:
//item 0
//name: Alan
//url: http://www.google.com
//item 1
//name: Louis
//url: http://www.yahoo.com

How can I deserialize an invalid json ? Truncated list of objects

My json file is mostly an array that contain objects but the list is incomplete, so I can't use the last entry. I would like to deserialize the rest of the file while discarding the last invalid entry
[ { "key" : "value1" }, { "key " : "value2"}, { "key
Please tell me if there is a way using Newtonsoft.Json library, or do I need some preprocessing.
Thank you!
Looks like on Json.NET 8.0.3 you can stream your string from a JsonTextReader to a JTokenWriter and get a partial result by catching and swallowing the JsonReaderException that gets thrown when parsing the truncated JSON:
JToken root;
string exceptionPath = null;
using (var textReader = new StringReader(badJson))
using (var jsonReader = new JsonTextReader(textReader))
using (JTokenWriter jsonWriter = new JTokenWriter())
{
try
{
jsonWriter.WriteToken(jsonReader);
}
catch (JsonReaderException ex)
{
exceptionPath = ex.Path;
Debug.WriteLine(ex);
}
root = jsonWriter.Token;
}
Console.WriteLine(root);
if (exceptionPath != null)
{
Console.WriteLine("Error occurred with token: ");
var badToken = root.SelectToken(exceptionPath);
Console.WriteLine(badToken);
}
This results in:
[
{
"key": "value1"
},
{
"key ": "value2"
},
{}
]
You could then finish deserializing the partial object with JToken.ToObject. You could also delete the incomplete array entry by using badToken.Remove().
It would be better practice not to generate invalid JSON in the first place though. I'm also not entirely sure this is documented functionality of Json.NET, and thus it might not work with future versions of Json.NET. (E.g. conceivably Newtonsoft could change their algorithm such that JTokenWriter.Token is only set when writing is successful.)
You can use the JsonReader class and try to parse as far as you get. Something like the code below will parse as many properties as it gets and then throw an exception. This is of course if you want to deserialize into a concrete class.
public Partial FromJson(JsonReader reader)
{
while (reader.Read())
{
// Break on EndObject
if (reader.TokenType == JsonToken.EndObject)
break;
// Only look for properties
if (reader.TokenType != JsonToken.PropertyName)
continue;
switch ((string) reader.Value)
{
case "Id":
reader.Read();
Id = Convert.ToInt16(reader.Value);
break;
case "Name":
reader.Read();
Name = Convert.ToString(reader.Value);
break;
}
}
return this;
}
Code taken from the CGbR JSON Target.
the second answer above is really good and simple, helped me out!
static string FixPartialJson(string badJson)
{
JToken root;
string exceptionPath = null;
using (var textReader = new StringReader(badJson))
using (var jsonReader = new JsonTextReader(textReader))
using (JTokenWriter jsonWriter = new JTokenWriter())
{
try
{
jsonWriter.WriteToken(jsonReader);
}
catch (JsonReaderException ex)
{
exceptionPath = ex.Path;
}
root = jsonWriter.Token;
}
return root.ToString();
}

Turn QString to JSON

I have the following:
QString notebookid = ui->notebookid->toPlainText();
QString tagid = ui->tagid->toPlainText();
QString userid = ui->userid->toPlainText();
QString subject = ui->subject->toPlainText();
QString comment = ui->comment->toPlainText();
I need to turn them into JSON, where the key is the notebookid, tagid, etc and the value is in the ui->notebookid, etc.
What's the best way to go about doing this?
Thanks.
I'll answer this based on the fact that you were using Qt 4.8 and would not have the QJsonObject available from Qt5.
I use QJSON for exactly this. It's an easy-to-use library using QVariants to parse and serialize the data.
This would be how you'd turn your data into json using QJSON:
QVariantMap jsonMap;
jsonMap.insert("notebookid", notebookid);
jsonMap.insert("tagid", tagid);
jsonMap.insert("userid", userid );
jsonMap.insert("subject", subject );
jsonMap.insert("comment", comment);
QJson::Serializer serializer;
bool ok;
QByteArray json = serializer.serialize(jsonMap, &ok);
assert (ok);
In Qt 5, you can use QJsonObject. One way is to explicitly select the controls to serialize:
QJsonObject MyDialog::serialize() const {
QJsonObject json;
json.insert("notebookid", ui->notebookid->toPlainText());
...
return json;
}
Another way is to have a generic serializer that uses the Qt's metadata. Each named control's user property is then serialized:
QJsonObject serializeDialog(const QWidget * dialog) {
QJsonObject json;
foreach (QWidget * widget, dialog->findChildren<QWidget*>()) {
if (widget->objectName().isEmpty()) continue;
QMetaProperty prop = widget->metaObject()->userProperty();
if (! prop.isValid()) continue;
QJsonValue val(QJsonValue::fromVariant(prop.read(widget)));
if (val.isUndefined()) continue;
json.insert(widget->objectName(), val);
}
return json;
}
You can convert QJsonDocument to text as follows:
QJsonDocument doc(serializeDialog(myDialog));
QString jsonText = QString::fromUtf8(doc.toJson());
Unfortunately, Qt 5's json code requires a bunch of changes to compile under Qt 4.

How do I pull out the JSON field I want using Jackson TreeNode and JsonNode?

I'm a little stumped why I can't pull the "Type" field out of my JSON stream to make a decision. It seems like this should be so easy.
I have the following JSON that I have as input:
[
{
"Institution":"ABC",
"Facility":"XYZ",
"Make":"Sunrise",
"Model":"Admission",
"SerialNumber":"",
"Revision":"1",
"Type":"ABC_Admission",
"ArchiveData":"<CSV file contents>"
}
]
In my Java I have a try-catch block with a JsonHolder class that implements Serializable to hold the JSON. Here's the Java I currently have:
try {
// Parse and split the input
JsonHolder data = JsonHolder.getField("text", input);
DataExtractor.LOG.info("JsonHolder data= " + data);
TreeNode node = data.getTreeNode();
DataExtractor.LOG.info("node size= " + node.size());
node = node.path("Type");
JsonNode json = (JsonNode) node;
DataExtractor.LOG.info("json= " + json.asText());
// code to decide what to do based on Type found
if (json.asText().equals("ABC_Admission")) {
// do one thing
} else {
// do something else
}
} catch (IOException iox) {
DataExtractor.LOG.error("Error extracting data", iox);
this.collector.fail(input);
}
When I run my code I get the following output (NOTE: I changed my package name where the class is to just for this output display)
25741 [Thread-91-DataExtractor] INFO <proprietary package name>.DataExtractor - JsonHolder data= [
{
"Institution":"ABC",
"Facility":"XYZ",
"Make":"Sunrise",
"Model":"Admission",
"SerialNumber":"",
"Revision":"1",
"Type":"ABC_Admission",
"ArchiveData":"<CSV file contents>"
}
]
25741 [Thread-91-DataExtractor] INFO <proprietary package name>.DataExtractor - node size= 1
25741 [Thread-91-DataExtractor] INFO <proprietary package name>.DataExtractor - json=
As you can see I don't get anything out. I just want to extract the value of the field "Type", so I was expecting to get the value "ABC_Admission" in this case. I would have thought the node path would separate out just that field from the rest of the JSON tree.
What am I doing wrong?
After consulting with another developer I found out the issue is my JSON is inside an array. So, I need to iterate over that array and then pull out the Type field from the object.
The updated code to resolve this is below:
try {
// Parse and split the input
JsonHolder data = JsonHolder.getField("text", input);
DataExtractor.LOG.info("JsonHolder data= " + data);
TreeNode node = data.getTreeNode();
String type = null;
// if this is an array of objects, iterate through the array
// to get the object, and reference the field we want
if (node.isArray()){
ArrayNode ary = (ArrayNode) node;
for (int i = 0; i < ary.size(); ++i) {
JsonNode obj = ary.get(i);
if (obj.has("Type")) {
type = obj.path("Type").asText();
break;
}
}
}
if (type == null) {
// Do something with failure??
}
DataExtractor.LOG.info("json= " + type);
if (type.equals("ABC_Admission")) {
// do one thing
else {
// do something else
}
} catch (IOException iox) {
DataExtractor.LOG.error("Error extracting data", iox);
this.collector.fail(input);
}

change xml namespace

I have an xml file as following:
<?xml version="1.0" encoding="utf-8"?>
<ABC version="1" xmlns="urn:Company">
</ABC>
I am releasing version 2 and the namespace changed to "NewCompany".
How do you update the namespace?
I tried
XmlDocument xmlDocument = new XmlDocument();
using (XmlReader xmlReader = XmlReader.Create("myfile.xml"))
{
xmlDocument.Load(xmlReader);
}
XmlNodeList nodeList = xmlDocument.GetElementsByTagName("ABC");
if (nodeList.Count == 1)
{
XmlElement element = nodeList.Item(0) as XmlElement;
if (element != null)
{
element.SetAttribute("xmlns", "NewCompany");
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
using (XmlWriter writer = XmlWriter.Create("myfile.xml", settings))
{
xmlDocument.WriteTo(writer);
}
}
}
But I get
"The prefix '' cannot be redefined from to within the same start element tag."
exception
I ran into this today and found a workaround. If you use XmlTextWriter instead of XmlWriter, the problem goes away. Your code sample would look something like this:
XmlNodeList nodeList = xmlDocument.GetElementsByTagName("ABC");
if (nodeList.Count == 1)
{
XmlElement element = nodeList.Item(0) as XmlElement;
if (element != null)
{
element.SetAttribute("xmlns", "NewCompany");
using (XmlTextWriter writer = new XmlTextWriter("myfile.xml", Encoding.UTF8))
{
writer.Formatting = Formatting.Indented;
xmlDocument.WriteTo(writer);
}
}
}
I'd have imagined that XmlWriter.Create will simply return a XmlTextWriter, but that doesn't seem to be the case. From looking around in Reflector, XmlWriter.Create seems to return concrete types different from XmlTextWriter.
XmlTextWriter seems to support changing the namespace of the document element, while the writer returned by XmlWriter.Create doesn't.
I realize that this question is four years old, but perhaps my answer will help someone.