How to use Groovy JsonOutput.toJson with data encoded with UTF-8? - json

I have a file with UTF-8 encoding.
I write a groovy script to load a file with a JSON structure, modify it and save it:
def originPreviewFilePath = "./xxx.json"
//target the file
def originFile = new File(originPreviewFilePath)
//load the UTF8 data file as a JSON structure
def originPreview = new JsonSlurper().parse(originFile,'UTF-8')
//Here is my own code to modify originPreview
//Convert the structure to JSON Text
def resultPreviewJson = JsonOutput.toJson(originPreview)
//Beautify JSON Text (Indent)
def finalFileData = JsonOutput.prettyPrint(resultPreviewJson)
//save the JSONText
new File(resultPreviewFilePath).write(finalFileData, 'UTF-8')
The problem is that JsonOutput.toJson transforms UTF-8 data to UNICODE. I don't understand why JsonSlurper().parse can use UTF-8 but not JsonOutput.toJson?
How to have JsonOutput.toJson use UTF-8? I need to have the exact inverse of JsonSlurper().parse

In case anyone is still struggling with this, the solution is to disable unicode escaping:
new JsonGenerator.Options()
.disableUnicodeEscaping()
.build()
.toJson(object)

This worked for me in Groovy 3:
StringEscapeUtils.unescapeJavaScript(
JsonOutput.prettyPrint(resultPreviewJson)
)

I believe that the encoding is applied at the incorrect statement while reading itself.
Change below statements from :
def originFile = new File(originPreviewFilePath)
def originPreview = new JsonSlurper().parse(originFile,'UTF-8')
To:
def originFile = new File(originPreviewFilePath).getText('UTF-8')
def originPreview = new JsonSlurper().parseText(originFile)

Related

JsonSlurper parsing String containing Json into unexpected format

From a separate system I get a String parameter "messageJson" whose content is in the form:
{"agent1":"smith","agent2":"brown","agent3":{"agent3_1":"jones","agent3_2":"johnson"}}
To use it in my program I parse it with JsonSlurper.
def myJson = new JsonSlurper().parseText(messageJson)
But the resulting Json has the form:
[agent1:smith, agent2:brown, agent3:[agent3_1:jones, agent3_2:johnson]]
Note the square brackets and the lack of double quotes. How can I parse messageJson so that the original structure is kept?
Ok, thanks to the hint by cfrick, I was able to find a solution. In case anyone else has a similar problem, all I needed to do was using JsonOutput in the end to convert the map back to a Json
I.E. :
def myJson = new JsonSlurper().parseText(messageJson)
myJson << [agent4:"jane"]
def backToJson = JsonOutput.toJson(myJson)

Importing CSV file in MVC and converting it in JSON using C#

I am importing .CSV file from an angular app into MVC and i am able to get the files like this
Int32 strLen, strRead;
System.IO.Stream stream = Request.InputStream;
strLen = Convert.ToInt32(stream.Length);
byte[] strArr = new byte[strLen];
strRead = stream.Read(strArr, 0, strLen);
here the files which is being imported is converted into byte[] because i am reading the file using
System.IO.Stream stream = Request.InputStream
Then i convert it into string like this
string a = System.Text.Encoding.UTF8.GetString(strArr);
and try to split the content and retrieve the data but it becomes very complex, i wonder if there is any alternate way for it. In a simple .CSV file like this
I get the result after converting the byte[] to string like this
and once i apply logic for splitting the string and retrieving the data, the logic gets very messy like this
Is there any efficinet way where i can convert the imported .CSV file to JSON
Save stream as text file in to the TEMP folder.
Use any parcer for working with CSV file. (Example FileHelpers)
Use any Json helper to convert it to the output format. (Example: newtonsoft)
You can use Cinchoo ETL - an open source library, to convert CSV to JSON easily.
using (var parser = new ChoCSVReader("IgnoreLineFile1.csv")
.WithField("PolicyNumber", 1)
.WithField("VinNumber", 2)
.Configure(c => c.IgnoreEmptyLine = true)
.Configure(c => c.ColumnCountStrict = true)
)
{
using (var writer = new ChoJSONWriter("ignoreLineFile1.json")
.WithField("PolicyNumber", fieldName: "Policy Number")
.WithField("VinNumber", fieldName: "Vin Number")
)
writer.Write(parser.Skip(1));
}
In above, you can pass stream to the reader and writer as well for your requirement.
Hope this will help.
Disclaimer: I'm the author of this library.

Special character (utf-8) in a django jsonresponse

Im using json response in django, but I have special characters (ñáé etc...)
my view
def get_agencies(request):
qr = Agency.objects.all()
qr_jason = serializers.serialize('json',qr)
return JsonResponse(qr_jason, safe=False)
But if I enter a special character like ñ in the json I recieve the ascii equivalent. Actually I can make a dictionary and then make the JasonResponse with the dictionary and it works, I can't find a way to use the serializers.serialize with utf-8.
json recieved (the u00f1 are ñ)
// 20170124165944
// http://localhost:8080/get_agencies/
"[
{
\"model\": \"items.agency\",
\"pk\": 1,
\"fields\": {
\"name\": \"asdk\\u00f1ld\",
\"tipo\": \"librevile\",
\"adress\": \"laslkfdli323,
ls\\u00f1\\u00f1\",
\"phone\": \"56549875\",
\"web\": \"http: //www.systmatic.com.mx\",
\"lat\": 23.514646,
\"lng\": -26.152684,
\"created\": \"2017-01-24T00: 56: 28.302Z\",
\"last_updated\": \"2017-01-24T22: 22: 08.856Z\"
}
}
]"
Faster solution:
def get_agencies(request):
qr = Agency.objects.all().values()
qr_list = list(qr)
return JsonResponse(qr_list, , safe=False, json_dumps_params={'ensure_ascii':False})
I know that you wrote that you would like to serialize using django.core.serializers.serialize but... you could do a workaround and serialize using json standard lib.
import json
def get_agencies(request):
qr = Agency.objects.all().values()
qr_json = json.dumps(list(qr), ensure_ascii=False, default=str)
return JsonResponse(qr_json, safe=False)
I've added default=str parameter to json.dumps because I saw that you have a datetime field in your model, so that should take care of that issue.

Python convert json file to html

I have a bunch of Pandas data frames. I want to view them in HTML (and also want the json). So, this is what I did:
masterDF = concatenated all dfs (pd.concat([df1, df2, df3, ..])
masterDf.to_json(jsonFile, orient = 'records') => this gives a valid json file, but, in a list format.
htmlStr = json2html.convert(json = jsonString)
htmlFile = write htmlStr to a myFile.html.
The json file looks like this:
[{"A":1458000000000,"B":300,"C":1,"sid":101,"D":323.4775570025,"score":0.0726},{"A":1458604800000,"B":6767,"C":1,"sid":101,"D":321.8098393263,"score":0.9524},{"A":1458345600000,"B":9999,"C":3,"sid":29987,"D":125.6096891766,"score":0.9874},{"A":1457827200000,"B":3110,"C":2,"sid":787623,"D":3010.9544668798,"score":0.0318}]
Problem I am facing:
pd.to_json outputs a jsonfile with [] format. Like a list. I am unable to use this json file to load. Like this:
with open(jsonFile) as json_data:
js = json.load(json_data)
htmlStr = json2html.convert(json = js)
return htmlStr
Is there a way to load a json-file like the above and convert to html?
why not use pandas.DataFrame.to_html? (http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_html.html)

How to get correct encoding with groovy HtmlParsing from a website?

I am making a Groovy script where i'm html parsing a swedish website and I want to get the swedish characters Å, Ä and Ö back from the site.
This is an example of what i'm trying to do (not the actual site i'm scraping in my project but an example).
When i run the script below it gives me the output "Avancerad s�kning" when i really want "Avancerad sökning".
Does anyone know how to do this encoding in a good way?
#Grab(group='org.ccil.cowan.tagsoup', module='tagsoup', version='1.2' )
String page= "http://www.webhallen.com/se-sv/"
def tagsoupParser = new org.ccil.cowan.tagsoup.Parser()
def slurper = new XmlSlurper(tagsoupParser)
def htmlParser = slurper.parse(page)
htmlParser.'**'.findAll { it.#class?.text() == 'first-child' }.each {println it.toString()}
Not sure, but it works with nekohtml:
#Grab('net.sourceforge.nekohtml:nekohtml:1.9.21')
String page= "http://www.webhallen.com/se-sv/"
def tagsoupParser = new org.cyberneko.html.parsers.SAXParser()
def slurper = new XmlSlurper(tagsoupParser)
def htmlParser = slurper.parse(page)
htmlParser.'**'.findAll { it.#class?.text() == 'first-child' }.each {println it.toString()}