I am reading in the following bilingual English <-> Japanese dictionary data from the iOS app bundle directory formatted as a json file having identical entries, but with different meanings (i.e. abandon) as shown in 'json data set 1' below:
{
"aardvark":"土豚 (つちぶた)",
"abacus":"算盤 (そろばん)",
"abacus":"十露盤 (そろばん)",
"abalone":"鮑 (あわび)",
"abandon":"乗り捨てる (のりすてる)(a ship or vehicle)",
"abandon":"取り下げる (とりさげる)(e.g. a lawsuit)",
"abandon":"捨て去る (すてさる)(ship)",
"abandon":"泣し沈む (なきしずむ)oneself to grief",
"abandon":"遺棄する (いき)",
"abandon":"握りつぶす (にぎりつぶす)",
"abandon":"握り潰す (にぎりつぶす)",
"abandon":"見限る (みかぎる)",
"abandon":"見切り (みきり)",
"abandon":"見捨てる (みすてる)",
"abandon":"突き放す[見捨てる (つきはなす)",
"abandon":"放り出す (ほうりだす)",
"abandon":"廃棄 (はいき)",
"abandon":"廃棄する (はいき)",
"abandon":"放棄する (ほうき)",
}
I am using the code snippet below to read in the data from the app.bundle directory:
var vocab:[String:String] = [:]
do {
let path = Bundle.main.path(forResource: "words_alpha", ofType: "json")!
let text = try! String(contentsOfFile: path, encoding: String.Encoding.utf8)
do {
vocab = try JSONDecoder().decode([String: String].self, from: Data(text.utf8))
print(text)
}
} catch {
print(error)
}
}
Question: Only the first entry of a duplicate entry is being read in whereas I would like to have all duplicate entries read in as multiple definitions for a single dictionary item/term.
One solution is to reformat duplicate entries in the json data as shown below by adding line returns between different definitions in 'json data set 2':
"abandon":"乗り捨てる (のりすてる)(a ship or vehicle)\n\n取り下げる (とりさげる)(e.g. a lawsuit)\n\n捨て去る (すてさる)(ship)\n\n 泣し沈む (なきしずむ)oneself to grief\n\n",
However, that is a huge amount of work editing a 30MB json data file to make the above changes for duplicate items so I am looking for a quick and dirty way to use swift json string manipulations to read in the data 'as is' using the native 'data set 1' format with each entry being on a line by itself as shown below:
{
"abandon":"乗り捨てる (のりすてる)(a ship or vehicle)",
"abandon":"取り下げる (とりさげる)(e.g. a lawsuit)",
}
Have tried a number of approaches, but none have worked so far. Any suggestions very much appreciated.
Dictionary can't have the same key, there is unicity.
If you use for instance JSONLint, you'll have a "Error: Duplicate key 'abacus'" (it stops at first error found).
However, that is a huge amount of work editing a 30MB json data file to make the above changes for duplicate items so I am looking for a quick and dirty way
Instead of thinking that way, let's pre-process the JSON, and fix it!
So you could write a little script beforehand to fix your JSON. You can do it in Swift!
In a quick & dirty way, you could do this:
All definitions must be one line (else you might have to fix manually for them)
Create a fixJSON.swift file (in Terminal.app: $>touch fixJSON.swift), make it executable ($ chmod +x fixJSON.swift), put that code inside it:
#!/usr/bin/swift
import Foundation
func fixingJSON(_ input: String) -> String? {
let lines = input.components(separatedBy: .newlines)
let regex = try! NSRegularExpression(pattern: "\"(\\w+)\":\"(.*?)\",", options: [])
let output = lines.reduce(into: [String: [String]]()) { partialResult, aLine in
var cleaned = aLine.trimmingCharacters(in: .whitespaces)
guard !cleaned.isEmpty else { return }
if !cleaned.hasSuffix(",") { //Articially add a ",", for the last line case
cleaned.append(",")
}
guard let match = regex.firstMatch(in: cleaned, options: [], range: NSRange(location: 0, length: cleaned.utf16.count)) else { return }
guard let wordRange = Range(match.range(at: 1), in: cleaned),
let definitionRange = Range(match.range(at: 2), in: cleaned) else { return }
partialResult[String(cleaned[wordRange]), default: [String]()] += [String(cleaned[definitionRange])]
}
// print(output)
do {
let encoder = JSONEncoder()
encoder.outputFormatting = [.prettyPrinted, .sortedKeys]
let asJSONData = try encoder.encode(output)
let asJSONString = String(data: asJSONData, encoding: .utf8)
// print(asJSONString!)
return asJSONString
} catch {
print("Error while encoding: \(error)")
return nil
}
}
func main() {
do {
//To change
let path = "translate.json"
let content = try String(contentsOfFile: path)
guard let output = fixingJSON(content) else { return }
//To change
let outputPath = "translate2.json"
try output.write(to: URL(fileURLWithPath: outputPath), atomically: true, encoding: .utf8)
} catch {
print("Oops, error while trying to read or write content of file:\n\(error)")
}
}
main()
Modify path/output path values, it's easier if you put it as the same place as the script file, then the path will be just the name of the file.
Then, in Terminal.app, just write $> ./fixJSON.swift
Okay, now, let's talk about what the script does.
As said, it's quick & dirty, and might have issues.
We read the content of the JSON with issue, I iterate over the lines, then used a regex, to find this:
"englishWord":"anything",
I artificially add a comma if there isn't (special case for the last entry of the JSON which shouldn't have one).
As to why, it's because there could be double quotes in a translation, so it could generate issues. It's just a quick & dirty fix. I might do better, but since it's a quick fix, spending more time to write beautiful code might be overkill for a one time use.
In the end, you'll have a [String: [String]] JSON.
This is fine JSON (except for the last comma, which isn't legal), but Swift's JSONDecoder can't handle it. JSON allows duplicate keys, but Swift doesn't. So you'll need to parse it by hand.
If your data is exactly as given, one record per line, with nothing "weird" (like embedded \" in the Strings), the easiest way to do that is to just parse it line by line, using simple String manipulation or NSRegularExpression.
If this is more arbitrary JSON, then you may want to use a full JSON parser that can handle this, such as RNJSON. Note that this is just a hobby project to build a JSON parser that exactly follows the spec, and as much intended as an example of how to write these things as a serious framework, but it can handle this JSON (as long as you get rid of that last , which is not legal).
import RNJSON
let keyValues = try JSONParser()
.parse(data: json)
.keyValues()
.lazy
.map({($0, [try $1.stringValue()])})
let translations = Dictionary(keyValues, uniquingKeysWith: +)
// [
"abandon": ["乗り捨てる (のりすてる)(a ship or vehicle)", "取り下げる (とりさげる)(e.g. a lawsuit)", "捨て去る (すてさる)(ship)", "泣し沈む (なきしずむ)oneself to grief", "遺棄する (いき)", "握りつぶす (にぎりつぶす)", "握り潰す (にぎりつぶす)", "見限る (みかぎる)", "見切り (みきり)", "見捨てる (みすてる)", "突き放す[見捨てる (つきはなす)", "放り出す (ほうりだす)", "廃棄 (はいき)", "廃棄する (はいき)", "放棄する (ほうき)"],
"aardvark": ["土豚 (つちぶた)"],
"abacus": ["算盤 (そろばん)", "十露盤 (そろばん)"],
"abalone": ["鮑 (あわび)"]
]
It's not that complicated a framework, so you could also adapt it to your own needs (making it accept that last non-standard comma, for example).
But, if possible, I'd personally just parse it line by line with simple String manipulation. That would be the easiest to implement using AsyncLineSequence, which would avoid pulling all 30MB into the memory before parsing.
Related
Here's my problem. Let's say I have a JSON structure that I'm reading using Swift's Codable API. What I want to do is not decode part of the JSON but read it as a string even though it's valid JSON.
In a playground I'm messing about with this code:
import Foundation
let json = #"""
{
"abc": 123,
"def": {
"xyz": "hello world!"
}
}
"""#
struct X: Decodable {
let abc: Int
let def: String
enum CodingKeys: String, CodingKey {
case abc
case def
}
init(decoder: Decoder) throws {
let container = try decoder.container(keyedBy: CodingKeys.self)
abc = try container.decode(Int.self, forKey: .abc)
var defContainer = try container.nestedUnkeyedContainer(forKey: .def)
def = try defContainer.decode(String.self)
// def = try container.decode(String.self, forKey: .def)
}
}
let x = try JSONDecoder().decode(X.self, from: json.data(using: .utf8)!)
Essentially I'm trying to read the def structure as a string instead of a dictionary.
Any clues?
If the resulting string doesn't need to be identical to the corresponding text in the JSON file (i.e. preserve whatever white space is there, etc.), just decode the entire JSON, and then encode the part that you want as a string, and construct a string from the resulting data.
If you do want to preserve exactly the text in the original JSON, including white space, then you'll do better to get that string some other way. Foundation's Scanner class makes it pretty easy to look for some starting token (e.g. "def:") and then read as much data as you want. So consider decoding the JSON in one step, and then separately using a Scanner to dig through the same input data to get the string you need.
Definitely not using JSONDecoder. By the time init(from:) is called, the underlying data has already been thrown away. However you do it, you'll need to parse the JSON yourself. This isn't as hard as it sounds. For example, to extract this string, you could use JSONScanner, which is a few hundred lines of code that you can adjust as you like. With that, you can do things like:
let scanner = JSONScanner()
let string = try scanner.extractData(from: Data(json.utf8), forPath: ["def"])
print(String(data: string, encoding: .utf8)!)
And that will print out:
{
"xyz": "hello world!"
}
(Note that RNAJSON is a sandbox framework of mine. It's not a production-ready framework, but it does a lot of interesting JSON things.)
Integrating this into a system that decodes this in a "Decoder-like" way is definitely buildable along these lines, but there's no simple answer down that path. Caleb's suggestion of re-encoding the data into a JSON string is definitely the easiest way.
Using RNAJSON again, there's a type called JSONValue that you can use like this:
init(from decoder: Decoder) throws {
let container = try decoder.container(keyedBy: CodingKeys.self)
abc = try container.decode(Int.self, forKey: .abc)
let defJSON = try container.decode(JSONValue.self, forKey: .def)
def = String(data: try JSONEncoder().encode(defJSON), encoding: .utf8)!
}
This will make def be a JSON string, but it doesn't promise that key order is maintained or that whitespace is preserved.
Thanks everyone. It's an interesting problem. Not so much in the rather simple example I gave, but in the actual problem I'm thinking about I want to include mustache templating inside the text I'm decoding (which by the way could be JSON or YAML). The top level parts of the data are fixed in that they have pre-defined keys so reading them is fine. but there is a point in the data where the developer generating it can pretty much include any data structure they like, and that data structure can include mustache keys. ie. abc: {{some-value}} which of course, makes no sense to a decoder.
There's lots of ways this could go wrong but I'm thinking that really I would need to run mustache across the entire data structure before decoding it, where as currently I'm decoding then running mustache and to do that I'm already setting the unknown part as a string and reading it as such. I'm contemplating the idea that it would be much nicer if it was just JSON/YAML rather than a string containing JSON/YAML. But it might not be possible :-)
I'm looking for some guidance here on an issue I'm encountering. I am polling an API for stock data and instead of getting back an array or formatted JSON I get chunks of it. Each block of JSON seems to be properly formatted, but the JSON decoders are looking for a properly formatted array.
Prior to using combine I had a different working. I had to parse the data to remove 4 trailing characters, then I was left with lines of JSON that I could split by newline and then do a forEach loop on to parse.
This was messy but it works. The issue now is I'm polling streaming data for live quote updating in my UI. I need to have the JSON chunks read into my class, stored as a variable that will update the UI. I am not worried about all of that code once I can figure out how to get the data read properly.
I know my app is receiving the data because I see the stats in the debugger for network. I also test the same API call in a browser and see what data my app is receiving. I just don't know how to break it up in the combine framework.
I have been working with something like this so far:
return URLSession.shared.dataTaskPublisher(for: request)
.map { $0.data }
.handleEvents(receiveSubscription: { print("Receive subscription: \($0)") },
receiveOutput: { print("Receive output: \($0)") },
receiveCompletion: { print("Receive completion: \($0)") },
receiveCancel: { print("Receive cancel") },
receiveRequest: { print("Receive request: \($0)") })
//.decode(type: CryptoDataContainer.self, decoder: JSONDecoder())
.decode(type: Quote.self, decoder: JSONDecoder())
.receive(on: DispatchQueue.main)
.eraseToAnyPublisher()
}
The Quote struct is very basic:
struct Quote: Codable {
public var Symbol: String
public var Last: Double
public var Bid: Double
public var Ask: Double
public var Close: Double
}
JSON data is posted below, and it streams. I get a few chunks of it at a time, but if I can split by newline or by everything {.*} to get each JSON chunk and decode individually it would work great.
I'm really looking for Last price and Symbol out of the JSON. I could even just look for Last and start one individual streaming session for each symbol I need.
I will also need to split these up using a subject to multicast each publisher to multiple subscribers but I wanted to get the basic functionality working before complicating matters.
Update: From another post I found that the problem is the JSON formatting - The API is returning invalid JSON that is not formatted into an array - it is missing the surrounding []. This is why it doesn't parse. Another option may be to simply convert to string, add square brackets around all of the JSON passed down from the server (there is nothing but objects).. and then parse it correctly? I'm not sure how to do this within a combine map but that seems to be the path forward.
Here’s an example of the map I’m trying to use currently, since the data comes in as json chunks I need to parse them each individually.
.map ({
let stringRepresentation = String(data: $0, encoding: .utf8)
let outputJson = stringRepresentation!.split(whereSeparator: \.isNewline).map(String.init)
if outputJson.count > 1 {
print("DEBUG: outputJson.count > 1 code running..")
let formattedOutputJson:String = outputJson.joined(separator: ",")
let finalJson:String = "{\"quotes\":[" + formattedOutputJson + "]}"
let data:Data? = finalJson.data(using: .utf8)
return data!
} else {
print("DEBUG: outputJson.count = 1 (Else) code running..")
let finalJson:String = "{\"quotes\":[" + outputJson[0] + "]}"
let data:Data? = finalJson.data(using: .utf8)
return data!
}
})
Here is a better example of the JSON formatting in chunks I’m talking about:
{"heartbeat":4,"timestamp":"\/Date(1622898090819)\/"}
{"heartbeat":5,"timestamp":"\/Date(1622898095819)\/"}
{"heartbeat":6,"timestamp":"\/Date(1622898100819)\/"}
{"heartbeat":7,"timestamp":"\/Date(1622898105819)\/"}
{"heartbeat":8,"timestamp":"\/Date(1622898110819)\/"}
If I can even capture the heartbeat number as an Int to use I would be well on my way to getting my code working. The issue is I get zero data out of this. I know I’m connected from packet captures and various other tests. The data is getting to my system, and to the Xcode simulated app, I just don’t think I’m getting it to a point where I can print to the console correctly to display it. I have been working on that prior to attempting to update the UI directly.
{"Description":"E-Mini NASDAQ-100 Jun 2021","PreviousClose":13529.25,"PreviousVolume":552987,"AssetType":"FUTURE","CountryCode":"United States","Halted":false,"Bid":13513.75,"BidSize":3,"Ask":13514.25,"AskSize":1,"Currency":"USD","LastSize":1,"LastVenue":"CME","SymbolRoot":"NQ","Exchange":"CME","LastTradingDate":"2021-06-18","DisplayType":3,"FractionalDisplay":false,"MinPrice":1258200,"PointValue":20,"TradeTime":"2021-06-04T02:40:20.0000000+00:00","MaxPrice":1447600,"MinMove":25,"High52Week":14064,"Open":13531.75,"Low52Week":9591.75,"High":13541.75,"Low":13467.75,"Close":13513.75,"Symbol":"NQM21","Volume":23877,"DailyOpenInterest":231736,"ExpirationDate":"2021-06-18","Last":13513.75,"NameExt":"(D)","DataFeed":"TradeStation","IsDelayed":true,"Restrictions":[null],"VWAP":13499.877658013,"NetChange":-15.50,"NetChangePct":-0.1145665872091948925476282900,"High52WeekTimeStamp":"2021-04-29T00:00:00.0000000+00:00","Low52WeekTimeStamp":"2020-06-11T00:00:00.0000000+00:00","PreviousClosePriceDisplay":"13529.25","BidPriceDisplay":"13513.75","AskPriceDisplay":"13514.25","MinPriceDisplay":"1258200.00","MaxPriceDisplay":"1447600.00","High52WeekPriceDisplay":"14064.00","OpenPriceDisplay":"13531.75","Low52WeekPriceDisplay":"9591.75","HighPriceDisplay":"13541.75","LowPriceDisplay":"13467.75","ClosePriceDisplay":"13513.75","LastPriceDisplay":"13513.75","VWAPDisplay":"13499.88","FirstNoticeDate":"","StrikePrice":0,"TickSizeTier":0,"Underlying":"","StrikePriceDisplay":"0.00"}
{"Last":4188.5,"NameExt":"(D)","DataFeed":"TradeStation","IsDelayed":true,"Restrictions":[null],"VWAP":4184.92506004117,"Description":"E-mini S&P 500 Jun 2021","PreviousClose":4191.25,"PreviousVolume":1384271,"AssetType":"FUTURE","CountryCode":"United States","Halted":false,"Bid":4188.25,"BidSize":39,"Ask":4188.5,"AskSize":23,"Currency":"USD","LastSize":1,"LastVenue":"CME","SymbolRoot":"ES","Exchange":"CME","LastTradingDate":"2021-06-18","DisplayType":3,"FractionalDisplay":false,"MinPrice":3898,"PointValue":50,"TradeTime":"2021-06-04T02:40:10.0000000+00:00","MaxPrice":4484,"MinMove":25,"High52Week":4238.25,"Open":4190.75,"Low52Week":2977.8999,"Hi{"Volume":23880,"TradeTime":"2021-06-04T02:40:23.0000000+00:00","VWAP":13499.879455674,"Symbol":"NQM21"}
{"AskSize":18,"Symbol":"ESM21"}
{"Bid":13514.25,"BidSize":1,"AskSize":1,"BidPriceDisplay":"13514.25","Symbol":"NQM21","VWAP":13499.8800582884,"Volume":23881,"TradeTime":"2021-06-04T02:40:25.0000000+00:00"}
{"VWAP":4184.92521199492,"AskSize":19,"Volume":47807,"TradeTime":"2021-06-04T02:40:27.0000000+00:00","Symbol":"ESM21"}
{"Bid":13514,"BidSize":3,"BidPriceDisplay":"13514.00","Symbol":"NQM21","TradeTime":"2021-06-04T02:40:27.0000000+00:00","Ask":13514.75,"VWAP":13499.8806713352,"Close":13514.5,"Last":13514.5,"Volume":23882,"NetChange":-14.75,"NetChangePct":-0.1090230426668144945211301400,"AskPriceDisplay":"13514.75","ClosePriceDisplay":"13514.50","LastPriceDisplay":"13514.50"}
{"AskSize":18,"Symbol":"ESM21"}
Edit: I was able to pin down the issue to a MUCH more concentrated field. Although this post here isn't necessarily wrong with its assumptions, Swift 4 base64 String to Data not working due to special character is much more clear and has a Playground example.
I have a string that has to be be serialized into a Dictionary in Swift 4. The app lets users upload data (JSON serialized as Data) and download it later. For the latter, the app does the following with the downloaded data (dlData)
if let rootDict = NSKeyedUnarchiver.unarchiveObject(with: dlData) as? Dictionary<String, Any> {
if let content = rootDict["C"] as? String {
if let data = content.data(using: .utf8, allowLossyConversion: true){
let json = try JSONSerialization.jsonObject(with: data, options: .mutableContainers) as? [String:Any]
...
} else {
print("DATA DIDNT WORK") //gets printed with his data
}
Pretty much every time this has worked fine but recently a user has contacted me that on his iPhone there isn't any data showing up. I've added the else path and that's where it goes. It doesn't seem to be able to convert this particular string to Data
When I print() the string, copy the console output and then hardcode it into the method, it works just fine. The string is valid JSON (validated with 3 different online validators), and the JSONSerialization also works. But not in the "live" environment where it uses the downloaded data instead of the hardcoded print()-representation
Where I think the issue might be is that the Xcode console "cleans up" the string and "bad" characters that might be in it which is why copy-pasting makes it work but a direct download does not. The only weird thing I can spot in the print()ed string is the replacement character (the � symbol).
dlData > rootDict > content > data > json
Data > Dictionary > string > Data > Dictionary
Isn't the best possible chaining for this task but I am not in the position to change the infrastructure of this system. And because it does work for at least 95% of the users, I think it should work for all of them.
I've tried doing replacingOccurrences(of: "�", with: "?") but this doesn't affect the string, probably because in the actual string this is no "�" but something else and the "�" only gets displayed because the console doesn't know how else to put it.
I've come across this blog https://natrajbontha.wordpress.com/2017/10/12/replacement-character-in-json-data/ but I would actually prefer to do the cleaning up only when the conversion has already failed once.
The original character at this spot is the American flag emoji and my users could live without having the latest emojis in there but generally, I want emojis to be displayed so replacing all of them isn't a choice.
I've just tried the same in the Playground and it results in the same.
//b64String is dlData but in base64
let decodedData = Data(base64Encoded: b64String)! //works
if let unarchivedDictionary = NSKeyedUnarchiver.unarchiveObject(with: decodedData) as? Dictionary<String, Any> { //works
if let dF = unarchivedDictionary["C"] as? String { //works
print(dF) //prints
if let data = dF.data(using: .utf8, allowLossyConversion: true) { //fails
print(data)
} else {
print("NO DATA") //prints
}
}
}
Swift 3 , Xcode8.2.1,
I'm trying to extract specific values from a json file in the project. The name of the file is city.list.json, and the syntax of the json file is as follows:
{"_id":707860,"name":"Hurzuf","country":"UA","coord":{"lon":34.283333,"lat":44.549999}}
{"_id":519188,"name":"Novinki","country":"RU","coord":{"lon":37.666668,"lat":55.683334}}
The input I have is the country name and i need the id value or the country code relevant returned as a string.
I get an error:
"Type 'Any?' has no subscript members",
The method I wrote:
private func findCountryCodeBy(location: String)->String{
var result:String="";
let bundle = Bundle(for: type(of: self));
if let theURL = bundle.url(forResource: "city.list", withExtension: "json") {
do {
let data = try Data(contentsOf: theURL);
if let parsedData = try? JSONSerialization.jsonObject(with: data, options:[]) as! [String:Any] {
result = parsedData["_id"][location][0] as! String;
}
} catch {
print(error);
result = "error";
}
}
return result;
}
That is not valid JSON. I think the nearest valid JSON equivalent would be EITHER a JSON list like:
[
{"_id":707860,"name":"Hurzuf","country":"UA","coord":{"lon":34.283333,"lat":44.549999}},
{"_id":519188,"name":"Novinki","country":"RU","coord":{"lon":37.666668,"lat":55.683334}}
]
Ie, a list enclosed within square brackets with each item separated by a comma.
OR a JSON dictionary:
{
"707860": {"name":"Hurzuf","country":"UA","coord":{"lon":34.283333,"lat":44.549999}},
"519188": {"name":"Novinki","country":"RU","coord":{"lon":37.666668,"lat":55.683334}}
}
Ie, a dictionary enclosed within curly brackets with the key (in this case I've used your _id as the key) before the : and the value (a dictionary of all the other items" after the :.
(Newlines, tabs, whitespace are ignored, I've just included them to make it obvious what I've done).
I think that the dictionary version may suit your code better, but it depends on what else you want to do with the data. A list may suit some situations better.
I wrote a quick Python script to simply read JSON from a file (and not do anything else with it), and it produced a parsing error for the not-quite-JSON that you had, but it worked fine on both of my JSON examples, above.
NB: If you do NOT have control over the format of the file you are reading (ie, if you are receiving it from some other source which cannot produce it in any other format) then you will have to either modify the format of the file after you receiv it to make it valid JSON, OR you will have to use something other than JSONSerialization to read it. You could modify it by replacing all occurrences of }{ or }\n{ with },{ and then put [ at the beginning and ] at the end. That should do the job for converting this particular file to valid JSON for a list. Converting to a dictionary would be a little more involved.
Ideally though, you may have control over the file format yourself, in which case, just change whatever generates the file to produce correct JSON in the first place.
Once you have your valid JSON and parsed it into your parsedData variable, you'll then need to fix this line:
result = parsedData["_id"][location][0] as! String;
Assuming that location is the the equivalent of the _id string in the JSON, then you may be able to use the dictionary version of the JSON above and replace that line with something like:
result = parsedData[location]["country"];
However, if location is not the _id string in the JSON, then you'd be better off using the list version of the JSON above, and use a for loop to compare the values of each list item (or use a dictionary version of the JSON keyed on whatever location actually relates to in the JSON).
In a JSON data file, I have a unicode character like this:
{
”myUnicodeCharacter”: ”\\u{25a1}”
}
I know how to read data from JSON files. The problem occurs when it contains characters which are represented as above.
I read it in to a String variable, myUnicodeCharacterString, which gets the value ”\u{25a1}”. I couldn't by the way use a single slash in the JSON data file because in such case it doesn't recognize the data in the file to be a proper JSON object, returning nil.
However, the value is not encoded to its graphical representation when it’s assigned to something for displaying it, for example a SKLabelNode:
mySKLabelNode.Text = myUnicodeCharacterString // displays ”\u{25a1}” and not a hollow square
The problem boils down to this:
// A: direct approach, does works
let unicodeValueByValue = UnicodeScalar("\u{25a1}") // ”9633”
let c1 = Character(unicodeValueByValue) // ”a hollow square”
// B: indirect approach, this does not work
let myUnicodeString = "\u{25a1}"
let unicodeValueByVariable = UnicodeScalar(myUnicodeString) // Error: cannot find an initialiser
let c2 = Character(unicodeValueByVariable)
So, how do I display a unicode character of the format "\u{xxxx}" if it's not directly given in code?
A better way would be to use the proper \uNNNN escape sequence
for Unicode characters in JSON (see http://json.org for details).
This is automatically handled by NSJSONSerialization, and you don't
have to convert a hex code.
In your case the JSON data should be
{
"myUnicodeCharacter" : "\u25a1"
}
Here is a full self-contained example:
let jsonString = "{ \"myUnicodeCharacter\" : \"\\u25a1\"}"
println(jsonString)
// { "myUnicodeCharacter" : "\u25a1"}
let dict = NSJSONSerialization.JSONObjectWithData(jsonString.dataUsingEncoding(NSUTF8StringEncoding)!,
options: nil, error: nil) as [String : String]
let myUnicodeCharacterString = dict["myUnicodeCharacter"]!
println(myUnicodeCharacterString)
// □
I came up with a solution, which does not answer the question, but is actually a better way of doing what I'm trying to do.
The unicode character is given instead as its hexadecimal value in the JSON data file, stripping all escape characters:
{
”myUnicodeCharacter”: ”25a1”
}
Then it's processed like this, after reading the value in to myUnicodeCharacterString:
let num = Int(strtoul(myUnicodeCharacterString, nil, 16))
mySKLabelNode.Text = String(UnicodeScalar(num))
And that worked. Now the hollow square showed up.
In Swift, we can implement the following solution
let jsonString = "{ "unicodeCharacter" : "\u00BD"}"
let data = jsonString.data(using: .utf8)
if let data = data {
let dict = try? JSONSerialization.jsonObject(with: data, options: []) as? [String: String]
print(dict?["unicodeCharacter"] ?? "")
}
Output: "½\n"