I am getting the
JSON::GeneratorError: source sequence is illegal/malformed utf-8
when I am using to_json method. I have not overridden the to_json method anywhere.
I have referred this question and also this one
But as Ruby 1.8 does not have the concept of string encodings the solution is not helping me.
How can I solve this issue without the requirement to escape the specific non-ascii characters?
I am on ruby 1.8.7
The only Rails solution I am aware of would be:
# [AM] Monkeypatch to support multibyte utf-8
module ::ActiveSupport::JSON::Encoding
def self.escape(string)
if string.respond_to?(:force_encoding)
string = string.encode(
::Encoding::UTF_8,
:undef => :replace
).force_encoding(::Encoding::BINARY)
end
json = string.gsub(escape_regex) { |s| ESCAPED_CHARS[s] }
json = %("#{json}")
json.force_encoding(::Encoding::UTF_8) if json.respond_to?(:force_encoding)
json
end
end
I believe there could be the same patch applied directly to JSON::GeneratorError.
Related
I know that the specs for JSON says that propertynames (keys) should be surrounded by quotes.
But I have a lot of files that I need to read that contains data where the keys might not be quoted.
Earlier, before migration to Core, I used JavaScriptSerializer (that accepts keys without quotes ) but that doesn't exist in .net core.
Any ideas, or alternatives? I am still searching. But after 4 hours I thought that maybe you guys know this.
So, how can I read "{ apa: 23 }" and create a dictionary in .net core?
/thanks
Install-Package Newtonsoft.Json
string json = #"{
'Name': 'name',
'Description': 'des'
}";
Test test = JsonConvert.DeserializeObject<Test>(json);
I have a string:
{"name":"hector","time":"1522379137221"}
I want to parse the string into JSON and expect to get:
{"name":"hector","time":"1522379137221"}
I am doing:
require 'json'
JSON.parse
which produces this:
{"name"=>"hector","time"=>"1522379137221"}
Can someone tell me how I can keep :? I don't understand why it adds =>.
After you parse the json data you should see it in the programming language that you are using.
Ruby uses => to separated the key from the value in hash (while json uses :).
So the ruby output is correct and the data is ready for you to manipute in your code. When you convert your hash to json, the json library will convert the => back to :.
JSON does not have symbol class. Hence, nothing in JSON data corresponds to Ruby symbol. Under a trivial conversion from JSON to Ruby like JSON.parse, you cannot have a symbol in the output.
I'm reading strings from a mysql database which isn't set up for Unicode.
Ruby gets the string as 七大洋 but I know the correct version should be 七大洋. The "wrong" string is encoded as UTF-8 because Ruby doesn't know it has it wrong. I've tried forcing every encoding on the mangled string but nothing works. I have a feeling that I might be able to do it by fiddling with the bits but I don't even know where to start.
I don't think any information has been lost because the incorrect string actually has more bytes than the correct one. I don't think Ruby is the culprit here because the strings also look mangled when I view the table outside Ruby - so I'm hoping to undo the damage that MySQL has already done.
You can use following construction to revert encoding:
"wrong_string".encode(Encoding::SOME_ENCODING).force_encoding('utf-8')
I tried all possible encodings to detect right encoding:
Encoding.constants.each_with_object({}) do |encoding_name, result|
value = "七大洋".encode(Encoding.const_get encoding_name).force_encoding('utf-8') rescue nil
result[encoding_name] = value if value == "七大洋"
end.keys
#=> [:Windows_1252, :WINDOWS_1252, :CP1252, :Windows_1254, :WINDOWS_1254, :CP1254]
Thus, to convert your string to 七大洋 you can use any encoding from above.
Alexander pointed out my main mistake (you need to encode then force_encoding to find the right encoding). The string is indeed encoded as CP1252!
The best solution is to read binary from MySQL and then force encoding:
client = Mysql2::Client.new(opts.merge encoding: 'binary')
# ...
text.force_encoding('UTF-8')
Or, if you can't change how you're getting the data, you'll be stuck with a Encoding::UndefinedConversionError when you try to encode. As detailed in this blog post, the solution is to specify encodings for the five undefined CP1252 bytes:
fallback = {
"\u0081" => "\x81".force_encoding("CP1252"),
"\u008D" => "\x8D".force_encoding("CP1252"),
"\u008F" => "\x8F".force_encoding("CP1252"),
"\u0090" => "\x90".force_encoding("CP1252"),
"\u009D" => "\x9D".force_encoding("CP1252")
}
text.encode('CP1252', fallback: fallback).force_encoding('UTF-8')
I use DBIx class for selecting data from database;
I send response from controller to client using serialization to json using Catalyst::View::JSON
But utf8-data selected from database needs to be decoded to perl-string from utf-8 before sending to client like this
use Encode;
...
sub get_fruits :Path('getfruits') :Args(0) {
my $fruits = [$c->model('DB::Fruit')->search({})->hashref_array];
# Hated encode data loop
foreach (#$fruits) {
$_->{name} = decode('utf8', $_->{name});
}
$c->stash({fruits => $fruits});
$c->forward('View::JSON');
}
Is it possible to decode data automatically in the View?
The Catalyst model always has to ensure that the data is decoded, regardless of where it is used. The view has to ensure the data is encoded correctly.
You have to make sure that your model decodes data coming from the database. If you are using DBIx::Class read Using Unicode.
This may be as simple as ensuring that Catalyst::View::JSON is using a JSON encoder that supports UTF8 encoding. I believe that if you use JSON::XS with Catalyst::View::JSON it will perform UTF8 encoding by default. You can make sure that Catalyst::View::JSON is using JSON::XS using the json_driver config variable.
Alternatively you can override JSON encoding in Catalyst::View::JSON as detailed in the docs
I'm trying to use JsonBuilder in groovy servlet (extending HttpServlet)
Here is a snippet:
public void doGet(HttpServletRequest request, HttpServletResponse response) {
response.setContentType('text/plain')
response.setCharacterEncoding('utf-8')
def pw = response.getWriter()
pw.println(new JsonBuilder(['city': 'Москва']))
pw.println([сity: 'Москва'])
}
The output is
{"city":"\u041C\u043E\u0441\u043A\u0432\u0430"}
{сity=Москва}
I just don't know nothing about UTF escaping in JsonBuilder, googling also did not give my anything valuable. So I guess I'm stuck.
Does anybody know how to get the output for json exactly in same form as we get the output for regular groovy object?
I have encountered the same issue, and the above methods didn't work.
However this did:
http://groovy.codehaus.org/gapi/groovy/json/StringEscapeUtils.html
StringEscapeUtils.unescapeJavaScript(JsonOutput.toJson('Москва'))
As far as JavaScript and/or JSON goes, it is the exact same output.
You can easily confirm this yourself:
'Москва' == '\u041c\u043e\u0441\u043a\u0432\u0430'; // true
What you're seeing are Unicode string escape sequences, which are defined by the ECMAScript specification (JavaScript) and are allowed in JSON as well.
That said, I wouldn't worry about it too much, but if you insist on disabling the string escapes, you can use the JsonOutput object:
JsonOutput.prettyPrint(json.toString());
I've found it, I've found it.
So if you are as stubborn as me and don't want to recognize that (in a very wide range of applications) escaped sequence is exactly the same as non-escape, you can just use JsonOuput object, which is in the same standard package, groovy.xml.*:
JsonOutput.prettyPrint(json.toString())
If somebody's answer will be more detailed, I will delete my own answer and will mark the other answer as accepted. So I encourage you )))