iPhone: Decode characters like \U05de - json

I used SBJsonParser to parse a json string.
inside, instead of hebrew chars, I got a string full of chars in a form like \U05de
what would be the best way to decode these back to hebrew chars,
so i can put these on controls like UIFieldView?

Eventually I ran a loop iterating in the string for the chars \u
in the loop, when detected such a substring, i took a range of 6 characters since that index,
giving me a substring for example \u052v that need to be fixed.
on this string, i ran the method [str JSONValue], which gave me the correct char, then i simply replaced all occurrences of \u052v (for example) with the latter corrected char.

Related

Fixind invalid json in SQL

I'm parsing json in redshift using json_extract_path_text, but this json is invalid (one of the fields contains double quote inside of the string value):
"somefield": "4 *\\"`)(z"
Is there any way to get rid of this quote and replace it with some other value (I do not really care about this particular data as it is wrong anyway, but I want to fetch some other parts of this json).
It looks like you have the wrong number of backslashes in the string. You need either or 1, to just get the double quotes, or 3 to get a backslash and the double quote. But this isn't really the question.
You can use the REPLACE() function to strip the \" text out. https://docs.aws.amazon.com/redshift/latest/dg/r_REPLACE.html
REPLACE(json_text, '\\"', '')
I believe REPLACE() doesn't do any string interpretation so no additional escaping will be needed.

How to get PostgreSQL to escape text from jsonb_array_element?

I'm loading some JSON from Postgres 13 into Elasticsearch using Logstash and ran into some errors caused by text not being escaped with reverse solidus. I tracked my problem down to this behavior:
SELECT
json_build_object(
'literal_text', 'abc\ndef'::text,
'literal_text_type', pg_typeof('abc\ndef'::text),
'text_from_jsonb_array_element', a->>0,
'jsonb_array_element_type', pg_typeof(a->>0)
)
FROM jsonb_array_elements('["abc\ndef"]') jae (a);
{
"literal_text": "abc\\ndef",
"literal_text_type": "text",
"text_from_jsonb_array_element": "abc\ndef",
"jsonb_array_element_type":"text"
}
db-fiddle
json_build_object encodes the literal text as expected (turning \n into \\n); however, it doesn't encode the text retrieved via jsonb_array_element even though both are text.
Why is the text extracted from jsonb_array_element being treated differently (not getting escaped by jsonb_build_object)? I've tried casting, using jsonb_array_elements_text (though my actual use case involves an array of arrays, so I need to split to a set of jsonb), and various escaping/encoding/formatting functions, but haven't found a solution yet.
Is there a trick to cast text pulled from jsonb_array_element so it will get properly encoded by jsonb_build_object?
Thanks for any hints or solutions.
Those strings look awfully similar, but they're actually different. When you create a string literal like '\n', that's a backslash character followed by an "n" character. So when you put that into json_build_object, it needs to add a backslash to escape the backslash you're giving it.
On the other hand, when you call jsonb_array_elements('["abc\ndef"]'), you're saying that the JSON has precisely a \n encoded in it with no second backslash, and therefore when it's converted to text, that \n is interpreted as a newline character, not two separate characters. You can see this easily by running the following:
SELECT a->>0 FROM jsonb_array_elements('["abc\ndef"]') a;
?column?
----------
abc +
def
(1 row)
On encoding that back into a JSON, you get a single backslash again, because it's once again encoding a newline character.
If you want to escape it with an extra backslash, I suggest a simple replace:
SELECT
json_build_object(
'text_from_jsonb_with_replace', replace(a->>0, E'\n', '\n')
)
FROM jsonb_array_elements('["abc\ndef"]') jae (a);
json_build_object
------------------------------------------------
{"text_from_jsonb_with_replace" : "abc\\ndef"}

Remove backslash from nested json [duplicate]

When I create a string containing backslashes, they get duplicated:
>>> my_string = "why\does\it\happen?"
>>> my_string
'why\\does\\it\\happen?'
Why?
What you are seeing is the representation of my_string created by its __repr__() method. If you print it, you can see that you've actually got single backslashes, just as you intended:
>>> print(my_string)
why\does\it\happen?
The string below has three characters in it, not four:
>>> 'a\\b'
'a\\b'
>>> len('a\\b')
3
You can get the standard representation of a string (or any other object) with the repr() built-in function:
>>> print(repr(my_string))
'why\\does\\it\\happen?'
Python represents backslashes in strings as \\ because the backslash is an escape character - for instance, \n represents a newline, and \t represents a tab.
This can sometimes get you into trouble:
>>> print("this\text\is\not\what\it\seems")
this ext\is
ot\what\it\seems
Because of this, there needs to be a way to tell Python you really want the two characters \n rather than a newline, and you do that by escaping the backslash itself, with another one:
>>> print("this\\text\is\what\you\\need")
this\text\is\what\you\need
When Python returns the representation of a string, it plays safe, escaping all backslashes (even if they wouldn't otherwise be part of an escape sequence), and that's what you're seeing. However, the string itself contains only single backslashes.
More information about Python's string literals can be found at: String and Bytes literals in the Python documentation.
As Zero Piraeus's answer explains, using single backslashes like this (outside of raw string literals) is a bad idea.
But there's an additional problem: in the future, it will be an error to use an undefined escape sequence like \d, instead of meaning a literal backslash followed by a d. So, instead of just getting lucky that your string happened to use \d instead of \t so it did what you probably wanted, it will definitely not do what you want.
As of 3.6, it already raises a DeprecationWarning, although most people don't see those. It will become a SyntaxError in some future version.
In many other languages, including C, using a backslash that doesn't start an escape sequence means the backslash is ignored.
In a few languages, including Python, a backslash that doesn't start an escape sequence is a literal backslash.
In some languages, to avoid confusion about whether the language is C-like or Python-like, and to avoid the problem with \Foo working but \foo not working, a backslash that doesn't start an escape sequence is illegal.

Should I encode curly or square brackets inside a JSON?

In fact, the title says it all. But, I'll go more into details:
I am sending a JSON string from my JS script to the server and vice versa. The JSON contains things as some content the user wrote into a textfield, but I know that some user will manage to break the JSON array this way sooner or later, so I decided to encode it with encodeURIComponent().
But I see, that when I try to encode curly brackets, that they aren't encoded at all. Is this going to be a problem?
More precisely, I'm afraid that if someone writes: } , {, the JSON will break. This shouldn't happen, since all of it is inside doublequotes like this: "} , {", and if a user write doublequotes or singlequotes they are going to be encoded, and from what I know, JSON should handle all of that just fine, but I am not entirely sure.
So, should I encode those brackets?
(Another thing is that the data is inserted into MySQL inside prepared statements, so that shouldn't be a problem, or I am wrong with that?)
A quick quote from the JSON specifications:
A string is a sequence of zero or more Unicode characters, wrapped in double quotes, using backslash escapes.
As you can see in the image that follows the paragraph quoted above, any Unicode character except for ", \ and control characters is represented as-is; no escape is required.
why will brake? Your JSON string will be inside ""
So will be something like
{"postcontent": "Shouldnt { } be escaped"}

Too many characters in character literal while converting HTML Tag to Entity reference

I have generated an HTML tag through C# code. I am able to render correctly in the text area. When I googled it, I found this. To render the HTML tags in the text area, we need to convert the '<','>' into HTML entity references. But when I am trying to replace using String.Replace, it throws an error: Too many characters in character literal
.
string psHtmlOutput="<html><body><table border='0' cellspacing='3' cellpadding='3'><tr><th> Name </th><th>DomainName</th><th>DomainType</th><th>Defualt</th></tr><tr><td>india.local</td><td>india.local</td><td>Authoritative</td><td>True</td></tr></table></body></html>";
psHtmlOutput.Replace('>','>');
psHtmlOutput.Replace('<','<');
Error: Too many characters in character literal
Please help; how can I proceed?
The String.Replace method has two overloads:
One that operates on Strings.
One that operates on Chars.
In C#, single quotation marks are used to specify Char literals. Because you have used single quotes, the second overload of the method has been used. However, your second argument is not a valid character literal because > is not a single character.
So if you actually want to replace the character with a string, just use the overload that takes strings:
psHtmlOutput.Replace(">", ">");
psHtmlOutput.Replace("<", "<");