I m new to reverse engineering and I was solving a CTF, I did find the byte code but it seems like I should sanitize it(it contains some strings), and unescape it properly.
Here is a chunk of the byte code.
\x1bLuaS\x00\x19\x93\r\n\x1a\n\x04\x08\x04\x08\x08xV\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00(w#\x01\r#unknown.lua\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x02\x03\x00\x00\x00,\x00\x00\x00\x08\x00\x00\x80&\x00\x80\x00\x01\x00\x00\x00\x04\x06check\x01\x00\x00\x00\x01\x00\x01\x00\x00\x00\x00\x01\x00\x00\x00"\x00\x00\x00\x01\x00\t\xbf\x00\x00\x00\\\x00\x00\x00_\x00\xc0\x00\x1e#\x00\x80A#\x00\x00f\x00\x00\x01F\x80#\x00G\xc0\xc0\x00\x80\x00\x00\x00\xc1\x00\x01\x00d\x80\x80\x01_#\xc1\x00\x1e#\x00\x80A#\x00\x00f\x00\x00\x01F\x80#\x00G\xc0\xc0\x00\x80\x00\x00\x00\xc1\x80\x01\x00d\x80\x80\x01_\xc0\xc1\x00\x1e#\x00\x80A#\x00\x00f\x00\x00\x01F\x80#\x00G\xc0\xc0\x00\x80\x00\x00\x00\xc1\x00\x02\x00d\x80\x80\x01_#\xc2\x00\x1e#\x00\x80A#\x00\x00f\x00\x00\x01F\x80#\x00G\xc0\xc0\x00\x80\x00\x00\x00\xc1\x80\x02\x00d\x80\x80\x01_\xc0\xc2\x00\x1e#\x00\x80A#\x00\x00f\x00\x00\x01F\x80#\x00G\xc0\xc0\x00\x80\x00\x00\x00\xc1\x00\x03\x00d\x80\x80\x01_#\xc3\x00\x1e#\x00\x80A#\x00\x00f\x00\x00\x01F\x80#\x00G\xc0\xc0\x00\x80\x00\x00\x00\xc1\x80\x03\x00d\x80\x80\x01_\xc0\xc3\x00\x1e#\x00\x80A#\x00 ...
Related
I'm processing JSON files in PowerShell, and it seems that ConvertFrom-Json changes case on its inputs only on some (rare) occasions.
For example, when I do:
$JsonStringSrc = '{"x":2.2737367544323206e-13,"y":1759,"z":33000,"width":664}'
$JsonStringTarget = $JsonStringSrc | ConvertFrom-Json | ConvertTo-Json -Depth 100 -Compress
$JsonStringTarget
It returns:
{"x":2.2737367544323206E-13,"y":1759,"z":33000,"width":664}
Lower case e became an uppercase E, messing up my hashes when validating proper i/o during processing.
Is this expected behavior (perhaps a regional setting)? Is there a setting for ConvertFrom-Json to leave my inputs alone for the output?
The problem lies in the way PowerShell's JSON library output the CLR foating point numbers. By converting from JSON you turn the JSON string into a CLR/PowerShell object with associated types for numbers and strings and such. Converting back to JSON serializes that object back to JSON, but uses the .NET default formatter configuration to do so. There is no metadata from the original JSON document to aid the conversion. Rounding errors and truncation, different order for elements may happen here too.
The JSON spec for canonical form (the form you want to use when hashing) is as follows:
MUST represent all non-integer numbers in exponential notation
including a nonzero single-digit significant integer part, and
including a nonempty significant fractional part, and
including no trailing zeroes in the significant fractional part (other than as part of a “.0” required to satisfy the preceding point), and
including a capital “E”, and
including no plus sign in the exponent, and
including no insignificant leading zeroes in the exponent
Source: https://gibson042.github.io/canonicaljson-spec/
Though the specs for JSON supports both options (e and E).
exponent
""
'E' sign digits
'e' sign digits
Source: https://www.crockford.com/mckeeman.html
You may be able to convert the object to JSON using the Newtonsoft.Json classes directly and passing in a custom Convertor.
https://stackoverflow.com/a/28743082/736079
A better solution would probably be to use a specialized formatter component that directly manipulates the existing JSON document without converting it to CLR objects first.
Postamble for future readers
Elm allows literal C:\Users\myuser in strings
This is consistent with the JSON spec
My problem was unrelated to this, but several layers of escaping convoluted the problem. Future lesson: fully producing a minimal working example would have found the error!
Original question
I have a Clojure backend that talks to an Elm frontend. I hit a bump when decoding JSON values in Elm.
\U below means the literal characters backslash and U, as if read from a text file. "\\U" is the same string as input in Clojure and Elm source (\ must be escaped). Note enclosing "".
Problem: encoding \U
The literal string \U, escaped "\\U" is not accepted by the Elm string decoder.
A blog post suggests that to obtain the literal string \U, this should be encoded in source code as "\\\\U", "escaping the unicode escape".
The literal string I want to send to the client is C:\Users\myuser. I prefer to send valid JSON from the server to the client.
Clojure standard library behavior
clojure.data.json does not do anything special for strings containing the literal \U. The example below shows that \U and \m are threated equally, the backslash is escaped, and the following character ignored.
project.core> (clojure.data.json/write-str "C:\\Users\\myuser")
"\"C:\\\\Users\\\\myuser\""
Manual workaround
Temporary workaround is manually escaping the strings I need:
(defn escape-backslash-u [s]
(clojure.string/replace s "\\U" "\\\\U"))
Concrete questions
Is clojure.data.json/write-str behaving correctly? As I understand the documentation, output should be valid unicode.
Are other JSON libraries behaving similarly?
Is Elm's Json.Decode behaving correctly by rejecting the literal string \U?
Solution progress
A friendly Clojurians Slack user pointed to the JSON standard specification, specifically sections 7. Strings and 8.2. Unicode characters.
I think you may be on the wrong track here.
The string you gave as an example, "C:\\Users\\myuser" is completely unproblematic, it does not contain any Unicode escape sequences. It is a string containing the ASCII characters ‘C’, ‘:’, ‘\’, ‘U’, and so on. The backslash is the escape character in Clojure strings so it needs to be escaped itself to represent a literal backslash.
In any case the string "C:\\Users\\myuser" can be serialized with (clojure.data.json/write-str "C:\\Users\\myuser"), and, as you know, this gives "\"C:\\\\Users\\\\myuser\"". All of this seems perfectly straightforward and sound.
Printing "\"C:\\\\Users\\\\myuser\"" results in the original string "C:\\Users\\myuser" being printed. That string is accepted as valid by JSONLint, again as expected.
I understood it as Elm beeing unable to decode \"C:\\\\User... to "C:\\User... because it interprets \u as start for an escape sequence.
I tried elm here with the following code:
import Html exposing (text)
main =
text "\"c:\\\\user\\\\foo\"" // from clojure.data.json/write-str
which in turn compiles/runs to
"c:\\user\\foo"
which looks fine to me.
Are you sure there is nothing else going on (middleware, transport) ?
ive been testing an sha512 class. i need to generate a hash from a string within flash cs5, but i need it to match the hash produced by asp.net(vb). it appears to be adding a zero somewhere in the string, and i dont know why.
these are the files im using: Porting SHA512 Javascript implemention to Actionscript.
the hashed string is the name "Karla" in this example
example (asp.net)// ** the brackets show where the difference is ** C4DB628AD520AFF7308ED19E91635E8E24A6C7CFD4DB2F71BBE2FA6CD63770B315A839143037BB9DB16784C0BDCEB622ECAA4077D4D8(1787)D5023E86734748
(as3)
C4DB628AD520AFF7308ED19E91635E8E24A6C7CFD4DB2F71BBE2FA6CD63770B315A839143037BB9DB16784C0BDCEB622ECAA4077D4D8(17087)D5023E86734748
there's added info below, in the link i provided, but i do not think it related to what i need, i dont think im using hmac, just a straight string hash, however, when i do it in vb.net i get the bytes from the string first the i has the bytes.
I had a feeling that the as3 code converted the string automatically in the sha512 class?
hoping someone came across this issue as well.
thanks for any help with this.
Neither one of those hashes are correct. The correct SHA512 hash for the string "Karla" is:
C4DB628AD520AFF7308ED19E91635E8E24A6C7CFD4DB2F71BBE2FA6CD63770B315A839143037BB9DB16784C0BDCEB622ECAA4077D4D817087D5023E867347408
However, I would wager that the AS3 hash is actually correct -- the javascript version generates the correct hash, see here -- and was just pasted incorrectly.
In two places in the computed hash, it contains the byte 0x08, but in the ASP.NET version high 4 bits of the byte are being lost, and its being appended to the output string as just "8" not "08".
Basically, your ASP.NET hash generator is trashing numbers less than 0x10 -- ignoring the leading zero -- and giving you malformed hashes..
Another way to tell that there is something amiss with your ASP.NET hash is that its only 126 characters (504 hex encoded bits) long.
I'm looking for the easiest way to turn a CSV file (of floats) into a float list. I'm not well acquainted with reading files in general in Ocaml, so I'm not sure what this sort of function entails.
Any help or direction is appreciated :)
EDIT: I'd prefer not to use a third party CSV library unless I absolutely have to.
https://forge.ocamlcore.org/projects/csv/
If you don't want to include a third-party library, and your CSV files are simply formatted with no quotes or embedded commas, you can parse them easily with standard library functions. Use read_line in a loop or in a recursive function to read each line in turn. To split each line, call Str.split_delim (link your program with str.cma or str.cmxa). Call float_of_string to parse each column into a float.
let comma = Str.regexp ","
let parse_line line = List.map float_of_string (Str.split_delim comma line)
Note that this will break if your fields contain quotes. It would be easy to strip quotes at the beginning and at the end of each element of the list returned by split_delim. However, if there are embedded commas, you need a proper CSV parser. You may have embedded commas if your data was produced by a localized program in a French locale — French uses commas as the decimal separator (e.g. English 3.14159, French 3,14159). Writing floating point data with commas instead of dots isn't a good idea, but it's something you might encounter (some spreadsheet CSV exports, for example). If your data comes out of a Fortran program, you should be fine.
I have a string which gets serialized to JSON in Javascript, and then deserialized to Java.
It looks like if the string contains a degree symbol, then I get a problem.
I could use some help in figuring out who to blame:
is it the Spidermonkey 1.8 implementation? (this has a JSON implementation built-in)
is it Google gson?
is it me for not doing something properly?
Here's what happens in JSDB:
js>s='15\u00f8C'
15°C
js>JSON.stringify(s)
"15°C"
I would have expected "15\u00f8C' which leads me to believe that Spidermonkey's JSON implementation isn't doing the right thing... except that the JSON homepage's syntax description (is that the spec?) says that a char can be
any-Unicode-character-
except-"-or-\-or-
control-character"
so maybe it passes the string along as-is without encoding it as \u00f8... in which case I would think the problem is with the gson library.
Can anyone help?
I suppose my workaround is to use either a different JSON library, or manually escape strings myself after calling JSON.stringify() -- but if this is a bug then I'd like to file a bug report.
This is not a bug in either implementation. There is no requirement to escape U+00B0. To quote the RFC:
2.5. Strings
The representation of strings is
similar to conventions used in the C
family of programming languages. A
string begins and ends with quotation
marks. All Unicode characters may be
placed within the quotation marks
except for the characters that must be
escaped: quotation mark, reverse
solidus, and the control characters
(U+0000 through U+001F).
Any character may be escaped.
Escaping everything inflates the size of the data (all code points can be represented in four or fewer bytes in all Unicode transformation formats; whereas encoding them all makes them six or twelve bytes).
It is more likely that you have a text transcoding bug somewhere in your code and escaping everything in the ASCII subset masks the problem. It is a requirement of the JSON spec that all data use a Unicode encoding.
hmm, well here's a workaround anyway:
function JSON_stringify(s, emit_unicode)
{
var json = JSON.stringify(s);
return emit_unicode ? json : json.replace(/[\u007f-\uffff]/g,
function(c) {
return '\\u'+('0000'+c.charCodeAt(0).toString(16)).slice(-4);
}
);
}
test case:
js>s='15\u00f8C 3\u0111';
15°C 3◄
js>JSON_stringify(s, true)
"15°C 3◄"
js>JSON_stringify(s, false)
"15\u00f8C 3\u0111"
This is SUPER late and probably not relevant anymore, but if anyone stumbles upon this answer, I believe I know the cause.
So the JSON encoded string is perfectly valid with the degree symbol in it, as the other answer mentions. The problem is most likely in the character encoding that you are reading/writing with. Depending on how you are using Gson, you are probably passing it a java.io.Reader instance. Any time you are creating a Reader from an InputStream, you need to specify the character encoding, or java.nio.charset.Charset instance (it's usually best to use java.nio.charset.StandardCharsets.UTF_8). If you don't specify a Charset, Java will use your platform default encoding, which on Windows is usually CP-1252.