Handle special characters in JSON payload - json

Below is an example of a JSON payload I'd like to send. How do I wrap the 'value' value so it handles the special characters like ; and " and * and / and carriage returns better?
{
"type":"INLINE",
"name":"${varEvidence-1}",
"value":"GET /mxca/userprofile/us/submenudata.do
request_type=authreg_submit&page=)(sn%3d HTTP/1.1
Host: example.com
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:31.0) "Gecko"/20100101
Firefox/31.0
Accept: /
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
X-Requested-With: XMLHttpRequest
Referer: https://example.com//mxca/userprofile/us/submenudata.do
request_type=authreg_submit&page=*)(sn%3d*
Cookie:
keymaster=PMV54Bye4RKtzab8vXdJyW6S8sCDe37OJuw0vx072smBAr7GW8D3rstCzWPUsEU29aM2MLFoglHzhATZISEQyYbsyNvg%3D%3D;
gatekeeper=4C7205995D7B2AFFA76E7B1335A890FCC4924C9C1E0F554077CDE2A651DD12D0A1CE69FB46D757A75FB64130179FD9001650153DA13887D33581F621453F81F560D81CFCAFE51922B6ADABF8C5A45F932491BA1325866F29B24CB2D73328A0FDBB72F1868E208C1786310849A6E5D2332E045C90CADC559A78DEA614ACE4A18E5262DDD7D9AE9854EB6EA7C9BA8BB68DC3F5DDAEA3C930442FA25FCBAF6D25F5DC6AE0C890E01A83E2CB1E70510B537BF63E653045C3B52B0E5FE728740894D87AA5599885F72DA1FDDF0D8AA9883FB3BE035EBA65CAEC15;
blueboxvalues=f93b466e-1f61c944-ba2aec26-cf40345c;
sessioncookie=easc=D0D8264E667BF43A91A02012B400E9B5A05175762DE38B9FA2D88675680B8419FD83366DE7A706A2E3391F9B0A4FDA9CE7E7168575DF4204233E855BAAC4CD01D386D62B27213A4D7595C69AB6EA15B7770572832321047ABC1627E6F1A1ECE62FAB6532AF78A9E3E888090D3EAFA80C92730CBE395556E578CA2F137E3A121CD20
Connection: keep-alive",
"docLockerId":null
}

As pet the JSON specs any characters can constitute a string except for " and \. If you have these in your JSON string you need to escape them using \ character.
So you just have to replace the character " with \" and the character \ with \\. Any other special characters are considered as characters in string.
Here is a list of JSON special characters.
\b Backspace
\f Form feed
\n New line
\r Carriage return
\t Tab
\" Double quote
\\ Backslash character
This stack overflow post
has a ton of information regarding this question.
You could also use the specs reference for more information.

Related

How to do HTTP form POST requests with karate reading key-values from a json file

I am trying to do a series of form POST requests using karate, each form has different fields. So I am trying to read the actual key-value pair data for the form input fields from a json file.
When submitting the form in a browser, the equivalent curl looks like this:
curl 'http://test.example.com:8080/submitform.html' -X POST -H 'Content-Type: multipart/form-data; boundary=---------------------------12345' --data-binary $'-----------------------------12345\r\nContent-Disposition: form-data; name=":formid"\r\n\r\n_submitform_start\r\n-----------------------------12345\r\nContent-Disposition: form-data; name=":formstart"\r\n\r\n/submitform/start\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="_charset_"\r\n\r\nUTF-8\r\n-----------------------------12345\r\nContent-Disposition: form-data; name=":redirect"\r\n\r\n/submitform/thank-you.html\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="xyzHidden"\r\n\r\n\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="first-name"\r\n\r\ntestfirst\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="last-name"\r\n\r\ntestlast\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="email"\r\n\r\ntestemail#example.com\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="telephone-number-with-country-code-optional"\r\n\r\n\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="company-name"\r\n\r\ntestcompany\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="city"\r\n\r\ntestcity\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="country-region"\r\n\r\nAustralia\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="describe-your-question-or-situation"\r\n\r\ntest describe\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="uploadNode_upload"\r\n\r\n/submitform/upload\r\n-----------------------------12345\r\nContent-Disposition: form-data; name="upload"; filename=""\r\nContent-Type: application/octet-stream\r\n\r\n-----------------------------12345--\r\n'
My feature file looks like this:
Feature: Form POST submit from file
Background:
* url sitehost
Scenario Outline: form submit
Given path '<submitpath>'
And def ffParams = read('classpath:forms/<datafile>.json')
And multipart fields ffParams
When method post
Then status 302
And match header Location == sitehost + '<thankyoupath>'
Examples:
| submitpath | datafile | thankyoupath |
| /submitform.html | submitform_data | submitform/thank-you.html |
My data json file looks something like this:
{
":formid":"_submitform_start",
":formstart":"/submitform/start",
"_charset_":"UTF-8",
":redirect":"/submitform/thank-you.html",
"xyzHidden":"",
"first-name":"testfirst",
"last-name":"testlast",
"email":"testemail#example.com",
"telephone-number-with-country-code-optional",
"company-name":"testcompany",
"city":"testcity",
"country-region":"Australia",
"describe-your-question-or-situation":"test describe"
}
I would like to submit the data as multipart/form-data, not application/x-www-form-urlencoded. According to the docs this is possible by using "multipart fields".
But when I do the above, in my request karate will put a filename="" after each key, e.g. it looks like
content-disposition: form-data; name="first-name"; filename=""
content-type: text/plain; charset=UTF-8
content-length: 4
Completed: true
IsInMemory: true
but it should look like in the curl
Content-Disposition: form-data; name="first-name"\r\n\r\ntestfirst\r\n
So not working as expected
First:
* def foo = null
* def bar = { a: 1, b: '##(foo)' }
* match bar == { a: 1 }
Note how b got removed. Explained in the docs: https://github.com/karatelabs/karate#remove-if-null
There are plenty of ways to manipulate JSON such as remove.
Regarding file-upload, this is notoriously tricky, so try to get things working without it if possible and then refer this thread.
Also see: https://stackoverflow.com/a/71512836/143475

Powershell not able to convert while converting values from "&" to JSON

RoleFullPath
Applications\User Admin & Support-DEMO
PowerShell Code
$NewJSON.roleFullPath = $Line.RoleFullPath
.
.
.
.
$JSONPath = $RolePath + $FolderName + "-JSON.json"
Convertto-JSON $NewJSON | Out-file -Encoding "UTF8" $JSONPath
Output:
"roleFullPath": "Applications\\User Admin \u0026 Support-DEMO"
While converting from csv to json, character '&' is getting converted to '\u0026'
Any help?
In Windows PowerShell v5.1, ConvertTo-Json indeed unexpectedly encodes & characters as Unicode escape sequence \u0026, where 0026 represents hex. number 0x26, the Unicode code point representing the & character, U+0026.
(PowerShell Core, by contrast, preserves the & as-is.)
That said, JSON parsers should be able to interpret such escape sequences and, indeed, the complementary ConvertFrom-Json cmdlet is.
Note: The solutions below are general ones that can handle the Unicode escape sequences of any Unicode character; since ConvertTo-Json seemingly only uses these Unicode escape-sequence representations for the characters &, ', < and >, a simpler solution is possible, unless false positives must be ruled out - see this answer.
That said, if you do want to manually convert Unicode escape sequences into their character equivalents in JSON text, you can use the following - limited solution:
# Sample JSON with Unicode escapes.
$json = '{ "roleFullPath": "Applications\\User Admin \u0026 Support-DEMO" }'
# Replace Unicode escapes with the chars. they represent,
# with limitations.
[regex]::replace($json, '\\u[0-9a-fA-F]{4}', {
param($match) [char] [int] ('0x' + $match.Value.Substring(2))
})
The above yields:
{ "roleFullPath": "Applications\\User Admin & Support-DEMO" }
Note how \u0026 was converted to the char. it represents, &.
A robust solution requires more work:
There are characters that must be escaped in JSON and cannot be represented literally, so in order for the to-character conversion to work generically, these characters must be excluded.
Additionally, false positives must be avoided; e.g., \\u0026 is not a valid Unicode escape sequence, because a JSON parser interprets \\ as an escaped \ followed by verbatim u0026.
Finally, the Unicode sequences for " and \ must be translated into their escaped forms, \" and \\, and it is possible to represent a few ASCII-range control characters by C-style escape sequences, e.g., \t for a tab character (\u0009).
The following robust solution addresses all these issues:
# Sample JSON with Unicode escape sequences:
# \u0026 is &, which CAN be converted to the literal char.
# \u000a is a newline (LF) character, which CANNOT be converted, but can
# be translated to escape sequence "\n"
# \\u0026 is *not* a Unicode escape sequence and must be preserved as-is.
$json = '{
"roleFullPath": "Applications\u000aUser Admin \u0026 Support-DEMO-\\u0026"
}'
[regex]::replace($json, '(?<=(?:^|[^\\])(?:\\\\)*)\\u([0-9a-fA-F]{4})', {
param($match)
$codePoint = [int] ('0x' + $match.Groups[1].Value)
if ($codePoint -in 0x22, 0x5c) {
# " or \ must be \-escaped.
'\' + [char] $codePoint
}
elseif ($codePoint -in 0x8, 0x9, 0xa, 0xc, 0xd) {
# Control chars. that can be represented as short, C-style escape sequences.
('\b', '\t', '\n', $null, '\f', '\r')[$codePoint - 0x8]
}
elseif ($codePoint -le 0x1f -or [char]::IsSurrogate([char] $codePoint)) {
# Other control chars. and halves of surrogate pairs must be retained
# as escape sequences.
# (Converting surrogate pairs to a single char. would require much more effort.)
$match.Value
}
else {
# Translate to literal char.
[char] $codePoint
}
})
Output:
{
"roleFullPath": "Applications\nUser Admin & Support-DEMO-\\u0026"
}
To stop Powershell from doing this pipe your Json output through this
$jsonOutput | ForEach-Object { [System.Text.RegularExpressions.Regex]::Unescape($_) } | Set-Content $jsonPath -Encoding UTF8;
This will prevent the & being converted :)

Issue with payload when passing user defined value to the body data

When trying to pass the user defined value to the body content, I am getting error "message": "Bad JSON escape sequence: \S. ---- \r\nUnexpected character encountered while parsing value".
When passing complete raw payload through body data, I am not getting this error.
With User defined Variable,
"customerBillingAddress":"26 Chestnut St\Suite 2B\Andover, MA 01810",
convert as " "customerBillingAddress":"26 Chestnut St\Suite 2B\Andover, MA 01810","
"\" is throwing error.
When testing with raw data, I am getting as it is in the payload.
"customerBillingAddress":"26 Chestnut St\\Suite 2B\\Andover, MA 01810",
Please advise
You need to escape \ with \\ and double quote " with \" as per JSON format guideline. Here I think you only need to escape \ in your JSON payload like below .
"customerBillingAddress":"26 Chestnut St\\Suite 2B\\Andover, MA 01810"
You need to escape the following characters in JSON:
\b Backspace (ascii code 08)
\f Form feed (ascii code 0C)
\n New line
\r Carriage return
\t Tab
\" Double quote
\\ Backslash character
In order to do this automatically you can use __groovy() function available since JMeter 3.1 like:
${__groovy(org.apache.commons.lang3.StringEscapeUtils.escapeJson(vars.get('json')),)}
Demo:

Using REST in TCL to POST in JSON format?

Essentially what I'm attempting to do is post to a REST API, but no matter what I do I end up with HTTP 400. Here is my extremely quick and extremely dirty code:
package require rest
package require json
::http::register https 443 ::tls::socket
set credentials {username admin password LabPass1}
set url1 [format "%s/%s" "https://127.0.0.1:8834" session]
set unformattedToken [dict get [::json::json2dict [::rest::post $url1 $credentials]] token]
set cookie [format "token=%s" $unformattedToken]
set header [list X-Cookie $cookie Content-type application/json]
set config [list method post format json headers $header]
set url [format "%s/%s" "https://127.0.0.1:8834" scans]
set uuid 7485-2345-566
set name "Testing TCL Network Scan"
set desc "Basic Network Scan using API"
set pid 872
set target 127.0.0.1
set data {{"uuid":"$uuid","settings": {"name":"$name","description":"$desc", "policy_id":"$pid","text_targets":"$target", "launch":"ONETIME","enabled":false,"launch_now":true}}}
set jsonData [json::json2dict $data]
set response [::rest::simple $url $jsonData $config]
I've tried using the above code and I've also tried removing the json::json2dict call and just sending the data. I believe, and I could be wrong, that my issue is the data is going as line-based text data:
POST /scans HTTP/1.1
Host: 127.0.0.1:8834
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 10.0) http/2.8.9 Tcl/8.6.4
Connection: close
X-Cookie: token=301b8dcdf855a29b5b902cf8d93c49750935c925a965445e
Content-type: application/json
Accept: */*
Accept-Encoding: gzip,deflate,compress
Content-Length: 270
uuid=7485-2345-566&settings=name%20%7BTesting%20TCL%20Network%20Scan%7D%20description%20%7BBasic%20Network%20Scan%20using%20API%7D%20policy_id%20872%20text_targets%20127.0.0.1%20launch%20ONETIME%20enabled%20false%20launch_now%20true
I've reviewed the JSON documentation, and the REST documentation but I'm having a hard time finding an example of posting using JSON format. Here is what this looks like in a curl command:
curl https://127.0.0.1:8834/scans -k -X POST -H 'Content-Type: application/json' -H 'X-Cookie: token= <token>' -d '{"uuid":"7485-2345-566","settings":{"name":"Testing TCL Network Scan","description":"Basic Network Scan using API", "policy_id":"872","text_targets":"127.0.0.1", "launch":"ONETIME","enabled":false,"launch_now":true}'
One problem you have is that the values in the query aren't evaluated. "uuid":"$uuid" becomes "uuid":"$uuid", for instance. This is because of the braces around the value that data is set to.
The best solution would seem to be to not create a json object and then convert it to a dict, but instead create the dict directly, like this:
set data [list uuid $uuid settings [list name $name description $desc policy_id $pid text_targets $target launch ONETIME enabled false launch_now true]]
or like this, for shorter lines:
dict set data uuid $uuid
dict set data settings name $name
dict set data settings description $desc
dict set data settings policy_id $pid
dict set data settings text_targets $target
dict set data settings launch ONETIME
dict set data settings enabled false
dict set data settings launch_now true
or by some other method.
Documentation: dict, list, set

Why is JSON::XS Not Generating Valid UTF-8?

I'm getting some corrupted JSON and I've reduced it down to this test case.
use utf8;
use 5.18.0;
use Test::More;
use Test::utf8;
use JSON::XS;
BEGIN {
# damn it
my $builder = Test::Builder->new;
foreach (qw/output failure_output todo_output/) {
binmode $builder->$_, ':encoding(UTF-8)';
}
}
foreach my $string ( 'Deliver «French Bread»', '日本国' ) {
my $hashref = { value => $string };
is_sane_utf8 $string, "String: $string";
my $json = encode_json($hashref);
is_sane_utf8 $json, "JSON: $json";
say STDERR $json;
}
diag ord('»');
done_testing;
And this is the output:
utf8.t ..
ok 1 - String: Deliver «French Bread»
not ok 2 - JSON: {"value":"Deliver «French Bread»"}
# Failed test 'JSON: {"value":"Deliver «French Bread»"}'
# at utf8.t line 17.
# Found dodgy chars "<c2><ab>" at char 18
# String not flagged as utf8...was it meant to be?
# Probably originally a LEFT-POINTING DOUBLE ANGLE QUOTATION MARK char - codepoint 171 (dec), ab (hex)
{"value":"Deliver «French Bread»"}
ok 3 - String: 日本国
ok 4 - JSON: {"value":"æ¥æ¬å½"}
1..4
{"value":"日本国"}
# 187
So the string containing guillemets («») is valid UTF-8, but the resulting JSON is not. What am I missing? The utf8 pragma is correctly marking my source. Further, that trailing 187 is from the diag. That's less than 255, so it almost looks like a variant of the old Unicode bug in Perl. (And the test output still looks like crap. Never could quite get that right with Test::Builder).
Switching to JSON::PP produces the same output.
This is Perl 5.18.1 running on OS X Yosemite.
is_sane_utf8 doesn't do what you think it does. You're suppose to pass strings you've decoded to it. I'm not sure what's the point of it, but it's not the right tool. If you want to check if a string is valid UTF-8, you could use
ok(eval { decode_utf8($string, Encode::FB_CROAK | Encode::LEAVE_SRC); 1 },
'$string is valid UTF-8');
To show that JSON::XS is correct, let's look at the sequence is_sane_utf8 flagged.
+--------------------- Start of two byte sequence
| +---------------- Not zero (good)
| | +---------- Continuation byte indicator (good)
| | |
v v v
C2 AB = [110]00010 [10]101011
00010 101011 = 000 1010 1011 = U+00AB = «
The following shows that JSON::XS produces the same output as Encode.pm:
use utf8;
use 5.18.0;
use JSON::XS;
use Encode;
foreach my $string ('Deliver «French Bread»', '日本国') {
my $hashref = { value => $string };
say(sprintf("Input: U+%v04X", $string));
say(sprintf("UTF-8 of input: %v02X", encode_utf8($string)));
my $json = encode_json($hashref);
say(sprintf("JSON: %v02X", $json));
say("");
}
Output (with some spaces added):
Input: U+0044.0065.006C.0069.0076.0065.0072.0020.00AB.0046.0072.0065.006E.0063.0068.0020.0042.0072.0065.0061.0064.00BB
UTF-8 of input: 44.65.6C.69.76.65.72.20.C2.AB.46.72.65.6E.63.68.20.42.72.65.61.64.C2.BB
JSON: 7B.22.76.61.6C.75.65.22.3A.22.44.65.6C.69.76.65.72.20.C2.AB.46.72.65.6E.63.68.20.42.72.65.61.64.C2.BB.22.7D
Input: U+65E5.672C.56FD
UTF-8 of input: E6.97.A5.E6.9C.AC.E5.9B.BD
JSON: 7B.22.76.61.6C.75.65.22.3A.22.E6.97.A5.E6.9C.AC.E5.9B.BD.22.7D
JSON::XS is generating valid UTF-8, but you're using the resulting UTF-8 encoded byte strings in two different contexts that expect character strings.
Issue 1: Test::utf8
Here are the two main situations when is_sane_utf8 will fail:
You have a miscoded character string that had been decoded from a UTF-8 byte string as if it were Latin-1 or from double encoded UTF-8, or the character string is perfectly fine and looks like a potentially "dodgy" miscoding (using the terminology from its docs).
You have a valid UTF-8 byte string containing the encoded code points U+0080 through U+00FF, for example «French Bread».
The is_sane_utf8 test is intended only for character strings and has the documented potential for false negatives.
Issue 2: Output Encoding
All of your non-JSON strings are character strings while your JSON strings are UTF-8 encoded byte strings, as returned from the JSON encoder. Since you're using the :encoding(UTF-8) PerlIO layer for TAP output, the character strings are being implicitly encoded to UTF-8 with good results, while the byte strings containing JSON are being double encoded. STDERR however does not have an :encoding PerlIO layer set, so the encoded JSON byte strings look good in your warnings since they're already encoded and being passed straight out.
Only use the :encoding(UTF-8) PerlIO layer for IO with character strings, as opposed to the UTF-8 encoded byte strings returned by default from the JSON encoder.