JsonDocument incomplete parsing with larger payloads - json
So basically, I have a HttpClient that attempts to obtain any form of JSON data from an endpoint. I previously utilized Newtonsoft.Json to achieve this easily but after migrating all of the functions to STJ, I started to notice improper parsing.
Platforms tested: macOS & Linux (Google Kubernetes Engine)
Framework: .NET Core 3.1 LTS
The code screenshots below show an API that returns a JSON Array. I simply stream it, load it into a JsonDocument, and then attempt to peek into it. Nothing comes out as expected. Code below is provided along with the step debug var results.
using System;
using System.ComponentModel;
using System.IO;
using System.Linq;
using System.Net;
using System.Net.Http;
using System.Net.Http.Headers;
using System.Text;
using System.Text.Json;
using System.Threading.Tasks;
using System.Web;
using System.Xml;
namespace HttpCallDemo
{
class Program
{
static async Task Main(string[] args)
{
using (var httpClient = new HttpClient())
{
// FLUSH
httpClient.DefaultRequestHeaders.Clear();
httpClient.MaxResponseContentBufferSize = 4096;
string body = string.Empty, customMediaType = string.Empty; // For POST/PUT
// Setup the url
var uri = new UriBuilder("https://api-pub.bitfinex.com/v2/tickers?symbols=ALL");
uri.Port = -1;
// Pull in the payload
var requestPayload = new HttpRequestMessage(HttpMethod.Get, uri.ToString());
HttpResponseMessage responsePayload;
responsePayload = await httpClient.SendAsync(requestPayload,
HttpCompletionOption.ResponseHeadersRead);
var byteArr = await responsePayload.Content.ReadAsByteArrayAsync();
if (byteArr.LongCount() > 4194304) // 4MB
return; // Too big.
// Pull the content
var contentFromBytes = Encoding.Default.GetString(byteArr);
JsonDocument payload;
switch (responsePayload.StatusCode)
{
case HttpStatusCode.OK:
// Return the payload distinctively
payload = JsonDocument.Parse(contentFromBytes);
#if DEBUG
var testJsonRes = Encoding.UTF8.GetString(
Utf8Json.JsonSerializer.Serialize(payload.RootElement));
// var testRawRes = contentStream.read
var testJsonResEl = payload.RootElement.GetRawText();
#endif
break;
default:
throw new InvalidDataException("Invalid HTTP response.");
}
}
}
}
}
Simply execute the above Minimal code, notice that the payload is different from its original after parsing? I'm sure there's something wrong with the options for STJ. Seems like we have to optimise or explicitly define its limits to allow it to process that JSON payload.
Diving deeper into the debug content made things even weirder. When the HttpClient obtains the payload, reads it to a string, it gives me the entire JSON string as is. However, once we attempt to parse it into a JsonDocument and the further invoking RootElement.Clone(), we'll end up with a JsonElement with much lesser data and while carrying an invalid JSON struct (Below).
ValueKind = Array : "[["tBTCUSD",11418,70.31212518,11419,161.93475693,258.02141213,0.0231,11418,2980.0289306,11438,11003],["tLTCUSD",58.919,2236.00823543,58.95,2884.6718013699997,1.258,0.0218,58.998,63147.48344762,59.261,56.334],["tLTCBTC",0.0051609,962.80334198,0.005166,1170.07399991,-0.000012,-0.0023,0.0051609,4178.13148459,0.0051852,0.0051],["tETHUSD",396.54,336.52151165,396.55,384.37623341,8.26964946,0.0213,396.50930256,69499.5382821,397.77,380.5],["tETHBTC",0.034731,166.67781664000003,0.034751,356.03450125999996,-0.000054,-0.0016,0.034747,5855.04978836,0.035109,0.0343],["tETCBTC",0.00063087,15536.813429530002,0.00063197,16238.600279749999,-0.00000838,-0.0131,0.00063085,73137.62192801,0.00064135,0.00062819],["tETCUSD",7.2059,9527.40221867,7.2176,8805.54677899,0.0517,0.0072,7.2203,49618.78868196,7.2263,7],["tRRTUSD",0.057476,33577.52064154,0.058614,20946.501210000002,0.023114,0.6511,0.058614,210741.23592011,0.06443,0.0355],["tZECUSD",88.131,821.28048322,88.332,880.37484662,5.925,0.0
And of course, attempting to read its contents would result in:
System.InvalidOperationException: Operation is not valid due to the current state of the object.
at System.Text.Json.JsonElement.get_Item(Int32 index)
at Nozomi.Preprocessing.Abstracts.BaseProcessingService`1.ProcessIdentifier(JsonElement jsonDoc, String identifier) in /Users/nicholaschen/Projects/nozomi/Nozomi.Infra.Preprocessing/Abstracts/BaseProcessingService.cs:line 255
Here's proof that there is a proper 38KBs worth of data coming in from the endpoint.
UPDATE
Further testing with this
if (payload.RootElement.ValueKind.Equals(JsonValueKind.Array))
{
string testJsonArr;
testJsonArr = Encoding.UTF8.GetString(
Utf8Json.JsonSerializer.Serialize(
payload.RootElement.EnumerateArray()));
}
show that a larger array of arrays (exceeding 9 elements each with 11 elements) would result in an incomplete JSON struct, causing the issue i'm facing.
For those who are working with JsonDocument and JsonElement, take note that the step debug variables are not accurate. It is not advisable to inspect the variables during runtime as they do not display themselves entirely.
#dbc has proven that re-serializing the deserialized data will produce the complete dataset. I strongly suggest you wrap the serializers for debugging in a DEBUG preprocessor to make sure these redundant lines don't end up being executed out of development.
To interact with these entities, ensure you .clone() whenever you can to prevent disposals and ensure that you're accessing the RootElement and then subsequently traversing into it before viewing its value in step debug mode because large values will not be displayed.
Related
Modify the rendered string of a response in Grails 3 before sending to the client
I need to create a benchmark report regarding whether in the grand scheme of things: minifying + GZIP dynamic HTML responses (generated through GSPs) on every request, which will lead to an additional overhead due to parsing of the generated dynamic HTML string then compressing using a Java library (which results to a smaller response size) is actually better than GZIP without minifying (which results to faster response time but a little larger response size). I got the feeling that this "improvement" maybe is insignificant, but I need the benchmark report to back it up to the team. To do that, I modify controller actions like so: // import ...MinifyPlugin class HomeController { def get() { Map model = [:] String htmlBody = groovyPageRenderer.render(view: "/get", model: model) // This adds a few milliseconds and reduce few characters. htmlBody = MinifyPlugin.minifyHtmlString(htmlBody) render htmlBody } } But the Grails project has almost a hundred actions and doing this on every existing action is impractical and not maintainable, especially that after the benchmarking, we may decide to not minify the HTML response. So I was thinking of doing this inside an Interceptor instead: void afterView() { if(response.getContentType().contains("text/html")) { // This throws IllegalStateException: getWriter() has already been called for this response OutputStream servletOutputStream = response.getOutputStream() String htmlBody = new String(servletOutputStream.toByteArray()) htmlBody = MinifyingPlugin.minifyHtmlString(htmlBody) ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream() byteArrayOutputStream.write(htmlBody.getBytes()) response.setCharacterEncoding("UTF-8") response.setContentType("text/html") response.outputStream << byteArrayOutputStream } } But it seems that modification of the response body is impossible once it enters the afterView interceptor...? So is any other way to do this using Grails 3 Interceptors, or should I update every controller action we have manually and perform the modification there instead?
This is what I like to use Interceptors for. The after() part of the interceptor can act on the model after it is returned from the controller (wherein 'before()' acts on the request before it is sent to the controller) This allows you to manipulate all data for a set of endpoints (or one specific endpoint) prior to return to client If you are wanting to render to a view, you do that in the interceptor rather than in the controller; you merely return data from the controller
API Testing / REST Assured Automation Testing Advice and or suggestions
Really looking for some practical advice and general guidance. Below is the current scenario. I have an excel document each row would be considered a test with inputs. There would be hundreds if not thousands of rows. Lets for example say Row1 would look like Col1----------|Col2---------------|Col3 TestingUser|TestingSurname|1980/01/01 This needs to me mapped to a JSON object then sent / POST to an API end point. I then need to assert the data that is coming back to make sure it’s the correct values. The tools I have looked at is: ReadyAPI rest-assured.io Would you recommend any other tool or framework for this type of testing. If you have worked with something and you can provide an example that would be great.
I wouldn't be able to provide recommendation as i haven't worked on RestAssured.However below are few advantages of ReadyAPI: Learning curve is shallow,any Tester will be able to build test case without dependency on any programming language. ReadyAPI has inbuild feature to read data from different datasources(DB, XML, json,csv,excel etc.) and invoke REST endpoint by passing these fields to Header,query and Json Body of the end point. The response for each call can be dumped to a file using DataSink option for a test step for each of the request calls made for the records from the file. Tool is structured to easily build test Cases with multiple test Steps. It more like drag and drop to build your test cases.Hierarchy is Project -> Test Suite -> Test Case -> Test Step. Easy integration with Jenkins CI/CD pipeline using testRunner with wide variety of test reporting capabilities. Test reports are available as Allure, jasper reports, junit Style reporting. For more technical testers who need more control can use Groovy,javascript language to build frameworks. VirtServer and LoadUI are other tools by SmartBear that can be used to mock services and run performance tests as desired. I have an important comment to make here, if the file is huge(even 1000 lines) i have seen Ready API struggling as the tool does the heavylifting in the back. Hence would recommend to use groovy script utilizing Java API's for any file operations.
Ok so I have created a class using velocity as a json template engine. I have created a test and within that test i have a normal java loop. This will loop through the entire xls, map values and and post to the API. This is all working as expected. The problem is the runner displays Default Suite Total tests run: 1, Passes: 0, However the loop does run x amount of times. How can i update it the when i execute the test its shows total tests run 10 or the same amount that is from the loop. Hope this makes sense #Test public void generatePostData() throws IOException { Workbook wb = WorkbookFactory.create(new File("data\\sc1.xlsx")); Sheet sheet = wb.getSheetAt(0); for (int i = 1; i < 10; i++) { //Get Excel Data Cell testNumber = sheet.getRow(i).getCell(1); System.out.println(testNumber.getNumericCellValue()); //Velocity VelocityEngine ve = new VelocityEngine(); ve.init(); //get the template Template t = ve.getTemplate("post.json"); //create context and add data VelocityContext context = new VelocityContext(); //map data context.put("tpltestNumber", testNumber); //render to stringWriter StringWriter writer = new StringWriter(); t.merge(context, writer); baseURI = "someURL"; Response response = given() .contentType("application/json") .body(String.valueOf(writer)) .when() .post() .then() .assertThat() .statusCode(200) .extract() .response(); } }
This is the answer to the question asked in the answers sesction by the reporter of the main question. (How to get the executed excel row count, for total executed test case count) For that you have to pass the data using a method with DataProvider annotation. TestNG documentation DataProvider in TestNG #DataProvider(name = "dp") private Object[][] dataProvider() { Workbook wb; Sheet sheet = null; Object[][] excelRowArray = new Object[10][]; //this 10 the row count in the excel file try { wb = WorkbookFactory.create(new File("data\\sc1.xlsx")); sheet = wb.getSheetAt(0); } catch (IOException e) { e.printStackTrace(); } for (int i = 1; i < 10; i++) {// Here 10 is the row count in the excel sheet //Get Excel Data row by row Cell testNumber = sheet.getRow(i).getCell(1); System.out.println(testNumber.getNumericCellValue()); // Create a object array with the values taken from a singe excel row Object[] excelRow = new Object[]{testNumber}; // Add the created object array to the 'excelRowArray' excelRowArray[i - 1] = excelRow; } return excelRowArray; } #Test(dataProvider = "dp") public void generatePostData(Object[] excelRow) { // Here single excelRow will be passed each time. // And this will run till all object[] in excelRowArray are finished. // Total tests executed will be the number of 'excelRow' object arrays in // excelRowArray. (Or excel row count in the sheet) //Velocity VelocityEngine ve = new VelocityEngine(); ve.init(); //get the template Template t = ve.getTemplate("post.json"); //create context and add data VelocityContext context = new VelocityContext(); //map data context.put("tpltestNumber", excelRow); // Here excelRow is used as the value //render to stringWriter StringWriter writer = new StringWriter(); t.merge(context, writer); String baseURI = "someURL"; Response response = given() .contentType("application/json") .body(String.valueOf(writer)) .when() .post() .then() .assertThat() .statusCode(200) .extract() .response(); }
Select JSON sub-node
I am querying the Wikipedia API and am getting JSON back that looks like this: https://en.wikipedia.org/w/api.php?action=query&prop=pageimages&titles=cessna%20172&pithumbsize=500&format=json {"batchcomplete":"","query":{"normalized":[{"from":"cessna 172","to":"Cessna 172"}],"pages":{"173462":{"pageid":173462,"ns":0,"title":"Cessna 172","thumbnail":{"source":"https://upload.wikimedia.org/wikipedia/commons/thumb/a/ae/Cessna_172S_Skyhawk_SP%2C_Private_JP6817606.jpg/500px-Cessna_172S_Skyhawk_SP%2C_Private_JP6817606.jpg","width":500,"height":333},"pageimage":"Cessna_172S_Skyhawk_SP,_Private_JP6817606.jpg"}}}} Using .Net Core 2.2, what is the proper way to get the image thumbnail out of this (the source property in this case)?
Parsing JSON is not a built in feature in .Net core 2.2 so you will want to add the Newtonsoft.Json package to the project with dotnet add package Newtonsoft.Json --version 12.0.3. From there include Newtonsoft.Json by adding using Newtonsoft.Json.Linq; to the top of the file. and using System.Net; to use WebClient. From there the code retrieves the string from the url. JObject.Parse parses the string as a JObject. We can get the property you want by chaining indexers: ["query"]["pages"]["173462"]["thumbnail"]["source"]. Full source: using System; using System.Net; using Newtonsoft.Json.Linq; class Program { static void Main(string[] args) { const string url = "https://en.wikipedia.org/w/api.php?action=query&prop=pageimages&titles=cessna%20172&pithumbsize=500&format=json"; using (WebClient client = new WebClient()) { string rawString = client.DownloadString(url); var jsonResult = JObject.Parse(rawString); string thumbnail = jsonResult["query"]["pages"]["173462"]["thumbnail"]["source"]; Console.WriteLine(thumbnail); } } }
Ideally, you will have to define a class and de-serialised the json. Example : Batch batch = JsonConvert.DeserializeObject<Account>(json); More details here. However, at times, just to get one/two values, it might be overhead to use an entire class structure. In this case, a quick way might be to parse the json dynamically. Example which is taken from here: public void JValueParsingTest() { var jsonString = #"{""Name"":""Rick"",""Company"":""West Wind"", ""Entered"":""2012-03-16T00:03:33.245-10:00""}"; dynamic json = JValue.Parse(jsonString); // values require casting string name = json.Name; string company = json.Company; DateTime entered = json.Entered; Assert.AreEqual(name, "Rick"); Assert.AreEqual(company, "West Wind"); }
Dynamic parameter as part of request URI with Apache HttpCore
I am looking for existing solutions to match dynamic parameters with HttpCore. What I have in mind is something similar to constraints in ruby on rails, or dynamic parameters with sails (see here for example). My objective is to define a REST API where I could easily match requests like GET /objects/<object_id>. To give a little bit of context, I have an application that creates an HttpServer using the following code server = ServerBootstrap.bootstrap() .setListenerPort(port) .setServerInfo("MyAppServer/1.1") .setSocketConfig(socketConfig) .registerHandler("*", new HttpHandler(this)) .create(); And the HttpHandler class that matches the requested URI and dispatches it to the corresponding backend method: public void handle(final HttpRequest request, final HttpResponse response, final HttpContext context) { String method = request.getRequestLine().getMethod().toUpperCase(Locale.ROOT); // Parameters are ignored for the example String path = request.getRequestLine().getUri(); if(method.equals("POST") && path.equals("/object/add") { if(request instanceof HttpEntityEnclosingRequest) { addObject(((HttpEntityEnclosingRequest)request).getEntity()) } [...] For sure I can replace path.equals("/object/add") by something more sophisticated with RegEx to match these dynamic parameters, but before doing so I'd like to know if I am not reinventing the wheel, or if there is an existing lib/class I didn't see in the docs that could help me. Using HttpCore is a requirement (it is already integrated in the application I am working on), I know some other libraries provide high-level routing mechanisms that support these dynamic parameters, but I can't really afford switching the entire server code to another library. I am currently using httpcore 4.4.10, but I can upgrade to a newer version of this might help me.
At present HttpCore does not have a fully featured request routing layer. (The reasons for that are more political than technical). Consider using a custom HttpRequestHandlerMapper to implement your application specific request routing logic. final HttpServer server = ServerBootstrap.bootstrap() .setListenerPort(port) .setServerInfo("Test/1.1") .setSocketConfig(socketConfig) .setSslContext(sslContext) .setHandlerMapper(new HttpRequestHandlerMapper() { #Override public HttpRequestHandler lookup(HttpRequest request) { try { URI uri = new URI(request.getRequestLine().getUri()); String path = uri.getPath(); // do request routing based on the request path return new HttpFileHandler(docRoot); } catch (URISyntaxException e) { // Provide a more reasonable error handler here return null; } } }) .setExceptionLogger(new StdErrorExceptionLogger()) .create();
How to export data from LinqPAD as JSON?
I want to create a JSON file for use as part of a simple web prototyping exercise. LinqPAD is perfect for accessing the data from my DB in just the shape I need, however I cannot get it out as JSON very easily. I don't really care what the schema is, because I can adapt my JavaScript to work with whatever is returned. Is this possible?
A more fluent solution is to add the following methods to the "My Extensions" File in Linqpad: public static String DumpJson<T>(this T obj) { return obj .ToJson() .Dump(); } public static String ToJson<T>(this T obj) { return new System.Web.Script.Serialization.JavaScriptSerializer() .Serialize(obj); } Then you can use them like this in any query you like: Enumerable.Range(1, 10) .Select(i => new { Index = i, IndexTimesTen = i * 10, }) .DumpJson(); I added "ToJson" separately so it can be used in with "Expessions".
This is not directly supported, and I have opened a feature request here. Vote for it if you would also find this useful. A workaround for now is to do the following: Set the language to C# Statement(s) Add an assembly reference (press F4) to System.Web.Extensions.dll In the same dialog, add a namespace import to System.Web.Script.Serialization Use code like the following to dump out your query as JSON new JavaScriptSerializer().Serialize(query).Dump();
There's a solution with Json.NET since it does indented formatting, and renders Json dates properly. Add Json.NET from NuGet, and refer to Newtonsoft.Json.dll to your “My Extensions” query and as well the following code : public static object DumpJson(this object value, string description = null) { return GetJson(value).Dump(description); } private static object GetJson(object value) { object dump = value; var strValue = value as string; if (strValue != null) { var obj = JsonConvert.DeserializeObject(strValue); dump = JsonConvert.SerializeObject(obj, Newtonsoft.Json.Formatting.Indented); } else { dump = JsonConvert.SerializeObject(value, Newtonsoft.Json.Formatting.Indented); } return dump; } Use .DumpJson() as .Dump() to render the result. It's possible to override more .DumpJson() with different signatures if necessary.
As of version 4.47, LINQPad has the ability to export JSON built in. Combined with the new lprun.exe utility, it can also satisfy your needs. http://www.linqpad.net/lprun.aspx