I have a Spring Boot application that sometimes has to serve a very big JSON payload (several MB) over a REST API, which takes considerable time to download.
The data is read from a DB, serialized into JSON and sent back to the client.
The DB read operation is fast, even for big datasets, usually below 1 second. So my conclusion was that the most time consuming part is the HTTP exchange.
I've enabled GZIP compression for the HTTP exchange so the payload should be compressed before being sent. It seems this works (the returned payload is indeed compressed), however, there is no noticeable performance gain.
A curl request to the application's endpoint without compression takes 49 sec and yields a ~10 MB JSON payload:
curl -H "Content-Type: application/json" -H "Accept: application/json" -H "Authorization: Basic <REDACTED>" --data-binary #priorities-request.json 'https://<REDACTED>/api/rest/priorities' > priorities-response.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 10.0M 0 9.9M 100 85081 205k 1715 0:00:49 0:00:49 --:--:-- 239k
With GZIP compression enabled, the same request takes 42 sec and yields a ~260 kb GZip compressed JSON payload:
curl -H "Content-Type: application/json" -H "Accept: application/json" -H "Accept-Encoding: gzip,deflate,br" -H "Authorization: Basic <REDACTED>" --data-binary #priorities-request.json 'https://<REDACTED>/api/rest/priorities' > priorities-response.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 259k 0 176k 100 85081 4221 1991 0:00:42 0:00:42 --:--:-- 14408
My expectation would be that downloading a compressed 260K payload would take considerably less time than an uncompressed 10 MB download.
What's my mistake?
Edit: Because it's been asked in the comments how I set up the GZIP compression: I set compression="on" and compressableMimeType="application/json" in the server.xml of Tomcat. That's it. The rest is done by Tomcat's org.apache.coyote.http11.filters.GzipOutputFilter class.
Edit 2: To rule out that serializing the data into JSON is where the time is lost, I tested locally with Jackson2JsonMessageConverter, but it took only about 0.5 seconds to write even a huge data structure into a 10MB JSON string.
Edit 3: What I find most puzzling is that the client application that consumes the API, which is running on another Tomcat instance on the same physical machine, still experiences the same delay when retrieving the data.
When you use gzip, there is a trade-off between CPU usage and bandwidth usage that needs to be considered. What comes as advantage for the end user (less transfer time i.e. faster download speed) will require additional CPU cycles at server side.
The performance gain would certainly be visible if you run a load test. Also if your API is used internally by other APIs in your application, it will reduce the network usage drastically. The CPU usage may not be that significant.
If you plan to send over the network a compressed data than may be compress it not when you read it from DB but when you write it to DB. That way the compression time will not be at all part of the request and you save on your storage space. Than disable HTTP compressin and read your compressed binary data from DB and send it over
We figured it out: It turned out the HTTP exchange didn't have anything to do with it.
The bottleneck was, in fact the database, but we didn't notice at first because the JPA query returns almost immediately.
What we weren't seeing was that lots of properties in the retrieved objects are loaded "lazily", so the DB queries for them are only executed when the JSON serializer accesses these properties. Those queries didn't use any CPU time on the Tomcat machine, so we couldn't detect the time loss by profiling.
Related
I have an API (ASPNET Core) which calls a Database (MySQL). I use NHibernate and the MySqlData connector.
When a lot of requests are sent to my API, I come accross some CPU problem (near to 100%) that lead to HTTP request errors.
I made a dump to analyse what is happenning and I found that most of my CPU time is spent in the MySql.Data.MySqlClient.CharSetMap::GetEncoding method, as you can see below (7% CPU usage for a request that use 16% of CPU) :
I use the UTF8 default collation for my MySql Database :
I tried to set the encoding manually in my ASPNET Core API, so I could avoid the GetEncoding cost, but I can't find how to do it (I even don't know if it's possible).
Any advice to improve my CPU usage ?
We have a backend application which has to send telemetry data to an event hub. All data have to be serialized into JSON and compressed.
Should we collect all serialized objects into a single newline-delimited JSON or it is better to use one EventData wrapper per object and send it as a batch? Probably compression will work better with newline-delimited json. But will ASA be able to process it?
Asa supports gzip and deflate compression. Every eventhub message can be up to 256 kb including metadata. On processing side, every message has some overhead. So, for same number of records, smaller number of eventhub messages is better. However, this usually means some buffering on the send side. Based on your overall latency requirements and memory footprint requirements on sender, you should batch multiple records into every eventhub message, with compression.
So after 3 months of hard work in developing & switching the company API from PHP to Go I found out that our Go server can't handle more than 20 req/second.
So basically how our API works:
takes in a request
validates the request
fetches the data from the DB using MYSQL
put's the data in a Map
send's it back to the Client in a JSON format
So after writing about 30 APIs I decided to take it for a spin and see how it performance under load test.
Test 1: ab -n 1 -c 1 http://localhost:8000/sales/report the results are "Time per request: 72.623 [ms] (mean)" .
Test 2: ab -n 100 -c 100 http://localhost:8000/sales/report the results are "Time per request: 4548.155 [ms] (mean)" (No MYSQL errors).
How did the number suddenly spike from 72.623 to 4548 ms in the second test?. We expect thousands of requests per day so I need to solve this issue before we finally release it . I was surprised when I saw the numbers ; I couldn't believe it. I know GO can do much better.
So basic info about the server and settings:
Using GO 1.5
16GB RAM
GOMAXPROCS is using all 8 Cores
db.SetMaxIdleConns(1000)
db.SetMaxOpenConns(1000) (also made sure we are using pool of
connections)
Connecting to MYSQL through unix socket
System is running under Ubuntu
External libraries that we are using:
github.com/go-sql-driver/mysql
github.com/gorilla/mux
github.com/elgs/gosqljson
Any ideas what might be causing this? . I took a look at this post but didn't work as I mentioned above I never got any MYSQL error. Thank you in advance for any help you can provide.
Your post doesn't have enough information to address why your program is not performing how you expect, but I think this question alone is worth an answer:
How did the number suddenly spike from 72.623 to 4548 ms in the second test?
In your first test, you did one single request (-n 1). In your second test, you did 100 requests in flight simultaneously (-c 100 -n 100).
You mention that your program communicates to an external database, your program has to wait for that resource to respond. Do you understand how your database performs when you send it 1,000 requests simultaneously? You made no mention of this. Go can certainly handle many hundreds of concurrent requests a second without breaking a sweat, but it depends what you're doing and how you're doing it. If your program can't complete requests as fast as they are coming in, they will pile up, leading to a high latency.
Neither of those tests you told us about are useful to understand how your server performs under "normal" circumstances - which you said would be "thousands of requests per day" (which isn't very specific, but I'll take to mean, "a few a second"). Then it would be much more interesting to look at -c 4 -n 1000, or something that exercises the server over a longer period of time, with a number of concurrent requests which is more like what you expect to get.
I'm not familiar with gosqljson package, but you say your "query by itself is not really complicated" and you're running it against "a well structured DB table," so I'd suggest dropping the gosqljson and binding your query results to structs, then marshalling those structs to json. That should be faster and incur less memory thrashing than using a map[string]interface{} for everything.
But I don't think gosqljson could possibly be that slow, so it's likely not the main culprit.
The way you're doing your second benchmark is not helpful.
Test 2: ab -n 100 -c 100 http://localhost:8000/sales/report
That's not testing how fast you can handle concurrent requests so much as it's testing how fast you can make connections to MySQL. You're only doing 100 queries and using 100 requests, which means Go is probably spending most of its time making up to 100 connections to MySQL. Go probably doesn't even have time to reuse any of the db connections, considering all the other stuff it's doing to satisfy each query, and then, boom, the test is over. You would need to set the max connections to something like 50 and run 10,000 queries to see how long concurrent requests take once a pool of db connections is already established; right now you're basically testing how long it takes Go to build up a pool of db connections.
Are JSON responses ever incomplete because of server errors, or are they designed to fail loudly? Are there any special concerns for transferring very large sets of data over JSON, and can they be mitigated? I'm open to any suggestions.
Transferring JSON over HTTP is no different than transferring any bytes over HTTP.
Yes, server errors can result in incomplete transfers. Imagine turning your server off half way through a transfer. This is true of any network transfer. Your client will fail loudly if there is such an error. You might get a connection time out or an error status code. Either way you will know about it.
There is no practical limit to the amount of data you can transfer as JSON over HTTP. I have transferred 1GB+ of JSON data in a single HTTP request. When making a large transfer you want to be sure to use a streaming API on the server side. Which is to say write to the output stream of the HTTP response while reading the data from your db, rather than reading your data from the DB into RAM entirely and then encoding it to JSON and writing it to the output. This way your client can start processing the response immediately, plus your server wont run out of memory.
I have an domain name to test.
Ping is ~20 ms.
'HTTP HEAD' is ~500 ms.
Why there are so big difference between them? Is this a server-side problem? Isn't there are too big difference? 25 times.
Well, for one, ping goes over a different protocol, ICMP. The server itself directly responds to pings. HTTP is a different protocol handled by an additional application, a web server, that must be running on the server (ping is built-in to the OS). Depending on how heavy the web server is, it could take a significant amount of time more, relative to something like a ping. Also, HEAD is sent along with a particular URL. If that URL is handled by something like ASP.NET instead of just the web server directly, then there's additional processing that must be done to return the response.
Ping is usually implemented as an ICMP echo request. A simpler datagram protocol: You send a packet, the server replies with the corresponding packet and that's about it.
HTTP HEAD is still HTTP: a TCP connection must be established between both ends and the HTTP server must reply with the headers for your request. It's obviously fast but not as simple as sending a single packet response.
If you're testing a domain, ping is a more adequate tool, while HTTP HEAD is a tool better suited to test an HTTP server.
If I'm not mistaken, a ping request is handled on the network driver level, and is extremely fast as a result (sometimes it's handled by the hardware itself, skipping software processing altogether). It will portray network latency fairly well.
An HTTP HEAD request must visit the web server, which is a user-level program, and requires copying bits of data a couple times, and web server code to parse the request, etc. The web server then has to generate the HTTP response headers for the request. Depending on the server and the requested page, this can take a while, as it has to generate the requested page anyway (It just sends you the headers only, and not the page content.)
When you run ping it responds much quicker because is it is designed to respond immediately. It shows you approximate latency, so if you get consistent results using ping you cannot get lower latency than that.
When you run HTTP HEAD you are actually making a request to a specific page, it is processed, executed rendered and only head is returned. It has much more overhead compared to ping, that's why it is taking much longer.