We're designing an API Rest with the following technologies :
spring-boot 1.5.7
spring-data-jpa
MySQL 5.5.59
It's a Java REST API designs with spring-boot and connected to a MySQL database via a mysql-connector. Most of the queries in repositories are written in the JPA Query way.
To compile it we use maven and we deploy it in fatjar running in a tomcat sever.
The fact is several tables and resources within can be very large for one user. So when a GET request is made with high offset, we can deal with slow queries.
We tried several solutions like :
set tomcat parameters for the mysql connection
use hikari but no available parameters to kill slow queries
deploy the rest api in a glassfish5 server but the spring data jpa queries throw an execption (could not extract result set), so it's a problem.
Moreover, a lot of people gonna say me "use pagination and check the offset in the request". The fact is that we're using that but the class PageRequest have a tricky behaviour. Indeed, the resource the API give you back with this class has the shape below :
{
"content": {},
"links": [
{
"href": "string",
"rel": "string",
"templated": true
}
],
"page": {
"number": 0,
"size": 0,
"totalElements": 0,
"totalPages": 0
}
}
As you can see you have an object page with a field totalElements. This is the problem if you have a lot of entries in a table for one user. Because spring data gonna do a "SELECT COUNT(*)" to have this totalElements information. Pagination is not the solution cause we arleady have thid.
So what we're asking you are the best practices and maybe the solution to handle and kill slow queries in a Restful API designs with spring-boot.
Thanks !
Regarding the pagination have a look at this answer: Way to disable count query from PageRequest for getting total pages?
Basically return a List and not a Page. This of course will not give you the total so you will need to handle the situation differently. It might require one extra query for next page that will return an empty list but you will save the count on each page.
If you however still need the total for some reason you will have to do the count.
Regarding your comment "slow queries caused by your database": it is not the database that generates the slow queries. Slow queries are generate by you and your code. There might be solutions to your problems other than killing the running queries but you would need to provide more information about your database: structure, number of records, exact queries issues, what indexes are in place, etc. Maybe you could introduce caching?
If you still insist on killing the query you could run in inside a CompletableFuture and await for completion with timeout: https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/CompletableFuture.html
After a lot of research, I found a solution to my problem simply by reading focusely the tomcat datasource documentation. I found the jdbc interceptors I have set in my application.yml as below :
jdbc-interceptors: QueryTimeoutInterceptor(queryTimeout=20);SlowQueryReport(threshold=20000,logFailed=true)
So this do what I want, e.g that throws an exception if my query exceed 20 secs and kill it !
Thanks anyway for your help !
Related
In my environment, multiple OrionLD instances are running on a Kubernetes cluster.
The environment consists of two OrionLD(0.8.0) instances , one MongoDB instance, and a LoadBalancer to OrionLD.
I created an entity with a new tenant by using "NGSILD-Tenant" header.
Next, when I tried to retrieve it with "GET /entities", sometimes the retrieval succeeded, and sometimes it failed.
The error message was below.
{
"type": "https://uri.etsi.org/ngsi-ld/errors/NonExistingTenant",
"title": "No such tenant",
"detail": "Tenant01"
}
It seems that one OrionLD instance can recognize the new tenant, but the other cannot.
What is a possible cause of this issue?
Thanks.
ok, this seems to be a problem in the broker. Create an issue on Orion-LD's github, please: https://github.com/FIWARE/context.Orion-LD/issues.
I recently implemented tenant checks for retrievals. It's OK to create new tenants on the fly (entity create operations), but for queries, the tenant must exist already, and the list is in RAM. Meaning, only the broker that created the entity knows about the tenant. It completely explains your problem.
I didn't think about this use case, but you are absolutely right.
I will have to improve the way I check for "tenant exists" for retrieval operations.
So, as it seemed, the bug was mine and it has been fixed and accepted (just to clarify this now "non-issue")
I am attempting to count the number of queries that are run against the database in a request against my Golang API server. This is in order to optimize + find bottlenecks in our API.
An example of how to do this in Django is here.
Any idea if there is a simple way to do this in Golang?
In order to find bottlenecks I would suggest going straight to the MySQL database and enabling the slow query log and slowly setting the long_query_time down to a low millisecond value.
Use tools like pt-query-digest to help digest these to get a frequency of like queries. Attack those as slow queries that need fixing and then set a lower value.
The actual count of the queries on each isn't actually that useful.
When attacking the problem from a go point of view measuring the API response time of each interface will help you look holisticly at the service.
No easy solution that I'm aware of.
You could wrap your db in your own struct, and then implement Exec() or whichever function you use directly on that struct. Your function would just call the database one, and count it however you see fit.
A similar example, but with a logger, can be found here:
How to log queries to database drivers?
I have a service has following query context request:
{
"entities": [{
"type": "Call",
"isPattern": "true",
"id": ".*" }],
"restriction": {
"scopes": [{
"type": "FIWARE::StringQuery",
"value": "status=='open'"
}]
}
}
Then It will return about 400 records in about 15 seconds.
What is the best way to reduce to queryContext time so that make the service run faster?
The have a look at the following docs, It may be use Database indexes or set log level.
https://fiware-orion.readthedocs.io/en/master/admin/perf_tuning/index.html
Please help to correct me?
Many thanks.
I think you are in the good direction... the document you cite about Orion performance (https://fiware-orion.readthedocs.io/en/master/admin/perf_tuning/index.html) is the one you should read and apply, specially the following sections:
MongoDB configuration
Database indexes
Write concern Notification
Identifying bottlenecks looking at semWait statistics
Log impact on performance
Mutex policy impact on performance
In addition, ensure your MongoDB is not the bottleneck of your system. I mean, it is worth nothing to tune Orion for maximum performance if your MongoDB server is not performing ok, e.g. it is running in a system with very limited CPU and RAM resources. Please, check MongoDB documentation about this matter.
Another possible bottleneck is the network. Where are your running your queryContext client? Is the result the same if you run queryContext in the same machine Orion runs (i.e. using localhost interface)?
I just started using couchbase and hoping to use it as my data store.
One of my requirements in performing a query that will return a certain field about all the documents in the store. This query is done once at the server startup.
For this purpose I need all the documents that exist and can't miss any of them.
I understand that views in couchbase are eventually consistent but I still hope this query can be done (at the cost of performance).
Notes about my configurating:
I have only one couchbase server instance (I dont need sharding or
replication)
I am using the java client (1.4.1)
What I have tried to do is saving my documents this way:
client.set(key, value, PersistTo.ONE).get();
And querying using:
query.setStale(Stale.FALSE);
Adding the PersistTo parameter caused the following exception:
Cause by: net.spy.memcached.internal.CheckedOperationTimeoutException: Timed out waiting for operation - failing node: <unknown>
at net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:167)
at net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:140)
So I guess I am actually asking 3 questions:
Is it possible to get the consistent results I need?
If so, is what I suggested the correct way of doing that?
How can I prevent those exceptions?
The mapping I'm using:
function (doc,meta) {
if (doc.doc_type && doc.doc_type == "MyType" && doc.myField) {
emit(meta.id,null);
}
}
Thank you
Is it possible to get the consistent results I need?
Yes it is possible to set Couchbase views to be consistent by setting the STALE flag to false as you've done. However there are performance impacts with this, so dependent on your data size the query may be slow, if you are only going to be doing it once a day then it should be ok.
Couchbase is designed to be a distributed system comprising of more than node, it's not really suitable for single node deployments. I have read (but can't find the link) that view performances are much better in larger clusters.
You are also forcing more of a sync processing model onto a system that shines with async requests, PersistTo is ok to use for some requests but not system wide on every call (personal opinion), it'll definitely throttle throughput and performance.
If so, is what I suggested the correct way of doing that?
You say the query is done after your application server is running, is this once per day or more? If once a day then your application should work (I'd consider upping the nodes ;)), if you have to do this query a lot and you are 'hammering' the node over and over with sets then I'd expect to see what you are currently experiencing.
How can I prevent those exceptions?
It could be a variety of reasons, what are the specs of your computer, RAM,CPU,DISK? How much ram is allocated to Couchbase, how much to your bucket, what % of the bucket ram is used?
I've personally seen this when I've hammered some lower end AWS instances on some not so amazing networks. What version of Couchbase are you using? It could be a whole variety of factors that and deserves to be a separate question.
Hope that helps!
EDIT regarding more information on the Stale = false parameter (from official docs)
http://docs.couchbase.com/couchbase-manual-2.2/#couchbase-views-writing-stale
The index is updated before the query is executed. This ensures that any documents updated (and persisted to disk) are included in the view. The client will wait until the index has been updated before the query has executed, and therefore the response will be delayed until the updated index is available.
So I'm actually on a project that connects an IRC bot to a game server. And I needed to store the username, the player's stats, and ban expiration(if the user is banned). I'm fairly new to node.js and don't know much about node js databases,so I kind of made my own,using JSON. The JSON file is like that:
{
"Player": [
{
"name": "LucasTT",
"stats": [
103,
1
],
"banExpires": 0
},...
And I made my own functions to access the database and didin't find it hard. So,is it better to use a lib like CouchDB or make my own(as I kind of did)?(Performance wise and how hard it would be to setup).
If you have dozens or hundreds of users, your approach is fine. In fact, it is better than using a database, because it doesn't have that complexity, doesn't require another server, and usually faster.
But if your userbase will grow to thousands, and you'll want several node.js instances (or even separate servers), using a database will become an only option.
So I'd say go ahead and use JSON. You'll know when you'll need a database.