I have a server which contains the data to be served upon API requests from mobile clients. The data is kind of persistent and update frequency is very low (say once in a week). But the table design is pretty heavy which makes API requests to be served slowly
The Web Service is implemented with Yii + Postgre SQL.
Using memcached is a way to solve this problem? If yes, how can I manage, if the cached data becomes dirty?
Any alternative solution for this? Postgre has any built-in mechanism like MEMORY in MySQL?
How about redis?
You could use memcached, but again everybody would hit you database server. In your case, you are saying the query results are kind of persistent so it might make more sense to cache the JSON responses from your Web Service.
This could be done using a Reverse Proxy with a built in cache. I guess an example might help you the most how we do it with Jetty (Java) and NGINX:
In our setup, we have a Jetty (Java) instance serving an API for our mobile clients. The API is listening on localhost:8080/api and returning JSON results fetched from some queries on a local Mysql database.
At this point, we could serve the API directly to our clients, but here comes the Reverse Proxy:
In front of the API sits an NGINX webserver listening from 0.0.0.0:80/ (everywhere, port 80)
When a mobile client connects to 0.0.0.0:80/api the built-in Reverse Proxy tries to fetch the exact query string from it's cache. If this fails, it fetches it from localhost:8080/api, puts it in it's cache and serves the new value found in the cache.
Benefits:
You can use other NGINX goodies: automatic GZIP compression of the cached JSON files
SSL endpoint termination at NGINX.
NGINX workers might benefit you, when you have a lot more connections, all requesting data from the cache.
You can consolidate your service endpoints
Think about cache-invalidation:
You have to think about cache-invalidation. You can tell NGINX to hold on it's cache, say, for a week for all HTTP 200 request for localhost:8080/api, or 1 minute for all other HTTP status codes. But if there comes the time, where you want to update the API in under a week, the cache is invalid, so you have to delete it somehow or turn down the caching time to an hour or day (so that most people will hit the cache).
This is what we do: We chose to delete the cache, when it is dirty. We have another JOB running on the Server listening to an Update-API event triggered via Puppet. The JOB will take care of clearing the NGINX cache for us.
Another idea would be to add the clearing cache function inside your Web Service. The reason we decided against this solution is: The Web Service would have to know it runs behind a reverse proxy, which breaks separation of concerns. But I would say, it depends on what you are planning.
Another thing, which would make your Web Service more right would be to serve correct ETAG and cache-expires headers with each JSON file. Again, we did not do that, because we have one big Update Event, instead of small ones for each file.
Side notes:
You do not have to use NGINX, but it really easy to configure
NGINX and Apache have SSL support
There is also the famous Reverse Proxy (https://www.varnish-cache.org), but to my knowledge it does not do SSL (yet?)
So, if you were to use Varnish in front of your Web Service + SSL, you would use a configuration like:
NGINX -> Varnish -> Web Service.
References:
- NGINX server: http://nginx.com
- Varnish Reverse Proxy: https://www.varnish-cache.org
- Puppet IT Automation: https://puppetlabs.com
- NGINX reverse proxy tutorial: http://www.cyberciti.biz/faq/howto-linux-unix-setup-nginx-ssl-proxy/ http://www.cyberciti.biz/tips/using-nginx-as-reverse-proxy.html
Related
I have just exposed my database on openshift and it gives me an 'https://....' url
Does anybody know how to connect using DBeaver by using this url that openshift gave to me.
The error that dbeaver says to me is the following
Malformed database URL, failed to parse the main URL sections.
Short answer: You can't with aRoute
Route can only expose http/https traffic
If you want to expose tcp traffic (like for a database), do not create aRouteand change yourServicetype to "NodePort"`
Check my previous answer for this kind of problem (exposing MQ in this case): How to connect to IBM MQ deployed to OpenShift?
OpenShift doc on NodePorts: https://docs.openshift.com/container-platform/4.7/networking/configuring_ingress_cluster_traffic/configuring-ingress-cluster-traffic-nodeport.html
There's another way to do this.
If your Route is set to "passthrough" it will just look at the SNI headers to determine where to route the traffic but won't unwrap it (and expect http inside) which will let it pass other traffic through to a pod.
I use this mechanism to run a ZNC bouncer (irc traffic) behind SNI.
The downside is you need to provide your own TLS cert inside the pod instead of leveraging the general one available to *.apps.(cluster).com
As for the specific error, "Malformed database URL", I've not used this software but from a quick websearch it looks like you want to rewrite the https://(appname).(clustername).com into a jdbc:.../hostname... string, and then enable TLS in settings.
I found this page that talks about setting it up, so it might be helpful if you've not around found it -- https://github.com/dbeaver/dbeaver/issues/9573
I am currently running a Django site on ec2. The site sends a csv back to the client. The CSV is of varying sizes. If it is small the site works fine and client is able to download the file. However, if the file gets large, I get an ERR_EMPTY_RESPONSE. I am guessing this is because the connection is aborting without giving adequate time for the process to run fully. Is there a way to increase this time span?
Here's what my site is returning to the client.
with open('//home/ubuntu/Fantasy-Fire/website/optimizer/lineups.csv') as myfile:
response = HttpResponse(myfile, content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename=lineups.csv'
return response
Is there some other argument that can allow me to ignore this error and keep generating the file even if it is taking awhile or is large?
I believe that you have any sort of backend proxy server which resets the connection to the Django backend and returns ERR_EMPTY_RESPONSE for the case. You should re-configure timeouts on your backend proxy. Usually that is nginx or apache used as a reverse proxy server.
What is Reverse Proxy Server
A reverse proxy server is an intermediate connection point positioned at a network’s edge. It receives initial HTTP connection requests, acting like the actual endpoint.
Essentially your network’s traffic cop, the reverse proxy serves as a gateway between users and your application origin server. In so doing it handles all policy management and traffic routing.
A reverse proxy operates by:
Receiving a user connection request
Completing a TCP three-way handshake, terminating the initial connection
Connecting with the origin server and forwarding the original request
More info at https://www.imperva.com/learn/performance/reverse-proxy/
One more possible case - your reverse proxy backend server doesn't have enough free space to process response from Django and aborts the request. You can also check free space on your reverse proxy balancer.
Within gunicorn, there is an argument for timeout, -t. When you run gunicorn, the default timeout is 30 seconds. Increase that to something your comfortable with like 90 or 120 seconds, whatever you think fits your application.
In Azure's traffic manager, I am doing some testing with TWO failover URLs: Two different endpoints are configured for the traffic manager (failover1.mysite.com, failover2.mysite.com.), however, my local browser (Chrome for example) seems to be caching the DNS record on its own and redirecting to what it thinks is still the destination, rather than letter Azure Traffic Manager re-route. Trying the request in a new browser or Incognito session will result in the request reaching the correct site. But for existing sessions, failover updates are not being registered and still hitting the site we are trying to redirect traffic away from. Does anyone have any experience with this?
I had the same issue while I was dealing with Azure Traffic Manager or AWS CloudFront.
DNS Record is associated with its TTL value. It is not something wrong with the Azure Traffic Manager. It is the TTL value that is letting the DNS client to cache the IP address.
How to check TTL value of DNS:
If you are using Windows,
https://support.rackspace.com/how-to/nslookup-checking-dns-records-on-windows/
If you are using linux follow the detailed instructions here,
https://www.cyberciti.biz/faq/howto-use-dig-to-find-dns-time-to-live-ttl-values/
Hope it helps.
From Microsoft's overview of their load balancing services:
Traffic Manager is a DNS-based traffic load balancer [...] it load balances only at the domain level. For that reason, it can't fail over as quickly as Front Door, because of common challenges around DNS caching and systems not honoring DNS TTLs.
With Front Door you can route requests to different backends based on rules and/or the health of the backends themselves so it doesn't have the issue you describe.
I am attempting to use purely https with my compute engine. I have a network load balancer created that forwards to a pool with my instance in it. However, the pool has constantly failing health checks because it won't let me configure a health check that uses https.
I'm using apache to redirect 80 to 443. Does anyone know how to either create an https health check or have the http health check follow the redirect?
Thanks for any help.
--edit--
I finally came across some documentation at http://googlecloudplatform.blogspot.com/2015/07/Debugging-Health-Checks-in-Load-Balancing-on-Google-Compute-Engine.html.
Failure 5: Not answering directly with a 200 response code The web server may be configured to redirect to a page that returns an HTTP 200 response code. The health check will not follow the redirect; it expects the health check page to return a 200 directly.
This basic capability has been supported at every other hosting provider we've been on. Why can't this be done? What am I missing?
I spent the whole day trying to configure a purely https based load balancer in GCloud for a Kubernetes cluster with an ingress controller.
I finally got it working, so maybe I share my experience with people that struggle with the same configuration. If the health-check fails for the instances you will usually see the following accessing your websites URL.
Error: Server Error
The server encountered a temporary error and could not complete your request.
Please try again in 30 seconds.
1) Protocol: GCloud introduced new health checks which can be configured for HTTPS, SSLTCP, SSL, HTTP, HTTPS, or HTTP/2 probing. This can help the original problem to prevent a redirect from port 80 to port 443.
2) Path: The most common issue is a that the "/" path of your application will not return a 200 OK and thus let the health issue fail. This can be prevented by adding a path argument to your health check e.g. "/index".
3) Ingress HTTPS: This is relatively simple. Adding a secret or a pre-shared-cert to your ingress.yaml will automatically result in an HTTPS Load Balancer instead of HTTP. Further information to follow are here
Lastly, the guide from the docs for Setting up HTTP Load Balancing with Ingress .
However, even though the new HTTPS Health checks seem to work, they are still in the beta phase and bugs are reported in the issue tracker. The documentation for the gcloud-ingress-controller can be found here.
Our server setup is the following:
a proxy and load balancer directs all the requests to its machines behind. The problem is, that these machines behind do not know where they are. If the proxy gets the request for
www.bridge.de/m01
he redirects to machine01.
Machine01 only knows its local path
m01
For an application solution for a password reset functionality I considered several opportunities.
We decided to pass the value of URL from 'before proxy' to the database of machine01. So machine01 'knows' its external context for that specific requests.
My question is: Is there a better way to pass external URL context to machines behind a proxy? We are using JavaEE, JSP and MySql for our application. Virtual machines running with CentOS.
Thanks for any suggestions! :D
Your question is not fully clear.
I assume you have the issue, what your load balancer terminates the connection and forwards you the request.
Usually your balancer provides you the origin URL of the request, since you may need it from time to time.
In this case you can check your http headers. If it is not provided, you have to reconfigure your balancer to provide you the needed details.
check this: Strategies for dealing with URIs when building an application that sits behind a reverse proxy