Too many connections while using gevent socketio - gunicorn

I am getting too many connections (1040) when using gevent socketio. I am using monkey patch right now. Can I limit the number of threads(greenlets) created and make some jobs share the threads? I am using gunicorn and django.

Well try using gevent.pool() in your code, which you can give a limit on how many greenlets it can spawn. Your question is to 'general' to answer. You need give more information about the context and source of the spawning of greenlets.

Related

Queue data publish/duplicate

I am using IBM webSphere MQ 7.5 server as queue manager for my applications.
Already I am receiving data through single queue.
On the other hand there are 3 applications that want to process data.
I have 3 solutions to duplicate/distribute data between them.
Use broker to duplicate 1 to 3 queue - I don't have broker so it is not accessible for me.
Write an application to get from queue and put them in other 3 queues on same machine
Define publish/subscribe definitions to publish input queue to 3 queues on same machine.
I want to know which methods (2 & 3) is preferred and have higher performance and acceptable operational management effort.
Based on the description I would say that going PubSub would achieve the goal; try to thinking in pure PubSub terms rather than thinking about the queues. i.e. You have an application that publishes to a topic with then 3 applications each having their own subscription to get copies of the message.
You then have the flexibility to define durable/nondurable subscriptons for example.
For option # 2, there are (at least) 2 solutions available:
There is an open source application called MMX (Message Multiplexer). It will do exactly what you describe. The only issue is that you will need to manage the application. i.e. If you stop the queue manager then the application will need to be manually restarted.
There is a commercial solution called MQ Message Replication. It is an API Exit that runs within the queue manager and does exactly what you want. Note: There is nothing external to manage as it runs within the queue manager.
I think there is another solution, with MQ only, to define a Namelist which will mirror queue1 to queue2 and queue3
Should be defined like: Source, Destination, QueueManager.
Hope it is useful.
Biruk.

Accessing heapster metrics in Kubernetes code

I would like to expand/shrink the number of kubelets being used by kubernetes cluster based on resource usage. I have been looking at the code and have some idea of how to implement it at a high level.
I am stuck on 2 things:
What will be a good way for accessing the cluster metrics (via Heapster)? Should I try to use the kubedns for finding the heapster endpoint and directly query the API or is there some other way possible? Also, I am not sure on how to use kubedns to get the heapster URL in the former.
The rescheduler which expands/shrinks the number of nodes will need to kick in every 30 minutes. What will be the best way for it. Is there some interface or something in the code which I can use for it or should I write a code segment which gets called every 30 mins and put it in the main loop?
Any help would be greatly appreciated :)
Part 1:
What you said about using kubedns to find heapster and querying that REST API is fine.
You could also write a client interface that abstracts the interface to heapster -- that would help with unit testing.
Take a look at this metrics client:
https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/podautoscaler/metrics/metrics_client.go
It doesn't do exactly what you want: it gets per-Pod stats instead of per-cluster or per-node stats. But you could modify it.
In function getForPods, you can see the code that resolves the heapster service and connects to it here:
resultRaw, err := h.client.Services(h.heapsterNamespace).
ProxyGet(h.heapsterService, metricPath, map[string]string{"start": startTime.Format(time.RFC3339)}).
DoRaw()
where heapsterNamespace is "kube-system" and heapsterService is "heapster".
That metrics client is part of the "horizonal pod autoscaler" implementation. It is solving a slightly different problem, but you should take a look at it if you haven't already. If is described here: https://github.com/kubernetes/kubernetes/blob/master/docs/design/horizontal-pod-autoscaler.md
FYI: The Heapster REST API is defined here:
https://github.com/kubernetes/heapster/blob/master/docs/model.md
You should poke around and see if there are node-level or cluster-level CPU metrics that work for you.
Part 2:
There is no standard interface for shrinking nodes. It is different for each cloud provider. And if you are on-premises, then you can't shrink nodes.
Related discussion:
https://github.com/kubernetes/kubernetes/issues/11935
Side note: Among kubernetes developers, we typically use the term "rescheduler" when talking about something that rebalances pods across machines, by removing a pod from one machine and creates the same kind of pod on another machine. That is a different thing than the thing you are talking about building. We haven't built a rescheduler yet, but there is an outline of how to build one here:
https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/rescheduler.md

Beanstalkd to ZeroMQ: is it possible to distribute work in the same way?

A common beanstalkd workflow would be to have many workers listening for jobs on a queue/tube, locking that job while they process it, and deleting that job so that no other workers can re-process it. If the job fails (eg. resources are unavailable to complete processing) the job can slip back onto the queue for another worker to pick up the job.
Is this approach possible with ZeroMQ? Eg, using the pub/sub model can multiple subscribers receive the same job and process it at the same time? Would push/pull or req/rep provide a similar setup?
I'm certain ZeroMQ can provide this for you. However keep in mind that ZeroMQ is not really a queue. It's an advanced networking library. Naturally, with the provided primatives, you can do what you describe.
You specific case seems like it could be implemented as a pub/sub system, if you don't mind having the same work done many times over. I recommend reading the ZeroMQ guide and especially chapter 5.
Although I'm certain you can do what you describe with ZeroMQ, I would first search for a queue which does this already.

How to force a multihopping topology with xbee zb?

I use some xbee (s2) modules with zb stack for mesh networking evaluation. Therefore a multi hopping environment has to be created. The problem is, that the firmware handles the association for themselves and there is no way deeper into the stack as the api provides. To force the path of the data, without to disturb the routing mechanism, I have tried to measure, I had to put them outside their reach. To get only the next hop in association isn't that easy. I used the least power level of the output, but the distance for the test setup is to large and the rf characteristics of the environment change undetermined.
Therefore my question, has anyone experience with this issue?
Regards, Toby
I don't think it's possible through software and coordinator/routers. You could change the Node Join Time (ATNJ) to force a new router to join through a particular router (disable Node Join on all nodes except one), but that would only affect joining. Once joined to the network, the router will discover that other nodes are within range.
You could possibly do it with sleepy end devices. You can use the ATNJ trick to force an end device to join through a single router, and it will always send its messages to that router. But you won't get that many hops -- end device sends to its parent router, which sends to the target's parent router, which sends to the target end device.
You'll likely need to physically limit the range of the radios to force hopping, as demonstrated in the video you linked of Digi's K-Node test equipment with a network of over 1000 radios. They're putting the radios in RF-shielded boxes and using wired antenna connections with software-controlled attenuators to connect the modules to each other.
If you have XBee modules with the U.fl or RPSMA connector, and don't connect an antenna, it should significantly reduce the range of the module. Otherwise, with a wire whip or integrated PCB antenna, you need to put each radio in some sort of box that attenuates the signal. Perhaps someone else can offer advice on materials that will reduce the signal's range without completely blocking it.
ZigBee nodes try to automatically form an Ad-Hoc network. That is why they join the network with the strongest connection (best network coverage) available on that moment. These modules are designed in such a way, that you do not have to care much about establishing a reliable communication. They will solve networking problems most of the time.
What you want to do, is somehow force a different situation. You want to create a specific topology, in order to get some multi-hopping. That will not be the normal behavior of the nods. But you can still get what you want with some of the AT Commands.
The mentioned command "NJ" should work for you. This command locks joins after a certain time (in seconds). Let us think of a simple ZigBee network with three nodes: one Coordinator, one Router and one End-Device. Switch on the Coordinator with "NJ" set to, let us say, two minutes. Then quickly switch on the Router, so it can associate with the Coordinator within these two minutes. After these two minutes, the Coordinator will be locked and will not accept more joins. At that moment you can start the End-Device, which will have to associate with the Router necessarily. This way, you will see that messages between End-Device and Coordinator go through the Router, as you wanted.
You may get a bigger network applying this idea several times, without needing to play with the module's antennas. You can control the AT Parameters remotely (i.e. from a Computer connected to the Coordinator), so you can use some code to help you initialize the network.

Is it possible to do asynchronous / parallel database query in a Django application?

I have web pages that take 10 - 20 database queries in order to get all the required data.
Normally after a query is sent out, the Django thread/process is blocked waiting for the results to come back, then it'd resume execution until it reaches the next query.
Is there's any way to issue all queries asynchronously so that they can be processed by the database server(s) in parallel?
I'm using MySQL but would like to hear about solutions for other databases too. For example I heard that Postgresql has an async client library - how would I use that in this case?
This very recent blog entry seems to imply that it's not built in to either the django or rails frameworks. I think it covers the issue well and is quite worth a read along with the comments.
http://www.eflorenzano.com/blog/post/how-do-we-kick-our-synchronous-addiction/ (broken link)
I think I remember Cal Henderson mentioning this deficiency somewhere in his excellent speech http://www.youtube.com/watch?v=i6Fr65PFqfk
My naive guess is you might be able to hack something with separate python libraries but you would lose a lot of the ORM/template lazy evaluation stuff django gives to the point you might as well be using another stack. Then again if you are only optimizing a few views in a large django project it might be fine.
I had a similar problem and I solved it with javascript/ajax
Just load the template with basic markup and then do severl ajax requsts to execute the queries and load the data. You can even show loading animation. User will have a web 2.0 feel instead of just gloomy page loading. Ofcourse, this means several more HTTP requests per page, but it's up to you to decide.
Here is how my example looks: http://artiox.lv/en/search?query=test&where_to_search=all (broken link)
Try Celery, there's a bit of a overhead of having to run a ampq server, but it might do what you want. Not sure about concurrency of the DB tho. Also, if you want speed for your DB, I'd recommend MongoDB (but you'll need django-nonrel for that).