How to properly calculate amount of Gunicorn's workers in multi venv environment? - gunicorn

The official Gunicorn docs states:
Generally we recommend (2 x $num_cores) + 1 as the number of workers to start off with
However what's the proper way of calculating number of workers if I'm running 10 small websites (5000 uv/day) on a dedicated server with 4c/8t CPU? The stack is: django, nginx, gunicorn, supervisor.
For a single website it's obvious where to start 2x4 + 1 = 9, but what about my situation? How should I scale workers to not waste server resources?

Related

Compute Engine - Automatic scale

I have one VM Compute Engine to host simple apps. My apps is growing and the number of users too.
Now my users work basicaly from 08:00 AM to 07:00 PM, in this period the usage os CPU and Memory is High and the speed of work is very important.
I'm preparing to expand the memory and processor in the next days, but i search a more scalable and cost efective way.
Is there a way for automatic add resources when i need and reduce after no more need?
Thanks
The cost of running your VMs is directly related to a number of different factors i.e. the type of network in use (premium vs standard), the machine type, the boot disk image you use (premium vs open-source images) and the region/zone where your workloads are running, among other things.
Your use case seems to fit managed instance groups (MIGs). With MIGs you essentially configure a template for VMs that share the same attributes. During the configuration of your MIG, you will be able to specify the CPU/memory limit beyond which the MIG autoscaler will kick off. When your CPU/memory reading goes below that threshold, MIG scales your VMs down to the number of instances specified in your template.
You can also use requests per second as a threshold for autoscaling and I would recommend you explore the docs to know more about it.
See docs

how many uvicorn workers do I have to have in production?

My Environment
FastAPI
Gunicorn & Uvicorn Worker
AWS EC2 c5.2xlarge (8 vCPU)
Document
https://fastapi.tiangolo.com/deployment/server-workers/
Question
Currently I'm using 24 Uvicorn workers in production server. (c5.2xlarge)
gunicorn main:app --workers 24 --worker-class uvicorn.workers.UvicornWorker --bind 0.0.0.0:80
I've learn that one process runs on one core.
Therefore If i have 8 processes, I can make use of whole cores (c5.2xlarge's vCpu == 8)
I'm curious that in this situation, Is there any performance benefit if I got more processes than 8?
Number of recommended workers is 2 x number_of_cores +1
You can read more about it at
https://docs.gunicorn.org/en/stable/design.html#:~:text=Gunicorn%20should%20only%20need%204,workers%20to%20start%20off%20with.
In your case with 8 CPU cores, you should be using 17 worker threads.
Additional thoughts on async systems:
The two times core is not a scientific figure as says in the article. But the idea is that one thread can do I/O and another CPU processing at the same time. This makes maximum use of simultaneous threads. Even with async systems, conceptually this works and should give you maximum efficiency.
In general the best practice is:
number_of_workers = number_of_cores x 2 + 1
or more precisely:
number_of_workers = number_of_cores x num_of_threads_per_core + 1
The reason for it is CPU hyperthreading, which allows each core to run multiple concurrent threads. The number of concurrent threads is decided by the chip designers.
Two concurrent threads per CPU core are common, but some processors can support more than two.
vCPU that is mentioned for AWS ec2 resource is already the hyperthreaded amount of processing units you have on the machine (num_of_cores x num_of_threads_per_core). Not to be confused with number_of_cores available on that machine.
So, in your case, c5.2xlarge has 8 vCPUs, meaning you have 8 available concurrent workers.

Apache Superset - Concurrent loading of dashboard slices (Athena)

I've got a dashboard with a few slices setup. Slices are loading one after the other, not concurrently. This results in a bad user experience. My data is sitting on S3 and I'm using the Athena connector for queries. I can see that the calls to Athena are fired in order, with each query waiting for the one before it to finish before running.
I'm using gevent, which is far as I can tell shouldn't be a problem?
Below is an extract from my Superset config:
SUPERSET_WORKERS = 8
WEBSERVER_THREADS = 8
It used to be set to 2 and 1 respectively, but I upped it to 8 each see if that could be the issue. I'm getting the same result though.
Is this a simple misconfiguration issue or am I missing something?
It's important to understand multiProcessing and multiThreading before increasing the workers and threads for gunicorn. If you are looking at CPU intense operations, you want many processes' while you probably want many threads for I/O intensive operations.
With your problem, you don't need many process; but threads within a process. With that config, next step would be to debug how you're spawning greenlets. (gevents facilitates concurrency. Concurrency !== Parallel Processing).
To bootstrap gunicorn with multiple threads you could do something like this:
gunicorn -b localhost:8000 main:app --threads 30 --reload
Please post more code to facilitate targeted help.

Benefit of running Kubernetes in bare metal and cloud with idle VM or machines?

I want to know the high level benefit of Kubernetes running in bare metal machines.
So let's say we have 100 bare metal machines ready with kubelet being deployed in each. Doesn't it mean that when the application only runs on 10 machines, we are wasting the rest 90 machines, just standing by and not using them for anything?
For cloud, does Kubernetes launch new VMs as needed, so that clients do not pay for idle machines?
How does Kubernetes handle the extra machines that are needed at the moment?
Yes, if you have 100 bare metal machines and use only 10, you are wasting money. You should only deploy the machines you need.
The Node Autoscaler works at certain Cloud Providers like AWS, GKE, or Open Stack based infrastructures.
Now, Node Autoscaler is useful if your load is not very predictable and/or scales up and down widely over the course of a short period of time (think Jobs or cyclic loads like a Netflix type use case).
If you're running services that just need to scale eventually as your customer base grows, that is not so useful as it is as easy to simply add new nodes manually.
Kubernetes will handle some amount of auto-scaling with an assigned number of nodes (i.e. you can run many Pods on one node, and you would usually pick your machines to run in a safe range but still allow handling of spikes in traffic by spinning more Pods on those nodes.
As a side note: with bare metal, you typically gain in performance, since you don't have the overhead of a VM / hypervisor, but you need to supply distributed storage, which a cloud provider would typically provide as a service.

Choosing a TSDB for one-off smart-home installation

I'm building a one-off smart-home data collection box. It's expected to run on a raspberry-pi-class machine (~1G RAM), handling about 200K data points per day (each a 64-bit int). We've been working with vanilla MySQL, but performance is starting to crumble, especially for queries on the number of entries in a given time interval.
As I understand it, this is basically exactly what time-series databases are designed for. If anything, the unusual thing about my situation is that the volume is relatively low, and so is the amount of RAM available.
A quick look at Wikipedia suggests OpenTSDB, InfluxDB, and possibly BlueFlood. OpenTSDB suggests 4G of RAM, though that may be for high-volume settings. InfluxDB actually mentions sensor readings, but I can't find a lot of information on what kind of resources are required.
Okay, so here's my actual question: are there obvious red flags that would make any of these systems inappropriate for the project I describe?
I realize that this is an invitation to flame, so I'm counting on folks to keep it on the bright and helpful side. Many thanks in advance!
InfluxDB should be fine with 1 GB RAM at that volume. Embedded sensors and low-power devices like Raspberry Pi's are definitely a core use case, although we haven't done much testing with the latest betas beyond compiling on ARM.
InfluxDB 0.9.0 was just released, and 0.9.x should be available in our Hosted environment in a few weeks. The low end instances have 1 GB RAM and 1 CPU equivalent, so they are a reasonable proxy for your Pi performance, and the free trial lasts two weeks.
If you have more specific questions, please reach out to us at influxdb#googlegroups.com or support#influxdb.com and we'll see hwo we can help.
Try VictoriaMetrics. It should run on systems with low RAM such as Raspberry Pi. See these instructions on how to build it for ARM.
VictoriaMetrics has the following additional benefits for small systems:
It is easy to configure and maintain since it has zero external dependencies and all the configuration is done via a few command-line flags.
It is optimized for low CPU usage and low persistent storage IO usage.
It compresses data well, so it uses small amounts of persistent storage space comparing to other solutions.
Did you try with OpenTSDB. We are using OpenTSDB for almost 150 houses to collect smart meter data where data is collected every 10 minutes. i.e is a lot of data points in one day. But we haven't tested it in Raspberry pi. For Raspberry pi OpenTSDB might be quite heavy since it needs to run webserver, HBase and Java.
Just for suggestions. You can use Raspberry pi as collecting hub for smart home and send the data from Raspberry pi to server and store all the points in the server. Later in the server you can do whatever you want like aggregation, or performing statistical analysis etc. And then you can send results back to the smart hub.
ATSD supports ARM architecture and can be installed on a Raspberry Pi 2 to store sensor data. Currently, Ubuntu or Debian OS is required. Be sure that the device has at least 1 GB of RAM and an SD card with high write speed (60mb/s or more). The size of the SD card depends on how much data you want to store and for how long, we recommend at least 16GB, you should plan ahead. Backup batter power is also recommended, to protect against crashes and ungraceful shutdowns.
Here you can find an in-depth guide on setting up a temperature/humidity sensor paired with an Arduino device. Using the guide you will be able to stream the sensor data into ATSD using MQTT or TCP protocol. Open-source sketches are included.