OBD2 PIDs 0101 vs. 0141 w/ regards to injection type - obd-ii

PIDs 0101 (monitor status since DTCs cleared) and 0141 (monitor status this drive cycle) are both returning the monitor status, however as per the specification only 0101 differentiates between spark ignition and compression ignition, hence the bit-to-monitor-mapping is different.
As per the standard documents (and Wikipedia1), this distinction is missing in 0141, so how am I supposed to interpret the result of 0141 on a compression ignition vehicle then?

All the PID details are in ISO 15031-5. You have to either buy it (arouond 80$) or find it anyhow!! Infos about PIDs are not complete (even sometimes ambiguous!) on Wikipedia. Below is some information and differences between 0x01 and 0x41 (But not complete and you cannot parse the information with it!). Hopefully helps:
0x01 is Monitor status since DTCs cleared.
The bits in this PID shall report two pieces of information for each monitor:
1) monitor status since DTCs were last cleared, saved in NVRAM or Keep Alive RAM
2) monitors supported on the vehicle.
0x41: the bit in this PID shall report two pieces of information for each monitor:
1) Monitor enable status for the current driving cycle. This bit shall indicate when a monitor is disabled in a manner such that there is no easy way for the driver to operate the vehicle to allow the monitor to run.
Typical examples are
⎯ engine-off soak not long enough (e.g., cold start temperature conditions not satisfied)
⎯ monitor maximum time limit or number of attempts/aborts exceeded
The monitor shall not indicate “disabled” for operator-controlled conditions such as rpm, load, throttle
position. The monitor shall not indicate “disabled” from key-on because minimum time limit has not been exceeded or engine warm-up conditions have not been met, since these conditions will eventually be met as the vehicle continues to be driven.
If the operator drives the vehicle to a different altitude or ambient air temperature conditions, monitor status may change from enabled to disabled. The monitor shall not change from disable to enable if the conditions
change back. This could result in a monitor showing “disabled” but eventually showing “complete”.
2) Monitor completion status for the current driving/monitoring cycle. Status shall be reset to “not complete” upon starting a new monitoring cycle. Note that some monitoring cycles can include various engineoperating conditions; other monitoring cycles begin after the ignition key is turned off. Some status bits on a given vehicle can utilize engine-running monitoring cycles while others can utilize engine-off monitoring cycles. Resetting the bits to “not complete” upon starting the engine will accommodate most engine-running
and engine-off monitoring cycles; however, manufacturers are free to define their own monitoring cycles.

In the latest version of the standard (SAE J1979DA-201406) – which I have since bought – they have clarified that bit by specifying B3 as injection bit for both 0101 and 0141. Thus this question has been solved.

Related

MySQL query design for booking database with limited amount of goods and heavy traffic

We're running a site for booking salmon fishing licenses. The site has no problem handling the traffic 364 days a year. The 365th day is when the license sale opens, and that's where the problem occurs. The servers are struggling more and more each year due to increased traffic, and we have to further optimize our booking query.
The licenses are divided into many different types (tbl_license_types), and each license type are connected to one or more fishing zones (tbl_zones).
Each license type can have a seasonal quota, which is a single value set as an integer field in tbl_license_types.
Each zone can have a daily quota, a weekly quota and a seasonal quota. The daily quota is the same for all days, and the seasonal quota of course is a single value. Daily and seasonal are therefore integer fields in tbl_zones. The weekly quota however differs by week, therefore those are specified in the separate tbl_weekly_quotas.
Bookings can be for one or more consecutive dates, but are only stated as From_date and To_date in tbl_shopping_cart (and tbl_bookings). For each booking attempt made by a user, the quotas have to be checked against already allowed bookings in both tbl_shopping_cart and tbl_bookings.
To be able to count/group by date, we use a view called view_season_calendar with a single column containing all the dates of the current season.
In the beginning we used a transaction where we first made a query to check the quotas, and if quotas allowed we would use a second query to insert the booking to tbl_bookings.
However that gave a lot of deadlocks under relatively moderate traffic, so we redesigned it to a single query (pseudo-code):
INSERT INTO tbl_bookings (_booking_)
WHERE _lowest_found_quota_ >= _requested_number_of_licenses_
where _lowest_found_quota_ is a ~330 lines long SELECT with multiple subqueries and the related tables being joined multiple times in order to check all quotas.
Example: User wants to book License type A, for zones 5 and 6, from 2020-05-19 to 2020-05-25.
The system needs to
count previous bookings of license type A against the license type A seasonal quota.
count previous bookings in zone 5 for each of the 6 dates against zone 5 daily quota.
same count for zone 6.
count previous bookings in zone 5 for each of the two weeks the dates are part of, against zone 5 weekly quota.
same count for zone 6.
count all previous bookings in zone 5 for the current season against zone 5 seasonal quota.
same count for zone 6.
If all are within quotas, insert the booking.
As I said this was working well earlier, but due to higher traffic load we need to optimize this query further now. I have some thoughts on how to do this;
Using isolation level READ UNCOMMITTED on each booking until quotas for the requested zones/license type are nearly full, then fallback to the default REPEATABLE READ. As long as there's a lot left of the quota, the count doesn't need to be 100% correct. This will greatly reduce lock waits and deadlocks, right?
Creating one or more views which keeps count of all bookings for each date, week, zone and license type, then using those views in the WHERE clauses of the insert.
If doing nr 2, use READ UNCOMMITTED in the views. If views report relevant quota near full, cancel the INSERT and start a new one with the design we're using today. (Hopefully traffic levels are coming down before quotas are becoming full)
I would greatly appreciate thoughts on how the query can be done as efficient as possible.
Rate Per Second = RPS
Suggestions to consider for your AWS Parameters group
innodb_lru_scan_depth=100 # from 1024 to conserve 90% of CPU cycles used for function every second
innodb_flush_neighbors=2 # from 1 to reduce time required to lower innodb_buffer_pool_pages_dirty count when busy
read_rnd_buffer_size=128K # from 512K to reduce handler_read_rnd_next RPS of 3,227
innodb_io_capacity=1900 # from 200 to use more of SSD IOPS
log_slow_verbosity=query_plan,explain # to make slow query log more useful
have_symlink=NO # from YES for some protection from ransomware
You will find these changes will cause transactions to complete processing quicker. For additional assistance, view profile, Network profile for contact info and free downloadable Utility Scripts to assist with performance tuning. On our FAQ page you will find "Q. How can I find JOINS or QUERIES not using indexes?" to assist in reducing select_scan RPhr of 1,173. Com_rollback averages 1 every ~ 2,700 seconds, usually correctable with consistent read order in maintenance queries.
See if you can upgrade the AWS starting a day before the season opens, then downgrade after the rush. It's a small price to pay for what might be a plenty big performance boost.
Rather than the long complex query for counting, decrement some counters as you go. (This may or may not help, so play around with the idea.)
Your web server has some limit on the number of connections it will handle; limit that rather than letting 2K users get into MySQL and stumble over each other. Think of what a grocery store is like when the aisles are so crowded that no one is getting finished!
Be sure to use "transactions", but don't let them be too broad. If they encompass too many things, the traffic will degenerate to single file (and/or transactions will abort with deadlocks).
Do as much as you can outside of transactions -- such as collecting and checking user names/addresses, etc. If you do this after issuing the license, be ready to undo the license if something goes wrong. (This should be done in code, not via ROLLBACK.
(More)
VIEWs are syntactic sugar; they do not provide any performance or isolation. OTOH, if you make "materialized" views, there might be something useful.
A long "History list" is a potential performance problem (especially CPU). This can happen when lots of connections are in the middle of a transaction at the same time -- each needs to hang onto its 'snapshot' of the dataset.
Whereever possible terminate transactions as soon as possible -- even if you turn around and start a new one. An example in Data Warehousing is to do the 'normalization' before starting the main transaction. (Probably this example does not apply to your app.)
Ponder having a background task computing the quotas. The hope is that the regular tasks can run faster by not having the computation inside their transactions.
A technique used in the reservation industry: (And this sounds somewhat like your item 1.) Plow ahead with minimal locking. At the last moment, have the shortest possible transaction to make the reservation and verify that the room (plane seat, etc) is still available.
If the whole task can be split into (1) read some stuff, then (2) write (and reread to verify that the thing is still available), then... If the read step is heavier than the write step, add more Slaves ('replicas') and use them for the read step. Reserve the Master for the write step. Note that Replicas are easy (and cheap) to add and toss for a brief time.

Reason to set queue size of ROS publisher or subscriver to a large value

When I look over the tutorial of Robot Operating system (ROS), I found most example codes set the publisher's queue size to a larger value such as 1000. I think this leads to losing real-time response of the node.
For what purpose, do people set it to that large value?
From ROS docs (http://wiki.ros.org/ROS/Tutorials/WritingPublisherSubscriber):
Message publisher (producer):
"The second parameter to advertise() is the size of the message queue
used for publishing messages. If messages are published more quickly
than we can send them, the number here specifies how many messages to
buffer up before throwing some away."
Message subscriber:
"The second parameter to the subscribe() function is the size of the message queue. If messages are arriving faster than they are being processed, this is the number of messages that will be buffered up before beginning to throw away the oldest ones."
Possible explanation:
Think in the consumer-producer problem.
You can't guarantee that you will consume messages in the rate they arrive. So you create a queue that is filled as messages comes by sender (some sensor for instance).
Bad case: If your program delays in some other part and you can't read the messages in the rate they arrived the queue increases.
Good case: As soon as your other processing load diminishes you can read the queue faster and start to reduce it. If you have available time you will end up reducing queue size to zero.
So as for your question, if you send queue size to large value you may guarantee that will not lose messages. In a simple example you have no memory constraints so you can do anything you want, like use many GBytes of RAM to create a large queue and assures will always work. Or if you create a toy example to explain a concept you don't want your program to crash for other reasons.
A real life example can be a scenario of a waiter and a kitchen to wash dishes.
Suppose the costumers ends its meals and the waiter takes their dirty dishes to wash in the kitchen. He puts in a table. Whenever the dishwasher can, he goes to table and gets dishes and take to wash. In normal operation the table is never filled. But if someone else give another task to the dishwasher guy, the table will start to get full. Until some time the waiter can't place dishes anymore and leave tables dirty (problem in the system). But if table is artificially large there (let's say 1000 square units) the waiter will likely fulfill its job even if dishwasher is busy, considering that after some time he will be able to return to clean dishes.
Ok, long answer, but it may be of help to understand queues.

Understanding the max.inflight property of kafka producer

I work on a bench of my Kafka cluster in version 1.0.0-cp1.
In part of my bench who focus on the max throughput possible with ordering guarantee and no data loss (a topic with only one partition), need I to set the max.in.flight.requests.per.connection property to 1?
I've read this article
And I understand that I only have to set the max.in.flight to 1 if I enable the retry feature at my producer with the retries property.
Another way to ask my question: Only one partition + retries=0 (producer props) is sufficient to guarantee the ordering in Kafka?
I need to know because increase the max.in.flight increases drastically the throughput.
Your use case is slightly unclear. You mention ordering and no data loss but don't specify if you tolerate duplicate messages. So it's unclear if you want At least Once (QoS 1) or Exactly Once
Either way, as you're using 1.0.0 and only using a single partition, you should have a look at the Idempotent Producer instead of tweaking the Producer configs. It allows to properly and efficiently guarantee ordering and no data loss.
From the documentation:
Idempotent delivery ensures that messages are delivered exactly once
to a particular topic partition during the lifetime of a single
producer.
The early Idempotent Producer was forcing max.in.flight.requests.per.connection to 1 (for the same reasons you mentioned) but in the latest releases it can now be used with max.in.flight.requests.per.connection set to up to 5 and still keep its guarantees.
Using the Idempotent Producer you'll not only get stronger delivery semantics (Exactly Once instead of At least Once) but it might even perform better!
I recommend you check the delivery semantics [in the docs]
[in the docs]:http://kafka.apache.org/documentation/#semantics
Back to your question
Yes without the idempotent (or transactional) producer, if you want to avoid data loss (QoS 1) and preserve ordering, you have to set max.in.flight.requests.per.connection to 1, allow retries and use acks=all. As you saw this comes at a significant performance cost.
Yes, you must set the max.in.flight.requests.per.connection property to 1.
In the article you have read it was an initial mistake (currently corrected) where author wrote:
max.in.flights.requests.per.session
which doesn't exist in the Kafka documentation.
This errata comes probably from the book "Kafka The Definitive Guide" (1st edition) where you can read in the page 52:
<...so if guaranteeing order is critical, we recommend setting
in.flight.requests.per.session=1 to make sure that while a batch of
messages is retrying, additional messages will not be sent ...>
imo, it is invaluable to also know about this issue that makes things far more interesting and slightly more complicated.
When you enable enable.idempotence=true , every time you send a message to the broker, you also send a sequence number, starting from zero. Brokers store that sequence number too on their side. When you make a next request to the broker, let’s say with sequence_id=3, the broker can look at its currently stored sequence number and say :
if its 4 - good, its a new batch of records
if its 3 - its a duplicate
if its 5 (or higher), it means messages were lost
And now max.inflight.requests.per.connection . A producer can send as many as this value concurrent requests without actually waiting for an answer from the broker. When we reach 3 (let’s say max.inflight.requests.per.connection=3) , we start to ask the broker for the previous results (at the same time we can’t process any batches now even if they are ready).
Now, for the sake of the example, let’s say the broker says this : “1 was OK, I stored it”, “2 has failed” and now the important part: because 2 failed, the only possible thing you can get for 3 is “out of order”, which means it did not store it. The client now knows that it needs to reprocess 2 and 3 and it creates a List and resends them - in that exact order; if retry is enabled.
This explanation is probably over simplified, but this is my basic understanding after reading the source code a bit.

Safe maximum amount of nodes in the DOM? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
For a web application, given the available memory in a target mobile device1 running a target mobile browser2, how can one estimate the maximum number of DOM nodes, including text nodes, that can be generated via HTML or DHTML?
How can one calculate the estimate before
Failure
Crash
Significant degradation in response
Also, is there a hard limit on any browser not to cross per tab open?
Regarding Prior Closure
This is not like the other questions in the comments below. It is also asking a very specific question seeking a method for estimation. There is nothing duplicated, broad, or opinion based about it, especially now that it is rewritten for clarity without changing its author's expressed interests.
Footnotes
[1] For instance, Android or IOS mobile devices sold from 2013 - 2018 with some specific RAM capacity
[2] Firefox, Chrome, IE 11, Edge, Opera, Safari
This is a question for which only a statistical answer could be accurate and comprehensive.
Why
The appropriate equation is this, where N is the number of nodes, bytesN is the total bytes required to represent them in the DOM, and the node index n ∈ [0, N).
bytesN = ∑N (bytesContentn + bytesOverheadn)
The value requested in the question is the maximum value of N in the worst case handheld device, operating system, browser, and operating conditions. Solving for N for each permutation is not trivial. The equation above reveals three dependencies, each of which could drastically alter the answer.
The average size of a node is dependent on the average number of bytes used in each to hold the content, such as UTF-8 text, attribute names and values, or cached information.
The average overhead of a DOM object is dependent on the HTTP user agent that manages the DOM representation of each document. W3C's Document Object Model FAQ states, "While all DOM implementations should be interoperable, they may vary considerably in code size, memory demand, and performance of individual operations."
The memory available to use for DOM representations is dependent upon the browser used by default (which can vary depending on what browser handheld device vendors or users prefer), user override of the default browser, the operating system version, the memory capacity of the handheld device, common background tasks, and other memory consumption.
Rigorous Solution
One could run tests to determine (1) and (2) for each of the common http user agents used on handheld devices. The distribution of user agents for any given site can be obtained by configuring the logging mechanism of the web server to place the HTTP_USER_AGENT if it isn't there by default and then stripping all but that field in the log and counting the instances of each value.
The number of bytes per character would need to be tested for both attributes values and UTF-8 inner text (or whatever the encoding) to get a clear pair of factors for calculating (1).
The memory available would need to be tested too under a variety of common conditions, which would be a major research project by itself.
The particular value of N chosen would have to be ZERO to handle the actual worst case, so one would chose a certain percentage of typical cases of content, node structures, and run time conditions. For instance, one may take a sample of cases using some form of randomized in situ (within normal environmental conditions) study and find N that satisfies 95% of those cases.
Perhaps a set of cases could be tested in the above ways and the results placed in a table. Such would represent a direct answer to your question.
I'm guessing it would take an excellent mobile software engineer with a good math background and a statistics expert working together full time with a substantial budget for about four weeks to get reasonable results.
A More Practical Estimation
One could guess the worst case scenario. With a few full days of research and a few proof-of-concept apps, this proposal could be refined. Absent of the time to do that, here's a good first guess.
Consider a cell phone that permits 1 Gbyte for DOM because normal operating conditions use 3 Gbytes out of the 4 GBytes for the above mentioned purposes. One might assume the average consumption of memory for a node to be as follows, to get a ballpark figure.
2 bytes per character for 40 characters of inner text per node
2 bytes per character for 4 attribute values of 10 characters each
1 byte per character for 4 attribute names of 4 characters each
160 bytes for the C/C++ node overhead
In this case Nworst_case, the worst case max nodes,
= 1,024 X 1,024 X 1,024
/ (2 X 40 + 2 X 4 X 10 + 1 X 4 X 4 + 160)
= 3,195,660 . 190,476.
I would not, however, build a document in a browser with three million DOM nodes if it could be at all avoided. Consider employing the more common practice below.
Common Practice
The best solution is to stay far below what N might be and simply reduce the total number of nodes to the degree possible using standard HTTP design techniques.
Reduce the size and complexity of that which is displayed on any given page, which also improves visual and conceptual clarity.
Request minimal amounts of data from the server, deferring content that is not yet visible using windowing techniques or balancing response time with memory consumption in well-planned ways.
Use asynchronous calls to assist with the above minimalism.
There is no limit for the DOM. Instead there is a limit for a running application, called 'browser'. As all other applications, it has a limit of 4GB of virtual memory. How much of resident memory is used depends on the amount of physical memory. With low RAM you might get to situation of constantly swapping in and out (having affordable amount of swap memory). Some systems (Linux, Android) have a special kernel task to kill applications if the system runs out of memory. Also, the maximum size of application in Linux like systems is usually limited to 2MB of virual memory and can be changed by ulimit command.

Decrementing money balance stored on master server from numerous backends? (distributed counter, eh?)

I have some backend servers located in two differend datacenters (in USA and in Europe). These servers are just delivering ads on CPM basis.
Beside that I have big & fat master MySQL server serving advertiser's ad campaign's money balances. Again, all ad campaigns are being delivered on CPM basis.
On every impression served from any of backends I have to decrement ad campaign's money balance according to impression price.
For example, price per one impression is 1 cent. Backend A has delivered 50 impressions and will decrement money balance by 50 cents. Backed B has delivered 30 impressions and it will decrement money balance by 30 cents.
So, main problems as I see are:
Backends are serving about 2-3K impressions every seconds. So, decrementing money balance on fly in MySQL is not a good idea imho.
Backends are located in US and EU datacenters. MySQL master server is located in USA. Network latency could be a problem [EU backend] <-> [US master]
As possible solutions I see:
Using Cassandra as distributed counter storage. I will try to be aware of this solution as long possible.
Reserving part on money by backend. For example, backend A is connecting to master and trying to reserve $1. As $1 is reserved and stored locally on backend (in local Redis for example) there is no problem to decrement it with light speed. Main problem I see is returning money from backend to master server if backend is being disabled from delivery scheme ("disconnected" from balancer). Anyway, it seems to be very nice solution and will allow to stay in current technology stack.
Any suggestions?
UPD: One important addition. It is not so important to deliver ads impressions with high precision. We can deliver more impressions than requested, but never less.
How about instead of decrementing balance, you keep a log of all reported work from each backend, and then calculate balance when you need it by subtracting the sum of all reported work from the campaign's account?
Tables:
campaign (campaign_id, budget, ...)
impressions (campaign_id, backend_id, count, ...)
Report work:
INSERT INTO impressions VALUES ($campaign_id, $backend_id, $served_impressions);
Calculate balance of a campaign only when necessary:
SELECT campaign.budget - impressions.count * $impression_price AS balance
FROM campaign INNER JOIN impressions USING (campaign_id);
This is perhaps the most classical ad-serving/impression-counting problem out there. You're basically trying to balance a few goals:
Not under-serving ad inventory, thus not making as much money as you could.
Not over-serving ad inventory, thus serving for free since you can't charge the customer for your mistake.
Not serving the impressions too quickly, because usually customers want an ad to run through a given calendar time period, and serving them all in an hour between 2-3 AM makes those customers unhappy and doesn't do them any good.
This is tricky because you don't necessarily know how many impressions will be available for a given spot (since it depends on traffic), and it gets even more tricky if you do CPC instead of CPM, since you then introduce another unknowable variable of click-through rate.
There isn't a single "right" pattern for this, but what I have seen to be successful through my years of consulting is:
Treat the backend database as your authoritative store. Partition it by customer as necessary to support your goals for scalability and fault tolerance (limiting possible outages to a fraction of customers). The database knows that you have an ad insertion order for e.g. 1000 impressions over the course of 7 days. It is periodically updated (minutes to hours) to reflect the remaining inventory and some basic stats to bootstrap the cache in case of cache loss, such as actual
Don't bother with money balances at the ad server level. Deal with impression counts, rates, and targets only. Settle that to money balances after the fact through logging and offline processing.
Serve ad inventory from a very lightweight and fast cache (near the web servers) which caches the impression remaining count and target serving velocity of an insertion order, and calculates the actual serving velocity.
Log all served impressions with relevant data.
Periodically collect serving velocities and push them back to the database.
Periodically collect logs and calculate actual served inventory and push it back to the database. (You may need to recalculate from logs due to outages, DoSes, spam, etc.)
Create a service on your big & fat master MySQL server serving advertiser's ad campaign's money balances.
This service must implement a getCampaignFund(idcampaign, requestingServerId, currentLocalAccountBalanceAtTheRequestingServer) that returns a creditLimit to the regional server.
Imagine a credit card mechanism. Your master server will give some limit to your regional servers. Once this limit is decreasing, a threshold trigger this request to get a new limit. But to get the new credit limit the regional server must inform how much it had used from the previous limit.
Your regional servers might implement additionally these services:
currentLocalCampaignAccountBalance
getCampaignAccountBalance(idcampaign): to inform the current usage of a specific campaign, so the main server might update all campaigns at a specific time.
addCampaign(idcampaign, initialBalance): to register a new campaign
and it's start credit limit.
supendCampaign(idcampaign): to suspend the impressions to a
campaign.
resumeCampaign(idcampaign): to resume impression to a campaign.
currentLocalCampaignAccountBalance finishCampaign(idcampaign): to
finish a campaign and return the current local account balance.
currentLocalCampaignAccountBalance
updateCampaignLimit(idcampaign, newCampaignLimit): to update the limit
(realocation of credit between regional servers). This service will
update the campaign credit limit and return the account balance of
the previous credit limit acquired.
Services are great so you have a loosely coupled architecture. Even if your main server goes offline for some time, your regional servers will keep running until they have not finished their credit limits.
this may not be a detailed canonical answer but i'll offer my thoughts as possible [and at least partial] solutions.
i'll have to guess a bit here because the question doesn't say much about what measurements have been taken to identify mysql bottlenecks, which imho is the place to start. i say that because imho 1-2k transactions per second is not out of range for mysql. i've easily supported volumes this high [and much higher] with some combination of the following techniques, in no particular order here because it depends on what measurements tell me are the bottlenecks: 0-database redesign; 1-tuning buffers; 2-adding ram; 3-solid state drives; 4-sharding; 5-upgrading to mysql 5.6+ if on 5.5 or lower. so i'd take some measurements and apply the foregoing as called for by the results of the measurements.
hope this helps.
I assume
Ads are probably bought in batches of at least a couple of thousands
There are ads from several different batches being delivered at the same time, not all of which be near empty at the same time
It is OK to serve some extra ads if your infrastructure is down.
So, here's how I would do it.
The BigFat backend has these methods
getCurrentBatches() that will deliver a list of batches that can be used for a while. Each batch contains a rate with the number of ads that can be served each second. Each batch also contains a serveMax; how many ads might be served before talking to BigFat again.
deductAndGetNextRateAndMax(batchId, adsServed) that will deduct the number of ads served since last call and return a new rate (that might be the same) and a new serveMax.
The reason to have a rate for each batch is that when one batch is starting to run out of funds it will be served less until it's totally depleted.
If one backend doesn't connect to BigFat for a while it will reach serveMax and only serve ads from other batches.
The backends could have a report period of seconds, minutes or even hours depending on serveMax. A brand new batch with millions of impressions left can run safely for a long while before reporting back.
When BigFat gets a call to deductAndGetNextRateAndMax it deducts the number of served ads and then returns something like 75% of the total remaining impressions up to a configured max. This means that at the end of batch, if it isn't refilled, there will be some ads delivered after the batch is empty but it's better that the batch is actually depleted rather than almost depleted for a long time.