Is TTL implemented in Pika? - pika

I'd like my queue to drop messages not processed within a certain time.
I already do this in the consumer by recording the publish time. However, in the case that no-one is subcribing, it would be better for the queue to simply drop stale messages.
Can I set a expiry time (TTL) in messages in Pika. The RabbitMQ docs talk about it but i don't see TTL references in the Pika docs.

You can set the per message TTL using the expiration flag on the BasicProperties object, as seen in the pika documentation here.
Using it would look something like this.
channel.basic_publish(
exchange='',
routing_key='hello_world',
properties=pika.BasicProperties(
expiration='60000',
),
body='my message'
)
Keep in mind that the expiration policy is expressed using milliseconds as string, so 60000 would translate to 60 seconds.
You can read more about message based TTL and it's caveats here.

Related

Group messages by key for pulsar key_shared subscription type

Say I have an unbounded set of keys for messages published to a pulsar topic.
{k0, k1, ..., kn}
And a finite set of expected message categories, where the category information is part of the message payload.
{c0, c1, c2}
Whenever all message categories for a given key are consumed I want to invoke an action in my application. For example, if I see the following key/category pairs, I would expect to see an action invoked.
{(k0, c0), (k0, c1), (k0, c2)} => action invoked for key k0
{(k1, c0), (k1, c1), (k1, c2)} => action invoked for key k1
In order to ensure application resiliency I only ack messages once all categories have been consumed. If a message pertaining to the same category is consumed twice I can ack the older message, holding on to one message per category.
Now, let's say I have a single consumer attached to the subscription and configured with the key_shared subscription type. We consume the following key/category pairs.
{(k0, c0), (k0, c1)}
And while waiting for (k0, c2) a second consumer is added to the subscription. According to this issue, the new consumer will not receive messages until the existing consumer acks or nacks the pending messages. This seems to be expected behaviour, and is indeed the behaviour I am seeing.
I am wondering if there is there a more idiomatic way I can go about implementing this feature? Does it make sense to delay acking of messages in order to achieve this grouping behaviour?
Using a partitioned topic with the failover subscription type achieves our design goal. Below are a description of the approaches we explored and the observed behaviour.
Non-partitioned topic with key_shared subscription
When the application is scaled out (more consumers added to the subscription), any pending messages (messages delivered to a consumer, but not yet acked) cause the new consumer to not receive any messages until the pending messages are acked/nacked or the pre-existing consumer unsubscribes.
Partitioned topic with failover subscription
When the application is scaled out, topic/partition pairs are re-assigned evenly across consumers and pending messages (if any) are re-delivered. Consumers needs to be informed when ownership of a topic partition changes in order to clear the internal state, for this the consumer event listener can be used.

Looking for an example of a OBD-II complete data frame

I'm developing an OBD-II reader where I want to query requests to read PID parameters with a stm32 processor. I already understand what should go on the data field, but the ID is giving me a headache. As I have read, one must send 0x7DF to broadcast a request, and each ECU will respond with his own ID. However, I have been asked to do this within the SAE J1939 protocol, which uses the 29 bit extended identifier, and I don't know what I need to add to this ID.
As I stated in the title, could someone show me some actual data from a bus using this method? I've been searching on the internet for real frames but did not have any luck so far.
I woud also appreciate if someone could shred some light to if the OBD-II communication needs some acknowledgment to work properly.
Thanks
I would suggest you to take a look on the SAE J1939 documentation, in the more specifically on the J1939/21,J1939-71 and J1939/73.
Generally, a J1939 transport protocol response sequence can be processed as follows:
Identify the BAM frame, indicating a new sequence being initiated
(via the PGN 60416 - 0xEC00 can be reach by 0x1CECFF00 )
Extract the J1939 PGN from bytes 6-8 of the BAM payload to use as the
identifier of the new frame
Construct the new data payload by concatenating bytes 2-8 of the data
transfer frames (i.e. excl. the 1st byte)
A J1939 data transfer messages with ID 1CEBFF00 (PGN 60160 or EB00).
Above, the last 3 bytes of the BAM equal E3FE00. When reordered, these equal the PGN FEE3 aka Engine Configuration 1 (EC1). Further, the payload is found by combining the the first 39 bytes across the 6 data transfer packets/fram
The administrative control device or any device issuing the vehicle use status PID should be sensitive to the run switch status (SPN 3046 - 0xFDC0 which probably can be reach by 0xCFDC000) and any other locally defined criteria for authorized use (i.e., driver log-ons) before the vehicle use status PID is used to generate an unauthorized use alarm.
Also, you can't forget to uses a read/send to extend ID message, since that is a 24-bit.
In fact, i will suggest you to use can-utils to make your a analyses even easier. A simple can-dump or can-sniffer you can see what is coming on your broadcast.
Some car's dbc https://github.com/commaai/opendbc

Caffeine - How to set for each entity its own "expiration time"

We used to use the guava cache and we want to change it to caffeine.
We want to set for each entity its own "expiration time", something like - put(K key, V value, long expiration_time).
I saw the 3 functions above and I wonder what exactly they are doing, if you can explain me the meaning ant the operations of each one of them it will be great.
For example, the return value of expireAfterCreate should be the duration we want for this entity from it's creation untill it's expiration? or something else?
I'm also wondering why we have the parameter "currentTime" in both expireAfterRead and expireAfterUpdate if we don't use it in the function?
When we used the guava cache we used the expireAfterAccess, what is the substitution for it in caffeine?
My last question is how can I set a default value for entities without a unique expiration time.
Thank you,
May
When we used the guava cache we used the expireAfterAccess, what is the substitution for it in caffeine?
We mirror the Guava API, so this is also available on the cache builder.
My last question is how can I set a default value for entities without a unique expiration time.
Use expireAfterAccess, expireAfterWrite, or return a constant duration with expireAfter(Expiry).
I saw the 3 functions above and I wonder what exactly they are doing, if you can explain me the meaning ant the operations of each one of them it will be great.
Expiry is a callback interface where a single timestamp value is updated. The invoked method corresponds to the operation performed on the cache entry (created, updated, read). An update or read that should have no effect can return currentDuration to no-op.
For example, the return value of expireAfterCreate should be the duration we want for this entity from it's creation untill it's expiration? or something else?
Yes. However if the expireAfterUpdate returns a custom value (something other than currentDuration), then that overrides the prior expiration duration.
I'm also wondering why we have the parameter "currentTime" in both expireAfterRead and expireAfterUpdate if we don't use it in the function?
This can most often be ignored, but is provided if somehow useful. It is the current nano timestamp from the Ticker (not wall clock time).
We want to set for each entity its own "expiration time", something like - put(K key, V value, long expiration_time).
The callback Expiry is required and generally recommended, because ideally entries are loaded through the cache to avoid stampedes (e.g. LoadingCache). A stampede is when multiple threads lookup the same entry, miss, load it, and overwrite each other putting it in. That wasted work rather than having only one thread perform the load and others wait for the results.
That said, this method is available under Cache.policy().expiresVariably(). Those configuration-specific methods are stashed in that area to offer more power when deemed necessary.
Thank you,
You're very welcome.

Duplicates on Apache Beam / Dataflow inputs even when using withIdAttribute

I am trying to ingest data from a 3rd party API into a Dataflow pipeline. Since the 3rd party doesn't make webhooks available, I wrote a custom script that constantly polls their endpoint for more data.
The data is refreshed every 15 minutes, but since I don't want to miss any datapoints and I want to consume as soon as new data is available, my "crawler" runs every 1 minute. The script then sends the data to a PubSub topic. Easy to see that PubSub will receive about 15 repeated messages for each datapoint in the source.
My first attempt to identify and discard those repeated messages was to add a custom attribute to each PubSub message (eventid), created from a hash of its [ID + updated_time] at source.
const attributes = {
eventid: Buffer.from(`${item.lastupdate}|${item.segmentid}`).toString('base64'),
timestamp: item.timestamp.toString()
};
const dataBuffer = Buffer.from(JSON.stringify(item))
publisher.publish(dataBuffer, attributes)
Then I configured Dataflow with a withIdAttribute() (which is the new idLabel(), based on Record IDs).
PCollection<String> input = p
.apply("ReadFromPubSub", PubsubIO
.readStrings()
.fromTopic(String.format("projects/%s/topics/%s", options.getProject(), options.getIncomingDataTopic()))
.withTimestampAttribute("timestamp")
.withIdAttribute("eventid"))
.apply("OutputToBigQuery", ...)
With that implementation, I was expecting that when the script sends the same datapoint a second time, the repeated eventid would be the same and the message discarded. But for some reason, I still see duplicates on the output dataset.
Some questions:
Is there a clever way to ingest the data to dataflow from that 3rd party API if they don't provide webhooks?
Any ideas on why dataflow is not discarding the messages on this situation?
I know about the 10-minute restriction for deduplication on dataflow, but I see duplicated data even on the 2nd insertion (2 minutes).
Any help will be greatly appreciated!
I think you are on the right track, instead of the hash I recommend to use timestamps. A better way to to this is by using windows. Review this document which filters data that is outside of the window.
Regarding the additional duplicate data, if you are using pull subscriptions and the acknowledgement deadline is reached before having the data processed the message will be resent as per the at-least-once delivery. In this case change the acknowledgement deadline, the defaults is 10 seconds.

Issue when sending SMTP Emails

I am trying to send a mass mailing campaign using PHPList. I have everything working as I need but I am getting an error message from emails sent to Google.
This error occurs in the header of the message:
Received-SPF: permerror (google.com: permanent error in processing during lookup of bounce#planemover.com: exceeds recursive limit) client-ip=xxx.xx.xxx.xx;
Authentication-Results: mx.google.com;
spf=permerror (google.com: permanent error in processing during lookup of bounce#planemover.com: exceeds recursive limit) smtp.mailfrom=bounce#planemover.com
Does anyone know what would cause this error? Will this error cause my domain to be blacklisted?
At the present time, the SPF record on your domain is:
planemover.com. 3600 IN TXT "v=spf1 mx a ip4:71.122.219.173 ip4:71.122.219.172 a:mx1.selling-ac.com include:selling-ac.com include:planemover.com ~all"
It contains an include: directive pointing back at itself. This will result in an infinite loop (or recursion).
You need to remove include:planemover.com from this DNS record. The TTL on this record is 3600 (or 1 hour), so it will take up to 1 hour after that change occurs on all of your hosting nameservers, for this to become effective globally.
Also, in the future, this kind of question is more appropriate for Server Fault. It's probably off-topic here on Stack Overflow.