GCP large PubSub message messing Cloud Function trigger - google-cloud-functions

I have deployed simple PubSub Cloud Function trigger using this tutorial: https://medium.com/#milosevic81/copy-data-from-pub-sub-to-bigquery-496e003228a1
For test I pushed large (over 8MB) message to PubSub topic.
As a result Cloud function returned the following error message to the log: Function execution could not start, status: 'request too large'
The issue is, that Cloud Function started to fire up constantly producing constant resource usage and log messages. It stopped only after I manually purged the related PubSub topic.
Is there a mechanism/configuration to prevent such behavior? Ideally PubSub message should not be picked again after Cloud Function trigger execution.

You reached the quotas of Cloud Functions
Max uncompressed HTTP request size -> 10MB
One solution is to use Cloud Run (the quotas is higher, 32Mb)
For this, you need several changes
Convert your Cloud Functions in Cloud Run. I wrote an article (not dedicated to this but you have an example in Python), and I presented this at GDG Ahmedabad last month, in GO this time
Create a push subscription on your PubSub topic and use the Cloud Run HTTPS endpoint in the "push" HTTP field
Cloud Run can handle up to 80 concurrent requests on 1 instances, Cloud Functions only one. Because your request are "big" it might cause memory issues if you process too many request in the same instance. You can control this with Cloud Run with the --concurrency param. Set it to 1 to have the same behavior as CLoud Functions.

Related

What happens if a Cloud Function cannot process inputs from a Pub/Sub topic as fast as they appear?

When I create a Google Cloud Function, I am offered to set a "Trigger":
and the "Maximum number of instances":
What happens if a Pub/Sub triggered Cloud Function faces periods when it receives more messages than it can instantly process?
Concrete example: I am sending 200 strings to the Cloud Function per minute, for 5 minutes. One instance can process 10 strings in a minute, and the "Maximum number of instances" is 10, so in total, 100 strings per minute is processed. What will happen to the other half of the 200 strings, will they "wait" in the Pub/Sub topic until they are processed, or will these inputs get lost?
What happens if a Pub/Sub triggered Cloud Function faces periods when it receives more messages than it can instantly process?
Initially, this would result to some messages to not process (UNACK will be sent), and will be left behind. However, to be able to work around with this, you just need to enable the retry policy on the function.
Cloud Functions guarantees at-least-once execution of an event-driven function for each event emitted by an event source. However, by default, if a function invocation terminates with an error, the function will not be invoked again, and the event will be dropped. When you enable retries on an event-driven function, Cloud Functions will retry a failed function invocation until it completes successfully, or the retry window (by default, 7 days) expires.
You can enable retries by providing the --retry flag when creating the function via the gcloud command-line tool or checking the "Retry on failure" box when creating via the Cloud Console.
To update the subscription to use a retry policy, find the name of the subscription created by Cloud Functions in the Cloud Console Pub/Sub section or use the gcloud pubsub subscriptions update command with the necessary flags.
The messages/events just queue up in the subscription until they get processed. No need to specify retry behaviour with --retry.

Permission denied when running scheduling Vertex Pipelines

I wish to schedule a Vertex Pipelines and deploy it from my local machine for now.
I have defined my pipeline which runs well I deploy it using: create_run_from_job_spec, on AIPlatformClient running it once.
When trying to schedule it with create_schedule_from_job_spec, I do have a Cloud Scheduler object well created, with a http endpoint to a Cloud Function. But when the scheduler runs, it fails because of Permission denied error. I used several service accounts with owner permissions on the project.
Do you know what could have gone wrong?
Since AIPlatformClient from Kubeflow pipelines raises deprecation warning, I also want to use PipelineJob from google.cloud.aiplatform but I cant see any direct way to schedule the pipeline execution.
I've spent about 3 hours banging my head on this too. In my case, what seemed to fix it was either:
disabling and re-enabling cloud scheduler api. Why did I do this? There is supposed to be a service account called service-[project-number]#gcp-sa-cloudscheduler.iam.gserviceaccount.com. If it is missing then re-enabling API might fix it
for older projects there is an additional step: https://cloud.google.com/scheduler/docs/http-target-auth#add
Simpler explanations include not doing some of the following steps
creating a service account for scheduler job. Grant cloud function invoker during creation
use this service account (see create_schedule_from_job_spec below)
find the (sneaky) cloud function that was created for you it will be called something like 'templated_http_request-v1' and add your service account as a cloud function invoker
response = client.create_schedule_from_job_spec(
job_spec_path=pipeline_spec,
schedule="*/15 * * * *",
time_zone="Europe/London",
parameter_values={},
cloud_scheduler_service_account="<your-service-account>#<project_id>.iam.gserviceaccount.com"
)
If you are still stuck, it is also useful to run gcloud scheduler jobs describe <pipeline-name> as it really helps to understand what scheduler is doing. You'll see cloudfunction url, POST payload which is some base64 encoded and contains pipeline yaml and you'll see that it is using OIDC/service account for security. Also useful is to view the code of the 'templated_http_request-v1' cloud function (sneakily created!). I was able to invoke the cloudfunction from POSTMAN using the payload obtained from scheduler job.

How to prevent cloud scheduler from triggering a function more than once?

I'm triggering a cloud function every minute with cloud scheduler [* * * * *].
The Stackdriver logs indicate the function appears to have been triggered and run twice in the same minute. Is this possible?
PubSub promises at least once delivery but I assumed that GCP would automatically handle duplicate triggers for scheduler -> function workflows.
What is a good pattern for preventing this function from running more than once per minute?
Your function needs to be made "idempotent" in order to ensure that a message gets processed only once. In other words, you'll have to maintain state somewhere (maybe a database) that a message was processed successfully, and check that state to make sure a message doesn't get processed twice.
All non-HTTP type Cloud Functions provide a unique event ID in the context parameter provided to the function invocation. If you see a repeat event ID, that means your function is being invoked again for the same message, for whatever reason.
This need for idempotence is not unique to pubsub or cloud scheduler. It's a concern for all non-HTTP type background functions.
A full discussion on writing idempotent functions is a bit too much a Stack Overflow answer, but there is a post in the Google Cloud blog that covers the issue pretty well.
See also: Cloud functions and Firebase Firestore with Idempotency

Run Google Cloud Function at a specific time

I'd like to schedule the execution of a Cloud Function to a specific time. It should be run only once.
I basically have a function "startTask" which modifies some data in the Firestore database. After X seconds (the time is passed to the startTask function), the "finishTask" function should be called.
I already tried messing around with Google Cloud Tasks but I feel like this isn't the right way to go.
Google Cloud does not have service that will do what you need that I am aware of. If you need X to happen N seconds after user does Y, you will need to code that service yourself.
You do not specify what services you are using for compute (App Engine, Compute Engine, Kubernetes, etc.) but writing a task secheduling service in just about any language is not very hard. There are many ways to accomplish this (client side code / server side code). Many OS / language combinations support scheduling a function with a timeout and callback.
You can use Cloud Tasks. It will allow you to be alerted after x amount of seconds.
https://cloud.google.com/tasks/docs/creating-http-target-tasks
The easiest way is to create a pub/sub topic, cron-topic that your cloud function subscribes to. Cloud Scheduler can push an event to cron-topic on a schedule
Create the Topic & Subscription
gcloud pubsub topics create cron-topic
# create cron-sub for testing. Function will create it's own subscription
gcloud pubsub subscriptions create cron-sub --topic cron-topic
Create the Schedule
Command is below, but since it's beta, see the console guide here
# send a message every 3 hours. For testing use `0/2 * * * *` for every 2 min
gcloud beta scheduler jobs create pubsub --topic=cron-topic --schedule='0 */3 * * *'
Create a Function to Consume the cron-topic Topic
Put your function code in the current directory and use this command to deploy the function listening to the cron-topic topic
FUNCTION_NAME=cron-topic-listener
gcloud functions deploy ${FUNCTION_NAME} --runtime go111 --trigger-topic cron-topic
note pub/sub events are sent at least once. In some cases the event can be sent more than once. Make sure your function is idempotent

Google Cloud Functions to only Ack Pub/Sub on success (Problem resolved by GCP)

An early version of Google Cloud Functions had a limitation with regards to retries when errors occurred. They have since provided enhancements that resolve this issue.
We are using a cloud function triggered by Pub/Sub to ensure delivery of an e-mail. Sometimes the e-mail service takes a long time to respond and our cloud function terminates before we get an error back. Since the message has already been acknowledged our e-mail gets lost.
The cloud function appears to be sending an ACK the Pub/Sub message automatically when we are called. Is there a way to delay the ACK until the successful completion of our code? Alternatively is there a way to catch timeouts and requeue the message for delivery? Something else we could try?
I heard from Google support that they do not currently provide the means to delay the ACK when a cloud function is invoked by Pub/Sub. If you want to use cloud functions with Pub/Sub you need to handle the error case yourself. For example you could have your cloud function requeue a message for the retry with a retry count.
This would seem to make it unnecessarily difficult to guarantee execution with Pub/Sub and cloud functions.
This is a problem because Functions ACKing a message on invoke, even if they crash, prevents the use of the new "dead-letter" feature.
Also, it goes against the docs. see note after this code sample:
https://cloud.google.com/functions/docs/calling/pubsub#sample_code