Permission denied when running scheduling Vertex Pipelines - google-cloud-functions

I wish to schedule a Vertex Pipelines and deploy it from my local machine for now.
I have defined my pipeline which runs well I deploy it using: create_run_from_job_spec, on AIPlatformClient running it once.
When trying to schedule it with create_schedule_from_job_spec, I do have a Cloud Scheduler object well created, with a http endpoint to a Cloud Function. But when the scheduler runs, it fails because of Permission denied error. I used several service accounts with owner permissions on the project.
Do you know what could have gone wrong?
Since AIPlatformClient from Kubeflow pipelines raises deprecation warning, I also want to use PipelineJob from google.cloud.aiplatform but I cant see any direct way to schedule the pipeline execution.

I've spent about 3 hours banging my head on this too. In my case, what seemed to fix it was either:
disabling and re-enabling cloud scheduler api. Why did I do this? There is supposed to be a service account called service-[project-number]#gcp-sa-cloudscheduler.iam.gserviceaccount.com. If it is missing then re-enabling API might fix it
for older projects there is an additional step: https://cloud.google.com/scheduler/docs/http-target-auth#add
Simpler explanations include not doing some of the following steps
creating a service account for scheduler job. Grant cloud function invoker during creation
use this service account (see create_schedule_from_job_spec below)
find the (sneaky) cloud function that was created for you it will be called something like 'templated_http_request-v1' and add your service account as a cloud function invoker
response = client.create_schedule_from_job_spec(
job_spec_path=pipeline_spec,
schedule="*/15 * * * *",
time_zone="Europe/London",
parameter_values={},
cloud_scheduler_service_account="<your-service-account>#<project_id>.iam.gserviceaccount.com"
)
If you are still stuck, it is also useful to run gcloud scheduler jobs describe <pipeline-name> as it really helps to understand what scheduler is doing. You'll see cloudfunction url, POST payload which is some base64 encoded and contains pipeline yaml and you'll see that it is using OIDC/service account for security. Also useful is to view the code of the 'templated_http_request-v1' cloud function (sneakily created!). I was able to invoke the cloudfunction from POSTMAN using the payload obtained from scheduler job.

Related

Unable to authenticate HTTP function call from Google Cloud Scheduler

I have created an HTTP Google Cloud Function that does not allow unauthenticated requests.
I have created a service account in the project with one role: Cloud Functions Invoker.
This service account is listed as a principal for my http cloud function and shows to have that role:
I have created a Cloud Scheduler Job to run this function.
In the job, I've specified that I want it to obtain an OIDC token for authenticating requests to the http function:
Whenever I trigger the job, it fails with a message indicating the request is unauthenticated:
Things I've tried:
Recreate the function
Recreate the job
Use a different user (the main service account user - that one doesn't work either)
Do a POST instead of a GET from the scheduler job (I've successfully created scheduled jobs for authenticated http functions before but this is the first time I've done a GET - just grasping at straws really)
Did I miss something? Any idea why it is coming back with the "Unauthenticated" message?
I revisited this today. My IAP protected HTTP function is expecting a query string parameter to be passed into it. The Cloud Platform Web UI automatically sets the audience to the same URL (including the parameter) when creating the Scheduled Job. I figured Google knows what they are doing, so I left it that way originally.
Out of desperation I tried removing this parameter from the audience and that made the authentication work properly.
So, I changed the audience from
https://<myProject>.cloudfunctions.net/myFunction?p=abc
to
https://<myProject>.cloudfunctions.net/myFunction

Trigger of a cloud function by a cloud storage bucket in different project

I have a requirement where I need to trigger a cloud function which in turn triggers a data flow job once a file is placed in google cloud storage bucket of another project. Google documentation says its not possible, please see here https://cloud.google.com/functions/docs/calling/storage
However, I tried doing this and my cloud function failed to deploy with below error. It says
Looks like this is a permission issue and if required permission are given , this will work.
Do I need to add and give owner permission to #appspot.gserviceaccount.com of the project(project A) from where I am trying to access the bucket of another project(Project B)
So if the above is true, In my project B IAM page, I will see 2 as below
#appspot.gserviceaccount.com OWNER
#appspot.gserviceaccount.com EDITOR
Any inputs on this is much appreciated.
It's not possible to catch event from other projects for now. But there is a workaround.
In the project with the bucket, create a PubSub notification on Cloud Storage
On the topic that you created, create a push subscription. Use the Cloud Functions URL, and secure the PubSub call (you can get inspiration from there. If you are stuck, let me know, I will take more time to describe this part)
On the Cloud Functions grant the PubSub service account as cloudfunctions.invoker role
EDIT 1
The security part isn't so easy at the beginning. In your project B (where you have your Cloud Storage), you have created a PubSub topic. On this topic you can create a notification with a service account created in the project B. Take care to well fill in the audience
Then, you need to grant this "project B" service account as roles/cloudfunctions.invoker on the Cloud Function of the project A
# Create the service account
gcloud iam service-accounts create pubsub-push --project=<ProjectB>
#Create the push subscription
gcloud pubsub subscriptions create \
--push-endpoint=https://<region>-<projectA>.cloudfunctions.net/<functionName> \
--push-auth-service-account=pubsub-push#<projectB>.iam.gserviceaccount.com \
--push-auth-token-audience=https://<region>-<projectA>.cloudfunctions.net/<functionName> \
--topic=<GCSNotifTopic> --project=<ProjectB>
#Grant the service account
gcloud functions add-iam-policy-binding --member=serviceAccount:pubsub-push#<ProjectB>.iam.gserviceaccount.com --role=roles/cloudfunctions.invoker <FunctionName> --project=<projectA>
Last traps:
The Cloud Functions in the project A haven't the same signature if it's an HTTP functions (callable by a pubsub push subscription) or a Background Function (callable by events, such as CLoud Storage event). You need to update this according to the documentation
The PubSub message sent to the Cloud Functions is slightly different. Take care to update the input param accordingly.
Google documentation says its not possible
This is all you need to know. It is not possible. There are no workarounds, regardless of what sort of error messages you might see.

GCP large PubSub message messing Cloud Function trigger

I have deployed simple PubSub Cloud Function trigger using this tutorial: https://medium.com/#milosevic81/copy-data-from-pub-sub-to-bigquery-496e003228a1
For test I pushed large (over 8MB) message to PubSub topic.
As a result Cloud function returned the following error message to the log: Function execution could not start, status: 'request too large'
The issue is, that Cloud Function started to fire up constantly producing constant resource usage and log messages. It stopped only after I manually purged the related PubSub topic.
Is there a mechanism/configuration to prevent such behavior? Ideally PubSub message should not be picked again after Cloud Function trigger execution.
You reached the quotas of Cloud Functions
Max uncompressed HTTP request size -> 10MB
One solution is to use Cloud Run (the quotas is higher, 32Mb)
For this, you need several changes
Convert your Cloud Functions in Cloud Run. I wrote an article (not dedicated to this but you have an example in Python), and I presented this at GDG Ahmedabad last month, in GO this time
Create a push subscription on your PubSub topic and use the Cloud Run HTTPS endpoint in the "push" HTTP field
Cloud Run can handle up to 80 concurrent requests on 1 instances, Cloud Functions only one. Because your request are "big" it might cause memory issues if you process too many request in the same instance. You can control this with Cloud Run with the --concurrency param. Set it to 1 to have the same behavior as CLoud Functions.

Error on google function deploy, service account doesn't exist

Please can you help me, I'm receiving this error when I'm trying to deploy a google cloud function:
HTTP Error: 400, Default service account 'project-name#appspot.gserviceaccount.com' doesn't exist. Please recreate this account (for example by disabling and enabling the Cloud Functions API), or specify a different account.
The command used to deploy is:
firebase deploy --only functions
A temporary solution is fine, but if you can help me to solve it permanently is better.
Thanks in advance.
In my case, the App Engine default service account was deleted. It looks like this: {project_id}#appspot.gserviceaccount.com
So I had to restore the service account like this:
You can now recover the deleted service accounts from https://cloud.google.com/iam/reference/rest/v1/projects.serviceAccounts/undelete
you have to get the UniqueID of the service account from https://console.cloud.google.com/home/activity
Source: https://stackoverflow.com/a/55277567/888881
The API explorer is an easy way to use the IAM API: https://developers.google.com/apis-explorer/#p/
I was struggling to resolve this issue, then I raised a case with Google.
here is a detailed article of my learnings :
https://medium.com/#ashirazee/http-error-400-default-service-account-appspot-gserviceaccount-com-accd178ea32a
Firstly navigate to Google Cloud Platform and view your service accounts.
try and find <project_id>#appspot.gserviceaccount.com' in your list of service accounts for the firebase project, it is linked to the App Engine.
if '#appspot.gserviceaccount.com' is missing you can not deploy anything(SEE EMAIL WITH GOOGLE BELOW), if it isn't, check and see if it's enabled, try disabling it and enabling it again.
#appspot.gserviceaccount.com is pre-installed by default, regardless of a paid account or not. try and recall if at any time you may have deleted it after or before deployment.
Now if you have for any reason deleted it over a period of more than 30 days than you can not retrieve it, and you must create a new firebase project. However, if it is within 30 days you can undelete it.
EMAIL FROM GOOGLE:
Email #1
"
Hello Ali
I am checking the logs of your project, unfortunately the service account was deleted on Ma, there is no chance to recover it nor recreate it
The only workaround available is to create a new project and deploy the service desired there. I know this could not be the best option for you nevertheless it is the way this works by design.
Do not hesitate to write back if you have more questions.
Cheers,"
Email #2,
"Hello Ali
I am glad to read that you have been able to deploy your functions successfully, unfortunately that service account cannot be recovered after 30 days of being deleted and that is the only solution. If you have other questions, please let us know by contacting us again through our support channel.
Cheers,"
lastly here is a helpful command line that will help you debug this, however, it won't help if there is no service account, it'll just highlight the obvious:
firebase deploy --only functions --debug
this was my error:
"HTTP Error: 400, Default service account '<project_id>#appspot.gserviceaccount.com' doesn't exist. Please recreate this account (for example by disabling and enabling the Clo
ud Functions API), or specify a different account."
Following the error message, you could enable the API by console accessing this url and enable the api.
Or by gcloud command:
gcloud services --project <project_id> enable cloudfunctions.googleapis.com
As the other stated, sadly, you need to use the default service account.
If within the 30 days period, you can use this tutorial to find and undelete the service acc: Read this guide to get https://cloud.google.com/iam/docs/creating-managing-service-accounts#undeleting_a_service_account
You have to enter the commands through the Google Cloud Console (one of the buttons will open a terminal on the right of the top blue app bar)
try to select Google Cloud Platform (GCP) resource location in Setting of Firebase. enter image description here

Run Google Cloud Function at a specific time

I'd like to schedule the execution of a Cloud Function to a specific time. It should be run only once.
I basically have a function "startTask" which modifies some data in the Firestore database. After X seconds (the time is passed to the startTask function), the "finishTask" function should be called.
I already tried messing around with Google Cloud Tasks but I feel like this isn't the right way to go.
Google Cloud does not have service that will do what you need that I am aware of. If you need X to happen N seconds after user does Y, you will need to code that service yourself.
You do not specify what services you are using for compute (App Engine, Compute Engine, Kubernetes, etc.) but writing a task secheduling service in just about any language is not very hard. There are many ways to accomplish this (client side code / server side code). Many OS / language combinations support scheduling a function with a timeout and callback.
You can use Cloud Tasks. It will allow you to be alerted after x amount of seconds.
https://cloud.google.com/tasks/docs/creating-http-target-tasks
The easiest way is to create a pub/sub topic, cron-topic that your cloud function subscribes to. Cloud Scheduler can push an event to cron-topic on a schedule
Create the Topic & Subscription
gcloud pubsub topics create cron-topic
# create cron-sub for testing. Function will create it's own subscription
gcloud pubsub subscriptions create cron-sub --topic cron-topic
Create the Schedule
Command is below, but since it's beta, see the console guide here
# send a message every 3 hours. For testing use `0/2 * * * *` for every 2 min
gcloud beta scheduler jobs create pubsub --topic=cron-topic --schedule='0 */3 * * *'
Create a Function to Consume the cron-topic Topic
Put your function code in the current directory and use this command to deploy the function listening to the cron-topic topic
FUNCTION_NAME=cron-topic-listener
gcloud functions deploy ${FUNCTION_NAME} --runtime go111 --trigger-topic cron-topic
note pub/sub events are sent at least once. In some cases the event can be sent more than once. Make sure your function is idempotent