Is there any API I can use to get the total cost of a VM instance in Google compute?
The usage scenario is like this:
Server starts
Runs for some Hours / Days
Gets shut down
For reporting purposes, we get the cost of the server and save it in our DB
Thanks
Google has a system for exporting billing information each day in a csv/json file to a storage bucket.
http://googlecloudplatform.blogspot.ca/2013/12/ow-get-programmatic-access-to-your-billing-data-with-new-billing-api.html
Related
I would like to analyze users that are currently on my website and assign them to a category (within 1 minute)
What I need is a kind of solution to export data from Google Analytics to a MySQL database within 30 seconds or less.
I have talked to Solution Engineers from HEVO and they told me that it isn't possible to send data from Google Analytics to a MySQL Database within 30 seconds or less.
You cannot do that, because Google Analytics has a processing latency (4 hours for GA360, and up to 24 hours for the free version) to assemble the reports. Only then the data is accessible via the API. (This is for Universal Analytics, the new version of GA promises to cut down on the processing latency).
If you have GA360 you can set up a BigQuery export and select the "streaming" option - this comes at a (moderate) fee, with streaming exports being slightly more expensive than daily exports. You can then run SQL queries in BigQuery, or export the data to any other database.
With the free version you are out of luck.
We are using a private GCP account and we would like to process 30 GB of data and do NLP processing using SpaCy. We wanted to use more workers and we decided to start with a maxiumn number of worker of 80 as show below. We submited our job and we got some issue with some of the GCP standard user quotas:
QUOTA_EXCEEDED: Quota 'IN_USE_ADDRESSES' exceeded. Limit: 8.0 in region XXX
So I decided to request some new quotas of 50 for IN_USE_ADDRESSES in some region (it took me few iteration to find a region who could accept this request). We submited a new jobs and we got new quotas issues:
QUOTA_EXCEEDED: Quota 'CPUS' exceeded. Limit: 24.0 in region XXX
QUOTA_EXCEEDED: Quota 'CPUS_ALL_REGIONS' exceeded. Limit: 32.0 globally
My questions is if I want to use 50 workers for example in one region, which quotas do I need to changed ? The doc https://cloud.google.com/dataflow/quotas doesn't seems to be up to date since they only said " To use 10 Compute Engine instances, you'll need 10 in-use IP addresses.". As you can see above this is not enought and other quotas need to be changed as well. Is there some doc, blog or other post where this is documented and explained ? Just for one region there are 49 Compute Engine quotas that can be changed!
I would suggest that you start using Private IP's instead of Public IP addresses. This would help in you in 2 ways:-
You can bypass some of the IP address related quotas as they are related to Public IP addresses.
Reduce costs significantly by eliminating network egress costs as the VM's would not be communicating with each other over public internet. You can find more details in this excellent article [1]
To start using the private IP's please follow the instructions as mentioned here [2]
Apart from this you would need to take care of the following quota's
CPUs
You can increase the quota for a given region by setting the CPUs quota under Compute Engine appropriately.
Persistent Disk
By default each VM needs a storage of 250 GB therefore for 100 instances it would be around 25TB. Please check the disk size of the workers that you are using and set the Persistent Disk quota under Compute Instances appropriately.
The default disk size is 25 GB for Cloud Dataflow Shuffle batch pipelines.
Managed Instance Groups
You would need to take that you have enough quota in the region as Dataflow needs the following quota:-
One Instance Group per Cloud Dataflow job
One Managed Instance Group per Cloud Dataflow job
One Instance Template per Cloud Dataflow job
Once you review these quotas you should be all set for running the job.
1 - https://medium.com/#harshithdwivedi/how-disabling-external-ips-helped-us-cut-down-over-80-of-our-cloud-dataflow-costs-259d25aebe74
2 - https://cloud.google.com/dataflow/docs/guides/specifying-networks
There is a lot of activity on my Google Compute engine API. It's less than 1 request per second which probably keeps me in the free zone but how do I figure out what is running and if I should stop it?
I have some pub/sub topics and a cloud function to copy data into a dataStore database. But even if I am not publishing any data (for days), I still get activity on the compute engine? Can I disable it or will it stop my cloud functions?
I have created 2 user-defined metrics in the Cloud Logging UI. Those metrics show up in Cloud Monitoring, but their graphs are perpetually showing "no graph data found". Are there any steps to troubleshoot this or are there other requirements to have the data from user-defined log metrics be visible in Cloud Monitoring?
Were there matching log entries after you created the metric? The logs-based metrics start counting matching entries only after metric creation time.
If there were matching log entries after metric creation, did you wait a few minutes to see if there was data in your graphs? It takes a few minutes to update the logs-based metrics in Cloud Monitoring, so you may see log entries in Cloud Logging that are not yet counted in Cloud Monitoring.
If you did wait at a few minutes, was there any delay on your log ingestion? For this it would be good to know where the logs were coming from. If a log entry arrives late to Cloud Logging, it will appear in the Logs Viewer but will not be counted in the logs-based metrics. A log entry is considered late if it arrives more than two minutes after the timestamp included in the log entry. The number of late-arriving entries is recorded for each log in the system metric, logging.googleapis.com/dropped_log_entry_count.
Some of these steps are documented here: https://cloud.google.com/logging/docs/view/logs_based_metrics#troubleshooting
I assume you're using Cloud Monitoring v2beta custom metrics. I also assume that you have not only created the metrics themselves but also sent timeseries data into these metrics.
I'd start with listing the timeseries data using the API call to to "monitoring.projects.timeSeries.list" see if your data is really there, otherwise the Cloud Monitoring UI will display the metrics but won't have any data in them. You can use API Explorer to facilitate this test.
P.S. Custom metrics v2 are being depreciated these days and now are being replaced by v3. You might want to update your code to reflect these changes using this guide
Anyone know if there is a way to download historical cpu usage data via API call for Google Compute Engine?
On the Console overview page, graphs are provided for this type of data for at least a month, but I don't see anything obvious on how to download the actual data directly.
The Google Compute Engine usage export feature recently launched - it sounds like what you're looking for. It gives daily detail in a CSV, as well as a month-to-date summary.