How to access a date element in JSON? - json

I have this API Endpoint setup on Lambda where my applications talk to and get the data it needs.
Problem am running across right now is trying to access an element that is based on the day before today's date.
Language: Python 3.7
Service: AWS Lambda
Provider: WeatherStack
Example: https://weatherstack.com/documentation -> API Features
-> Weather Forecast
In order to access this element from the API provider; I have to basically setup a JSON structure that goes like this:
"forecast": {
"2020-04-04": {
"date": "2020-04-04",
"date_epoch": 1585958400,
"astro": {
"sunrise": "06:42 AM",
"sunset": "07:31 PM",
"moonrise": "03:26 PM",
"moonset": "04:56 AM",
"moon_phase": "Waxing Gibbous",
"moon_illumination": 79
},
"mintemp": 46,
"maxtemp": 54,
"avgtemp": 50,
"totalsnow": 0,
"sunhour": 7.7,
"uv_index": 2
}
}
Now the problem here is the "2020-04-04" date; as I can't access it simply by calling api_endpoint['forecast'][0] as it will throw an error. I checked using Lens however and did find that it does have one element in 'forecast' which is of course the 2020-04-04 that I'm having trouble trying to access.
I don't know if there's a way to dynamically set the element to be called based on yesterday's date since the api provider will change the forecast date element daily.
I've tried api_endpoint['forecast'][datetime.now()] and got an error.
Is there a way to set the [] after ['forecast] dynamically via variable so that i can always call it based on api_endpoint['forecast'][yesterdaysdate]?
Solution:
from datetime import timedelta, datetime
ts = time.gmtime()
todaysdate = (time.strftime("%Y-%m-%d", ts))
yesterday_date = (datetime.datetime.utcnow() - timedelta(1)).strftime('%Y-%m-%d')
data = api_response['forecast'][yesterday_date]

If I understand corectly, you want to call the data inside the api_endpoint['forecast'][yesterday_date].
If so, this can be achieved by this:
from datetime import datetime, timedelta
yesterday_date = (datetime.now() - timedelta(1)).strftime('%Y-%m-%d')
# call to api
api_endpoint['forecast'][yesterday_date]
If you want to days ago, change timedelta(2) and so on.
Today variable can be assigned by this:
current_date = datetime.now().strftime('%Y-%m-%d')
api_endpoint['forecast'][current_date]
If none of the above solutions answer to your question, leave a comment.

Related

Sending data to CloudWatch using the AWS-SDK

I want to write data to CloudWatch using the AWS-SDK (or whatever may work).
I see this:
The only method that looks remotely like publishing data to CloudWatch is the putMetricData method..but it's hard to find an example of using this.
Does anyone know how to publish data to CloudWatch?
When I call this:
cw.putMetricData({
Namespace: 'ec2-memory-usage',
MetricData: [{
MetricName:'first',
Timestamp: new Date()
}]
}, (err, result) => {
console.log({err, result});
});
I get this error:
{ err:
{ InvalidParameterCombination: At least one of the parameters must be specified.
at Request.extractError (/Users/alex/codes/interos/jenkins-jobs/jobs/check-memory-ec2-instances/node_modules/aws-sdk/lib/protocol/query.js:50:29)
at Request.callListeners (/Users/alex/codes/interos/jenkins-jobs/jobs/check-memory-ec2-instances/node_modules/aws-sdk/lib/sequential_executor.js:106:20)
at Request.emit (/Users/alex/codes/interos/jenkins-jobs/jobs/check-memory-ec2-instances/node_modules/aws-sdk/lib/sequential_executor.js:78:10)
at Request.emit (/Users/alex/codes/interos/jenkins-jobs/jobs/check-memory-ec2-instances/node_modules/aws-sdk/lib/request.js:683:14)
at Request.transition (/Users/alex/codes/interos/jenkins-jobs/jobs/check-memory-ec2-instances/node_modules/aws-sdk/lib/request.js:22:10)
at AcceptorStateMachine.runTo (/Users/alex/codes/interos/jenkins-jobs/jobs/check-memory-ec2-instances/node_modules/aws-sdk/lib/state_machine.js:14:12)
at /Users/alex/codes/interos/jenkins-jobs/jobs/check-memory-ec2-instances/node_modules/aws-sdk/lib/state_machine.js:26:10
at Request.<anonymous> (/Users/alex/codes/interos/jenkins-jobs/jobs/check-memory-ec2-instances/node_modules/aws-sdk/lib/request.js:38:9)
at Request.<anonymous> (/Users/alex/codes/interos/jenkins-jobs/jobs/check-memory-ec2-instances/node_modules/aws-sdk/lib/request.js:685:12)
at Request.callListeners (/Users/alex/codes/interos/jenkins-jobs/jobs/check-memory-ec2-instances/node_modules/aws-sdk/lib/sequential_executor.js:116:18)
message: 'At least one of the parameters must be specified.',
code: 'InvalidParameterCombination',
time: 2019-07-08T19:41:41.191Z,
requestId: '688a4ff3-a1b8-11e9-967e-431915ff0070',
statusCode: 400,
retryable: false,
retryDelay: 7.89360948163893 },
result: null }
You're getting this error because you're not specifying any metric data. You're only setting the metric name and the timestamp. You also need to send some values for the metric.
Let's say your application is measuring the latency of requests and you observed 5 requests, with latencies 100ms, 500ms, 200ms, 200ms and 400ms. You have few options for getting this data into CloudWatch (hence the At least one of the parameters must be specified. error).
You can publish these 5 values one at a time by setting the Value within the metric data object. This is the simplest way to do it. CloudWatch does all the aggregation for you and you get percentiles on your metrics. I would not recommended this approach if you need to publish many observations. This option will result in the most requests made to CloudWatch, which may result in a big bill or throttling from CloudWatch side if you start publishing too many observations.
For example:
MetricData: [{
MetricName:'first',
Timestamp: new Date(),
Value: 100
}]
You can aggregate the data yourself and construct and publish the StatisticValues. This is more complex on your end, but results in the fewest requests to CloudWatch. You can aggregate for a minute for example and execute 1 put per metric every minute. This will not give you percentiles (since you're aggregating data on your end, CloudWatch doesn't know the exact values you observed). I would recommend this if you do not need percentiles.
For example:
MetricData: [{
MetricName:'first',
Timestamp: new Date(),
StatisticValues: {
Maximum: 500,
Minimum: 100,
SampleCount: 5,
Sum: 1400
}
}]
You can count the observations and publish Values and Counts. This is kinda the best of both worlds. There is some complexity on your end, but counting is arguably easier than aggregating into StatisticValues. You're still sending every observation so CloudWatch will do the aggregation for you, so you'll get percentiles. The format also allows more data to be sent than in the option 1. I would recommend this if you need percentiles.
For example:
MetricData: [{
MetricName:'first',
Timestamp: new Date(),
Values: [100, 200, 400, 500],
Counts: [1, 2, 1, 1]
}]
See here for more details for each option: https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/CloudWatch.html#putMetricData-property

How can I create an EMR cluster resource that uses spot instances without hardcoding the bid_price variable?

I'm using Terraform to create an AWS EMR cluster that uses spot instances as core instances.
I know I can use the bid_price variable within the core_instance_group block on a aws_emr_cluster resource, but I don't want to hardcode prices as I'd have to change them manually every time the instance type changes.
Using the AWS Web UI, I'm able to choose the "Use on-demand as max price" option. That's exactly what I'm trying to reproduce, but in Terraform.
Right now I am trying to solve my problem using the aws_pricing_product data source. You can see what I have so far below:
data "aws_pricing_product" "m4_large_price" {
service_code = "AmazonEC2"
filters {
field = "instanceType"
value = "m4.large"
}
filters {
field = "operatingSystem"
value = "Linux"
}
filters {
field = "tenancy"
value = "Shared"
}
filters {
field = "usagetype"
value = "BoxUsage:m4.large"
}
filters {
field = "preInstalledSw"
value = "NA"
}
filters {
field = "location"
value = "US East (N. Virginia)"
}
}
data.aws_pricing_product.m4_large_price.result returns a json containing the details of a single product (you can check the response of the example here). The actual on-demand price is buried somewhere inside this json, but I don't know how can I get it (image generated with http://jsonviewer.stack.hu/):
I know I might be able solve this by using an external data source and piping the output of an aws cli call to something like jq, e.g:
aws pricing get-products --filters "Type=TERM_MATCH,Field=sku,Value=8VCNEHQMSCQS4P39" --format-version aws_v1 --service-code AmazonEC2 | jq [........]
But I'd like to know if there is any way to accomplish what I'm trying to do with pure Terraform. Thanks in advance!
Unfortunately the aws_pricing_product data source docs don't expand on how it should be used effectively but the discussion in the pull request that added it adds some insight.
In Terraform 0.12 you should be able to use the jsondecode function to nicely get at what you want with the following given as an example in the linked pull request:
data "aws_pricing_product" "example" {
service_code = "AmazonRedshift"
filters = [
{
field = "instanceType"
value = "ds1.xlarge"
},
{
field = "location"
value = "US East (N. Virginia)"
},
]
}
# Potential Terraform 0.12 syntax - may change during implementation
# Also, not sure about the exact attribute reference architecture myself :)
output "example" {
values = jsondecode(data.json_query.example.value).terms.OnDemand.*.priceDimensions.*.pricePerUnit.USD
}
If you are stuck on Terraform <0.12 you might struggle to do this natively in Terraform other than the external data source approach you've already suggested.
#cfelipe put that ${jsondecode(data.aws_pricing_product.m4_large_price.value).terms.OnDemand.*.priceDimensions.*.pricePerUnit.USD}" in a Locals

Reference data join on stream analytics input not giving output

I'm trying to set a rule in Azure Stream Analytics job with the use of reference data and input stream which is coming from an event hub.
This is my reference data JSON packet in BLOB storage:
{
"ruleId": 1234,
"Tag" : "TAG1",
"metricName": "velocity",
"alertName": "velocity over 500",
"operator" : "AVGGREATEROREQUAL",
"value": 500
}
And here is the transformation query in the stream analytics job:
WITH
transformedInput AS
(
SELECT
metric = GetArrayElement(DeviceInputStream.data,0),
masterTag = rules.Tag,
ruleId = rules.ruleId,
alertName = rules.alertName,
ruleOperator = rules.operator,
ruleValue = rules.value
FROM
DeviceInputStream
timestamp by EventProcessedUtcTime
JOIN
rules
ON DeviceInputStream.masterTag = rules.Tag
)
--rule output--
SELECT
System.Timestamp as time,
transformedInput.Tag as Tag,
transformedInput.ruleId as ruleId,
transformedInput.alertName as alert,
AVG(metric.velocity) as avg
INTO
alertruleblob
FROM
transformedInput
GROUP BY
transformedInput.masterTag,
transformedInput.ruleId,
transformedInput.alertName,
ruleOperator,
ruleValue,
TumblingWindow(second, 6)
HAVING
ruleOperator = 'AVGGREATEROREQUAL' AND avg(metric.velocity) >= ruleValue
This is not yielding any results. However, when I do a test with sample input and reference data I get the expected results. But this doens't seem to be working with the streaming data. My use case is if the average velocity is greater than 500 for a 6 second window, store that result in another blob storage. The value of velocity has been greater than 500 for sometime, but I'm not getting any results.
What am I doing wrong?
This was working all along. I just had to specify the input path of the reference blob in the reference input path of stream analytics including the file name. I was basically referencing only the blob container without the actual file. So when I changed the path pattern to "filename.json", I got the results. It was a stupid mistake.

Opensearchsever -Search range beteen dates- JSON Restful API

I'm trying to use OpenSearchServer in one of my applications using RestFul JSON API .Can you please provide an example for querying search between 2 dates using the restful JSON api?
Below is my code so far
{"query":"test help","rows":100,
"returnedFields":[
"fileName",
"url"
]
}
Sorry for the bandwidth wastage.
To search between two dates using JSON API, we can use the "Relative date filter " .
Here's what the documentation says :
The Relative date filter can be used for this. Let's say that documents are indexed with the current date in the field indexedDate. In our example the date is expressed using the yyyyMMddHHmmss format - for instance 20141225130512 stands for the 25th of December, 2014, at 1:05:12 PM.
eg:
"filters":[
{
"negative":false,
"type":"RelativeDateFilter",
"from":{
"unit":"days",
"interval":2
},
"to":{
"unit":"days",
"interval":0
},
"field":"indexedDate",
"dateFormat":"yyyyMMddHHmmss"
}
],
Further details can be found here :http://www.opensearchserver.com/documentation/faq/querying/how_to_use_filters_on_query.md

Last Fm API returns nothing with specific time period

I'm using the last fm api to get data from a user but when using the User.getTopTracks method with the time period 1month, it returns nothing:
{
toptracks: {
#text: " ",
user: "RJ",
type: "1month",
page: "",
perPage: "",
totalPages: "",
total: "0"
}
}
This error does not occur when using similar methods (e.g. User.getTopAlbums)
This is a bug in the Last.fm API.
The issue used to exist for getTopArtists as well. It has been fixed for the artists call, but not for getTopTracks.
My solution is to use 3month when the user selects 1month. If I have time I will update the UI to work around this issue.