How to partition a lag function over IotHub.ConnectionDeviceId in Azure Stream Analytics? - cortana-intelligence

According to this site
https://azure.microsoft.com/en-us/documentation/articles/stream-analytics-define-inputs/
there is a property from iot hub that can be used in stream analytics for identify the device. But when I want to used it in a lag function, I get a compile error:
LAG(brightness, 1, -1) OVER (PARTITION BY IoTHub.ConnectionDeviceId LIMIT DURATION(minute, 10)) as lastBrightness,
Any ideas?

This should be supported - we will look into why error is thrown.
In the meantime you can use this query as a workaround:
WITH step1 AS
(
SELECT brightness, IoTHub.ConnectionDeviceId as deviceid
FROM input
)
SELECT
LAG(brightness, 1, -1)
OVER (PARTITION BY deviceid LIMIT DURATION(minute, 10)) as lastBrightness
FROM step1

Related

SQL runs on localhost but throws SQLSTATE[42883] error on deployment to heroku

i have a laravel/vuejs project i just deployed to heroku. Everything works well both on local and production except for a query that is throwing up SQLSTATE[42883] error on production. The sql query retrieves a count of models created every week:
public function getProductWeeklyData(){
$products= Product::selectRaw('COUNT(*) AS product_count')
->selectRaw('FROM_DAYS(TO_DAYS(created_at) -MOD(TO_DAYS(created_at) -1, 7)) AS week_starting')
->groupBy('week_starting')
->orderBy('week_starting')
->take(10)->get();
$products->each->setAppends([]);
return response()->json($products, 200);
}
This works on localhost and return an array like so:
0: {product_count: 7, week_starting: "2021-10-31"}
1: {product_count: 12, week_starting: "2021-11-07"}
2: {product_count: 15, week_starting: "2021-11-14"}
which was exactly what i wanted.
However, after deploying to heroku and changing the database connection to PostgreSQL, the query fails and start returning this error.
"message": "SQLSTATE[42883]: Undefined function: 7 ERROR: function to_days(timestamp without time zone) does not exist\nLINE 1: select COUNT() AS product_count,FROM_DAYS(TO_DAYS(created_at...\n ^\nHINT: No function matches the given name and argument types. You might need to add explicit type casts. (SQL: select COUNT() AS product_count, FROM_DAYS(TO_DAYS(created_at) -MOD(TO_DAYS(created_at) -1, 7)) AS week_starting from "products" group by "week_starting" order by "week_starting" asc limit 10)"
i tried casting the date like was suggested in the error by doing this
...(created_at::text, 'YYYY-MM-DD')
but still got an error of invalid query.
How do i get this query working with PostgreSQL on production?

slick mysql streaming to avoid GC and OOM issues

While querying records from DB for a specified date range I am getting GC issue as the total number of returned records count is very large. Being new to slick I am not aware of using streaming. Could someone help in translating below method to stream logic -
val res = query.filter { row =>
(row.category === ServiceConstants.CATEGORY_TYPE.name ) &&
(row.ftrxDate >= trxDateLowerLimit && row.ftrxDate <= trxDateUpperLimit)}.result
db.run(res)
You can find information on how to stream data from the database in the manual:
https://scala-slick.org/doc/3.3.2/dbio.html#streaming

Formatting Biquery query to ML appropriate JSON to Pass through ML Predict

Using Python 2.7, I wont to pass a query from BigQuery to ML Predict which has a specific formating request.
First: Is there an easier way to go directly from the BigQuery query to JSON in the correct format so it can be passed to requests.post() instead of going through pandas (from what I understand pandas is still not supported for GCP Standard)?
Second: Is there a way to construct the query to go directly to a JSON format and then modify the JSON to reflect the ML Predict JSON requirments?
Currently my code looks like this:
#I used the bigquery to dataframe option here to view the output.
#I would like to not use pandas in the end code.
logs = log_data.execute(output_options=bq.QueryOutput.dataframe()).result()
data = logs.to_json(orient='index')
print data
'{"0":{"end_time":"2018-04-19","device":"iPad","device_os":"iOS","device_os_version":"5.1.1","latency":0.150959,"megacycles":140.0,"cost":"1.3075e-08","device_brand":"Apple","device_family":"iPad","browser_version":"5.1","app":"567","ua_parse":"0"}}'
#The JSON needs to be in this format according to google documentation.
#data = {
# 'instances': [
# {
# 'key':'',
# 'end_time': '2018-04-19',
# 'device': 'iPad',
# 'device_os': 'iOS',
# 'device_os_version': '5.1.1',
# 'latency': 0.150959,
# 'megacycles':140.0,
# 'cost':'1.3075e-08',
# 'device_brand':'Apple',
# 'device_family':'iPad',
# 'browser_version':'5.1',
# 'app':'567',
# 'ua_parse':'40.9.8'
# }
# ]
#}
So all I would need to change is the leading key '0' to 'instances' and I should be all set to pass into `requests.post().
Is there a way to accomplish this?
Edit-Adding BigQuery query:
%%bq query --n log_data
WITH `my.table` AS (
SELECT ARRAY<STRUCT<end_time STRING, device STRING, device_os STRING, device_os_version STRING, latency FLOAT64, megacycles FLOAT64,
cost STRING, device_brand STRING, device_family STRING, browser_version STRING, app STRING, ua_parse STRING>>[] instances
)
SELECT TO_JSON_STRING(t)
FROM `my.table` AS t
WHERE end_time >='2018-04-19'
LIMIT 1
data = log_data.execute().result()
Thanks to #MikhailBerlyant I have adjust my query and code to look like this:
%%bq query --n log_data
SELECT [TO_JSON_STRING(t)] AS instance
FROM `yourproject.yourdataset.yourtable` AS t
WHERE end_time >='2018-04-19'
LIMIT 1
But when I run the execute logs = log_data.execute().result() I get this
Which results in this error when passing into request.post
TypeError: QueryResultsTable job_zfVEiPdf2W6msBlT6bBLgMusF49E is not JSON serializable
Is there a way within execut() to just return the json?
First: Is there an easier way to go directly from the BigQuery query to JSON in the correct format
See example below
#standardSQL
WITH yourTable AS (
SELECT ARRAY<STRUCT<id INT64, type STRING>>[(1, 'abc'), (2, 'xyz')] instances
)
SELECT TO_JSON_STRING(t)
FROM yourTable t
with result is in the format you asked for:
{"instances":[{"id":1,"type":"abc"},{"id":2,"type":"xyz"}]}
Above demonstrates the query and how it will work
In you real case - you should use something like below
SELECT TO_JSON_STRING(t)
FROM `yourproject.yourdataset.yourtable` AS t
WHERE end_time >='2018-04-19'
LIMIT 1
hope this helps :o)
Update based on comments
SELECT [TO_JSON_STRING(t)] AS instance
FROM `yourproject.yourdataset.yourtable` t
WHERE end_time >='2018-04-19'
LIMIT 1
I wanted to add this in case someone has the same problem I had or at least are stuck on were to go once you have the query.
I was able to write a function that formatted the query in the way Google ML Predict wants it to be passed into requests.post(). This is most likely a horrible way to accomplish this but I could not find a direct way to go from BigQuery to ML Predict in the correct format.
def logs(query):
client = gcb.Client()
query_job = client.query(query)
CSV_COLUMNS ='end_time,device,device_os,device_os_version,latency,megacycles,cost,device_brand,device_family,browser_version,app,ua_parse'.split(',')
for row in query_job.result():
var = list(row)
l1 = dict(zip(CSV_COLUMNS,var))
l1.update({'key':''})
l2 = {'instances':[l1]}
return l2

Couchbase N1qlQuery: delete cause error to the query service

I try to run this query
delete from bucket o
use keys (select raw ARRAY_CONCAT(ARRAY_CONCAT(ARRAY_CONCAT(d, t), s), u)
from bucket
use keys 'SS')
I get this response:
{
"status": "Unexpected server error"
}
in the server log I see this:
Service 'query' exited with status 1. Restarting. Messages: runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:2232 +0x1 fp=0xc20e97cfc0 sp=0xc20e97cfb8
created by github.com/couchbase/query/parser/n1ql.NewLexerWithInit
/home/couchbase/jenkins/workspace/watson-unix/goproj/src/github.com/couchbase/query/parser/n1ql/n1ql.nn.go:30999 +0x4a6c9
[goport] 2016/11/29 08:40:11 /opt/couchbase/bin/cbq-engine terminated: signal: aborted (core dumped)
What the problem with this query?
I am using couchbase version 4.5.
you can also use ARRAY_FLATTEN() function or the FIRST operator.
delete from bucket o use keys
ARRAY_FLATTEN( ( select raw ARRAY_CONCAT(ARRAY_CONCAT(ARRAY_CONCAT(d, t), s), u) from bucket use keys 'SS'), 1)
returning meta(o).id;
or
delete from bucket o
use keys FIRST x FOR x IN
( select raw ARRAY_CONCAT(ARRAY_CONCAT(ARRAY_CONCAT(d, t), s), u)
from bucket
use keys 'SS' )
END
returning meta(o).id;
Note that parenthesis (in bold italics) is required around the sub-query when it is used as expression (for example, as parameter to ARRAY_FLATTEN() or in the FIRST construct)
USE KEYS requires array of keys. ARRAY_CONCAT() returns array and Subquery returns array. It become array of array.
Remove one of the array as follows.
delete from bucket o
use keys (select raw ARRAY_CONCAT(ARRAY_CONCAT(ARRAY_CONCAT(d, t), s), u)
from bucket
use keys 'SS')[0];
If the ARRAY_CONCAT() argument is missing or null, it may return same panic error in 4.5. This has been fixed in 4.5.1.

How to retrieve items with hibernate clustered by to_days + COUNT(*)

I am pretty new to hibernate again, so this might be a noobish question ;).
Without to_days, but clustered by timestamp it works like this:
CriteriaQuery<Tuple> query = criteriaBuilder.createQuery(Tuple.class);
Root<Session> sessionRoot = query.from(Session.class);
query.multiselect(
sessionRoot.get("time").alias("time"),
criteriaBuilder.count(sessionRoot).alias("count")
);
query.groupBy(sessionRoot.get("time"));
List<Tuple> results = this.executeQuery(query);
So I recieve:
time|count
13721938721|1
13721938722|2
13721938723|3
13721938724|4
13721938725|2
13721938726|1
13721938727|4
But this are all sessioncounts for each millisecond, but I need those clustered by day and not by timestamp: thus I use to_days in plain mysql.
In mysql I perform this query:
SELECT TO_DAYS(`time`) AS `days`, COUNT(*) as `count` FROM sessions WHERE 1 GROUP BY `days`
This gives me:
days|count
777594|123
777595|60
777596|61
777597|74
But I have no idea, yet: how to achieve the same thing with javax.persistence.criteria.CriteriaBuilder and CriteriaQuery in hibernate?
I dont know how to do it with criteriaBuilder, but i do know how in Hibernate 4 criteria api:
query.setProjection(
Projections.sqlProjection(
"TO_DAYS(time) as days",
new String[]{"days"},
new Type[]{StandardBasicTypes.INTEGER}
)
);
sqlProjection allows you to cast or convert data types, but careful, using a projection will only retrieve the fileds you specify in it, and the resulting list will come up like this:
List<Object[]> results = this.executeQuery(query);
But you can make hibernate do a alias match with the properties using a result transformer:
query.setResultTransformer(new AliasToBeanResultTransformer(Session.class));
and the list comes out like it normally does:
List<Session.class> results = this.executeQuery(query);
Sorry i could not provide a criteriaBuilder solution, but i hope this gets you in the right track.
After some investigation, it turned out, that HQL does not support TO_DAYS. Since I want to make it possible for MySQL and other databases, this is my final solution:
Query q = entityManager.createQuery("SELECT concat(day(e.time), '-', month(e.time), '-', year(e.time)) AS days, COUNT(*) FROM Event e GROUP BY concat(day(e.time), '-', month(e.time), '-', year(e.time))");
The result is:
3-5-2012|980
4-5-2012|200
10-6-2012|123
12-6-2012|144
13-11-2012|500
So afterwards I convert all ugly date strings into proper milliseconds in java and have the data, which I need.