Data Studio doesn't request all the fields to my Community Connector - google-apps-script

I'm making a Community Connector with the following fields, among others: Age, gender and impressions.
When I try to do a bar chart with Impressions as a metric, Age as a dimension and Gender as a breakdown dimension (or Gender and Age, inverted) I get the following error:
User Configuration Error
This data source was improperly configured.
Invalid argument type.
Error ID: b44d6288
Debugging, I found the problem is that it isn't making a single request to getData() including the three fields (which, when processed, would make the right call to the API and get the right data). Instead it only requests the pair dimension-metric on one request and sometimes also breakdown dimension-metric on others (and, sometimes, dimension-metric with filter info), which gives it a "broken" data which it apparently can't make sense of. As the request to my getData() only includes two fields, I return two fields per row. Info about the third one can't be found nowhere, especially on the request parameter, as far as I can see.
This behavior appeared somewhere along the development of the connector -- at some points this exact combination worked normal.
As this behavior doesn't include code it's really scaring me. Any idea would be deeply appreciated.

Related

How to fix a query in functions within foundry which is hiting ObjectSet:PagingAboveConfiguredLimitNotAllowed?

I have phonorgraph object with billions of rows and we are querying it through object set service
for example, I want to get all DriverLicences from certain city.
#Function()
public getDriverLicences(city: string): ObjectSet<DriverLicences> {
let drivers = Objects.search().DriverLicences().filter(row => row.city.exactMatch(city));
return drivers ;
}
I am facing this error when I am trying query it from slate:
ERROR 400: {"errorCode":"INVALID_ARGUMENT","errorName":"ObjectSet:PagingAboveConfiguredLimitNotAllowed","errorInstanceId":"0000-000","parameters":{}}
I understand that I am probably retrieving more than 100 000 results but I need all the results because of the implemented logic in the front is a complex slate dashboard built by another team that we cannot re-factor.
The issue here is that, specifically in the Slate <> Function connector, there is a "translation layer" that serializes the contents of the object set and provides a response data structure that materializes the property:value pairs for each object in the set.
This clearly doesn't work for large object sets where throwing so much data into the browser is likely to overwhelm the resources allocated to the tab.
From context it seems like you might be migrating an existing Slate app over to Functions; in the current version, how is the query limiting the number of results returned? It certainly must not be returning several 100 thousand results for further processing on the front end? (And if so, that might be an anti-pattern to consider addressing).
As for options that you could currently explore, you can sort your object set and then specify a smaller limit to return:
Objects.search().DriverLicences().filter(row => row.city.exactMatch(city)).orderBy(date_of_issue).take(100)
You'll find a few more details in the Functions documentation Reference entry on Ontology API: Object Sets in the section on Ordering and limiting.
You can even make a work around for the (current) lack of paging when return an ObjectSet to Slate by using the last value from the property ordered on (i.e. date_of_issue) as a filter in the subsequent request and return the next N objects.
This can work if you need a Slate table or HTML widget that renders on set of results then, on a user action, gets the next page.

Accessing Lists in Django request.POST

I am having difficulty accessing all the data returned by my forms in my post function. I notice a significant discrepancy between what is displayed when I print request.POST vs. when my code accesses this data. Hopefully someone can explain this to me.
Output of print(request.POST):
print(request.POST)
<QueryDict: {'csrfmiddlewaretoken': ['AXMPO...'],
'start_date': ['2019-03-01'], 'end_date': ['2019-03-26'],
'reports': ['4', '1']}>
In order to examine the data my code is dealing with I used the json module to view the data. The behavior of my code during debugging conforms to this representation:
json.dumps(request.POST)
'{"csrfmiddlewaretoken": "AXMPO...",
"start_date": "2019-03-01", "end_date": "2019-03-26",
"reports": "1"}'
It all looks pretty similar until you see the "reports" value. The user selects these reports via an MultipleSelect widget on my form and my code is iterating through the id numbers provided. However, no matter how many reports I select I only get one ID. If anyone can explain why this is happening I would sincerely appreciate it.
Turns out this is a really old school issue. I could wish this was more prominent in the documentation though. The explanation by Simon Willson is below:
"""
This is a feature, not a bug. If you want a list of values for a key, use the following:
values = request.POST.getlist('key')
The reasoning behind this is that an API method should consistently return either a string or a list, but never both. The common case in web applications is for a form key to be associated with a single value, so that's what the [] syntax does. getlist() is there for the occasions (like yours) when you intend to use a key multiple times for a single value.
""" - Simon Willson, 13 years ago.

How do I download gridded sst data?

I've recently been introduced to R and trying the heatwaveR package. I get an error when loading erddap data ... Here's the code I have used so far:
library(rerddap)
library(ncdf4)
info(datasetid = "ncdc_oisst_v2_avhrr_by_time_zlev_lat_lon", url = "https://www.ncei.noaa.gov/erddap/")
And I get the following error:
Error in curl::curl_fetch_memory(x$url$url, handle = x$url$handle) :
schannel: next InitializeSecurityContext failed: SEC_E_INVALID_TOKEN (0x80090308) - The token supplied to the function is invalid
Would like some help in this. I'm new to this website too so I apologize if the above question is not as per standards (codes to be typed in a grey box, etc.)
Someone directed this post to my attention from the heatwaveR issues page on GitHub. Here is the answer I provided for them:
I do not manage the rerddap package so can't say exactly why it may be giving you this error. But I can say that I have noticed lately that the OISST data are often not available on the ERDDAP server in question. I (attempt to) download fresh data every day and am often denied with an error similar to the one you posted. It's gotten to the point where I had to insert some logic gates into my download script so it tells me that the data aren't currently being hosted before it tries to download them. I should also point out that one may download the "final" data from this server, which have roughly a two week delay from present day, as well as the "preliminary (prelim)" data, which are near-real-time but haven't gone through all of the QC steps yet. These two products are accounted for in the following code:
# First download the list of data products on the server
server_data <- rerddap::ed_datasets(which = "griddap", "https://www.ncei.noaa.gov/erddap/")$Dataset.ID
# Check if the "final" data are currently hosted
if(!"ncdc_oisst_v2_avhrr_by_time_zlev_lat_lon" %in% server_data)
stop("Final data are not currently up on the ERDDAP server")
# Check if the "prelim" data are currently hosted
if(!"ncdc_oisst_v2_avhrr_prelim_by_time_zlev_lat_lon" %in% server_data)
stop("Prelim data are not currently up on the ERDDAP server")
If the data are available I then check the times/dates available with these two lines:
# Download final OISST meta-data
final_info <- rerddap::info(datasetid = "ncdc_oisst_v2_avhrr_by_time_zlev_lat_lon", url = "https://www.ncei.noaa.gov/erddap/")
# Download prelim OISST meta-data
prelim_info <- rerddap::info(datasetid = "ncdc_oisst_v2_avhrr_prelim_by_time_zlev_lat_lon", url = "https://www.ncei.noaa.gov/erddap/")
I ran this now and it looks like the data are currently available. Is your error from today, or from a day or two ago? The availability seems to cycle over the week but I haven't quite made sense of any pattern yet. It is also important to note that about a day before the data go dark they are filled with all sorts of massive errors. So I've also had to add error trapping into my code that stops the data aggregation process once it detects temperatures in excess of some massive number. In this case it is something like1^90, but the number isn't consistent meaning it is not a missing value placeholder.
To manually see for yourself if the data are being hosted you can go to this link and scroll to the bottom:
https://www.ncei.noaa.gov/erddap/griddap/index.html
All the best,
-Robert

kafka-python 1.3.3: KafkaProducer.send with explicit key fails to send message to broker

(Possibly a duplicate of Can't send a keyedMessage to brokers with partitioner.class=kafka.producer.DefaultPartitioner, although the OP of that question didn't mention kafka-python. And anyway, it never got an answer.)
I have a Python program that has been successfully (for many months) sending messages to the Kafka broker, using essentially the following logic:
producer = kafka.KafkaProducer(bootstrap_servers=[some_addr],
retries=3)
...
msg = json.dumps(some_message)
res = producer.send(some_topic, value=msg)
Recently, I tried to upgrade it to send messages to different partitions based on a definite key value extracted from the message:
producer = kafka.KafkaProducer(bootstrap_servers=[some_addr],
key_serializer=str.encode,
retries=3)
...
try:
key = some_message[0]
except:
key = None
msg = json.dumps(some_message)
res = producer.send(some_topic, value=msg, key=key)
However, with this code, no messages ever make it out of the program to the broker. I've verified that the key value extracted from some_message is always a valid string. Presumably I don't need to define my own partitioner, since, according to the documentation:
The default partitioner implementation hashes each non-None key using the same murmur2 algorithm as the java client so that messages with the same key are assigned to the same partition.
Furthermore, with the new code, when I try to determine what happened to my send by calling res.get (to obtain a kafka.FutureRecordMetadata), that call throws a TypeError exception with the message descriptor 'encode' requires a 'str' object but received a 'unicode'.
(As a side question, I'm not exactly sure what I'd do with the FutureRecordMetadata if I were actually able to get it. Based on the kafka-python source code, I assume I'd want to call either its succeeded or its failed method, but the documentation is silent on the point. The documentation does say that the return value of send "resolves to" RecordMetadata, but I haven't been able to figure out, from either the documentation or the code, what "resolves to" means in this context.)
Anyway: I can't be the only person using kafka-python 1.3.3 who's ever tried to send messages with a partitioning key, and I have not seen anything on teh Intertubes describing a similar problem (except for the SO question I referenced at the top of this post).
I'm certainly willing to believe that I'm doing something wrong, but I have no idea what that might be. Is there some additional parameter I need to supply to the KafkaProducer constructor?
The fundamental problem turned out to be that my key value was a unicode, even though I was quite convinced that it was a str. Hence the selection of str.encode for my key_serializer was inappropriate, and was what led to the exception from res.get. Omitting the key_serializer and calling key.encode('utf-8') was enough to get my messages published, and partitioned as expected.
A large contributor to the obscurity of this problem (for me) was that the kafka-python 1.3.3 documentation does not go into any detail on what a FutureRecordMetadata really is, nor what one should expect in the way of exceptions its get method can raise. The sole usage example in the documentation:
# Asynchronous by default
future = producer.send('my-topic', b'raw_bytes')
# Block for 'synchronous' sends
try:
record_metadata = future.get(timeout=10)
except KafkaError:
# Decide what to do if produce request failed...
log.exception()
pass
suggests that the only kind of exception it will raise is KafkaError, which is not true. In fact, get can and will (re-)raise any exception that the asynchronous publishing mechanism encountered in trying to get the message out the door.
I also faced the same error. Once I added json.dumps while sending the key, it worked.
producer.send(topic="first_topic", key=json.dumps(key)
.encode('utf-8'), value=json.dumps(msg)
.encode('utf-8'))
.add_callback(on_send_success).add_errback(on_send_error)

Simperium Data Dictionary or Decoder Ring for Return Value on "all" call?

I've looked through all of the Simperium API docs for all of the different programming languages and can't seem to find this. Is there any documentation for the data returned from an ".all" call (e.g. api.todo.all(:cv=>nil, :data=>false, :username=>false, :most_recent=>false, :timeout=>nil) )?
For example, this is some data returned:
{"ccid"=>"10101010101010101010101010110101010",
"o"=>"M",
"cv"=>"232323232323232323232323232",
"clientid"=>"ab-123123123123123123123123",
"v"=>{
"date"=>{"o"=>"+", "v"=>"2015-08-20T00:00:00-07:00"},
"calendar"=>{"o"=>"+", "v"=>false},
"desc"=>{"o"=>"+", "v"=>"<p>test</p>\r\n"},
"location"=>{"o"=>"+", "v"=>"Los Angeles"},
"id"=>{"o"=>"+", "v"=>43}
},
"ev"=>1,
"id"=>"abababababababababababababab/10101010101010101010101010110101010"}
I can figure out some of it just from context or from the name of the key but a lot of it is guesswork and trial and error. The one that concerns me is the value returned for the "o" key. I assume that a value of "M" is modify and a value of "+" is add. I've also run into "-" for delete and just recently discovered that there is also a "! '-'" which is also a delete but don't know what else it signifies. What other values can be returned in the "o" key? Are there other keys/values that can be returned but are rare? Is there documentation that details what can be returned (that would be the most helpful)?
If it matters, I am using the Ruby API but I think this is a question that, if answered, can be helpful for all APIs.
The response you are seeing is a list of all of the changes which have occurred in the given bucket since some point in its history. In the case where cv is blank, it tries to get the full history.
You can find some of the details in the protocol documentation though it's incomplete and focused on the WebSocket message syntax (the operations are the same however as with the HTTP API).
The information provided by the v parameter is the result of applying the JSON-diff algorithm to the data between changes. With this diff information you can reconstruct the data at any given version as the changes stream in.