How do I download gridded sst data? - heatmap

I've recently been introduced to R and trying the heatwaveR package. I get an error when loading erddap data ... Here's the code I have used so far:
library(rerddap)
library(ncdf4)
info(datasetid = "ncdc_oisst_v2_avhrr_by_time_zlev_lat_lon", url = "https://www.ncei.noaa.gov/erddap/")
And I get the following error:
Error in curl::curl_fetch_memory(x$url$url, handle = x$url$handle) :
schannel: next InitializeSecurityContext failed: SEC_E_INVALID_TOKEN (0x80090308) - The token supplied to the function is invalid
Would like some help in this. I'm new to this website too so I apologize if the above question is not as per standards (codes to be typed in a grey box, etc.)

Someone directed this post to my attention from the heatwaveR issues page on GitHub. Here is the answer I provided for them:
I do not manage the rerddap package so can't say exactly why it may be giving you this error. But I can say that I have noticed lately that the OISST data are often not available on the ERDDAP server in question. I (attempt to) download fresh data every day and am often denied with an error similar to the one you posted. It's gotten to the point where I had to insert some logic gates into my download script so it tells me that the data aren't currently being hosted before it tries to download them. I should also point out that one may download the "final" data from this server, which have roughly a two week delay from present day, as well as the "preliminary (prelim)" data, which are near-real-time but haven't gone through all of the QC steps yet. These two products are accounted for in the following code:
# First download the list of data products on the server
server_data <- rerddap::ed_datasets(which = "griddap", "https://www.ncei.noaa.gov/erddap/")$Dataset.ID
# Check if the "final" data are currently hosted
if(!"ncdc_oisst_v2_avhrr_by_time_zlev_lat_lon" %in% server_data)
stop("Final data are not currently up on the ERDDAP server")
# Check if the "prelim" data are currently hosted
if(!"ncdc_oisst_v2_avhrr_prelim_by_time_zlev_lat_lon" %in% server_data)
stop("Prelim data are not currently up on the ERDDAP server")
If the data are available I then check the times/dates available with these two lines:
# Download final OISST meta-data
final_info <- rerddap::info(datasetid = "ncdc_oisst_v2_avhrr_by_time_zlev_lat_lon", url = "https://www.ncei.noaa.gov/erddap/")
# Download prelim OISST meta-data
prelim_info <- rerddap::info(datasetid = "ncdc_oisst_v2_avhrr_prelim_by_time_zlev_lat_lon", url = "https://www.ncei.noaa.gov/erddap/")
I ran this now and it looks like the data are currently available. Is your error from today, or from a day or two ago? The availability seems to cycle over the week but I haven't quite made sense of any pattern yet. It is also important to note that about a day before the data go dark they are filled with all sorts of massive errors. So I've also had to add error trapping into my code that stops the data aggregation process once it detects temperatures in excess of some massive number. In this case it is something like1^90, but the number isn't consistent meaning it is not a missing value placeholder.
To manually see for yourself if the data are being hosted you can go to this link and scroll to the bottom:
https://www.ncei.noaa.gov/erddap/griddap/index.html
All the best,
-Robert

Related

Seeing "internal error: Huge input lookup [1]" error when trying to create a large powerpoint file using OFFICER package

I don't have a great way to give a reproduceable example, but here's my best description. I'm running a loop that generates 60 different powerpoint slides, each in officer and creates a list, which results in a "pptx document with 60 slides" in my R environment. However, when I try to print this list, I see the following error:
Error in read_xml.raw(charToRaw(enc2utf8(x)), "UTF-8", ..., as_html = as_html, :
internal error: Huge input lookup [1]
I tried running the list with only 10 powerpoint slides, and the print works, creating a slide deck of 10 slides. But I guess 60 is beyond the level that is considered "huge." Is there a way to override this? I saw some other posts about how you can add a Huge override, but I'm not exactly sure where I would do that.
set options = c("HUGE") for read_xml()

Data Studio doesn't request all the fields to my Community Connector

I'm making a Community Connector with the following fields, among others: Age, gender and impressions.
When I try to do a bar chart with Impressions as a metric, Age as a dimension and Gender as a breakdown dimension (or Gender and Age, inverted) I get the following error:
User Configuration Error
This data source was improperly configured.
Invalid argument type.
Error ID: b44d6288
Debugging, I found the problem is that it isn't making a single request to getData() including the three fields (which, when processed, would make the right call to the API and get the right data). Instead it only requests the pair dimension-metric on one request and sometimes also breakdown dimension-metric on others (and, sometimes, dimension-metric with filter info), which gives it a "broken" data which it apparently can't make sense of. As the request to my getData() only includes two fields, I return two fields per row. Info about the third one can't be found nowhere, especially on the request parameter, as far as I can see.
This behavior appeared somewhere along the development of the connector -- at some points this exact combination worked normal.
As this behavior doesn't include code it's really scaring me. Any idea would be deeply appreciated.

How to download IMF Data Through JSON in R

I recently took an interest in retrieving data in R through JSON. Specifically, I want to be able to access data through the IMF. I know virtually nothing about JSON so I will share what I [think I] know so far, and what I have accomplished.
I browsed their web page for JSON, which helped a little bit. It gave me the start point URL. Here is the web page; http://datahelp.imf.org/knowledgebase/articles/667681-using-json-restful-web-service
I managed to download (using the GET() and the fromJSON() functions) some lists, which are really bulky. I know enough about the lists that the "call" was successful, but I cannot for the life of me get actual data. So far, I have been trying to use the rawToChar() function on the "content" data but I am virtually stuck there.
If anything, I managed to create data frames that contain the codes, which I presume would be used somewhere in the JSON link. Here is what I have.
all.imf.data = fromJSON("http://dataservices.imf.org/REST/SDMX_JSON.svc/Dataflow/")
str(all.imf.data)
#all.imf.data$Structure$Dataflows$Dataflow$Name[[2]] #for the catalogue of sources
catalogue1 = cbind(all.imf.data$Structure$Dataflows$Dataflow$KeyFamilyRef,
all.imf.data$Structure$Dataflows$Dataflow$Name[[2]])
catalogue1 = catalogue1[,-2] # catalogue of all the countries
data.structure = fromJSON("http://dataservices.imf.org/REST/SDMX_JSON.svc/DataStructure/IFS")
info1 = data.frame(data.structure$Structure$Concepts$ConceptScheme$Concept[,c(1,4)])
View(data.structure$Structure$CodeLists$CodeList$Description)
str(data.structure$Structure$CodeLists$CodeList$Code)
#Units
units = data.structure$Structure$CodeLists$CodeList$Code[[1]]
#Countries
countries = data.frame(data.structure$Structure$CodeLists$CodeList$Code[[3]])
countries = countries[,-length(countries)]
#Series Codes
codes = data.frame(data.structure$Structure$CodeLists$CodeList$Code[[4]])
codes = codes[,-length(codes)]
# all.imf.data # JSON from the starting point, provided on the website
# catalogue1 # data frame of all the data bases, International Financial Statistics, Government Financial Statistics, etc.
# codes # codes for the specific data sets (GDP, Current Account, etc).
# countries # data frame of all the countries and their ISO codes
# data.structure # large list, with starting URL and endpoint "IFS". Ideally, I want to find some data set somewhere within this data base.
"info1" # looks like parameters for retrieving the data (for instance, dates, units, etc).
# units # data frame that indicates the options for units
I would just like some advice about how to go about retrieving any data, something as simple as GDP (PPP) for a constant year. I have been following an article in R blogs (which retrieved data in the EU's database) but I cannot replicate the procedure for the IMF. I feel like I am close to retrieving something useful but I cannot quite get there. Given that I have data frames that contain the names for the databases, the series and the codes for the series, I think it is just a matter of figuring out how to construct the appropriate URL for getting the data, but I could be wrong.
Provided in the data frame codes are the codes for the data sets I presume. Is there a way to make a call for the data for, let's say, the US for BK_DB_BP6_USD, which is "Balance of Payments, Capital Account, Total, Debit, etc"? How should I go about doing this in the context of R?

Uncommitted transactions in Plone + SqlAlchemy + MySql

We have a hybrid web application integrating a MySql db with Plone (last upgrade was to Plone 4.0), using collective.tin, collective.lead and SqlAlchemy.
Ok, I know that collective.tin never was released and collective.lead has been superseded; however all things work (almost) perfectly since a few years.
Recently we experienced a very strange behaviour and are looking for help in order to understand it.
Among others, we have 2 Plone content types, say A and B, defined by subclassing collective.tin, and the corresponding innodb MySql tables; rows of B have a foreign key towards A.
In the time span of 15-20 minutes, 2 different users created 3 A objects and some 10-20 B objects that weren't committed to MySql but were indexed by Plone; queries I executed with a MySql client from the linux shell weren't able to find those A rows (didn't look for B rows); however, queries executed through the web application (the aforementioned components stack) by those 2 users, and also by other users, occasionally were still finding and correctly visualizing some of those 3 A objects.
Only after I restarted the Zope instance, it was possible to resume normal activity from the Plone web interface; 3 A rows and many B rows were still missing from the MySql db, but the autoincrement counter showed the expected increment; I had to remove 3 invalid brains for A objects from the Plone index (didn't worry for B objects).
Any suggestion on possible causes and on how to investigate the problem?
We had the exact same problem with sqlalchemy 0.4; the session would get out of sync with the actual database contents. The problem was somewhat masked in our case because users were sent to specific backends in the cluster through session affinity. If the affinity was lost suddenly messages had disappeared. The exact details are a little hazy, because I cannot locate the correct (ancient) revision history of the fix I put in place.
From what I can glean from context is that the session identity map prevents the session from requiring the database for objects it retrieved before. It thus won't see changes made to these objects in different sessions.
The fix is to call .expire_all() on the session after each and every commit or rollback; SQLAlchemy 0.5 and up does this automatically (autoexpire=True on the session, now called expire_on_commit I believe), but for 0.4 you'll need to register a SessionExtension to do this for you.
Lucky for you, we also use collective.lead for this project, so my fix is your fix:
# The identity map should be flushed on commit.
# SQLAlchemy 0.5 does this properly, but in 0.4 we need to do this via
# a SesssionExtension.
from sqlalchemy import __version__
if __version__[:3] == '0.4':
from sqlalchemy.orm.session import SessionExtension
class ExpireAllSessionExtension(SessionExtension):
def after_commit(self, session):
"""Expire the identity-map on commit"""
session.expire_all()
def after_rollback(self, session):
"""Expire the identity-map on rollback"""
session.expire_all()
def installExtension():
# Patch collective.lead.database to let us install the extension
# on the session created there.
from collective.lead.database import Database
old_session = Database.session.fget
def session(self):
session = old_session(self)
if session.extension is None:
session.extension = ExpireAllSessionExtension()
return session
Database.session = property(session)
else:
def installExtension():
pass
When defining the mapper, you install this extension with:
from .sessionexpiration import installExtension
# Ensure that sessions get properly expired on commit and rollback.
installExtension()

Save R plot to web server

I'm trying to create a procedure which extracts data from a MySQL server (using the RODBC package), performs some statistical routines on that data in R, then saves generated plots back to the server such that they can be retrieved in a Web Browser via a little bit of php and web magic.
My plan is to save the plot in a MySQL BLOB field by using the RODBC package to execute a SQL insert into statement. I think I can insert the data directly as a string. Problem is, how do I get the data string and will this even work? My best thought is to use the savePlot function to save a temp file and then read it back in somehow.
Anybody tried this before or have suggestions on how to approach this?
Regardless of if you think this is a terrible idea, here is a working answer I was able to piece together from this post
## open connection
library(RODBC)
channel <- odbcConnect("")
## generate a plot and save it to a temp file
x <- rnorm(100,0,1)
hist(x, col="light blue")
savePlot("temp.jpg", type="jpeg")
## read back in the temp file as binary
plot_binary <- paste(readBin("temp.jpg", what="raw", n=1e6), collapse="")
## insert it into a table
sqlQuery(channel, paste("insert into test values (1, x'",plot_binary,"')", sep=""))
## close connection
odbcClose(channel)
Before implementation, I'll make sure to do some soul searching to decide if this should be used rather than using the servers file system.
Storing images in databases is often frowned upon. To create an in memory file in R you can use a textConnection as a connection. This will give you the string. It will work if you don't forget to set the proper mime type and open the connection as binary.
Save the plot to a server and write the filename into the database will work. But there's this thing called Rapache may help. Plus, Jeroen Ooms has some online demo, including a web interface for Hadley Wickham's famous R Graph package ggplot2.