httr: retrieving data with POST() - html

Disclaimer: while I have managed to grab data from another source using httr's POST function, let it be known that I am a complete n00b with regards to httr and HTML forms in general.
I would like to bring some data directly into R from a website using httr. My first attempt involved passing a named list to the body arg (as is shown in this vignette). However, I noticed square brackets in the form input names (at least I think they're the form input arguments). So instead, I tried passing in the body as a string as I think it should appear in the request body:
url <- 'http://research.stlouisfed.org/fred2/series/TOTALSA/downloaddata'
query <- paste('form[native_frequency]=Monthly', 'form[units]=lin',
'form[frequency]=Monthly', 'form[obs_start_date]="1976-01-01"',
'form[obs_end_date]="2014-11-01"', 'form[file_format]=txt'
sep = '&')
response <- POST(url, body = query)
In any case, the above code just returns the webpage source code and I cannot figure out how to properly submit the form so that it returns the same data as manually clicking the form's 'Download Data' button.
In Developer Tools/Network on Chrome, it states in the Response Header under Content-Disposition that there is a text file attachment containing the data when I manually click the 'Download Data' button on the form. It doesn't appear to be in any of the headers associated with the response object in the code above. Why isn't this file getting returned by the POST request--where's the file with the data going?
Feels like I'm missing something obvious. Anyone care to help me connect the dots?

Generally if you're going to use httr, you let it build and encode the data for you, you just pass in the information via a list of form values. Try
url<-"http://research.stlouisfed.org/fred2/series/TOTALSA/downloaddata"
query <- list('form[native_frequency]'="Monthly",
'form[units]'="lin",
'form[frequency]'="Monthly",
'form[obs_start_date]'="1996-01-01",
'form[obs_end_date]'="2014-11-01",
'form[file_format]'="txt")
response <- POST(url, body = query)
content(response, "text")
and the return looks something like
[1] "Title: Total Vehicle Sales\r\nSeries ID: TOTALSA\r\nSource:
US. Bureau of Economic Analysis\r\nRelease: Supplemental Estimates, Motor
Vehicles\r\nSeasonal Adjustment: Seasonally Adjusted Annual Rate\r\nFrequency: Monthly\r\nUnits:
Millions of Units\r\nDate Range: 1996-01-01 to 2014-11-
01\r\nLast Updated: 2014-12-05 7:16 AM CST\r\nNotes: \r\n\r\nDATE
VALUE\r\n1996-01-01 14.8\r\n1996-02-01 15.6\r\n1996-03-01 16.0\r\n1996-04-01 15.5\r\n1996-05-01
16.0\r\n1996-06-01 15.3\r\n1996-07-01 15.1\r\n1996-08-01 15.5\r\n1996-09-01 15.5\r\n1996-10-01 15.3\r

Related

Extract or generate X-Client-TraceId for header in GET-request

I would like to retrieve some historical stock prices via a REST API from the following site:
https://www.boerse-frankfurt.de/zertifikat/de0007873291-open-end-zertifikat-auf-dow-jones-industrial-average
The response is a JSON.
Basically, the query can be done as follows: An OPTIONS call is sent without parameters and then a GET request with header parameters.
Both calls are sent to the following address:
https://api.boerse-frankfurt.de/v1/data/quote_history_derivatives?isin=DE0007873291&mic=XSC&from=2021-11-12T07%3A00%3A00.000Z&to=2021-11-12T21%3A00%3A00.000Z&offset=0&limit=25
The following two parameters are included in the header:
Client-Date: 2021-11-16T23:02:29.529Z
X-Client-TraceId: d2d6911d81ebbbff7a7549555a2c26d6
And now my question: how do you get the X-Client-TraceId? It looks like a UUID, but it doesn't seem to be one. The value changes with every page view in the browser. But you can't just enter any value.
Many greetings,
Trebor
Since this question was asked, someone has written a blog post about this exact topic. The algorithm detailed there still seems to be in use (as of 2022-03-12).
An excerpt of the relevant parts:
Client-Date
This is the current time, converted to a string with Javascript’s toISOString() function.
[...]
X-Client-TraceId
[...]
salt is a fixed string, in this case w4icATTGtnjAZMbkL3kJwxMfEAKDa3MN. Apparently it appears in the source code as-is so it must be constant.
X-Client-TraceId is the md5 of time + url + salt.
Note: time is the string sent in the Client-Date header.
The blog post has some additional information around the process of reverse engineering this algorithm and the X-Security header.

Apps script JSON.parse() returns unexpected result, how can I solve this?

I am currently working on external app using Google Sheets and JSON for data transmission via Fetch API. I decided to mock the scenario (for debugging matters) then simple JSON comes from my external app through prepared Code.gs to be posted on Google sheets. The code snippet I run through Apps-scripts looks like this:
function _doPost(/* e */) {
// const body = e.postData.contents;
const bodyJSON = JSON.parse("{\"coords\" : \"123,456,789,112,113,114,115,116\"}" /* instead of : body */);
const db = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
db.getRange("A1:A10").setValue(bodyJSON.coords).setNumberFormat("#"); // get range, set value, set text format
}
The problem is the result I get: 123,456,789,112,113,000,000,000 As you see, starting from 114 and the later it outputs me 000,... instead. I thought, okay I am gonna explicitly specify format to be returned (saved) as a text format. If the output within the range selected on Google Sheets UI : Format -> Number -> it shows me Text.
However, interesting magic happens, let's say if I would update the body of the JSON to be parsed something like that when the sequence of numbers composed of 2 digits instead of 3 (notice: those are actual part of string, not true numbers, separated by comma!) : "{\"coords\" : \"123,456,789,112,113,114,115,116,17,18\"}" it would not only show response result as expected but also brings back id est fixes the "corrupted" values hidden under the 000,... as so : "{"coords" : "123,456,789,112,113,114,115,116,17,18 "}".
Even Logger.log() returns me initial JSON input as expected. I really have no clue what is going on. I would really appreciate one's correspondence to help solving this issue. Thank you.
You can try directly assigning a JSON formatted string in your bodyJSON variable instead of parsing a set of string using JSON.parse.
Part of your code should look like this:
const bodyJSON = {
"coords" : "123,456,789,112,113,114,115,116"
}
I found simple workaround after all: just added the preceding pair of zeros 0,0,123,... at the very beginning of coords. This prevents so called culprit I defined in my issue. If anyone interested, the external app I am building currently, it's called Hotspot widget : play around with DOM, append a marker which coordinates (coords) being pushed through Apps-script and saved to Google Sheets. I am providing a link with instructions on how to set up one's own copy of the app. It's a decent start-off for learning Vanilla JavaScript basics including simple database approach on the fly. Thank you and good luck!
Hotspot widget on Github

How to retrieve the rest api complete response in vugen?

I am trying to retrieve the complete json response in VUGEN. I am new to writing script in VUGEN. I am using web-HTTP/HTML protocol and just wrote a simple script to call a rest service with POST.
Action()
{
web_rest("POST: http://losthost:8181/DBConnector/restServices/cass...",
"URL=http://losthost:8181/DBConnector/restServices/oep_catalog_v1",
"Method=POST",
"EncType=raw",
"Snapshot=t868726.inf",
HEADERS,
"Name=filter", "Value=upc=123456789", ENDHEADER,
"Name=env", "Value=qa", ENDHEADER,
LAST);
return 0;
}
I don't know what to do next. I searched on the internet to get any command to pull response value. I got web_reg_save_param but it just pulls one value. I need the complete response saved in a file or string.
Please help.
VuGen provides several APIs to extract response data.
For example, you can do the boundary based correlation with empty left and right boundary. The sample below saves the web_rest response (body of donuts.js) in the parameter CorrelationParameter3.
web_reg_save_param_ex(
"ParamName=CorrelationParameter3",
"LB=",
"RB=",
SEARCH_FILTERS,
"Scope=Body",
LAST);
web_rest("GET: donuts.js",
"URL=http://adobe.github.io/Spry/data/json/donuts.js",
"Method=GET",
"Snapshot=t769333.inf",
LAST);
This process of locating, extracting and replacing dynamic values is called “correlation”.
You can read more about correlations in LoadRunner correlations kept simple blog post.
Your manager owes your training and a mentor for a period if you are asked to perform in this capacity

R: getting data from website, method POST, dropdown menu options change

I'm trying to use R to extract data from a website where I have to select information from 5 dropdown menus and then click on an export or consult button (http://200.20.53.7/dadosaguaweb/default.aspx). I found this excellent thread: Getting data in R as dataframe from web source, but it didn't answer my question because of some differences:
1) The website's form's method is Post, not Get;
I tried using the RHTMLForms package together with RCurl, in a way that would work for Post or Get. Namely:
baseURL <- "http://200.20.53.7/dadosaguaweb/default.aspx"
forms<-getHTMLFormDescription(baseURL)
form1<-forms$form1
dadosAgua<-createFunction(form1)
dadosDef<-dadosAgua(75,"PS0421",1979,2015,6309)
2) The website is one of those where the list of options for the second dropdown menu changes according to what you selected for the first one and so on. Therefore, when I set the first input parameter to "75", it does not accept the second one as "PS0421" because that option is not available when the first parameter is at its default value.
So, I tried a step-by-step approach, changing one parameter at a time, like this:
baseURL <- "http://200.20.53.7/dadosaguaweb/default.aspx"
forms1 <- getHTMLFormDescription(baseURL)
form1 <- forms1$form1
dadosAgua1 <- createFunction(form1)
dadosDef1 <- dadosAgua1(75)
forms2 <- getHTMLFormDescription(dadosDef1)
form2 <- forms2$form1
dadosAgua2 <- createFunction(form2)
dadosDef2 <- dadosAgua2(75,"PS0421")
And I get the error message:
Error in function (type, msg, asError = TRUE) : Empty reply from server
Now I'm completely stuck.
I think what you're trying to do is navigation scripting, i.e. getting code to interact with a webpage. It may be complicated to do that programatically, because in order for the fields in the form to change in response to what you click, you have to actually be on a web-browser.
An alternative might be for you to use a tool that can do that for you, like CasperJS, which uses a headless browser, so the page fields can change based on behaviour you script. I don't know how comfortable you are with Javascript, and I don't know of any R packages that can do what casperjs does, so I can't recommend anything else.
Edit:
Take a look at RSelenium

How to pass back html and logic information after an ajax call with CI

I have a CI and jQuery based project. I've got a site searching my db. It consists of a jQueryUI accordion. One section contains input fields for an advanced search and the other section is used to display a html table with results.
The search parameters from the first section are sent to the server using ajax post. This is crunched by the server and either a html styled error message or a html table with results (and later some other stuff such as how many results found, how much time consumed etc.) is returned.
Back on the client jQuery must be able to distinguish between the two. Best would be to be able to transmit another variable 'search_success'. If 'search_success' is false, the error is prepended to section one above the input fields. Otherwise the html block is displayed in section two and jQuery opens section 2.
Right now I'm returning plain html with a 0 or 1 prepended. This first char is chopped off by jQuery and used to distinguish between the two possible results. This is kind of ugly.
After reading this post about sending array using json I thought about addressing this problem in json.
I intended to build something like
echo json_encode(array('search_success' => $search_success, 'html' => $html));
This would alow for nice structuring of the data. Problem now is, my 'html' is not a simple php variable but a view:
<?php
$template = array('table_open' => '<table id="table" data-url="'.base_url().'">');
$this->table->set_template($template);
$this->table->set_heading($table_header);
echo $this->table->generate($table);
?>
This view could also get a lot more complicated. Of course I could abandon the CI MVC and store the whole html in a php string which I could transform to json with the above code. However, this would defeat the purpose of storing the whole html part in a view.
Is there a way to wrap my whole view in json without relinquishing my view architecture?
Or what approach would be more suitable to the problem?
Thanks, singultus
To bring this topic to an end, the answer is simple:
$json['html'] = $this->load->view('myfile', '', true); // 3. param 'true'!
$json['other_stuff'] = $other stuff;
echo json_encode($json);
See here at the very end. This approach allows for a nicely structured response to the server.
All credit to #koala_dev!