Mixed data sampling in R (midas_r) - regression

I have the following code where I have a weekly time series (variable x) which I would like to use in order to forecast my monthly time series (variable y).
So basically I want to forecast the current month's data (variable y) with either 1,2,3 or all 4 weeks (variable New_value) in the current month.
However, I am not sure if I am using the correct lags (I think I am) but moreover I am not sure how to interpret the starting values in the midas_r function (start = list() ).
Any help would be much appreciated.
######################
# MIDAS REGRESSION
####################
x <- structure(c(1.19, 1.24 , 1.67 , 1.67 , 1.55 , 1.67 , 1.39 , 2.01 , 2.14 , 1.71 , 1.59 , 1.49 , 1.68 , -0.37 , -0.44 , -7.87 , -7.79 , -31.22 , -31.05 , -30.47 , -35.53 , -25.48 , -25.9 , -19.03 , -16.33 ,
10.09 , 13.19 , 13.31 , 16.85 , 14.58 , 14.78 , 14.62 , 15.27 , 15.58 , 15.63 , 14.27 , 14.09 , 4.82 , 3.55 , 3.46 , 3.24 , 2.86 , 2.86 , 2.86 , 2.82),
.Tsp = c(2020, 2020.846154, 52), class = "ts")
x <- diff(x)
y <- structure(c(2.30, 2.64 , 2.77 , 2.83 , -43.91 , 12.32 , 26.68 , 12.06 , 10.08 , 12.01 , 4.71 , 3.85),
.Tsp = c(2020, 2020.91667, 12), class = "ts")
y <- diff(y)
trend <- c(1:length(y))
#RUNNING THE MIDAS REGRESSION
reg <- midas_r(y ~ mls(y, 1, 1) + mls(x, 4:7, m = 4, nealmon), start = list(x = c(1, 1, -1, -1)))
summary(reg)
hAh_test(reg)
#forecast(reg, newdata = list(y = c(NA), x =c(rep(NA, 4))))
#FORECAST WITH WEEKLY VALUES
reg <- midas_r(y ~ mls(y, 1, 1) + mls(x, 3:7, m = 4, nealmon), start = list(x = c(1, 1, -1, -1)))
new_value <- 2.52
#new_value <- c(2.52, 3.12)
forecast(reg, newdata = list(x = c(new_value, rep(NA, 4-length(new_value)))))

Looks like you have a problem with indentation, plus some unnecessary complexity.
Try something like this and see if it work:
for i in range(1, 20):
url = 'https://www.boliga.dk/salg/resultater?propertyType=4&street=&municipality=326&salesDateMin=2020&page=' + str(i)
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')
table = soup.find('table', {'class': 'table generic-table m-0 mb-3'})
df = pd.read_html(str(table))[0]
df = pd.DataFrame(df) #if it all works otherwise, try running the code w/out this line; it may be unnecessary
df

Related

How to merge 2 row headers in a single column in a data table and insert a reactive object?

This is the next step in my attempt to build a user-friendly transition matrix in R, a follow-on to post How to add a vertical line to the first column header in a data table?. I have been spoiled by the ease of drafting eye-friendly tables in Excel and have been struggling with this in R Shiny.
Running the MWE code at the bottom generates the transition table shown on the left side of the image below (with my comments overlaying). Expressing my question in Excel-speak, I'm trying to merge the top 2 cells (rows) in the left-most column (call them cells A1 and A2), eliminate the small bit of line just above "to_state" (cell A2)(item #1 in the image), eliminate that first column's header "to_state" (in cell A2)(item #2 in the image), and into that merged column header space insert an object similar to the object hovering over the "From" columns to the right, that states "To state where end period = x", where x is the value of object transTo() (item #3 in the image). Any suggestions for doing this? Using DT for the table rendering if possible.
I'm open to any other suggestion for drafting a user-friendly, understandable state transition matrix that delineates to/from columns/rows and reactively shows the to/from periods.
Post Shiny: Merge cells in DT::datatable seems promising but it addresses merging rows in the body of the table and not header rows.
Please note that in the fuller code, the table dynamically contracts/expands based on the number of unique states detected in the underlying data. States can range from 2 to 12.
MWE code:
library(DT)
library(shiny)
library(dplyr)
library(htmltools)
library(data.table)
data <-
data.frame(
ID = c(1,1,1,2,2,2,3,3,3),
Period = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
Values = c(5, 10, 15, 0, 2, 4, 3, 6, 9),
State = c("X0","X1","X2","X0","X2","X0", "X2","X1","X0")
)
numTransit <- function(x, from=1, to=3){
setDT(x)
unique_state <- unique(x$State)
all_states <- setDT(expand.grid(list(from_state = unique_state, to_state = unique_state)))
dcast(x[, .(from_state = State[from],
to_state = State[to]),
by = ID]
[,.N, c("from_state", "to_state")]
[all_states,on = c("from_state", "to_state")],
to_state ~ from_state, value.var = "N"
)
}
ui <- fluidPage(
tags$head(tags$style(".datatables .display {margin-left: 0;}")), # < left-align the table
h4(strong("Base data frame:")),
tableOutput("data"),
h4(strong("Transition table inputs:")),
numericInput("transFrom", "From period:", 1, min = 1, max = 3),
numericInput("transTo", "To period:", 2, min = 1, max = 3),
h4(strong("Output transition table:")),
DTOutput("resultsDT"),
)
server <- function(input, output, session) {
results <-
reactive({
results <- numTransit(data, input$transFrom, input$transTo) %>%
replace(is.na(.), 0) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
results <- cbind(results, Sum = rowSums(results[,-1]))
})
output$data <- renderTable(data)
output$resultsDT <- renderDT(server=FALSE, {
req(results())
datatable(
data = results(),
rownames = FALSE,
filter = 'none',
container = tags$table(
class = 'display',
tags$thead(
tags$tr(
tags$th(colspan = 1, '', style = "border-right: solid 1px;"),
tags$th(colspan = 10, sprintf('From state where initial period = %s', input$transFrom))
),
tags$tr(
mapply(tags$th, colnames(results()), style = sprintf("border-right: solid %spx;", c(1L, rep(0, ncol(results())-1L))), SIMPLIFY = FALSE)
)
)
),
options = list(scrollX = F
, dom = 'ft'
, lengthChange = T
, pagingType = "numbers" # hides Next and Previous buttons
, autoWidth = T
, info = FALSE # hide the "Showing 1 of 2..." at bottom of table
, searching = FALSE # removes search box
),
class = "display"
) %>%
formatStyle(c(1), `border-right` = "solid 1px")
})
}
shinyApp(ui, server)
Please reference these related posts that lead to the solution shown at the bottom. The posts that built up to this solution are How to merge to row cells in data table?, How to add a vertical line to the first column header in a data table?, and How to add reactive object to secondary column header in output table?
Solution:
library(DT)
library(shiny)
library(dplyr)
library(htmltools)
library(data.table)
data <-
data.frame(
ID = c(1,1,1,2,2,2,3,3,3),
Period = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
Values = c(5, 10, 15, 0, 2, 4, 3, 6, 9),
State = c("X0","X1","X2","X0","X2","X0", "X2","X1","X0")
)
numTransit <- function(x, from=1, to=3){
setDT(x)
unique_state <- unique(x$State)
all_states <- setDT(expand.grid(list(from_state = unique_state, to_state = unique_state)))
dcast(x[, .(from_state = State[from],
to_state = State[to]),
by = ID]
[,.N, c("from_state", "to_state")]
[all_states,on = c("from_state", "to_state")],
to_state ~ from_state, value.var = "N"
)
}
ui <- fluidPage(
tags$head(tags$style(".datatables .display {margin-left: 0;}")), # < left-align the table
h4(strong("Base data frame:")),
tableOutput("data"),
h4(strong("Transition table inputs:")),
numericInput("transFrom", "From period:", 1, min = 1, max = 3),
numericInput("transTo", "To period:", 2, min = 1, max = 3),
h4(strong("Output transition table:")),
DTOutput("resultsDT"),
)
server <- function(input, output, session) {
results <-
reactive({
results <- numTransit(data, input$transFrom, input$transTo) %>%
replace(is.na(.), 0) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
results <- cbind(results, Sum = rowSums(results[,-1]))
})
output$data <- renderTable(data)
output$resultsDT <- renderDT(server=FALSE, {
req(results())
datatable(
data = results(),
rownames = FALSE,
filter = 'none',
container = tags$table(
class = 'display',
tags$thead(
tags$tr(
tags$th(rowspan = 2, sprintf('To state where end period = %s', input$transTo), style = "border-right: solid 1px;"),
tags$th(colspan = 10, sprintf('From state where initial period = %s', input$transFrom))
),
tags$tr(
mapply(tags$th, colnames(results())[-1], style = sprintf("border-right: solid %spx;", rep(0, ncol(results()) - 1L)), SIMPLIFY = FALSE)
)
)
),
options = list(scrollX = F
, dom = 'ft'
, lengthChange = T
, pagingType = "numbers" # hides Next and Previous buttons
, autoWidth = T
, info = FALSE # hide the "Showing 1 of 2..." at bottom of table
, searching = FALSE # removes search box
),
class = "display"
) %>%
formatStyle(c(1), `border-right` = "solid 1px")
})
}
shinyApp(ui, server)

How to add reactive object to secondary column header in output table?

I am working on a transition table module and am wrestling with how to make the output understandable for the user. I used to prepare transition tables in Excel; making the table legible was super easy but deriving the data for table output took hours. Now my problem is the opposite with R: takes a few seconds to generate the table output from millions of rows of data but table presentation is far from simple.
To start, I would like to reflect the user's "From" input (object transFrom) in this reactive table's secondary column header as shown in the image below; any suggestions for how to do this? I am completely clueless with respect to html. I had found this solution here R Shiny app - Render Data Table with double header, and I like it because it uses DT, which I use throughout (though I would have preferred the base R table, using Shiny renderTable(), but I could not make that work). I have researched this and found other packages for drafting nice tables but I am avoiding "package bloat" and would rather stick with base R, Shiny, data.table, and DT package IF POSSIBLE.
Note that the columns reflect the transition states FROM, and the rows reflect the transition states TO.
Here is the MWE code for actively rendering the above:
library(data.table)
library(dplyr)
library(shiny)
# custom table container
sketch = htmltools::withTags(table(
class = 'display',
thead(
tr(
th(colspan = 1, ''),
th(colspan = 10, 'From state where initial period is = ')
),
tr(
lapply(colnames(results), th)
)
)
))
data <-
data.frame(
ID = c(1,1,1,2,2,2,3,3,3),
Period = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
Values = c(5, 10, 15, 0, 2, 4, 3, 6, 9),
State = c("X0","X1","X2","X0","X2","X0", "X2","X1","X0")
)
ui <- fluidPage(
tags$head(tags$style(".datatables .display {margin-left: 0;}")), # < left-align the table
h4(strong("Base data frame:")),
tableOutput("data"),
h4(strong("Transition table inputs:")),
numericInput("transFrom", "From period:", 1, min = 1, max = 3),
numericInput("transTo", "To period:", 2, min = 1, max = 3),
h4(strong("Output transition table:")),
DTOutput("results"),
)
server <- function(input, output) {
numTransit <- function(x, from=1, to=3){
setDT(x)
unique_state <- unique(x$State)
all_states <- setDT(expand.grid(list(from_state = unique_state, to_state = unique_state)))
dcast(x[, .(from_state = State[from],
to_state = State[to]),
by = ID]
[,.N, c("from_state", "to_state")]
[all_states,on = c("from_state", "to_state")],
to_state ~ from_state, value.var = "N"
)
}
results <-
reactive({
results <- numTransit(data,input$transFrom,input$transTo) %>%
replace(is.na(.), 0) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
results <- cbind(results, Sum = rowSums(results[,-1]))
})
output$data <- renderTable(data)
output$results <- renderDT(server=FALSE,{
results() %>%
datatable(rownames = FALSE,
filter = 'none',
container = sketch,
options = list(scrollX = F
, dom = 'ft'
, lengthChange = T
, pagingType = "numbers" # hides Next and Previous buttons
, autoWidth = T
, info = FALSE # hide the "Showing 1 of 2..." at bottom of table
,searching = FALSE # removes search box
),
class = "display"
)
})
}
shinyApp(ui, server)
It seems that htmltools::withTags doesn't play well with using shiny inputs (I filed an issue here).
Please check the following:
library(DT)
library(shiny)
library(htmltools)
library(data.table)
data <-
data.frame(
ID = c(1,1,1,2,2,2,3,3,3),
Period = c(1, 2, 3, 1, 2, 3, 1, 2, 3),
Values = c(5, 10, 15, 0, 2, 4, 3, 6, 9),
State = c("X0","X1","X2","X0","X2","X0", "X2","X1","X0")
)
numTransit <- function(x, from=1, to=3){
setDT(x)
unique_state <- unique(x$State)
all_states <- setDT(expand.grid(list(from_state = unique_state, to_state = unique_state)))
dcast(x[, .(from_state = State[from],
to_state = State[to]),
by = ID]
[,.N, c("from_state", "to_state")]
[all_states,on = c("from_state", "to_state")],
to_state ~ from_state, value.var = "N"
)
}
ui <- fluidPage(
tags$head(tags$style(".datatables .display {margin-left: 0;}")), # < left-align the table
h4(strong("Base data frame:")),
tableOutput("data"),
h4(strong("Transition table inputs:")),
numericInput("transFrom", "From period:", 1, min = 1, max = 3),
numericInput("transTo", "To period:", 2, min = 1, max = 3),
h4(strong("Output transition table:")),
DTOutput("resultsDT"),
)
server <- function(input, output, session) {
results <-
reactive({
results <- numTransit(data, input$transFrom, input$transTo) %>%
replace(is.na(.), 0) %>%
bind_rows(summarise_all(., ~(if(is.numeric(.)) sum(.) else "Sum")))
results <- cbind(results, Sum = rowSums(results[,-1]))
})
output$data <- renderTable(data)
output$resultsDT <- renderDT(server=FALSE, {
req(results())
datatable(
data = results(),
rownames = FALSE,
filter = 'none',
container = tags$table(
class = 'display',
tags$thead(
tags$tr(
tags$th(colspan = 1, ''),
tags$th(colspan = 10, sprintf('From state where initial period is = %s', input$transFrom))
),
tags$tr(
lapply(colnames(results()), tags$th)
)
)
),
options = list(scrollX = F
, dom = 'ft'
, lengthChange = T
, pagingType = "numbers" # hides Next and Previous buttons
, autoWidth = T
, info = FALSE # hide the "Showing 1 of 2..." at bottom of table
, searching = FALSE # removes search box
),
class = "display"
)
})
}
shinyApp(ui, server)

Shifting tensor over dimension

Lets say I have a tensor of A of shape [batch_size X length X 1024]
I want to do the following :
for the i element in the batch i want to shift the (embedding 1024) of all 'length' elements embedding by their position .
for example the vector A[0 , 0 , : ] should stay the same, and A[0 , 1 , :] should be shifted (or rolled) by 1 , and A[0 , 15 , :] should be shifted by 15.
this is for all the elements in the batch.
so far i did it with for loops, but its not efficient
below is my code with for loops :
x = # [batchsize , length , 1024]
new_embedding = []
llist = []
batch_size = x.shape[0]
seq_len = x.shape[1]
for sample in range(batch_size):
for token in range(seq_len):
orig = x[sample , token , : ]
new_embedding.append(torch.roll(orig , token , 0))
llist.append(torch.stack(new_embedding , 0))
new_embedding = []
x = torch.stack(llist , 0)

Assigned Octave variable not being saved to file

In the Octave script below I am looping through files in a directory, loading them in to Octave to do some manipulation on data, and then attempting to write the manipulated data ( a matrix ) to a new file whose name is derived from the name of the input file. The manipulated data is assigned to a variable name that has the same name as the file that it is to be saved in. All unwanted variables are cleared and the save command should save/write the single, assigned variable matrix to the file "new_filename."
However, this last save/write command is not happening, and I don't understand why not. Without specific variable commands, the save function should save all variables in scope, in this case there only being the one matrix to save. Why is this not working?
clear all ;
all_raw_OHLC_files = glob( "*_raw_OHLC_daily" ) ; % cell with filenames matching *_raw_OHLC_daily
for ii = 1 : length( all_raw_OHLC_files ) % loop for length of above cell
filename = all_raw_OHLC_files{ii} ; % get files' names
% create a new filename for the output file
split_filename = strsplit( filename , "_" ) ;
new_filename = tolower( [ split_filename{1} "_" split_filename{2} "_ohlc_daily" ] ) ;
% open and read file
fid = fopen( filename , 'rt' ) ;
data = textscan( fid , '%s %f %f %f %f %f %s' , 'Delimiter' , ',' , 'CollectOutput', 1 ) ;
fclose( fid ) ;
ex_data = [ datenum( data{1} , 'yyyy-mm-dd HH:MM:SS' ) data{2} ] ; % extract the file's data
% process the raw data in to OHLC bars
weekday_ix = weekday( ex_data( : , 1 ) ) ;
% find Mondays immediately preceeded by Sundays in the data
monday_ix = find( ( weekday_ix == 2 ) .* ( shift( weekday_ix , 1 ) == 1 ) ) ;
sunday_ix = monday_ix .- 1 ;
% replace Monday open with the Sunday open
ex_data( monday_ix , 2 ) = ex_data( sunday_ix , 2 ) ;
% replace Monday high with max of Sunday high and Monday high
ex_data( monday_ix , 3 ) = max( ex_data( sunday_ix , 3 ) , ex_data( monday_ix , 3 ) ) ;
% repeat for min of lows
ex_data( monday_ix , 4 ) = min( ex_data( sunday_ix , 4 ) , ex_data( monday_ix , 4 ) ) ;
% combines volume figures
ex_data( monday_ix , 6 ) = ex_data( sunday_ix , 6 ) .+ ex_data( monday_ix , 6 ) ;
% now delete the sunday data
ex_data( sunday_ix , : ) = [] ;
assignin( "base" , tolower( [ split_filename{1} "_" split_filename{2} "_ohlc_daily" ] ) , ex_data )
clear ans weekday_ix sunday_ix monday_ix ii filename split_filename fid ex_data data all_raw_OHLC_files
% print to file
save new_filename
endfor
save new_filename saves the current workspace to a file with the filename "new_filename". I guess what you want is to create a file with a filename that is stored in "new_filename":
save (new_filename);
Your current approach of "clearing all I don't need and then store the whole workspace" is IMHO very ugly and you should instead explicitly store ex_data if this is the only part wou want:
save (new_filename, "ex_data");

crosstables for survey data (weighted and unweighted)

I have survey data that I am working on. I need to make some tables and regression analyses on the data.
After attaching the data, this is the code I use for tables for four variables:
ftable(var1, var2, var3, var4)
And this is the regression code that I use for the data:
logit.1 <- glm(var4 ~ var3 + var2 + var1, family = binomial(link = "logit"))
summary(logit.1)
So far so good for the unweighted analyses. But how can I do the same analyses for the weighted data? Here is some additional info:
There are four variables in the dataset that reflect the sampling structure. These are
strat: stratum (urban or (sub-county) rural).
clust: batch of interviews that were part of the same random walk
vill_neigh_code: village or neighbourhood code
sweight: weights
library(survey)
data(api)
# example data set
head( apiclus2 )
# instead of var1 - var4, use these four variables:
ftable( apiclus2[ , c( 'sch.wide' , 'comp.imp' , 'both' , 'awards' ) ] )
# move it over to x for faster typing
x <- apiclus2
# also give x a column of all ones
x$one <- 1
# run the glm() function specified.
logit.1 <-
glm(
comp.imp ~ target + cnum + growth ,
data = x ,
family = binomial( link = 'logit' )
)
summary( logit.1 )
# now create the survey object you've described
dclus <-
svydesign(
id = ~dnum + snum , # cluster variable(s)
strata = ~stype , # stratum variable
weights = ~pw , # weight variable
data = x ,
nest = TRUE
)
# weighted counts
svyby(
~one ,
~ sch.wide + comp.imp + both + awards ,
dclus ,
svytotal
)
# weighted counts formatted differently
ftable(
svyby(
~one ,
~ sch.wide + comp.imp + both + awards ,
dclus ,
svytotal ,
keep.var = FALSE
)
)
# run the svyglm() function specified.
logit.2 <-
svyglm(
comp.imp ~ target + cnum + growth ,
design = dclus ,
family = binomial( link = 'logit' )
)
summary( logit.2 )