Problem with retrieving a table from Chinook dataset due to character - mysql

I have created a database in MySQL with data from the Chinook dataset, which has fictitious information on customers that buy music.
One of the tables ("Invoice"), has the billing addresses, which has characters in diverse languages:
InvoiceId CustomerId InvoiceDate BillingAddress
1 2 2009-01-01 00:00:00 Theodor-Heuss-Straße 34
2 4 2009-01-02 00:00:00 Ullevålsveien 14
3 8 2009-01-03 00:00:00 Grétrystraat 63
4 14 2009-01-06 00:00:00 8210 111 ST NW
I tried to retrieve the data using R, with the following code:
library(DBI)
library(RMySQL)
library(dplyr)
library(magrittr)
library(lubridate)
library(stringi)
# Step 1 - Connect to the database ----------------------------------------
con <- DBI::dbConnect(MySQL(),
dbname = Sys.getenv("DB_CHINOOK"),
host = Sys.getenv("HST_CHINOOK"),
user = Sys.getenv("USR_CHINOOK"),
password = Sys.getenv("PASS_CHINOOK"),
port = XXXX)
invoices_tbl <- tbl(con, "Invoice") %>%
collect()
The connection is ok, but when trying to visualize the data, I can't see the special characters:
> head(invoices_tbl[,1:4])
# A tibble: 6 x 4
InvoiceId CustomerId InvoiceDate BillingAddress
<int> <int> <chr> <chr>
1 1 2 2009-01-01 00:00:00 "Theodor-Heuss-Stra\xdfe 34"
2 2 4 2009-01-02 00:00:00 "Ullev\xe5lsveien 14"
3 3 8 2009-01-03 00:00:00 "Gr\xe9trystraat 63"
4 4 14 2009-01-06 00:00:00 "8210 111 ST NW"
5 5 23 2009-01-11 00:00:00 "69 Salem Street"
6 6 37 2009-01-19 00:00:00 "Berger Stra\xdfe 10"
My question is, should I change something in the configuration inside MySQL? Or is it an issue with R? How can I see the special characters? What is the meaning of \xdfe?
Please, any help will be greatly appreciated.

The hexadecimal format can be converted with iconv
invoices_tbl$BillingAddress <- iconv(invoices_tbl$BillingAddress,
"latin1", "utf-8")
-output
invoices_tbl
InvoiceId CustomerId InvoiceDate BillingAddress
1 1 2 2009-01-01 00:00:00 Theodor-Heuss-Straße 34
2 2 4 2009-01-02 00:00:00 Ullevålsveien 14
3 3 8 2009-01-03 00:00:00 Grétrystraat 63
4 4 14 2009-01-06 00:00:00 8210 111 ST NW
5 5 23 2009-01-11 00:00:00 69 Salem Street
6 6 37 2009-01-19 00:00:00 Berger Straße 10
data
invoices_tbl <- structure(list(InvoiceId = 1:6, CustomerId = c(2L, 4L, 8L, 14L,
23L, 37L), InvoiceDate = c("2009-01-01 00:00:00", "2009-01-02 00:00:00",
"2009-01-03 00:00:00", "2009-01-06 00:00:00", "2009-01-11 00:00:00",
"2009-01-19 00:00:00"), BillingAddress = c("Theodor-Heuss-Stra\xdfe 34",
"Ullev\xe5lsveien 14", "Gr\xe9trystraat 63", "8210 111 ST NW",
"69 Salem Street", "Berger Stra\xdfe 10")), row.names = c("1",
"2", "3", "4", "5", "6"), class = "data.frame")

Related

MySQL - Getting number of days user spent on each status

I have been trying to extract the number of days a particular user spent on each status in a month from the MySQL database table. The data is saved in log format which makes it a bit hard to work with. For e.g. I need to calculate the number of days the user 488 spent on each status in the month of June 2022 only.
user_id old_status new_status modified_on
488 3 10 31/05/2022 10:03
488 10 5 01/06/2022 13:05
488 5 16 07/06/2022 16:06
488 16 2 09/06/2022 08:26
488 2 6 30/06/2022 13:51
488 6 2 07/07/2022 09:44
488 2 6 08/08/2022 13:25
488 6 1 15/08/2022 10:37
488 1 11 02/09/2022 13:48
488 11 2 03/10/2022 07:26
488 2 10 10/10/2022 10:17
488 10 6 25/01/2023 17:50
488 6 1 01/02/2023 13:46
The output should look like this:
The output should look like:
user status Days
488 5 6
488 16 2
488 2 21
I tried multiple ways to join the same table with itself in order to find the solution but no luck. Any help will be appreciated.
here is what I think you should do, first join the old_status field in the log table with the status table then use the DATEDIFF function to subtract modified_on(log table ) from created_at(or any other field in status that stores creation time) you can filter results using where clause to get certain users on certain dates
this query might help (i don't know the structure of your tables so if there is something wrong edit it to suit your needs)
SELECT *,DATEDIFF(log.modified_at,st.created_at) AS spent_time_on_staus
FROM log_status AS log JOIN status AS st ON st.id=log.old_status
WHERE log.user_id=488 AND EXTRACT(MONTH FROM st.created_at) = 6
This is a suggestion to get you started. It will not get you all the way (since there are several status changes to and from the same status...)
SELECT
shfrom.userid,
shfrom.new_status as statusName,
shfrom.modified_on as fromdate,
shto.modified_on as todate,
DATEDIFF(shto.modified_on, shfrom.modified_on) as days_spent_in_status
FROM
status_history as shfrom
INNER JOIN status_history as shto
ON shfrom.userid = shto.userid and shfrom.new_status = shto.old_status
WHERE
shfrom.modified_on < shto.modified_on
;
I created a table based on your question and put in the data you provided, in mysql format:
create table status_history(
userid int,
old_status int,
new_status int,
modified_on datetime
);
insert into status_history values
(488, 3,10, '2022-05-31 10:03'),
(488,10, 5, '2022-06-01 13:05'),
(488, 5,16, '2022-06-07 16:06'),
(488,16, 2, '2022-06-09 08:26'),
(488, 2, 6, '2022-06-30 13:51'),
(488, 6, 2, '2022-07-07 09:44'),
(488, 2, 6, '2022-08-08 13:25'),
(488, 6, 1, '2022-08-15 10:37'),
(488, 1,11, '2022-09-02 13:48'),
(488,11, 2, '2022-10-03 07:26'),
(488, 2,10, '2022-10-10 10:17'),
(488,10, 6, '2023-01-25 17:50'),
(488, 6, 1, '2023-02-01 13:46');
this produces this result, where the duration is the time spent:
userid
statusName
fromdate
todate
days_spent_in_status
488
10
2022-05-31 10:03:00
2022-06-01 13:05:00
1
488
5
2022-06-01 13:05:00
2022-06-07 16:06:00
6
488
16
2022-06-07 16:06:00
2022-06-09 08:26:00
2
488
2
2022-06-09 08:26:00
2022-06-30 13:51:00
21
488
6
2022-06-30 13:51:00
2022-07-07 09:44:00
7
488
2
2022-06-09 08:26:00
2022-08-08 13:25:00
60
488
2
2022-07-07 09:44:00
2022-08-08 13:25:00
32
488
6
2022-06-30 13:51:00
2022-08-15 10:37:00
46
488
6
2022-08-08 13:25:00
2022-08-15 10:37:00
7
488
1
2022-08-15 10:37:00
2022-09-02 13:48:00
18
488
11
2022-09-02 13:48:00
2022-10-03 07:26:00
31
488
2
2022-06-09 08:26:00
2022-10-10 10:17:00
123
488
2
2022-07-07 09:44:00
2022-10-10 10:17:00
95
488
2
2022-10-03 07:26:00
2022-10-10 10:17:00
7
488
10
2022-05-31 10:03:00
2023-01-25 17:50:00
239
488
10
2022-10-10 10:17:00
2023-01-25 17:50:00
107
488
6
2022-06-30 13:51:00
2023-02-01 13:46:00
216
488
6
2022-08-08 13:25:00
2023-02-01 13:46:00
177
488
6
2023-01-25 17:50:00
2023-02-01 13:46:00
7
You still need to filter out the ones that are capturing an early status change with a later status change. I hope it gets you started.

web scraping understat website to retrieve table failing in R

I am trying to to pull out a table from the the website https://understat.com/league/EPL
The table I am trying to import into R is highlighted in red in the screenshot here;
screenshot of website
Using inspect tools I can see the xpath to the table as follows;
//*[#id="league-chemp"]/table
full XPath is
/html/body/div[1]/div[3]/div[3]/div/div[2]/div/table
My code is as follows;
library(rvest)
library(selectr)
library(xml2)
library(jsonlite)
library(htmltab)
library(RCurl)
library(XML)
url <- 'https://understat.com/league/EPL'
webpage <- read_html('https://understat.com/league/EPL')
xpath <- "/html/body/div[1]/div[3]/div[3]/div/div[2]/div/table/tbody"
nodes <- html_nodes(webpage, xpath = xpath)
However the response is;
> nodes
{xml_nodeset (0)}
I've hit a dead end, I think there maybe some embedded JSON code and javascript within the main html body of the response that is causing issues, but its all above my expertise right now.
I have been able to extract the table with the following code :
library(rvest)
library(RSelenium)
port <- as.integer(4444L + rpois(lambda = 1000, 1))
rd <- rsDriver(chromever = "105.0.5195.52", browser = "chrome", port = port)
remDr <- rd$client
remDr$open()
url <- "https://understat.com/league/EPL"
remDr$navigate(url)
Sys.sleep(5)
html_Content <- remDr$getPageSource()[[1]]
tables <- read_html(html_Content) %>% html_table()
tables
[[1]]
# A tibble: 20 x 12
`<U+2116>` Team M W D L G GA PTS xG xGA xPTS
<int> <chr> <int> <int> <int> <int> <int> <int> <int> <chr> <chr> <chr>
1 1 Arsenal 9 8 0 1 23 10 24 19.48-3.52 8.17-1.83 20.03-3.97
2 2 Manchester City 9 7 2 0 33 9 23 23.27-9.73 5.81-3.19 23.59+0.59
3 3 Tottenham 9 6 2 1 20 10 20 14.78-5.22 10.60+0.60 15.12-4.88
4 4 Chelsea 8 5 1 2 13 10 16 12.10-0.90 10.62+0.62 11.86-4.14
5 5 Manchester United 8 5 0 3 13 15 15 12.35-0.65 11.41-3.59 11.86-3.14
6 6 Newcastle United 9 3 5 1 17 9 14 18.41+1.41 12.13+3.13 15.73+1.73
7 7 Brighton 8 4 2 2 14 9 14 14.52+0.52 8.58-0.42 15.53+1.53
8 8 Bournemouth 9 3 3 3 8 20 12 5.26-2.74 15.39-4.61 6.43-5.57
9 9 Fulham 9 3 2 4 14 18 11 10.22-3.78 21.34+3.34 7.02-3.98
10 10 Liverpool 8 2 4 2 20 12 10 17.02-2.98 12.33+0.33 12.95+2.95
11 11 Brentford 9 2 4 3 16 17 10 13.28-2.72 13.00-4.00 12.78+2.78
12 12 Everton 9 2 4 3 8 9 10 10.33+2.33 14.67+5.67 9.13-0.87
13 13 West Ham 9 3 1 5 8 10 10 11.51+3.51 9.64-0.36 13.64+3.64
14 14 Leeds 8 2 3 3 11 12 9 10.45-0.55 12.28+0.28 9.73+0.73
15 15 Crystal Palace 8 2 3 3 10 12 9 9.91-0.09 13.71+1.71 8.62-0.38
16 16 Aston Villa 8 2 2 4 6 10 8 8.08+2.08 10.45+0.45 10.24+2.24
17 17 Southampton 9 2 1 6 8 17 7 9.32+1.32 13.88-3.12 9.36+2.36
18 18 Wolverhampton Wanderers 9 1 3 5 3 12 6 8.16+5.16 11.84-0.16 9.54+3.54
19 19 Leicester 9 1 1 7 15 24 4 9.06-5.94 15.12-8.88 8.00+4.00
20 20 Nottingham Forest 8 1 1 6 6 21 4 8.62+2.62 15.17-5.83 7.48+3.48
[[2]]
# A tibble: 11 x 11
`<U+2116>` Player Team Apps Min G A xG xA xG90 xA90
<int> <chr> <chr> <int> <int> <int> <int> <chr> <chr> <dbl> <dbl>
1 1 "Erling Haaland" "Manchester City" 9 768 15 3 10.10-4.90 2.61-0.39 1.18 0.31
2 2 "Harry Kane" "Tottenham" 9 804 8 1 6.72-1.28 2.06+1.06 0.75 0.23
3 3 "Roberto Firmino" "Liverpool" 7 473 6 3 4.06-1.94 1.40-1.60 0.77 0.27
4 4 "Aleksandar Mitrovic" "Fulham" 8 666 6 0 4.53-1.47 0.38+0.38 0.61 0.05
5 5 "Ivan Toney" "Brentford" 9 810 6 2 5.24-0.76 1.55-0.45 0.58 0.17
6 6 "Phil Foden" "Manchester City" 9 678 6 4 3.37-2.63 2.49-1.51 0.45 0.33
7 7 "Gabriel Jesus" "Arsenal" 9 794 5 3 6.29+1.29 1.73-1.27 0.71 0.2
8 8 "James Maddison" "Leicester" 8 716 5 2 1.40-3.60 0.97-1.03 0.18 0.12
9 9 "Leandro Trossard" "Brighton" 8 686 5 1 3.01-1.99 0.70-0.30 0.39 0.09
10 10 "Wilfried Zaha" "Crystal Palace" 7 624 4 1 3.33-0.67 1.24+0.24 0.48 0.18
11 NA "" "" NA NA 252 183 250.82-1.18 180.23-2.77 NA NA
Here is another approach that can be considered :
library(RDCOMClient)
url <- "https://understat.com/league/EPL"
IEApp <- COMCreate("InternetExplorer.Application")
IEApp[['Visible']] <- TRUE
IEApp$Navigate(url)
Sys.sleep(5)
doc <- IEApp$Document()
html_Content <- doc$Body()$innerHTML()
tables <- read_html(html_Content) %>% html_table()
tables
[[1]]
# A tibble: 20 x 12
`?` Team M W D L G GA PTS xG xGA xPTS
<int> <chr> <int> <int> <int> <int> <int> <int> <int> <chr> <chr> <chr>
1 1 Arsenal 9 8 0 1 23 10 24 19.48-3.52 8.17-1.83 20.03-3.97
2 2 Manchester City 9 7 2 0 33 9 23 23.27-9.73 5.81-3.19 23.59+0.59
3 3 Tottenham 9 6 2 1 20 10 20 14.78-5.22 10.60+0.60 15.12-4.88
4 4 Chelsea 8 5 1 2 13 10 16 12.10-0.90 10.62+0.62 11.86-4.14
5 5 Manchester United 8 5 0 3 13 15 15 12.35-0.65 11.41-3.59 11.86-3.14
6 6 Newcastle United 9 3 5 1 17 9 14 18.41+1.41 12.13+3.13 15.73+1.73
7 7 Brighton 8 4 2 2 14 9 14 14.52+0.52 8.58-0.42 15.53+1.53
8 8 Bournemouth 9 3 3 3 8 20 12 5.26-2.74 15.39-4.61 6.43-5.57
9 9 Fulham 9 3 2 4 14 18 11 10.22-3.78 21.34+3.34 7.02-3.98
10 10 Liverpool 8 2 4 2 20 12 10 17.02-2.98 12.33+0.33 12.95+2.95
11 11 Brentford 9 2 4 3 16 17 10 13.28-2.72 13.00-4.00 12.78+2.78
12 12 Everton 9 2 4 3 8 9 10 10.33+2.33 14.67+5.67 9.13-0.87
13 13 West Ham 9 3 1 5 8 10 10 11.51+3.51 9.64-0.36 13.64+3.64
14 14 Leeds 8 2 3 3 11 12 9 10.45-0.55 12.28+0.28 9.73+0.73
15 15 Crystal Palace 8 2 3 3 10 12 9 9.91-0.09 13.71+1.71 8.62-0.38
16 16 Aston Villa 8 2 2 4 6 10 8 8.08+2.08 10.45+0.45 10.24+2.24
17 17 Southampton 9 2 1 6 8 17 7 9.32+1.32 13.88-3.12 9.36+2.36
18 18 Wolverhampton Wanderers 9 1 3 5 3 12 6 8.16+5.16 11.84-0.16 9.54+3.54
19 19 Leicester 9 1 1 7 15 24 4 9.06-5.94 15.12-8.88 8.00+4.00
20 20 Nottingham Forest 8 1 1 6 6 21 4 8.62+2.62 15.17-5.83 7.48+3.48
[[2]]
# A tibble: 11 x 11
`?` Player Team Apps Min G A xG xA xG90 xA90
<int> <chr> <chr> <int> <int> <int> <int> <chr> <chr> <dbl> <dbl>
1 1 "Erling Haaland" "Manchester City" 9 768 15 3 10.10-4.90 2.61-0.39 1.18 0.31
2 2 "Harry Kane" "Tottenham" 9 804 8 1 6.72-1.28 2.06+1.06 0.75 0.23
3 3 "Roberto Firmino" "Liverpool" 7 473 6 3 4.06-1.94 1.40-1.60 0.77 0.27
4 4 "Aleksandar Mitrovic" "Fulham" 8 666 6 0 4.53-1.47 0.38+0.38 0.61 0.05
5 5 "Ivan Toney" "Brentford" 9 810 6 2 5.24-0.76 1.55-0.45 0.58 0.17
6 6 "Phil Foden" "Manchester City" 9 678 6 4 3.37-2.63 2.49-1.51 0.45 0.33
7 7 "Gabriel Jesus" "Arsenal" 9 794 5 3 6.29+1.29 1.73-1.27 0.71 0.2
8 8 "James Maddison" "Leicester" 8 716 5 2 1.40-3.60 0.97-1.03 0.18 0.12
9 9 "Leandro Trossard" "Brighton" 8 686 5 1 3.01-1.99 0.70-0.30 0.39 0.09
10 10 "Wilfried Zaha" "Crystal Palace" 7 624 4 1 3.33-0.67 1.24+0.24 0.48 0.18
11 NA "" "" NA NA 252 183 250.82-1.18 180.23-2.77 NA NA

sql query: which students have repeated subject?

table structure is like :
student area yearlevel code year sem result
123010 INFO 9 0002 2015 1 77
123011 INFO 9 0002 2015 1 70
123012 INFO 9 0002 2015 1 55
123037 INFO 9 0002 2016 2 49
123037 INFO 9 0002 2017 1 NULL
123010 COMP 9 0007 2016 1 82
123010 ISYS 9 0026 2015 2 82
123011 ISYS 9 0026 2015 2 88
123012 ISYS 9 0026 2015 2 66
123010 COMP 9 0038 2016 2 77
123010 COMP 9 0041 2016 1 45
123010 COMP 9 0041 2017 1 NULL
123010 ISYS 9 0049 2016 1 88
So student 101 has a repeated subject 000002
Use GROUP BY and HAVING function :
SELECT student,subject
FROM your_table
GROUP BY student,subject
HAVING COUNT (*) > 1
This should work:
select Student,subject from table group by student, subject having count(*)>1

How can I replace empty cells with NA in R?

I'm new to R, and have been trying a bunch of examples but I couldn't get anything to change all of my empty cells into NA.
library(XML)
theurl <- "http://www.pro-football-reference.com/teams/sfo/1989.htm"
table <- readHTMLTable(theurl)
table
Thank you.
The result you get from readHTMLTable is giving you a list of two tables, so you need to work on each list element, which can be done using lapply
table <- lapply(table, function(x){
x[x == ""] <- NA
return(x)
})
table$team_stats
Player PF Yds Ply Y/P TO FL 1stD Cmp Att Yds TD Int NY/A 1stD Att Yds TD Y/A 1stD Pen Yds 1stPy
1 Team Stats 442 6268 1021 6.1 25 14 350 339 483 4302 35 11 8.1 209 493 1966 14 4.0 124 109 922 17
2 Opp. Stats 253 4618 979 4.7 37 16 283 316 564 3235 15 21 5.3 178 372 1383 9 3.7 76 75 581 29
3 Lg Rank Offense 1 1 <NA> <NA> 2 10 1 <NA> 20 2 1 1 1 <NA> 13 10 12 13 <NA> <NA> <NA> <NA>
4 Lg Rank Defense 3 4 <NA> <NA> 11 9 9 <NA> 25 11 3 9 5 <NA> 1 3 3 8 <NA> <NA> <NA> <NA>
You have a list of data.frames of factors, though the actual data is mostly numeric. Converting to the appropriate type with type.convert will automatically insert the appropriate NAs for you:
df_list <- lapply(table, function(x){
x[] <- lapply(x, function(y){type.convert(as.character(y), as.is = TRUE)});
x
})
df_list[[1]][, 1:18]
## Player PF Yds Ply Y/P TO FL 1stD Cmp Att Yds.1 TD Int NY/A 1stD.1 Att.1 Yds.2 TD.1
## 1 Team Stats 442 6268 1021 6.1 25 14 350 339 483 4302 35 11 8.1 209 493 1966 14
## 2 Opp. Stats 253 4618 979 4.7 37 16 283 316 564 3235 15 21 5.3 178 372 1383 9
## 3 Lg Rank Offense 1 1 NA NA 2 10 1 NA 20 2 1 1 1.0 NA 13 10 12
## 4 Lg Rank Defense 3 4 NA NA 11 9 9 NA 25 11 3 9 5.0 NA 1 3 3
Or more concisely but with a lot of packages,
library(tidyverse) # for purrr functions and readr::type_convert
library(janitor) # for clean_names
df_list <- map(table, ~.x %>% clean_names() %>% dmap(as.character) %>% type_convert())
df_list[[1]]
## # A tibble: 4 × 23
## player pf yds ply y_p to fl x1std cmp att yds_2 td int ny_a
## <chr> <int> <int> <int> <dbl> <int> <int> <int> <int> <int> <int> <int> <int> <dbl>
## 1 Team Stats 442 6268 1021 6.1 25 14 350 339 483 4302 35 11 8.1
## 2 Opp. Stats 253 4618 979 4.7 37 16 283 316 564 3235 15 21 5.3
## 3 Lg Rank Offense 1 1 NA NA 2 10 1 NA 20 2 1 1 1.0
## 4 Lg Rank Defense 3 4 NA NA 11 9 9 NA 25 11 3 9 5.0
## # ... with 9 more variables: x1std_2 <int>, att_2 <int>, yds_3 <int>, td_2 <int>, y_a <dbl>,
## # x1std_3 <int>, pen <int>, yds_4 <int>, x1stpy <int>

Webscraping the data using R

Aim: I am trying to scrape the historical daily stock price for all companies from the webpage http://www.nepalstock.com/datanepse/previous.php. The following code works; however, it always generates the daily stock price for the most recent (Feb 5, 2015) date only. In another words, output is the same, irrespective of the date that I entered. I would appreciate if you could help in this regard.
library(RHTMLForms)
library(RCurl)
library(XML)
url <- "http://www.nepalstock.com/datanepse/previous.php"
forms <- getHTMLFormDescription(url)
# we are interested in the second list with date forms
# forms[[2]]
# HTML Form: http://www.nepalstock.com/datanepse/
# Date: [ ]
get_stock<-createFunction(forms[[2]])
#create sequence of dates from start to end and store it as a list
date_daily<-as.list(seq(as.Date("2011-08-24"), as.Date("2011-08-30"), "days"))
# determine the number of elements in the list
num<-length(date_daily)
daily_1<-lapply(date_daily,function(x){
show(x) #displays the particular date
readHTMLTable(htmlParse(get_stock(Date = x)), which = 7)
})
#18 tables out of which 7 is one what we desired
# change the colnames
col_name<-c("SN","Traded_Companies","No_of_Transactions","Max_Price","Min_Price","Closing_Price","Total_Share","Amount","Previous_Closing","Difference_Rs.")
daily_2<-lapply(daily_1,setNames,nm=col_name)
Output:
> head(daily_2[[1]],5)
SN Traded_Companies No_of_Transactions Max_Price Min_Price Closing_Price Total_Share Amount
1 1 Agricultural Development Bank Ltd 24 489 471 473 2,868 1,359,038
2 2 Arun Valley Hydropower Development Company Limited 40 365 360 362 8,844 3,199,605
3 3 Alpine Development Bank Limited 11 297 295 295 150 44,350
4 4 Asian Life Insurance Co. Limited 10 1,230 1,215 1,225 898 1,098,452
5 5 Apex Development Bank Ltd. 23 131 125 131 6,033 769,893
Previous_Closing Difference_Rs.
1 480 -7
2 363 -1
3 303 -8
4 1,242 -17
5 132 -1
> tail(daily_2[[1]],5)
SN Traded_Companies No_of_Transactions Max_Price Min_Price Closing_Price Total_Share Amount Previous_Closing
140 140 United Finance Ltd 4 255 242 242 464 115,128 255
141 141 United Insurance Co.(Nepal)Ltd. 3 905 905 905 234 211,770 915
142 142 Vibor Bikas Bank Limited 7 158 152 156 710 109,510 161
143 143 Western Development Bank Limited 35 320 311 313 7,631 2,402,497 318
144 144 Yeti Development Bank Limited 22 139 132 139 14,355 1,921,511 134
Difference_Rs.
140 -13
141 -10
142 -5
143 -5
144 5
Here's one quick approach. Note that the site uses a POST request to send the date to the server.
library(rvest)
library(httr)
page <- "http://www.nepalstock.com/datanepse/previous.php" %>%
POST(body = list(Date = "2015-02-01")) %>%
html()
page %>%
html_node(".dataTable") %>%
html_table(header = TRUE)