How to scrape using rvest in pages with multiple tables - html

I'm trying to scrape the data from every table at the hockey-reference awards page. I can scrape the first table for the Hart Memorial Trophy, but when I try the rest of them, I end up with empty vectors. I used Selector Gadget and the rvest package to produce the following code.
library(rvest)
url="https://www.hockey-reference.com/awards/voting-2017.html"
byng<-read_html(url)
byng_node<-html_nodes(byng, "#byng_stats .right , #byng_stats a")
byng_text<-html_text(byng_node)
However, once I run this code, I get no data in the byng variables:
> byng_node
{xml_nodeset (0)}
> byng_text
character(0)
What's happening here? Does selector gadget not work for pages with multiple tables? Does it have nothing to do with that and there's something HTMLy I don't understand? Any help is greatly appreciated!

#neilfws was right: if you look at the source code of the HTML page, you see that all but the first table are commented so rvest thinks they are comments, not part of source code itself. Let's do a dirty hack and remove these characters that are used to comment our precious tables:
library(rvest)
url="https://www.hockey-reference.com/awards/voting-2017.html"
byng<-read_html(url)
# Remove commenting sequences
byng <- gsub("<!--", "", byng)
byng <- gsub("-->", "", byng)
byng<-read_html(byng)
#Get tables as a list of dataframes
tables <- html_table(byng)
# Last table
tables[7]
[[1]]
Scoring Scoring Scoring Scoring Goalie Stats Goalie Stats
1 Place Player Age Tm Pos Votes Vote% 1st 2nd 3rd 4th 5th G A PTS +/- W L
2 1 Connor McDavid 20 EDM C 762 94.07 141 18 3 0 0 30 70 100 27
3 2 Sidney Crosby 29 PIT C 526 64.94 20 142 0 0 0 44 45 89 17
4 3 Nicklas Backstrom 29 WSH C 127 15.68 1 2 116 0 0 23 63 86 17
5 4 Mark Scheifele 23 WPG C 21 2.59 0 0 21 0 0 32 50 82 18
6 5 Auston Matthews 19 TOR C 10 1.23 0 0 10 0 0 40 29 69 2
7 6 Evgeni Malkin 30 PIT C 4 0.49 0 0 4 0 0 33 39 72 18
8 7 John Tavares 26 NYI C 2 0.25 0 0 2 0 0 28 38 66 4
9 8 Jonathan Toews 28 CHI C 1 0.12 0 0 1 0 0 21 37 58 7
10 8 Brad Marchand 28 BOS C 1 0.12 0 0 1 0 0 39 46 85 18
11 8 Ryan Kesler 32 ANA C 1 0.12 0 0 1 0 0 22 36 58 8
12 8 Ryan Getzlaf 31 ANA C 1 0.12 0 0 1 0 0 15 58 73 7

Related

mysql return 360 degrees returned with strengths

I have a table of wind directions and strengths over a 24 hour period, sample data at the bottom of this question.
only directions that have strengths are stored in the database, I'm currently using the following SQL:
SELECT winddirection, avespeed
FROM wp_weather_data
WHERE ID%10 = 0
what I would like to return is an entry for every degree (0 value for any degree not in the db) and where there are multiple entries for a given degree to only return the highest value. Oh, and they need to be in ascending order of degrees.
Is this possible?
This is so I can plot a wind distribution chart on a polar chat plugin in WordPress.
sample data returned from the above sql:
294 2
271 3
269 2
285 3
289 2
123 1
130 1
144 1
160 0
168 0
161 0
135 0
138 0
331 0
115 0
136 0
161 0
267 0
114 0
265 0
204 0
248 1
206 0
199 1
250 2
244 3
257 3
272 5
267 5
208 3
221 3
223 4
253 6
233 5

DXF ASCII to write multiple text strings

I can't seem to add multiple text strings to a DXF file. It works for when I only have a block of code for 1 line of text. I can't seem to add multiple lines.
I created the code shown below. Any help would be very much appreciated.
0
SECTION
2
ENTITIES
0
TEXT
5
31
8
0
6
BYLAYER
62
256
10
161.25
20
120.25
30
0
40
1
1
Sample Text 1
50
0
41
1
51
0
7
71
0
11
161.25
21
120.25
31
0
210
0
220
0
230
1
73
3
0
TEXT
5
31
8
0
6
BYLAYER
62
256
10
100
20
100
30
0
40
1
1
Sample Text 2
50
0
41
1
51
0
7
71
0
11
100
21
100
31
0
210
0
220
0
230
1
73
3
0
ENDSEC
0
EOF
The DXF file with the code shown above will not open in AutoCAD and/or Microstation. However it does open in LibreCAD which appears to be more forgiving with syntax.
I'd like it to open in AutoCAD AND Microstation. Any input would very much be appreciated.
Handles (DXF group 5) should be unique within a file.
As such, you should not use the same handle for both text entities:
0
SECTION
2
ENTITIES
0
TEXT
5
31 <----------+
|
< ... > |
|
0 +----- Identical handles
TEXT |
5 |
31 <----------+

Mysql find common values for all users

We are developing an local shop reccomendation system and in one of our sql queries we had a problem
We want to fetch the companies which all users in same cluster rated , but if any one of the users in the same group doesnt rated the company we wouldnt want to fetch it
SELECT ml_user_clusters.primaryUser,ml_user_clusters.clusterId,ml_ratings.companyId,ml_ratings.rating,ml_user_labels.groupId FROM ml_user_clusters
LEFT JOIN ml_ratings ON ml_ratings.userId = ml_user_clusters.primaryUser
LEFT JOIN ml_company_user_labels ON ml_company_user_labels.companyId = ml_ratings.companyId
LEFT JOIN ml_user_labels ON ml_user_labels.groupId = ml_company_user_labels.labelId
WHERE ml_user_clusters.clusterId = 0
We've started to add a query like in the below but couldnt able to finish it with proper AND clause
And our data is like in the below: So in the result we would like to have only the companies which has groupId=6 because all users in the same cluster(clusterId=0) rated a company with groupId = 6
primaryUser clusterId companyId rating groupId
497 0 135 5 NULL
498 0 135 10 NULL
79 0 135 12 NULL
501 0 135 10 NULL
79 0 85 14 2
79 0 8 4 5
79 0 98 11 5
79 0 3 5 5
497 0 6 7 6
500 0 6 7 6
499 0 29 7 6
497 0 29 7 6
499 0 77 7 6
500 0 29 7 6
498 0 6 7 6
500 0 77 11 6
500 0 130 3 6
498 0 130 3 6
501 0 77 19 6
499 0 6 7 6
79 0 30 1 7
500 0 30 7 7
79 0 48 7 9
79 0 39 1 13
79 0 48 7 13
499 0 6 7 15
497 0 6 7 15
79 0 8 4 15
500 0 6 7 15
79 0 98 11 15
498 0 6 7 15
79 0 3 5 15
79 0 81 7 15
79 0 3 5 17
79 0 82 7 17
79 0 103 7 17
79 0 118 3 17
79 0 63 3 17
501 0 118 7 17
79 0 82 7 19
79 0 118 3 19
79 0 63 3 19
501 0 118 7 19
79 0 39 1 21
79 0 85 14 23
Expected output must be: (Because all unique users in Cluster=0 has rated a company which has GroupID=6 )
primaryUser clusterId companyId rating groupId
497 0 6 7 6
500 0 6 7 6
499 0 29 7 6
497 0 29 7 6
499 0 77 7 6
500 0 29 7 6
498 0 6 7 6
500 0 77 11 6
500 0 130 3 6
498 0 130 3 6
501 0 77 19 6
499 0 6 7 6
Do you have any idea how we can fix that problem?
Something like this should work,you should build a fiddle for better testing.
Explanation: you count distinct users grouped by group id and compare with the total number of distinct users.If the two match it means all users in that respective groupid have voted.
SELECT ml_user_labels.groupId
FROM ml_user_clusters
LEFT JOIN ml_ratings ON ml_ratings.userId = ml_user_clusters.primaryUser
LEFT JOIN ml_company_user_labels ON ml_company_user_labels.companyId = ml_ratings.companyId
LEFT JOIN ml_user_labels ON ml_user_labels.groupId = ml_company_user_labels.labelId
WHERE ml_user_clusters.clusterId = 0
GROUP BY ml_user_labels.groupId
HAVING COUNT(DISTINCT ml_user_clusters.primaryUser) =
(SELECT COUNT(DISTINCT ml_user_clusters.primaryUser)
FROM ml_user_clusters
LEFT JOIN ml_ratings ON ml_ratings.userId = ml_user_clusters.primaryUser
LEFT JOIN ml_company_user_labels ON ml_company_user_labels.companyId = ml_ratings.companyId
LEFT JOIN ml_user_labels ON ml_user_labels.groupId = ml_company_user_labels.labelId
WHERE ml_user_clusters.clusterId = 0)x

Insert php multiple data into mysql

I have 1000+ customers. I require customer report.
Here debit = potato + onion + ginger. Credit is commission.Balance will be updated every time. It will be balance - debit and balance + credit alternatively.
Grocery data report is as: Data is filled through php form with mysql_fetch_array query. Here few customers are as sample. and few data fields.
id cus_id cus_name potato onion ginger debit credit balance
1 12 munna 10 25 28 63 0 37
2 16 anil 24 56 84 164 0 136
3 34 palash 17 47 51 115 0 85
4 45 dimpy 35 64 39 138 0 112
Table grocery before and after entering new data:
id cus_id cus_name potato onion ginger debit credit balance
1 12 munna 10 25 28 63 0 37
2 16 anil 24 56 84 164 0 136
3 34 palash 17 47 51 115 0 85
4 45 dimpy 35 64 39 138 0 112
5 12 munna 0 0 0 0 6 43
6 16 anil 0 0 0 0 16 152
7 34 palash 0 0 0 0 12 97
8 45 dimpy 0 0 0 0 14 126
My problem is :
I am unable to update balance column, cus_name wise and cus_id wise and insert all data into mysql database. Suggest me with mysql query.

Update specific records based on some group

I am using SQL Server 2008 R2.
I am having a database table that contains some user data as given below :
Id UserId Sys Dia ReadingType DataId IsDeleted
1 10 98 65 last 1390556024216 0
2 10 99 69 average 1390556024216 0
3 10 102 96 last 1390562788540 0
4 10 102 96 average 1390562788540 0
5 11 130 98 last 1390631241547 0
6 11 130 98 average 1390631241547 0
7 2 285 199 first 1390770562374 0
8 2 250 180 last 1390770562374 0
9 2 267 189 average 1390770562374 0
10 1 258 180 first 1391191009457 0
11 1 258 180 last 1391191009457 0
12 1 258 180 average 1391191009457 0
13 1 285 199 additional 1391191009457 0
14 22 110 78 last 1391549208338 0
15 22 123 83 last 1391549208349 0
In this table, there are records that are having the same DataId but different ReadingType.
I want to set IsDeleted=1 for the records having ReadingType='last' and having a record with ReadingType='average' with the same DataId, Sys, Dia and UserId.
So the Desired result shoul be :
Id UserId Sys Dia Reading DataId IsDeleted
1 10 98 65 last 1390556024216 0
2 10 99 69 average 1390556024216 0
3 10 102 96 last 1390562788540 1
4 10 102 96 average 1390562788540 0
5 11 130 98 last 1390631241547 1
6 11 130 98 average 1390631241547 0
7 2 285 199 first 1390770562374 0
8 2 250 180 last 1390770562374 0
9 2 267 189 average 1390770562374 0
10 1 258 180 first 1391191009457 0
11 1 258 180 last 1391191009457 1
12 1 258 180 average 1391191009457 0
13 1 285 199 additional 1391191009457 0
14 22 110 78 last 1391549208338 0
15 22 123 83 last 1391549208349 0
Here the records with Id 3, 5 and 11 should be marked as deleted as they are having same UserId, Sys, Dia, DataId and ReadingType="last" with another record having ReadingType="average" with same other fields.
Can anyone help me how to find out such a records and update them?
Just use UPDATE with EXISTS subquery:
UPDATE T
SET IsDeleted=1
WHERE
ReadingType='last'
AND
EXISTS(SELECT * FROM T as T1
WHERE T1.ReadingType='average'
AND T1.DataId=T.DataId
AND T1.Sys=T.Sys
AND T1.Dia=T.Dia
AND T1.UserId=T.UserId
)
SQLFiddle demo
You Can solve many way but here i am using the sub-query to solve your problem
UPDATE TABLE SET IsDeleted=1
WHERE DataId=(SELECT DataId FROM TABEL WHERE Reading='last')