PowerShell HTML method getElementsByClassName returns null - html

I'm trying to write a PowerShell script which fetches an HTML page from a website and extracts some information from it.
My code looks like this:
$html = (invoke-webrequest -uri $address).parsedHTML;
$bodyHTML = $html.body.getElementsByClassName("news-item")[0].innerText;
The script fetches the website fine. The important part of the website looks like this:
...
<DIV class=news-item>
Important Information
...
The Problem:
I always get an error message: "cannot index into a null array".
The getElementsByClassName()-Function does not return anything.
If I list all div's and show the class names:
$html.body.getElementsByTagName("div") | select className
it lists all the class names including "news-item", which I am looking for.
Does anybody have an idea what the problem might be?

The problem seems to be the PowerShell version used. The PowerShell version running on the machine was 4.11.
With PowerShell 5.1 on a different machine the code worked fine.
As a workaround, because PowerShell coult not be updated and I was only looking for div-elements, I used the following code:
$bodyHTML = ($html.body.getElementsByTagName("div") | where { $_.className -eq "news-item" })[0].innerText;

Related

How to parse Table from Wikipedia using htmltab package?

All,
I am trying to parse 1 table located here https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population#Sovereign_states_and_dependencies_by_population. And I would like to use htmltab package to achieve this task. Currently my code looks like following. However I am getting below Error. I tried passing "Rank", "% of world population " in which function, but still received an error. I am not sure, what could be wrong ?
Please Note: I am new to R and Webscraping, if you could provide explanation of the code, that will be great help.
url3 <- "https://en.wikipedia.org/wiki/List_of_countries_and_dependencies_by_population#Sovereign_states_and_dependencies_by_population"
list_of_countries<- htmltab(doc = url3, which = "//th[text() = 'Country(or dependent territory)']/ancestor::table")
Error: Couldn't find the table. Try passing (a different) information to the which argument.
This is an XPath problem not an R problem. If you inspect the HTML of that table the relevant header is
<th class="headerSort" tabindex="0" role="columnheader button" title="Sort ascending">
Country<br><small>(or dependent territory)</small>
</th>
So text() on this is just "Country".
For example this could work (this is not the only option, you will just have to try out various xpath selectors to see).
htmltab(doc = url3, which = "//th[text() = 'Country']/ancestor::table")
Alternatively it's the first table on the page, so you could try which=1 instead.
(NB in Chrome you can do $x("//th[text() = 'Country']") and so on in the developer console to try these things out, and no doubt in other browsers also)

Removing Headers in eText in BI Publisher

We have a scenario where we should not display the header in the output in CSV using eText template.
Our output looks like this:
Header000001 Header000002
------------ ------------
Adetail1 Bdetail1
Adetail2 Bdetail2
Adetail3 Bdetail3
Desired output is:
Adetail1 Bdetail1
Adetail2 Bdetail2
Adetail3 Bdetail3
We tried all possible options in eText template like removing header section, verifying the data using BI Publisher Desktop tool, verifying logs etc.
We are not getting any error in BI Publisher Desktop tool.
Same question has been posted by somebody some time ago and it was resolved, but solution was not provided.
It would be very helpful if anybody can provide the exact solution.
The header will just be another block in your eText template. You can use the <DISPLAY CONDITION> command to skip printing that block in the output. The display condition command specifies when the enclosed record or data field group should be displayed. The command parameter is a boolean expression. When it evaluates to true, the record or data field group is displayed. Otherwise the record or data field group is skipped. You can just give condition as false, and that block will be skipped.
I have created a template using the provided data xml to output a CSV, without headers. A delimiter based template is used, but the header is not printed.
Access it from here.

Drupal 7 Services JSON shows fields names with spaces

I have drupal 7 deployment with services 3 module. I have Services with JSON output configured. When I get my results, the custom fields return labels instead of the actual field names. For example, Node Title which is built in shows node_title. However, 1 Year a custom field that was stored as field_1_year shows up as 1 Year. This makes it difficult to parse JSON. Any suggestions?
You can make your custom json feed i.e.:
Make you php script and at top add standard D7 bootstrap:
define('DRUPAL_ROOT', getcwd());
require_once DRUPAL_ROOT . '/includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);
After this code you'll have all Drupal's functionalities available in your script.
Add your code to get values you want. You can use Drupal's database api, or even easier, create some view and use views_get_view_result() function to get values view returns:
https://api.drupal.org/api/views/views.module/function/views_get_view_result/7
Then iterate trough your results and create another php array containing values you want in the way you want.
Use json_encode to convert your array to json string:
http://php.net/manual/en/function.json-encode.php
Print out your json string. You can even print out json header before it, so apps that are getting feed will know it's json (sometime may be needed)
header('Content-Type: application/json');
Something like that...

Test.loadData with Custom sObject Throws Exception

I am loading a CSV file via Static Resourced to test my APEX code. I am using the following code in my test:
List<Territory_Zip_Code__c> territoryData = Test.loadData(Territory_Zip_Code__c.sObjectType, TERRITORY_ZIP_CODES_STATIC_RESOURCE_NAME);
The first few lines of the CSV file look like so:
Territory__c,Zip_Code__c
ABC,123
DEF,456
I am getting the following error:
System.StringException: Unknown field: Territory__c
Territory__c is a valid API field name for my custom sObject.
I've also tried adding the sObject name in front of the field name, like My_Territory__c.Territory__c but that didn't work either.
In addition, I tried using the field name, instead of the API name (for example, Territory) but that didn't work either.
There are lots of examples of using Test.loadData with built-in sObjects, such as Account and Contacts, but no examples showing custom sObjects. I'm starting to think this just isn't possible with custom objects.
Using test.loadData most certainly does work with custom objects. The test data CSV header only needs the field names, as you have in your example.
Your code also looks good. The only difference I could spot is that your variable is a strongly typed list. In my code I use a List which seems to work:
List<sObject> testdata = Test.loadData(MyCustomObject__c.sObjectType, 'mytestdatafile');

Googlechart error on a linechart with tooltip values coming via JSON

I have a google chart and want to add a custom tooltip. I found some great answers like this this site and set about doing this with roles. I also found this link about it and it looked like the best way.
My data is being generated via json and I use a php file to create a json feed. The rows I have coded like this
{"cols": [ {"id":"","label":"Period","pattern":""},
{"id":"","label":"Recorded P/L","type":"number", "role":"data"} ,
{"id":"","label": null,"type":"string", "role":"tooltip"},
{"id":"","label":"Best Available P/L","type":"number", "role":"data"},
{"id":"","label": null,"type":"string", "role":"tooltip"}
]
Then it goes on and adds all the data. The problem is when I try to run this I get the error
All series on a given axis must be of the same data type
I have checked the json and that is formed correctly but am not sure what I could be doing wrong.
At least part of your problem is that you're not specifying the type for your first column.