I have a page where several status can be found. I would like to count all serviceStatus-OK and serviceStatus-DOWN divs on the page. Unfortunately I cannot modify the page, I need to verify all service is up and running. My idea is that, load the page, count all OK status. If 5 service has 5 OK, we are good. If there is any DOWN, we are not ok.
Any ideas?
Source code:
<span id="content">
<div class="status">
<div class="serviceName">
Some name <br />
http://blablaservice1
</div>
<div class="serviceStatus-OK">
OK
</div>
</div>
<div class="status">
<div class="serviceName">
Some other name <br />
http://blablaservice2
</div>
<div class="serviceStatus-DOWN">
DOWN
</div>
</div>
My code:
Elements services = doc.select("span#conent");
Element content = doc.getElementById("content");
Elements servicesOks = content.getElementsByTag("status");
int upCounter = 0;
int downCounter = 0;
for (Element y : servicesOks) {
if (y.hasClass("status-OK")) {
upCounter++;
}
}
for (Element y : servicesOks) {
if (y.hasAttr("status-DOWN")) {
downCounter++;
}
}
System.out.println("OK Systems: " + upCounter);
System.out.println("DOWN Systems: " + upCounter);
My output is:
OK Systems: 0
DOWN Systems: 0
You can find the number of okays and downs like this:
Document doc = Jsoup.parse(input);
Elements OK = doc.select(".serviceStatus-OK");
Elements down = doc.select(".serviceStatus-DOWN");
System.out.println("OK - " + OK.size());
System.out.println("DOWN - " + down.size());
Find all the elements with the names serviceStatus-OK and serviceStatus-DOWN, and then count the number of items of each kind (elemnts is just a list).
Related
I'm building a book-like pagination style for my website. It is composed of a 4 parts code: the text supposed to be displayed, the pages numbered, the "next" and "previous" buttons and the script to determine which part/page of the text is displayed.
var current = 1;
var totalPages = document.getElementById("pageContainer").childElementCount;
function showPages(id = 1) {
if (id < 1 || id > totalPages)
return;
curr_page = document.getElementById("page" + current);
curr_page.classList.add("pageHidden");
curr_page.classList.remove("pageDisplayed");
target_page = document.getElementById("page" + id);
target_page.classList.add("pageDisplayed");
target_page.classList.remove("pageHidden");
current = id;
}
.pageHidden
{
display: none;
}
.pageDisplayed
{
display: block;
}
<div id="pageContainer">
<div class="pageDisplayed" id="page1"><p>page 1 displayed</p></div>
<div class="pageHidden" id="page2"><p>That is the second page.</p></div>
<div class="pageHidden" id="page3"><p>And finally a third one.</p></div>
</div>
<h2>pages :
1
2
3
</h2>
<h2>
<span style="float: left;">
Previous
</span>
<span style="float: right;">
Next
</span>
</h2>
Now, I'd like to mark the actual page that is being displayed by changing the appearance of the link. For now it shows
"Pages: 1 2 3"
And I'd like it to become, for example:
"Pages: 1 [2] 3" (with a different color for the marked "[2]".
I've found ways to do that, but I couldn't make it works with the "next" and "previous" button, which, when triggered, didn't highlight the anchor number.
Basically, if the function showPages returns "page2", the anchor "2" with the id "2" should appear like that: [2].
I think it means that I have to compare the id from the page container, to the id from the pagination part... Anyone know how I can do that?
Same As you are hiding previous block and showing new block. create a css class with following css:
.page:before{
content: '['
}
.page:after{
content: ']'
}
this css will wrap any element with page class in brackets [].
Now add page class to selected page and remove from older one.
pageNum = document.getElementById("" + id);
pageNum.classList.add("page")
pageNum = document.getElementById("" + current);
pageNum.classList.remove("page")
Here is your working code.
var current = 1;
var totalPages = document.getElementById("pageContainer").childElementCount;
function showPages(id = 1) {
if (id < 1 || id > totalPages)
return;
curr_page = document.getElementById("page" + current);
curr_page.classList.add("pageHidden");
curr_page.classList.remove("pageDisplayed");
target_page = document.getElementById("page" + id);
target_page.classList.add("pageDisplayed");
target_page.classList.remove("pageHidden");
pageNum = document.getElementById("" + id);
pageNum.classList.add("page")
pageNum = document.getElementById("" + current);
pageNum.classList.remove("page")
current = id;
}
.pageHidden
{
display: none;
}
.pageDisplayed
{
display: block;
}
.page:before{
content: '['
}
.page:after{
content: ']'
}
<div id="pageContainer">
<div class="pageDisplayed" id="page1"><p>page 1 displayed</p></div>
<div class="pageHidden" id="page2"><p>That is the second page.</p></div>
<div class="pageHidden" id="page3"><p>And finally a third one.</p></div>
</div>
<h2>pages :
1
2
3
</h2>
<h2>
<span style="float: left;">
Previous
</span>
<span style="float: right;">
Next
</span>
</h2>
Here you go..
I am trying to build a scraper and I would need some help with the following:
I would like to grab a bunch of data from an a-tag and some divs/spans nested in the same div.
My code look like this:
page = Nokogiri::HTML(open(website))
page.search('.company').each { |e| companies << e.text.strip }
page.search('.jobtitle').each { |e| jobtitles << e.text.strip }
page.search('.location').each { |e| locations << e.text.strip }
page.xpath('//a[#class="turnstileLink"]').map{ |e| links << e['href'] }
For the first three (company, title and location) I get either 16 or 15 results, but for the last search my array only contains 10 elements. Weirdly its they also dont match the first 10 of one of the other arrays, but rather start matching somewhere around the 3rd or 4th element of one of the other arrays.
The html of a typical card that I would like to target is here:
<div class="row result clickcard" id="pj_81c3e09223cbc6b3" data-jk="81c3e09223cbc6b3" data-advn="4563763653116462" data-tu="">
<a target="_blank" id="sja1" data-tn-element="jobTitle" class="jobtitle turnstileLink" href="/pagead/clk?mo=r&ad=-6NYlbfkN0DhDTzlYIMy8YIuVE6IrMC_kH05KGZgoAT6LTrcTn8STrwXoiuruouegXiAvJy4qud6xIecRibm3b0Q5eOBkpCiV3R04sAyQbvP7gt6NKZVpCRp32eFzXudmk-TIABX3xEZGo90a47Vz9OofqZaLDh37545RNQ3sFjM6VzWNEWwKf_YoXxeGKcAICj9AADyBuYAY7p9UIUxoox7J5U9gO8Zo2dvRW-i5FJtaUr49Vjsl04W0Jp-CN2azbfp6rrfT6RYFbJ_YAc2iI-L37eeygDtI4KXQwv_elrV8ZLEKo9rkcfEzbE129kX7JKeEq5wJ1dj7GJ4ONH1lIPJQd1gJLoqNYJVQlLTKJiBP72Z0RBmgfZQ-69U8AoEyMT6pytz6iqykLCnO-SxClmvFPJsNV96oBGzpMWtWQeVgGQ49jZfBBRq9Ubw7N73iEjCv6oQ70hcW1P4d8DYK0pCI7vu2KfUh0P9vx8AKC6wY2QoAZeeP4OiBIJ8ikKSIUYJTbe3UwKcLYP7r_3_rx1gY_JO1ReG21ctCxfqGH9DnqTSjz3SYCMZ2ZekooXa&vjs=3&p=1&sk=&fvj=1" title="Private Care Jobs With Elder - Immediate Start - £550 to £750 pw" rel="noopener nofollow" onmousedown="sjomd('sja1'); clk('sja1');" onclick="setRefineByCookie([]); sjoc('sja1',0); convCtr('SJ')">Private Care Jobs With Elder - Immediate Start - £550 to £75...</a>
<br>
<div class="sjcl">
<span class="company">
Elder</span>
<span class="location">London</span>
</div>
<div class="">
<table cellpadding="0" cellspacing="0" border="0"><tbody><tr><td class="snip">
<span class="summary">
Pass a full DBS check or have a valid check already. Access to the internet and a smartphone. At Elder, we’re looking for caring individuals to join our...</span>
</td></tr></tbody></table>
</div>
<div class="sjCapt">
<div class="result-link-bar-container">
<div class="result-link-bar"><span class=" sponsoredGray ">Sponsored</span> - <span id="tt_set_10" class="tt_set"><a id="sj_81c3e09223cbc6b3" href="#" class="sl resultLink save-job-link " onclick="changeJobState('81c3e09223cbc6b3', 'save', 'linkbar', true, ''); return false;" title="Save this job to my.indeed">save job</a></span><div id="editsaved2_81c3e09223cbc6b3" class="edit_note_content" style="display:none;"></div><script>if (!window['sj_result_81c3e09223cbc6b3']) {window['sj_result_81c3e09223cbc6b3'] = {};}window['sj_result_81c3e09223cbc6b3']['showSource'] = false; window['sj_result_81c3e09223cbc6b3']['source'] = "Indeed"; window['sj_result_81c3e09223cbc6b3']['loggedIn'] = false; window['sj_result_81c3e09223cbc6b3']['showMyJobsLinks'] = false;window['sj_result_81c3e09223cbc6b3']['undoAction'] = "unsave";window['sj_result_81c3e09223cbc6b3']['jobKey'] = "81c3e09223cbc6b3"; window['sj_result_81c3e09223cbc6b3']['myIndeedAvailable'] = true; window['sj_result_81c3e09223cbc6b3']['showMoreActionsLink'] = window['sj_result_81c3e09223cbc6b3']['showMoreActionsLink'] || false; window['sj_result_81c3e09223cbc6b3']['resultNumber'] = 10; window['sj_result_81c3e09223cbc6b3']['jobStateChangedToSaved'] = false; window['sj_result_81c3e09223cbc6b3']['searchState'] = "l=London&start=20"; window['sj_result_81c3e09223cbc6b3']['basicPermaLink'] = "https://www.indeed.co.uk"; window['sj_result_81c3e09223cbc6b3']['saveJobFailed'] = false; window['sj_result_81c3e09223cbc6b3']['removeJobFailed'] = false; window['sj_result_81c3e09223cbc6b3']['requestPending'] = false; window['sj_result_81c3e09223cbc6b3']['notesEnabled'] = false; window['sj_result_81c3e09223cbc6b3']['currentPage'] = "serp"; window['sj_result_81c3e09223cbc6b3']['sponsored'] = true;window['sj_result_81c3e09223cbc6b3']['showSponsor'] = true;window['sj_result_81c3e09223cbc6b3']['reportJobButtonEnabled'] = false; window['sj_result_81c3e09223cbc6b3']['showMyJobsHired'] = false; window['sj_result_81c3e09223cbc6b3']['showSaveForSponsored'] = true; window['sj_result_81c3e09223cbc6b3']['showJobAge'] = true;</script></div></div>
<div class="tab-container">
<div class="sign-in-container result-tab"></div>
<div class="tellafriend-container result-tab email_job_content"></div>
</div>
</div>
</div>
All cards have the same class ".clickcard" and all the relevant links have the class ".turnstileLink" but I cant seem to get consistent results when i try to page.search or page.xpath them, without having a problem matching up the data from all the different arrays correctly, besides the different number of elements I get returned.
So my question is: If I want to scrape the company name, location, job title, the url to that page and possibly another value, how would I best go about this?
I would appreciate any feedback!
Edit:
The contains() expression needs to be more complex:
contains(
concat(' ',normalize-space(#class),' '),
' turnstileLink '
)
to prevent classes like turnstileLinkerCar from matching. It's such a hassle that I would use doc.css() with a css selector like a.turnstileLink, which takes care of matching exactly the specified class name in a string that may have multiple class names.
Try:
doc.xpath('//a[contains(#class, "turnstileLink")]').each{ |e| links << e['href'] }
Or:
doc.css('a.turnstileLink').each{ |e| links << e['href'] }
Here's the problem:
require 'nokogiri'
my_html = %q{
<html>
<body>
A link
B link
C link
D link
</body>
</html>
}
doc = Nokogiri::HTML(my_html)
links = doc.xpath('//a[#class="c1"]').map{ |e| e["href"] }
p links
--output:--
["aaa"]
The class of the bbb link is "c1 c2" which is not equal to "c1".
Response to comment:
require 'nokogiri'
my_html = %q{
<html>
<body>
<div class="x">
A link
B link
C link
<div>
D link
</div>
</div>
<div class="y">
Y link
</div>
</body>
</html>
}
doc = Nokogiri::HTML(my_html)
links = doc.css('a.c1').map{ |e| e["href"] }
p links
--output:--
["aaa", "bbb", "ccc", "ddd", "yyy"]
But:
links = doc.css('div.x a.c1').map{ |e| e["href"] }
p links
--output:--
["aaa", "bbb", "ccc", "ddd"]
The same thing with xpaths:
links = doc.xpath('//div[contains(#class, "x")]//a[contains(#class, "c1")]').map{ |e| e["href"] }
plinks
--output:--
["aaa", "bbb", "ccc", "ddd"]
I have the following code, and it displays the listed text and the pagination at the bottom all fine. But when I try to get the images from the child page into the list it errors for me. Any ideas why?
#inherits Umbraco.Web.Mvc.UmbracoTemplatePage
#{
Layout = "SiteLayout.cshtml";
}
<div role="main" id="main">
<div id="blogPageFeatures">
<!-- loop through all of the child pages (blog posts) and grab data to output as a summary -->
#{
var pageSize =3;
var page =1;int.TryParse(Request.QueryString["page"],out page);
var items =Model.Content.Children().Where(x => x.IsDocumentType("BlogArticle")).OrderByDescending(x => x.CreateDate);
var totalPages =(int)Math.Ceiling((double)items.Count()/(double)pageSize);
if(page > totalPages)
{
page = totalPages;
}
else if(page <1)
{
page =1;
}
}
#foreach(var item in items.Skip((page -1)* pageSize).Take(pageSize))
{
<a href="#item.Url">
<!-- The next image line is where it errors -->
<img src="#Umbraco.Media(item.Children.blogArticleImage).Url">
<h2>#item.Name</h2>
<p class="blogTileDate">#item.CreateDate.ToString("dd/MM/yy")</p>
</a>
}
<div class="clearfix"></div>
<!-- Pagination START -->
#if(totalPages >1)
{
<div class="pagination">
<ul>
#if(page >1)
{
<li>Prev</li>
}
#for(int p =1; p < totalPages +1; p++)
{
var active =(p == page)?" class=\"active\"":string.Empty;
<li#(Html.Raw(active))>
#p
</li>
}
#if(page <totalPages)
{
<li>Next</li>
}
</ul>
</div>
}
<!-- Pagination END -->
<div class="clearfix"></div>
</div>
<div class="clearfix"></div>
</div>
You are getting Model.Content.Children, which is a strongly typed collection of IPublishedContent objects. You are then calling item.Children.blogArticlImage, which is not going to work, as you're trying to get the a property on the children of the item, not a property of the item itself. You're also trying to use dynamic syntax, which won't work on the strongly typed objects. Try changing your image code to:
<img src="#Umbraco.TypedMedia(item.GetPropertyValue<int>("blogArticleImage")).Url">
That should do te trick, one thing to note is that if an item doesn't have an image set, or the image that has been selected has been deleted, it'll throw an error. A more robust approach would be something like this:
var imageSrc = "ADD DEFAULT IMAGE PATH HERE";
var media = Umbraco.TypedMedia(item.GetPropertyValue<int>("blogArticleImage"));
if (media != null && !string.IsNullOrEmpty(media.Url))
{
imageSrc = media.Url;
}
<mig src="#imageSrc">
i'm trying to make some special menu but i have a problem with selecting the most nested element (div) . Menu will be dynamic so it can change how much divs will be nested in one div. (parents will be created with new childs) so i need to select the last one (the most nested) without using more classes od Ids.
Here is a code i wrote until now:
<div id="strategy">
<div class="selected">
0
<div class="selected">
some text
<div class="selected"> this is the last div, but it can be anytime changed and more childs of this element can be created</div>
</div>
</div>
<div class="selected">
1
</div>
<div>
2
</div>
</div>
and something of css i tried:
div.selected:only-of-type {background: #F00;}
also tried nth:last-child, only-child.. i think everything but there must be some way how to do it.
if you're open to jQuery...
$(document).ready(function() {
var $target = $('#strategy').children();
while( $target.length ) {
$target = $target.children();
}
var last = $target.end(); // You need .end() to get to the last matched set
var lastHtml = last.html();
$('body').append('<strong>deepest child is: ' + lastHtml + '</strong>');
last.css('color', 'blue');
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="strategy">
<div class="selected">
0
<div class="selected">
some text
<div class="selected"> this is the last div, but it can be anytime changed and more childs of this element can be created</div>
</div>
</div>
<div class="selected">
1
</div>
<div>
2
</div>
</div>
i try without luck to insert 6 images into 2 lines,
the problem is that the line number 2 break the last 2 images and push them down.
i'm using Umbraco and bootstrap 3.
here is my code
#inherits Umbraco.Web.Mvc.UmbracoTemplatePage
#{
Layout = "InnerPage.cshtml";
}
#section slider
{
#{Html.RenderPartial("ImageSlider");}
}
#{
<!-- PROGRAM PAGE -->
<script src="js/script.js"></script>
<div class="caption-button caption-gray left-top" style="font-size:14px;">UPCOMING EVENTS</div>
<div class="padding-top-60">
<div class="row row-lg-height">
<div id="eventCarousel" class="carousel slide vertical" >
<div class="carousel-inner">
#{
var content = Umbraco.Content(1122);
int itemsCount = content.Children.Count();
int sliders = itemsCount / 6;
for (int i = 0; i <= sliders; i++)
{
var items = content.Children;
if (i == 0)
{
#Html.Raw("<div class=\"item active\">")
items = content.Children.Take(6);
}
else
{
items = content.Children.Skip(i * 6).Take(6);
#Html.Raw("<div class=\"item\">")
}
foreach (var childContent in items)
{
var contentService = ApplicationContext.Current.Services.ContentService.GetById(int.Parse(childContent.Id.ToString()));
var title = childContent.Name.ToString();
var image = Umbraco.Media(contentService.Properties["Thumbnail"].Value.ToString());
var description = contentService.Properties["shortDescription"].Value.ToString();
var img = image.Url.ToString();
<div class="col-lg-4">
<img src="#image.Url" class="media-object pull-left img-responsive" />
<h1 class="media-heading" style="color:#606060;">#title</h1>
<div style="padding:0px 5px; ">#MvcHtmlString.Create(#description)</div>
</div>
}
#Html.Raw("</div>")
}
}
</div>
</div>
</div>
</div>
<a class="caption-button caption-red right-bottom" href="#eventCarousel" data-slide="next">MORE</a>
}
i attached 2 images so you can see what firebug showing to me and what happened on the screen.
PLEASE HELP ME ! I am breaking my head 2 days already and it's wasting my time..
What do i do wrong ??
I think your code is a littlebit difficult. If you use another approach you will be able to debug much easier.
Try some approach like this:
#{
var allNodesToLoop = Umbraco.Content(1122).Children;
}
#foreach(var nodegroup in allNodesToLoop.InGroupsOf(2) {
<div class="row">
#foreach(var item in nodegroup) {
<div class="col-md-4">
#item.Name
<!-- and other stuff you want to render in this grid cell -->
</div>
}
</div>
}