Get text from XML file and print it in HTML File - html

I have a simple XML file created in R that consists of the following lines:
<statistics>
<mean>15.75</mean>
<sd>2.83</sd>
</statistics>
I want to extract the mean and sd to a HTML page, that has a Flash graph and I would like this underneath:
Statistics
Mean = 15.75
Standard Deviation = 2.83
What is the easiest way to achieve this?
Regards,
Anthony.

You should use PHP and SimpleXml.
Just load your xml with simplexml:
$xml = new SimpleXMLElement(file_get_contents("statistics.xml"));
and afterwards echo the desired elements to the page (or add them to your template engine):
echo "Mean: ".$xml->statistics[0]->mean."<br />";
echo "Standard Deviation: ".$xml->statistics[0]->sd."<br />";
If you have more then one statistics element, for example:
<statistics>
<mean>15.75</mean>
<sd>2.83</sd>
</statistics>
<statistics>
<mean>25.75</mean>
<sd>28.3</sd>
</statistics>
Simply use a foreach loop to iterate trough each element:
foreach ($xml->statistiscs as $statistic) {
echo "Mean: ".$statistics->mean."<br />";
echo "Standard Deviation: ".$statistics->sd."<br />";
}

jQuery is one way to do it. This will dynamically load the xml file on the client. The downside is that you have to include the library which you can download here: www.jquery.com. It can be used to so many other things so look into it. Your code would be something like:
$(document).ready(function () {
$.get('address/to/xml.file', function(data) {
var mean = $(data).find('mean').text();
var sd = $(data).find('sd').text();
$("somecontainer").text(mean + "|" + sd);
});
});
Or do you want some server-side language to do your xml parsing and printing?

Related

Dynamic XML Template in TVML/TVJS

Does anyone know how to Dynamically generate a template in an apple tv app using TVJS/TVML? Basically I want to hit my API, get back an array of objects and then insert that data into my XML template.
I've been searching for info on how to accomplish it but have come up short. I've found many tutorials that use hard coded images, videos, etc but nothing dynamically generated.
Any help would be appreciated.
Finally, I've figured this out. It wouldn't be difficult to generate a template on-the-fly, but instead I wanted to reuse the Presenter and the ResourceLoader, and to have the template as a *.xml.js file. Here is the solution I managed to arrive at.
For the initial view, I used a catalogTemplate, as demonstrated in Ray Wenderlich's tutorial. Instead of conference talks, however, I was displaying categories of men's and women's merchandise. Once a category was selected, I wanted to display a stackTemplate with a number of options for that category. The problem was how to pass any information, the title of the category in the simplest case, to the second template.
In the first template, I had the lockups configured like so:
<lockup categoryTitle="Women: Dresses" categoryDir="w-dresses">
<img src="${this.BASEURL}images/dresses.jpg" width="230" height="288" />
<title>Dresses</title>
</lockup>
In application.js, I had a listener attached, in the same way how tutorials show:
doc.addEventListener("select", Presenter.load.bind(Presenter));
Here is the second template (Category.xml.js):
var Template = function(categoryTitle) {
return `<?xml version="1.0" encoding="UTF-8" ?>
<document>
<stackTemplate>
<banner>
<title>${categoryTitle}</title>
</banner>
</stackTemplate>
</document>`
}
This is a JavaScript, so in your case you can pass into the function, say, an array of values and then construct the template accordingly. The tricky part was to pass a value.
First, I made a couple of changes to the ResourceLoader (this can be done better, of course, it's just a proof of concept). I simply added categoryTitle as an additional parameter to the top-level function and when calling the Template:
ResourceLoader.prototype.loadResource = function(resource, callback, categoryTitle) {
var self = this;
evaluateScripts([resource], function(success) {
if(success) {
var resource = Template.call(self, categoryTitle);
callback.call(self, resource);
} else {
var title = "Resource Loader Error",
description = `Error loading resource '${resource}'. \n\n Try again later.`,
alert = createAlert(title, description);
navigationDocument.presentModal(alert);
}
});
}
Finally, in the Presenter, in the load, I am passing categoryTitle to the resourceLoader:
load: function(event) {
var self = this,
ele = event.target,
categoryTitle = ele.getAttribute("categoryTitle");
if (categoryTitle) {
resourceLoader.loadResource(`${baseURL}templates/Category.xml.js`, function(resource) {
var doc = self.makeDocument(resource);
self.pushDocument(doc);
}, categoryTitle);
}
},
This works for me.
One final note: for some categories, I had titles with an ampersand, like 'Tops & T-shirts'. Naturally, I replaced the ampersand with an XML entity: 'Tops & T-shirts'. This, however, didn't work, probably because this string was decoded twice: the first time the entity was turned into an ampersand, and on the second pass the single ampersand was flagged as an error. What worked for me was this: 'Tops &amp; T-shirts'!
It is simple if you are using atvjs.
// create your dynamic page
ATV.Page.create({
name: 'homepage',
url: 'path/to/your/json/data',
template: function(data) {
// your dynamic template
return `<document>
<alertTemplate>
<title>${data.title}</title>
<description>${data.description}</description>
</alertTemplate>
</document>`;
}
});
// later in your app you can navigate to your page by calling
ATV.Navigation.navigate('homepage');
Disclaimer: I am the creator and maintainer of atvjs and as of writing this answer, it is the only JavaScript framework available for Apple TV development using TVML and TVJS. Hence I could provide references only from this framework. The answer should not be mistaken as a biased opinion.
I'm using PHP to generate the TVML files dynamically, configuring the output as text/javascript format:
<?php
header("Content-type: application/x-javascript");
[run your PHP API calls here]
$template = '<?xml version="1.0" encoding="UTF-8" ?>
<document>
... [use PHP variables here] ...
</document>';
echo "var Template = function() { return `". $template . "`}";
?>
You can dynamically generate a template by creating a dynamic string that represents the xml in a TVML template.
Review the code in here: https://developer.apple.com/library/prerelease/tvos/samplecode/TVMLCatalog/Listings/client_js_Presenter_js.html#//apple_ref/doc/uid/TP40016505-client_js_Presenter_js-DontLinkElementID_6
This file has functions that can be used to create an XML document that can represent a view.
You can make an XMLHttpRequest (ex: consuming API JSon calls through TVJS-tvOS) bring back some JSON data and then dynamically generate an XML document that conforms to one of the TVML templates. Parse it into an XML document and then navigate to the document.

Scraping using Html Agility Package

I am trying to scrape data from a news article using HtmlAgilityPackage the link is as follows http://www.ndtv.com/india-news/vyapam-scam-documents-show-chief-minister-shivraj-chouhan-delayed-probe-780528
I have written the following code below to extract all the comments in this articles but for some reason my variable aTags is returning null value
Code:
var getHtmlWeb = new HtmlWeb();
var document = getHtmlWeb.Load(txtinputurl.Text);
var aTags = document.DocumentNode.SelectNodes("//div[#class='com_user_text']");
int counter = 1;
if (aTags != null)
{
foreach (var aTag in aTags)
{
lbloutput.Text += lbloutput.Text + ". " + aTag.InnerHtml + "\t" + "<br />";
counter++;
}
}
I have also used this XPath but still the same result //div[#class='newcomment_list']/ul/li/div[#class='headerwrap']/div[#class='com_user_text']
Please help me with the correct Xpath to Extract all the comments
Searched all over the net but no solution.
Do a 'View Source' on the page and search for com_user_text. The user comments don't appear at all. They are loaded via javascript after the page is loaded. So when you load the page content via getHtmlWeb.Load(), you don't get user comments.
As this answer says, HTML Agility is not a tool capable of emulating a browser and running javascript. Instead, you need something like WatiN that "allows programmatic access to web pages through a given browser engine and will load the full document."

Parse img from RSS-feed using PHP SIMPLE HTML DOM Parser

I am trying to parse this site (to get the img-link): http://statigr.am/feed/parishilton
This is my code:
include 'parse/simple_html_dom.php';
// Create DOM from URL or file
$html = file_get_html('http://statigr.am/feed/parishilton/');
// Find all images
foreach($html->find('img') as $element)
{
echo $element->src . '<br>';
}
The script doesn't return anything! Why is that ? I want the img link.
It's because all images are inside CDATA section and parser ignores it, so the solution is
$html = file_get_html('http://statigr.am/feed/parishilton/');
$html = str_replace("<![CDATA[","",$html); // clean-up
$html = str_replace("]]>","",$html); // clean-up
$html = str_get_html($html); // re-construct the dom object
// Loop
foreach($html->find('item description img') as $el)
{
echo $el->src . "<br />";
}
Replace all CDATA from the returned content and then use str_get_html to create DOM object from that string and loop through the images. (Tested and works).
Output :
http://distilleryimage3.s3.amazonaws.com/cc25d8562c9611e3a8b922000a1f8ac2_8.jpg
http://distilleryimage7.s3.amazonaws.com/4d8e22da2c8911e3a6a022000ae81e78_8.jpg
http://distilleryimage5.s3.amazonaws.com/ce6aa38a2be711e391ae22000ae9112d_8.jpg
http://distilleryimage3.s3.amazonaws.com/d64ab4c42bc811e39cbd22000a1fafdb_8.jpg
......
......

DOMDocument issues: Escaping attributes and removing tags from javascript

I am not fan of DOMDocument because I believe it is not very good for real world usages. Yet in current project I need to replace all texts in a page (which I don't have access to source code) with other strings (some sort of translation); so I need to use it.
I tried doing this with DOMDocument and I didn't received the expected result. Here is the code I use:
function Translate_DoHTML($body, $replaceArray){
if ($replaceArray && is_array($replaceArray) && count($replaceArray) > 0){
$body2 = mb_convert_encoding($body, 'HTML-ENTITIES', "UTF-8");
$doc = new DOMDocument();
$doc->resolveExternals = false;
$doc->substituteEntities = false;
$doc->strictErrorChecking = false;
if (#$doc->loadHTML($body2)){
Translate_DoHTML_Process($doc, $replaceArray);
$body = $doc->saveHTML();
}
}
return $body;
}
function Translate_DoHTML_Process($node, $replaceRules){
if($node->hasChildNodes()) {
$nodes = array();
foreach ($node->childNodes as $childNode)
$nodes[] = $childNode;
foreach ($nodes as $childNode)
if ($childNode instanceof DOMText) {
if (trim($childNode->wholeText)){
$text = str_ireplace(array_keys($replaceRules), array_values($replaceRules), $childNode->wholeText);
$node->replaceChild(new DOMText($text),$childNode);
}
}else
Translate_DoHTML_Process($childNode, $replaceRules);
}
}
And here are the problems:
Escaping attributes: There are data-X attributes in file that become escaped. This is not a major problem but it would be great if I could disable this behavior.
Before DOM:
data-link-content=" <a class="submenuitem" href=&quot
After DOM:
data-link-content=' <a class="submenuitem" href="
Removing of closing tags in javascript:
This is actually the main problem for me here. I don't know for what reason in the world DOMDocument may see any need to remove these tags. But it do. As you can clearly see in below example it remove closing tags in java-script string. It also removed last part of script. It seems like DOMDocument parse the java-script inside. Maybe because there is no CDATA tag? But any way it is HTML and we don't need CDDATA in HTML. I thought CDATA is for xHTML. Also I have no way to add CDDATA here. So can I ask it to not parse script tags?
Before DOM:
<script type="text/javascript"> document.write('<video src="http://x.webm"><p>You will need to Install the latest Flash plugin to view this page properly.</p></video>'); </script>
After DOM:
<script type="text/javascript"> document.write('<video src="http://x.webm"><p>You will need to <a href="http://www.adobe.com/go/getflashplayer" target="_blank">Install the latest Flash plugin to view this page properly.</script>
If there is no way for me to prevent these things, is there any way that I can port this code to SimpleHTMLDOM?
Thanks you very much.
Try this , and replace line content ;
$body2 = mb_convert_encoding($body, 'HTML-ENTITIES', "UTF-8");
to ;
$body2 = convertor($body);
and insert in your code ;
function convertor($ToConvert)
{
$FromConvert = html_entity_decode($ToConvert,ENT_QUOTES,'ISO-8859-1');
$Convert = mb_convert_encoding($FromConvert, "ISO-8859-1", "UTF-8");
return ltrim($Convert);
}
But use the right encoding in the context.
Have a nice day.
Based on my search, reason of the second problem is actually what "Alex" told us in this question: DOM parser that allows HTML5-style </ in <script> tag
But based on their research there is no good parser out there capable of understanding today's HTML. Also, html5lib's last update was 2 years ago and it failed to work in real world situations based on my tests.
So I had only one way to solve the second problem. RegEx. Here is the code I use:
function Translate_DoHTML_GetScripts($body){
$res = array();
if (preg_match_all('/<script\b[^>]*>([\s\S]*?)<\/script>/m', $body, $matches) && is_array($matches) && isset($matches[0])){
foreach ($matches[0] as $key => $match)
$res["<!-- __SCRIPTBUGFIXER_PLACEHOLDER".$key."__ -->"] = $match;
$body = str_ireplace(array_values($res), array_keys($res), $body);
}
return array('Body' => $body, 'Scripts' => $res);
}
function Translate_DoHTML_SetScripts($body, $scripts){
return str_ireplace(array_keys($scripts), array_values($scripts), $body);
}
Using above two functions I will remove any script from HTML so I can use DomDocument to do my works. Then again at the end, I will add them back exactly where they were.
Yet I am not sure if regex is fast enough for this.
And don't tell me to not use RegEx for HTML. I know that HTML is not a regular language and so on; but if you read the problem your self, you will suggest the same approach.

Sending values through links

Here is the situation: I have 2 pages.
What I want is to have a number of text links(<a href="">) on page 1 all directing to page 2, but I want each link to send a different value.
On page 2 I want to show that value like this:
Hello you clicked {value}
Another point to take into account is that I can't use any php in this situation, just html.
Can you use any scripting? Something like Javascript. If you can, then pass the values along in the query string (just add a "?ValueName=Value") to the end of your links. Then on the target page retrieve the query string value. The following site shows how to parse it out: Parsing the Query String.
Here's the Javascript code you would need:
var qs = new Querystring();
var v1 = qs.get("ValueName")
From there you should be able to work with the passed value.
Javascript can get it. Say, you're trying to get the querystring value from this url: http://foo.com/default.html?foo=bar
var tabvalue = getQueryVariable("foo");
function getQueryVariable(variable)
{
var query = window.location.search.substring(1);
var vars = query.split("&");
for (var i=0;i<vars.length;i++)
{
var pair = vars[i].split("=");
if (pair[0] == variable)
{
return pair[1];
}
}
}
** Not 100% certain if my JS code here is correct, as I didn't test it.
You might be able to accomplish this using HTML Anchors.
http://www.w3schools.com/HTML/html_links.asp
Append your data to the HREF tag of your links ad use javascript on second page to parse the URL and display wathever you want
http://java-programming.suite101.com/article.cfm/how_to_get_url_parts_in_javascript
It's not clean, but it should work.
Use document.location.search and split()
http://www.example.com/example.html?argument=value
var queryString = document.location.search();
var parts = queryString.split('=');
document.write(parts[0]); // The argument name
document.write(parts[1]); // The value
Hope it helps
Well this is pretty basic with javascript, but if you want more of this and more advanced stuff you should really look into php for instance. Using php it's easy to get variables from one page to another, here's an example:
the url:
localhost/index.php?myvar=Hello World
You can then access myvar in index.php using this bit of code:
$myvar =$_GET['myvar'];
Ok thanks for all your replies, i'll take a look if i can find a way to use the scripts.
It's really annoying since i have to work around a CMS, because in the CMS, all pages are created with a Wysiwyg editor which tend to filter out unrecognized tags/scripts.
Edit: Ok it seems that the damn wysiwyg editor only recognizes html tags... (as expected)
Using php
<?
$passthis = "See you on the other side";
echo '<form action="whereyouwantittogo.php" target="_blank" method="post">'.
'<input type="text" name="passthis1" value="'.
$passthis .' " /> '.
'<button type="Submit" value="Submit" >Submit</button>'.
'</form>';
?>
The script for the page you would like to pass the info to:
<?
$thispassed = $_POST['passthis1'];
echo '<textarea>'. $thispassed .'</textarea>';
echo $thispassed;
?>
Use this two codes on seperate pages with the latter at whereyouwantittogo.php and you should be in business.