get linked css files via a regexp in a html-page - html

i try to parse a html page which a have loaded with perl. i need to get the src="asd/jkl/xyz.css" for example out of the html-repsone to manipulate the path to an absolute.
the reason why i want to do this is, that is need the css inline in a E-Mail head ...
so my try to realize this is:
load the page via perl
get the src of the linked css
load the css files via perl
parse the css und put the contents of the css files in the head-tag of my generated email.
has anyone a better idea or a working regex?

Try something like this:
#!/usr/bin/env perl
use XML::LibXML;
my $parser = XML::LibXML->new();
my $doc = $parser->load_html(location => "http://mywebsite.com", recover => 2);
print $doc->findnodes('//link[#rel="stylesheet"]/#src');
Reference: http://metacpan.org/pod/XML::LibXML

Related

Embed image in HTML r markdown document that can be shared

I have an R markdown document which is created using a shiny app, saved as a HTML. I have inserted a logo in the top right hand corner of the output, which has been done using the following code:
<script>
$(document).ready(function() {
$head = $('#header');
$head.prepend('<img src=\"FILEPATH/logo.png\" style=\"float: right;padding-right:10px;height:125px;width:250px\"/>')
});
</script>
However, when I save the HTML output and share the output, of course the user cannot see the logo since the code is trying to find a file path which will not exist on their computer.
So, my question is - is there a way to include the logo in the output without the use of file paths? Ideally I don't want to upload the image to the web, and change the source to a web address.
You can encode an image file to a data URI with knitr::image_uri. If you want to add it in your document, you can add the html code produced by the following command in your header instead of your script:
htmltools::img(src = knitr::image_uri("FILEPATH/logo.png"),
alt = 'logo',
style = 'float: right;padding-right:10px;height:125px;width:250px')

Reading a webpage with perl's LWP - output differs from a downloaded html page

I try to access and use different pages in NCBI such as
http://www.ncbi.nlm.nih.gov/nuccore/NM_000036
However, when I used perl's LWP::Simple 'get' function, I do not get the same output I get when I save the page manually (with the firefox browser 'save as html' option). What I do get from the 'get' function lacks the data I require.
Am I doing something wrong?
Should I use another tool?
My script is :
use strict;
use warnings;
use LWP::Simple;
my $input_name='GENES.txt';
open (INPUT, $input_name ) || die "unable to open $input_name";
open (OUTPUT,'>', 'Selected_Genes')|| die;
my $line;
while ($line = <INPUT>)
{
chomp $line;
print OUTPUT '>'.$line."\n";
my $URL='http://www.ncbi.nlm.nih.gov/nuccore/'.$line;
#e.g:
#$URL=http://www.ncbi.nlm.nih.gov/nuccore/NM_000036
my $text=gets($URL);
print $text."\n";
$text=~m!\r?\n\r?\s+\/translation="((?:(?:[^"])\r?\n?\r?)*)"!;
print OUTPUT $1."\n";
}
Thanks in advance!
The page at http://www.ncbi.nlm.nih.gov/nuccore/NM_000036 does a lot of JavaScript processing and also loads a bunch of stuff dynamically. LWP::UserAgent does not do that for you as it cannot run JavaScript.
I suggest you take a look at what is happening in your browser, with Firebug or the Chrome Developer Tools. You'll see it does an XHR request to this URL: http://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?val=289547499&db=nuccore&dopt=genbank&extrafeat=976&fmt_mask=0&retmode=html&withmarkup=on&log$=seqview&maxdownloadsize=1000000
Now I am not sure how these params translate to the NM_000036, but you should be able to figure that out by looking at some of the JS code that is being run on the page, or trying multiple pages ans looking at the URLs of the XHR calls.
Since this is probably a public service, and I'm assuming you are allowed to take that data, you should consider asking if they have a proper API that you can hit instead of screen scraping the stuff off of their website.
Content you're searching is generated by JavaScript. You need to parse your HTML (from the first response) and find ID for the data you want:
<meta name="ncbi_uidlist" content="289547499" />
Next you need to make another request to the URL in the form: http://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?val=ID_YOU_HAVE
Something like this (untested!):
my $URL='http://www.ncbi.nlm.nih.gov/nuccore/'.$line;
my $html=gets($URL);
my ($id) = $html =~m{name="ncbi_uidlist" \s+ content="([^"]+)"}xi;
if ($id) {
$html=gets( "http://www.ncbi.nlm.nih.gov/sviewer/viewer.fcgi?val=" . $id );
$text=~m!\r?\n\r?\s+\/translation="((?:(?:[^"])\r?\n?\r?)*)"!;
print OUTPUT $1."\n";
}

HTML/ display images in a HTML page

I created a HTML page.
Now, I try to display all the pictures that are in a specific folder (/folder1) in this HTML page (Note: I don't know the names of these images).
I try to create a loop, which read all this images, and display it in this HTML.
There is an easy way to do that?
You are looking for something which HTML cannot do. You are going to need some sort of backend language, whether that be Rails, PHP, Python, or something else doesn't really matter.
HTML is and always will be only a Markup Language.
Here is a similar post which has code that might help you:
How To Display All Images in Folder
With php you can use function scandir() to retrieve all the files in a directory, and save them as an array.
Then iterate over that array and display any image with something like:
echo '<img src="path/to/folder1/'$files_array[i]'">
where $files_array contains the names of every image file in that directory.
if your images are stored in a server you can read the directory and get the image name them send to the font end.
if you are work in a local file system such as
/dir/index.html
/dir/images/
/dir/images/xxx.png
/dir/images/aaa.png
/dir/images/other image.png
you can rename all images in batch to 1.png 2.png 3.png ...and so on then use javascript in html to
generate the image
var body = document.getElementsByTagName("body")[0];
for (var i = 0; i < 100; ++i) {
var img = document.createElement("img");
img.src = "images/" + i + ".png";
body.appendChild(img);
}

Assigning tags for uploaded HTML files in Concrete5

We are planning to load a number of HTML files as they are in the site using Concrete5.
We had to do this since the number of files is too big to load them via editor.
(We are going to generate the html files with madcap flare)
However, I need to use the tag feature of concrete5 for the contents loaded by this method.
I am told by my developers that this is impossible.
Does anyone know how to use tags for files loaded without going through the C5 editor?
i.e. I want the contents in the manually linked html files to be searched and filtered within the site with the search feature and filter feature provided by C5
HELP!!
I recommend creating a very simple template consisting of the standard C5 header/footer code, with one big block as the contents of the body tag.
You can then import the pages by something along the lines of (pseudo-code):
$parent = Page::getByCollectionPath('/');
$ct = CollectionType::getByHandle('template_name');
$data = array(
'cName' => 'The page title',
'cHandle' => 'The trailing path component'
);
$page = $parent->add($ct, $data);
$blocks = $page->getBlocks('Main');
// Gross hack because the template has one block, and that a 'content' block
$blocks[0]->update('content', 'IMPORTED HTML BODY CONTENT');
After that, you can add tags either via the API or the Dashboard.

What do I have to do for opening and editing existing file with PHP?

I have different HTML files. I want to open, edit and then save changes with PHP (NOT OOP) in admin panel by using HTML textarea tag. What do I have to do for that? Do I need to create new mysql database? Could you please show me an example?
You can read the contents of the HTML file using file_get_contents:
$html = 'example.html';
$currentContents = file_get_contents($html);
// set the textarea text to $currentContents
To write the changes, you will have to post the textarea to a PHP script (through an HTML form) and then do something like:
$newContents = $_POST['textareaName'];
$html = 'example.html';
$fh = fopen($html, 'w') or die("File could not be opened.");
fwrite($fh, $newContents);
fclose($fh);
There are some security things you need to worry about it, but this is a basic example of how to achieve your goal. Good luck!
http://us.php.net/file_get_contents
http://us.php.net/fwrite