Remove html/text from RSS feed description except images - html

I am setting up a blog with WordPress, that uses a plugin to import RSS feeds and automatically publish them to the blog, on a schedule.
I only want to pull the images from the descriptions, not the text that sometimes appears with them, or other html elements.
There could be multiple images in a post, each one with captions, or links.
Ideally I'd like to use Yahoo Pipes to grab the feed, then use regex operator to replace everything with blank except <img> elements. Then send the manipulated feed to the WP plugin.
I've only managed to strip paragraphs so far, using: <p>.*?</p>. But in some cases there is plain text not wrapped in tags, etc.
Any help appreciated :) I'm a bit of a regex newbie.

you can try with this to get all images from html code.
preg_match_all('/<img[^>]+>/i',$html, $allimages);
print_r($allimages);
if you want image to be stored in string format then implode it with ,

Related

Adding URL links in table charts

We have a Superset table that displays data based on an SQL query. Currently, all the data is rendered in HTML div/span tags.
We need to open a link in a new tab on click of one of the columns. If we send the raw link in anchor tag, it displays <a href={{link}}></a>, because the superset code wraps all the contents in a div/span tag.
Is there any way this can be done?
As far as superset use df.to_html to render the pandas data frames to Html on Dahsoborads Explore Tabs, you can use HTML tags and other on your queries. For example, I developed this simple query that generates a simple table of charts with CSV download links. Check This out:
Write The query like this:
Try to explore it(click on the explore button!):
As far as I know, you can't. All visualization on superset is based on d3. You might want to look for custom visualizations on their site.

String to HTML conversion so that page can read HTML tags

I'm currently working on a blog using Django and SQLite for the back end. In my setup, I stored my articles in the database in this sort of form:
<p> <strong>The Time/Money Tradeoff</strong> </p> <p> As we flesh out High Life, Low Price, you will notice that sometimes we will suggest deals and solutions that may cost slightly more than their alternatives. We won’t always suggest the cheapest laptop...
On the page itself, I have this code for where I use the session data:
<p>{{request.session.article.0.blog_article}}</p>
I had assumed that the web broswer would be able to read the HTML tags. However, it prints on the page in that form, with the visible <p> tags and the like. I think this is because it's stored as a Unicode string in the database and is put onto the page between two quotation marks. If I paste the HTML code onto the page, the format looks like I wanted it to look, but I want it to be an automated process (tell Django which article ID I want, it plugs the elements of the page into the template and everything looks great).
How can I get the stored article in a form where the page can see the HTML tags?
By default django would autoescape all strings in the template, so when you render html code in the template, they just show up as the literal html code. But you could use safe filter to turn this off:
<p>{{request.session.article.0.blog_article|safe}}</p>

UIWebview URL handling

I have a UIWebview wherein I feed different kinds of text like links, simple text, emoticons. It is customize that it will chop a long link into several chunks for eg.
https://fbcdn-sphotos-d-a.akamaihd.net/hphotos-ak-ash3/s480x480/525188_10200372945753969_1256081282_n.jpg
into
https://fbcdn-sphotos-</BR>
d-a.akamaihd.net/hphot</BR>
os-ak-ash3/s480x480/52</BR>
5188_10200372945753969</BR>
_1256081282_n.jpg
I tried removing the method that chops it into several chunks but this is what I get...
https://fbcdn-sphotosd-
a.akamaihd.net/hphotos-ak-
ash3/s480x480/525188_10200372945753969_1256081282_n.jpg
The link is full and working, however on the UI it doesn't display correctly, if only it would automatically create a new line without breaking the link...
So the problem is that because there is a trailing html element the link automatically gets broken my question is...
Is there a way to handle such scenario wherein I can include the HTML element tags into the link yet doesn't break the link or better yet just make the long link display correctly?

Pulling out some text from a giant HTML file using Nokogiri/xpath

I am scraping a website and am trying to pull out certain elements from the HTML. In the sites I am scraping, there are script tags with a bunch of info in them however, there is one part inside these tags that I am interested in. The line basically looks like:
'image':'http://ut5.example.com/t/231/3_b_643435.jpg',
With some stuff above and below it. Now, this is different for each page source except for obviously the domain and some of the subfolders that store the images.
How would I go about looking through the source for this specific line, and cutting out just the URL? I would need to use regular expressions I feel as the URLs are dynamic.
The "gsub" method does something similar to what I want to search for, with its ability to use /regex/. But, I am not wanting to replace anything, I just want to find that URL in the source code using a /regex/ and copy it.
According to you comments, this is what you're looking for I guess
var regex = /http.+/;
Example http://jsfiddle.net/Km9ZB/

Keeping links in SimplePie description

I am using SimplePie to feed blog entries to a non-Wordpress website. By default it strips out all HTML tags, but there is a way to keep them in, by inserting this code near the top of your page:
$feed->strip_htmltags(false);
$feed->init();
$feed->handle_content_type();
However this doesn't seem to be working. The links are present in my feed reader, so I don't believe the problem is with the feed itself, rather with the way I'm using SimplePie. Has anyone else encountered this issue, and found a solution? Thanks.
By default it strips out all HTML tags
Actually, it uses a blacklist to strip certain tags, but it does not strip links (a elements). If links are not appearing, then likely you're accessing the content wrong, or something else is stripping them.
One possibility for why this is occurring is that you're accessing the summary of the item instead of the content.