in xml data is
<content_special>
SAT 17TH TEST test club cheap drinks £3 shots £5 bottles $beers
</content_special>
i get this string using this code
TBXMLElement *Content_Special=[TBXML childElementNamed:#"content_special" parentElement:Club];
NSString *Content_SpecialStr = [TBXML textForElement:Content_Special];
when i NSLog(#"Content_special:%#",Content_SpecialStr);
that print like this
Content_special:SAT 17TH TEST
test club
cheap drinks
£3 shots
£5 bottles
$beers
how can i get original sting which Display in Xlm ? Any suggestion...
Found Solution Using Google Toolbox for Mac GTMNSString+HTML.h ,GTMNSString+HTML.m And GTMDefines.h,
First #import "GTMNSString+HTML.h"
use like This: Content_SpecialStr = [Content_SpecialStr gtm_stringByUnescapingFromHTML];
Ensure your data xml file has utf-8 encoding http://www.w3schools.com/xml/xml_encoding.asp
Related
How to extract only email addresses from text file. I'm trying to do it in Dreamweaver (Search and Replace)…… however, I’m open to other solutions.
The email addresses will need to be saved to a text file; one email address per line.
Thanks
The Text file looks like this:
jon.park#CALPERS.CA.GOV anne.hollinhead#LAO.CA.GOV karn.whites.COMMUNITYSOLUTIONS.ORG dperkins#CITYOFAVENAL.COM marykathryn.clay#DHCS.CA.GOV charlene.manning#CALHR.CA.GOV
arivera#HFH-CONSULTANTS.COM Andy River 488-390-8293 kristi.biwn#SEN.CA.GOV Kristina Brn/916-631-1769 andy.diff#DOF.CA.GOV Andy Diff Liren.goadian#GMAIL.COM Liren goadian joenfly#KERN.ORG Jay Fly /; 661-684-4215 jmcark#CI.OCEANSIDE.CA.US Jay Mark / 763-495-5832 tytgomez#CI.OCEANSIDE.CA.US Tera Gomez/ 769-435-5818 rdxteming#BEAUMONTCARES.COM Rebe Ddorming 657-577-3255 vgytierorz#SCO.CA.GOV Vilma G/916.319.8550 timmyriverlin#OCFA.ORG Timmera Riverline dcurdoch#M-W-H.COM dclex Milant (918) 445-3902 kavinm.listw#EMSA.CA.GOV Kavinm Lists 918-431-3745
I tried different expressions in Dreamweaver that let me only see part of the email addresses; however, I cannot save only the email address as a list/other.
I use this expression in Search and Replace to find the email address: (\b[A-Za-z0-9._%+-]+#[A-Za-z0-9.-]+.[A-Za-z]{2,4}\b)|
How do I copy/export the results to a HTML or Txt file?
I am working on a data prep tutorial, using data from this article: https://www.nytimes.com/interactive/2021/01/19/upshot/trump-complete-insult-list.html#
None of the text is hard-coded, everything is dynamic and I don't know where to start. I've tried a few things with packages rvest and xml2 but I can't even tell if I'm making progress or not.
I've used copy/paste ang regexes in notepad++ to get a tabular structure like this:
Target
Attack
AAA News
Fake News
AAA News
Fake News
AAA News
A total disgrace
...
...
Mr. ZZZ
A real nut job
but I'd like to show how to do everything programmatically (no copy/paste).
My main question is as follows: is that even possible with reasonable effort? And if so, any clues on how to get started?
PS: I know that this could be a duplicate, I just can't tell of which question since there are totally different approaches out there :\
I used my free articles allocation at The NY Times for the month, but here is some guidance. It looks like the web page uses several scripts to create and display the page.
If you uses your browser's developer tools and look at the network tab, you will find 2 CSV files:
tweets-full.csv located here: https://static01.nyt.com/newsgraphics/2021/01/10/trump-insult-complete/8afc02d17b32a573bf1ceed93a0ac21b232fba7a/tweets-full.csv
tweets-reduced.csv located here: https://static01.nyt.com/newsgraphics/2021/01/10/trump-insult-complete/8afc02d17b32a573bf1ceed93a0ac21b232fba7a/tweets-reduced.csv
It looks like the reduced file creates the table quoted above and the tweets-full is the full tweet. You can download these files directly with read.csv() and the process this information as needed.
Be sure to read the term of service before scraping any webpage.
Here's a programatic approach with RSelenium and rvest:
library(RSelenium)
library(rvest)
library(tidyverse)
driver <- rsDriver(browser="chrome", port=4234L, chromever ="87.0.4280.87")
client <- driver[["client"]]
client$navigate("https://www.nytimes.com/interactive/2021/01/19/upshot/trump-complete-insult-list.html#")
page.source <- client$getPageSource()[[1]]
#Extract nodes for each letter using XPath
Letters <- read_html(page.source) %>%
html_nodes(xpath = '//*[#id="mem-wall"]/div[2]/div')
#Extract Entities using CSS
Entities <- map(Letters, ~ html_nodes(.x, css = 'div.g-entity-name') %>%
html_text)
#Extract quotes using CSS
Quotes <- map(Letters, ~ html_nodes(.x, css = 'div.g-twitter-quote-container') %>%
map(html_nodes, css = 'div.g-twitter-quote-c') %>%
map(html_text))
#Bind the entites and quotes together. There are two letters that are blank, so fall back to NA
map2_dfr(Entities, Quotes,
~ map2_dfr(.x, .y,~ {if(length(.x) > 0 & length(.y)){data.frame(Entity = .x, Insult = .y)}else{
data.frame(Entity = NA, Insult = NA)}})) -> Result
#Strip out the quotes
Result %>%
mutate(Insult = str_replace_all(Insult,"(^“)|([ .,!?]?”)","") %>% str_trim) -> Result
#Take a look at the result
Result %>%
slice_sample(n=10)
Entity Insult
1 Mitt Romney failed presidential candidate
2 Hillary Clinton Crooked
3 The “mainstream” media Fake News
4 Democrats on a fishing expedition
5 Pete Ricketts illegal late night coup
6 The “mainstream” media anti-Trump haters
7 The Washington Post do nothing but write bad stories even on very positive achievements
8 Democrats weak
9 Marco Rubio Lightweight
10 The Steele Dossier a Fake Dossier
The xpath was obtained by inspecting the webpage source (F9 in Chrome), hovering over elements until the correct one was highlighted, right clicking, and choosing copy XPath as shown:
<p>kick 2 first look on republic day According to the latest update, the first look of the film Kick 2 will be released on 26th of January. Regular shooting of this film is going on at a brisk pace in Hyderabad. Rakul Preet is playing the female lead in this film and Thaman is scoring ...</p>
<p>The post <a rel="nofollow" href="http://www.teluguabroad.com/kick-2-first-look-republic-day/">kick 2 first look on republic day</a> appeared first on <a rel="nofollow" href="http://www.teluguabroad.com">Teluguabroad</a>.</p>
how to display this in web view and i was getting it from rss feed? to display that content in the next view controller having web view?
You need to set up a html string and then you can load this on web view.
NSString *rssString=[NSString stringWithFormat:#"<p>kick 2 first look on republic day According to the latest update, the first look of the film Kick 2 will be released on 26th of January. Regular shooting of this film is going on at a brisk pace in Hyderabad. Rakul Preet is playing the female lead in this film and Thaman is scoring ...</p><p>The post <a rel=\"nofollow\" href=\"http://www.teluguabroad.com/kick-2-first-look-republic-day/\">kick 2 first look on republic day</a> appeared first on <a rel=\"nofollow\" href=\"http://www.teluguabroad.com\">Teluguabroad</a>.</p>"];
NSString *myHTML = [NSString stringWithFormat:#"<html><body>%#</body></html>",rssString];
[myUIWebView loadHTMLString:myHTML baseURL:nil];
Also notice that for every double quotes " used in NSString we have appended a single slash'\'. So that Xcode compiler can understand that these double quotes are a part of string and will not confuse NSString with #" " identifier which we use in Xcode.
And in case you need to load this to next view controller, you can send the string myHTMl to next view controller using property and load on web view .
In my project, I am retrieving an address from the database and displaying this address on page. The issue or problem that I am facing is the the way the retrieved address is being displayed.
Basically this is the way the address is being displayed:
Address: 1 Kings Street Kilmarnock East Berkshire Scotland KA1 UIT
I would like the address to be displayed like so:
Address: 1 Kings Street
Kilmarnock
East Berkshire
Scotland
KA1 UIT
How can I do this in HTML or css? I am currently using twitter bootstrap , and I don't if there is anyway to do using it.
Oh.. It's hard but let me try... ;)
<div>Address: 1 Kings Street<br> Kilmarnock <br>East Berkshire<br> Scotland <br>KA1 UIT</div>
Your Expected Output:
Address: 1 Kings Street
Kilmarnock
East Berkshire
Scotland
KA1 UIT
Working Demo
Note: This is just simplest and basic way.. But if Data is coming from any database dynamically then at-least you should explain in details from where it comes and by which object it can be accessible.. Then you can apply JS or regex functions... I hope you understand what I mean..
If you edited the data in your database, and used a comma between each line, you might be able to split the data. (your question doesn't state what language you are writing in (mvc/how data is being received).
However, once you have the string, you can use the String.split method (available in most languages), and hence could adapt your view accordingly.
So,
Address: 1 Kings Street, Kilmarnock, East Berkshire, Scotland, KA1 UIT
would become
{1 Kings Street}{ Kilmarnock}{ East Berkshire}{ Scotland}{ KA1 UIT}
which you could then do a foreach block, displaying them onto the screen as you requested
A possible alternative (although probably not easy), would be to extract the postcode, and call a google maps API call to receive the address - but again, this would be quite difficult.
At present, I cannot see how you would be able to expect a program to decide where to split the address into separate lines (unless you place them into different fields which was described in the comments).
You can add <p> to each line, or put <br> at the end of each lines.
But
If your retrieving the data from a database through a php variable you cannot do that with pure html / css, you'll need string manipulation with PHP.
I have the following awful HTML:
<p>
102036 - <em>In re</em> State v. Williams video<br>
104236 - University of Kansas Hosp. Auth. v. Board of Wabaunsee County Comm'rs video
</p>
I want to use XPath to capture all of the text following each </a>, so:
Item 1: " - In re State v. Williams
Item 2: " - University of Kansas Hosp. Auth. v. Board of Wabunsee County
Alternatively, I could just capture all text, and that would be fine too:
Item 1: "102036 - In re State v. Williams
Item 2: "104236 - University of Kansas Hosp. Auth. v. Board of Wabunsee County
I've been trying various things for a while now, but making no progress. I want something like:
/a/following::text()[before::br]
Help?
Here you go, pal:
//a//following-sibling::text() | //a//following-sibling::*[not(self::a)]/text()
The best thing I've found so far is to simply nuke the errant <em> nodes.
So:
elem = html.xpath('//p')[0]
etree.strip_tags(elem, 'em')
Then, using the cleaner html, a simple XPath can be used:
texts = [e.tail for e in elem.xpath('//a')]
Credit where due: https://stackoverflow.com/a/8788559/64911
If you have firebug installed and are running Firefox, for this and all future xpath needs you can just follow this tutorial:
http://www.wikihow.com/Find-XPath-Using-Firebug
Very easy way to find the xpath for anything on a page.