Consider this string value:
LCD Soundsystem was the musical project of producer James Murphy, co-founder of dance-punk label DFA Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of alternative dance and post punk, along with elements of disco and other styles. <br />
How can all html tags be removed in Swift?
So the result has to be:
LCD Soundsystem was the musical project of producer James Murphy, co-founder of dance-punk label DFA Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of alternative dance and post punk, along with elements of disco and other styles.
You may use a regular expression, notice the one I've created:
var str = "LCD Soundsystem was the musical project of producer <a href='http://www.last.fm/music/James+Murphy' class='bbcode_artist'>James Murphy</a>, co-founder of <a href='http://www.last.fm/tag/dance-punk' class='bbcode_tag' rel='tag'>dance-punk</a> label <a href='http://www.last.fm/label/DFA' class='bbcode_label'>DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href='http://www.last.fm/tag/alternative%20dance' class='bbcode_tag' rel='tag'>alternative dance</a> and <a href='http://www.last.fm/tag/post%20punk' class='bbcode_tag' rel='tag'>post punk</a>, along with elements of <a href='http://www.last.fm/tag/disco' class='bbcode_tag' rel='tag'>disco</a> and other styles. <br />"
let regex:NSRegularExpression = NSRegularExpression(
pattern: "<.*?>",
options: NSRegularExpressionOptions.CaseInsensitive,
error: nil)!
let range = NSMakeRange(0, countElements(str))
let htmlLessString :String = regex.stringByReplacingMatchesInString(str,
options: NSMatchingOptions.allZeros,
range:range ,
withTemplate: "")
println(htmlLessString)
It converts:
"LCD Soundsystem was the musical project of producer <a href='http://www.last.fm/music/James+Murphy' class='bbcode_artist'>James Murphy</a>, co-founder of <a href='http://www.last.fm/tag/dance-punk' class='bbcode_tag' rel='tag'>dance-punk</a> label <a href='http://www.last.fm/label/DFA' class='bbcode_label'>DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href='http://www.last.fm/tag/alternative%20dance' class='bbcode_tag' rel='tag'>alternative dance</a> and <a href='http://www.last.fm/tag/post%20punk' class='bbcode_tag' rel='tag'>post punk</a>, along with elements of <a href='http://www.last.fm/tag/disco' class='bbcode_tag' rel='tag'>disco</a> and other styles. <br />"
to
"LCD Soundsystem was the musical project of producer James Murphy, co-founder of dance-punk label DFA Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of alternative dance and post punk, along with elements of disco and other styles."
the only thing is that I've converted all double quotes(") to single quotes and then apply the regex, otherwise I needed to escape them all using "\"
Update:
I also tried escaping all double quotes by using "\" and the result was still the same:
new string I used was:
"LCD Soundsystem was the musical project of producer James Murphy, co-founder of dance-punk label DFA Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of alternative dance and post punk, along with elements of disco and other styles. <br />"
and result:
"LCD Soundsystem was the musical project of producer James Murphy, co-founder of dance-punk label DFA Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of alternative dance and post punk, along with elements of disco and other styles."
Here is CjCoaxs code rewritten for Swift 2.0:
var str = "LCD Soundsystem was the musical project of producer <a href='http://www.last.fm/music/James+Murphy' class='bbcode_artist'>James Murphy</a>, co-founder of <a href='http://www.last.fm/tag/dance-punk' class='bbcode_tag' rel='tag'>dance-punk</a> label <a href='http://www.last.fm/label/DFA' class='bbcode_label'>DFA</a> Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of <a href='http://www.last.fm/tag/alternative%20dance' class='bbcode_tag' rel='tag'>alternative dance</a> and <a href='http://www.last.fm/tag/post%20punk' class='bbcode_tag' rel='tag'>post punk</a>, along with elements of <a href='http://www.last.fm/tag/disco' class='bbcode_tag' rel='tag'>disco</a> and other styles. <br />"
let regex = try! NSRegularExpression(pattern: "<.*?>", options: [.CaseInsensitive])
let range = NSMakeRange(0, input.characters.count)
let htmlLessString :String = regex.stringByReplacingMatchesInString(input, options: [],
range:range ,
withTemplate: "")
print(htmlLessString)
Here is code for Swift 3.0:
do {
let regex = "<[^>]+>"
let expr = try NSRegularExpression(pattern: regex, options: NSRegularExpression.Options.caseInsensitive)
let replacement = expr.stringByReplacingMatches(in: originalString, options: [], range: NSMakeRange(0, comment.characters.count), withTemplate: "")
//replacement is the result
} catch {
// regex was bad!
}
Try SwiftSoup it's easy
do{
let html = "LCD Soundsystem was the musical project of producer James Murphy, co-founder of dance-punk label DFA Records. Formed in 2001 in New York City, New York, United States, the music of LCD Soundsystem can also be described as a mix of alternative dance and post punk, along with elements of disco and other styles. <br />"
let doc: Document = try SwiftSoup.parse(html)
return try doc.text()
}catch Exception.Error(let type, let message)
{
print("")
}catch{
print("")
}
Related
I'm using bert pre-trained model for question and answering. It's returning correct result but with lot of spaces between the text
The code is below :
def get_answer_using_bert(question, reference_text):
bert_model = BertForQuestionAnswering.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
bert_tokenizer = BertTokenizer.from_pretrained('bert-large-uncased-whole-word-masking-finetuned-squad')
input_ids = bert_tokenizer.encode(question, reference_text)
input_tokens = bert_tokenizer.convert_ids_to_tokens(input_ids)
sep_location = input_ids.index(bert_tokenizer.sep_token_id)
first_seg_len, second_seg_len = sep_location + 1, len(input_ids) - (sep_location + 1)
seg_embedding = [0] * first_seg_len + [1] * second_seg_len
model_scores = bert_model(torch.tensor([input_ids]),
token_type_ids=torch.tensor([seg_embedding]))
ans_start_loc, ans_end_loc = torch.argmax(model_scores[0]), torch.argmax(model_scores[1])
result = ' '.join(input_tokens[ans_start_loc:ans_end_loc + 1])
result = result.replace('#', '')
return result
Followed by code below :
reference_text = 'Mukesh Dhirubhai Ambani was born on 19 April 1957 in the British Crown colony of Aden (present-day Yemen) to Dhirubhai Ambani and Kokilaben Ambani. He has a younger brother Anil Ambani and two sisters, Nina Bhadrashyam Kothari and Dipti Dattaraj Salgaonkar. Ambani lived only briefly in Yemen, because his father decided to move back to India in 1958 to start a trading business that focused on spices and textiles. The latter was originally named Vimal but later changed to Only Vimal His family lived in a modest two-bedroom apartment in Bhuleshwar, Mumbai until the 1970s. The family financial status slightly improved when they moved to India but Ambani still lived in a communal society, used public transportation, and never received an allowance. Dhirubhai later purchased a 14-floor apartment block called Sea Wind in Colaba, where, until recently, Ambani and his brother lived with their families on different floors.'
question = 'What is the name of mukesh ambani brother?'
get_answer_using_bert(question, reference_text)
And the output is :
'an il am ban i'
Can anyone help me how to fix this issue. It would be really helpful.
You can just use the tokenizer decode function:
bert_tokenizer.decode(input_ids[ans_start_loc:ans_end_loc +1])
Output:
'anil ambani'
In case you do not want to use decode, you can use:
result.replace(' ##', '')
I am using axios to retrieve data from Wikipedia Api. This is the code I have writtent.
let axiosData = function(){
let searchString = $('#searchString').val();
console.log(searchString);
let Url = "https://en.wikipedia.org/w/api.php?action=opensearch&search="+ searchString +
"&origin=*&callback=";
axios.get(Url)
.then(function(res){
var linkLists = res.data;
console.log(linkLists);
})
.catch(function(){
console.log("Error")
});
return false;
}
$('form').submit(axiosData);
I am able to get an output when I console log it. Which is the following: In this case I am searching for the name Jon Snow. How am able to access the json?
/**/(["jon snow",["Jon Snow (character)","Jon Snow (journalist)","Jon Snow","John Snow (cricketer)","Jon Snoddy","John Snow","Jon Snodgrass (musician)","Jon Snodgrass","John Snow College, Durham","John Snow, Inc"],["Jon Snow is a fictional character in the A Song of Ice and Fire series of fantasy novels by American author George R. R.","Jonathan George Snow HonFRIBA (born 28 September 1947) is an English journalist and television presenter.","Jon Snow may refer to:","John Augustine Snow (born 13 October 1941) is a retired English cricketer. He played for Sussex and England in the 1960s and 1970s.","Jon Snoddy is an American technology expert who is currently the Advanced Development Studio Executive SVP at Walt Disney Imagineering.","John Snow (15 March 1813 \u2013 16 June 1858) was an English physician and a leader in the development of anaesthesia and medical hygiene.","Jon Snodgrass is \"the guy with the glasses from Drag the River\".","Jon Snodgrass is a Panamanian author, born on July 27, 1941 in Col\u00f3n, Panama to John Alphonso and Olivia Jane (Chestnut) Snodgrass.","John Snow College is one of 16 constituent colleges of the University of Durham in England. The College takes its name from the nineteenth-century Yorkshire physician Dr John Snow.","John Snow, Inc. (JSI) is a public health research and consulting firm in the United States and around the world."],["https://en.wikipedia.org/wiki/Jon_Snow_(character)","https://en.wikipedia.org/wiki/Jon_Snow_(journalist)","https://en.wikipedia.org/wiki/Jon_Snow","https://en.wikipedia.org/wiki/John_Snow_(cricketer)","https://en.wikipedia.org/wiki/Jon_Snoddy","https://en.wikipedia.org/wiki/John_Snow","https://en.wikipedia.org/wiki/Jon_Snodgrass_(musician)","https://en.wikipedia.org/wiki/Jon_Snodgrass","https://en.wikipedia.org/wiki/John_Snow_College,_Durham","https://en.wikipedia.org/wiki/John_Snow,_Inc"]])
At The beginning of the json has '/**/(' and at the end ')' characters which have cut by substring. After parsed it.
let Url = "https://en.wikipedia.org/w/api.php?action=opensearch&format=json&search=jon%20snow&origin=*&callback=";
axios.get(Url)
.then(function(res){
console.log(res);
var linkLists = JSON.parse(res.data.substring(5, res.data.length-1));
console.log(linkLists)
})
.catch(function(){
console.log("Error...")
});
If res.data is the JSON response for your http request you can parse that into JSON with the JSON.parse() function
var linkJSON = JSON.parse(res.data)
That will give you the json data in the linkJSON object.
I am having issues on converting JSON data to readable text. Right now it comes out something like this:
{"id":82,"url":"http://www.tvmaze.com/shows/82/game-of-thrones","name":"Game of Thrones","type":"Scripted","language":"English","genres":["Drama","Adventure","Fantasy"],"status":"Running","runtime":60,"premiered":"2011-04-17","schedule":{"time":"21:00","days":["Sunday"]},"rating":{"average":9.3},"weight":10,"network":{"id":8,"name":"HBO","country":{"name":"United States","code":"US","timezone":"America/New_York"}},"webChannel":null,"externals":{"tvrage":24493,"thetvdb":121361,"imdb":"tt0944947"},"image":{"medium":"http://static.tvmaze.com/uploads/images/medium_portrait/53/132622.jpg","original":"http://static.tvmaze.com/uploads/images/original_untouched/53/132622.jpg"},"summary":"<p>Based on the bestselling book series <em>A Song of Ice and Fire</em> by George R.R. Martin, this sprawling new HBO drama is set in a world where summers span decades and winters can last a lifetime. From the scheming south and the savage eastern lands, to the frozen north and ancient Wall that protects the realm from the mysterious darkness beyond, the powerful families of the Seven Kingdoms are locked in a battle for the Iron Throne. This is a story of duplicity and treachery, nobility and honor, conquest and triumph. In the <em>\"Game of Thrones\"</em>, you either win or you die.</p>","updated":1485102249,"_links":{"self":{"href":"http://api.tvmaze.com/shows/82"},"previousepisode":{"href":"http://api.tvmaze.com/episodes/729575"},"nextepisode":{"href":"http://api.tvmaze.com/episodes/937256"}}}
How would I go about converting this data so I can view it as ID, Name, Genre etc.?
If you just want to print it to the console, try something like this:
var json = '{"id":82,"url":"http://www.tvmaze.com/shows/82/game-of-thrones","name":"Game of Thrones","type":"Scripted","language":"English","genres":["Drama","Adventure","Fantasy"],"status":"Running","runtime":60,"premiered":"2011-04-17","schedule":{"time":"21:00","days":["Sunday"]},"rating":{"average":9.3},"weight":10,"network":{"id":8,"name":"HBO","country":{"name":"United States","code":"US","timezone":"America/New_York"}},"webChannel":null,"externals":{"tvrage":24493,"thetvdb":121361,"imdb":"tt0944947"},"image":{"medium":"http://static.tvmaze.com/uploads/images/medium_portrait/53/132622.jpg","original":"http://static.tvmaze.com/uploads/images/original_untouched/53/132622.jpg"},"summary":"<p>Based on the bestselling book series <em>A Song of Ice and Fire</em> by George R.R. Martin, this sprawling new HBO drama is set in a world where summers span decades and winters can last a lifetime. From the scheming south and the savage eastern lands, to the frozen north and ancient Wall that protects the realm from the mysterious darkness beyond, the powerful families of the Seven Kingdoms are locked in a battle for the Iron Throne. This is a story of duplicity and treachery, nobility and honor, conquest and triumph. In the <em>Game of Thrones</em>, you either win or you die.</p>","updated":1485102249,"_links":{"self":{"href":"http://api.tvmaze.com/shows/82"},"previousepisode":{"href":"http://api.tvmaze.com/episodes/729575"},"nextepisode":{"href":"http://api.tvmaze.com/episodes/937256"}}}'
var obj = jQuery.parseJSON(json)
function printLine(obj){
for( var i in obj){
if(typeof obj[i] != 'object') console.log(i + ": " +obj[i]);
else printLine(obj[i]);
}
}
printLine(obj);
I am new to ionic framework and I had stuck in the following issue :
data.json:
{ "speakers" : [
{
"name":"Mr Bellingham",
"shortname":"Barot_Bellingham",
"reknown":"Royal Academy of Painting and Sculpture",
"bio":"Barot has just finished his final year at The Royal Academy of Painting and Sculpture, where he excelled in glass etching paintings and portraiture. Hailed as one of the most diverse artists of his generation, Barot is equally as skilled with watercolors as he is with oils, and is just as well-balanced in different subject areas. Barot's collection entitled \"The Un-Collection\" will adorn the walls of Gilbert Hall, depicting his range of skills and sensibilities - all of them, uniquely Barot, yet undeniably different"
},
{
"name":"Jonathan G. Ferrar II",
"shortname":"Jonathan_Ferrar",
"reknown":"Artist to Watch in 2012",
"bio":"The Artist to Watch in 2012 by the London Review, Johnathan has already sold one of the highest priced-commissions paid to an art student, ever on record. The piece, entitled Gratitude Resort, a work in oil and mixed media, was sold for $750,000 and Jonathan donated all the proceeds to Art for Peace, an organization that provides college art scholarships for creative children in developing nations"
},
{
"name":"Hillary Hewitt Goldwynn-Post",
"shortname":"Hillary_Goldwynn",
"reknown":"New York University",
"bio":"Hillary is a sophomore art sculpture student at New York University, and has already won all the major international prizes for new sculptors, including the Divinity Circle, the International Sculptor's Medal, and the Academy of Paris Award. Hillary's CAC exhibit features 25 abstract watercolor paintings that contain only water images including waves, deep sea, and river."
},
{
"name":"Hassum Harrod",
"shortname":"Hassum_Harrod",
"reknown":"Art College in New Dehli",
"bio":"The Art College in New Dehli has sponsored Hassum on scholarship for his entire undergraduate career at the university, seeing great promise in his contemporary paintings of landscapes - that use equal parts muted and vibrant tones, and are almost a contradiction in art. Hassum will be speaking on \"The use and absence of color in modern art\" during Thursday's agenda."
},
{
"name":"Jennifer Jerome",
"shortname":"Jennifer_Jerome",
"reknown":"New Orleans, LA",
"bio":"A native of New Orleans, much of Jennifer's work has centered around abstract images that depict flooding and rebuilding, having grown up as a teenager in the post-flood years. Despite the sadness of devastation and lives lost, Jennifer's work also depicts the hope and togetherness of a community that has persevered. Jennifer's exhibit will be discussed during Tuesday's Water in Art theme."
},
{
"name":"LaVonne L. LaRue",
"shortname":"LaVonne_LaRue",
"reknown":"Chicago, IL",
"bio":"LaVonne's giant-sized paintings all around Chicago tell the story of love, nature, and conservation - themes that are central to her heart. LaVonne will share her love and skill of graffiti art on Monday's schedule, as she starts the painting of a 20-foot high wall in the Rousseau Room of Hotel Contempo in front of a standing-room only audience in Art in Unexpected Places."
},
{
"name":"Constance Olivia Smith",
"shortname":"Constance_Smith",
"reknown":"Fullerton-Brighton-Norwell Award",
"bio":"Constance received the Fullerton-Brighton-Norwell Award for Modern Art for her mixed-media image of a tree of life, with jewel-adorned branches depicting the arms of humanity, and precious gemstone-decorated leaves representing the spouting buds of togetherness. The daughter of a New York jeweler, Constance has been salvaging the discarded remnants of her father's jewelry-making since she was five years old, and won the New York State Fair grand prize at the age of 8 years old for a gem-adorned painting of the Manhattan Bridge."
},
{
"name":"Riley Rudolph Rewington",
"shortname":"Riley_Rewington",
"reknown":"Roux Academy of Art, Media, and Design",
"bio":"A first-year student at the Roux Academy of Art, Media, and Design, Riley is already changing the face of modern art at the university. Riley's exquisite abstract pieces have no intention of ever being understood, but instead beg the viewer to dream, create, pretend, and envision with their mind's eye. Riley will be speaking on the \"Art of Abstract\" during Thursday's schedule"
},
{
"name":"Xhou Ta",
"shortname":"Xhou_Ta",
"reknown":"China International Art University",
"bio":"A senior at the China International Art University, Xhou has become well-known for his miniature sculptures, often the size of a rice granule, that are displayed by rear projection of microscope images on canvas. Xhou will discuss the art and science behind his incredibly detailed works of art."
}
]}
app.js :
angular.module('starter', ['ionic'])
.run(function($ionicPlatform) {
$ionicPlatform.ready(function() {
// Hide the accessory bar by default (remove this to show the accessory bar above the keyboard
// for form inputs)
if(window.cordova && window.cordova.plugins.Keyboard) {
cordova.plugins.Keyboard.hideKeyboardAccessoryBar(true);
}
if(window.StatusBar) {
StatusBar.styleDefault();
}
});
})
.controller('ListController', ['$scope','$http',function($scope,$http){
$http.get('js/data.json').success(function(data){
$scope.artists = data;
});
}])
index.html :
<ion-content ng-controller="ListController" class="has-subheader">
<ion-list>
<ion-item ng-repeat='item in artists' class="item-thumbnail-left item-text-wrap">
<img src="img/{{item.shortname}}_tn.jpg">
<h2>{{item.shortname}}</h2>
<h3>{{item.reknown}}</h3>
<p>
{{item.bio}}
</p>
</ion-item>
</ion-list>
</ion-content>
As you all can see I am trying to list down the data from data.json file but it does not display any data. On console-Network tab I am able to see it call to data.json file but in html it is unable to display. What I am doing wrong? Please help me on this, thanks in advance
Try like this
ng-repeat='item in artists.speakers'
instead of
ng-repeat='item in artists'
Given this portion of a html file, I am looking for a way to extract the text starting from "Metronidazole ...." to the end under "INDICATIONS & USAGE".
Any suggestions?
<div class="Section" data-sectionCode="34067-9">
<a name="section-4"></a>
<p></p>
<h1>
<span class="Bold">INDICATIONS & USAGE
</span>
</h1>
<p class="First">Metronidazole vaginal gel USP, 0.75% is indicated in the treatment of bacterial vaginosis (formerly referred to as <span class="Italics">Haemophilus</span> vaginitis, <span class="Italics">Gardnerella</span> vaginitis, nonspecific vaginitis, <span class="Italics">Corynebacterium</span> vaginitis, or anaerobic vaginosis).</p>
<dl>
<dt></dt>
<dd>
<p class="First">
<span class="Bold">NOTE:</span> For purposes of this indication, a clinical diagnosis of bacterial vaginosis is usually defined by the presence of a homogeneous vaginal discharge that (a) has a pH of greater than 4.5, (b) emits a “fishy” amine odor when mixed with a 10% KOH solution, and (c) contains clue cells on microscopic examination. Gram’s stain results consistent with a diagnosis of bacterial vaginosis include (a) markedly reduced or absent <span class="Italics">Lactobacillus</span> morphology, (b) predominance of <span class="Italics">Gardnerella</span> morphotype, and (c) absent or few white blood cells.</p>
</dd>
</dl>
<p>Other pathogens commonly associated with vulvovaginitis, e.g., <span class="Italics">Trichomonas vaginalis</span>, <span class="Italics">Chlamydia trachomatis</span>, <span class="Italics">N</span>. <span class="Italics">gonorrhoeae</span>, <span class="Italics">Candida albicans</span>, and <span class="Italics">Herpes simplex</span> virus should be ruled out.</p>
</div>
INDICATIONS & USAGE
Metronidazole vaginal gel USP, 0.75% is indicated in the treatment of bacterial vaginosis (formerly referred to as Haemophilus vaginitis, Gardnerella vaginitis, nonspecific vaginitis, Corynebacterium vaginitis, or anaerobic vaginosis).
NOTE: For purposes of this indication, a clinical diagnosis of bacterial vaginosis is usually defined by the presence of a homogeneous vaginal discharge that (a) has a pH of greater than 4.5, (b) emits a “fishy” amine odor when mixed with a 10% KOH solution, and (c) contains clue cells on microscopic examination. Gram’s stain results consistent with a diagnosis of bacterial vaginosis include (a) markedly reduced or absent Lactobacillus morphology, (b) predominance of Gardnerella morphotype, and (c) absent or few white blood cells.
Other pathogens commonly associated with vulvovaginitis, e.g., Trichomonas vaginalis, Chlamydia trachomatis, N. gonorrhoeae, Candida albicans, and Herpes simplex virus should be ruled out.
You can use some old school tricks like,
First convert the NSString to Character Array (Source Array).
Create an empty target Character Array.
Start adding the character from source to target Array using logic.
If you find '<' (start of html tag) character stop adding character in target Array till you find '>' (end of html tag) character.
Or you could use another Trick like
NSString* startTag = #"<";
NSString* endTag = #">";
NSString* replacementString = #"";
while ([str rangeOfString:startTag].length != 0 && [str rangeOfString:endTag].length != 0)
{
NSRange range1 = [str rangeOfString:startTag];
NSRange range2 = [str rangeOfString:endTag];
if(range1.location>range2.location)
break;
NSRange newRange;
newRange.length =range2.location-range1.location+range2.length;
newRange.location = range1.location;
str = [str stringByReplacingCharactersInRange:newRange withString:replacementString];
}
then you can find the text as you said under "INDICATIONS & USAGE" by rangeOfString Method , your desire text should be after this range.