Creating multiple text nodes in HTML - html

If I have the HTML
<html>
<body>
<div>"Foo" "Bar"</div>
</body>
</html>
is there any way to hav the browser emit "Foo" and "Bar" as two text nodes so it rendered the same as <div>FooBar</div>? I know that, with JavaScript, you can add two text nodes as children of a div to achieve this result, but I'm wondering if it's possible in vanilla HTML. The reason I ask is because I'm workin on an SSR engine that does partial hydration, and the JS engine is looking for several text nodes in an HTML element but I can only manage to create one. Thanks in advance for any tips!

Related

HTML XPath: Extracting text mixed in with multiple level and complex tags?

related questions before:
HTML XPath: Extracting text mixed in with multiple tags?
HTML XPath: Selectively avoiding tags when extracting text
//sorry for my poor English
I'm a beginner of writing web crawler, I'm trying to extract main content from a web pages(in Chinese) by xpath(though I have learned that there are algorithms both taditional and machine learning ways to extracting web main content) ,and I'm a very beginner at writing xpath rules.
I'm in faced with a web page that contains text mixed in complex tags,I summarize it as follows,where character(e.g. A,A2) means text only,'...' means more tags even nested without text.I want to get "AA2BB2CDEFGHIJKLMNOP"
...
<div id="artibody" class="art_context">
<div align="center">...</div>
<div align="center"><font>A</font>A2</div>
<div align="left"><br><br><strong>B</strong>B2</div>
<div align="left">
<p>C<a>D</a>E</p>
<p>F<a>G</a>H<a>I</a>J</p>K
</div>
<div align="center">...</div>
<div align="center"><font>L</font></div>
<p>M</p><!--M contains only text luckly-->
<p>N</p>
<p>O</p>
<p>P<span>...</span><div class="shareBox">...</div>
</p>
<span id="arctTailMark"></span>
<script>
var page_navigation = document.getElementById('page_navigation');
...
</script>
<div style="padding:10px 0 30px 0">...</div>
</div>
Thanks for previous questions, I write a rule
'string(//div[#class=\"art_context\"])'
I get all content in plain text I want without tags ,but the js code in <script> is extracted as well.I tried the following,but it seems not helpful.There are still js codes in it .
'string(//div[#class=\"art_context\" and not(self::script)])'
The following one get "\r\n" only.
'//div[#class=\"art_context\" and not(self::script)]/text()'
Here are my questions:
1.How to write the xpath rule to meet my need : extracting content in div[#id="artibody"] except codes in <script>
2.Is the rule for question1 simple and powerful? Maybe I will meet more pages with a div[#id="artibody"] but the descendant nodes are quite different.
3.Any further suggestions on my task? Extracting web content from one website,but the main content lays in <div> with different id,class,and descendant node structure. I run the spider on my laptop(Intel corei5 3225,8G RAM) while using machine learning algorithms may decrease the crawl speed significantly.At the same time writing many xpath rule seems bothering.
I'd appreciate it if you could give me any suggestions on this question(and my English).
To get all descendant text nodes except the script contents, you can use this:
//div[#class="art_context"]//*[not(self::script)]/text()
In natural language: “Get all text nodes from descendants of all div[#class="art_context"] elements that are not script elements”.
The // after div[#class="art_context"] is needed to select descendants, not just children.
In comparison, the //div[#class="art_context" and not(self::script)]/text() expression in the question says “Get all text-node children of all div[#class="art_context"] elements that are not also script elements.”
So the and not(self::script) part in the expression in the question is redundant, because all the expression is doing is selecting just //div[#class="art_context"] anyway, and then the /text() part is selecting only the text-node direct children of that div, which is just line breaks.
Also, if instead of using XPath to just get the set of text nodes, you want to use XPath to get the result as a single string, you can use the functions string-join(…) and normalize-space(…):
normalize-space(string-join(//div[#class="art_context"]//*[not(self::script)]/text(), ""))

How to properly use same text sections across multiple html pages?

I am making help content documentation for an already made software (the kind of which opens in every software when you press F1 or navigate to the Help section in the menu bar). I am using simple html/CSS/js pages to do so.
There is a ton of the same text descriptions of various software properties that appear in more than one page. The idea is to make a single text source file, where all the text descriptions are located and then use some sort of referencing to that specific text section wherever necessary.
Kind of a similar to using a CSS stylesheet to apply styles over all of the pages, only this handles text instead of styles. This way I would be able to change text in only one file and it would apply everywhere it is used.
I ran across the html SSI method, but this only includes the entire html page and not just a specific text section the way I would like to. I would strongly avoid using different file for each text section.
Can anyone please point me into the right direction here?
I think that you can make a JavaScript function that contains the common texts and use this functions in your code whenever you need them, for this the JavaScript that you create should be an external file and you can reference it in every html page you need it.
For example, you can have one function that returns "Hello World" and set this to a "p" element with the id="title". So in every page where you have an element with the id title you can call your JavaScript function to set its text to "Hello World". Use this link to find out more about this topic:
http://www.w3schools.com/js/js_htmldom_html.asp
UPDATE: I did a little test, i created the following JavaScript:
function helloTitle(){
var text = "Hello World!";
document.getElementById("title").innerHTML = text;
}
And referenced it in some HTML pages like this:
<script src="commonText.js" type="text/javascript"></script>
After that i only need to call the function in the element i want it to modify:
<p id="title"><script>helloTitle();</script></p>
This is a solution if you are only using JS, CSS and HTML. There should be other ways to achieve this.
Hope this information could help you!
I figured out how to do it a little more comforatbly on a large scale using the html command https://www.w3schools.com/tags/tag_iframe.asp
in your main html file you do:
<p> <iframe src="Text.html" width="100%" height="300" style="border:1px solid black;"> </p>
and then with some basic html formating insert whatever text u want
<html>
<body>
hmm idk what i should put here. Test
</body>
</html>
there will also be some css formatting needing to be done before it look perfect, but if you want to make multi line blocks I think this is the easiest way to.

Can I plug text into HTML from elsewhere?

I'm new to web development and I'm working on my second website. I feel it should be a basic question and probably have already gotten addressed somewhere on Stack Overflow. However I can't find anything directly relevant, due to a lack of precise description. The problem is:
Because I'm doing copywriting along the way, frequently I find myself needing to update the copy inside the HTML code wrapped deep inside many div's. It's quite inconvenient; and because of texts, codes can sometimes get messy.
I wonder if there's a simple way to leave a "handle" in place of texts inside HTML code, "plugging in" text from elsewhere, like plugging in style from CSS? I suppose it should work in a concept similar to what a CMS have.
With jQuery, you can use .html to plug text and symbol to html page
<html>
<head>
<title>Your page</title>
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js">
</script>
<script type="text/javascript">
$(function(){
$("#statictext").html('<b>jQuery</b>');
$("#symbol").html('©');
});
</script>
</head>
<body>
<div id="statictext"></div>
<div id="symbol"></div>
</body>
I think what you're looking for is the id HTML attribute. You can use it like this from javascript (i'm using js since you don't specify a language):
var yourelement = document.getElementById('yourelementid');
yourelement.textContent = "Yer text";
with your html being:
<div id="yourelementid"></div>
with the element being a div or any other element that can have text content.
If you need to insert HTML, you can do it through .innerHTML or, preferrably, manipulate the DOM, by adding and removing elements. CSS also has an attr() property function, which allows you to set an arbitrary property on an HTML element (such as piece="textstuff", with the css being content: attr(piece)).
You can also construct elements and append them (again, if what you want is to insert HTML markup) by using .appendChild and .removeChild.

How to inject html codes, chrome extensions

i want to insert html codes using chrome extension.
Adding complex HTML using a Chrome content script
this link works but i need to insert more specific area. For example this links add html codes to top of codes but i need to insert in specific codes like
<html>
<body>
...
...
<div>
//codes
</div>
// i want my code goes here
// <div>
// </div>
...
...
</body>
</html>
if im still can't explain myself, there is a chrome extension which name is "looper for youtube" this extension is doing what i need. Thanks for any helps, and sorry for my bad english
You have to write code in the content_script to specify where your HTML will be injected. For example, you could use the insertBefore function to insert HTML code before an existing element.
There are many functions surrounding the DOM tree which can help you specify the exact point in the document to insert new objects. Do not think about the HTML as a text file, think about it as a document tree (with parent nodes, child nodes, and ids). Here is a list of functions to get you started:
http://www.w3schools.com/jsref/dom_obj_all.asp
For example, something like:
document.getElementById('test').innerHTML = "HI!";
Will insert "HI!" inside the div tags in the following HTML:
<html>
<body>
<div id='test'>
</div>
</body>
</html>
If you require further assistance, please be more specific with your request. Where (exactly) do you need the HTML to be injected?

Using the same HTML id attribute over miltiple PHP files

I know that id selector is used to specify a style for a single element. My question is, if I have a project and it has multiple php files, can these php files contain elements with same id?
Here is example:
php file 1:
...
<body>
<h1 id="test">header1</h1>
</body>
...
php file 2:
...
<body>
<h3 id="test">header3</h3>
</body>
...
css file:
#test
{
color:red;
}
This usage is correct or not?
If they are all rendered in the same HTML page in the browser, it's incorrect as ID should be unique on a single page. If only one is ever rendered then it'll be a-ok.
If you want your Web pages to validate as XHTML or HTML, then you should have unique IDs on your pages.
Yes, that is correct. In fact, that is a good idea. If you do that, you can use the same stylesheet on both pages. As long as you don't combine the files, it's a great idea.
What you are doing is fine, but it looks like class is better for what you are trying to do. You typically use ID to specify a specific element on a specific page and class to apply styling to different elements, on the same or different pages.
Using the same ID on multiple pages WILL work, but imo class is the more proper thing to use.
The id should be unique for each element per (HTML) document.
So, unless you combine the output of your PHP files into a single HTML file there is no problem. In page1 your one h1 heading will be red, in page2 your one h3 heading will be read, etc.
Personally, I prefer CSS classes for appearance and DOM IDs for functions, but they can be mixed.