string concatenation in css - html

I want to achieve the following in css. How do i do it in a cross browser way?
url('../img/icons/' + attr('type') + '_10.png')

I don't think you can. In the content property you can "concatenate" just by separating with a space, but in other places I don't think there is such a feature. Which is a shame.
You'll probably be best off specifying this style in a style attribute whenever the type attribute is used.

CSS performs concatenation without using any operator (e.g. +, &, etc). Keep your strings in quotes combine the strings, attr, var, etc into one line.
Examples:
url('not/very' '/useful/concatenation'); // not/very/useful/concatentation
url('../img/icons/' attr('type') '_10.png'); //../img/icons/${type}_10.png
url(attr('href') '#hash'); // https://${href}/#hash
url(var(--hello) ' world'); // Hello World

No, you can't do this in plain CSS because the CSS language hasn't control structures or anything like that wich will allow you to dinamically generate CSS code.
Instead, you can use a javascript solutions or a solution based on CSS variables coded in PHP.

You can't do dynamic string interpolation in the way that you're suggesting, but if you have a limited number of possible values for the [type] attribute, you could create styles for each one:
.your .selector[type="foo"] {
background-image: url('../img/icons/foo_10.png');
}
.your .selector[type="bar"] {
background-image: url('../img/icons/bar_10.png');
}
.your .selector[type="baz"] {
background-image: url('../img/icons/baz_10.png');
}
If you've got an unreasonable number of types, then you'll probably need to come up with a better solution than I've listed here.

Related

Regex that extracts a little text from a huge HTML page and ignore tags and the rest

I want to write a regex which extract the content that only the strings/text in html and i need to remove all the rest. I have a huge webpage with a lot of data, but I will show only a stretch:
<div class="flex-row column"><div class="max-360 a-small"> <img class='width-50' src="irobot.io"><h3 class="pad-30">Do you think like a robot?</h3><p> This is not the problem, the problem is about the human failure.</p></div><div class="max-500> <img class='width-50' src="irobot.io"><h3 class="pad-30">
"
I need to received back only:
Do you think like a robot? This is not the problem, the problem is about the human failure.
Someone can help me? I tried something like:
Regex: Do you.*[^<]\b<
But i never worked before with regex.
Thank you!
Edit: you mentioned that you need the solution to work in Python or Java. All my suggestions stay the same; you should use a XML/HTML parser instead of a regular expression. Python has Lib/xml, there are multiple options in Java.
If you must use a regular expression, the syntax is the same for Java. In Python, you can use a pattern like (?:<([^> ]*)[^>]*>)(.*)(?:<\/?\1[^>]*>) with all the same restrictions I mentioned for JavaScript/Java. Try it out!
You can try to parse the text contents from a tag using regular expressions, but it's not recommended in most cases. If you must use a regular expression, you can try something like (?:<(?<tag>[^> ]*)[^>]*>)(?<text>.*)(?:<\/?\k<tag>[^>]*>) on a single tag at a time.
Try it out!
const pattern = /(?:<(?<tag>[^> ]*)[^>]*>)(?<text>.*)(?:<\/?\k<tag>[^>]*>)/;
const matches = pattern.exec(document.body.innerHTML);
console.log('The whole tag: ', matches[0]);
console.log('The tag type: ', matches[1]);
console.log('The text content: ', matches[2]);
<h3 class="pad-30">Do you think like a robot?</h3>
This will not work over multiple nested tags. There are better options available if you want to parse the whole tree in one action. For instance:
The browser's document tree parser already provides extensive navigation options including the ability to return the result of concatenating all visible textNode contents given a starting node.
You can use the innerText or textContent properties like so (open the full screen version):
console.log('Option 1:');
console.log(document.querySelector('body').innerText);
console.log('Option 2:');
console.log(document.querySelector('body').textContent);
code {
background-color: lightgray;
border-radius: 0.5em;
padding: 0.25em 0.5em;
}
<p>Do you <em>need</em> to use a <code>regular expression</code> for this task or would one of the following be suitable?</p>
<ol>
<li>Use the innerText property of the <code>body</code> node like this: <code>document.querySelector("body").innerText();</code></li>
<li>Use the textContent property to also get the contents of <code>script</code> tags in the body like this: <code>document.querySelector('body').textContent);</code></li>
</ol>

Separate each character with a dot in css

How can you transform abc to a. b. c. in pure css.
Only if this is not posible how can you achieve this than in angular (typescript).
With typescript it's easy
"abc".split('').join('. ') + "."
you can make a pipe if you need to reuse this function in several places.
With CSS, I don't know how it is possible. I'm curious to know if there is a CSS solution. I suppose no (:
If you have control over the dom:
You can place a span around each character and then use the ::after pseudo-element to add the periods after each one.
It would look something like:
span::after{
content:"."
}
Unfortunately, you can't write CSS solution to do that in functional way. You can replace that in way that #MaxiGui placed in a comment. If you want to achieve that in typescript (or javascript) you can use this code:
const txt = 'abc';
const transformedTxt = `${txt.split('').join('. ')}.`;

Why does the browser automatically unescape html tag attribute values?

Below I have an HTML tag, and use JavaScript to extract the value of the widget attribute. This code will alert <test> instead of <test>, so the browser automatically unescapes attribute values:
alert(document.getElementById("hau").attributes[1].value)
<div id="hau" widget="<test>"></div>
My questions are:
Can this behavior be prevented in any way, besides doing a double escape of the attribute contents? (It would look like this: &lt;test&gt;)
Does anyone know why the browser behaves like this? Is there any place in the HTML specs that this behavior is mentioned explicitly?
1) It can be done without doing a double escape
Looks like yours is closer to htmlEncode().
If you don't mind using jQuery
alert(htmlEncode($('#hau').attr('widget')))
function htmlEncode(value){
//create a in-memory div, set it's inner text(which jQuery automatically encodes)
//then grab the encoded contents back out. The div never exists on the page.
return $('<div/>').text(value).html();
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<div id="hau" widget="<test>"></div>
If you're interested in a pure vanilla js solution
alert(htmlEncode(document.getElementById("hau").attributes[1].value))
function htmlEncode( html ) {
return document.createElement( 'a' ).appendChild(
document.createTextNode( html ) ).parentNode.innerHTML;
};
<div id="hau" widget="<test>"></div>
2) Why does the browser behave like this?
Only because of this behaviour, we are able to do a few specific things, such as including quotes inside of a pre-filled input field as shown below, which would not have been possible if the only way to insert " is by adding itself which again would require escaping with another char like \
<input type='text' value=""You &apos;should&apos; see the double quotes here"" />
The browser unescapes the attribute value as soon as it parses the document (mentioned here). One of the reasons might be that it would otherwise be impossible to include, for example, double quotes in your attribute value (well, technically it would if you put the value in single quotes instead, but then you wouldn't be able to include single quotes in the value).
That said, the behavior cannot be prevented, although if you really must use the value with the HTML entities being part of it, you could simply turn your special characters back into the codes (I recommend Underscore's escape for such task).

HTML rendered incorrectly in .NET

I am trying to take the string "<BR>" in VB.NET and convert it to HTML through XSLT. When the HTML comes out, though, it looks like this:
<BR>
I can only assume it goes ahead and tries to render it. Is there any way I can convert those </> back into the brackets so I get the line break I'm trying for?
Check the XSLT has:
<xsl:output method="html"/>
edit: explanation from comments
By default XSLT outputs as XML(1) which means it will escape any significant characters. You can override this in specific instances with the attribute disable-output-escaping="yes" (intro here) but much more powerful is to change the output to the explicit value of HTML which confides same benefit globally, as the following:
For script and style elements, replace any escaped characters (such
as & and >) with their actual values
(& and >, respectively).
For attributes, replace any occurrences of > with >.
Write empty elements such as <br>, <img>, and <input> without
closing tags or slashes.
Write attributes that convey information by their presence as
opposed to their value, such as
checked and selected, in minimized
form.
from a solid IBM article covering the subject, more recent coverage from stylusstudio here
If HTML output is what you desire HTML output is what you should specify.
(1) There is actually corner case where output defaults to HTML, but I don't think it's universal and it's kind of obtuse to depend on it.
Try wraping it with <xsl:text disable-output-escaping="yes"><br></xsl:text>
Don't know about XSLT but..
One workaround might be using HttpUtility.HtmlDecode from System.Web namespace.
using System;
using System.Web;
class Program
{
static void Main()
{
Console.WriteLine(HttpUtility.HtmlDecode("<br>"));
Console.ReadKey();
}
}
...
Got it! On top of the selected answer, I also did something similar to this on my string:
htmlString = htmlString.Replace("<","<")
htmlString = htmlString.Replace(">",">")
I think, though, that in the end, I may just end up using <pre> tags to preserve everything.
The string "<br>" is already HTML so you can just Response.Write("<br>").
But you meantion XSLT so I imagine there some transform going on. In that case surely the transform should be inserting it at the correct place as a node. A better question will likely get a better answer

How can I remove an entire HTML tag (and its contents) by its class using a regex?

I am not very good with Regex but I am learning.
I would like to remove some html tag by the class name. This is what I have so far :
<div class="footer".*?>(.*?)</div>
The first .*? is because it might contain other attribute and the second is it might contain other html stuff.
What am I doing wrong? I have try a lot of set without success.
Update
Inside the DIV it can contain multiple line and I am playing with Perl regex.
As other people said, HTML is notoriously tricky to deal with using regexes, and a DOM approach might be better. E.g.:
use HTML::TreeBuilder::XPath;
my $tree = HTML::TreeBuilder::XPath->new;
$tree->parse_file( 'yourdocument.html' );
for my $node ( $tree->findnodes( '//*[#class="footer"]' ) ) {
$node->replace_with_content; # delete element, but not the children
}
print $tree->as_HTML;
You will also want to allow for other things before class in the div tag
<div[^>]*class="footer"[^>]*>(.*?)</div>
Also, go case-insensitive. You may need to escape things like the quotes, or the slash in the closing tag. What context are you doing this in?
Also note that HTML parsing with regular expressions can be very nasty, depending on the input. A good point is brought up in an answer below - suppose you have a structure like:
<div>
<div class="footer">
<div>Hi!</div>
</div>
</div>
Trying to build a regex for that is a recipe for disaster. Your best bet is to load the document into a DOM, and perform manipulations on that.
Pseudocode that should map closely to XML::DOM:
document = //load document
divs = document.getElementsByTagName("div");
for(div in divs) {
if(div.getAttributes["class"] == "footer") {
parent = div.getParent();
for(child in div.getChildren()) {
// filter attribute types?
parent.insertBefore(div, child);
}
parent.removeChild(div);
}
}
Here is a perl library, HTML::DOM, and another, XML::DOM
.NET has built-in libraries to handle dom parsing.
In Perl you need the /s modifier, otherwise the dot won't match a newline.
That said, using a proper HTML or XML parser to remove unwanted parts of a HTML file is much more appropriate.
<div[^>]*class="footer"[^>]*>(.*?)</div>
Worked for me, but needed to use backslashes before special characters
<div[^>]*class=\"footer\"[^>]*>(.*?)<\/div>
Partly depends on the exact regex engine you are using - which language etc. But one possibility is that you need to escape the quotes and/or the forward slash. You might also want to make it case insensitive.
<div class=\"footer\".*?>(.*?)<\/div>
Otherwise please say what language/platform you are using - .NET, java, perl ...
Try this:
<([^\s]+).*?class="footer".*?>([.\n]*?)</([^\s]+)>
Your biggest problem is going to be nested tags. For example:
<div class="footer"><b></b></div>
The regexp given would match everything through the </b>, leaving the </div> dangling on the end. You will have to either assume that the tag you're looking for has no nested elements, or you will need to use some sort of parser from HTML to DOM and an XPath query to remove an entire sub-tree.
This will be tricky because of the greediness of regular expressions, (Note that my examples may be specific to perl, but I know that greediness is a general issue with REs.) The second .*? will match as much as possible before the </div>, so if you have the following:
<div class="SomethingElse"><div class="footer"> stuff </div></div>
The expression will match:
<div class="footer"> stuff </div></div>
which is not likely what you want.
why not <div class="footer".*?</div> I'm not a regex guru either, but I don't think you need to specify that last bracket for your open div tag