How to render parentheses as part of url in gtk label? - html

I am using gtk in an application and I make use of the abilities of gtklabel text to be rendered automatically as a clickable url. This works well most of the time, however with a url which contains parentheses "(" and ")" this does not work. The versions I use are the ones available on debian (old)stable, i.e. debian 6 (2.20) and 7 (3.4.2).
For example, I am trying to display the following url:
https://maps.google.com/maps?q=62.1891,+-141.5372+(Example+text+in+here+will+be+rendered+in+the+maps+label)&iwloc=A&hl=en
When I create a gtklabel with this text, for example:
text="<b>Click here for Map</b>\n"
Then it will display fine in the label as an underlined link in bold with the text Click here for Map
However when you click the link it will not show correctly and this error appears:
Gtk-WARNING **: Unable to show '(null)': Operation not supported
It looks like the parentheses mess up the rendering of the url by gtk.
Is there a way to escape the parentheses, or use a different character that works in the map url to create the label?
I have tried various methods of escaping it, however none were effective so far. Such as using %28 and %29 to replace the parentheses as well as backslashes as an escape character.
I am using the method described in https://developer.gnome.org/gtk2/2.24/GtkLabel.html and https://developer.gnome.org/gtk3/stable/GtkLabel.html under "Links" which allows automatic rendering of links:
Links
Since 2.18, GTK+ supports markup for clickable hyperlinks in addition
to regular Pango markup. The markup for links is borrowed from HTML,
using the a with href and title attributes. GTK+ renders links similar
to the way they appear in web browsers, with colored, underlined text.
The title attribute is displayed as a tooltip on the link. An example
looks like this:
1 gtk_label_set_markup (label, "Go to the http://www.gtk.org\" title=\"<i>Our&/i> website\">GTK+
website for more...");
I understand it is working in more recent releases of gtk (2.24 and 3.6), making sure to escape ampersands. But I was wondering if there is a work around for older gtk versions to avoid this problem?

You should be escaping your ampersands with &.
I'm pretty sure GTK prints out a runtime warning telling you this when you call gtk_label_set_markup().
Here's the warning on GTK 3.6.4:
Gtk-WARNING **: Failed to set text from markup due to error parsing markup: Error on line 1: Entity did not end with a semicolon; most likely you used an ampersand character without intending to start an entity - escape ampersand as &

jku is right, the ampersand need to be escaped. He're an example using the very same string as you, and it works (tested on 3.6.4 and 2.24.17).
#include <gtk/gtk.h>
int
main (int argc, char **argv)
{
gtk_init (&argc, &argv);
GtkWidget *window = gtk_window_new (GTK_WINDOW_TOPLEVEL);
// This one won't work, needs ampersand escaping
// GtkWidget *label = gtk_label_new ("<b>Click here for Map</b>\n");
GtkWidget *label = gtk_label_new ("<b>Click here for Map</b>\n");
gtk_label_set_use_markup (GTK_LABEL (label), TRUE);
gtk_container_add (GTK_CONTAINER(window), label);
gtk_widget_show_all (GTK_WIDGET (window));
g_signal_connect (window, "destroy", G_CALLBACK(gtk_main_quit), NULL);
gtk_main ();
return 0;
}
Original answer:
Have you tried to call gtk_show_uri with that link? You could then see if that's a problem with what handles URI's, or if it's the way your label is formatted/constructed.

Related

Why do some strings contain " " and some " ", when my input is the same(" ")?

My problem occurs when I try to use some data/strings in a p-element.
I start of with data like this:
data: function() {
return {
reportText: {
text1: "This is some subject text",
text2: "This is the conclusion",
}
}
}
I use this data as follows in my (vue-)html:
<p> {{ reportText.text1 }} </p>
<p> {{ reportText.text2 }} </p>
In my browser, when I inspect my elements I get to see the following results:
<p>This is some subject text</p>
<p>This is the conclusion</p>
As you can see, there is suddenly a difference, one p element uses and the other , even though I started of with both strings only using . I know and technically represent the same thingm, but the problem with the string is that it gets treated as a string with 1 large word instead of multiple separate words. This screws up my layout and I can't solve this by using certain css properties (word-wrap etc.)
Other things I have tried:
Tried sanitizing the strings by using .replace( , ), but that doesn't do anything. I assume this is because it basically is the same, so there is nothing to really replace. Same reason why I have to use blockcode on stackoverflow to make the destinction between and .
Logged the data from vue to see if there is any noticeable difference, but I can't see any. If I log the data/reportText I again only see string with 's
So I have the following questions:
Why does this happen? I can't seem to find any logical explanation why it sometimes uses 's and sometimes uses 's, it seems random, but I am sure I am missing something.
Any other things I could try to follow the path my string takes, so I can see where the transformation from to happens?
Per the comments, the solution devised ended up being a simple unicode character replacement targeting the \u00A0 unicode code point (i.e. replacing unicode non-breaking spaces with ordinary spaces):
str.replace(/[\\u00A0]/g, ' ')
Explanation:
JavaScript typically allows the use of unicode characters in two ways: you can input the rendered character directly, or you can use a unicode code point (i.e. in the case of JavaScript, a hexadecimal code prefixed with \u like \u00A0). It has no concept of an HTML entity (i.e. a character sequence between a & and ; like ).
The inspector tool for some browsers, however, utilizes the HTML concept of the HTML entity and will often display unicode characters using their corresponding HTML entities where applicable. If you check the same source code in Chrome's inspector vs. Firefox's inspector (as of writing this answer, anyway), you will see that Chrome uses HTML entities while Firefox uses the rendered character result. While it's a handy feature to be able to see non-printable unicode characters in the inspector, Chrome's use of HTML entities is only a convenience feature, not a reflection of the actual contents of your source code.
With that in mind, we can infer that your source code contains unicode characters in their fully rendered form. Regardless of the form of your unicode character, the fix is identical: you need to target these unicode space characters explicitly and replace them with ordinary spaces.

web2py: textareas lose initial newline

I'm not sure if this is a web2py problem or a general html problem, but when I create a form in web2py that contains an editable string in a textarea, and the string contains an initial newline, like "\nsecond_line", the textarea does not display or save the newline - it is cut out. It works fine if there is a character before the newline: "firstline\nsecond_line" shows as on two lines. It is also only relevant for the first newline. If I have a string like "\n\nthird_line", then the textarea shows a single newline at the start.
This is with the most recent (non beta) version of web2py, on safari 9.1.3 and chrome 56.0.2924.87.
Ah. "By HTML 4.0 appendix B chapter 3.1, β€œa line break immediately following a start tag must be ignored, as must a line break immediately before an end tag. This applies to all HTML elements without exception.”"

Disable conversion of html entities in CKEditor

I have a textarea containing a text like:
Foo πŸ“· Bar
When I apply CKeditor on that area it correctly displays it as:
Foo πŸ“· Bar
Which is fine.
But unfortunately it convertes πŸ“· to πŸ“· while doing so.
Can I disable this somehow?
Edit
I tried Enities addon with the setting entities_additional set to true.
This setting actually breaks the πŸ“· character into πŸ“· which is invalid. I'm sure this is a bug and the Enitiy Plugin can't handle multibyte characters.
By default CKEditor should translate entities with either this entities_processNumerical : force or this entities_additional:'#128247' setting.
This is however not the case for 4-byte entities as they get destroyed most likely by replace method.
I have reported this issue here: https://dev.ckeditor.com/ticket/14588
I had a similar problem, my textarea using CKEditor was adding the encoded HTML tags as plain text, so when I displayed the output on a web page the HTML tags showed up as: <p> in the page and not which one would not normally see in the browser (one would only see result, the actual paragraph spacing).
I tried all combinations of:
config.entities = false
config.htmlEncodeOutput = false;
config.entities = true
config.htmlEncodeOutput = true;
Nothing worked until I realised that I was using the PHP htmlspecialchars() in my form to parse the textarea field.
By removing htmlspecialchars() in my form for that field and setting:
config.entities = true;
I resolved the problem.
I Got a solution, Please use "htmlspecialchars" like echo htmlspecialchars( $content );
This will convert "&" to "&amp ;".

How to stop an html TEXTAREA from decoding html entities

I have a strange problem:
In the database, I have a literal ampersand lt semicolon:
<div
whenever its printed into a html textarea tag, the source code of the page shows the > as >.
How do I stop this decoding?
You can't stop entities being decoded in a textarea since the content of a textarea is not (unlike a script or style element) intrinsic CDATA, even though error recovery may sometimes give the impression that it is.
The definition of the textarea element is:
<!ELEMENT TEXTAREA - - (#PCDATA) -- multi-line text field -->
i.e. it contains PCDATA which is described as:
Document text (indicated by the SGML construct "#PCDATA"). Text may contain character references. Recall that these begin with & and end with a semicolon (e.g., HergΓ©'s adventures of Tintin contains the character entity reference for the e acute character).
This means that when you type (the invalid HTML of) "start of tag" (<) the browser corrects it to "less than sign" (<) but when you type "start of entity" (&), which is allowed, no error correction takes place.
You need to write what you mean. If you want to include some HTML as data then you must convert any character with special meaning to its respective character reference.
If the data is:
<div
Then the HTML must be:
<textarea>&lt;div</textarea>
You can use the standard functions for converting this (e.g. PHP's htmlspecialchars or Perl's HTML::Entities module).
NB 1: If you were using XHTML[2] (and really using it, it doesn't count if you serve it as text/html) then you could use an explicit CDATA block:
<textarea><![CDATA[<div]]></textarea>
NB 2: Or if browsers implemented HTML 4 correctly
Ok , but the question is . why it decodes them anyway ? assuming i've added & , save the textarea , ti will be saved < , but displayed as < , saving it again will convert it back to < (but it will remain < in the database) , saving again will save it a < in the database , why the textarea decodes it ?
The server sends (to the browser) data encoded as HTML.
The browser sends (to the server) data encoded as application/x-www-form-urlencoded (or multipart/form-data).
Since the browser is not sending the data as HTML, the characters are not represented as HTML entities.
If you take the data received from the client and then put it into an HTML document, then you must encode it as HTML first.
In PHP, this can be done using htmlentities(). Example below.
<?php
$content = "This string contains the TM symbol: β„’";
print "<textarea>". htmlentities($content) ."</textarea>";
?>
Without htmlentities(), the textarea would interpret and display the TM symbol (β„’) instead of "β„’".
http://php.net/manual/en/function.htmlentities.php
You have to be sure that this is rendered to the browser:
<textarea name="somename">&lt;div</textarea>
Essentially, this means that the & in < has to be html encoded to &. How to do it will depend on the technologies you're using.
UPDATE: Think about it like this. If you want to display <div> inside a textarea, you'll have to encode <> because otherwise, <div> would be a normal HTML element to the browser:
<textarea name="somename"><div></textarea>
Having said this, if you want to display <div> inside a textarea, you'll have to encode & again, because the browser decodes HTML entities when rendering HTML. It has nothing to do with your database.
You can serve your DB-content from a separate page and then place it in the textarea using a Javascript (jQuery) Ajax-call:
request = $.ajax
({
type: "GET",
url: "url-with-the-troubled-content.php",
success: function(data)
{
document.getElementById('id-of-text-area').value = data;
}
});
Explained at
http://www.endtask.net/how-to-prevent-a-textarea-element-from-decoding-html-entities/
I had the same problem and I just made two replacements on the text to show from the database before letting it into the text area:
myString = Replace(myString, "&", "&")
myString = Replace(myString, "<", "<")
Replace n:o 1 to trick the textarea to show the codes.
replace n:o 2: Without this replacement you can not show the word "" inside the textarea (it would end the textarea tag).
(Asp / vbscript code above, translate to a replace method of your language choice)
I found an alternative solution for reading and working with in-browser, simply read the element's text() using jQuery, it returns the characters as display characters and allows me to write from a textarea to a div's innerHTML using the property via html()...
With only JS and HTML...
...to answer the actual question, with a bare-minimal example:
<textarea id=myta></textarea>
<script id=mytext type=text/plain>
β„’
</script>
<script> myta.value = mytext.innerText; </script>
Explanation:
Script tags do not render html nor entities. By storing text in a script tag, it will remain unadultered-- problem is it will try to execute as JavaScript. So we use an empty textarea and store the text in a script tag (here, the first one).
To prevent that, we change the mime-type to text/plain instead of it's default, which is text/javascript. This will prevent it from running.
Then to populate the textarea, we copy the script tag's content to it (here done in the second script tag).
The only caveats I have found with this are you have to use JavaScript and you cannot include script tags directly in it.

Inserting HTML inside quotes

I want a page break inside the title attribute of a link, but when I put one in, it appears correct in a browser, but returns 7 errors when I validate it.
This is the code.
<a href="images/Bosses/Lord Yarkan Large.jpg" class="hastipz" target="_blank" title="Lord Yarkan, a level 80 Unique from Silkroad Online -- Click for a Larger Image">
<img class="bosspic" src="images/Bosses/Lord Yarkan.jpg" style="float:right; position:relative;" alt="Lord Yarkon; Silkroad Unique"/>
</a>
The reason is because the title attribute appears in a tooltip, and I need a page break inside that tooltip. How can I add a page break inside the quotes without returning errors?
I found this forum post:
There are two approaches:
1) Use the character entity for a carriage return, which is 
 Thus:
<...title="Exemplary
website">
(For a full list of character entities, try Googling "HTML Character Codes".)
2) to do any additional styling to your "tooltips", Google "CSS tooltips"
1) is Non-standard though. Works on IE/Chrome, not with Firefox. The new spec appears to recommend
(newline) instead.
Do you need to validate for work?
If not, do not worry about the errors if it works as you want it.
Validation is not the goal. It is a tool to help build better Web sites. which is the goal. ;-)
If you must have it validate, you could try to use some script to switch out a specific keyword / set of characters for a <br /> at dom ready. Although this is untested and I am not sure it wouldn't throw errors, too.
EDIT
As requested, a little jQuery to switch out a word:
$('a').each(function(){
var a = $(this).attr('title');
var b = a.replace('lineBreak','\n');
$(this).attr('title', b);
});
Example: http://jsfiddle.net/jasongennaro/qRQaq/1/
Nb:
I used "lineBreak" as the keyword, as this is unlikely to be matched. "br" might be
I replaced it with the \n line break character.
You should try the \n line break character on its own... might work without needing to replace anything.