Direct link to MediaWiki page section - mediawiki

In my Wikipedia page, I have a section called subtitleA. Before arriving at this point when reading, I have one sentence that has a link that jumps to the content of that section.
To be more clear, this is a simple illustration:
To do this, you will need `this` (link to subtitleA).
To do that, you will do another thing..
== SubtitleA ==
this is how you do it....
I found the following solution:
To do this, you will need [http://wikisite.com/pageName#SubtitleA this].
This has already been proven correct; however, one of my subtitles contains spaces, brackets and directory like the following:
== SubtitleA (balabalaA\balabalaB\balabala....) ==
I can no longer use the solution I found because of those spaces... Can anyone provide me an alternative solutions? Thanks.

To do this, you will need [[pageName#SubtitleA|this]].
Use the exact same format as in the section title.
Anchor encoding is similar to percent encoding (with a . instead of a %) but not exactly the same (e.g. spaces are collapsed and encoded to _). If you really, really need to do it directly, you can use {{anchorencode|original title}}.

I found the solution:
URL encoder is the key, but not using standard %xx as the replacements for special characters. Use .xx (e.g. .5C .28) would work in the mediawiki framework.

Related

Label text ignoring html tags

<label for="abc" id="xyz">http://abc.com/player.js</xref>?xyz="foo" </label>
is ignoring
</xref> tag
value in the browser. So, the displayed output is
http://abc.com/player.js?xyz="foo"
but i want the browser to display
http://abc.com/player.js</xref>?xyz="foo"
Please help me how to achieve this.
It isn't being ignored. It is being treated as an end tag (for a non-HTML element that has no start tag). Use < if you want a < character to appear as data instead of as "start of tag".
That said, this is a URL and raw <, > and " characters shouldn't appear in URIs anyway. So encode it as http://abc.com/player.js%3C/xref%3E?xyz=%22foo%22
You should do it like this
"http://abc.com/player.js%3C/xref%3E?xyz=foo"
Url should be encoded properly to work as valid URL
Use encodeURI for encoding URLs for a valid one
var ValidURL = encodeURI("http://abc.com/player.js</xref>?xyz=foo");
See this answer on encodeURI for better knowledge.
I misunderstood the question, I thought the URI was to be used elsewhere within JavaScript. But the question pretty clearly states that the URI is to just be rendered as text.
If the text being displayed is being passed in from a server, then your best bet is to encode it before printing it on the page (or if you're using a template engine, then you can most likely just encode it on the template). Pretty much any web framework/templating engine should have this functionality.
However, if it is just static HTML, just manually encode the the characters. If you don't know the codes off the top of your head, you can just use some online converter to help, such as something like:
HTML Encode/Decode:
http://htmlentities.net/
Old Answer:
Try encoding the URI using the JavaScript function encodeURI before using it:
encodeURI('http://abc.com/player.js</xref>?xyz="foo"');
You can also decode it using decodeURI if need be:
decodeURI(yourEncodedURI);
So ultimately I don't think you'll be able to get the browser to display the </xref> tag as is, but you will be able to preserve it (using encodeURI/decodeURI) and use it in your code, if this is what you need.
Fiddle:
http://jsfiddle.net/rk8nR/3/
More info:
When are you supposed to use escape instead of encodeURI / encodeURIComponent?

Trademark symbol is displayed as raw text

if you visit www.startwire.com you'll see in the center of the page (in the yellow box, under the video) the following:
StartWire™
in our dev and stage environments, this is not an issue, but it is in production. What could possibly be causing this?
If you look at the page source, you will see &trade; - you are double encoding the entity.
This should be simply ™.
In the HTML you have:
<h2>Sign-up now. StartWire&trade; is completely FREE.</h2>
whereas the correct would be:
<h2>Sign-up now. StartWire™ is completely FREE.</h2>
Notice the extraneous &. Look like you are double encoding something on the server.
If you check your page source it says:
&trade;
This means that probably it took ™ and transformed that into HTML. So the & becomes &. This is probably due to the use of a htmlentities() function.
Make sure you do not do this conversion twice...
A possible cause of this is that you are taking the contents from a database and that you have encoded the entries before inserting them into the database and you encode them a second time when you retrieve them from this database.
Is the content being "HTML encoded" (or whatever they call it) automatically, somewhere in the script? Because this is what appears in the HTML: &trade;.
My suggestions would be to just use the symbol in your code (™). If that doesn't work, try escaping the & of ™ using \ (so that it becomes \™).
not sure, but i have checked your site it shows like you have write like
&™
simple write ™

Amend HTML Grammar based on attributes in TextMate

I've recently started experimenting with jQuery Templates, which rely on your ability to wrap HTML within SCRIPT tags.
<script id="movieTemplate" type="text/x-jquery-tmpl">
<li>
<b>${Name}</b> (${ReleaseYear})
</li>
</script>
The problem is, TextMate naturally assumes that anything within SCRIPT tags is JavaScript. I'm sure it's possible to make TextMate treat the content differently based on the type attribute, but I'm struggling with some of the grammar being used in the bundle. I'm pretty confident that the line below is key, but I'm not sure where to start.
begin = '(?:^\s+)?(<)((?i:script))\b(?![^>]*/>)';
Has anyone already dealt with a similar scenario? Would someone be able to point me in the right direction?
Rich
begin = '(?:^\s+)?(<)((?i:script))\b(?!([^>]*text/x-jquery-tmpl[^>]*|[^>]*/>))';
will stop treating script tags with "text/x-jquery-tmpl" in them as javascript
That's a regular expression. You could extend it to check for the type text/javascript like that:
begin = '(?:^\s+)?(<)((?i:script))\b(.*?type="text/javascript")(?![^>]*/>.*)';
I have only tested it with if, but it seems to work. When the type is text/javascript TextMate expands it to Javascript for every other type it uses PHP. (Just like outside of script tags.)
You can read more about how TextMate uses regular expressions here: Regex (TextMate Manual)
The matching groups are meaningful. You need to change it to this:
begin = '(?:^\s+)?(<)((?i:script))\b(?:.*?type="text/javascript")(?![^>]*/>)';
In order to keep the current matching group configuration.

Match multiple terms within <body> tags

I've want to match any occurrence of a search term (or list of search terms) within the tags of a document. My current solution uses preg (within a Joomla plugin)
$pattern = '/matchthisterm/i';
$article->text = preg_replace($pattern,"<span class=\"highlight\">\\0</span>",$article->text);
But this replaces everything within the HTML of the document so I need to match the tags first. Is this even the best way to achieve this?
EDIT:
OK, I've used simplehtmldom, but just need some help getting to the correct term. So far I've got:
$pattern = '/(matchthisterm)/i';
$html = str_get_html($buffer);
$es = $html->find('text');
foreach ($es as $term) {
//Match to the terms within the text nodes
if (preg_match($pattern, $term->plaintext)) {
$term->outertext = '<span class="highlight">' . $term->outertext . '</span>';
}
}
This makes the entire node text bold, am I ok to use the preg_replace in here?
SOLUTION:
//Get the HTML and look at the text nodes
$html = str_get_html($buffer);
$es = $html->find('text');
foreach ($es as $term) {
//Match to the terms within the text nodes
$term->outertext = str_ireplace('matchthis', '<span class="highlight">matchthis</span>', $term->outertext);
}
No, processing [X][HT]ML with regex is largely disastrous. In the simplest case for your example, this input:
bof
gives quite thoroughly broken output:
matchthisterm</span>/bar">bof
The proper way to do it would be to use a proper HTML/XML parser (for example DOMDocument.loadHTML or simplehtmldom), then scan and replace the contents of each text node separately. Finally re-save the HTML back to a string.
An alternative for search term highlighting is to do it in JavaScript. Since the browser has already parsed the HTML to a DOM, that saves you a processing step. See eg. this question for an example.
I agree processing HTML with regex is not a good solution.
I just read the argument about why regex can't parse HTML here: RegEx match open tags except XHTML self-contained tags
I quite agree with the whole thing, but the problem is MUCH simpler here: we just need to know whether we are inside some HTML tag or not. We don't have to parse an HTML structure and interpreting a tree and mismatching tags or some other errors. We just know that a HTML tag is something between < and >. I believe the regex is a very good, adapted and consistent tool here.
It's not because we're dealing with some HTML that we don't want to use regex. We need to focus on the real problem here, which I believe doesn't really process HTML. We only need to know whether we're inside a tag or not. I hope I won't get too much downvotes for this, but I completely assume my position.
I'm redirecting you to a previous post (where you put a link to this topic) I made sooner this day: Highlight text, except html tags
On the same idea, and I hope we know all we need to, you're using preg_replace() where a simpler function like str_ireplace() would be sufficient. If you just need to replace a word (or a set of words) inside a string and deal with case insensivity, don't use regex. Keep it simple. (I'm assuming you didn't simplify the replacement you're trying to make on purpose to explain your problem here).
I haven't used preg but I've done pattern matching in perl, java and actionscript before. If this is anything similar you have to escape special characters. For example "\<span class.... I found a website that talks about using preg, in case you haven't come across this site, that can be found here

How can I convert URLs in text to HTML links?

I'm writing a forum-type discussion board in Perl and would like to change automatically http://www.google.com to be an HTML link. This should also be safe, and err on the side of security. Is there a quick, easy, and safe way to add links automatically?
Try something like this:
use Regexp::Common qw /URI/;
$text =~ s|($RE{URI}{HTTP})(?!</a>)|$1|g
The key here is using Regexp::Common::URI which probably has a more thorough url matcher than anything I could come up with. Also I do a negative lookahead assertion at the end to make sure that the url is not already in a link. That last part isn't exactly thorough, since it's possible that somebody could do something like this:
http://www.mysite.com is my website
To do this correctly you'd need to parse the entire submission text and only substitute out urls that are not already part of a link.