Chasing down a problem, I happened to notice that jekyll/convertible.rb's read_yaml() routine seems to allow the frontmatter to be terminated either with three dashes, or with three dots:
if content =~ /\A(---\s*\n.*?\n?)^((---|\.\.\.)\s*$\n?)/m
I don't find the "dots" form documented anywhere. Does it mean something?
It's specified in the YAML spec, in §2.2. It's one way to specify the end of a YAML document.
Related
My html file contains in many places the code
It is too short and it doesn't really make sense to replace it with a code like
<span class="three-spaces"></span>
I would like to replace it with something like
##TS##
or
%%TS%%
and the file should start with something like:
SET TS = " "
Is there any way to write the HTML this way? I am not looking for compiling a source file into a HTML. I am looking for a solution that allows directly writing macros into HTML files.
Later edit: I'm coming with another example:
I also need to transform
lnk(http://www.example.com)
into
<a target="_blank" href="http://www.example.com">http://www.example.com</a>
Instead of telling him WHY he should not do something, how about telling him HOW he could do it? Maybe his example is not an appropriate need for it, but there's other situations where being able to create a macro would be nice.
For example... I have an HTML page that I'm working on that deals with unit conversions and quite often, I'm having to type things like "cm/in" as "cm/in" or for volumes "cu-cm/cu-in" as "cm3/in3". It would be really nice from a typing and readability standpoint if I could create macros that were just typed as "%%cm-per-in%%, %%cc-per-cu-in%% or something like that.
So, the line in the 'sed' file might look like this:
s/%%cc-per-cu-in%%/<sup>cm<sup>3<\/sup><\/sup>\/<sub>in<sup>3<\/sup><\/sub>/g
Since the "/" is a field separator for the substitute command, you need to explicitly quote it with the backslash character ("\") within the replacement portion of the substitute command.
The way that I have handled things like this in the past was to either write my own preprocessor to make the changes or if the "sed" utility was available, I would use it. So for this sort of thing, I would basically have a "pre-HTML" file that I edited and after running it through "sed" or the preprocessor, it would generate an HTML file that I could copy to the web server.
Now, you could create a javascript function that would do the text substitution for you, but in my opinion, it is not as nice looking as an actual preprocessor macro substitution. For example, to do what I was doing in the sed script, I would need to create a function that would take as a parameter the short form "nickname" for the longer HTML that would be generated. For example:
function S( x )
{
if (x == "cc-per-cu-in") {
document.write("<sup>cm<sup>3</sup></sup>/<sub>in<sup>3</sup></sub>");
} else if (x == "cm-per-in") {
document.write("<sup>cm</sup>/<sub>in</sub>");
} else {
document.write("<B>***MACRO-ERROR***</B>");
}
}
And then use it like this:
This is a test of cc-per-cu-in <SCRIPT>S("cc-per-cu-in");</SCRIPT> and
cm-per-in <SCRIPT>S("cm-per-in");</SCRIPT> as an alternative to sed.
This is a test of an error <SCRIPT>S("cc-per-in");</SCRIPT> for a
missing macro substitution.
This generates the following:
This is a test of cc-per-cu-in cm3/in3
and cm-per-in cm/in as an alternative to sed. This is a test of an error MACRO-ERROR for a missing macro substitution.
Yeah, it works, but it is not as readable as if you used a 'sed' substitution.
So, decide for yourself... Which is more readable...
This...
This is a test of cc-per-cu-in <SCRIPT>S("cc-per-cu-in");</SCRIPT> and
cm-per-in <SCRIPT>S("cm-per-in");</SCRIPT> as an alternative to sed.
Or this...
This is a test of cc-per-cu-in %%cc-per-cu-in%% and
cm-per-in %%cm-per-in% as an alternative to sed.
Personally, I think the second example is more readable and worth the extra trouble to have pre-HTML files that get run through sed to generate the actual HTML files... But, as the saying goes, "Your mileage may vary"...
EDITED: One more thing that I forgot about in the initial post that I find useful when using a pre-processor for the HTML files -- Timestamping the file... Often I'll have a small timestamp placed on a page that says the last time it was modified. Instead of manually editing the timestamp each time, I can have a macro (such as "%%DATE%%", "%%TIME%%", "%%DATETIME%%") that gets converted to my preferred date/time format and put in the file.
Since my background is in 'C' and UNIX, if I can't find a way to do something in HTML, I'll often just use one of the command line tools under UNIX or write a small 'C' program to do it. My HTML editing is always in 'vi' (or 'vim' on the PC) and I find that I am often creating tables for alignment of various portions of the HTML page. I got tired of typing all the TABLE, TR, and TD tags, so I created a simple 'C' program called 'table' that I can execute via the '!}' command in 'vi', similar to how you execute the 'fmt' command in 'vi'. It takes as parameters the number of rows & columns to create, whether the column cells are to be split across two lines, how many spaces to indent the tags, and the column widths and generates an appropriately indented TABLE tag structure. Just a simple utility, but saves on the typing.
Instead of typing this:
<TABLE>
<TR>
<TD width=200>
</TD>
<TD width=300>
</TD>
</TR>
<TR>
<TD>
</TD>
<TD>
</TD>
</TR>
<TR>
<TD>
</TD>
<TD>
</TD>
</TR>
</TABLE>
I can type this:
!}table -r 3 -c 2 -split -w 200 300
Now, with respect to the portion of the original question about being able to create a macro to do HTML links, that is also possible using 'sed' as a pre-processor for the HTML files. Let's say that you wanted to change:
%%lnk(www.stackoverflow.com)
to:
www.stackoverflow.com
you could create this line in the sed script file:
s/%%lnk(\(.*\))/<a href="\1">\1<\/a>/g
'sed' uses regular expressions and they are not what you might call 'pretty', but they are powerful if you know what you are doing.
One slight problem with this example is that it requires the macro to be on a single line (i.e. you cannot split the macro across lines) and if you call the macro multiple times in a single line, you get a result that you might not be expecting. Instead of doing the macro substitution multiple times, it assumes the argument to the macro starts with the first '(' of the first macro invocation and ends with the last ')' of the last macro invocation. I'm not a sed regular expression expert, so I haven't figured out how to fix this yet. For the multiple line portion though, a possible fix would be to replace all the LF characters in the file with some other special character that would not normally be used, run sed on that result, and then convert the special characters back to LF characters. Of course, the problem there is that the entire file would be a single line and if you are invoking the macro, it is going to have the results that I described above. I suspect awk would not have that problem, but I have never had a need to learn awk.
Upon further reflection, I think there might be an easier solution to both the multi-line and multiple invocation of a macro on a single line -- the 'm4' macro preprocessor that comes with the 'C' compiler (e.g. gcc). I haven't tested it much to see what the downside might be, but it seems to work well enough for the tests that I have performed. You would define a macro as such in your pre-HTML file:
define(`LNK', `$1')
And yeah, it does use the backwards single quote character to start the text string and the normal single quote character to end the text string.
The only problem that I've found so far is that is that for the macro names, it only allows the characters 'A'-'Z', 'a'-'z', '0'-'9', and '' (underscore). Since I prefer to type '-' instead of '', that is a definite disadvantage to me.
Technically inline JavaScript with a <script> tag could do what you are asking. You could even look into the many templating solutions available via JavaScript libraries.
That would not actually provide any benefit, though. JavaScript changes what is ultimately displayed, not the file itself. Since your use case does not change the display it wouldn't actually be useful.
It would be more efficient to consider why is appearing in the first place and fix that.
This …
My html file contains in many places the code
… is actually what is wrong in your file!
is not meant to use for layout purpose, you should fix that and use CSS instead to layout it correctly.
is meant to stop breaking words at the end of a line that are seperated by a space. For example numbers and their unit: 5 liters can end up with 5 at the end of the line and liters in the next line (Example).
To keep that together you would use 5 liters. That's what you use for and nothing else, especially not for layout purpose.
To still answer your question:
HTML is a markup language not a programming language. That means it is descriptive/static and not functional/dynamic. If you try to generate HTML dynamically you would need to use something like PHP or JavaScript.
Just an observation from a novice. If everyone did as purists suggest (i.e.-the right way), then the web would still be using the same coding conventions it was using 30 years ago. People do things, innovate, and create new ways, then new standards, and deprecate others all the time. Just because someone says "spaces are only for separating words...and nothing else" is silly. For many, many years, when people typed letters, they used one space between words, and two spaces between end punctuation and the next sentence. That changed...yeah, things change. There is absolutely nothing wrong with using spaces and non-breaking spaces in ways which assist layout. It is neither useful nor elegant for someone to use a long span with style over and over and over, rather than simple spaces. You can think it is, and your club of do it right folks might even agree. But...although "right", they are also being rather silly about it. Question: Will a page with 3 non-breaking spaces validate? Interesting.
I'm inserting remark.js slides (as MD files) into a jekyll site (hosted on github, and pre-processing done there).
Since remark.js uses three dashes to indicate a next slide, it's important that these three dashes do not get transformed into a new line '<hr />'.
Is there a way to turn off jekyll preprocessing within an MD file? Or, change the behavior so that --- are not transformed into <hr /> ?
I believe you would need to enter a backslash before the three hyphens, according to this document linked to from Jekyll's website.
Markdown allows you to use backslash escapes to generate literal characters which would otherwise have special meaning in Markdown’s formatting syntax.
But depending on the markdown processor you are using with Jekyll, the escape character could be something other than a backslash, or you might need to escape each hyphen.
This might be an old post, but recently I hit the same issue:
I couldn't escape --- in markdown such that remarkjs can render them as individual slides. In Jekyll 4.2.2, the --- was converted into </hr> and this was braking remarkjs.
My solution was to write my content for slides into an .md file and put it under _includes/presentations. I didn't add any --- at the beginning of this file so it will not be picked-up by Kramdown for processing. Then I added a regular .md file in _posts, to this file I added the previous one as an include between <pre> tags.
Content of the post file is:
---
layout: presentation
title: TDD Workshop Presentation
permalink: /tdd-workshop-presentation/
---
<pre>{% include presentations/tdd-workshop-1.md %}</pre>
Content of presentations/tdd-workshop-1.md
# TDD
## Test Driven Development Workshop
---
# Agenda
1. Introduction
2. Deep-dive
3. ...
Please mind the new line at the beginning of this file, as that's necessary for the first tag to be rendered properly.
I hope that this helps.
In my Wikipedia page, I have a section called subtitleA. Before arriving at this point when reading, I have one sentence that has a link that jumps to the content of that section.
To be more clear, this is a simple illustration:
To do this, you will need `this` (link to subtitleA).
To do that, you will do another thing..
== SubtitleA ==
this is how you do it....
I found the following solution:
To do this, you will need [http://wikisite.com/pageName#SubtitleA this].
This has already been proven correct; however, one of my subtitles contains spaces, brackets and directory like the following:
== SubtitleA (balabalaA\balabalaB\balabala....) ==
I can no longer use the solution I found because of those spaces... Can anyone provide me an alternative solutions? Thanks.
To do this, you will need [[pageName#SubtitleA|this]].
Use the exact same format as in the section title.
Anchor encoding is similar to percent encoding (with a . instead of a %) but not exactly the same (e.g. spaces are collapsed and encoded to _). If you really, really need to do it directly, you can use {{anchorencode|original title}}.
I found the solution:
URL encoder is the key, but not using standard %xx as the replacements for special characters. Use .xx (e.g. .5C .28) would work in the mediawiki framework.
I have written a very simple regular expression to search within an HTML document for any tag - as we are modifying 40+ templates that have been edited by a WYSIWYG editor that was horrible. Basically, it added style="font... tags everywhere - so I want to delete them all.
The problem is, some of them have line breaks between the styles (like you would typically write CSS) - and I can't figure out how to include line breaks within my expression.
Here is what I have:
style="font(.*?)"
I am using textmate to search for it, and it works great except for styles that have hard line breaks in them.
Any help???
Use this RegEx: style="font([\s\S]*?)". . does not match \n by default.
Putting (?s) at the front of your regex causes . to match newline as well
This is the most straightforward way to do it:
style="font([^"]*)"
Besides the fact that it becomes unreadable for humans, are there any downsides when I remove every linebreak and space from the html source code?
Does the browsers render different then? Will the rendering get faster (or maybe slower)?
There are many already answered questions about minifying HTML. Here are some:
Why minify assets and not the markup?
HTML Minification
How to minify HTML code
You will have a smaller file size, so it may download faster (though it'll be probably unnoticeable). There are tools for this, indeed.
If you remove line breaks there is no harm. But according to your questions
...when I remove every linebreak and space from the html source code?
If you do remove every linebreak and line space, your purpose may not be served. You should only remove extra line-breaks and spaces. Also be careful not to alter values attributes form data, or any other attribute for that matter.
Regarding improvements it can offer:
It may render faster as it needs to parse lesser data. But this speedup is highly small. I even discourage it as it reduces readability and the speedup is in order of a few hundred clock cycles for CPU. The same goes for download. It reduces mere bites of data (unless the document has too much white spaces)
Insted of this its better to use GZIP compression for the output at the server side. The following is an line from php which enables it. If you have php in your server, then just rename your *.html file to *.php , Then add the following code before any output:
if (substr_count($_SERVER['HTTP_ACCEPT_ENCODING'], 'gzip')) ob_start("ob_gzhandler");
you can also do this using the .htaccess file. Google regarding this more.
A bit late but still... By using output_buffering it is as simple as that:
function compress($string)
{
// Remove html comments
$string = preg_replace('/<!--.*-->/', '', $string);
// Merge multiple spaces into one space
$string = preg_replace('/\s+/', ' ', $string);
// Remove space between tags. Skip the following if
// you want as it will also remove the space
// between <span>Hello</span> <span>World</span>.
return preg_replace('/>\s+</', '><', $string);
}
ob_start('compress');
// Here goes your html.
ob_end_flush();