Unexpected link behavior after exporting Org-mode file to HTML

Unexpected link behavior after exporting Org-mode file to HTML - html

I am running Org-mode 8.2.6 in Emacs 24.3.1 on Windows 7 Ultimate and have encountered a peculiarity with how links work in HTML exported from Org-mode. I have used org-id extensively to assign unique IDs to headings in Org-mode files (stored in the :PROPERTIES: drawer of the heading).
Before the new exporter framework was introduced in Org-mode 8.0 all of these links worked without issue. No matter the level of the heading's hierarchy, exported-HTML ID-based links worked fine. However, using the new exporter framework produces different results. Now, ID-based links always fail when the target heading lies at a level below the headline level defined in the export settings, which defaults to level 3 (H:3). Note: This is true only for the exported HTML; ID-based links work perfectly within Emacs.
Here is a minimal example that demonstrates this behavior when I export it to HTML (see annotations for details):
* Headline Level 1
** Headline Level 2
*** Headline Level 3
:PROPERTIES:
:ID: 307db49e-e001-4a7b-9541-96eee2ae6f06
:END:
**** <<heading-level-4>>Non-headline level
:PROPERTIES:
:ID: 3be9179d-f838-4052-93ca-6c76c9aff12d
:END:
** Headline Level 2
*** Headline Level 3
Now I want to link to information that appears elsewhere in the file. Links work as
expected within Emacs. When exported to HTML, however, links do not work as they
did before the new exporter framework was introduced in Org-mode 8.0.
**** ID-based link: [[id:307db49e-e001-4a7b-9541-96eee2ae6f06][Headline Level 3]]
This link /does/ work. Using IDs always works for links to any headline level. By
"headline level" I mean any Org-mode heading that is defined as a headline
(default H:3).
**** ID-based link: [[id:3be9179d-f838-4052-93ca-6c76c9aff12d][Non-headline level]]
This link using the ID /doesn't/ work when exported to HTML using the new exporter
framework. Now, using IDs as the target for links /always/ fails for links to any
headline lower than the headline level defined in the export settings.
**** Non-ID-based link: [[heading-level-4][Non-headline level]]
Using an internal link works, but I have /many/ existing files that depend on IDs
for links at heading levels lower than the levels I want treated as (numbered)
headlines, and I also sometimes link to targets in other files, in which case,
using ID's creates a much simpler workflow.
If the file above is named demo-links.org the default output file is demo-links.html. The HTML for the target of the first working link looks like this:
<h4 id="sec-1-1-1"><a id="ID-307db49e-e001-4a7b-9541-96eee2ae6f06" name="ID-307db49e-e001-4a7b-9541-96eee2ae6f06"></a><span class="section-number-4">1.1.1</span> Headline Level 3</h4>
and the HTML for the link to it looks like this:
Headline Level 3
The link ID is part of the target code but isn't used in the linking code.
The HTML for the target of the non-working link looks like this:
<ul class="org-ul"><li><a id="heading-level-4" name="heading-level-4"></a>Non-headline level<br /><div class="outline-text-5" id="text-1-1-1-1"></div></li></ul>
Notice that the code that is generated does NOT contain the ID (3be9179d-f838-4052-93ca-6c76c9aff12d). Neither does it contain a section ID like the previous link did.
The HTML for the link to it looks like this:
Non-headline level
I believe the relevant code in ox-html.el appears following the comment "Links pointing to a headline" but I'm a novice (at best) with elisp.
My question is this: Is this behavior by design, or is there some setting I can alter that will make the export work the way it did before the new export framework was introduced?

I encounter the problem too. What's more, text enclosed between "<<" and ">>" can be showed in the exported HTML in org-7.9 while it can't be showed in org-8.2
With limited reputations, I can't vote the item up, so I just answer it without a correct answer here. (=_=)

Looking at the code of the org-html-headline function, it seems that the "standard headline" case (whatever is exported to hN) is handling the custom IDs specially:
(let* (...
(ids (remove nil
(list (org-element-property :CUSTOM_ID headline)
(concat "sec-" section-number)
(org-element-property :ID headline))))
(preferred-id (car ids)) ;; <- here, id for the header is sec-XXX
(extra-ids (cdr ids))
...)
(format ...
(format "outline-container-%s"
(or (org-element-property :CUSTOM_ID headline)
(concat "sec-" section-number)))
...
(format "\n<h%d id=\"%s\">%s%s</h%d>\n"
level1
preferred-id
(mapconcat
(lambda (x)
(let ((id (org-export-solidify-link-text
(if (org-uuidgen-p x) (concat "ID-" x)
x))))
(org-html--anchor id))) ;; <- generate a list of <a id=ID-XXX></a>
extra-ids "")
full-text
level1)
...))
That's where those <a id="ID-XXX" name="ID-XXX"></a> snippets come from (placeholder anchor, just to expose an additional ID)
Unfortunately, in the case of headlines that translate into list items, there is no such handling of ID/CUSTOM_ID, which quite frankly looks like a bug to me.
So while it is possible to rewrite org-html-headline to do the same treatment for list items, it's unfortunately not as easy as modifying a setting (and code modification would pretty fragile)
I would suggest opening a bug report, after all it seems to be a regression.

Related

how to write a function syntax in mkdocs

what syntax of markdown do I need to write the api like below.
the html code is below:
link
<dl class="function">
<dt id="create_bootstrap_script">
<code class="descname">create_bootstrap_script</code><span class="sig-paren">(</span><em>extra_text</em><span class="sig-paren">)</span><a class="headerlink" href="#create_bootstrap_script" title="Permalink to this definition">¶</a></dt>
<dd><p>Creates a bootstrap script from <code class="docutils literal"><span class="pre">extra_text</span></code>, which is like
this script but with extend_parser, adjust_options, and after_install hooks.</p>
</dd></dl>
<p>This returns a string that (written to disk of course) can be used
as a bootstrap script with your own customizations. The script
will be the standard virtualenv.py script, with your extra text
added (your extra text should be Python code).</p>
<p>If you include these functions, they will be called:</p>
I've tried to use syntax like this, but turn out to be just like but not the same.
Orange(a, b)
: The fruit of an evergreen tree of the genus Citrus.
Every answer will be appreciated.

In the first example the dt is a codespan which has had syntax highlighting applied.
In the second example, the dt only contains plain text. You can get closer by using a code span as well:
`Orange(a, b)`
: The fruit of an evergreen tree of the genus Citrus.
Notice that the first line is wrapped in backticks (`) which makes it a code span. Of course, you still need syntax highlighting to get an exact match, but MkDocs does not currently support syntax highlighting for code spans, only code blocks. Presumably the virtualenv docs were built using Sphinx rather than MkDocs. While MkDocs comes close to matching Sphinx in looks, it does not match feature for feature and this is one of the differences. That said, it may be possible with a custom theme and/or using some custom Markdown Extensions.

How to parse HTML/XML tags according to NOT conditions in [r]

Dearest StackOverflow homies,
I'm playing with HTML that was output by EverNote and need to parse the following:
Note Title
Note anchor (hyperlink identities of the notes themselves)
Note Creation Date
Note Content, and
Intra-notebook hyperlinks (the
links within the content of a note to another note's anchor)
According to examples by Duncan Temple Lang, author of the [r] XML package and a SO answer by #jdharrison, I have been able to parse the Note Title, Note anchor, and Note Creation Dates with relative ease. For those who may be interested, the commands to do so are
require("XML")
rawHTML <- paste(readLines("EverNotebook.html"), collapse="\n") #Yes... this is noob code
doc = htmlTreeParse(rawHTML,useInternalNodes=T)
#Get Note Titles
html.titles<-xpathApply(doc, "//h1", xmlValue)
#Get Note Title Anchors
html.tAnchors<-xpathApply(doc, "//a[#name]", xmlGetAttr, "name")
#Get Note Creation Date
html.Dates<-xpathApply(doc, "//table[#bgcolor]/tr/td/i", xmlValue)
Here's a fiddle of an example HTML EverNote export.
I'm stuck on parsing 1. Note Contents and 2. Intra-notebook hyperlinks.
Taking a closer look at the code it is apparent the solution for the first part is to return every upper-most* div that does NOT include a table with attribute bgcolor="#D4DDE5." How is this accomplished?
Duncan says that it is possible to use XPath to parse XML according to NOT conditions:
"It allows us to express things such as "find me all nodes named a" or "find me all nodes named a that have no attribute named b" or "nodes a that >have an attribute b equal to 'bob'" or "find me all nodes a which have c as >an ancestor node"
However he does not go on to describe how the XML package can parse exclusions... so I'm stuck there.
Addressing the second part, consider the format of anchors to other notes in the same notebook:
<a href="#13178">
The goal with these is to procure their number and yet this is difficult because they are solely distinguished from www links by the # prefix. Information on how to parse for these particular anchors via partial matching of their value (in this case #) is sparse - maybe even requiring grep(). How can one use the XML package to parse for these special hrefs? I describe both problems here since it's possible a solution to the first part may aid the second... but perhaps I'm wrong. Any advice?
UPDATE 1
By upper-most div I intend to say outer-most div. The contents of every note in an EverNote HMTL export are within the DOMs outer-most divs. Thus the interest is to return every outer-most div that does NOT include a table with attribute bgcolor="#D4DDE5."

"....to return every upper-most div that does NOT include a table with attribute bgcolor="#D4DDE5." How is this accomplished?"
One possible way ignoring 'upper-most' as I don't know exactly how would you define it :
//div[not(table[#bgcolor='#D4DDE5'])]
Above XPath reads: select all <div> not having child element <table> with bgcolor attribute equals #D4DDE5.
I'm not sure about what you mean by "parse" in the 2nd part of the question. If you simply want to get all of those links having special href, you can partially match the href attribute using starts-with() or contains() :
//a[starts-with(#href, '#')]
//a[contains(#href, '#')]
UPDATE :
Taking "outer-most" div into consideration :
//div[not(table[#bgcolor='#D4DDE5']) and not(ancestor::div)]
Side note : I don't know exactly how XPath not() is defined, but if it works like negation in general, (this worked as confirmed by OP in the comment below) you can apply one of De Morgan's law :
"not (A or B)" is the same as "(not A) and (not B)".
so that the updated XPath can be slightly simplified to :
//div[not(table[#bgcolor='#D4DDE5'] or ancestor::div)]

Selenium automation- finding best xpath

I am looking to avoid using xpaths that are 'xpath position'. Reason being, the xpath can change and fail an automation test if a new object is on the page and shifts the expected xpath position.
But on some web pages, this is the only xpath I can find. For example, I am looking to click a tab called 'FooBar'.
If I use the Selenium IDE FireFox plugin, I get:
//td[12]/a/font
If I use the FirePath Firefox plugin, I get:
html/body/form/table[2]/tbody/tr/td[12]/font
If a new tab called "Hello, World" is added to the web page (before FooBar tab) then FooBar tab will change and have an xpath position of
//td[13]/a/font
What would you suggest to do?
TY!

Instead of using absolute xpath you could use relateive xpath which is short and more reliable.
Say
<td id="FooBar" name="FooBar">FooBar</td>
By.id("FooBar");
By.name("FooBar");
By.xpath("//td[text()='FooBar']") //exact match
By.xpath("//td[#id='FooBar']") //with any attribute value
By.xpath("//td[contains(text(),'oBar')]") //partial match with contains function
By.xpath("//td[starts-with(text(),'FooB')]") //partial match with startswith function
This blog post may be useful for you.

Relative xpath is good idea. relative css is even better(faster)
If possible suggest/request id for element.
Check also chrome -> check element -> copy css/xpath

Using //td is not a good idea because it will return all your td nodes. Any predicate such as //td[25] will be a very fragile selection because any td added to any previous table will change its result. Using plugins to generate XPath is great to find quickly what you want, but its always best to use it just as a starting point, and then analyze the structure of the file to write a locator that will be harder to break when changes occur.
The best locators are anchored to invariant values or attributes. Plugins usually won't suggest id or attribute anchors. They usually use absolute positional expressions. If can rewrite your locator path in terms of invariant structures in the file, you can then select the elements or text that you want relative to it.
For example, suppose you have
<body> ...
... lots of code....
<h1>header that has a special word</h1>
... other tags and text but not `h1` ...
<table id="some-id">
...
<td>some-invariant-text</td>
<td>other text</td>
<td>the field that you want</td>
...
The table has an ID. That's the best anchor. Now you can select the table as
//table[#id='some-id']
But many times you don't have the id, or even some other invariant attribute. You can still try to discover a pattern. For example: suppose that the last <h1> before the table you want contains a word you can match, you could still find the table using:
//table[preceding::h1[1][contains(.,'word')]]
Once you have the table, you can use relative axes to find the other nodes. Let's assume you want an td but there are no attributes on any tbody, tr, etc. You can still look for some invariant text. Tables usually have headers, or some fixed text which you can match. In the example above, if you find a td that is 2 fields before the one that you want, you could use:
//table[preceding::h1[1][contains(.,'word')]]/td[preceding-sibling::td[2][.='some-invariant-text']]
This is a simple example. If you apply some of these suggestions to the file you are working on, you can improve your XPath expression and make your selection code more robust.

Latex as copyable text not as image

According to https://chemistry.meta.stackexchange.com/a/88, Stack Exchange sites use MathJax to format math equations.
When I looked at the demo page (http://www.mathjax.org/demos/tex-samples/), the source code for the first example is:
\[\begin{aligned}
\dot{x} & = \sigma(y-x) \\
\dot{y} & = \rho x - y - xz \\
\dot{z} & = -\beta z + xy
\end{aligned} \]
Since the result is text, I am assuming that some fancy CSS makes it look nice like that. My question is can someone help me find a way to get that CSS and convert that code to raw HTML that looks the same?

If you are using Firefox, you can install a browser AddOn called "Web Developer" which will give you an added menu bar. One of the commands available from this bar is CSS/Display Style Information. You can then select any element on the page and the styling for the element will be shown in separate window at the bottom of the page. By using this, you can potentially reconstruct from scratch the HTML styling for a particular element or set of elements.

How can one close HTML tags in Vim quickly?

It's been a while since I've had to do any HTML-like code in Vim, but recently I came across this again. Say I'm writing some simple HTML:
<html><head><title>This is a title</title></head></html>
How do I write those closing tags for title, head and html down quickly? I feel like I'm missing some really simple way here that does not involve me going through writing them all down one by one.
Of course I can use CtrlP to autocomplete the individual tag names but what gets me on my laptop keyboard is actually getting the brackets and slash right.

I find using the xmledit plugin pretty useful. it adds two pieces of functionality:
When you open a tag (e.g. type <p>), it expands the tag as soon as you type the closing > into <p></p> and places the cursor inside the tag in insert mode.
If you then immediately type another > (e.g. you type <p>>), it expands that into
<p>
</p>
and places the cursor inside the tag, indented once, in insert mode.
The xml vim plugin adds code folding and nested tag matching to these features.
Of course, you don't have to worry about closing tags at all if you write your HTML content in Markdown and use %! to filter your Vim buffer through the Markdown processor of your choice :)

I like minimal things,
imap ,/ </<C-X><C-O>

I find it more convinient to make vim write both opening and closing tag for me, instead of just the closing one. You can use excellent ragtag plugin by Tim Pope. Usage looks like this (let | mark cursor position)
you type:
span|
press CTRL+x SPACE
and you get
<span>|</span>
You can also use CTRL+x ENTER instead of CTRL+x SPACE, and you get
<span>
|
</span>
Ragtag can do more than just it (eg. insert <%= stuff around this %> or DOCTYPE). You probably want to check out other plugins by author of ragtag, especially surround.

Check this out..
closetag.vim
Functions and mappings to close open HTML/XML tags
https://www.vim.org/scripts/script.php?script_id=13
I use something similar.

If you're doing anything elaborate, sparkup is very good.
An example from their site:
ul > li.item-$*3 expands to:
<ul>
<li class="item-1"></li>
<li class="item-2"></li>
<li class="item-3"></li>
</ul>
with a <C-e>.
To do the example given in the question,
html > head > title{This is a title}
yields
<html>
<head>
<title>This is a title</title>
</head>
</html>

There is also a zencoding vim plugin: https://github.com/mattn/zencoding-vim
tutorial: https://github.com/mattn/zencoding-vim/blob/master/TUTORIAL
Update: this now called Emmet: http://emmet.io/
An excerpt from the tutorial:
1. Expand Abbreviation
Type abbreviation as 'div>p#foo$*3>a' and type '<c-y>,'.
---------------------
<div>
<p id="foo1">
</p>
<p id="foo2">
</p>
<p id="foo3">
</p>
</div>
---------------------
2. Wrap with Abbreviation
Write as below.
---------------------
test1
test2
test3
---------------------
Then do visual select(line wize) and type '<c-y>,'.
If you request 'Tag:', then type 'ul>li*'.
---------------------
<ul>
<li>test1</li>
<li>test2</li>
<li>test3</li>
</ul>
---------------------
...
12. Make anchor from URL
Move cursor to URL
---------------------
http://www.google.com/
---------------------
Type '<c-y>a'
---------------------
Google
---------------------

Mapping
I like to have my block tags (as opposed to inline) closed immediately and with as simple a shortcut as possible (I like to avoid special keys like CTRL where possible, though I do use closetag.vim to close my inline tags.) I like to use this shortcut when starting blocks of tags (thanks to #kimilhee; this is a take-off of his answer):
inoremap ><Tab> ><Esc>F<lyt>o</<C-r>"><Esc>O<Space>
Sample usage
Type—
<p>[Tab]
Result—
<p>
|
</p>
where | indicates cursor position.
Explanation
inoremap means create the mapping in insert mode
><Tab> means a closing angle brackets and a tab character; this is what is matched
><Esc> means end the first tag and escape from insert into normal mode
F< means find the last opening angle bracket
l means move the cursor right one (don't copy the opening angle bracket)
yt> means yank from cursor position to up until before the next closing angle bracket (i.e. copy tags contents)
o</ means start new line in insert mode and add an opening angle bracket and slash
<C-r>" means paste in insert mode from the default register (")
><Esc> means close the closing tag and escape from insert mode
O<Space> means start a new line in insert mode above the cursor and insert a space

Check out vim-closetag
It's a really simple script (also available as a vundle plugin) that closes (X)HTML tags for you. From it's README:
If this is the current content:
<table|
Now you press >, the content will be:
<table>|</table>
And now if you press > again, the content will be:
<table>
|
</table>
Note: | is the cursor here

Here is yet another simple solution based on easily foundable Web writing:
Auto closing an HTML tag
:iabbrev </ </<C-X><C-O>
Turning completion on
autocmd FileType xml set omnifunc=xmlcomplete#CompleteTags

allml (now Ragtag ) and Omni-completion ( <C-X><C-O> )
doesn't work in a file like .py or .java.
if you want to close tag automatically in those file,
you can map like this.
imap <C-j> <ESC>F<lyt>$a</^R">
( ^R is Contrl+R : you can type like this Control+v and then Control+r )
(| is cursor position )
now if you type..
<p>abcde|
and type ^j
then it close the tag like this..
<p>abcde</p>|

Building off of the excellent answer by #KeithPinson (sorry, not enough reputation points to comment on your answer yet), this alternative will prevent the autocomplete from copying anything extra that might be inside the html tag (e.g. classes, ids, etc...) but should not be copied to the closing tag.
UPDATE I have updated my response to work with filename.html.erb files.
I noticed my original response didn't work in files commonly used in Rails views, like some_file.html.erb when I was using embedded ruby (e.g. <p>Year: <%= #year %><p>). The code below will work with .html.erb files.
inoremap ><Tab> ><Esc>?<[a-z]<CR>lyiwo</<C-r>"><Esc>O
Sample usage
Type:
<div class="foo">[Tab]
Result:
<div class="foo">
|
<div>
where | indicates cursor position
And as an example of adding the closing tag inline instead of block style:
inoremap ><Tab> ><Esc>?<[a-z]<CR>lyiwh/[^%]><CR>la</<C-r>"><Esc>F<i
Sample usage
Type:
<div class="foo">[Tab]
Result:
<div class="foo">|<div>
where | indicates cursor position
It's true that both of the above examples rely on >[Tab] to signal a closing tag (meaning you would have to choose either inline or block style). Personally, I use the block-style with >[Tab] and the inline-style with >>.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Unexpected link behavior after exporting Org-mode file to HTML - html

I encounter the problem too. What's more, text enclosed between "<<" and ">>" can be showed in the exported HTML in org-7.9 while it can't be showed in org-8.2 With limited reputations, I can't vote the item up, so I just answer it without a correct answer here. (=_=)

Related

how to write a function syntax in mkdocs

How to parse HTML/XML tags according to NOT conditions in [r]

Selenium automation- finding best xpath

Latex as copyable text not as image

How can one close HTML tags in Vim quickly?

Categories

Resources