I'm using UIWebView to display data from my organization data (publicize and legal), however, for instance, I would only want to pull specific data from the html file rather than pulling the whole URL. e.g. I want to pull the "News" section of the html and I want the user to only stay in that page, not enabling them to go into other parts of the website (e.g. home page, contact us) and allowing them to view the PDF article on the HTML file.
I've asked around and read up on DOM and screen scraping, but it seem that the data pulled are stored in a database instead.
Is there any way that I can pull just the HTML "News" section with the PDF URL into my customized HTML file and that it will be updated live (maybe every 30second it will refresh and pull information from the website so that the content and list of PDF are up to date)(e.g. added in 3new article into the main website, my customize HTML file will also refresh and pull information from website and update my article list)
If anyone can point to me a specific method that allow HTML to HTML data passing (live), that will be great and I can go do more research on it. Currently very lost and confuse as it is my first time doing this. Any help/feedback will be very much appreciated :)
EDIT: For example, google map or google search. I don't want to use the whole google webpage, just taking the important thing that i want like the search result or map display.
This will involve quite a lot of learning on your part - you'll have to learn HTML / the DOM / JavaScript and iOS/UIWebVIew.
Lets leave the live refresh part for now, I'll post another answer or edit to that later on.
That's not going to easy either (check out my earlier posting today on background execution issues that will affect you, unless the update is only to take place in the foreground
iOS Run Code Once a Day)
You will have to do something like this. And note that I've never tried this, nor seen posting of people who have on here, but in theory it should work, but there will be a lot of learning as I've said, and lots of trial and error. Its a big task when you're not familiar with these things.
1) Download the html page and load it in a UIWebView, but that UIWebView is hidden so the user's can't see it.
2) When the page has loaded its dom will be accessable.
3) You can use Javascript to access the DOM and look for the parts you want.
How you inject and run the Javascript in UIWebView can be answered in a separate question (this answer will get too long if all the exact details are included).
4) Remove the parts of the dom you are not interested in. Or use use events to make only those parts you are interested in appear, jQuery can probably help here.
5) Display the UIWebView
Alternatively the HTML could be saved to a file and string parsing could be used to search for the bits you are looking for and create a new text html file from it. I think this would get very messy, better to take advantage of the fact that UIWebView will parse the HTML page and create the dom for you.
Related
I'm wanting to create a HTML page to be accessed via the kindle browser. I'm wanting to create a puzzle using a form and, when the user solves the puzzle, it will just create a new puzzle. I'm aiming to use cookies to hold the users progress. It can cope with HTML and CSS 3. Can I get a normal web page to redraw itself after the user submits without going back to the server?
Before I get started on the project I just wanted to see whether it was possible doing it this way. Ideally I'd like to put the HTML, CSS and any data into a mobi format but I'm not sure if this is the right place to ask that.
Anyway, thanks for reading.
Mike
I've been searching for an example of a "HTML Snapshot" , for google bot crawler, but still i don't have a clue what the "snap shot" is like ? From the way i understand, i figure it's my page's html put together into a large string ?
Thanks alot !
You are right, that's exactly what it is. A HTML snapshot is the static HTML code that you want Google to find when crawling your Website. 10 years ago, it was pretty much the same thing as the HTML source code. Today, especially in SPA (single page applications), the HTML changes without reloading the page. That means that there is not always a proper URL associated to every HTML possible. The snapshot is that kind of generated HTML that you want to present to Google.
That's why you can find products such as https://ajaxsnapshots.com/
It's a Javascript code that take "pictures" of your HTML pages as they are generated to make sure that the code fetched by Google bot is meaningful.
I created an API of sorts, that when you navigate to it, returns information in html.
On my website, I would like to have the web page reach out to the API and display the information as part of the web page (sort of like a webpage reaches out for an img). What HTML tag would be best suited to achieving this result? I came across the and tags but not really sure which would be best.
I am building this myself thus have full control over how the content is delivered back to the page. Is there specific pattern that is used for such "modular" sourcing of information? I could rewrite my website to - prior to serving the web page - reach out to the api and pull the info itself and then include the results in html but a) this would be more complex and require changes in several places b) will become really complex as the number of such api call results I would want to include increases.
You can use Iframe for this purpose and when you recieve html which you want to display , you can simply set html content in that iframe's ID :
document.getElementById('myIframe').contentWindow.document.write("<html><body>Here is your html</body></html>");
Hope this helps.
As far as i know, using iframes is rather depricated. I always use div-tags for such tasks.
document.getElementById("targetdiv").innerHTML = "New HTML-Content";
More info on divs: http://www.w3schools.com/tags/tag_div.asp
normally you go on a website and by right click you can choose to see the source code. Or you just use firebug and select an element you want to analyse. Is it possible to write the source code in the URL so that it wouldn't be shown by right click + choosing or selecting an element?
I'm asking because I've already seen this phenomenon once by using an iphone simulator in safari.
Any ideas or hints what I'm exactly looking for? Your help would be great.
Edit: Based on wrong information. You can see the sourcecode by rightclicking. But the url still contains all information about the site. I'll get back to you as soon as I got more information to write them down clearly. Sorry for all the confusion.
Edit: This is the code in the url containing information about the site.
data:text/html;charset=utf-8;base64,PCFET0NUWVBFIGh0bWw%2BDQo8aHRtbCBtYW5pZmVzdD0naHR0cDovL25vdm93ZWIubWZ1c2UuY29tL3dlYmFwcC9TcG9ydGluZ2JldC9wb3J0YWwvc3BvcnRpbmdiZXRQb3J0YWwubWFuaWZlc3QnPg0KPGhlYWQ%2BPHRpdGxlPlNwb3J0aW5nYmV0PC90aXRsZT4NCiAgICA8bWV0YSBodHRwLWVxdWl2PSdjb250ZW50LXR5cGUnIGNvbnRlbnQ9J3RleHQvaHRtbDsgY2hhcnNldD11dGYtOCc%2BDQoJPG1ldGEgbmFtZT0ndmlld3BvcnQnIGNvbnRlbnQ9J21heGltdW0tc2NhbGU9MSwgd2lkdGg9ZGV2aWNlLXdpZHRoLCBoZWlnaHQ9ZGV2aWNlLWhlaWdodCwgdXNlci1zY2FsYWJsZT1ubywgbWluaW11bS1zY2FsZT0xLjAnPg0KICAgIDxtZXRhIG5hbWU9J2FwcGxlLW1vYmlsZS13ZWItYXBwLWNhcGFibGUnIGNvbnRlbnQ9J1lFUyc%2BDQogICAgPG1ldGEgbmFtZT0nYXBwbGUtbW9iaWxlLXdlYi1hcHAtc3RhdHVzLWJhci1zdHlsZScgY29udGVudD0nYmxhY2snPg0KICAgIDxzY3JpcHQgdHlwZT0ndGV4dC9qYXZhc2NyaXB0JyBsYW5ndWFnZT0namF2YXNjcmlwdCc%2BDQogIGlmIChkb2N1bWVudC5yZWZlcnJlciA9PSAnJykNCiAgew0KICAgd2luZG93LmxvY2F0aW9uPSdkYXRhOnRleHQvaHRtbDtjaGFyc2V0PXV0Zi04O2Jhc2U2NCxQR2gwYld3JTJCUEdobFlXUSUyQlBHMWxkR0VnYm1GdFpUMG5kbWxsZDNCdmNuUW5JR052Ym5SbGJuUTlKMjFoZUdsdGRXMHRjMk5oYkdVOU1Td2dkMmxrZEdnOVpHVjJhV05sTFhkcFpIUm9MQ0IxYzJWeUxYTmpZV3hoWW14bFBXNXZMQ0J0YVc1cGJYVnRMWE5qWVd4bFBURXVNQ2MlMkJQRzFsZEdFZ2JtRnRaVDBuWVhCd2JHVXRiVzlpYVd4bExYZGxZaTFoY0hBdFkyRndZV0pzWlNjZ1kyOXVkR1Z1ZEQwbldVVlRKejQ4YldWMFlTQnVZVzFsUFNkaGNIQnNaUzF0YjJKcGJHVXRkMlZpTFdGd2NDMXpkR0YwZFhNdFltRnlMWE4wZVd4bEp5QmpiMjUwWlc1MFBTZGliR0ZqYXljJTJCUEUxRlZFRWdhSFIwY0MxbGNYVnBkajBuY21WbWNtVnphQ2NnWTI5dWRHVnVkRDBuTVR0VlVrdzlhSFIwY0hNNkx5OTNaV0poY0hBdWJXWjFjMlV1WTI5dEwxTndiM0owYVc1blltVjBMMmx3YUc5dVpTOXBibVJsZUMxbGJsOUhRaTVvZEcxc1AybGtQVFUzTkRVME1qa3lNRUUwTURRMk1UVXdNVE01TUVaRFFUTTRNREJFTkRnNEpteHZZMkZzWlQxbGJsOUhRaVpoWm1acGJHbGhkR1ZKUkQwblBqd3ZhR1ZoWkQ0OGMzUjViR1UlMkJZbTlrZVh0aVlXTnJaM0p2ZFc1a0xXTnZiRzl5T2lNd01EQTdkR1Y0ZEMxaGJHbG5ianBqWlc1MFpYSTdZMjlzYjNJNkkwWkdSanRtYjI1MExXWmhiV2xzZVRwQmNtbGhiQ3dnU0dWc2RtVjBhV05oTENCellXNXpMWE5sY21sbU8yWnZiblF0YzJsNlpUb3lNSEI0TzMwOEwzTjBlV3hsUGp4aWIyUjVQanh3UG14dllXUnBibWN1TGk0OEwzQSUyQlBDOWliMlI1UGp3dmFIUnRiRDQ9Jw0KCSB9DQogICAgPC9zY3JpcHQ%2BDQogICAgPGxpbmsgcmVsPSdhcHBsZS10b3VjaC1pY29uLXByZWNvbXBvc2VkJyBocmVmPSdodHRwOi8vbm92b3dlYi5tZnVzZS5jb20vd2ViYXBwL1Nwb3J0aW5nYmV0L3BvcnRhbC9JbWFnZXMvV2ViQ2xpcEljb24tZW5fR0IucG5nJz4NCiAgICA8bGluayByZWw9J3N0eWxlc2hlZXQnIHR5cGU9J3RleHQvY3NzJyBocmVmPSdodHRwOi8vbm92b3dlYi5tZnVzZS5jb20vd2ViYXBwL1Nwb3J0aW5nYmV0L3BvcnRhbC9jc3MvbWFpbi1lbl9HQi5jc3MnPg0KICAgIDxsaW5rIHJlbD0nc3R5bGVzaGVldCcgdHlwZT0ndGV4dC9jc3MnIGhyZWY9J2h0dHA6Ly9ub3Zvd2ViLm1mdXNlLmNvbS93ZWJhcHAvU3BvcnRpbmdiZXQvcG9ydGFsL2Nzcy9UcmFuc2l0aW9ucy5jc3MnPg0KCTxzY3JpcHQgdHlwZT0ndGV4dC9qYXZhc2NyaXB0JyBzcmM9J2h0dHA6Ly9ub3Zvd2ViLm1mdXNlLmNvbS93ZWJhcHAvU3BvcnRpbmdiZXQvcG9ydGFsL1BhcnRzL3V0aWxpdGllcy5qcycgY2hhcnNldD0ndXRmLTgnPjwvc2NyaXB0Pg0KICAgIDxzY3JpcHQgdHlwZT0ndGV4dC9qYXZhc2NyaXB0JyBzcmM9J2h0dHA6Ly9ub3Zvd2ViLm1mdXNlLmNvbS93ZWJhcHAvU3BvcnRpbmdiZXQvcG9ydGFsL1BhcnRzL3NldHVwLWVuX0dCLmpzJyBjaGFyc2V0PSd1dGYtOCc%2BPC9zY3JpcHQ%2BDQogICAgPHNjcmlwdCB0eXBlPSd0ZXh0L2phdmFzY3JpcHQnIHNyYz0naHR0cDovL25vdm93ZWIubWZ1c2UuY29tL3dlYmFwcC9TcG9ydGluZ2JldC9wb3J0YWwvbWFpbi5qcycgY2hhcnNldD0ndXRmLTgnPjwvc2NyaXB0Pg0KICAgIDxzY3JpcHQgdHlwZT0ndGV4dC9qYXZhc2NyaXB0JyBzcmM9J2h0dHA6Ly9ub3Zvd2ViLm1mdXNlLmNvbS93ZWJhcHAvU3BvcnRpbmdiZXQvcG9ydGFsL1BhcnRzL1RleHQuanMnIGNoYXJzZXQ9J3V0Zi04Jz48L3NjcmlwdD4NCiAgICA8c2NyaXB0IHR5cGU9J3RleHQvamF2YXNjcmlwdCcgc3JjPSdodHRwOi8vbm92b3dlYi5tZnVzZS5jb20vd2ViYXBwL1Nwb3J0aW5nYmV0L3BvcnRhbC9QYXJ0cy9QdXNoQnV0dG9uLmpzJyBjaGFyc2V0PSd1dGYtOCc%2BPC9zY3JpcHQ%2BDQogICAgPHNjcmlwdCB0eXBlPSd0ZXh0L2phdmFzY3JpcHQnIHNyYz0naHR0cDovL25vdm93ZWIubWZ1c2UuY29tL3dlYmFwcC9TcG9ydGluZ2JldC9wb3J0YWwvUGFydHMvQnV0dG9uSGFuZGxlci5qcycgY2hhcnNldD0ndXRmLTgnPjwvc2NyaXB0Pg0KICAgIDxzY3JpcHQgdHlwZT0ndGV4dC9qYXZhc2NyaXB0JyBzcmM9J2h0dHA6Ly9ub3Zvd2ViLm1mdXNlLmNvbS93ZWJhcHAvU3BvcnRpbmdiZXQvcG9ydGFsL1BhcnRzL1RyYW5zaXRpb25zLmpzJyBjaGFyc2V0PSd1dGYtOCc%2BPC9zY3JpcHQ%2BDQogICAgPHNjcmlwdCB0eXBlPSd0ZXh0L2phdmFzY3JpcHQnIHNyYz0naHR0cDovL25vdm93ZWIubWZ1c2UuY29tL3dlYmFwcC9TcG9ydGluZ2JldC9wb3J0YWwvUGFydHMvU3RhY2tMYXlvdXQuanMnIGNoYXJzZXQ9J3V0Zi04Jz48L3NjcmlwdD4NCjwvaGVhZD4NCjxib2R5IG9uTG9hZD0nbG9hZCgpOyc%2BDQogICAgPGRpdiBpZD0nc3RhY2tMYXlvdXQnPjxkaXYgaWQ9J3NlbGVjdGlvbi1wYWdlJz4NCiAgICAgICAgICAgIDxkaXYgaWQ9J2xhbmRpbmdwYWdlJz4NCiAgICAgICAgICAgICAgICA8ZGl2IGlkPSdjZW50cmVUb3BCRyc%2BPC9kaXY%2BPGRpdiBpZD0nY2VudHJlQm90dG9tQkcnPjwvZGl2Pg0KICAgICAgICAgICAgICAgIDxkaXYgaWQ9J2xvZ28nPjwvZGl2Pg0KICAgICAgICAgICAgICAgIDxkaXYgaWQ9J2ljb24nPjwvZGl2Pg0KICAgICAgICAgICAgICAgIDxkaXYgaWQ9J2Rpc3BhbHlib3gnPg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDEnPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDInPjwvZGl2Pg0KICAgICAgICAgICAgICAgIDwvZGl2Pg0KICAgICAgICAgICAgICAgIDxkaXYgY2xhc3M9J3ZpZXcyJyBpZD0naXBob25lJz48L2Rpdj4NCiAgICAgICAgICAgICAgICA8ZGl2IGNsYXNzPSd2aWV3MicgaWQ9J2Nhc2lubyc%2BPC9kaXY%2BDQogICAgICAgICAgICAgICAgPGRpdiBpZD0nZGlzcGFseWJveDMnPg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDUnPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDYnPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDcnPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDgnPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDknPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDEwJz48L2Rpdj4NCiAgICAgICAgICAgICAgICAgICAgPGRpdiBpZD0ndGV4dHAxMSc%2BPC9kaXY%2BDQogICAgICAgICAgICAgICAgICAgIDxkaXYgaWQ9J3RleHRwMTInPjwvZGl2Pg0KICAgICAgICAgICAgICAgICAgICA8ZGl2IGlkPSd0ZXh0cDEzJz48L2Rpdj4NCiAgICAgICAgICAgICAgICA8L2Rpdj4NCiAgICAgICAgICAgIDwvZGl2Pg0KICAgICAgICA8L2Rpdj48ZGl2IGlkPSdpbnN0YWxsLWFwcC1wYWdlJz4NCiAgICAgICAgICAgIDxkaXYgaWQ9J2luc3RhbGwnPg0KICAgICAgICAgICAgICAgIDxkaXYgaWQ9J2NlbnRyZVRvcEJHMSc%2BPC9kaXY%2BPGRpdiBpZD0nY2VudHJlQm90dG9tQkcxJz48L2Rpdj4NCiAgICAgICAgICAgICAgICA8ZGl2IGlkPSdkaXNwYWx5Ym94MSc%2BDQogICAgICAgICAgICAgICAgICAgIDxkaXYgaWQ9J3RleHRwMyc%2BPC9kaXY%2BDQogICAgICAgICAgICAgICAgICAgIDxkaXYgaWQ9J3RleHRwNCc%2BPC9kaXY%2BDQogICAgICAgICAgICAgICAgPC9kaXY%2BDQogICAgICAgICAgICAgICAgPGRpdiBpZD0naWNvbjEnPjwvZGl2Pg0KICAgICAgICAgICAgICAgIDxkaXYgaWQ9J2xvZ28xJz48L2Rpdj4NCiAgICAgICAgICAgICAgICA8ZGl2IGlkPSdidXR0b24yJz48L2Rpdj4NCiAgICAgICAgICAgIDwvZGl2Pg0KICAgICAgICA8L2Rpdj48L2Rpdj4NCjwvYm9keT4NCjwvaHRtbD4=
No, it's not possible to hide a website's source code. The reason for that is simply that the browser needs that code to display the website, so whenever you see a website, you'll always be able to see as much code as is needed to make the website look like that.
You can mangle the code a bit, but as you have said yourself, things like Firebug are able to display the current state of a website, so you'll also be able to see the correct code.
edit
Just a note: Just because Safari with an iPhone user agent isn't able to display the source code, it doesn't mean that the code is not there or somehow encrypted into the URL. If you can see the website, the code is there.
I guess it's a bug (or a feature?) that Safari isn't able to display it in iPhone mode (maybe because the iPhone itself isn't able to display the code either).
edit 2
Okay, it indeed set the URL to the following for me:
data:text/html;charset=utf-8;base64,PGh0bWw%2BPGhlYWQ%2BPG1ldGEgbmFtZT0ndmlld3BvcnQnIGNvbnRlbnQ9J21heGltdW0tc2NhbGU9MSwgd2lkdGg9ZGV2aWNlLXdpZHRoLCB1c2VyLXNjYWxhYmxlPW5vLCBtaW5pbXVtLXNjYWxlPTEuMCc%2BPG1ldGEgbmFtZT0nYXBwbGUtbW9iaWxlLXdlYi1hcHAtY2FwYWJsZScgY29udGVudD0nWUVTJz48bWV0YSBuYW1lPSdhcHBsZS1tb2JpbGUtd2ViLWFwcC1zdGF0dXMtYmFyLXN0eWxlJyBjb250ZW50PSdibGFjayc%2BPE1FVEEgaHR0cC1lcXVpdj0ncmVmcmVzaCcgY29udGVudD0nMTtVUkw9aHR0cHM6Ly93ZWJhcHAubWZ1c2UuY29tL1Nwb3J0aW5nYmV0L2lwaG9uZS9pbmRleC1lbl9HQi5odG1sP2lkPTU4NjIwNEE2MEE0MDQ2MTUwMTM5MEZDQTFBQTdGNDFBJmxvY2FsZT1lbl9HQiZhZmZpbGlhdGVJRD0nPjwvaGVhZD48c3R5bGU%2BYm9keXtiYWNrZ3JvdW5kLWNvbG9yOiMwMDA7dGV4dC1hbGlnbjpjZW50ZXI7Y29sb3I6I0ZGRjtmb250LWZhbWlseTpBcmlhbCwgSGVsdmV0aWNhLCBzYW5zLXNlcmlmO2ZvbnQtc2l6ZToyMHB4O308L3N0eWxlPjxib2R5PjxwPmxvYWRpbmcuLi48L3A%2BPC9ib2R5PjwvaHRtbD4=
This however just encodes to a loading & redirect page that itself redirects to a different webpage with a special session-like parameter. I guess they didn't want to create real server side sessions for this and just put the parameter into the redirect page and encoded the whole junk using the data: URI to not create a custom page for it. This however does neither help the browser (in terms of speed or anything else) nor does it hide the source code, as you can just decode it again to see the original source code.
What you're referring to is the data URI scheme, which allows base64 encoded data to be included locally (within a request), where normally http/etc URLs are used to initiate new requests.
The data URI scheme is a URI scheme
that provides a way to include data
in-line in web pages as if they were
external resources. It tends to be
simpler than other inclusion methods,
such as MIME with cid or mid URIs.
Read the Wikipedia page for more details: http://en.wikipedia.org/wiki/Data_URI_scheme
i don't know what you're trying to achive, but if you want to hide the source code because of "anybody can steal my code": that isn't possible. the sourcecode has to get to the browser in any way, so the browser can display it - and if the code is on the client-machine (in the browser) there will always be a possibility to grab it.
Even if you restrict right clicking, or viewing the source, it is impossible to hide it from everybody. Also, placing it in the URL would be bad, very bad (I can't even imagine it).
the html is needed for the browser to render the UI. You can't hide it.
You could compress and obfuscate the javascript though, to make it difficult to read and understand. But that's evil :)
Internet Explorer has a character limit of 2048 characters, so you would have to compress the content and pray it will fit in the url after it's been base64 encoded. Then you can use javascript to decode it. It will also be extremely difficult to update your pages or allow for bookmarking. It could also result in users exploiting the system.
Chances are nobody will want your sauce code anyway, and if they did, it wouldn't affect you one little bit. Facebook shows it's sauce, I don't see it's popularity dropping. So just stick with serving your pages the normal way.
1. The length of an URL is limited, so that you couldn't write a whole page into it even if it were possible.
2. Once a thing has been displayed at a client machine the code cannot be protected.
(well, using javascript right-click disabling could repell a few noobs, but it is still fairly easy to grab the code)
What I mean by autolinking is the process by which wiki links inlined in page content are generated into either a hyperlink to the page (if it does exist) or a create link (if the page doesn't exist).
With the parser I am using, this is a two step process - first, the page content is parsed and all of the links to wiki pages from the source markup are extracted. Then, I feed an array of the existing pages back to the parser, before the final HTML markup is generated.
What is the best way to handle this process? It seems as if I need to keep a cached list of every single page on the site, rather than having to extract the index of page titles each time. Or is it better to check each link separately to see if it exists? This might result in a lot of database lookups if the list wasn't cached. Would this still be viable for a larger wiki site with thousands of pages?
In my own wiki I check all the links (without caching), but my wiki is only used by a few people internally. You should benchmark stuff like this.
In my own wiki system my caching system is pretty simple - when the page is updated it checks links to make sure they are valid and applies the correct formatting/location for those that aren't. The cached page is saved as a HTML page in my cache root.
Pages that are marked as 'not created' during the page update are inserted into the a table of the database that holds the page and then a csv of pages that link to it.
When someone creates that page it initiates a scan to look through each linking page and re-caches the linking page with the correct link and formatting.
If you weren't interested in highlighting non-created pages however you could just have a checker to see if the page is created when you attempt to access it - and if not redirect to the creation page. Then just link to pages as normal in other articles.
I tried to do this once and it was a nightmare! My solution was a nasty loop in a SQL procedure, and I don't recommend it.
One thing that gave me trouble was deciding what link to use on a multi-word phrase. Say you had some text saying "I am using Stack Overflow" and your wiki had 3 pages called "stack", "overflow" and "stack overflow"....which part of your phrase gets linked to where? It will happen!
My idea would be to query the titles like SELECT title FROM articles and simply check if each wikilink is in that array of strings. If it is you link to the page, if not, you link to the create page.
In a personal project I made with Sinatra (link text) after I run the content through Markdown, I do a gsub to replace wiki words and other things (like [[Here is my link]] and whatnot) with proper links, on each checking if the page exists and linking to create or view depending.
It's not the best, but I didn't build this app with caching/speed in mind. It's a low resource simple wiki.
If speed was more important, you could wrap the app in something to cache it. For example, sinatra can be wrapped with the Rack caching.
Based on my experience developing Juli, which is an offline personal wiki with autolink, generating static HTML approach may fix your issue.
As you think, it takes long time to generate autolinked Wiki page. However, in generating static HTML situation, regenerating autolinked Wiki page happens only when a wikipage is newly added or deleted (in other words, it doesn't happen when updating wikipage) and the 'regenerating' can be done in background so that usually I don't matter how it take long time. User will see only the generated static HTML.