tag pages... session variables or many static pages? - html

i have a website which contains many news articles. IN the database, each article has so-called "tags" which the user sees displayed alongside the article. When the user clicks on the tag, they are directed to a list of other articles also containing this tag.
Should I generate a distinct HTML page for each newly created tag, or should I create one single page and vary the content based on what tag the user clicked on using session variables????
obviously, the pages will not be completely static since I will update them everytime a new article with a matching tag is uploaded

You certainly shouldn't use session data. That is for data that needs to persist, but it set on a per user basis. Using it for per-request data will just break bookmarking and introduce race conditions.
You should have a distinct URI for each tag. It doesn't matter (from an end user perspective) if you use dynamically generated content (either via a query string, or parsing the URI in your server side code (most frameworks, e.g. Dancer, will handle this for you)) or if you use generated static pages.
Static pages make it easier to handle caching and give better performance on very high traffic systems, but tend to require a rebuild of large sections of the site if content changes. You can get similar performance improvements by using server side caching (e.g. via memcached).
Dynamic pages are usually simpler to implement.

I suggest you to create a listing page that contains title and small description of all articles containing a particular tag similar to WordPress.
For example, here is listing page for the tag jQuery:
http://sarfraznawaz.wordpress.com/tag/jquery/

I would create one page, and then rewrite the url so that it referenced the tag page so something like this
Tag element == New
tagpage.aspx
http://www.yourwebsite.com/New.aspx
this allows you to have one page to update the content with but allows each page to be indexed by Google.com.
I'm not sure what language you are using but I would look up URL rewriting
here's a link for rewriting in apache:
http://httpd.apache.org/docs/2.0/misc/rewriteguide.html
here's a link for rewriting in asp.net:
http://msdn.microsoft.com/en-us/library/ms972974.aspx

Related

an html tag for displaying html received from a specific url

I created an API of sorts, that when you navigate to it, returns information in html.
On my website, I would like to have the web page reach out to the API and display the information as part of the web page (sort of like a webpage reaches out for an img). What HTML tag would be best suited to achieving this result? I came across the and tags but not really sure which would be best.
I am building this myself thus have full control over how the content is delivered back to the page. Is there specific pattern that is used for such "modular" sourcing of information? I could rewrite my website to - prior to serving the web page - reach out to the api and pull the info itself and then include the results in html but a) this would be more complex and require changes in several places b) will become really complex as the number of such api call results I would want to include increases.
You can use Iframe for this purpose and when you recieve html which you want to display , you can simply set html content in that iframe's ID :
document.getElementById('myIframe').contentWindow.document.write("<html><body>Here is your html</body></html>");
Hope this helps.
As far as i know, using iframes is rather depricated. I always use div-tags for such tasks.
document.getElementById("targetdiv").innerHTML = "New HTML-Content";
More info on divs: http://www.w3schools.com/tags/tag_div.asp

Show different Main Pages based on host name in MediaWiki

I have two domains pointing to the same wiki sharing the same database.
I would like it so that with domainA.com the main page is MainPageA and with domainB.com it is MainPageB.
The only way to change the the main page of MediaWiki that I know of is to edit MediaWiki:Mainpage, but that is stored in the MySQL database. Since both wikis are sharing the same database, both main pages get changed too.
The reason that the databases are shared is because all articles apply to both wikis, just that the logo of the wiki etc. is different.
Is there some kind of PHP conditional variable I can set to set the Main Page?
You could do this in wikicode, by making your Main Page source look something like this:
{{#switch:{{SERVERNAME}}
|domainA.com={{:Main Page for domainA.com}}
|domainB.com={{:Main Page for domainB.com}}
|#default=<span class=error>Unrecognized domain {{SERVERNAME}}.</span>
}}
or even just:
{{:Main Page for {{SERVERNAME}}}}
For more information, see Help:Magic words at mediawiki.org. (Note that the first version also requires the ParserFunctions extension.)
Ps. There might be some issues with MediaWiki's parser caching that could cause the wrong Main Page to appear. If so, a quick and dirty workaround would be to install the MagicNoCache extension and add __NOCACHE__ to the Main Page.
Pps. A better solution for cache issues might be to make sure that the different sites have separate cache keys, by adding the following line to your LocalSettings.php:
$wgRenderHashAppend .= "!$wgServer";

What does "dynamic" actually mean?

I keep hearing, especially here on StackOverflow, about people generating webpage content "dynamically." Does this mean generating content anytime after design time, or only on the client side, or some other definition?
In other words, as it relates to web development, what is the definition of "dynamic"?
This means that you are generating HTML through code, i.e., PHP, python, etc. Instead of hosting static HTML pages, you can generate HTML which is representative of the current state of your site/DB.
As with any popular word, people use it to mean many different things.
Original definition: static web pages were just a file that the server read off the disk and served verbatim. dynamic pages included code, such as PHP, that was interpreted by the server and replaced with specially-tailored information before it was sent to the user.
Static pages don't really exist anymore. Any site you care about will be "dynamic" in some form. As a result, the term got recycled to mean any number of things:
A page that rearranges its DOM and/or CSS after it has been received from the server. This is usually accomplished with Javascript, and may involve hiding/showing different parts of the page or displaying them in different ways. For example, a tabbed interface that displays different pieces of the page depending on which tab the user clicks on.
A page that requests new information from the server with AJAX requests and displays it using a method similar to #1. For example, user clicks on "More..." next to an article stub and the entire article is loaded and displayed without the need for a full page refresh.
Everything that involves more on the part of the server than to just transmit a file on its harddisk.
It refers to the possibility of generation of complete web pages based on content that was not known or available at the time that the "scaffolding" for the web pages was created.
A dynamic web page give you new information for each view (maybe). For example, a static webpage has always the same information on it, a dynamic web page contents can change, depending on specific variables, like which user is logged in etc.
Values that are not hard coded into the code that forms the website. The values can come from a number of sources including databases which have their content created by users, or scraped from other websites or any other number of places.
Static content is not changed between requests, dynamic content may be changed depends of time, request parameters etc. Static content usually is stored in files (like html, css, images, scripts etc.). Dynamic content is generated. Generation process usually uses two parts: page template that contains page markup in special format with placeholders for dynamic parts, and other data that are obtained from external sources like database, web service etc. Special application combines template with data to get final html (or other content) is responded to request.
Dynamic content is by definition changes with time and person.Your gmail data is different from mine(person).Both of us receive emails regularly(time),atleast.
A dynamic web page is a kind of web page that has been prepared with fresh information (content and/or layout), for each individual viewing. It is not static because it changes with: the time (ex. a news content), the user (ex. preferences in a login session), the user interaction (ex. web page game), the context (parametric customization), or all of them.
Ajax combines client and server side dynamic data.
Dynamically has been used to mean:
1. content or result generated on the fly. not ahead of time. generation follows some kind of process where a script or function is invoked.
2. re-calculated, not cached.
3. using some kind of lookup (as in the case of dynamic methods in an object).
4. not statically.

Is there anyway of making json data readable by a Google spider?

Is it possible to make JSON data readable by a Google spider?
Say for instance that I have a JSON feed that contains the data for an e-commerce site. This JSON data is used to populate a human-readable page in the users browser. (I.E. The translation from JSON data to human displayed page is done inside the users browser; not my choice, just what I've been given to work with, its an old legacy CGI application and not an actual server-side scripting language.)
My concern here is that, the google spiders will not be able to pickup/directly link to the item in question when a user clicks on it in google, being presented with an index page full of all the items, rather than being linked directly to the item they clicked on.
Is there anyway of "informing" the google spider in the JSON that what they should feed the user a different link?
While Google does crawl and index JavaScript in some circumstances, it's still best to serve "normal" (X)HTML content if at all possible. In this case, it would help to know the rest of the site's setup, in particular: is the JSON content just used to create a feed of links to the product pages (with static content) or are all product pages also generated by JSON feeds? If the feed is only used to point to the actual product pages (which are static) then one way to make the product pages discoverable could be to create a HTML sitemap page or some other alternate form of navigation. A XML Sitemap file can also help, but I would recommend not using it as the sole way of making the product pages discoverable.
If all of the content is only accessible through JSON feeds, then I think you will have to make some bigger changes if you want that content to be accessible through search results.
One way to handle it could also be to use the new JavaScript crawling/indexing proposal, which basically would result in a headless browser being set up between your site and Google: http://code.google.com/web/ajaxcrawling/ (whether setting this up or revamping the rest of the site is easier is hard to say :-))
You should make a wrapper page in server-side code around the JSON data, and respond to requests with either the wrapper or the regular version depending on the User-Agent.

How should I handle autolinking in wiki page content?

What I mean by autolinking is the process by which wiki links inlined in page content are generated into either a hyperlink to the page (if it does exist) or a create link (if the page doesn't exist).
With the parser I am using, this is a two step process - first, the page content is parsed and all of the links to wiki pages from the source markup are extracted. Then, I feed an array of the existing pages back to the parser, before the final HTML markup is generated.
What is the best way to handle this process? It seems as if I need to keep a cached list of every single page on the site, rather than having to extract the index of page titles each time. Or is it better to check each link separately to see if it exists? This might result in a lot of database lookups if the list wasn't cached. Would this still be viable for a larger wiki site with thousands of pages?
In my own wiki I check all the links (without caching), but my wiki is only used by a few people internally. You should benchmark stuff like this.
In my own wiki system my caching system is pretty simple - when the page is updated it checks links to make sure they are valid and applies the correct formatting/location for those that aren't. The cached page is saved as a HTML page in my cache root.
Pages that are marked as 'not created' during the page update are inserted into the a table of the database that holds the page and then a csv of pages that link to it.
When someone creates that page it initiates a scan to look through each linking page and re-caches the linking page with the correct link and formatting.
If you weren't interested in highlighting non-created pages however you could just have a checker to see if the page is created when you attempt to access it - and if not redirect to the creation page. Then just link to pages as normal in other articles.
I tried to do this once and it was a nightmare! My solution was a nasty loop in a SQL procedure, and I don't recommend it.
One thing that gave me trouble was deciding what link to use on a multi-word phrase. Say you had some text saying "I am using Stack Overflow" and your wiki had 3 pages called "stack", "overflow" and "stack overflow"....which part of your phrase gets linked to where? It will happen!
My idea would be to query the titles like SELECT title FROM articles and simply check if each wikilink is in that array of strings. If it is you link to the page, if not, you link to the create page.
In a personal project I made with Sinatra (link text) after I run the content through Markdown, I do a gsub to replace wiki words and other things (like [[Here is my link]] and whatnot) with proper links, on each checking if the page exists and linking to create or view depending.
It's not the best, but I didn't build this app with caching/speed in mind. It's a low resource simple wiki.
If speed was more important, you could wrap the app in something to cache it. For example, sinatra can be wrapped with the Rack caching.
Based on my experience developing Juli, which is an offline personal wiki with autolink, generating static HTML approach may fix your issue.
As you think, it takes long time to generate autolinked Wiki page. However, in generating static HTML situation, regenerating autolinked Wiki page happens only when a wikipage is newly added or deleted (in other words, it doesn't happen when updating wikipage) and the 'regenerating' can be done in background so that usually I don't matter how it take long time. User will see only the generated static HTML.