Creating a category hierarchy in a MediaWiki environment

Creating a category hierarchy in a MediaWiki environment - mediawiki

Background:
I work with a large collaboration which centralizes a lot of documentation in a wiki structure. I have passing familiarity with wiki-markup and can create simple pages with links, etc.
One major deficiency of my collaboration's wiki (based on mediawiki architecture) is that there is very little organization or cross-linkage.
I'm attempting to introduce a hierarchical category structure to the wiki, such that pages are broken down into categories, providing a means of interlinking information.
I know that I can add a [[Category:THISCATEGORY]] tag to any page source, and a special category page which organizes other pages with that category is automatically generated. The major advantage of this method of linking pages is that one gets access to related pages for free (so long as they are tagged), without requiring direct cross linkage between pages explicitly.
Question:
I'm wondering, is there an efficient way to create a root-category node of some kind, which instead of linking to other pages, links to all categories? This would allow the wiki to be effectively cross-linked without major overhauls, and would only require that a page author provide some general category tags for any additional pages they might wish to add.

You can use the special page Special:Categories to show the list of all categories on a wiki.
But if you want a better structure, I think you should also create a hierarchical structure from your categories (like Wikipedia does starting with Category:Contents). That way, your users will be able to navigate not just articles in the same category, they will also be able get to similar categories.

I'm not sure if this is as automatic as you'd like, but you can add a category tag to the article for a Category, and it'll become a subcategory of sorts. For example, go to the article "Category:Foo", edit it, and add [[Category:Bar]]. When you then visit Category:Bar, it'll list Foo as a subdirectory.
For example, see this page on wikipedia, which has this category as a subcategory, which itself has subcategories.

Once your category tree has grown you can use Special:UncategorizedCategories to make sure all categories (apart from one) are in at least one subcategory. Similarly you can use Special:UncategorizedPages to make sure all of your pages live in at least one category.

Related

Beginning html/css designer - how can I add tags to posts that people can use to sort content?

I'm working on a site to help students with ACT prep, and I want to have a page where I can post explanations to questions that people submit. I want to be able to put a few tags on each post so that site visitors can click on or search whatever's relevant for them in the archives ("semicolons", "geometry", etc.) and all the relevant posts will come up, blog style. I'm very new to this, though, and I don't know how to do it or even what to search - when I search for tags I keep getting SEO recommendations, and that doesn't seem like the right thing.

Here's a solution (but it's not great)
It might be the only way to make what you want happen with a static HTML site.
You could, by hand, create pages that you fill with links to all of the posts that fit a certain category or "tag". For example, you could make a page that has links to all of your posts concerning geometry. Lets call this your archive page for geometry.
Then, when you include tags in a post, you would make each tag link to it's corresponding archive page.
Why do I say its not the best solution?
Virtually every blog that you see has a "back end" with a database that stores posts. When someone comes to your website and looks at a post, that posts data is inserted into a template and displayed to the user. You do not have to re-write the entire web page every time. Thing like the header, sidebar, footer, main page background etc are all in a template.
Having a database also lets you search the database and return relevant results. And a blog with a back end will typically let you write rules (or have them already written) that say, when you add a "tag" to a post, a link to that post should be automatically added to an archive page etc.
As far as I can tell you don't have database, so you'll just be linking static HTML pages. That means that every time you make a new post, you'll have to add a link to all of it's relevant archive pages by hand. Maybe you don't mind that now, but eventually it will be a nightmare to maintain.
I would strongly encourage you to look into a blogging platform like Wordpress to make your site. It will be more complicated to learn at first, but technology that's meant to do what you want it to do will ultimately be easier to use and maintain than technology that's simply meant to mark up a page.

Scraper: distinguishing meaningful text from meaningless items, hadoop

I'm trying to build a crawler and scraper in Apache Nutch to find all the pages containing a section talking about a particular word-topic (e.g. "election","elections", "vote", etc).
Once I have crawled, Nutch cleans the HTML from stop words, and tags, but it doesn't take out menu voices (that are in every pages of the website).
So it could happen that when you look for all the pages talking about elections, you could retrieve a whole website because it has the word "elections" in its menu and therefore in every page.
I was wondering if techniques that analyze multiple pages of the website to understand what is the main template of a page, exist. Useful papers and/or implementations/libraries.
I was thinking about creating some kind of hadoop Job that analyzed similarities between multiple pages to extract a template. But the same website could have multiple templates, so it is hard to think of an effective way to do that.
E.G.
WEBPage 1:
MENU HOME VOTE ELECTION NEWS
meaningful text... elections ....
WebPage 2:
MENU HOME VOTE ELECTION NEWS
meaningful text... talking about swimming pools ....

You didn't mention which branch of Nutch (1.x/2.x) are you using, but at the moment I can think of a couple of approaches:
Take a look at NUTCH-585 which will be helpful if you are not crawling many different sites and if you can specify which nodes of your HTML content you want to exclude from the indexed content.
If you're working with different sites and the previous approach is not feasible take a look at NUTCH-961 which uses the boilerplate feature inside Apache Tika to guess what texts matter from your HTML content. This library uses some algorithms and provides several extractors, you could try it and see what works for you. In my experience I've had some issues with news sites that had a lot of comments and some of the comments ended up being indexed alone with the main article content, but it was a minor issue after all. In any case this approach could work very well for a lot of cases.
Also you can take a peek at NUTCH-1870 which let you specify XPath expressions to extract certain specific parts of the webpage as separated fields, using this with the right boost parameters in Solr could improve your precision.

Using a list of dynamic links throughout website

By "dynamic links", I mean a list of links that will constantly be updated.
To illustrate my question, I have a website that I am constantly writing new articles for. I currently have about 10 articles. If someone is to read article #5, there is a list of links to all 10 articles in the right panel of the page. As I update the site, and article #1 becomes out of date, I'd like to replace article #1 with article #11. Rather than updating the links within every article (so 10 times), is there a way to update the links once and have them all update simultaneously to every page?? Could I create an iframe for this??
Thanks for any and all help!

What's your goal? Do you want to learn to be a web developer? Or are you mostly concerned with getting your articles published?
If you want to be a web developer, I'd recommend steering clear of large CMS system like Wordpress or Drupal. Those are great products. But you want to learn the basics first. I think starting a PHP tutorial is the way to go.
If you just want to publish your articles, I'd recommend you find a nice place to create a blog. There are so many to choose from. It all depends on how much you want to spend.
Feel free to ask follow up questions. Web development sounds simple. But it's really a complex topic. I can't imagine what is must be like starting out these days with so many choices and competing technologies.

One way to do it would be to use Server-side includes. (Wikipedia) They work like this:
<!--#include file="some-content.html" -->
or
<!--#include virtual="some-folder/some-content.html" -->
The difference is file="" finds a file relative to the current page, whereas virtual="" finds it from the domain root. Either way, this method can use any type of regular text file as a source. The actual addition of the content is done by the server (hence the name) so its contents will be parsed as regular HTML and all CSS will apply to it as if the file were part of your page. I don't know about compatibility with different hosts, but if your web server supports it, this is probably the easiest way to go.

jQuery Mobile -- how to lay out my pages

I have an app that I'm building which is for completing a work order. The main page shows the details of the work order (site address, what needs to be done, etc.) and then there are a couple of listviews which show product and labour for that work order. At the start, there is no labour nor product attached to the work order, so these listviews are empty. When the employee is finished the work order, they can click Add Labour or Add Product to, well, add labour or product to the work order to reflect what work they've done.
I should point out that the main work order, the labour items, and the products are all distinct records in a database, all connected by the work order's primary key.
My question is about how I've laid this out. Currently, I have 3 distinct pages - one for the main work order, another for adding/editing labour and another for adding/editing product. When I say 'distinct page' I don't mean the <div data-role="page">, I mean, clicking "Add Labour" takes the user to a different website URL entirely.
I'm starting to question my logic in this design. Should I have all three (main, labour & product) on the same page, separated only by the <div data-role="page">? Then, when the user adds labour, it just takes them to that 'page' and, upon hitting "Save Labour" the main div (page) is dynamically updated (the labour listview gets an item added to it).
Not entirerly sure how I should have this built - I'm new to jQuery mobile.
Thanks

First let me tell you, you are not doing anything wrong.
There are 2 common ways of creating jQuery Mobile applications.
Single HTMl / multiple pages
Multiple HTML files
Each way has few good and few bad sides to it. From my experience people usually choose multiple HTML files solution and it is to be expected. Single HTML / multiple pages is a new paradigm created by jQuery Mobile developers and it will take some time for people to accept it as a normal/common way.
Because you didn't say what kind of app are you creating I cant advise you which approach is the best one. Usually if you are creating a classic web page it is best to use a multiple HTML solution. In case you are creating a Phonegap mobile hybrid app it is best to use single HTML file with multiple pages (this will make sure page transitions are smooth).
If you have more questions feel free to ask.

can tags be replacement of taxonomy?

My Question is around usability. In most of the sites i have seen and developed i see taxonomy as a way a user would find something he is looking for in the site. But quite recently i have seen the concept of tagging. Where products services questions are tagged and can be found with the tagname. Is tagging an alternative to taxonomy or they should work together.

I'd say that like most things, it depends on what kind of information you're trying to organize.
For example, here on Stack Overflow, there isn't really a rigid hierarchy by which to sort the questions. They're much more organic in the sense that they can span multiple, and even unrelated, disciplines or fields and create a whole host of dynamic connections. For organizing this type of information, I think tags are an appropriate replacement for traditional, hierarchical taxonification. The decentralized, dehierarchized nature of tagging dovetails perfectly with the general organization of the site's content, especially when the site's users/community is encouraged to participate in cataloguing and organizing the information. Many blogs and social networking sites like Delicious organize their content with a series of tags as well.
Conversely, if you're trying to sell products or provide technical support, you'll probably find that tagging is not a suitable replacement for traditional taxonomic organization. If you're familiar with MSDN, which provides online documentation for developers in the Microsoft ecosystem, you'll observe that most of its content is organized into a natural hierarchy by technology/language, feature, sub-feature, etc. If you want to buy a computer from Dell, you start by narrowing down your choices: do you want a desktop, notebook, or tablet? Do you want a performance-oriented notebook, a desktop-replacement notebook, or an ultra-portable? Etc. Of course, that doesn't mean that you shouldn't consider implementing tags as an alternative way for users to explore the information that you have available, but in the best of cases, they will work together.
Think about the type of content you plan to host on your site and consider the most natural way to organize that information. Your users will appreciate more than anything a site that is intuitive and where they feel it is easy to locate exactly what they're looking for.

That is an argument I always found interesting, and basically I reduce to this question:
In order to found something, is better to have a hierarchical taxonomy or a flat tag-based taxonomy (maybe collaborative i.e. Folksonomy) ?
Well, there's no unique answer, but, depending on the search context, sometimes the former is more convenient and sometimes the latter is.
The best thing would be to have both kind of taxonomies, but could be difficult to manage, in particular if contents are created by people and so the classification is up to them.
One solution could be have tags inheritance, like in drupal taxonomy system.
So for instance when you want to classify a picture of your dog, you just have to select the tag: 'dogs' and automatically your picture will belong to tags: 'dogs' --> 'animals' --> 'living beings' and so on.

This question is an issue related to the human thinking:
Sure it is better, if you can find something by a tagged word. If you dont know the word/tag perfectly, you are not able to find it. Others may have taged the thing you search for with a similar, but other tag. In this case a (binary) tag search will not give you the correct (or whole) awnser.
Anyway, there is a possibility to extract a taxonomy (as long as words/tags are related) out of tags. This concept (combined with a vecor-orientated-search) can be presented to the user and will help him to find what he needs.

Although I'd just upvote Cody's answer (I did), I would also like to add something:
The field of usability used to be within the realm of ergonomics before it grew up. So I think it is appropriate to refer to one of ergonomics' core principles.
Every person has a unique set of dimensions, so there is no single set of “correct dimensions” for e.g. a chair. The best dimensions are adjustable dimensions that provide a reasonable range of variability.
It is possible to apply this principle to website navigation as well and provide multiple ways of reaching the same content, so that people with different habits can find stuff using the way they are most comfortable with.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008