I administer my own company internal wiki using MediaWiki. I like MediaWiki because many people are already familiar with it having used Wikipedia. Also, it was a joy to configure and I didn't run into a lot of issues, not being that familiar with PHP. (So I'm not necessarily looking for another solution, like DokuWiki...)
My requirement is that the opening page be a listing of all pages, broken down alphabetically by category - much like a Table of Contents for the entire wiki. It would look like this (on the "Main Page"):
Category 1
Page A
Page B
Page C
Category 2
Page E
Page N
Page X
Page Z
Category 3
Page Q
Page V
Each page gets the category assigned to it. I know about the Special:Categories page, but that only shows the categories, and one must drill down (follow the link) to see the pages within that category - therefore, I cannot see multiple pages/multiple categories.
I have seen Extension:Hierarchy, but this does not fit my needs because the "Table of Contents" has to be edited rather than being auto generated by declaring the "parent" or "category" on each page itself.
Is there already existing functionality for this for MediaWiki? (I understand that as the wiki grows, so too will this Table of Contents page, but that is okay.)
Alternatively, I know about the MediaWiki API. I can create a server-side process that:
Does a MySQL lookup for all pages and their categories
Sorts them
Uses the MediaWiki API to generate this Table of Contents on the Main Page
And I can run this process periodically. I am up for the challenge, because I am a programmer and it is an interesting exercise, but why reinvent the wheel if I don't have to?
CategoryTree is an option. Now, a challenge here is that MediaWiki categories are not hierarchical. In other word, you can have category loops (A>B>C>A). Also, one article can show up in any number of categories, and articles can be without categories. The only thing that has to be done manually is to put <categorytree>Category Name</categorytree> for each category on the home "Table of Contents" page. Granted that new categories are not likely going to pop up a lot, this will not be a terrible issue. However, one solution for this inconvenience is to just put all your (top-level) categories into Category:Categories and then display that category via the extension (see the depth and hideroot parameters).
Hard to use, but wikistats produces an HTML representation from an XML dump, see e.g. MediaWiki.org categories.
CatGraph is another analysis tool, even more complex it would seem (but I've not tried setting it up for a wiki of mine, unlike wikistats).
Related
Hello Stack Overflow Community!
I am making a directory of many thousand custom mods for a game using HTML tables. When I started this project, I thought one HTML page would be slow, but adequate for the ~4k files I was expecting. As I progressed, I realized there are tens of thousands of files I need to have in these tables, and let the user search though to find what they are missing to load up a new scenario. Each entry has about 20 text entries and a small image (~3KB). I only need to be able to search through one column.
I'm thinking of dividing the tables across several pages on my website to help loading speeds and improve overall organization. But then a user would have to navigate to each page, and perform a search there. This could take a while and be very cumbersome.
I'm not great at website programming. Can someone advise a way to allow the user to search through several web pages and tables from one location? Ideally this would jump to the location in the table on the new webpage, or maybe highlight the entry like the browser's search function does.
You can see my current setup here : https://www.loco-dat-directory.site/
Hopefully someone can point me in the right direction, as I'm quite confused now :-)
This would be my steps,
Copy all my info into an excel spredsheet, then convert that to json, then make that an array for javascript (myarray), then can make an input field, and on click an if statement if input == myarray[0].propertyName
if you want something more than an exact match, you'd need https://lodash.com/
in your project.
Hacky Solution
There is a browser tool, called TableCapture, to capture data from html tables and load into excel/spreadsheets - where you are basically deferring to spreadsheet software to manage the searching.
You would have to see if:
This type of tool would solve your problem - maybe you can pull each HTML page's contents manually, then merge these pages into a document with multiple "sheets", and then let people download the "spreadsheet" from your website.
If you do not take on the labor above and just tell other people to do it, then you'd have to see if you can teach the people how to perform the search and do this method on their own. eg. "download this plugin, use it on these pages, search"
Why your question is difficult to answer
The reason why it will be hard for people to answer you in stackoverflow.com (usually code solutions) is that you need a more complicated solution (in my opinion) than hard coded tables and html/css/javascript.
This type of situation is exactly why people use databases and APIs to accept requests ("term": "something") for information and deliver responses ( "results": [...] ).
Thank you everyone for your great advice. I wasn't aware most of these potential solutions existed, and it was good to see how other people were tackling problems of similar scope.
I've decided to go with DataTables for their built-in sorting and filtering : https://datatables.net/
I'm also going to use a javascript array with an input field on the main page to allow users to search for which pack their mod is in. This will lead them to separate pages on my site, each with a unique datatable for a mod pack. Separate pages will load up much quicker than one gigantic page trying to show everything.
I'm trying to get a "category tree" from wikipedia for a project I'm working on. The problem is I only want more common topics and fields of study, so the larger dumps I've been able to find have way too many peripheral articles included.
I recently found the vital articles pages which seem to be a collection of exactly what I'm looking for. Unfortunately I don't really know how to extract the information from those pages or to filter the larger dumps to only include those categories and articles.
To be explicit, my question is: given a vital article level (say level 4), how can I extract the tree of categories and article names for a given list e.g. People, Arts, Physical sciences etc. into a csv or similar file that I can then import into another program. I don't need the actual content of the articles, just the name (and ideally the reference to the article to get more information at a later point).
I'm also open to suggestions about how to better accomplish this task.
Thanks!
Did you use PetScan?. It's wikimedia based tool that allow extract data from pages based on some conditions.
You can achieve your goal by go the tool, then navigate to "Templates&links" tab, then type the page name in field "Linked from All of these pages:", e.g. Wikipedia:Vital_articles/Level/4/History. If you want to add more than one page in the textarea, just type it line by line.
Finally, press Do it! button, and the data will be generated. After that you can download the data from output tab.
I am querying all revision histories for each wikipedia page. I downloaded wiki dump for list of page titles in main namespace from the link https://dumps.wikimedia.org/enwiktionary/20170320/
However, it seems like there are more than 12,000,000 titles from the dump I downloaded, which is way more than what wikipedia reported (https://en.wikipedia.org/wiki/Wikipedia:Size_comparisons). Can anyone tell me what is going on? Am I using the correct dump?
The reason I am asking is that it looks like it will take a few hundred days to get all revision histories if I query the history providing the article titles. So if there are any better ways to extract revision histories, it will be very helpful too.
First of all, that is a dump of pages in Wiktionary. Wikipedia's id is enwiki, however even with the right dump making the counts match takes some efforts:
Some pages are redirects
Some pages aren't counted as valid content pages and thus are excluded from the official statistics. To be considered valid, a page should contain at least one internal link.
I want to know if I should have one html file per url(home,register,login,contact) i got more than 50 or should i separate them into like 5 files and get them through ?id=1,2,3,4,5,6 etc.
I want to know which method is more convenient , anyway I have understood that the second method would have to load the whole file which will be more slower than loading a single file.
But loading a single file will require more petitions and request to and from the server and the whole html files will be heavier due to i have to write a head and include all the files for each one of them
In past experience, I make sure that any components with distinct functionality is placed in its own file. I would consider distinct functionality as the examples that you listed above (home, register, login, contact, etc). On the other hand, if you are managing blog posts (or something similar), I would definitely use GET requests (i.e. ?page=1,2,3).
I have also maintained websites that have about 50-100 different pages, but it did use a content management system. If you feel overwhelmed, this could also be a possibility to explore.
If you do not choose to use a cms, I would recommend you to use partial files. A good example for a partial would be a header or footer. By using partials, you no longer need to replicate the same code on multiple pages (say goodbye to creating 50 navbars).
I'd like to create a button on a menu bar that can generate a link to a random article from my blog posts (much like Wikipedia has). It's for a client, and they'd like to have this functionality on the site. I'm not familiar with PHP so I'd like to find a way around that, especially since I don't have access to the root user on my server host's mySQL installation (if this is relevant).
I had a theoretical solution: have a .txt or .xml file containing a list of all the URLs to each of the posts, with a "key" assigned to each of them. Then, when the user clicks the random article button, the current time (ex. 1:45) is hashed and mapped to a specific URL. I am fairly new to Drupal, however, I was wondering if there was some way to have the random article button use a .c file to execute these steps. The site is being hosted on a server that uses Apache 2, and I looked through some modules that were implemented in C code. I'm pretty new to all of this (although proficient in C), and spent many fruitless hours searching for solutions.
In a pure Drupal fashion (don't know if you are interested by this kind of solution), you could create a view (create a block) which retrieve blog posts, use a random sort criteria and limit results to 1 item. Then configure this view to display fields, and add only one field : post title, and check "link to content" in this field parameters window. You'll get one random blog post title which will be rendered as a link to this blog post.
Finally in Structure->Block assign your new block in a region to see it.
It's a pure Drupal / Views / no-code-just-clicks :) way, but it will be far more maintainable and easy to setup than introducing C for such a simple feature.
Views module
Let me know if you try this and have problems configuring your view or anything else.
Good luck