Mediawiki-only allow certain number of page creations? - mediawiki

Is it possible & if so, how can I allow users to only create a certain number of pages. ie. When they sign up, only allow them to create one page?

Off the top of my head, the following approach ought to work:
Hook into the userCan event for the "edit" action, check the page's existence (i.e. $title->exists()) and if it doesn't, consult some stored count (see below), and if the decision reached is to disallow creation, set $result to false and return false to stop further hooks overriding the decision.
Hook into the ArticleInsertComplete event and update some stored count to reflect that the user ($user) has created another page.
The decision in #1 can be expanded via additional logic to support multiple policies in conjunction with, e.g. automatic rights assignment; for example, to allow users to create more than one page after they've reached so-called "auto-confirmed" status, or to ignore the check for administrators or other users with specific rights.

I don't know what exactly you want to do, but I bet you want to let them make only user page or something? It might be simpler for you to just restrict their editing to a certain namespace, until they are added to some special group.
Some more info might help us answer the questions better. Do you also want to restrict their editing to the page they create?

Related

How to surface different Workshop pages to different user groups?

I have a Workshop module that addresses different user groups. Hence I would like to surface different pages to different groups by default. Indeed I see an option to control the default page selection based on a variable.
My first thought was to split my users into different Multipass groups and then have a function that queries a given user's Multipass attributes for membership in certain groups. However, I don't seem to be able to check for group membership in this way, probably for security reasons.
What would be the recommended way to go about this?
The Foundry security primitives for resource visibility (as opposed to data visibility) are largely aligned at the resource level rather than within a given resource. (The one exception I know of that's relevant is within the Object View configuration, where you can set visibility on different Tabs).
An approach also depends on if the resource visibility is a matter of permissions (i.e. should a user outside a given group not see a given page - again separate from the permission to see any data within that page) or one of convenience (i.e. all users can see all the data and all the interfaces, but each given group should simply start in a separate place.
In the former case, (i.e. security) I think it'd be best to make a separate Workshop app for each team and then maybe wrap them all into a Carbon workspace. The resource visibility, configured as the actual resource permissions in Compass, should determine if it appears in the Carbon workspace for the user.
If it's just for convenience, you could build all the pages in a single Workshop app, then make a separate Carbon workspace for each team and set a parameter to determine the default page, as you mentioned.

How can I disable semantic notations in text areas in Semantic MediaWiki Forms?

I am working on a user-moderated database and settled on MediaWiki with Semantic MediaWiki as an engine. I installed Semantic Forms to force the end users to conform to a certain standard when creating or editing entries. The problem is that since a user can add a semantic notation to any form text input it can throw off the proper structure of the system, i.e. if it was an IMDB clone a user can add [[Directed by:Forest Gump]] which would then result in the movie "Forest Gump" showing up under a list of directors.
I doubt that there's any setting that can simply turn this off or on, but I've had one or two ideas as to how to get it working.
One, perhaps there's a way to disable semantic notation on specific namespaces and put the forms on those namespaces. I have a feeling that this will cause the forms to merely break.
Another idea is to modify the code. This is clearly the less ideal approach. To get started, I believe I would need to create some sort of filter on SFTextAreaInput which would disable semantic notations for the user inserted text, but alas I'm unsure as to how to get started on that.
Well, Semantic MediaWiki is still a Wiki. In your classical enterprise database, you restrict the users' input options as a means of ensuring data integrity. That isn't what wikis do; the thinking with a wiki is, yes, the user can enter incorrect information, but another user will amend it and let the first user know what was wrong.
I wouldn't try to coerce SMW into rigid data acquisition. I mean, you do have options such as removing the standard input fields in forms:
'''Free text:'''
{{{standard input|free text|rows=10}}}
If users are selecting a movie page when they should be selecting a director page, then you probably want to encourage correct selection by populating the form control from the Directors category, like:
{{{field|Director|input type=combobox|values from category=Directors}}}
Yes, they can still go very far out of their way to select "Forrest Gump", but if that happens then the fact that someone wilfully circumvented the preselected correct options is a more pressing concern than the fact that the system permits it.
Wikis work best when the system encourages rather than enforces valid knowledge.
My name is Wolfgang Fahl I am behind the smartMediaWiki approach. You might want to go the smartMediaWiki route
see
http://semantic-mediawiki.org/wiki/SMWCon_Spring_2015/smartMediaWiki
For a start don't go just by the property values but e.g. also by a category.
{{#ask: [[Category:Movie]] [[Directed by::+]]
|?Directed by
}}
will only show pages that have both the property set and are in the correct category.
In the smartMediaWiki approach you'd create a topic "Movie" and the entry of movies would be done via Forms. This is an elaboration of the SemanticForms and semantic PageSchemas idea that recently evolved. You can find out more about this at SMWCon Barcelona 2015 this fall.

How do I prevent my HTML from being exploitable while avoiding GUIDs?

I've recently inherited a ASP.NET MVC 4 code base. One problem I noted was the use of some database ids (ints) in the urls as well in html form submissions. The code in its present state is exploitable through both URL tinkering and creating custom HTML posts with different numbers.
Now while I can easily fix the URL problems by using session state or additional auth checks i'm less sure about the database ids that get embedded into the HTML that the site spits out (i.e. I give them a drop down to fill). When the ids come back in a post how can I be sure I put them there as valid options?
What is considered "best practice" in terms of addressing this problem?
While I appreciate I could just "GUID it up" I'm hesitant to do so because I find them a pain in the ass to work with when debugging databases.
Do I have a choice here? Must I GUID to prevent easy guessing of ids or is there some kind of DRY mechanism I can use to validate the usage of ids as they come back into the site?
UPDATE: A commenter asked about the exploits I'm expecting. Lets say I spit out a HTML form with a drop down list of all the locations one can import "treasure" from. The id of the locations that the user owns are 1,2 and 3, these are provided in the HTML. But the user examines the html, fiddles with it and decides to put together a POST with the id of 4 selected. 4 is not his location, its someone else's.
Validate the ID passed against the IDs the user can modify.
It may seem tedious, but this is really the only way to make sure the user has access to what they're trying to modify. Using GUIDs without validation is security by obscurity: sure guessing them is hard, but you can potentially guess them given enough resources.
You can do this at the top of the controller before you do anything else with the posted data. If there's a violation, just throw an exception and have your global exception handler deal with it; you don't need to handle it in a pretty way since you can safely assume that the user is tampering with data in an unsupported way.
The issue you describe is known as "insecure direct object references," and the OWASP group recommends two policies for dealing with this issue:
using session-based indirect object references, and
validating all accesses to object references.
An example of Suggestion #1 would be that instead of having dropdown options 1, 2, and 3, you assign each option a GUID that is associated with the original ID in a map in the user's session. When you get a POST from that user, you check to see what object the given ID was supposed to be tied to. OWASP's ESAPI has some libraries to help with this in various languages.
But in many cases Suggestion #1 is actually counterproductive. For example, in many cases you want to have URLs that can be copy/pasted from one user to another. Process #2 is generally seen as the most foolproof way to address this issue.
You are describing Broken Access Control with Insecure Ids. Once you've identified the threat and decided which Ids are owned by certain users, ensure checks are in place for this server side.

Is there a way to create a custom edit flag in MediaWiki?

I moderate a wiki where many users use the AutoWikiBrowser to rapidly edit. This is fine but it makes it harder to locate and deal with vandalism via the recent changes. Is there any way that I can create a custom edit flag to mark edits as semi-automated and allow users to hide them from the recent changes? Ideally this would come with the ability to mark edits as semi-automated by default, which would allow the functionality I seek without needing a change to the AWB source code.
The ability to mark one's edits as semi-automated shouldn't be open to anyone, so it would need to be restricted to certain usergroups (probably rollback and up). I realise that there is the ability to mark edits as bot edits, but this is inaccurate as they are not truly bots, and inconvenient, since it requires a bureaucrat to mark the user as a bot, then unmark them when their editing is finished. I realise its a lot to request, and I certainly understand if its not possible, but I was hoping that it was.
Why not have users use two accounts - one for manual edits and one for bot edits? Or is that too much overhead for the users?
As you say, if your bots have their own accounts you can add them to the bot group. Then users can, in Recent Changes, decide themselves to show / hide bot edits.
Then your admins can patrol the changes as usual.
The solution is very simple: add a user group whose users are able to add and remove themselves (via Special:userrights) to the default "bot" group or to another group "flood" having only the "bot" and perhaps "noratelimit" permission; then add those users to this group.

What can I do to prevent write-write conflicts on a wiki-style website?

On a wiki-style website, what can I do to prevent or mitigate write-write conflicts while still allowing the site to run quickly and keeping the site easy to use?
The problem I foresee is this:
User A begins editing a file
User B begins editing the file
User A finishes editing the file
User B finishes editing the file, accidentally overwriting all of User A's edits
Here were some approaches I came up with:
Have some sort of check-out / check-in / locking system (although I don't know how to prevent people from keeping a file checked out "too long", and I don't want users to be frustrated by not being allowed to make an edit)
Have some sort of diff system that shows an other changes made when a user commits their changes and allows some sort of merge (but I'm worried this will hard to create and would make the site "too hard" to use)
Notify users of concurrent edits while they are making their changes (some sort of AJAX?)
Any other ways to go at this? Any examples of sites that implement this well?
Remember the version number (or ID) of the last change. Then read the entry before writing it and compare if this version is still the same.
In case of a conflict inform the user who was trying to write the entry which was changed in the meantime. Support him with a diff.
Most wikis do it this way. MediaWiki, Usemod, etc.
Three-way merging: The first thing to point out is that most concurrent edits, particularly on longer documents, are to different sections of the text. As a result, by noting which revision Users A and B acquired, we can do a three-way merge, as detailed by Bill Ritcher of Guiffy Software. A three-way merge can identify where the edits have been made from the original, and unless they clash it can silently merge both edits into a new article. Ideally, at this point carry out the merge and show User B the new document so that she can choose to further revise it.
Collision resolution:
This leaves you with the scenario when both editors have edited the same section. In this case, merge everything else and offer the text of the three versions to User B - that is, include the original - with either User A's version in the textbox or User B's. That choice depends on whether you think the default should be to accept the latest (the user just clicks Save to retain their version) or force the editor to edit twice to get their changes in (they have to re-apply their changes to editor A's version of the section).
Using three-way merging like this avoids lock-outs, which are very difficult to handle well on the web (how long do you let them have the lock?), and the aggravating 'you might want to look again' scenario, which only works well for forum-style responses. It also retains the post-respond style of the web.
If you want to Ajax it up a bit, dynamically 3-way merge User A's version into User B's version while they are editing it, and notify them. Now that would be impressive.
In Mediawiki, the server accepts the first change, and then when the second edit is saved a conflicts page comes up, and then the second person merges the two changes together. See Wikipedia: Help:Edit Conflicts
Using a locking mechanism will probably be the easiest to implement. Each article could have a lock field associated with it and a lock time. If the lock time exceeded some set value you'd consider the lock to be invalid and remove it when checking out the article for edit. You could also keep track of open locks and remove them on session close. You'd also need to implement some concurrency control in the database (autogenerated timestamps, perhaps) so that you could make sure that you are checking in an update to the version that you checked out, just in case two people were able to edit the article at the same time. Only the one with the correct version would be able successfully check in an edit.
You might also be able to find a difference engine that you could just use to construct differences, though displaying them in a wiki editor may be problematic -- actually displaying the differences is probably harder than constructing the diff. You'd rely on the versioning system to detect when you needed to reject an edit and perform a diff.
In Gmail, if we are writing a reply to a mail and someone else sends a reply while we are still typing it, a popup appears indicating that there is a new update and the update itself appears as another post without a page reload. This approach would suit your needs and if you can use Ajax to show the exact post with a link to diff of what was just updated while User B is still busy typing his entry that would be great.
As Ravi (and others) have said, you could use an AJAX approach and inform the user when another change is in progress. When an edit is submitted, just indicate the textual differences and let the second user work out how to merge the two versions.
However, I'd like to add on with something new you could try in addition to that: Open a chat dialog between the editors while they're doing their edits. You could use something like embedded Gabbly for that, for instance.
The best conflict resolution is direct dialog, I say.
Your problem (lost update) is solved best using Optimistic Concurrency Control.
One implementation is to add a version column in each editable entity of the system. On user edit you load the row and display the html form on the user. A hidden field gives the version, let's say 3. The update query needs to look something like:
update articles set ..., version=4 where id=14 and version=3;
If rows returned is 0 then someone has already updated article 14. All you need to do then is how to deal with the situation. Some common solutions:
last commit wins
first commit wins
merge conflicting updates
let the user decide
Instead of an incrementing version int/long you can use a timestamp but it's not suggested because:
retrieving the current time from the JVM isn't necessarily safe in a clustered environment, where nodes may not be time synchronized.
(quote from Java Persistence with Hibernate)
Some more info at the hibernate documentation.
At my office, we have a policy that all data tables contain 4 fields:
CreatedBy
CreatedDate
LastUpdateBy
LastUpdateDate
That way there is a nice audit trail on who has done what to the records, at least most recently.
But most importantly, it becomes easy enough to compare the LastUpdateDate of the current or edited record on the screen (requires you to store it on the page, in a cookie, whatever, with the value in the database. If the values don't match, you can decide what to do from there.