Is there a way to create a custom edit flag in MediaWiki? - mediawiki

I moderate a wiki where many users use the AutoWikiBrowser to rapidly edit. This is fine but it makes it harder to locate and deal with vandalism via the recent changes. Is there any way that I can create a custom edit flag to mark edits as semi-automated and allow users to hide them from the recent changes? Ideally this would come with the ability to mark edits as semi-automated by default, which would allow the functionality I seek without needing a change to the AWB source code.
The ability to mark one's edits as semi-automated shouldn't be open to anyone, so it would need to be restricted to certain usergroups (probably rollback and up). I realise that there is the ability to mark edits as bot edits, but this is inaccurate as they are not truly bots, and inconvenient, since it requires a bureaucrat to mark the user as a bot, then unmark them when their editing is finished. I realise its a lot to request, and I certainly understand if its not possible, but I was hoping that it was.

Why not have users use two accounts - one for manual edits and one for bot edits? Or is that too much overhead for the users?
As you say, if your bots have their own accounts you can add them to the bot group. Then users can, in Recent Changes, decide themselves to show / hide bot edits.
Then your admins can patrol the changes as usual.

The solution is very simple: add a user group whose users are able to add and remove themselves (via Special:userrights) to the default "bot" group or to another group "flood" having only the "bot" and perhaps "noratelimit" permission; then add those users to this group.

Related

Retrieve the number of edits made by bots, registered users and anonymous users for a Wikipedia article

I'm trying to retrieve the number of edits made by bots, registered users and anonymous users separated for a specific wikipedia article.
I know I can get all revisions for an article by the revision prop in the MediaWiki API, I was thinking to use rvprop=user to return the name of the user who made the revision and do some processing on the retrieved data.
http://ar.wikipedia.org/w/api.php?action=query&prop=revisions&titles=%D8%A7%D8%A8%D9%86%20%D8%A7%D9%84%D9%86%D9%81%D9%8A%D8%B3&rvlimit=500&rvprop=timestamp%7Cuser|size&format=xml
for anonymous users revisions the anon="" occurs always so I can count it, but for the bots I can't find a way, as far as I know the bots names are not always written in a standardized way.
Any idea how to do it? or an easier way maybe using another API to do this task?
The revisions API lets you list the flags for each revision - they include whether an edit was marked as a minor or bot edit. For example, see these revisions.
However, it looks like the edits in your linked data set were made without flagging them as bot edits, either because those bots are not approved bots or because they forgot to set the flag. In that case, you're quite out of luck. You still can filter against the term bot in the username or the known list of bots in your wiki.

Is it a good idea to let your users change their usernames? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm back and forth on the idea of letting the users on my site change their usernames, which would be displayed through out the site. On one side I want to give the users flexibility. But on the other, I don't want them to use this feature as a way to hide if they do something unwanted on the site. I know SO & Twitter lets you change your display name. Whats keeping someone from behaving bad on the site and then changing their name so they can continue behaving bad?
I need feed back on the pro's and con's.
Thanks!
Update:
To clear things up a bit. I'm not using the user name as the primary internal account ID.
Each user gets a unique number. My question is not really about my system tracking the user its about how other users will be able to track each other.
If userA knows that userB is doing something bad and then userB changes his name to userC. Then userA will no longer know who he is.
What do you mean with "do something bad and then change their name"? If you're implying that users can post content, for instance, with their name attached, and then change their name and the name attached to their posts won't change as well, then I think you need to reconsider your (database) architecture and ensure that a username is a single point of reference and all representations of that username change when someone changes their username.
Edit:
OK, sorry for misunderstanding. But if it's the case that you have a single point of reference, then changing your username is irrelevant to the problem. Let's say my username is Foo and I troll some thread somewhere, then change my name to Bar. As long as people can see what I've posted (eg. a post history page), then it doesn't matter whether I used to be called Foo or not, Bar is associated now with posts made before that were troll material. So perhaps you just need to create transparency, by making something like a post history overview on users' profiles? :)
If the issue of impersonating someone or "cheating" is a factor you could always do ALA eBay and display an icon next to someone who changed their username in the past 30 days.
Depending on the case you can keep an history and display it if required.
If you do that make sure that previous usernames are not recycled for new users.
There are often very good reasons why people would want to legitimately change their username.
For example, assumed someone signed up as pimpleOnGodsAss while at university and then, a few years later, is in the workforce and wants to network with other professionals through the site ... they're going to want to change!
Also, consider the very common case of people changing their names when they marry - if they used their family name as a part of their username, they'll want to change that too. Martha Jones with username marthajones marries John Smith and wants username marthasmith (if available).
Note too that you can't avoid people achieving this - they can always reregister with a new email address, discarding their old history, and getting the new username they want.
I'd suggest that the benefits of the feature outweigh the costs - people will always find a way to game the system, don't penalise good users by locking away features just because some will find a way to abuse them.
I like Steam's "View aliases" function, which lets you see all names that you've seen that user using. As long as the username itself isn't the primary key in the user table, then sure, let them change names. Add an alias table so if someone's being a dick, you can see who they used to be.
You could display both names for a while :
Symbol Guy -> Formerly known as Prince
I wouldn't let the user change their actual username.
However, changing the display name should be safe. You can always track their behavior via their account or username directly. If they're doing something bad, you should flag their account, not necessarily try to track it by their name.
Most sites have this concept (including SO).
If each user has a constant unique id number, they can change the username but are still the same.
Arstechnica.com charges $20 to change your username, and subscribers get one change for free.
It's a feature I'd like to see on more websites.
It's convenient and, as you said it, gives user flexibility. If only changing one's username is enough to hide, there may be another issue.
Keep track of those users another way (user ID, logs, reputation system, ...) and consider showing the original username to your admins/moderators maybe.
I agree with most of the people answering here - let the users change their name, otherwise the site is simply less friendly. As Rahul said, if a user changes their name, make sure that the new name is associated with all their prior activity.
Similarly - If you use email address as your username (as many sites do), let the user change their email address. I can tell from personal experience that not allowing this is a real pain (for users and customer support for whatever site isn't allowing it).
Your database structure shouldn't depend on the username(/email address) being the same, so why enforce that on your users?
If you have the feature, leave it. You would just be encouraging someone to create another account to have the user names goodwitch and badwitch.
If you don't have it, don't add it unless you have nothing better to add.
Personally, i try to make usernames sticky, and a valuable part of the experience of a site.
This is true if other users will see the user names; if there is any social networking involved with the site.
You can always archive really old ones if your site lives for a long time.
As a user, I must say that I absolutely want to be able to change the user name on my account. I wouldn't go as far as asking for multiple aliases, but it is always possible that someone changed his mind, or did not really think much when first setting up the account.
You should not be using the user name as the primary internal account ID.
Yes - ONLY IF users don't interact with another users.
NO - if users can interact: forums and stuff like that. It's kinda confusing to see an user with a different name every day.
I'd allow the change, and perhaps carry both a user (login) name and a display name, and I'd allow a change to both. There are various arguments on both sides, but for me it usually comes down to the fact that either/both of those elements (user & display) typically reflect something about their owner's name, and their owner's name can change.
If your login is an email address, and you change email providers, now what? Or take your name 'Donny V.' I assume you're male, but what if you were female*? And if you got married to Mike Peterson. Wouldn't you now want to be known as "Donny P."? Maybe, maybe not. But many would.
*Yes, I know men can change their last names too.
How about a "formerly known as" display. If a user decides to change the user name, just store the old name in a table and display it additionally. Maybe give the option in all the user profiles to disable this display.
I vote for yes, within reason. Users who have got into trouble (negative total rep, moderation, past warnings, whatever is applicable to your site) should not be allowed.
Other users should be allowed but with a limit (e.g. 3 times a year). There should be a way to keep track of the users past usernames, at the very least for the admins/moderators.
EDIT: However I find systems such as Steam/Wordpress where a user has a display name and login name seperate a little confusing, so I would not reccomend such an approach, however that is just personal feeling.
If userA knows that userB is doing something bad and then userB changes his name to userC. Then userA will no longer know who he is.
If you're main concern is related to abuse, perhaps you should provide a method of reporting abuse, and maintain a log of when usernames change.
You could always control how often users are allowed to change their username to avoid seeing same people in forums who change their name every day; cause someone will do it every day if they are allowed to.
Business questions first
Why are you offering this feature instead of spending the time on another feature? Would another feature offer better benefit (such as a status line?)
What will this accomplish?
Are users asking for this?
Will this feature result increased stickiness or better experience?
Is this a competitive advantage?
Does your site become more confusing?
Technical questions
What is the potential for misuse? Do you have a denormalized database where the username has been copied many places or is there only one place where the username is stored?
Do you have a way to stream a notification to other users "Your friend 'foo' is now 'bar'?"
Like most things in life, it comes down to the context of which you speak of. Personally, I find any service that could or gets abused should have persistent usernames.
Suppose it is ok to let user change their names, but this change should also touch his previous posts - they should be shown with new user name
...
I have reminded about some password has realization that I found in internet resource
password_hash = md5(password + salt + user_name)
if you have the same model then you should reject chaning user name
Regards, Pavel
Combining some of the ideas here:
Every user has a unique internal ID, a number. So you are free to implement this feature any time you want. No need to code it right away, as you can delay this decision.
In case you want to implement it: Let them only change the username every 6 months and indicate every new username by some symbol for 30 days. Show the username history in the profile and be sure to inform the user about this, so he can decide against changing the username.

What can I do to prevent write-write conflicts on a wiki-style website?

On a wiki-style website, what can I do to prevent or mitigate write-write conflicts while still allowing the site to run quickly and keeping the site easy to use?
The problem I foresee is this:
User A begins editing a file
User B begins editing the file
User A finishes editing the file
User B finishes editing the file, accidentally overwriting all of User A's edits
Here were some approaches I came up with:
Have some sort of check-out / check-in / locking system (although I don't know how to prevent people from keeping a file checked out "too long", and I don't want users to be frustrated by not being allowed to make an edit)
Have some sort of diff system that shows an other changes made when a user commits their changes and allows some sort of merge (but I'm worried this will hard to create and would make the site "too hard" to use)
Notify users of concurrent edits while they are making their changes (some sort of AJAX?)
Any other ways to go at this? Any examples of sites that implement this well?
Remember the version number (or ID) of the last change. Then read the entry before writing it and compare if this version is still the same.
In case of a conflict inform the user who was trying to write the entry which was changed in the meantime. Support him with a diff.
Most wikis do it this way. MediaWiki, Usemod, etc.
Three-way merging: The first thing to point out is that most concurrent edits, particularly on longer documents, are to different sections of the text. As a result, by noting which revision Users A and B acquired, we can do a three-way merge, as detailed by Bill Ritcher of Guiffy Software. A three-way merge can identify where the edits have been made from the original, and unless they clash it can silently merge both edits into a new article. Ideally, at this point carry out the merge and show User B the new document so that she can choose to further revise it.
Collision resolution:
This leaves you with the scenario when both editors have edited the same section. In this case, merge everything else and offer the text of the three versions to User B - that is, include the original - with either User A's version in the textbox or User B's. That choice depends on whether you think the default should be to accept the latest (the user just clicks Save to retain their version) or force the editor to edit twice to get their changes in (they have to re-apply their changes to editor A's version of the section).
Using three-way merging like this avoids lock-outs, which are very difficult to handle well on the web (how long do you let them have the lock?), and the aggravating 'you might want to look again' scenario, which only works well for forum-style responses. It also retains the post-respond style of the web.
If you want to Ajax it up a bit, dynamically 3-way merge User A's version into User B's version while they are editing it, and notify them. Now that would be impressive.
In Mediawiki, the server accepts the first change, and then when the second edit is saved a conflicts page comes up, and then the second person merges the two changes together. See Wikipedia: Help:Edit Conflicts
Using a locking mechanism will probably be the easiest to implement. Each article could have a lock field associated with it and a lock time. If the lock time exceeded some set value you'd consider the lock to be invalid and remove it when checking out the article for edit. You could also keep track of open locks and remove them on session close. You'd also need to implement some concurrency control in the database (autogenerated timestamps, perhaps) so that you could make sure that you are checking in an update to the version that you checked out, just in case two people were able to edit the article at the same time. Only the one with the correct version would be able successfully check in an edit.
You might also be able to find a difference engine that you could just use to construct differences, though displaying them in a wiki editor may be problematic -- actually displaying the differences is probably harder than constructing the diff. You'd rely on the versioning system to detect when you needed to reject an edit and perform a diff.
In Gmail, if we are writing a reply to a mail and someone else sends a reply while we are still typing it, a popup appears indicating that there is a new update and the update itself appears as another post without a page reload. This approach would suit your needs and if you can use Ajax to show the exact post with a link to diff of what was just updated while User B is still busy typing his entry that would be great.
As Ravi (and others) have said, you could use an AJAX approach and inform the user when another change is in progress. When an edit is submitted, just indicate the textual differences and let the second user work out how to merge the two versions.
However, I'd like to add on with something new you could try in addition to that: Open a chat dialog between the editors while they're doing their edits. You could use something like embedded Gabbly for that, for instance.
The best conflict resolution is direct dialog, I say.
Your problem (lost update) is solved best using Optimistic Concurrency Control.
One implementation is to add a version column in each editable entity of the system. On user edit you load the row and display the html form on the user. A hidden field gives the version, let's say 3. The update query needs to look something like:
update articles set ..., version=4 where id=14 and version=3;
If rows returned is 0 then someone has already updated article 14. All you need to do then is how to deal with the situation. Some common solutions:
last commit wins
first commit wins
merge conflicting updates
let the user decide
Instead of an incrementing version int/long you can use a timestamp but it's not suggested because:
retrieving the current time from the JVM isn't necessarily safe in a clustered environment, where nodes may not be time synchronized.
(quote from Java Persistence with Hibernate)
Some more info at the hibernate documentation.
At my office, we have a policy that all data tables contain 4 fields:
CreatedBy
CreatedDate
LastUpdateBy
LastUpdateDate
That way there is a nice audit trail on who has done what to the records, at least most recently.
But most importantly, it becomes easy enough to compare the LastUpdateDate of the current or edited record on the screen (requires you to store it on the page, in a cookie, whatever, with the value in the database. If the values don't match, you can decide what to do from there.

Mediawiki-only allow certain number of page creations?

Is it possible & if so, how can I allow users to only create a certain number of pages. ie. When they sign up, only allow them to create one page?
Off the top of my head, the following approach ought to work:
Hook into the userCan event for the "edit" action, check the page's existence (i.e. $title->exists()) and if it doesn't, consult some stored count (see below), and if the decision reached is to disallow creation, set $result to false and return false to stop further hooks overriding the decision.
Hook into the ArticleInsertComplete event and update some stored count to reflect that the user ($user) has created another page.
The decision in #1 can be expanded via additional logic to support multiple policies in conjunction with, e.g. automatic rights assignment; for example, to allow users to create more than one page after they've reached so-called "auto-confirmed" status, or to ignore the check for administrators or other users with specific rights.
I don't know what exactly you want to do, but I bet you want to let them make only user page or something? It might be simpler for you to just restrict their editing to a certain namespace, until they are added to some special group.
Some more info might help us answer the questions better. Do you also want to restrict their editing to the page they create?

Should you worry about fake accounts/logins on a website?

I'm specifically thinking about the BugMeNot service, which provides user name and password combos to a good number of sites. Now, I realize that pay-for-content sites might be worried about this (and I would suspect that most watch for shared accounts), but how about other sites? Should administrators be on the lookout for these accounts? Should web developers do anything differently to take them into account (and perhaps prevent their use)?
I think it depends on the aim of your site. If usage analytics are all-important, then this is something you'd have to watch out for. If advertising is your only revenue stream, then does it really matter which username someone uses?
Probably the best way to discourage use of bugmenot accounts is to make it worthwhile to have an actual account. E.g.: No one would use that here, since we all want rep and a profile, or if you're sending out useful emails, people want to receive them.
Ask yourself the question "Why do we require users to register to access my site?" Once you have business reason for this requirement, then you can try to work out what the effect of having some part of that bypassed by suspect account information.
Work on the basis that at least 10 to 15 percent of account information will be rubbish - and if people using the site can't see any benefit to them personally for registering, and if the registration process is even remotely tedious or an imposition, then accept that you will be either driving more potential visitors away, or increasing your "crap to useful information" ratio.
Not make registration mandatory to read something? i.e. Ask people to register when you are providing some functionality for them that 'saves' some settings, data, etc. I would imagine site like stackoverflow gets less fake registrations (reading questions doesn't require an account) than say New York Times, where you need to have an account to read articles.
If that is not upto your control, you may consider removing dormant accounts. i.e. Removing accounts after a certain amount of inactivity.
That entirely depends.
Most sites that find themselves listed in bugmenot.com tend to be the ones that require registration for in order to access otherwise-free content.
If registration is required in order to interact with the site (ie, add comments/posts/etc), then chances are most people would rather create their own account than use one that has been made public.
So before considering whether to do things like automatically check bugmenot - think about whether their are problems with your business model.
There are a few situations where pay-to-access content sites (I'm thinking things like, ahem, 'adult' sites) end up with a few user accounts being published publically (usually because someone has brute-forced some account details), and in that case there may be a argument for putting significant effort into it.
From an administrator viewpoint absolutely. That registration is required for a reason, even if it's something just as simple as user tracking/profile maintaining. Several thousand people using that login entirely defeats the purpose. IP tracking could help mitigate this problem, but it would definitely be hard to eliminate entirely.
No need to worry about BugMeNot: http://www.bugmenot.com/report.php
With bugmenot, keep in mind that this service is not actually there to harm the sites, but rather to make using them easier. You can request to block your site if it is pay-per-view, community-based (i.e. a forum or Wiki) or the account includes sensible information (like banking). This means in virtually all situations where you would think that bugmenot is a bad thing, bugmenot does not want to be used. So maybe things are not as bad as you might think.