For a major school project I am implementing a real-time collaborative editor. For a little background, basically what this means is that two(or more) users can type into a document at the same time, and their changes are automatically propagated to one another (similar to Etherpad).
Now my problem is as follows:
I want to be able to detect what changes a user carried out onto an HTML textfield. They could:
Insert a character
Delete a character
Paste a string of characters
Cut a string of characters
I want to be able to detect which of these changes happened and then notify other clients similar to "insert character 'c' at position 2" etc.
Anyway I was hoping to get some advice on how I would go about implementing the detection of these changes?
My first attempt was to consider the carot position before and after a change occurred, but this failed miserably.
For my second attempt I was thinking about doing a diff on the entire contents of the textfields old and new value. Am I missing anything obvious with this solution? Is there something simpler?
It is a really hard work make this working today, for several reasons, but
maybe you will need to restrict only to some browsers. read: https://developer.mozilla.org/en/XUL/Attribute/oninput the alternative to "oninput" is listen to all input events (keyboard, mouse, dragdrop) I suggest to use "oninput"
html is not perfect... even html5. input and textareas supports only single-range
selections. you can solve this using designmode/contenteditable instead of
textareas/textfield
detecting offsets of what changed is a hard work: read
-- https://developer.mozilla.org/en/Document_Object_Model_%28DOM%29/window.getSelection
-- http://www.quirksmode.org/dom/range_intro.html -- http://msdn.microsoft.com/en-us/library/ms535869%28v=VS.85%29.aspx -- http://msdn.microsoft.com/en-us/library/ms535872%28v=VS.85%29.aspx
you may need a "diff" algorithm written in javascript! http://ejohn.org/projects/javascript-diff-algorithm/
one personal note: detecting words, characters changes may be totally non-sense and not useful, detect instead paragraphs changes, or in case of an excel-like worksheet, the single cell
I hope this helps
feel free to correct my English!
My pseudocode/written out response would be (if I understand your question exactly) to use jQuery to detect keyup events and then save the input to the server via ajax, then also take the response and post it back to the input. This isn't very efficient, but basically the idea is that you're constantly posting and checking what else has been posted. If you want to see what someone else is doing in real time, you can ping the server every second or so and update with the response.
All of this of course can be optimized, but it still is kind of taxing for a server. You could also see if you can implement Google Topeka Wave for your project, or get in touch with Google Topeka to see how they do it :)
Related
I am doing some volunteer work for a charity that is using a couple online systems that store their donors and related data. I would like to find a way to store a URL as a custom field in such a way that they can put corresponding links between donors in one of the systems in order to quickly find the same donor in another system. The only built-in method in the products being used is to store a single value in a field labeled "website" which is originally intended to store a value for any website associated with the donor. I would like to avoid using this field if possible and instead create a custom field.
However, the rub is the custom fields only have a handful of options (clear text, date, currency, etc). There is no option to store a URL or something like rich text). I've thought of a couple less optimal ways to make the values stored in those fields clickable (a browser plugin or a proxy) however both of those have obvious drawbacks that I would like to avoid.
What I am wondering and hoping someone has a possible answer for, is if there are an ways of storing a value in a clear text field that might disrupt or escape the underlying html encoding such that the displayed link is clickable. I already control the values being put into these fields (users cannot enter their own values, they are essentially read-only), so security isn't much of a concern.
I have very limited access or influence to have any system level changes, however I would like to make this possible as it would help them a great deal (their users are all volunteers with limited time and education). I've tried a few tricks but havn't found anything that doesn't get converted to unicode or escaped (it could be that it's completely controlled for at output, i simply don't know).
My current attempts have been limited to using the built in forms submission, I may explore their import and/or API methods on the theory that might allow better low-level access to storing the actual values in the system, however I'm still not certain what to try other than adding .
I have also tried an inline script to add the corresponding tab, however that seems to break the form submission method (perhaps it'll work via csv import or via the API)
Does anyone have suggestions for other things I could try before I go any further? I'm a bit of a novice and feel like there may be something else obvious I haven't tried.
Is it possible to create an automatically generated Rhythm game for Flash Action Script 3 ?
But not just randomly generated, generated from the notes of a song. Or is that something I have to do manually?
How would I go about doing either of these?
I am currently following this tutorial: http://www.flashgametuts.com/tutorials/as3/how-to-make-a-rhythm-game-in-as3-part-4/ so perhaps it can be made to fit around this? (Go to the final part and View Source to see the full thing)
Thanks!
Depending on what you mean by rhythm game, check out the computeSpectrum() function of the SoundMixer class: http://help.adobe.com/en_US/FlashPlatform/reference/actionscript/3/flash/media/SoundMixer.html#computeSpectrum()
There's an example of it working in the link, but basically what it does is take a snapshot of the current sound wave and puts normalised (-1 to 1) values in a ByteArray. What you do with those values is up to you - e.g. you might use them as a height field to generate terrain for example.
Repeat this every frame, and you get the gist
Welcome to SO!
First off, there is nothing already built in, to my knowledge. There may be something lurking around Google that someone else wrote, but you'd need to dig around for that (though I assume you already did.)
Generated from the notes of a song. Hmm, this will take some serious ingenuity and coding on your part. I'll point you in the right direction, but it is up to you to write the code. No one here will do it for you, but we'll happily help with specific problems in your code.
The crazy (yet potentially more fun) approach MAY BE to derive the data in a similar manner that an audio visualizer does...but I can't guarantee that will work. This would work best with MIDI-generated, single instrument songs. Here is a tutorial on visualizers.
A second approach may be to actually convert MIDI files directly. Again, I can't guarantee it will work, but it would theoretically be possible, seeing how MIDI files store data to begin with. Here's an answer on playing MIDI files, to get you started. Consider looking through their class.
However, the "easiest" approach would be to come up with some sort of system by which you store the note values for a song. You can manually enter the values in an array, or in a data file (such as XML) that you can load.
I put "easiest" in quotes because you'd have to account for a LOT of information - not just note values, but note duration, rhythm, and rests.
Anyway, those are just a few ideas to get you started. Good luck!
I'm experimenting with P2P on Flash, and I've come across a little hurdle that I'd like to clarify before moving forward. The technology itself (Flash) doesn't matter for this problem, as I think this problem occurs in other languages.
I'm trying to create a document that can be edited "live" by multiple people. Just like Google Docs pretty much. But I'm wondering, how would you suggest synchronizing everyone's text? I mean, should I message everyone with all the text in the text field every time someone makes a change? That seems very inefficient.
I'm thinking there has to be a design pattern that I can learn and implement, but I'm not sure where to start.
Optimally, the application should send the connected clients only the changes that have occurred to the document, and have some sort of buffer or error correction that can be used for retrieving earlier changes that may have been missed. Is there any established design pattern that deals with this type of issue?
Thanks,
Sandro
I think your "Optimally" solution is actually the one you should go for.
each textfield has a model, the model has a history (a FILO storing last, let's say, 10 values).
every time you edit that textfield you push the whole text into the model and send the delta to other connected clients.
as other clients receive the data they just pick the last value from the model and merge it to the received data.
you can refine the mechanism by putting an idle timer in the middle: as a user types something in the textfield you flag that model as "toBeSentThroughTheNet" and you start a timer. as the timer "ticks" (TimerEvent.TIMER) you stop it, collect the flagged data and send it to other clients. just remember to reset the timer everytime the user is actually typing (a semplification coul be keydown = reset, keyup = start).
one more optimization could be send the data packed in a compressed bytearray, but this requires you write your own protocol and may be not so an easy and quick path :)
If the requirement is that everyone can edit the document at the same time and the changes should be propagated to everyone and no changes should be lost, then it is a non-trivial problem. There are few different approaches out there, but one that is quite robust is Operational Transformation. This is the same algorithm that Google Docs uses for collaborative editing.
Understanding and Applying Operational Transformation and the attendant hacker news discussion are probably other good places to start.
The Wave Protocol was released as open source so you can take a look on how it is implemented.
You could of course forgo the tricky synchronization and just allow people to take turns and only one person can edit the document at a time and this person just pushes the changes to the remainder of the group.
I hate writing error condition code. I guess I don't have a good approach to doing it:
Do you write all of your 'functional'
code first then go back in and add
error handling or vice versa?
How stupid do you assume your users
are?
How granular do you make your
exception throws?
Does anyone have sage advice to pass on to make this easier?
A lot of great answers guys, thank you. I had actually gotten more answers about dealing with the user than I thought. I'm actually more interested in error handling on the back end, dealing with database connection failures and potential effects on the front end, etc. Keep them coming!
I can answer one question: You don't need to assume your users are "stupid", you need to help them to use your application. Show nice prompts for things, validate data and explain why, so it's obvious to them, don't crash in their face if you can't handle what they've done (or more specifically, what you've let them do), show a nice page explaining what they can do instead, and so on.
Treat them with respect, and don't assume they know everything about your system, you are here to help them.
In respect to the first part; I generally write most error-handling at the time, and add a little bit back in later.
I generally don't throw that many exceptions.
Assume your users don't know anything and will break your system any way that it can possibly be broken.
Then write your error handling code accordingly.
First, and foremost, be clear to the user on what you expect. Second, test the input to verify it contains data within the boundaries you expect.
Prime example, I had a form with an email field. We weren't immediately using that data so we didn't put any checking on it. Result: about 1% of the users put in their home address. The field was labeled "Email Address" Apparently the users were just reading the second word and ignoring the first.
The fix was to change the label to simply say "Email" and then test the input. For kicks we went ahead and logged what the users were initially typing in that field just to see if the label change helped. It did.
Also, as a general practice, your functions should test the inputs to verify they contain the data you expect. Use asserts or their equivalent in your language of choice.
When i code, there will be some exceptions which i will expect, i.e. a file may be missing, or some xml serialisation may fail. Those exceptions i know will happen ahead of time, and i can put in handling for them.
There is a lot you cannot anticipate though, and nor should you try to. Put in a global error handler and logger, so that ultimately everything gets caught and logged. Then as your testers and/or users find situations that cause exceptions (i.e. bad input) then you can decide whether you want to put further handling in specifically for it, or maybe leave it as it was.
Summary: validate your input, but don't try to gaze into the crystal ball too much, as you will never anticipate every issue that the user may come up with. Have a global handler and logger, and then refine as necessary.
I have two words for you: Defensive Coding
You have to assume your users are incredibly stupid. Someone will always find a way to give you input that you thought would never happen.
I try to make my exception throws as granular as possible to provide the best feedback when something goes wrong. If you lump everything together, you can't tell which error case caused the problem.
I usually try to handle error cases first (before getting functional code), but that's not necessarily a best practice.
Someone has already mentioned defensive programming. A few thoughts from a user experience perspective, though.
If the user's input is invalid, either (a) correct it if you're reasonably sure you knew what they meant or (b) display a message in line that tells them what corrective action they should take.
Avoid messages like, "FATAL SYSTEM ERROR CODE 02382981." Users (a) don't know what this means, even if you do, and (b) are intimidated and put off by seeing things like this.
Avoid pop-up messages for every—freaking—possible—error you can come up with. You shouldn't disrupt user flow unless you absolutely need them to resolve a problem before they can do anything else.
Log, log, log. When you show an error message to the user, put relevant information that might help you debug in either (a) a log file or (b) a database, depending on the type of application you're creating. This will ease the effort of hunting down information about the error without making the user cry.
Once you identify what your users should and should not be able to do, you'll be able to effectively write error handling code. You can make this easier on yourself with helper methods/classes.
In terms of your question about writing handling before/after/during business logic, think about it this way: if you're making 400,000 sandwiches, it might be faster to add all the mustard at the same time, but it's probably also a lot more boring than making each sandwich individually. Who knows, though, maybe you really like the smell of mustard...
When coding up, say, a registration form from scratch, does it make sense to get it functioning with the expected inputs first, and then go back and catch/handle the unexpected inputs and deal with errors?
The alternative would be to process the input, checking any constraints and insuring that they are handled properly, and then dealing with getting the typical use case functioning correctly.
Is one way preferable to the other, and if so why? Also, is there an alternate way to go about this sort of 2-part task?
To clarify, by validity, I mean more than just data validation, including business rules, such as "No more than X people can register for this event"
The best thing to do, in my opinion, is to get a decent first version working which may not deal with all of the unexpected cases completely, but is made thoughtfully and modularly. You can then go back and perfect the logic so that your tests pass.
In the real world, this method pays off because you're more likely to be productive and interested when there's something working with a few kinks, rather than just being stuck trying to figure out all the edge cases in your head and deal with them at the start.
One way would be to take a TDD approach (assuming you write unit tests). Start off by writing your unit tests, then work on getting those unit tests to pass.
In my opinion UI work should be done last since you probably have plenty to do on the back-end functionality.
I'm strongly in favor of writing the validators first. Once you have something that seems like it's working it will be much harder to go back and add the validators. By writing the validators first (but not necessarily handling the exceptions they raise) you make sure that you get them done. It also gives you a good chance to think about exactly what you're expecting - this can help you think of special cases you'll need to cover later.
Tracer bullets are nice, but the principle doesn't need to be applied to every aspect of a project.
Write the test for a method of the validation object
Write the method of the validation object
Test until pass
Repeat 1-3 until all methods are tested.
Write the form and feed the data into the tested and working validation object methods.
Use the same procedure for the business object handling the data post validation.
I always like coding up the functionality first and then adding validation later. Seems fine to me seeing as most coding is done in a development environment and no one will be submitting data while you're working on it.
But whichever way you feel comfortable with is best.
The downside to my method is that, if you're disorganized, you may forget to add a validity or sanitation check somewhere. And it does happen :-)
If it's not TOO much work, I'd start with basic input validation. Especially for dates or identifiers, like order numbers. It's easier to loosen validation than to tighten it. Basic input validation can save a lot of debugging time further down in the backend.
On the other hand, say you're talking good-looking Javascript enabled validation that supports multiple languages. In that case you'd be better off writing a simple first version, and develop the backend based on that. You can polish up the input form when it's beginning to approach a final version.
Once you get the functionality working, will you REALLY go back and add validation code?
Most of us would like to, but many of us run out of time at the end of the project.
I'm not sure that designing one way or the other is superior... While Mr. Leekman says that cranking out the functional part and then going back and doing the tedious work of edge-cases is more productive, I'm not sure I see how.
Having the functional code is not useful (and even dangerous) if you're allowing non-permissible values (i.e. values your functional code didn't expect) creep in.
Conversely great boundary checks are useless if the values that make it through aren't processed in some way.
In a production environment, you really have nothing to show if you are not covering both validity and functionality. The order in which you do these things really should be an issue of personal preference. If you're not feeling creative and just want something tedious to do, write the validity-checking portion of the code. If you've just figured out the perfect algo to do exactly what you need during your coffee break, sit down and write the functional part and go back to bounds-check later on.
I like to put up a couple of fields worth of functionality, then do the validation checks for that functionality, then the validation for the rest, and finally fill in the rest of the functionality.
If you're breaking the code up into "model" and "UI", then some of this is easier to figure, but basically requires a design choice: are your model objects going to assume correct inputs and put the onus on the UI, or test for correct inputs.
Now, me, being a belt and suspenders guy, I tend to answer that question "yes": the model will check the domain even though the inputs should be checked anyway. But in any case, once you make that decision, you have your answer: if domain checking is part of the definition, you should build it and test it with all the other parts. If you separate the domain checking from the functionality, build and test them separately.
Having seen too many databases where people neglected to do the validation methods, I vote for doing it first.