tables with many columns in a web applications - usability

Which is the best way to represent data from a table with many columns and possible long text in the columns? The dilemma is , if I use tables, the table width is very long and the table overflows outside the div. What options are there, can I list the data, the way this posts are listed in stackoverflow in a business application? How will it affect usability?

In general, there are no fundamental usability problems with a very wide table with each record occupying a single line as long as:
You provide horizontal scrolling so the user can get to all the fields.
You fix the record identifiers as row headers that they do not scroll out of view so the users can always identify the record they’re looking at.
Fields are relatively short so that many are in view at once; certainly you don’t want one field being wider than the typical window screen so the user as to scroll back and forth for each record to read that field.
The fact that users successfully use very wide spreadsheets shows that wide tables are not necessary a problem.
However a multi-line-per-record layout like SO is perhaps best for the kind of task that SO is used for: the user scanning records for a particular field value (the question title) to find a record of potential interest. Once the user finds a potentially interesting record, she or he can then without any additional input read the other fields (e.g., number of answers, number of votes) to decide if it’s worth drilling down for more information. Compared to a wide single-line-per-record table, this means more scrolling to scan the same number of records, but less scrolling or clicking to see what’s in a given record. There will thus be less work for the user when:
Some fields are very long (e.g., spanning the whole window).
There are few fields the user is likely to scan on.
Those fields for scanning are made graphically salient to facilitate scanning (e.g., with big, bold, and/or colorful font).
A multi-line-per-record layout is not so good when the user may be scanning on any of a large number of fields. You can make one to three fields more salient than the rest, but if you try to make every field salient then none of them are salient.
In contrast with a wide single-line-per-record table, the user can scroll horizontally to whatever field is of interest and scan down the table. There is no need to make any particular fields more salient than the others.
A single-line-per-record table is also better when the user is working between two or more records on a given field, for example, comparing the records on a field or copying one field value to another. More records for a given field value are visible at a time, reducing scrolling. It’s also easier to compare values when they are directly on top of each other rather than when separated vertically by other fields.

Related

Database Design: How should I store 'word difficulty' in MySQL?

I made a vocabulary app for Android that has a list of ~5000 words stored in a local database (SQLite), and I want to find out which words are more difficult than others.
To find out, I'm thinking of adding a very simple feature that puts two random words on the screen, asking the user to choose the more difficult one. Then another pair of random words will show, and this process can be repeated for as long as the user wants. The more users who participate in this 'more difficult word', the app would in theory be able to distinguish difficult words from easy words.
Since the difficulty would be based on input from all users, I know I need to keep track of it online so that every app could then fetch them from the database on my website (which is MySQL). I'm not sure what would be the most efficient way to keep track of the difficulty, but I came up with two possible solutions:
1) Add a difficulty column that holds integer values to the words table. Then for every pair of words that a user looks at and ranks, the word that he/she chooses more difficult would have have its difficulty increased by one, and the word not chosen would have its difficulty decreased by one. I could simply order by that integer value to get the most difficult ones.
2) Create a difficulty table with two columns, more and less, that hold words (or ID's of the words to save space) based on the results of each selection a user makes. I'm still unsure how I would get the most difficult words - some combination of group by and order by?
The benefit of my second solution is that I can know how many times each word has been seen (# of rows from the more column that contain the word + # rows from the less column that contain the word). That helps with statistics, like if I wanted to find out which word has the highest ratio of more / less. But it would also take up much more space than my first suggested solution would, and don't know how it could scale.
Which do you think is the better solution, or what other ones should I consider?
Did you try sphinx for this? Guess a full text search engine like sphinx would solve with great performance.

Large amounts of Data

I've been working with MS Access 2010 for a while now and for the most part everything works. YAY. However, I have large amounts of data to eventually plot (x-axis y-axis pairs) that come from a piece of equipment that I use for work. I can import this data as a seperate table, but I am not particularly fond of the idea of having my database overloaded with seperate tables that are purely to store this data. To my undertanding each table should represent an entity that fits into the large context of the database. Also, for the equipment I'm using right now all the x-axis data is redundant. The question is, what is the best way to divid the data for effecient storage?
Considerations:
I keep running into the same problems as I think about this question. Suppose that in either case I made two tables, one to store the x-axis data and another to store the y-axis data, and then had a linking table between the two allowing for a many to many relationship.
On the one hand, I could store one value per Record (all values in one Column). But, then there would need to be a tag field in each of these two tables, thus defeating the purpose of the split.
On the other hand, I could store one value per Field (all value in one Row), which in my case would yield over 2000 fields in each table.
There is a third option, the one I'm currently using, to store one pair per row in a single table. However, there is much redundancy.
You should stick with your current method. This is by far the simplest method to both retrieve and add to the data. Below I have my reactions to your other suggestions.
Suppose that in either case I made two tables, one to store the x-axis
data and another to store the y-axis data, and then had a linking
table between the two allowing for a many to many relationship.
This might provide a slight hard drive space improvement if X and Y are not integers. However, it would complicate things significantly for questionable benefit.
On the one hand, I could store one value per Record (all values in one
Column). But, then there would need to be a tag field in each of these
two tables, thus defeating the purpose of the split.
This would make it a lot more complicated to work with the data and is a bad idea. You would need to use complicated querying to get both data points in the same row. You could do this, but it complicate both input and retrieval.
On the other hand, I could store one value per Field (all value in one
Row), which in my case would yield over 2000 fields in each table.
If you do this, you will regret it. This would make it nearly impossible to do any meaningful data analysis later on.
There is a third option, the one I'm currently using, to store one
pair per row in a single table. However, there is much redundancy.
This is ideal. You can easily import your data into the two columns, the data is easily retrievable. The redundancies are not important unless the value is irrelevant.

How many columns in table to keep? - MySQL

I am stuck between row vs columns table design for storing some items but the decision is which table is easier to manage and if columns then how many columns are best to have? For example I have object meta data, ideally there are 45 pieces of information (after being normalized) on the same level that i need to store per object. So is 45 columns in a heavry read/write table good? Can it work flawless in a real world situation of heavy concurrent read/writes?
If all or most of your columns are filled with data and this number is fixed, then just use 45 fields. It's nothing inherently bad with 45 columns.
If all conditions are met:
You have a possibility of the the attributes which are neither known nor can be predicted at design time
The attributes are only occasionally filled (say, 10 or less per entity)
There are many possible attributes (hundreds or more)
No attribute is filled for most entities
then you have a such called sparce matrix. This (and only this) model can be better represented with an EAV table.
"There is a hard limit of 4096 columns per table", it should be just fine.
Taking the "easier to manage" part of the question:
If the property names you are collecting do not change, then columns is just fine. Even if it's sparsely populated, disk space is cheap.
However, if you have up to 45 properties per item (row) but those properties might be radically different from one element to another then using rows is better.
For example taking a product catalog. One product might have color, weight, and height. Another might have a number of buttons or handles. These are obviously radically different properties. Further this type of data suggests that new properties will be added that might only be related to a particular set of products. In this case, rows is much better.
Another option is to go NoSql and utilize a document based database server. This would allow you to set the named "columns" on a per item basis.
All of that said, management of rows will be done by the application. This will require some advanced DB skills. Management of columns will be done by the developer at design time; which is usually easier for most people to get their minds around.
I don't know if I'm correct but I once read in MySQL to keep your table with minimum columns IF POSSIBLE, (read: http://dev.mysql.com/doc/refman/5.0/en/data-size.html ), do NOTE: this is if you are using MySQL, I don't know if their concept applies to other DBMS like oracle, firebird, posgresql, etc.
You could take a look at your table with 45 column and analyze what you truly need and leave the optional fields into other table.
Hope it helps, good luck

Best usability practice for accepting long-ish account numbers

A user recently inquired (OK, complained) as to why a 19-digit account number on our web site was broken up into 4 individual text boxes of length [5,5,5,4]. Not being the original designer, I couldn't answer the question, but I'd always it assumed that it was done in order to preserve data quality and possibly to provide a better user experience also.
Other more generic examples include Phone with Area Code (10 consecutive digits versus [3,3,4]) and of course SSN (9 digits versus [3,2,4])
It got me wondering whether there are any known standards out there on the topic? When do you split up your ID#? Specifically with regards to user experience and minimizing data entry errors.
I know there was some research into this, the most I can find at the moment is the Wikipedia article on Short-term memory, specifically chunking. There's also The Magical Number Seven, Plus or Minus Two.
When I'm providing ID's to end users I, personally like to break it up into blocks of 5 which appears to be the same convention the original designer of your system used. I've got no logical reason that I can give you for having picked this number other than it "feels right". Short of being able to spend a lot of money on carrying out a study, "gut instinct" and following contentions from other systems is probably the way to go.
That said, if you can make the UI more usable to the user by:
Automatically moving from the end of one field to the start of another when it's complete
Automatically moving from the start of one field to the prior field and deleting the last character when the user presses delete in an empty field that isn't the first one
OR
Replacing it with one long field that has some form of "input mask" on it (not sure if this is doable in plain HTML, but it may be feasible using one of the UI frameworks) so it appears like "_____ - _____ - _____ - ____" and ends up looking like "1235 - 54321 - 12345 - 1234"
It would almost certainly make them happier!
Don't know about standards, but from a personal point of view:
If there are multiple fields, make sure the cursor moves to the next field once a field is full.
If there's only one field, allow spaces/dashes/whatever to be used in that field because you can filter them out. It's really annoying when sites/programs force you to enter dates in "dd/mm/yyyy" format, for example, meaning the day/month must be padded with zeroes. "23/8/2010" should be acceptable.
You need to consider the wider context of your particular application. There are always pros and cons of any design decision, but their impact changes depending on the situation, so you have to think every time.
Splitting the long number into several fields makes it easier to read, especially if you choose to divide the number the same way as most of your users. You can also often validate the input as soon as the user goes to the next field, so you indicate errors earlier.
On the other hand, users rarely type long numbers like that nowadays: most of the time they just copy-paste them from whatever note-keeping solution they have chosen, in whatever format they have it there. That means that a single field, without any limit on lenght or allowed characters suddenly makes a lot of sense -- you can filter the characters out anyways (just make sure you display the final form of the number to the user at some point). There are also issues with moving the focus between fields, with browsers remembering previous values (you just have to select one number, not 4 parts of the same number then), etc.
In general, I would say that as browsers slowly become more and more usable, you should take advantage of the mechanisms they provide by using the stock solutions, and not inventing complex solutions on your own. You may be a step before them today, but in two years the browsers will catch up and your site will suck.

MySQL Column Unification, any performance improvements?

I'm designing a MySQL table for an authentication system for a high-traffic personal website. Every time a user comment, article, etc is displayed the following fields will be needed:
login
User Display
User Bio ( A little signature )
Website Account
YouTube Account
Twitter Account
Facebook Account
Lastfm Account
So everything is in one table to prevent the need to call sub-tables. So my question is:
¿Would there be any improvements if I combine Website, Youtube, Twitter, Facebook and Lastfm columns to one?
For example:
[website::something.com][youtube::youtube.com/something]
No, combining these columns would not result in any improvement. Indeed it seems you would extend the overall length (with the adding of prefix and separators, hence potentially worsening performance.
A few other tricks however, may help:
reduce the size of the values stored in "xxxAccount" columns, by removing altogether, or replacing with short-hand codes, the most common parts of these values (the examples shown indicate some kind of URL whereby the beginning will likely be repeated.
depending on the average length of the bio, and typical text found therein, it may also be useful to find ways of shrinking its [storage] size, with simple replacement of common words, or possibly with actual compression (ZIP and such), although doing so may result in having to store the column in a BLOB column which may then become separated from the table, depending on the server implementation/configuration.
And, of course, independently form any improvements at the level of the database, the use model indicated seems to prompt for caching this kind of data agressively, to avoid the trick to SQL altogether.
Well i dont think so , think of it this way .. you will need some way to split them and that would require additional processing and then why not just have one field in the whole table and have everything in that? :) Dont worry about the performance it would be better with separate columns