displaying combined similar data

displaying combined similar data - html

I'm designing a web application - prototyping and wireframing the main pages so I've got an idea of what it will do. I'm struggling on how to display my data to users.
We basically provide them with an email inbox, a phone message system and a fax system. This means three different types of data - one is textual, one is audio and one is visual. They share some common properties however, and the point of our service is to unify users communications, so it makes sense to combine them.
Mashing the data together in any way results in a very sparse summary, the only information they share is the sender and the date. So after spending 5 hours agonising over design decisions I thought I'd open it up. The options we're leaning to is
Show a 'unified inbox' with a link to view the full item details on a per line basis
Drop the idea of a dashboard and just have an individual inbox in the web interface for each service. We can display the number of new messages on the tab for the service so they know there are new messages
Show a very simple summary as the dashboard, merely showing the number of new 'communications' in each of the users inboxes (fax, email, voice).
What is best from a design perspective? We could conduct user testing, but it's a shoestring startup, so the costs of mocking up 3 complete UI's is prohibitive at this point.

I'm confused what the question is, should we suggest the UI layout? Or are you looking for ideas on how to prototype / play around with a look / solution?
I use Balsamic Mockups for all my UI designs, spend some time laying it out, and it is a great way to visualize what you want, and it adds a level of interactivity to it as well.
Hopefully thats somewhat along the lines of what you were asking ;).
Otherwise I would go with something like you mentioned above:
Show a summary / dashboard page showing say last 10 of the last messages (voice / email / fax).
Show # of new items per service, and go from there.

As I understand, your problem is that you can't show anything useful for Fax or voicemail?
Still, what would be gained by separate inboxes? If you want to unify these three services, separating by type is what you don't do. The most important search / access vectors are WHO and WHEN.
(There is of course the need to search for "the fax from Mr. Lyle", so filtering by type should be possible. But it's not the fundamental access filter)
My suggestions (I understand that some of these might be complicated):
Single inbox. Icon for type.
If possible, try using "natural times" such as "a few minutes ago", "yesterday, 12:31" (if you use it for minutes, you may need to do that ajaxy thing to refresh them).
e-mail: Include the title of the e-mail / text message. If you can, Add line of text - fill up from the body, omitting line breaks, untuil you reach a certain character count or the width of the panel.
Fax: it might help to show # of pages (not sure if this is possible) and mouseover for thumbnail. The first deals with people failing to send all pages at once, the second with people inserting them the wrong way around.
Audio: Allow to play right from the inbox. Duration might be helpful to filter out "oh, it's voice mail, I'll hang up" calls, it's also a good preview on how much time I need to "read" this message.
Don't add irrelevant data just because it's shared between the two (e.g. size).
Sort by time received (or time sent if available?).
If there are many unread messages in the inbox, and there are multiple messages from the same sender without other messages inbetween, you can collapse them (e.g. only show the first two of the sequence, and a "more messages from Joanna..." link. This helps against important single message drowned by communicators gone wild.
An option would be to group by sender, at least for selected Senders, so that it reads
From Joanna
|V| 5 min ago Hey again Joe, just ust wanted to say....
o<| 5 min ago Hello, it's me! Hmm it seems oyu areally are on a business..
72 new since yesterday ( |=| 5 o<| 52 |V| 15)
From Mr Lyle
|=| yesterday, 12:12 7 pages [show]
Other
|V| 1h ago gunk1243#523.com Cheap Torpedoes Your best source of cheap, ...
|V| 1h ago gunk563#523.com Torpedoes Mania ON SALE! GU 537! sinks any ...
12 new since yesterday ( |V| 12)
Mr. Lyle doesn't have an abstract since there is only one new message. Clicking an abstract would expand that list, clicking a user would show you messages (including old ones) only from this user.
Phew. Hope that helps.

Related

If I have a collection of random websites, how do I get specific information from each?

Say I have a collection of websites for accountants, like this:
http://www.johnvanderlyn.com
http://www.rubinassociatespa.com
http://www.taxestaxestaxes.com
http://janus-curran.com
http://ricksarassociates.com
http://www.condoaudits.com
http://www.krco-cpa.com
http://ci.boca-raton.fl.us
What I want to do is crawl each and get the names & emails of the partners. How should I approach this problem, at a high-level?
Assume I know how to actually crawl each site (and all subpages) & parse the HTML elements -- I am using Oga.
What I am struggling with is how to make sense of data that is presented in a wide variety of ways. For instance, the email address for the firm (and or partner) can be found in one of these ways:
On the About Us page, under the name of the partner.
On the About Us page, as a generic catch-all email.
On the Team page, under the name of the partner.
On the Contact Us page, as a generic catch-all email.
On a Partner's page, under the name of the partner.
Or it could be any other way.
One way I was thinking about approaching the email, is just to search for all mailto a tags and filter from there.
The obvious downside for this is that there is no guarantee that the email will be for the partner and not some other employee.
Another issue that is more obvious is detecting the partner(s) names just from the markup. I was initially thinking I could just pull all the header tags and text in them, but I have stumbled across a few sites that have the partner names in span tags.
I know SO is usually for specific programming questions, but I am not sure how to approach this and where to ask this. Is there another StackExchange site that this question is more appropriate for?
Any advice on specific direction you can give me would be great.

I looked at the http://ricksarassociates.com/ website and I cant find any partners at all so in my opinion you better stand to gain from this if not you better look for some other invention.
I have done similar datascraping from time to time, and in norway we have laws - or should I say "laws" - that you are not allowed to email people however you are allowed to email the company - so in a way the same problem from another angle.
I wish I knew maths and algorythms by heart because I am sure there is a fascinating sollution hidden in AI and machine learning, but in my mind the only sollution I can see is building a rule set that over time probably gets quite complex. Maby you could apply some bayesian filtering - it works very well for email.
But - to be a little more productive here. One thing i know is inmportant, you could start by creating the crawler environment and building the dataset. Have the database for URLS so you can add more at any time, and start the crawling on what you have already so that you do your testing querying your own data with a 100% copy. This will save you enormous time instead of live scraping while tweaking.
I did my own search engine some years ago, scraping all NO domains however I needed only the index file that time. Took over a week alone just to scrape it down and I think it was 8GB of data just for that single file, and I had to use several proxyservers aswell to make it work due to problems with to much DNS traffik. Lots of problems that needed being taken care of. I guess I am only saying - if you are crawling a large scale you might aswell start getting the data down if you want to work efficient with the parsing later.
Good luck, and do post if you get a sollution. I do not think it is posible without an algorythm or AI though - people design websites the way they like and they pull templates out of their arse so there are no rules to follow. You will end up with bad data.
Do you have funding for this? If so its simpler. Then you could just crawl each site, and make a profile for each site. You could employ someone cheap to manual go through the parsed data and remove all the errors. This is probably how most people does it, unless someone already have done it and the database is for sale / available from webservice so it can be scraped.

The links you provide are mainly US site, so I guess you are focusing on English names. In that case, instead of parsing from html tags, I would just search the whole webpage for name. (There are free database of first name and last name) This may also work if you are donig this for some other Europe company, but it would be a problem for company from some countries. Take Chinese as an example, while there is a fix set of last name, one may use basically any combination of Chinese character as first name, so this solution won't work for Chinese site.
It is easy to find email from a webpage as there is a fixed format of (username)#(domain name) with no space in between. Again I won't treat it as html tags but just as normal string so that the email can be found no matter it is in mailto tag or in plain text. Then, to determine what email is it:
Only one email in page?
Yes -> catch-all email.
No -> Is name found in that page as well?
No -> catch-all email (can have more than one catch-all email, maybe for different purpose like info + employment)
Yes -> Email should be attached to the name found right before it. It is normal that the name should appear before the email.
Then, it should be safe to assume the name appear first belongs to more important member, e.g. Chairman or partner.

I have done similar scraping for these types of pages, and it varies wildly from site to site. If you are trying to make one crawler to sort of auto find the information, it will be difficult. However, the high level looks something like this.
For each site you check, look for element patterns. Divs will often have labels, ID's, and classes which will easily let you grab information. Perhaps you find that many divs will have a particular class name. Check for this first.
It is often better to grab too much data from a particular page, and boil it down on your side afterwards. You could, perhaps, look for information which comes up on a screen by utilizing type (is link) or regex (is email) to look for formatted text. Names and occupation will be harder to find by this method, but might be related positionally on many pages to other well formatted items.
Names will often be affixed with honorifics (Mrs., Mr., Dr., JD, MD, etc.) You could come up with a bank of those, and check against them for any page you end up on.
Finally, if you really wanted to make this process general purpose, you could do some heuristics to improve your methods based off of expected information; names, for example, are most often within a particular list. If it was worth your time, you could check certain text for whether it matches a list of more common names.
What you mentioned in your initial question seems that you would have a lot of benefit with a general purpose Regular Expressions crawler, and you could make improvements on it as you know more about the sites which you interact with.

There are excellent posts on this topic with a lot of useful links throughout these webpages:
https://www.quora.com/What-is-a-good-web-scraper-for-pulling-emails-names-etc-even-if-the-contact-info-is-another-page-deep-a-browser-add-on-is-a-plus
http://www.hongkiat.com/blog/web-scraping-tools/
http://www.garethjames.net/a-guide-to-web-scraping-tools/
http://www.butleranalytics.com/15-web-scraping-tools/
Some of the examined applications are working in macOS.

Line Breaks in Web Push Notifications on Chrome

So for my app which sends out web push notification, I wanted to show an exact preview of how notification will look when sent out.
So I wanted to know about some rules/patterns but could not find any standard doc for the same.
How many maximum characters does it allow in message field? (I suppose it is ~100 characters)
How does it treat linebreaks | newlines?
I experimented out by sending few notifications but it did not follow a pattern. In some it showed only few linebreaks ~3 out of many while in some it was upto 5.
For e.g., message content for one of the notification was,
Exclusive Offer at Reliance Trends Aundh
Welcome to Diwali Offer !
At Reliance Trends
50% Off
On our exclusive
Electronics Limited Period
It went out like this,

How can I summarize the updates to a table on an page I browse?

I am a student at a University. With the placement process going on, we have an internal placement website that shows updates and status about various companies I have applied to. Since the number of companies is too large it becomes cumbersome to scroll through the complete list to find information. Sometimes, I just miss some things. Now, to tackle this problem, here is what I want to do:
The data is in an HTML table. Each row shows information about one company: Some dates, Status(Not/Shortlisted/Applied), Some yes/no options etc. each in a different column. Once I open the page I want to be able to extract information about which companies I got shortlisted in, and in which ones I didn't make it.
What is the right technology to do this ? I am thinking of writing a Greasemonkey user script (I have never actually written any, but how hard could it be ?). What other options do I have?
Edit: I don't quite understand why this question has voted to be closed?
I just displayed a use case for something general: On opening a web page, automatically extracting information from the page and display it to the user. What is the easiest and sufficiently powerful way to achieve this?

Since you can't get access to the website's database, Greasemonkey would be your best automation approach. However, this task is likely to be over before you can get a decent script up from scratch.
Your best practical approach is to save the pages and/or copy and summarize the data in MS Excel, or equivalent.
~~~~~~~~~
Here at SO, We will not develop any but the simplest Greasemonkey scripts for you from scratch (unless they are fun somehow ;) ). But, you can sometimes get such help in the "Script requests forum" at userscripts.org.
In order for someone to help you, they will need:
A clear idea of exactly what data gets manipulated, and how.
Access to the target site. Or access to saved snapshots of the target pages. GM scripts are extremely dependent on the details of the target page.

"other option":
ctrl + F
enter shortlisted
enter
ctrl + G <--repeat last search

What features can improve the usability of a WebApp

i am looking for new features and ideas to improve the overall usability of our internal webapp (straight LOB-App with some CRM features)
i bet there is ton of those waiting the get found.
as an example:
recently i tried out rememberthemilk.com a task tracking application which has the feature to enter dates in natural language, so instead of using the date picker or entering the date itself, because grabbing the mouse actually takes longer (but forces you to think about what the date is), you can just write "today" or "tomorrow" or "end of month" or "in 2 weeks". that feature really got me, every time i use another application now, i wonder why i cant do this here. i wonder why other application make me thing about what date "next friday" is. i dont care! but i do care that my boss just said "i need this till next friday".
1 feature/idea per answer please.

Take a look at Nielsen's list of 10 Heuristics for Usability Design. Entire courses and books are designed around these 10 laws - they're very appropriate and I really wish more companies would use them (I'm looking at you, Adobe).
http://www.useit.com/papers/heuristic/heuristic_list.html

Learn User Experience Design
...developers are nutoriously bad at developing usable interfaces for their applications. We make them to make sense to us but not the user.
One of the things that stuck with me about human cognition is that the human mind can focus the best when it has to concentrate best when it has 5 +/- 2 things to worry about. Therefore the most effective interface will, at most, present the user with 5 +/- 2 elements to work with at any given time. Otherwise, they will be overwhelmed.

The #1 way to improve the user experience is to do some usability testing, find someone who is unfamiliar with the application and ask them to do some simple tasks, record the session with screen recording software.
Sometimes it's difficult for people to explain why they think using an application is difficult, simple watching someone perform basic functions can help you find issues that the user may have have considered to be a problem.

Web displays: Paging vs. long tables

It seems that the trend in web design is to provide paged output, where long tables are displayed a page at a time. My customers don't like that, and have requested that the web sites I design for them show all entries in long tables. The arguments for paging seem to be mostly based on the performance hit of displaying long tables, and this is less of a concern in a high-bandwidth corporate intranet. Arguments against paging include the ability to print the entire table, do string searches against the entire table, select arbitrary ranges from the entire table for copying, etc. I've pointed out that these features can easily be added to paged web designs (e.g. a print button that prints the entire table, or a button that creates a CSV file of the the table), but the paged output still seems inconvenient to them. Our typical table is about 100 to 600 items. Obviously tables that would be significantly larger would probably have to be paged.
Questions:
What is your experience with personal or customer preferences for paged vs. full output in long tables?
Web design tools seem to be pushing the paging paradigm. Are they out of touch, or are my customers unusual?
If you're thinking "It depends on the length of the table", what threshold would you use?

I love long one-page listings.
One of the few reasons I can see for paged
listing is the ones you point out about performance.
I think your customers are very usual and in-touch.
The threshold would be about page loading times. When the server can't produce the full lists fast enough or when the lists gets so long that the browser slows down. (The latter can happen for quite short lists if you have non-a-tag hover stuff in your CSS and the browser is IE.)
Give the users a powerful search function and they'll narrow down their page lists themselves.

Why not simply have it be a user configurable option. It sounds like you plan to essentially implement both anyway.
To be honest I think that no matter which you choose someone will complain. At least with it being user configurable you have the ability to put it back on the user.

Provide a default page length, and a configurable parameter (e.g. in the query string for programmatic use, and/or a form on the webpage for interactive use) to control how many listings are in a page.
User flexibility is good. Texas Instruments has a parametric search tool for electrical engineers to find ICs that meet certain technical characteristics, and they include a link both to "show all" in a webpage and "download all" as a .csv file. That's a good model, kudos to TI. Ditto to flickr; their API lets you control (to a large extent) how many results show up on a web service call.
I personally HATE websites that default to 10 listings per page with no way to increase it. It takes FOREVER to browse them, & I'm willing to wait longer if I can get all the stuff at once.
If it's an interactive webpage, I would consider going to an AJAX solution that downloads 100 at a time so there's an indication of progress (and the user can stop it if there are 20000 results).
I agree with PEZ, it's all about responsiveness.

Best solution: Don't provide lists with more than 100 items.
Usually your user doesn't want to read more than 100 or even 600 items. They just don't care. They are searching for one (or possibly a few). Make sure that there's a way for them to get to those items without visual-grep-ing through the list.
And if your client insists on displaying all items, then provide paging with a configurable page size and let him enter "100000 items per page" if he wants to.

One of the seminal books on web design (sorry, I forget which one) used to say not to count on your users scrolling down because most of them don't know how or can't be bothered. I think a more recent update says that while is is true for the general public, certain sectors of more technical users can be expected to scroll down and you can make pages that require scrolling IFF (if and only iff) you know your users can handle it.

I can understand your situation extremely well. I have been in similar situation. I moved a business workflow from being man managed to an automated one. Initially it was carried out using excel spreadsheets. The stakeholders for my software were in the age group of 55+ they dont like anything ajaxy or any of the UI patterns you are talking about. It such cases data retreival logic can be optimized. Any table that touches the 1K mark or has item like image blobs or things like that should be shown in parts from a performance point of view.
long outputs slow rendering and will be performance leech
Customers dont want to changes most times and customer is always right unless u can convince them.
I have put forth my threshhold but it also depends on the content of the rows.
Happy Coding!

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008