Related
I am using such MySQL request for measuring views count
UPDATE content SET views=views+1 WHERE id='$id'
For example if I want to check how many times some single page has been viewed I've just putting it on top of page code. Unfortunately I always receiving about 5-10x bigger amount than results in Google Analytics.
If I am correct one refresh should increase value in my data base about +1. Doesn't "Views" in Google Analytics works in the same way?
If e.g. Google Analytics provides me that single page has been viewed 100x times and my data base says it was e.g. 450x times. How such simple request could generate additional 350 views? And I don't mean visits or unique visits. Just regular views.
Is it possible that Google Analytics interprates such data in a little bit different way and my data base result is correct?
There are quite a few reasons why this could be occurring. The most usual culprit is bots and spiders. As soon as you use a third-party API like Google Analytics, or Facebook's API, you'll get their bots making hits to your page.
You need to examine each request in more detail. The user agent is a good place to start, although I do recommend researching this area further - discriminating between human and bot traffic is quite a deep subject.
In Google Analytics the data is provided by the user, for example:
A user view a page on your domain, now he is on charge to comunicate to Google The PageView, if something fails in the road, the data will no be included in the reports.
In the other case , the SQL sistem that you have is a Log Based Analytic, the data is collected by your system reducing the data collection failures.
If we see this in that way, that means taht some data can be missed with the slow conections and users that dont execute javascriopt (Adbloquers or bots), or the HTML page is not properly printed***.
Now 5x times more it's a huge discrepancy, in my experiences must be near 8-25% of discrepancy. (tested over transaction level, maybe in Pageview can be more)
What i recomend you is:
Save device, browser information, the ip, and some other metadata information that can be useful and dont forget the timesatmp, so in that way yo can isolate the problem, maybe are robots or users with adblock, in the worst case you code is not properly implemented ( located in the Footer as example)
*** i added this because one time i had a huge discrepancy, but it was a server error, the HTML code was not properly printed showing to the user a empty HTTP. The MYSQL was no so fast to save the information and process the HTML code. I notice it when the effort test (via Screaming frog) showed a lot of 500x errors. ( Wordpress Blog with no cache)
I have a contest entry page on my company's website. In order to enter the contest, you create a login, which is just an email and a 4-digit pin. Here's the PIN field:
<input type="password" name="contest_pin" id="contest_pin" maxlength="4" />
When users submit the form, the account is created in our database, and then they get an email (which I'm copied on) that contains the email address and PIN they created.
Here's the issue: in every browser I've tested (Safari/Chrome/Firefox on Mac, Chrome/Firefox on Linux, IE7/8/9 on Windows) I CANNOT enter more than 4 digits into that PIN field. And yet, several of the emails I've received show that the user has created a pin with more than 4 characters.
How is this possible? Are there browsers that don't support maxlength? I haven't tested in Opera, or on any of the mobile browsers. It's not a huge deal if their pin is longer than 4 digits; the database will accept more. I'm just wondering how they managed to get around maxlength.
EDITED TO ADD
There are too many answers basically saying the same thing for me to respond individually to all of them. I KNOW that I should always do server-side validation for anything important, and we do have PHP code in place sanitizing our data, and if it was hugely important I would also have PHP code enforcing the 4-digit limit. It's not that important to us that they be only 4 characters, so I haven't enforced it. I'm just wondering why the maxlength property is not doing what it's designed to do, which is prevent users from entering more than a certain number of characters. For those of you that suggested malicious scripts or Firebug, I can be 100% certain this is not the case. Only registered users of our site (which is limited to a very specific corporate membership list) can even get to the contest entry page, and I can guarantee that none of the approximately 100 people on that list are going to be deliberately trying to circumvent an input type property.
They very likely are bots that read field names and create GET and POST requests based on those rather than using the HTML form like a normal human user would.
This is why client-side validation of form is never enough to ensure data is correct. Client-side validation is nice as it's responsive for end users, but it's not able to prevent bad data from arriving at your server's doorstep.
As an example, let's say I have an input field in a form whose action is GET. My input field's maxlength is 4. When I press submit, I see the URL ending with ?field=1234. There's nothing stopping me from updating that URL to ?field=123456789 and pressing enter. Similar things can be done with POST actions, but a tool is needed to do it.
I believe that every browser supports it, here's a few links for reference :
Maxlength | SitePointReference
Maxlength | W3 Schools
Obviously there are way around this - you should ensure you ALWAYS have adequate server-side validation, as client-side usually isn't enough on it's own.
All browsers support maxlength. However, this attribute can easily be removed/changed using DOM methods or, for example, with Firefox's Web Developer Toolbar.
several of the emails I've received show that the user has created a pin with more than 4 characters.
How is this possible? Are there browsers that don't support maxlength?
I would investigate the USER_AGENT and REFERER headers related to those user activities. Perhaps a malicious user submitted forms programmatically circumventing the browser restrictions, just to check your perimeter defense. If so you should see some patterns there.
Anyway these educated guesses aside, maxlength should not be treated as a means of securing the input. Anything client-side is not under your control, it exists merely to make user interface more intuitive, interactive. You should always check everything on the server. In that case, the PIN being composed of 4 digits, otherwise reject the input. The golden rule is to treat all user input as hostile and thoroughly validate it on the server.
In general, trying to enforce rules for user input done client-side is a bad idea. I had an experience where we had contracted out some work to some programmers and their idea of sanitizing user input was making it so that users couldn't input more than 10 characters in any given field. A quick firebug change and, oh look, I can drop the server's database with some minimal SQL injection.
If I were you I'd check maximum lengths with whatever script adds user information to your database and return form validation errors if the user input exceeds the maximum specified length.
This is the type of thing that you should still validate server side, even though the clients will almost always support it. It is very easy to get around a maxlength -- Firefox Developer toolbar includes an option to "Remove Maximum Lengths", or a request can very easily be hand-edited. I almost think that in the past you could get around a maxlength in one of the browsers simply by using cut and paste(eg, the browser wouldn't let you type more characters, but if you pasted a value that was 5+ characters it would enter them all), though I can't remember specifically which browser I saw that on...
I'm designing a web application - prototyping and wireframing the main pages so I've got an idea of what it will do. I'm struggling on how to display my data to users.
We basically provide them with an email inbox, a phone message system and a fax system. This means three different types of data - one is textual, one is audio and one is visual. They share some common properties however, and the point of our service is to unify users communications, so it makes sense to combine them.
Mashing the data together in any way results in a very sparse summary, the only information they share is the sender and the date. So after spending 5 hours agonising over design decisions I thought I'd open it up. The options we're leaning to is
Show a 'unified inbox' with a link to view the full item details on a per line basis
Drop the idea of a dashboard and just have an individual inbox in the web interface for each service. We can display the number of new messages on the tab for the service so they know there are new messages
Show a very simple summary as the dashboard, merely showing the number of new 'communications' in each of the users inboxes (fax, email, voice).
What is best from a design perspective? We could conduct user testing, but it's a shoestring startup, so the costs of mocking up 3 complete UI's is prohibitive at this point.
I'm confused what the question is, should we suggest the UI layout? Or are you looking for ideas on how to prototype / play around with a look / solution?
I use Balsamic Mockups for all my UI designs, spend some time laying it out, and it is a great way to visualize what you want, and it adds a level of interactivity to it as well.
Hopefully thats somewhat along the lines of what you were asking ;).
Otherwise I would go with something like you mentioned above:
Show a summary / dashboard page showing say last 10 of the last messages (voice / email / fax).
Show # of new items per service, and go from there.
As I understand, your problem is that you can't show anything useful for Fax or voicemail?
Still, what would be gained by separate inboxes? If you want to unify these three services, separating by type is what you don't do. The most important search / access vectors are WHO and WHEN.
(There is of course the need to search for "the fax from Mr. Lyle", so filtering by type should be possible. But it's not the fundamental access filter)
My suggestions (I understand that some of these might be complicated):
Single inbox. Icon for type.
If possible, try using "natural times" such as "a few minutes ago", "yesterday, 12:31" (if you use it for minutes, you may need to do that ajaxy thing to refresh them).
e-mail: Include the title of the e-mail / text message. If you can, Add line of text - fill up from the body, omitting line breaks, untuil you reach a certain character count or the width of the panel.
Fax: it might help to show # of pages (not sure if this is possible) and mouseover for thumbnail. The first deals with people failing to send all pages at once, the second with people inserting them the wrong way around.
Audio: Allow to play right from the inbox. Duration might be helpful to filter out "oh, it's voice mail, I'll hang up" calls, it's also a good preview on how much time I need to "read" this message.
Don't add irrelevant data just because it's shared between the two (e.g. size).
Sort by time received (or time sent if available?).
If there are many unread messages in the inbox, and there are multiple messages from the same sender without other messages inbetween, you can collapse them (e.g. only show the first two of the sequence, and a "more messages from Joanna..." link. This helps against important single message drowned by communicators gone wild.
An option would be to group by sender, at least for selected Senders, so that it reads
From Joanna
|V| 5 min ago Hey again Joe, just ust wanted to say....
o<| 5 min ago Hello, it's me! Hmm it seems oyu areally are on a business..
72 new since yesterday ( |=| 5 o<| 52 |V| 15)
From Mr Lyle
|=| yesterday, 12:12 7 pages [show]
Other
|V| 1h ago gunk1243#523.com Cheap Torpedoes Your best source of cheap, ...
|V| 1h ago gunk563#523.com Torpedoes Mania ON SALE! GU 537! sinks any ...
12 new since yesterday ( |V| 12)
Mr. Lyle doesn't have an abstract since there is only one new message. Clicking an abstract would expand that list, clicking a user would show you messages (including old ones) only from this user.
Phew. Hope that helps.
We have a service where we literally give away free money.
Naturally said service is ripe for abuse. To defend against this we do the following:
log ip address
use unique email addresses (only 1 acct/email addy)
collect more info like st. address, phone number, etc.
use signup captcha
BHOs (I've seen poker rooms use these)
Now, let's get real here -- NONE of this will stop a determined user.
Obviously ip addresses can be changed via a proxy (which could be blacklisted via akismet) but change anyways if the user has a dynamic ip or if more than one user is behind a NAT'd network (can we say almost everyone?)
I can sign up for thousands of unique email addresses each hour -- this is no defense.
I can put in fake information taken from lists for street addresses and phone numbers.
I can buy captchas from captcha solving services (1k for $5).
bhos seem only effective for downloadable software -- this is a website
What are some other ways to prevent multiple users from abusing the service? How do all the PPC people control click fraud?
I know we could actually call the person but I don't think we are trying to do that anytime soon.
Thanks,
It's pretty difficult to generate lots of fake phone numbers that can send and receive SMS messages. SMS verification could go a long way towards cutting down on fraud. Of course, it also limits you to giving away free money to cell phone owners.
I think only way is to bind your users accounts to 'real world' information, like his/her passport number, for instance. Of course, you'll need to make sure that information is securely stored and to find some way to validate it.
Re: signing up for new email accounts...
A user doesn't even need to do that. Please feel free to send your mail to brian_s#mailinator.com, or feydr.asks.a.question#spamherelots.com, or stackoverflow#safetymail.info, or my_arbitrary_username#zippymail.info. I haven't registered any of those email addresses, but all of them will work.
Those domains are owned by ManyBrain, and they (and probably others as well) set the domain to accept any email user. ManyBrain in particular then makes the inboxes for those emails publicly accessible without any registration (stripping everything by text from the email and deleting old mail). Check it out: admin#mailinator.com's email inbox!
Others have mentioned ways to try and keep user identities unique. This is just one more reason to not trust email addresses.
First, I suppose (hope) that you don't literally give away free money but rather give it to use your service or something like that.
That matters as there is a big difference between users trying to just get free money from you they can spend on buying expensive cars vs only spending on your service which would be much more limited.
Obviously many more user will try to fool the system in the former than in the latter case.
Why it matters? Because it is all about the balance between your control vs your user annoyance. I see many answers concentrating on the control part, so let's go through annoyance, shall we?
Log IP address. What if I am the next guy on the computer in say internet shop and the guy before me already used that IP? The other guy left your hot page that I now see but I am screwed because the IP is blocked. Yes, I can go to another computer but it is annoyance and I may have other things to do.
Collecting physical Adresses. For what??? Are you going to visit me? Or start sending me spam letters? Let me guess, more often than not you get addresses with misprints at best and fake ones at worst. In fact, it is much less hassle for me to give you fake address and not dealing with whatever possible spam letters I'll have to recycle in environment-friendly way. :)
Collecting phone numbers. Again, why shall I trust your site? This is the real story. I gave my phone nr to obscure site, then later I started receiving occasional messages full of nonsense like "hit the fly". That I simply deleted. Only later and by accident to discover that I was actually charged 2 euros to receive each of those messages!!! Do I want to get those hassles? Obviously not! So no, buddy, sorry to disappoint but I will not give your site my phone number unless your company is called Facebook or Google. :)
Use signup captcha. I love that :). So what are we trying to achieve here? Will the user who is determined to abuse your service, have problems to type in a couple of captchas? I doubt it. But what about the "good user"? Are you aware how annoying captchas are for many users??? What about users with impaired vision? But even without it, most captchas are so bad that they make you feel like you have impaired vision! The best advice I can give - if you care about user experience, avoid captchas as plague! If you have any doubts, do your online research first!
See here more discussion about control vs annoyance and here some more thoughts about being user-friendly.
You have to bind their information to something that is 'real world', as Rubens says. Of course, you also need to be able to verify this information (I can just make up passport numbers all day if you don't check to make sure they're correct).
How do you deliver the money? Perhaps you can index this off the paypal account, mailing address, or whatever you're sending the money to?
Sometimes the only way to prevent people abusing a system is to not have the system in the first place.
If you're doing what you say you're doing, "giving away money to people", then surprise surprise, there will be tons of people with more time available to try to find ways to game the system than you will have to fix it.
I guess it will never be possible to have an identification system which identifies fake identities that is:
cheap to run (I think it's called "operational cost"?)
cheap to implement (ideally one time cost - how do you call that?)
has no Type-I/Type-II errors
is scalable
But I think you could prevent users from having too many (to say a quite random number: more than 50) accounts.
You might combine the following approaches:
IP address: can be bypassed with VPN
CAPTCHA: can be bypassed with human farms (see this article, for example - although they claim that their test can't be that easily passed to other humans, I doubt this is true)
Ability-based identification: can be faked when you know what is stored and how exactly the identification works by randomly (but with a given distribution) acting (example: brainauth.com)
Real-world interaction: Although this might be the best one, but I guess it is expensive and not many users will accept it. Also, for some users/countries it might not be possible. (example: Postident in Germany, where the Post wants to see your identity card. I guess this can only be faced in massive scale by the government.)
Other sites/resources: This basically transforms the problem for other sites. You can use services, where it is not allowed/uncommon/expensive to have much more than 1 account
Email
Phone number: e.g. by using SMS, see Multi-factor authentication
Bank account: PayPal; transfer not much money or ask them to transfer a random (small) amount to you (which you will send back).
Social based
When you take the social graph (vertices are people, edges are connections), you will expect some distribution. You know that you are a single human and you know some other people. So you have a "network of trust" (in quotes, because I think this might be used in other context as well). Now you might not trust people / networks how interact heavily with your service, but are either isolated (no connection) or who connect a large group with another large group ("articulation points"). You also might not trust fast growing, heavily interacting new, isolated graphs.
When a user provides content that is liked by many other users (who you trust), this might be an indicator that there is a real human creating it.
We had a similar issue recently on our website, it is really a hassle to solve this issue if you are providing a business over one time or monthly recurring free credits system.
We are using a fraud detection solution https://fraudradar.io for a while and that helped us a lot to clean out most of the spam activities. It is pretty customizable with:
IP checks
Email domain validity
Regex rules
Whitelisting options per IP, email domain etc.
Simple API to communicate through
I would suggest to check that out.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
Our motor pool wants to scan drivers’ licenses and have the data imported into our custom system. We're looking for something that will allow us to programmatically get the data from the scanner (including the picture) and let us insert it into our application. I was wondering if anyone has had experience with this type of system and could recommend one or tell us which ones to avoid. Our application is written in PowerBuilder and uses a DB2 database.
Try solutions by idScan.net (www.idScan.net)
There is SDK that will allow drivers license parsing for all states in the USA and Canadian provinces. You can also purchase hardware such as ID scanner E-seek m250 that reads both 2D barcode and magnetic stripes (software is included).
Good luck!
We support something similar in our records management software. Our application is designed to work with a wedge reader, since they are the easiest to get up and running (no special drivers needed). When a card is swiped, the reader sends keystrokes to the OS for each character that is encoded on the magnetic stripe, with a simulated Enter keypress between each track (an AAMVA-compliant license has 3 data tracks).
It's slightly annoying because it behaves exactly as if someone was typing out the data by hand, so there is no easy way to tell when you have all the data (you could just wait to get 3 lines of information, but then it's difficult to detect invalid cards, such as when someone tries to swipe a student ID card, which might have fewer than 3 tracks encoded; in this case, the application hangs forever waiting for the non-existent third track to be received). To deal with this, we use a "fail-fast" approach: each time we get an Enter keypress, we immediately process the current line, keeping a record of which track we are expecting at that point (1, 2, or 3). If the current track cannot be processed (for example, a different start character appears on the track that what is documented for an AAMVA format driver's license), we assume the user must have swiped something other than a driver's license.
I'm not sure if the reader we use supports reading image data or not. It can be programmed to return a subset of the data on the card, but we just use the factory default setting, which appears to return only the first three data tracks (and actually I believe image data is encoded in the 2D barcode found on some licenses, not on the magnetic stripe, but I could be wrong).
For more on the AAMVA track format that is used on driver's license magstripes, see Annex F in the current standard.
The basic approach we use is:
Display a modal dialog that has a hidden textbox, which is given focus. The dialog box simply tells the user to swipe the card through the reader.
The user swipes the card, and the reader starts sending keydown events to the hidden textbox.
The keydown event handler for the textbox watches for Enter keypresses. When one is detected, we grab the last line currently stored in the textbox, and pass it to a track parser that attempts to parse the track according to the AAMVA format.
If this "fail-fast" parsing step fails for the current track, we change the dialog's status message to a message telling the user the card could not be read. At this point, the textbox will still receive additional keydown events, but it's OK because subsequent tracks have a high enough chance of also failing that the user will still see the error message whenever the reader stops sending data.
If the parsing is successful, we increment a counter that tells the parser what track it should process next.
If the current track count is greater than 3, we know we've processed 3 tracks. At this point we parse the 3 tracks (which have already split most of the fields up but everything is still stored as strings at this point) into a more usable DriversLicense object, which does additional checks on the track data, and makes it more consumable from our application (converting the DOB field from a string into a real Date object, parsing out the subfields in the AAMVA Name field into first name, middle name, last name, name suffix, etc.). If this second parsing phase fails, we tell the user to reswipe the card. If it succeeds, we close the dialog and pass the DriversLicense object to our main application for further processing.
If your scanner is "twain compliant", You will then be able to manage it from your app through an ActiveX control you can buy on the net like this one. You'll be able to manage your basic scan parameters (quality, color, single/multiple pages can, output format, etc), start the scan from your app, save the result as a file and transfer this file wherever needed. We have been using it with VB code for the last 2 years. It works.
Maybe you want to use magnetic stripe reader, to get driver license info from the card. As I remember most of the Driver licenses just have the data in plain text on those stripes, so it is relatively stright forward programming-wise.
MagStripe readers are also cheap now days.
You can try something from this list: http://www.adams1.com/plugins.html
I have not used them myself, though.
I wrote a parser in C#, and while it's "ok" it's still far from perfect.
I can't seem to find it but a Wikipedia entry used to exist that has the patterns to look for (trust me, parsing this yourself is a pain without any help).
Be aware that different states have different laws for what you can and can't use government issued ID's for. Texas has one.
We use a dell card reader and it inputs it exactly as though it were being typed through a keyboard, followed by the enter key. This made programming /very/ easy because then you just send focus to the text box and wait for enter. The main keys which break it in to chunks is the carrot '^'. Break that and you'll have your basic chunks.
You can also use InfoScan SDK. You can find it on www.scan-monitor.com the system allows you to use any scanner and does not make you purchase a specific scanner.