Retrieving a fully qualified street address from ZIP / postal code - google-maps

I have a form in which my users need to enter the following location data:
Full address line (street address, apartment, suite, unit, building, floor)
House number
City
State / province / region
ZIP / Postal code
Country
To simplify the completion of this form, I would like to automatically fill in the fully qualified address (addrses line, city, state province etc) by letting the user only enter his country, zip code and house number.
Is it correct that these 3 items are sufficient to lookup the address in the United States? Or is less or more information necessary? And is the answer to this question different for every country? Moreover, is there a service, API, or library that can be utilized for this purpose (e.g. Google Maps or OpenStreetMap)?

Great questions!
Is it correct that these 3 items are sufficient to look up the address in the United States?
No. Unfortunately these three will get you down to ~hundreds of possible addresses in the
US.
Is the answer to this question different for every country?
Yes! The postal systems from country to country vary greatly and you're users in them will have different expectations about what they expect to supply - Brits don't expect to have to enter a full address for example.
With the UK, Canada and Australia you can usually get to a single address from the house number and postcode. BUT, you can not guarantee this. There may be sub-premise information or business information which requires a bit of interaction with the user to check you have right address.
Some countries, such as France, do not have complete premise number coverage. With these you can take the premise number & postcode but depending upon the town you have to alter your behavior to either trust and accept the input or prompt them for a correction.
Another important consideration when planning your workflow is the need to allow for people who perhaps do not know their postcode / zip. It does not happen often but sometimes people have just moved, or occasionally a properties postcode/zip changes so it is important to be flexible in the information you need.
Is there a service, API, or library that can be utilized for this purpose?
Yes - there are several solutions around that offer the ability to capture global addresses. Experian Data Quality (my company) offer a hosted or on premise solution that allows for this.
Try it out here - on the right hand side under the "Do you want to know more?" you can switch countries, the prompt updates and the interaction occurs if needed.

I can only answer about US addresses (I work at SmartyStreets), but the answer is no, that won't work.
Kudos for your desires to improve the user experience. Unfortunately, I would not recommend trying this, and here's why:
A US ZIP code, in its entirety, is actually 11 digits long (12 with the check digit):
The first three digits are the SCF (Sectional Center Facility), kind of like a region code
The first five digits are your typical 5-digit ZIP code that specifies a set of carrier routes
The next 4 digits are more precise, often narrowing down an address to block-level.
The next 2 digits are seldom used except in barcodes, but they indicate the delivery point. In theory, this would specify a particular house, apartment, or mailbox, but in reality, sometimes the 11-digit code is ambiguous (common in large complexes, street blocks, or PO facilities). It's typical for the delivery point to correlate to the house or apartment number, but not always.
So in your situation:
Knowing the country narrows down the possibilities to just 350,000,000+ addresses
Knowing the 5-digit ZIP code narrows it down to somewhere around 10,000+ addresses (important note: not everyone knows the 5-digit ZIP code, and they change. What's more, is that they may not be sure whether to enter their PO box ZIP code or their house ZIP code. And what if their house doesn't receive mail? Or what if they're in the military and their 5-digit ZIP is in flux?)
Knowing the house number may narrow down the address candidates to anywhere from 1-1000. It depends how "big" the ZIP code is. (But ZIP codes are not polygons).
So no, it is not sufficient to know these three parts of the address. The country is practically worthless at that point, and the ZIP code is locality/city-specific at best. The house number might appear dozens, if not hundreds, of times in a ZIP code. (I grew up in the boonies where our house number was unique, but that's rare.)
And yes, the answer to this question varies country to country, but this reasoning holds true for most developed countries. Less developed countries don't have such organization to their postal system.
Is there a service that can do this? Not if you don't want your users to scroll through dozens or hundreds of results. If they have to look through more than just a couple, you're better off just asking them to type their full address.
I answered a very similar question just the other day. You might find it useful.
So now that I've rained doomsday on your idea, how about an alternative? Of course I'm partial to SmartyStreets' autocomplete, which suggests addresses, geo-located close to the user, as they're typing. I should mention that it's free. It doesn't actually verify the address until the user is finished or has chosen one of the suggestions, but it does reduce keystrokes.
Further on this UX vein, I'd recommend putting country as the first field of your address form. This way, you can alter the form's format based on the country they choose. If you use a service like LiveAddress, you can have the user type their address in a format comfortable to them in a single field, rather than across multiple text boxes in your arbitrary order, since LiveAddress can parse their input.

You could easily achieve this by using the google maps reverse geocoding api. Heres a link to its documentation. link

I don't know of any country where there is a one-to-one mapping between a post code and a street address. Except Singapore. Postal Codes in SG
In that particular case you can use the post code to fill in the remaining fields, in any other case you can derive the city name and the street address, but not likely the House number.
Example 1: (derive full street address from post code)
https://geocode.xyz/339696?geoit=xml
<geodata>
<latt>1.32035</latt>
<longt>103.87430</longt>
<elevation/>
<standard>
<stnumber>88</stnumber>
<addresst>88 GEYLANG BAHRU</addresst>
<postal>339696</postal>
<city>Singapore</city>
<prov>SG</prov>
<countryname>Singapore</countryname>
<confidence>0.5</confidence>
</standard>
</geodata>
Example 2: (Get most common street address, and other variations of city name)
https://geocode.xyz/27777?region=DE&geoit=xml
<geodata>
<latt>53.06060</latt>
<longt>8.58388</longt>
<elevation/>
<standard>
<stnumber>20</stnumber>
<addresst>20 Bokenbusch</addresst>
<postal>27777</postal>
<city>Ganderlesee</city>
<prov>DE</prov>
<countryname>Germany</countryname>
<confidence>0.5</confidence>
</standard>
<alt>
<loc>
<city>Ganderkesee</city>
<latt>53.06868</latt>
<longt>8.57437</longt>
<cc>951</cc>
</loc>
<loc>
<city>Bremen</city>
<latt>53.07675</latt>
<longt>8.57559</longt>
<cc>172</cc>
</loc>
<loc>
<city>Schierbrok</city>
<latt>53.08639</latt>
<longt>8.58037</longt>
<cc>166</cc>
</loc>
The number in "cc" indicates how many street addresses in that city share the given post code.
Good luck!

Related

Google Maps Autocomplete doesn't include postcodes in address search

For my current project, I have an address lookup for the user to enter an address. In its default state, its results are too ambiguous, and the lookup returns all locations even if it isn't actually an address (eg some of the locations in the list are an entire city or region).
Adding types: ['address'] to the query has solved this; Google now only responds with actual addresses instead of ambiguous regions, however this has lost us the ability to search via postcode, as these two fiddles demonstrate:
http://jsfiddle.net/yj6qvpsg/2/ will list entire cities and regions (bad), but you can still search for an address with a UK postcode (good).
http://jsfiddle.net/yj6qvpsg/1/ will only list addresses (good), but won't search UK postcodes (bad).
How do we get the best of both worlds? I tried playing around with eg, types: ['address', 'postal_code'], but had no luck...
So it turns out that it's only really in the UK that postcodes are tied to physical addresses at street level (you can literally give the house number and postcode as a complete and valid address), while in the rest of Europe a postcode represents a whole region, hence Google considers postcodes as regions throughout.
Maybe one day they'll make an exception for the UK, but in the mean time an alternative if you really need this feature is probably to look at something like: https://getaddress.io/ which might suck if like us your entire data structure is built to match Google's
This came up as the top result for this question, but is massively outdated.
So for anyone looking for a more in-depth search, set Google's Autocomplete to geocode and you can search by:
Street name
District
Town
City
Postcode
Ace!

Google maps geocoder state bias

I have a google maps application where users can search by Country, State, City or a street address. Users may be anywhere in the world and they may be searching for anywhere else in the world, not just within their own country.
I need the geocoder to have a bias such that if a state is entered (without the country) it geocodes to the state and not to a city with the same name. Our application prioritises countries first, then states, then cities etc... however the geocoder is not doing the same.
Eg. I want to search for "Victoria" which is a state within Australia.
http://maps.googleapis.com/maps/api/geocode/json?address=victoria shows Victoria, BC, Canada.
http://maps.googleapis.com/maps/api/geocode/json?address=victoria&region=au shows the state of Victoria in Australia however I cannot include the region as my users may be anywhere in the world so I have no way of knowing which region they are searching for.
I have looked at "administrative levels" and also "types" but I cannot find a solution which suits my needs of simply prioritising in the order country > state > city.
I ideally want something like this:
maps.googleapis.com/maps/api/geocode/json?address=victoria&components=administrative_area:WILDCARD
OR
maps.googleapis.com/maps/api/geocode/json?address=victoria&types=administrative_area_level_1
Of course neither of these solutions work but I hope they illustrate what I am trying to achieve.
Any suggestions?
Thanks,
Nicole
You can do a query without specifying the address, use
...?components=administrative_area:victoria
and then iterate over the results.address_components to pick out ones where the types include administrative_area_level_1
Update: I noticed that depending on the search term provided to administrative_area, google is using some kind of heuristics to determine the certainty of the results. If there's a clear winner, then only 1 result is shown. If the matching is similar for a group of locations, then you will get several. So when there's several results, you can pick towards a higher or lower administrative_area_level to suit your needs.

How can I filter out fictional locations (ex. "under a rock", "hiding") from Google Maps API geocode results?

Google Maps API does a great job trying to locate a match for nearly every query. But if I'm only interested in real locations, how can I filter out Google's guesses?
For example, according to Google, "under a rock" is located at "The Rock, Shifnal, Shropshire TF11, UK". But a person who answers the question, "Where are you?" with "Under a rock" does not mean to indicate that they are in Shropshire, UK. Instead they just don't want to tell you — well, either that or they are in real trouble, thankfully with web access, stuck under some rock.
I have several million user generated location strings that I'm attempting to find coordinates for. If someone writes "under a rock" I'd rather just leave the coordinates null instead of putting an obviously wrong point in Shropshire, UK.
Here are some other examples:
under a rock => Shropshire, UK
planet earth => Cheshire, UK
nowhere => Scituate, RI, USA
travelling => Madrid, Spain
hiding => Anderson, CA, USA
global => Midland, TX, USA
on the web => North Part, ON, Canada
internet => Frisco, TX, USA
worldwide => Mie Prefecture, Japan
Ultimately I'm after a solid way to return coordinates from a string but return false if the location is like the above.
I need to build a function that returns the following:
Twin Cities => Return the colloquial coordinates of Minneapolis-St. Paul
right behind you => false [Google get's this one "right" -- at least for my purposes]
under a rock => false
nowhere => false
Canada => Return coordinates
Mission District San Francisco => Return coordinates
Chicago => Return coordinates
a galaxy far far away => false [Google also get's this "right" — zero results]
What do you recommend?
Here's a comma-delimited array for you to play at home:
'twin cities','right behind you','under a rock','nowhere','canada','mission district san francisco','chicago','a galaxy far far away','london, england','1600 pennsylvania ave, washington, d.c.','california','41.87194,12.56738','global','worldwide','on the internet','mars'
And here's the url format:
'http://maps.googleapis.com/maps/api/geocode/json?address=' + query + '&sensor=false'
ex: http://maps.googleapis.com/maps/api/geocode/json?address=twin+cities&sensor=false
It seems most of your incorrect results have a "partial_match" attribute set to "true".
e.g.
Twin Cities, no partial match:
http://maps.googleapis.com/maps/api/geocode/json?address=Twin%20Cities&sensor=false
under a rock, 10+ results, all with partial match:
http://maps.googleapis.com/maps/api/geocode/json?address=under%20a%20rock&sensor=false
Though the original purpose of this attribute is not to tell wether a locality is correct or not, it's still pretty accurate on the dataset you provided.
From Google Maps API documentation:
partial_match indicates that the geocoder did not return an exact match for the original request, though it was able to match part of the requested address. You may wish to examine the original request for misspellings and/or an incomplete address.
Partial matches most often occur for street addresses that do not exist within the locality you pass in the request. Partial matches may also be returned when a request matches two or more locations in the same locality. For example, "21 Henr St, Bristol, UK" will return a partial match for both Henry Street and Henrietta Street. Note that if a request includes a misspelled address component, the geocoding service may suggest an alternate address. Suggestions triggered in this way will not be marked as a partial match.
This might not be the direct answer to your question.
If you are currently going through 1000s of user input saved in db, and filter out the invalid ones, I think it is too late and not feasible. The output can be only good as input.
The better way is to make input as valid as possible, and end users don't always know what they want.
I would suggest you that user enter their address through autocomplete, so that you will always have the valid address
User enters text, and select the suggestions
An marker and info window will be shown
When user confirms input, you save info window text as user input, not from text input.
By doing this way, you don't need to validate or filter user input.
I know there are Bayes Classifier implementations in javascript. Never tried them though, I currently use a Ruby implementation which works correctly.
You could have two classifications (Real and Unreal), training each of them with how many samples you want (30, 50 samples each?). "If your classifier is well trained, it will be more accurate".
Then you'd have to test the location before calling GoogleMaps API to filter Unreal locations.
To truly succeed here you are going to have to build a database driven system that facilitates both positive and negative lookups with AI that gets smarter over time, just like Google did. I don't believe that there is a single algorithm that will filter out results based on cosmetics alone.
I looked around and found a site that contains every city in the world. Unfortunately, it doesn't give it as a single list so you'd have to spend a bit of time harvesting data. the site is http://www.fallingrain.com/world/index.html.
They seem to be using individual directories for organizing countries, states, and cities. Then, broken down further by alphabet. It is however the only comprehensive that I could find.
If you manage to get all of these locations into a database then you will have the beginnings of a positive lookup system for your queries. Also, you'll need to start building separate lists of bi, tri, and quad-city areas as well as popular destinations and land marks.
You should also store a negative lookup table for all known mismatches. People have a tendency to generate similar false data and type-o's across large populations. So, the most popular "nowhere" and "planet earth" answers will be repeated over and over again and, in every language you can think of.
One of the benefits of this strategy is that you can run relational queries against your data to get matches in bulk instead as well as one at a time. Since some false negatives will occur at the beginning then your main decision is to determine what you want to do with unmatched items. You may want to adopt a strategy where you have the ability to both reject non-matches as well as substituting partial matches with the nearest actual match.
Anyhow, I hope this helps. It is a bit of effort but if it's important it will be worth it. Who knows, you may end up with a database that's actually worth something. Maybe even a Google maps gateway service for companies/developers who need the same functionality. (:
Take care.

Question about tracking user in a map application using cellid

I am trying to understand the concept of cellid (http://www.opencellid.org/api)
As per that, if we send a request
http://www.opencellid.org/cell/get?key=myapikey&mnc=1&mcc=2&lac=200&cellid=234
it will respond with the latitude and longitude.
I was wondering if this can be used from within a google map application for tracking a user or it needs to be used from within a mobile device?
If it can be used from within a web app, what parameters should it use for
mcc: mobile country code (decimal)
mnc: mobile network code (decimal)
lac: locale area code (decimal)
cellid: value of the cell id
E.g., will it work if we know the cell number of the person(e.g., 281 222 6700)
The request is just a lookup in the opencellid database.
It doesn't matter where the information is coming from.
If you know the MMC, MNC, LAC and CellID of a user/mobile device,
the request will return latitude and longitude if the cellID has been found in the DB.
There is no additional information transfered by using the request from within a J2ME app.
MCC+MNC+LAC+CELLID should be a unique identifier of a cell. (afaik those values can change over time,
but they still should be unique.)
More often than not, knowing just the LAC and CellID is sufficient.
However, you can't use this to track based on a number, only by cell tower parameters. Number tracking is a whole different ball game with VRL & HRL lookups which are hard to come by, very expensive ($100+ per lookup) and sometimes even illegal.
Google Maps also uses cell ID lookups to approximate the user's location before GPS kicks in (the translucent circle around a dot is actually data from Cell IDs).
That being said, opencellid has very minimal coverage and little or no updates to the project. Check out some paid players who offer wider coverages:
LocationAPI
Combian

Picking the most accurate geocode

I'm using http://maps.google.com/maps/geo? web service to geocode some addresses.
The problem I have is that a fuller address doesn't necessarily give a more accurate geocode.
e.g passing in Llantysilio, Denbighshire, UK is far more accurate than Llantysilio, Llangollen, Denbighshire, UK
The Accuracy attribute in the XML doesn't seem very helpful in deciding which address to pick.
How have other people dealt with this issue? Is there a good way to pick the best geocode that works most/all of the time?
*edit
A bit of extra info - when I put in the fuller address the first line of the address is ignored and the geocoder jumps to a different, but exact, address which is a central street located in the extra line added to the address. In this example, it picks Castle Street in the middle llangollen, seemingly disregarding Llantysilio.
Edit by kdgregory: here are the two API requests that I used (missing API key doesn't seem to be an issue):
http://maps.google.com/maps/geo?q=Llantysilio,+Denbighshire,+UK&sensor=false&output=xml
http://maps.google.com/maps/geo?q=Llantysilio,Llangollen,++Denbighshire,+UK&sensor=false&output=xml
You have to interpret the accuracy my friend. There are usually 2 parts to an accuracy, first the address macthing. The second part is the important part. You can geocode something to a accuracy level of the United States, or a city level, zipcode centroid, street interpolated level or an actual parcel precision level. The first example has a 4 and the second is 9. For this service higher is better.
Accuracy Value Description
0 Unknown accuracy.
1 Country level accuracy.
2 Region (state, province, prefecture, etc.) level accuracy.
3 Sub-region (county, municipality, etc.) level accuracy.
4 Town (city, village) level accuracy.
5 Post code (zip code) level accuracy.
6 Street level accuracy.
7 Intersection level accuracy.
8 Address level accuracy.
9 Premise (building name, property name, shopping center, etc.) level accuracy.
It's probably good to note that Google does not follow the XAL specs, but rather implements them in a subset.
So, this means that you won't necessarily be able to do:
place.AddressDetails.Country.AdministrativeArea.Locality.LocalityName
place.AddressDetails.Country.AdministrativeArea.AdministrativeAreaName
place.AddressDetails.Country.CountryName
Because a country and sub-locality may be provided while a administrative area is not.
The data that is returned is identified with an accuracy gauge that gives you a relative idea of what you can expect for data. So, you can store objects and chop off parts of the full address using this variable and try to geocode in such a fashion - It's not recommended though.
Typically, a full address is (without the thoroughfare) is a good way of finding the general location. You can use some of the weighted-preferential logic Google provides to refine the address.
E.g. Use the setViewPort or setCountryCode to give your searches a bit more accuracy.
Remember, Geocoding is not a science. You can't expect consistent results.
A geocode response.Placemark[0] via gmap you can check what you got, and take the level or try again. I chose default in the order
place.AddressDetails.Country.AdministrativeArea.Locality.LocalityName
place.AddressDetails.Country.AdministrativeArea.AdministrativeAreaName
place.AddressDetails.Country.CountryName
It could be more logically named as seen above. gmaps 3 works somewhat incompatible with v2.
You can try a very ugly hack which consists in geocoding your full adresse and all subsets of the words your adress contains, you get a lot of geocodes that you use to get the adresses related to them with reverse geocoding tool.
Once you have plenty of adresses you compare them with the one you first gave, then you take the most accurate geocode...
Many requests, lot of iteration growing with each word you add to your adress, well an ugly work but can be fun to make some statistics ^^
In the end I concluded that there are far too many weird blips in address consistency with google's geocoding webservice in the UK, but eventually managed to figure out a way of using postcodes instead, which is far more accurate: how it's done