What is the Royal Mail's PAF Address Database? - street-address

I'm struggling to understand what you would get from the Royal Mail if you bought their PAF file dataset of UK addresses.
I was expecting that PAF was some form of database which you would host yourself, and the Royal Mail provide APIs into that database.
However, after reading this, I'm presuming that all you get is a series of files containing the data. I can't find any obvious information regarding an API.
Are there any libraries available to help you handle these files, especially from Java?
Do you have to parse the file yourself and stick it in your own database, so you can do quick lookups from an application?
If all this is true, why would you ever bother buying this off the Royal Mail? Aren't all the third party providers, with their web based APIs, just far simpler to use - in terms of both programming and data maintenance?
Apologies if I've missed the obvious on this one, but I find the Royal Mail site lacking in information. I'm beginning to think that I've misunderstood their PAF file offering.

The postcode address file (PAF) is a set of data-files provided by Royal Mail that contain all address in the UK. My understanding is that it's normally updated every three months.
I'm aware of two companies that have products that supply APIs into the PAF data: QAS and Capscan. With these you're able to search addresses to find missing postcodes or vice versa. APIs include both web-based solutions and native calls.
Why you'd buy direct from Royal Mail? Because you'd want to write your own query tools rather rely on third party products or you want to do data-mining that other products can't provide.
Could you import into a SQL database? Yes, but only after you'd written your own PAF file parser.
Why use these over web-based tools? Because you're sitting behind an intranet, have limited internet access from servers, restrictive licensing from any web-based solution, etc.

It's all in wikipedia
http://en.wikipedia.org/wiki/Postcode_Address_File

Check out www.PostcodeAnywhere.co.uk a web-service based lookup site. Also desktop lookup app available. Decision likley to be based on lookup volume, ease of use, costs, etc. But for low-medium volumes, simple implementation in a few minutes and 'automatic' maintenance built-in.

I've subsequently found this page where you can order a sample data set. It states:
Please be aware that Raw Data contains no software and the data must be processed for use in IT applications. If you do not wish to program PAF or Postzon then we can supply it to you in a pre-written application known as UK Addresses on CD
The UK Addresses on CD page goes on about something called "UK Addresses Utilities", and it states:
The UK Addresses CD also contains a set of dynamic link libraries and provides the ability to interrogate the address datasets programmatically through a .NET 2.0(+) DLL.

I have written something in C# that can parse these files into SQL Server
https://github.com/Telexx/Royal-Mail-PAF-Parser/

Related

Is BizTalk The Correct Solution?

We have about about six systems (they are all internal systems) that we need to send data between. Currently we do not have a consistent way of doing this. We use SSIS, SQL Server linked servers to directly update databases, ODBC connections to directly update databases, text files, etc..
Our goals are:
1) Have a consistent way of connecting applications.
2) Have a central way of monitoring and logging the connections between
applications.
3) For the applications that offer web services we
would like to start using them instead of connectiong directly with
the database.
Whatever we use will need to be able to connect to web services, databases, flat files, and should also be able to accept data via a tcp connection.
Is Biztalk a good solution for this, or is it is overkill?
It really depends. For the architecture you're describing, it would seem a good fit. However, you will need to validate wether biztalk can communicate whith the systems you are trying to integrate. For example; when these systems use webservices, message queues or file based communication, that may be a good fit.
When you start with biztalk, you have to be willing to invest in hardware, software, en most of all in learning to use it.
regarding your points:
1) yes, if you make sure to encapsulate the system connectors correctly
2) yes, biztalk supports this with BAM
3) yes, that would match perfectly
From what you've described (6 systems), it is definitely a good time to investigate a more formalized approach to integration, as you've no doubt found that in a point to point / direct integration approach will result in a large number of permutations / spaghetti as each new system is added.
BizTalk supports both hub and spoke, and bus type topologies (with the ESB toolkit), either of which will reduce the number of interconnects between your systems.
To add to oɔɯǝɹ:
Yes - ultimately BizTalk converts everything to XML internally and you will use either visual maps or xslt to transform between message types.
Yes. Out of the box there are a lot of WMI and Perfmon counters you can use, plus BizTalk has a SCOM management pack to monitor BizTalk's Health. For you apps, BAM (either TPE for simple monitoring, but more advanced stuff can be done with the BAM API).
Yes - BizTalk supports all the common WCF binding types, and basic SOAP web services. BizTalk's messagebox can be used as a pub / sub engine which can allow you to 'hook' other processes into messages at a later stage.
Some caveats:
. BizTalk should be used for messages (e.g. Electronic Documents across the organisation), but not for bulk data synchronisation. SSIS is a better bet for really large data transfers / data migration / data synchronisation patterns.
. As David points out, there is a steep learning curve to BizTalk and the tool itself isn't free (requiring SQL and BizTalk licenses, and usually you will want to use a monitoring tool like SCOM as well.). To fast track this, you would need to send devs on BizTalk training, or bring in a BizTalk consultant.
. Microsoft seem to be focusing on Azure Service Bus, and there is speculation that BizTalk is going merged into Azure Service Bus at some point in future. If your enterprise strategy isn't entirely Microsoft, you might also want to consider products like NServiceBus and FUSE for an ESB.
You problem is a typical enterprise problem. Companies start of building isolated applications like HR, Web, Supply Chain, Inventory, Client management etc over number of years and once they reach a point these application cannot be living alone and they need to talk to each other, typically they start some hacked solution like data migration at database level.
But very soon they realize the problems like no clear visibility, poor management, no standards etc and they create a real spaghetti. The biggest threat is applications will become dependant on one another and you lose your agility to change anything. Any change to system will require heavy testing and long release cycle.
This is the kind of problem a middleware platform like BizTalk Server will solve for you. Lot of replies in the thread focused on cost of BizTalk server (some of the cost mentioned are not correct BTW). It's not a cheap product, but if you look at the role it play in your organisation as a central middleware platform connecting all the applications together and number of non-functional benefits you get out of the box like adapters to most of the third party products like SAP, Oracle, FTP, FILE, Web Services, etc, ability to scale your platform easily, performance, long running work flows, durability, compensation logic for long running workflows, throttling your environment etc., soon the cost factor will diminish.
My recommendation will be take a look at BizTalk, if you are new then engage with local Microsoft office. Either they can help or recommend a parter who can come and analyse your situation.

Arcgis explained? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
I want to look into ArcGis, and I cant get my head around where it fits in.
I have used the Google Maps API to create some simple maps with makers, Overlays, Listeners etc.
I have recently started looking at PostGIS, and i fully understand that, it enhances Postgres with additional data types and functions for drawing polygons and mapping areas. Great!
What I dont understand is where ArcGIS fits in?
What does it do? why would you use it.
I have a large db of addresses.
Ultimately, it comes down to whether you are happier having a big software stack where everything is designed to work together or whether you are happier doing a bit of SQL, and/or Javascript and Python coding (these are the big players in open source GIS), and generally piecing bits together yourself. ESRI, the makers of the ArcGIS family (which includes desktop, server and web based technologies) is essentially the Microsoft of the GIS world -- the big player, whose products are designed to work very well with each other, but are sometimes a bit tardy when it comes to standards compliance or interaction with 3rd party software.
On the open source side, Postgis, which essentially provides a spatial type extension to Postgres plus many spatial functions, is really an amalgam of various packages: GEOS, which provides many of the spatial predicate functions, Proj4j which does coordinate system conversion and GDAL which provides a lot of glue functions. Recently Postgis has added native support for raster, 3d and topology functions, along with the long existing vector functions, which means that amazingly sophisticated GIS analysis can be performed directly at the database level by chaining together SQL functions.
As has been suggested above QuantumGIS (generally known as QGIS) provides most of the functionality of ArcGIS desktop, for those wanting to go that route. Javascript libraries, such as the OpenLayers or Leaflet (there are many others) can be used to visualize the results of Postgis queries. In addition, there are tools such as Geoserver (Java serlvet based) which allow you to serve up data held either in Postgres/Postgis tables or ESRI shp files in common formats such as WMS, WFS and WMTS, acting as a bridge between client and server.
The final decision is often as much political as technical. If you work for a large utility company or local government, for example, support contracts and industry norms are likely to outweight budgetary constraints. If you are working for a small startup and have people who are happy working at the command line, you will likely get much better value for money from the open source stack.
I assume you are asking about ArcMap which is ESRI's desktop GIS application. The application is an extremely powerful cartographic production and analytic tool that allows the user to perform complex spatial operations. ArcMap also provides a graphical user interface that allows users to edit geometry supporting complex typologies that model real world phenomenon. The application also supports a wide range of raster analysis that is commonly used for remote sensing.
Some examples of how you could use ArcGIS with your database of addresses:
You can use it to compare addresses to census data and better understand customers.
Generate heat maps of statistically significant clusters.
Use network analysis tools to identify closest facilities or route you to the addresses in the most efficient order.
Find what types of features your data falls within (ie City council districts, fire districts, states, etc) and update your data with that information.
These are just a few things you could do with ArcGIS. In short, this tool allows you to view and analyze your data in a spatial context versus the relational approach you seem to be taking now.
ESRI's ArcGIS is very powerful and has TONS of customization options through their ArcObjects API as well as a new way to add your own custom tools and button commands through a framework they call Add-ins. You can even use Python to create a very simple (code-wise) tool that lets a user click a button and, for example, return a selection on the map of all the telephone poles that are within 50 feet of a tree line. They could then just export this set of pole features as a tabular data report for a tree trimming crew to visit each pole to see if they need to trim back the vegetation. You can also use their ArcGIS Runtime to build a completely custom tool that runs from a USB thumb drive with zero install with only the parts and pieces you create like a Map, Table of contents, and a custom toolbar that has only the buttons and tools you need specific for that application. I've seen a gas utility inspection application written this way that only had the map and three buttons for them to use on an iPad or Android tablet. The options with ArcGIS are very near to endless and they keep updating it all constantly.
My day job is customizing ArcGIS to fit gas, water, and power utilities' needs to match their business workflow. I have been working with ArcGIS since 2004.
If you understand google maps API and postGIS then you really have no need for ArcGIS. Download QGIS and use it in conjunction with POSTGIS http://www.qgis.org/
ARCGIS is just unnecessary that's why it doesn't make sense to you (and rightfully so).
Simple Answer: It lets you use MAPS to analyze and store data in a database, where said data has some sort of 'location' attribute on a surface or in 3D space.
Here's a quick example: "Return all of the parcels in Smith County that are within 1000 feet of a school and display them in red on a map."
If I had to answer in one sentence it would be like: if you just want to show where is something on the map (and some basic data with it) use Google Maps API but if you want to analyze, query and understand your spatial data use ArcGIS.
ArcGIS is platform, containing Desktop, Server, Portal (spatial CMS) with various types of geodatabases supported. ArcGIS for Desktop is used for powerful spatial analysis, it includes more than 700 different tools that support strong spatial and alpha – numerical analysis. When we are talking about spatial analysis we can talk about different spatial overlays (simple example: where do wolves and foxes live), proximity analysis (factory to customer distances; protected area (buffer) around oil drill) and spatial statistics (finding patterns in space (and time), mapping clusters (hot/cold spots) and also, since database is in the background of every serious GIS you can use SQL to query your alpha-numerical (and spatial) data to make better decisions.
Mapping is also function of ArcGIS Desktop software – our brains can understand much better information when they are visualized, and also you can and should visualize results you obtained through analysis. Keep in mad that map is only visualization of the data in geodatabase (or shapefile).
ArcGIS Desktop is also used for data entry – with “heads up” editing, for example form orto-photo images for creating vectors with attributes.
Geodatabase management is also part of ArcGIS and geodatabases vary from file geodatabases to enterprise geodatabases which use SQL Server, Oracle, DB2 and other RDBMS systems. Single user file geodatabase supports one concurrent editor and has no storage limit, while enterprise databases provide multiuser editing, versioning, archiving and backup scenarios. Personal geodatabase is single user geodatabase using Microsoft Access for storing spatial data.
ArcGIS for Server provides different formats of spatial services containing spatial data (map) along with alpha-numerical information (if supported by format). Types of ArcGIS for Server services that can be published are: Mapping, WCS, WMS, Feature Access, Schematics, Mobile Data Access, Network Analysis, KML, WFS… ArcGIS for Server services are authored using ArcMap, served with Server (of course) and their URL links are used by developers who code from the scratch or used within Portal for ArcGIS web app templates which can be customized by developers if needed, or other, Silverlight of Flex Esri viewers.
I would say that if you are already comfortable with PostGIS, you should be fine for any work with vectors. If you are working with raster data then I think that would be where ArcGIS would fit in. In ArcGIS you can run different types of statistics and filters on rasters where I don't think you can with PostGIS but I'm sure that will eventually be added.
One more thing, if you ever need to automate your PostGIS work, I would recommend using Python with the psycopg2 library.

B2B Application Building and Maintenance Cost

I've been considering for some time now to get into be b2b integration business. I've researched the tools available for doing this,
like Oragle's WebLogic Integration, IBM's WebSphere, or Microsoft's BizTalk. They all seem to do the job (each having their ups and downs).
I've also looked at some companies that already are doing this (ex: www.hubspan.com). It seems that b2b integration is very needed service.
Although my background is in integration of commercial products with open source software, I feel that concerning the b2b integration world,
I still feel that I need to feel some blanks.
So basically I'd like to clear a few things concerning all this:
All the frameworks that I previously mentioned are just that, frameworks. They allow to build an application ON TOP of the said frameworks,
they are not itendet ot be a final product. I assume that this is because the integration needs of different companies vary so much,
that an out-of-the-box solution is just not possible. So my question is, do the applications build with the said frameworks vary so much
from business to business, that it's not possible to reuse them?
Also, is it possible to build a single framework of Suppliers and Customers (build a Core of somekind), and connect new Costumers and/or
Suppliers as they come? (this is the way HubSpan did it, not counting the developing of custom Connectors to the Client ERP systems).
Or will I have to do a separate integration for each Customer?
How much work hours is required to complete a typical integration project, (assuming everything is planned and executed properly)?
(For the sake of simplicity, let's say that the integration includes only 'Query Product Price', 'Query Product Availability',
and 'Purchase Order Management'.
And finally, is this a job for a sigle person (can I do this myself?, assuming I have the knowledge to do it) or a team is required?
Thanks in advance for sharing your thoughs and oppinions.
Yes they can vary that much.
It depends on the business. Some will integrate easily while others will need custom modules and connectors
There isn't really a "typical" integration project.
Depends on the size of the project. If you're talking fortune 500 companies then no. If you're talking a local manufacturer and local supply house (presumably small) then maybe.
This is probably a question better asked on the programmers.stackexchange
I think it varies a lot. You should probably define what you mean by B2B, and there are a lot of different types these days. From a BizTalk perspective, it is possible to build an application service provider (ASP) version of B2B but it is hard to do.
The level of customization is one of the factors that drives up cost and the length of the project. I think it is difficult to do B2B alone, usually there is so much business domain knowledge specific to each company that you need those business people to help explain the existing systems.
Thanks,

Is it exists any "rss hosting" with API for creating feeds

I am creating a desktop app that will create some reports. I want to export these reports as RSS or ATOM feeds. I can easily create feeds with Rome lib for Java. But I have no idea how to spread them. I thought about embedding httpd into my app, but it's bad idea, because a computer can be behind NAT or turned off.
I need some kind of "proxy" server, where can I push my feeds, and clients will be able to pull content from that server.
I can probable write server side app fore this, but first I'd like to find out if some dedicated solution is available for problems like this.
I was also thinking about using some blogging platform and using its API. What do you think about this approach?
One more thing I have to consider when choosing platform ability to handle lot of updates. Sometimes desktop app will be shut down but when it will be running, it generates quite a lot of updates.
Check out Google's feedburner.
EDIT
Here's a better link for their help / faq. You'll still need to use some service to generate your feed, but it won't have to handle a heavy load. Feedburner will poll your feed every 30 minutes and their servers will act as a proxy for your feed. As far as how to publish the feed for Feedburner to read, I would recommend writing a service to handle this, even more considering that you getting the data for the feeds from a number of desktop applications, and it'll probably be easier to write a custom service to interface with them, store your data in a DB, and publish feeds than it would be to try and modify a blogging service for this purpose.
I don't know why I didn't think of this when I first answered your question, but Yahoo has a service called Yahoo Pipes which allows you could use to generate feeds from various kinds of inputs. I'm not sure how well it would scale but it might work for you.

How to monitor a POP, SMTP and Exchange Server for mail activity

We need to write a .Net (C#) application that monitors all mail activity through a POP, SMTP and Exchange Server (2007 and later) and essentially grab the mail for archiving into a document management system. I realise that the way to monitor each type of server would probably be different so I'd like to know what the best (most elegant and reliable) way is to achieve this.
Thanks.
Many countries have rather narrow regulations for what such a system must do and what it must not do in order to be in compliance with the law. If you are developing a product for a company in SA that wants to sell it internationally, I would suggest that need a more targeted approach.
Depending on the legal framework, your solution will have to intercept and archive all emails, or just a subset.
For instance, some countries do not allow the company to store private emails of employees, in which case the archival process needs to be configurable with rules that the employee can control.
If the intent is to archive each and every email, then the network-level approach that Jimmy Chandra suggested is better, because it is easier to deploy.
I don't think you need to worry about POP right? it is not used for sending mails (unless you need to monitor access to emails too).
Regarding Exchange, versions 2000 onwards have Journaling support (don't know about previous ones), so a mail is copied cto a mailbox as it is sent/recieved (there are several different options depending Exchange version, check it out). Then you can read that mailbox or set a rule to forward it to an external SMTP, and you app listen to it.
For other SMTP servers, it would be possible to get a similar approach by forwarding rules etc, and some might have custom support as Exchange has.