Programming, CakePHP and commentary from Britain.

Popular

CakePHP Livesearch
CakePHP Sessions
VBScript & Excel
VBScript & Oracle

Search

E-mail

kdobson@gmail.com

Subscribe

RSS 2.0

Encoding coordinates for efficiency

The folks over at Soul Solutions published an article (Encoding for performance, Soul Solutions) outlining the steps for implementing an “Encoded Polyline Algorithm Format” (Encoded Polyline Algorithm Format, Google) - an algorithm for reducing the data overheads when sending long lists of coordinates over a network.

I have been aware of this for a while, but - unfortunately - have not had the opportunity to put it into practice. The duo over at Soul Solutions work primarily with Virtual Earth (Microsoft’s on-line mapping system), so tend to focus on loading speed and ensuring things are as snappy as possible. My scenario was a little different; generating KML files of hundreds of polygons based on geographical boundaries.

The speed of loading was not so important for me, as the KML files are all stored locally and loaded straight into Google Earth. However, the generation of these KML files is where the encoding algorithm proved exceptionally useful.

Imagine a database table containing lat/long coordinates for lines making up the state boundaries in the United States. There are 13,691 points in this table - each representing a location inside the US. With 51 states, that’s an average of 269 points per state. This number of points is by no means big enough to cause any sort of problems when generating a KML file.

Now imagine a database table containing all the counties in the US, and another table containing all the ZIP codes in the US. For the latter, that’s around 2,249,509 points making up ~29,900 ZIPs. Again, this is by no means a big value for a SQL server, but querying it takes some time - especially when you want to get 1000 ZIP codes out of the database quickly.

The solutions provided by Soul Solutions/Google not only give us the ability to encode coordinates so the overheads are much smaller (25% of the original transfer time), but also allows the number of database rows per polygon to be reduced dramatically; thus speeding up query times (the more rows in a table, the longer a query takes).

With the US state example, we have 51 (one per state, each containing an encoded string) row entries instead of 13,691 - saving space by a factor of 269, or using 0.00373% of the original space.

Or, more importantly for query times, storing 29,000 ZIP codes in 29,000 rows instead of 2.25 million.

Tags: , , ,

Contribute...