What you want to do: Partition a geographical area, where each area contains elementary entities.
You want to set up a web service where users are allowed to add locations by clicking on a map, say using the Google Maps API.
The problem is that adding all the locations at the top level of a tree will quickly lead to a bloated data structure increasing search time and make it hard to scale.
One approach is to divide the area into partitions, a bit like the United States is divided into states.
States work fine as their union does cover the whole country, but their respective shapes is unnecessarily tortuous.
All we need really is a bunch of possibly overlapping circular areas defined by a center and a radius r.
It turns out it is relatively easy to find databases of US populated locations with their coordinates.
Simply use these as your centers and pick a radius that will cover the whole territory, depending on the granularity of your database.
That way, users browse by locations and add markers for each location, allowing your data structure to scale.
A problem that this layout creates is that of marker visibility.
Assuming two towns are rather close from each other and one user adds a place through town A, then all users browsing that area through town B will not see this marker.
A solution is to store all the neighbor locations of a given location.
Given a location L1, a location L2 is a neighbor of L1 iff distance(L1, L2) < 2r, where r is the radius of a location.
Using this schema, town B would be listed as a neighbor of town A and, as such, its list of place markers would be displayed by all its neighbors, including town A.
That way, no matter through which location a user adds a marker, all other users will be able to see it if they browse that area through neighbor locations (locations that would cover the marker).
What if I select a location on the west coast, then add a marker on the east coast?
This will break the system as this marker will not be visible by any other locations than the original location and its neighbors.
That's why in addition to storing neighbors of a location, users should not be able to add place markers more than r distance from a location's center.
If all these constraints are respected, you've got a scalable database of places, under the provision that users know the general location that they are looking for. A reasonable assumption.
No comments:
Post a Comment