Yahoo! recently (eight months ago)
released big parts of their GeoPlanet
data. We successfully integrated the free dataset from
GeoNames.org into TheLabelFinder
platform and thought the GeoPlanet data is worth being reviewed and compared.
This artice is the result of our research.
Both GeoNames and GeoPlanet provides Webservice Interfaces.
GeoNames offers a REST API while GeoPlanet uses SOAP.
But relying on third party webservices can sometimes
be tricky. The connection might be slow, unstable or even
totally broken because of any kind of server error.
This is why we prefer integrating the data into our own databases and applications.
The actual downloads can be found on GeoNames.org and
Both data sets are licensed under the Creative Commons Attribution License.
GeoNames and GeoPlanet have a lot in common.
The following list shows features which are roughly the same or
at least comparable.
As you can see, their concepts overlap, but as always:
The devil is in the details.
The following table shows the most important differences.
| || GeoNames.org || Yahoo! GeoPlanet
| Geo coordinates || yes || no
| Structure || flat || hierarchical
| Neighboring || no || yes
The biggest eye-catching disadvantage of the GeoPlanet data is:
It doesn't come with geo coordinates. That's really sad, because
the webservices they offer not only does support that but also
provides the bounding box of a given place which would be a really
But in contrast to GeoNames, GeoPlanet excels at structuring the data.
GeoNames' structure is flat. No record (GeoNames calls them toponyms) knows about its surrounding
location. E.g. Berlin does not know about Germany, which itself doesn't know about Europe and so forth.
GeoPlanet records, or places as they are called by Yahoo!, always (except one)
have a reference to its parent place and therefore offer relations between
places like the following:
Parent (direct surrounding place)
Child (direct sub-places)
Siblings (places sharing the same parent and place type)
Ancestors (set of all parents)
If you take e.g. our company's district you will get the following
Their is a fifth relationship GeoPlanet offers:
Back to our local district example, this would be:
Ortsteil Prenzlauer Berg (WOEID 26821872)
Ortsteil Mitte (WOEID 26821864)
Ortsteil Tiergarten (WOEID 26821854)
Ortsteil Friedrichshain (WOEID 26821877)
Ortsteil Weißensee (WOEID 26821880)
Ortste… you got the point, right?
In case you didn't already guess which place has no parent place in the GeoPlanet data:
It's Earth (WOEID 1).
As meantioned before, we actively using the GeoNames data in
a production critical environment and are very happy with it.
The data quality suits our needs, but as long as we didn't
work with the GeoPlanet dataset it would be unfair trying
to judge them in this sector. If anybody already has some experiences,
or explicit examples: Please let us know.
We can't really compare data quality, but what we can
do is comparing quantity. Might not be useful
but it was easy to collect, so here you go:
| || GeoNames.org || Yahoo! GeoPlanet
| Places/Toponyms || 7,069,291 || 5,332,310
| Aliases/Alternate Names || 2,928,296 || 1,950,735
| Neighbors || n/a || 8,521,075
| Size (all files, unzipped) || 882M || 504M
GeoPlanet's hierarchical structuring looks promising and allows thinking about
some really neat features. Then again the lack of important
basic information like geo coordinates really upsets me. If Yahoo! considers integrating
center and bounding box information, the GeoPlanet data would be a real
competitor to GeoNames.
GeoNames on the other hand is a community driven project. The data might (who knows?)
be not as good as their GeoPlanet counterpart but you are free to register and change it yourself.
Many people (reasonably) fear a dependency on a big company like Yahoo! or Google.
In that case GeoNames might be the better choice.