Skip to content
This repository has been archived by the owner on Jun 11, 2024. It is now read-only.

Latest commit

 

History

History
13 lines (7 loc) · 1.11 KB

README.md

File metadata and controls

13 lines (7 loc) · 1.11 KB

addresses-importer

The goal of this project is to aggregate multiple sources of addresses and then merge them into one. Currently we're using OpenAddresses and OpenStreetMap.

The big part of this project being the deduplication process and cleaning the data.

Workflow

It first loads addresses data using the importers. You might want to take a look in the importers folder if you want more information on a specific importer.

To make sure they generate the same kind of data, we wrote a trait called CompatibleDB which is available in tools/src/lib.rs alongside an Address type. Therefore, the importers are forced to all provide the same information in the same format. It's then up to the caller to implement them however they want.

Once the imports are done, all the data is merged into one big file. However, a same address may have been imported several times from different sources and sometime several time in the same source. This is where the [deduplicator](./deduplicator) comes in. As usual, more information can be found in its README file.