-
Notifications
You must be signed in to change notification settings - Fork 2
Input Data
Our pipeline uses overlapping drone imagery taken by a Phantom DJI Drone. There are several different ways to provide this input data to the pipeline. As a preliminary example, we will illustrate the most simple directory structure.
Note first that the pipeline expects that you have drone images from each region surveyed by the drone in separate directories like this
flower_map
├── config.yml
├── data
| ├── samples.tsv
| ├── region1
| | ├── DJI_001.JPG
| | ├── DJI_002.JPG
| | ├── DJI_003.JPG
| ├── region2
| | ├── DJI_001.PNG
| | ├── DJI_002.PNG
| | ├── DJI_003.PNG
| ├── region3
| | ├── DJI_001.JPG
| | ├── DJI_002.PNG
| | ├── DJI_003.JPG
├── envs
├── LICENSE
├── metashape.lic
├── out
├── README.md
├── run.bash
├── scripts
├── Snakefile
We've placed all of our data in a data/
folder within the project root. If you're data exists in a separate place on your filesystem, you can symlink it to the data/
directory or symlink the data/
directory itself. You may even choose not to have a data/
directory at all. The important thing is that each region has its own directory of drone image files.
Inside the data/
directory, we created a samples.tsv
file describing the paths to these datasets:
region1 data/region1
region2 data/region2
region3 data/region3 .PNG
The samples.tsv
file has three tab-separated columns and a line for each dataset that you'd like the analyze. The first column is a unique identifier you assign to the dataset. This is used by the pipeline when it creates its output. Note that it is best to avoid using spaces in your unique identifiers. The second column is the path to the dataset from the root of the project directory.
The third column is optional and denotes the extension of the image files in the dataset's directory. If this is not specified, the most commonly used extension will be used. In our example, the pipeline would default to using .JPG
for region3
, since data/region3
has only one .PNG
file. But by specifying .PNG
in our samples.tsv
file, we are instructing the pipeline to use only the .PNG
file in data/region3
.
Once you're done constructing your samples.tsv
file, you should specify the path to it in your config.yml
configuration file.
It is best to specify all of your datasets in samples.tsv
even if you only plan to use a few of them at first. A separate configuration option in config.yml
called SAMP_NAMES
allows you to use only a subset of the datasets at once.