ARADEEPOPSIS
is a software tool that enables plant researchers to non-invasively score plant growth, biomass accumulation and senescence from image data in a highly parallelized, high throughput, yet easy to use manner.
It is built upon the published, convolutional neural network (CNN) DeepLabv3+[1] that serves the task of semantic image segmentation. A pretrained checkpoint of this model has been trained upon using manually annotated top-view images of Arabidopsis thaliana plants of different ages. The code that was used for training can be found here. The different models are available here
The training and validation datasets including ground truth annotations are available at
The pipeline is implemented using open source software such as Nextflow
[2], TensorFlow
[3], ImageMagick
, scikit-image
[4] and shiny
[5].
The pipeline uses either a conda environment or a Docker container to resolve dependencies, ensuring a high level of reproducibility and portability. It is largely platform independent and scales from Personal Computers to High Performance Computing (HPC) infrastructure, allowing for time efficient analysis of hundreds of thousands of images within a day.
Note: To ensure reproducibility, container-based profiles are recommended over conda.
Once the pipeline is fed with images of single plants, it converts the images into chunks of arbitrary size by saving the image data into an IO-optimized binary file format.
These file records are then, in parallel, served to a deep learning model, allowing for pixel-by-pixel classification of the image data.
Three distinct models are available and should be chosen according to the research interest.
Depending on the model of choice, the pipeline extracts relevant phenotypic information such as:
- plant area (
model A/B/C
) - area of senescent/necrotic tissue (
model B/C
) - area of anthocyanin-rich tissue (
model C
) - color composition and color indices as described by Del Valle et al. 2018[6] for each class supported by the respective model
- a variety of morphometric traits for each class supported by the respective model
The pipeline has a modular character, also allowing the use of custom models obtained from the Deep Plant Phenomics[7] framework for the semantic segmentation step.
In addition, it is also possible to feed the pipeline with segmentation masks obtained by other methods, while still making use the automated trait extraction and visualization capabilities of the pipeline.
Running the pipeline with the provided models requires at least 6GB of memory for singlescale inference, and at least 12GB for --multiscale
.
Note: Windows users will have to set up the WSL first.
-
Install
Nextflow
-
Install either
conda
,Docker
,podman
,Charliecloud
orSingularity
.
Note: To run the pipeline on MacOS with Docker, it is necessary to increase the allowed memory usage from the default setting of 2GB to 6GB (See the Docker documentation for instructions)
To run the pipeline you have to provide single-pot plant images:
nextflow run Gregor-Mendel-Institute/aradeepopsis --images 'path/to/images/*{png|jpg}' -profile {conda|docker|podman|singularity|charliecloud}
module load nextflow/20.01.0
nextflow run Gregor-Mendel-Institute/aradeepopsis --images 'path/to/images/*{png|jpg}' -profile cbe,singularity
Default Parameters
Parameter | Default value | Type |
---|---|---|
--model |
C |
<Character> |
--images |
None |
<Path> |
--masks |
false |
<Path> |
--multiscale |
false |
<Boolean> |
--chunksize |
10 |
<Integer> |
--ignore_senescence |
true |
<Boolean> |
--outdir |
./results |
<Path> |
--save_overlay |
true |
<Boolean> |
--save_mask |
true |
<Boolean> |
--save_hull |
true |
<Boolean> |
--summary_diagnostics |
false |
<Boolean> |
--shiny |
true |
<Boolean> |
Pipeline Output
The pipeline computes a total of 78 morphometric and color-related traits from the analysed images. A description of all traits is available here.
[1] Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation.
Chen, L.-C. et al., 2018. arXiv [cs.CV]. Available at: http://arxiv.org/abs/1802.02611.
[2] Nextflow enables reproducible computational workflows.
Di Tommaso, P. et al., 2017. Nature biotechnology, 35(4), pp.316–319.
[3] TensorFlow: Large-scale machine learning on heterogeneous systems.
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Rafal Jozefowicz, Yangqing Jia,Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Mike Schuster, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng, 2015
[4] scikit-image: Image processing in Python.
Stéfan van der Walt, Johannes L. Schönberger, Juan Nunez-Iglesias, François Boulogne, Joshua D. Warner, Neil Yager, Emmanuelle Gouillart, Tony Yu and the scikit-image contributors. PeerJ 2:e453 (2014)
[5] shiny: Easy web applications in R
Rstudio Inc. (2014)
[6] Digital photography provides a fast, reliable, and noninvasive method to estimate anthocyanin pigment concentration in reproductive and vegetative plant tissues
Del Valle JC, Gallardo-López A, Buide ML, Whittall JB, Narbona E, 2018. Ecol Evol. 8(6):3064–76.
[7] Deep Plant Phenomics: A Deep Learning Platform for Complex Plant Phenotyping Tasks
Ubbens JR, Stavness I. Front Plant Sci. 2017 Jul 7;8:1190.