Skip to content

Commit

Permalink
Merge pull request #353 from NervanaSystems/jitendra/sync-public-master
Browse files Browse the repository at this point in the history
Sync public master - Release 1.4.0
  • Loading branch information
jitendra42 authored Jul 3, 2019
2 parents 76bc438 + 90976e9 commit 3a8206d
Show file tree
Hide file tree
Showing 257 changed files with 12,025 additions and 3,951 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,4 @@
.coverage
.tox
test_data/
*.bak
191 changes: 191 additions & 0 deletions Contribute.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,191 @@
# Contributing to the Model Zoo for Intel® Architecture

## Adding scripts for a new TensorFlow model

### Code updates

In order to add a new model to the zoo, there are a few things that are
required:

1. Setup the directory structure to allow the
[launch script](/docs/general/tensorflow/LaunchBenchmark.md) to find
your model. This involves creating folders for:
`/benchmarks/<use case>/<framework>/<model name>/<mode>/<precision>`.
Note that you will need to add `__init__.py` files in each new
directory that you add, in order for python to find the code.

![Directory Structure](benchmarks_directory_structure.png)

2. Next, in the leaf folder that was created in the previous step, you
will need to create `config.json` and `model_init.py` files:

![Add model init](add_model_init_and_config.png)

The `config.json` file contains the best known KMP environment variable
settings to get optimal performance for the model. Below default settings are recommended for most of
the models in Model Zoo.

```
{
"optimization_parameters": {
"KMP_AFFINITY": "granularity=fine,verbose,compact,1,0",
"KMP_BLOCKTIME": 1,
"KMP_SETTINGS": 1
}
}
```

The `model_init.py` file is used to initialize the best known configuration for the
model, and then start executing inference or training. When the
[launch script](/docs/general/tensorflow/LaunchBenchmark.md) is run,
it will look for the appropriate `model_init.py` file to use
according to the model name, framework, mode, and precision that are
specified by the user.

The contents of the `model_init.py` file will vary by framework. For
TensorFlow models, we typically use the
[base model init class](/benchmarks/common/base_model_init.py) that
includes functions for doing common tasks such as setting up the best
known environment variables (like `KMP_BLOCKTIME`, `KMP_SETTINGS`,
`KMP_AFFINITY` by loading **config.json** and `OMP_NUM_THREADS`), num intra threads, and num
inter threads. The `model_init.py` file also sets up the string that
will ultimately be used to run inference or model training, which
normally includes the use of `numactl` and sending all of the
appropriate arguments to the model's script. Also, if your model
requires any non-standard arguments (arguments that are not part of
the [launch script flags](/docs/general/tensorflow/LaunchBenchmark.md#launch_benchmarkpy-flags)),
the `model_init.py` file is where you would define and parse those
args.

3. [start.sh](/benchmarks/common/tensorflow/start.sh) is a shell script
that is called by the `launch_benchmarks.py` script in the docker
container. This script installs dependencies that are required by
the model, sets up the `PYTHONPATH` environment variable, and then
calls the [run_tf_benchmark.py](/benchmarks/common/tensorflow/run_tf_benchmark.py)
script with the appropriate args. That run script will end up calling
the `model_init.py` file that you have defined in the previous step.

To add support for a new model in the `start.sh` script, you will
need to add a function with the same name as your model. Note that
this function name should match the `<model name>` folder from the
first step where you setup the directories for your model. In this
function, add commands to install any third-party dependencies within
an `if [ ${NOINSTALL} != "True" ]; then` conditional block. The
purpose of the `NOINSTALL` flag is to be able to skip the installs
for quicker iteration when running on bare metal or debugging. If
your model requires the `PYTHONPATH` environment variable to be setup
to find model code or dependencies, that should be done in the
model's function. Next, setup the command that will be run. The
standard launch script args are already added to the `CMD` variable,
so your model function will only need to add on more args if you have
model-specific args defined in your `model_init.py`. Lastly, call the
`run_model` function with the `PYTHONPATH` and the `CMD` string.

Below is a sample template of a `start.sh` model function that
installs dependencies from `requirements.txt` file, sets up the
`PYHTONPATH` to find model source files, adds on a custom steps flag
to the run command, and then runs the model:
```bash
function <model_name>() {
if [ ${PRECISION} == "fp32" ]; then
if [ ${NOINSTALL} != "True" ]; then
pip install -r ${MOUNT_EXTERNAL_MODELS_SOURCE}/requirements.txt
fi

export PYTHONPATH=${PYTHONPATH}:${MOUNT_EXTERNAL_MODELS_SOURCE}
CMD="${CMD} $(add_steps_args)"
PYTHONPATH=${PYTHONPATH} CMD=${CMD} run_model
else
echo "PRECISION=${PRECISION} is not supported for ${MODEL_NAME}"
exit 1
fi
}
```

Optional step:
* If there is CPU-optimized model code that has not been upstreamed to
the original repository, then it can be added to the
[models](/models) directory in the zoo repo. As with the first step
in the previous section, the directory structure should be setup like:
`/models/<use case>/<framework>/<model name>/<mode>/<precision>`.

![Models Directory Structure](models_directory_structure.png)

If there are model files that can be shared by multiple modes or
precisions, they can be placed the higher-level directory. For
example, if a file could be shared by both `FP32` and `Int8`
precisions, then it could be placed in the directory at:
`/models/<use case>/<framework>/<model name>/<mode>` (omitting the
`<precision>` directory). Note that if this is being done, you need to
ensure that the license that is associated with the original model
repository is compatible with the license of the model zoo.

### Debugging

There are a couple of options for debugging and quicker iteration when
developing new scripts:
* Use the `--debug` flag in the launch_benchmark.py script, which will
give you a shell into the docker container. See the
[debugging section](/docs/general/tensorflow/LaunchBenchmark.md#debugging)
of the launch script documentation for more information on using this
flag.
* Run the launch script on bare metal (without a docker container). The
launch script documentation also has a
[section](/docs/general/tensorflow/LaunchBenchmark.md#alpha-feature-running-on-bare-metal)
with instructions on how to do this. Note that when running without
docker, you are responsible for installing all dependencies on your
system before running the launch script. If you are using this option
during development, be sure to also test _with_ a docker container to
ensure that the `start.sh` script dependency installation is working
properly for your model.

### Documentation updates

1. Create a `README.md` file in the
`/benchmarks/<use case>/<framework>/<model name>` directory:

![Add README file](add_readme.png)

This README file should describe all of the steps necessary to run
the model, including downloading and preprocessing the dataset,
downloading the pretrained model, cloning repositories, and running
the model script with the appropriate arguments. Most models
have best known settings for batch and online inference performance
testing as well as testing accuracy. The README file should specify
how to set these configs using the `launch_benchmark.py` script.

2. Update the table in the [main `benchmarks` README](/benchmarks/README.md)
with a link to the model that you are adding. Note that the models
in this table are ordered alphabetically by use case, framework, and
model name. The model name should link to the original paper for the
model. The instructions column should link to the README
file that you created in the previous step.

### Testing

1. After you've completed the above steps, run the model according to
instructions in the README file for the new model. Ensure that the
performance and accuracy metrics are on par with what you would
expect.
2. Add unit tests to cover the new model.
* For TensorFlow models, there is a
[parameterized test](/tests/unit/common/tensorflow/test_run_tf_benchmarks.py#L80)
that checks the flow running from `run_tf_benchmarks.py` to the
inference command that is executed by the `model_init.py` file. The
test ensures that the inference command has all of the expected
arguments.
To add a new parameterized instance of the test for your
new model, add a new JSON file `tf_<model_name>_args.json` to the [tf_models_args](/tests/unit/common/tensorflow/tf_model_args)
directory. Each file has a list of dictionaries, a dictionary has three
items: (1) `_comment` a comment describes the command,
(2) `input` the `run_tf_benchmarks.py` command with the appropriate
flags to run the model (3) `output` the expected inference or training
command that should get run by the `model_init.py` file.
* If any launch script or base class files were changed, then
additional unit tests should be added.
* Unit tests and style checks are run when you post a GitHub PR, and
the tests must be passing before the PR is merged.
* For information on how to run the unit tests and style checks
locally, see the [tests documentation](/tests/README.md).
4 changes: 2 additions & 2 deletions Jenkinsfile
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@ node('skx') {
sudo apt-get install -y python3-dev || sudo yum install -y python36-devel.x86_64
# virtualenv 16.3.0 is broken do not use it
python2 -m pip install --force-reinstall --user --upgrade pip virtualenv!=16.3.0 tox
python3 -m pip install --force-reinstall --user --upgrade pip virtualenv!=16.3.0 tox
python2 -m pip install --no-cache-dir --user --upgrade pip==19.0.3 virtualenv!=16.3.0 tox
python3 -m pip install --no-cache-dir --user --upgrade pip==19.0.3 virtualenv!=16.3.0 tox
"""
}
stage('Style tests') {
Expand Down
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,8 @@ This repository contains **links to pre-trained models, sample scripts, best pra
- Show how to efficiently execute, train, and deploy Intel-optimized models
- Make it easy to get started running Intel-optimized models on Intel hardware in the cloud or on bare metal

***DISCLAIMER: These scripts are not intended for benchmarking Intel platforms. For any performance and/or benchmarking information on specific Intel platforms, visit [https://www.intel.ai/blog](https://www.intel.ai/blog).***
***DISCLAIMER: These scripts are not intended for benchmarking Intel platforms.
For any performance and/or benchmarking information on specific Intel platforms, visit [https://www.intel.ai/blog](https://www.intel.ai/blog).***

## How to Use the Model Zoo

Expand All @@ -31,3 +32,6 @@ We hope this structure is intuitive and helps you find what you are looking for;
![Repo Structure](repo_structure.png)

*Note: For model quantization and optimization tools, see [https://github.com/IntelAI/tools](https://github.com/IntelAI/tools)*.

## How to Contribute
If you would like to add a new benchmarking script, please use [this guide](/Contribute.md).
Binary file added add_model_init_and_config.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added add_readme.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
23 changes: 18 additions & 5 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,31 +11,44 @@ dependencies to be installed:
* [git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
* `wget` for downloading pre-trained models

## Use Cases
## TensorFlow Use Cases

| Use Case | Framework | Model | Mode | Instructions |
| -----------------------| --------------| ------------------- | --------- |------------------------------|
| Adversarial Networks | TensorFlow | [DCGAN](https://arxiv.org/pdf/1511.06434.pdf) | Inference | [FP32](adversarial_networks/tensorflow/dcgan/README.md#fp32-inference-instructions) |
| Content Creation | TensorFlow | [DRAW](https://arxiv.org/pdf/1502.04623.pdf) | Inference | [FP32](content_creation/tensorflow/draw/README.md#fp32-inference-instructions) |
| Face Detection and Alignment | Tensorflow | [FaceNet](https://arxiv.org/pdf/1503.03832.pdf) | Inference | [FP32](face_detection_and_alignment/tensorflow/facenet/README.md#fp32-inference-instructions) |
| Face Detection and Alignment | TensorFlow | [MTCC](https://arxiv.org/pdf/1604.02878.pdf) | Inference | [FP32](face_detection_and_alignment/tensorflow/mtcc/README.md#fp32-inference-instructions) |
| Image Recognition | TensorFlow | [DenseNet169](https://arxiv.org/pdf/1608.06993.pdf) | Inference | [FP32](image_recognition/tensorflow/densenet169/README.md#fp32-inference-instructions) |
| Image Recognition | TensorFlow | [Inception ResNet V2](https://arxiv.org/pdf/1602.07261.pdf) | Inference | [Int8](image_recognition/tensorflow/inception_resnet_v2/README.md#int8-inference-instructions) [FP32](image_recognition/tensorflow/inception_resnet_v2/README.md#fp32-inference-instructions) |
| Image Recognition | TensorFlow | [Inception V3](https://arxiv.org/pdf/1512.00567.pdf) | Inference | [Int8](image_recognition/tensorflow/inceptionv3/README.md#int8-inference-instructions) [FP32](image_recognition/tensorflow/inceptionv3/README.md#fp32-inference-instructions) |
| Image Recognition | TensorFlow | [Inception V4](https://arxiv.org/pdf/1602.07261.pdf) | Inference | [Int8](image_recognition/tensorflow/inceptionv4/README.md#int8-inference-instructions) [FP32](image_recognition/tensorflow/inceptionv4/README.md#fp32-inference-instructions) |
| Image Recognition | TensorFlow | [MobileNet V1](https://arxiv.org/pdf/1704.04861.pdf) | Inference | [FP32](image_recognition/tensorflow/mobilenet_v1/README.md#fp32-inference-instructions) |
| Image Recognition | TensorFlow | [MobileNet V1](https://arxiv.org/pdf/1704.04861.pdf) | Inference | [Int8](image_recognition/tensorflow/mobilenet_v1/README.md#int8-inference-instructions) [FP32](image_recognition/tensorflow/mobilenet_v1/README.md#fp32-inference-instructions) |
| Image Recognition | TensorFlow | [ResNet 101](https://arxiv.org/pdf/1512.03385.pdf) | Inference | [Int8](image_recognition/tensorflow/resnet101/README.md#int8-inference-instructions) [FP32](image_recognition/tensorflow/resnet101/README.md#fp32-inference-instructions) |
| Image Recognition | TensorFlow | [ResNet 50](https://arxiv.org/pdf/1512.03385.pdf) | Inference | [Int8](image_recognition/tensorflow/resnet50/README.md#int8-inference-instructions) [FP32](image_recognition/tensorflow/resnet50/README.md#fp32-inference-instructions) |
| Image Recognition | TensorFlow | [ResNet 50v1.5](https://github.com/tensorflow/models/tree/master/official/resnet) | Inference | [Int8](image_recognition/tensorflow/resnet50v1_5/README.md#int8-inference-instructions) [FP32](image_recognition/tensorflow/resnet50v1_5/README.md#fp32-inference-instructions) |
| Image Recognition | TensorFlow | [SqueezeNet](https://arxiv.org/pdf/1602.07360.pdf) | Inference | [FP32](image_recognition/tensorflow/squeezenet/README.md#fp32-inference-instructions) |
| Image Segmentation | TensorFlow | [Mask R-CNN](https://arxiv.org/pdf/1703.06870.pdf) | Inference | [FP32](image_segmentation/tensorflow/maskrcnn/README.md#fp32-inference-instructions) |
| Image Segmentation | TensorFlow | [UNet](https://arxiv.org/pdf/1505.04597.pdf) | Inference | [FP32](image_segmentation/tensorflow/unet/README.md#fp32-inference-instructions) |
| Language Modeling | TensorFlow | [LM-1B](https://arxiv.org/pdf/1602.02410.pdf) | Inference | [FP32](language_modeling/tensorflow/lm-1b/README.md#fp32-inference-instructions) |
| Language Translation | TensorFlow | [GNMT](https://arxiv.org/pdf/1609.08144.pdf) | Inference | [FP32](language_translation/tensorflow/gnmt/README.md#fp32-inference-instructions) |
| Language Translation | TensorFlow | [Transformer Language](https://arxiv.org/pdf/1706.03762.pdf)| Inference | [FP32](language_translation/tensorflow/transformer_language/README.md#fp32-inference-instructions) |
| Language Translation | TensorFlow | [Transformer_LT_Official ](https://arxiv.org/pdf/1706.03762.pdf)| Inference | [FP32](language_translation/tensorflow/transformer_lt_official/README.md#fp32-inference-instructions) |
| Object Detection | TensorFlow | [R-FCN](https://arxiv.org/pdf/1605.06409.pdf) | Inference | [FP32](object_detection/tensorflow/rfcn/README.md#fp32-inference-instructions) |
| Object Detection | TensorFlow | [R-FCN](https://arxiv.org/pdf/1605.06409.pdf) | Inference | [Int8](object_detection/tensorflow/rfcn/README.md#int8-inference-instructions) [FP32](object_detection/tensorflow/rfcn/README.md#fp32-inference-instructions) |
| Object Detection | TensorFlow | [Faster R-CNN](https://arxiv.org/pdf/1506.01497.pdf) | Inference | [Int8](object_detection/tensorflow/faster_rcnn/README.md#int8-inference-instructions) [FP32](object_detection/tensorflow/faster_rcnn/README.md#fp32-inference-instructions) |
| Object Detection | TensorFlow | [SSD-MobileNet](https://arxiv.org/pdf/1704.04861.pdf) | Inference | [FP32](object_detection/tensorflow/ssd-mobilenet/README.md#fp32-inference-instructions) |
| Object Detection | TensorFlow | [SSD-ResNet34](https://arxiv.org/pdf/1512.02325.pdf) | Inference | [FP32](object_detection/tensorflow/ssd-resnet34/README.md#fp32-inference-instructions) |
| Object Detection | TensorFlow | [SSD-MobileNet](https://arxiv.org/pdf/1704.04861.pdf) | Inference | [Int8](object_detection/tensorflow/ssd-mobilenet/README.md#int8-inference-instructions) [FP32](object_detection/tensorflow/ssd-mobilenet/README.md#fp32-inference-instructions) |
| Object Detection | TensorFlow | [SSD-ResNet34](https://arxiv.org/pdf/1512.02325.pdf) | Inference | [Int8](object_detection/tensorflow/ssd-resnet34/README.md#int8-inference-instructions) [FP32](object_detection/tensorflow/ssd-resnet34/README.md#fp32-inference-instructions) |
| Object Detection | TensorFlow | [SSD-VGG16](https://arxiv.org/pdf/1512.02325.pdf) | Inference | [Int8](object_detection/tensorflow/ssd_vgg16/README.md#int8-inference-instructions) [FP32](object_detection/tensorflow/ssd_vgg16/README.md#fp32-inference-instructions) |
| Recommendation | TensorFlow | [NCF](https://arxiv.org/pdf/1708.05031.pdf) | Inference | [FP32](recommendation/tensorflow/ncf/README.md#fp32-inference-instructions) |
| Recommendation | TensorFlow | [Wide & Deep Large Dataset](https://arxiv.org/pdf/1606.07792.pdf) | Inference | [Int8](recommendation/tensorflow/wide_deep_large_ds/README.md#int8-inference-instructions) [FP32](recommendation/tensorflow/wide_deep_large_ds/README.md#fp32-inference-instructions) |
| Recommendation | TensorFlow | [Wide & Deep](https://arxiv.org/pdf/1606.07792.pdf) | Inference | [FP32](recommendation/tensorflow/wide_deep/README.md#fp32-inference-instructions) |
| Text-to-Speech | TensorFlow | [WaveNet](https://arxiv.org/pdf/1609.03499.pdf) | Inference | [FP32](text_to_speech/tensorflow/wavenet/README.md#fp32-inference-instructions) |


## TensorFlow Serving Use Cases


| Use Case | Framework | Model | Mode | Instructions |
| -----------------------| --------------| ------------------- | --------- |------------------------------|
| Image Recognition | TensorFlow Serving | [Inception V3](https://arxiv.org/pdf/1512.00567.pdf) | Inference | [FP32](image_recognition/tensorflow_serving/inceptionv3/README.md#fp32-inference-instructions) |

Loading

0 comments on commit 3a8206d

Please sign in to comment.