Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: Fix shape and reformat free tensor handling in the input byte size check #7444

Merged
merged 6 commits into from
Jul 27, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
34 changes: 34 additions & 0 deletions docs/user_guide/model_configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -598,6 +598,40 @@ input1: [4, 4, 6] <== shape of this tensor [3]
Currently, only TensorRT supports shape tensors. Read [Shape Tensor I/O](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#shape_tensor_io)
to learn more about shape tensors.

## Non-Linear I/O Formats

For models that process input or output data in non-linear formats, the _is_non_linear_format_io_ property
must be set. The following example model configuration shows how to specify that INPUT0 and INPUT1 use non-linear I/O data formats.

```
name: "mytensorrtmodel"
platform: "tensorrt_plan"
max_batch_size: 8
input [
{
name: "INPUT0"
data_type: TYPE_FP16
dims: [ 3,224,224 ]
is_non_linear_format_io: true
},
{
name: "INPUT1"
data_type: TYPE_FP16
dims: [ 3,224,224 ]
is_non_linear_format_io: true
}
]
output [
{
name: "OUTPUT0"
data_type: TYPE_FP16
dims: [ 1,3 ]
}
]
```

Currently, only TensorRT supports this property. To learn more about I/O formats, refer to the [I/O Formats documentation](https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#reformat-free-network-tensors).

## Version Policy

Each model can have one or more
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
max_batch_size: 8
input [
{
name: "INPUT0"
data_type: TYPE_FP32
dims: [ 16 ]
is_non_linear_format_io: true
},
{
name: "INPUT1"
data_type: TYPE_FP32
dims: [ 16 ]
}
]
output [
{
name: "OUTPUT0"
data_type: TYPE_FP32
dims: [ 16 ]
},
{
name: "OUTPUT1"
data_type: TYPE_FP32
dims: [ 16 ]
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
'INPUT0' uses a linear IO format, but 'is_non_linear_format_io' is incorrectly set to true in the model configuration.
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
max_batch_size: 8
input [
{
name: "INPUT0"
data_type: TYPE_FP32
dims: [ 16 ]
},
{
name: "INPUT1"
data_type: TYPE_FP32
dims: [ 16 ]
}
]
output [
{
name: "OUTPUT0"
data_type: TYPE_FP32
dims: [ 16 ]
},
{
name: "OUTPUT1"
data_type: TYPE_FP32
dims: [ 16 ]
is_non_linear_format_io: true
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
'OUTPUT1' uses a linear IO format, but 'is_non_linear_format_io' is incorrectly set to true in the model configuration.
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
name: "no_config_non_linear_format_io"
platform: "tensorrt_plan"
backend: "tensorrt"
version_policy {
latest {
num_versions: 1
}
}
max_batch_size: 8
input {
name: "INPUT0"
data_type: TYPE_FP32
dims: -1
dims: 2
dims: 1
is_non_linear_format_io: true
}
input {
name: "INPUT1"
data_type: TYPE_FP32
dims: -1
dims: 2
dims: 1
is_non_linear_format_io: true
}
output {
name: "OUTPUT0"
data_type: TYPE_FP32
dims: -1
dims: 2
dims: 1
}
output {
name: "OUTPUT1"
data_type: TYPE_FP32
dims: -1
dims: 2
dims: 1
}
optimization {
input_pinned_memory {
enable: true
}
output_pinned_memory {
enable: true
}
}
dynamic_batching {
preferred_batch_size: 8
}
instance_group {
name: "no_config_non_linear_format_io"
kind: KIND_GPU
count: 1
gpus: 0
}
default_model_filename: "model.plan"
13 changes: 12 additions & 1 deletion qa/L0_model_config/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,12 @@ for modelpath in \
autofill_noplatform/tensorrt/bad_input_shape/1 \
autofill_noplatform/tensorrt/bad_input_type/1 \
autofill_noplatform/tensorrt/bad_input_shape_tensor/1 \
autofill_noplatform/tensorrt/bad_input_non_linear_format_io/1 \
autofill_noplatform/tensorrt/bad_output_dims/1 \
autofill_noplatform/tensorrt/bad_output_shape/1 \
autofill_noplatform/tensorrt/bad_output_type/1 \
autofill_noplatform/tensorrt/bad_output_shape_tensor/1 \
autofill_noplatform/tensorrt/bad_outut_non_linear_format_io/1 \
autofill_noplatform/tensorrt/too_few_inputs/1 \
autofill_noplatform/tensorrt/too_many_inputs/1 \
autofill_noplatform/tensorrt/unknown_input/1 \
Expand Down Expand Up @@ -92,6 +94,14 @@ for modelpath in \
$modelpath/.
done

# Copy TensorRT plans with non-linear format IO into the test model repositories.
for modelpath in \
autofill_noplatform_success/tensorrt/no_config_non_linear_format_io/1 ; do
mkdir -p $modelpath
cp /data/inferenceserver/${REPO_VERSION}/qa_trt_format_model_repository/plan_CHW32_LINEAR_float32_float32_float32/1/model.plan \
$modelpath/.
done

# Copy variable-sized TensorRT plans into the test model repositories.
for modelpath in \
autofill_noplatform_success/tensorrt/no_name_platform_variable/1 \
Expand Down Expand Up @@ -593,7 +603,8 @@ for TARGET_DIR in `ls -d autofill_noplatform_success/*/*`; do
# that the directory is an entire model repository.
rm -fr models && mkdir models
if [ -f ${TARGET_DIR}/config.pbtxt ] || [ "$TARGET" = "no_config" ] \
|| [ "$TARGET" = "no_config_variable" ] || [ "$TARGET" = "no_config_shape_tensor" ] ; then
|| [ "$TARGET" = "no_config_variable" ] || [ "$TARGET" = "no_config_shape_tensor" ] \
|| [ "$TARGET" = "no_config_non_linear_format_io" ] ; then
cp -r ${TARGET_DIR} models/.
else
cp -r ${TARGET_DIR}/* models/.
Expand Down
Loading