awslabs · EthanBunce · Oct 18, 2024 · Oct 18, 2024 · Oct 18, 2024 · Oct 18, 2024
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### **Added**
 - added GitHub as alternate option for code repository support along with AWS CodeCommit for sagemaker-templates-service-catalog module
+- added SageMaker ground truth labeling module
 ### **Changed**
 - updated manifests to idf release 1.12.0
 

diff --git a/README.md b/README.md
@@ -50,6 +50,7 @@ End-to-end example use-cases built using modules in this repository.
 | [SageMaker Model Package Promote Pipeline Module](modules/sagemaker/sagemaker-model-package-promote-pipeline/README.md)  | Deploy a Pipeline to promote SageMaker Model Packages in a multi-account setup. The pipeline can be triggered through an EventBridge rule in reaction of a SageMaker Model Package Group state event change (Approved/Rejected). Once the pipeline is triggered, it will promote the latest approved model package, if one is found.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
 | [SageMaker Model Monitoring Module](modules/sagemaker/sagemaker-model-monitoring/README.md)                              | Deploy data quality, model quality, model bias, and model explainability monitoring jobs which run against a SageMaker Endpoint.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
 | [SageMaker Model CICD Module](modules/sagemaker/sagemaker-model-cicd/README.md)                                          | Creates a comprehensive CICD pipeline using AWS CodePipelines to build and deploy a ML model on SageMaker.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+| [SageMaker Ground Truth Labeling Module](modules/sagemaker/sagemaker-ground-truth-labeling/README.md)                    | Creates a state machine to allow labeling of images and text file, uploaded to the upload bucket, using various built-in task types in SageMaker Ground Truth.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
 
 ### Mlflow Modules
 

diff --git a/examples/sagemaker-ground-truth-labeling/README.md b/examples/sagemaker-ground-truth-labeling/README.md
@@ -0,0 +1,40 @@
+# SageMaker Ground truth labeling examples
+
+### Description
+
+This folder contains examples for each of the built-in task types for the sagemaker ground truth module. Each folder contains an example manifest as well as any necessary templates. Please upload the templates to an S3 bucket and update the manifest with the correct location.
+
+### Additional workers
+
+For tasks without a verification step (all except `image_bounding_box` and `image_semantic_segmentation`) we recommend increasing the number of human reviewers per object to increase accuracy. This will only work if you have at least that many reviewers in your workteam, as the same reviewer cannot review the same item twice. To adjust the number of workers add the additional parameters below to your manifest:
+
+```yaml
+  - name: labeling-human-task-config
+    value:
+      NumberOfHumanWorkersPerDataObject: 5
+      TaskAvailabilityLifetimeInSeconds: 21600
+      TaskTimeLimitInSeconds: 300
+```
+
+### Using public workforce
+
+As mentioned in the README you can use a public workforce for your task if you wish (at an additional cost). More information on using a public workforce like Amazon Mechanical Turk is available [here](https://docs.aws.amazon.com/sagemaker/latest/dg/sms-workforce-management-public.html). Labeling and verification task prices is specified in USD, see [here](https://docs.aws.amazon.com/sagemaker/latest/APIReference/API_PublicWorkforceTaskPrice.html) for allowed values. [This page](https://aws.amazon.com/sagemaker/groundtruth/pricing/) provides suggested pricing based on task type. To use a public workforce add / adjust the following parameters to your manifest:
+
+```yaml
+  - name: labeling-workteam-arn
+    value: 'arn:aws:sagemaker:<region>:394669845002:workteam/public-crowd/default'
+  - name: labeling-task-price
+    value:
+      AmountInUsd:
+        Dollars: 0
+        Cents: 3
+        TenthFractionsOfACent: 6
+  - name: verification-workteam-arn
+    value: 'arn:aws:sagemaker:<region>:394669845002:workteam/public-crowd/default'
+  - name: verification-task-price
+    value:
+      AmountInUsd:
+        Dollars: 0
+        Cents: 3
+        TenthFractionsOfACent: 6
+```
diff --git a/...aker-ground-truth-labeling/image_bounding_box/image_bounding_box_labeling_categories.json b/...aker-ground-truth-labeling/image_bounding_box/image_bounding_box_labeling_categories.json
@@ -0,0 +1 @@
+{"labels": [{"label": "Plane"}, {"label": "Boat"}]}
diff --git a/...emaker-ground-truth-labeling/image_bounding_box/image_bounding_box_labeling_template.html b/...emaker-ground-truth-labeling/image_bounding_box/image_bounding_box_labeling_template.html
@@ -0,0 +1,28 @@
+<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
+<crowd-form>
+  <crowd-bounding-box
+    name="boundingBox"
+    src="{{ task.input.taskObject | grant_read_access }}"
+    header="Please draw a box around all planes and boats in the image."
+    labels="{{ task.input.labels | to_json | escape }}"
+  >
+    <full-instructions header="Bounding box instructions">
+      <ol>
+        <li><strong>Inspect</strong> the image</li>
+        <li><strong>Determine</strong> if the specified label is/are visible in the picture.</li>
+        <li><strong>Outline</strong> each instance of the specified label in the image using the provided “Box” tool.</li>
+      </ol>
+      <ul>
+        <li>Boxes should fit tight around each object</li>
+        <li>Do not include parts of the object are overlapping or that cannot be seen, even though you think you can interpolate the whole shape.</li>
+        <li>Avoid including shadows.</li>
+        <li>If the target is off screen, draw the box up to the edge of the image.</li>
+      </ul>
+    </full-instructions>
+
+    <short-instructions>
+      <Strong>Outline</strong> each instance of the specified label in the image using the provided “Box” tool.
+      <!-- You may wish to include examples of correctly and incorrectly labeled images here -->
+    </short-instructions>
+  </crowd-bounding-box>
+</crowd-form>
diff --git a/...-ground-truth-labeling/image_bounding_box/image_bounding_box_verification_categories.json b/...-ground-truth-labeling/image_bounding_box/image_bounding_box_verification_categories.json
@@ -0,0 +1 @@
+{"labels":[{"label":"Label(s) correct"},{"label":"Incorrect label - missed object"},{"label":"Incorrect label - bounding box not accurate enough"}]}
diff --git a/...-ground-truth-labeling/image_bounding_box/image_bounding_box_verification_template.liquid b/...-ground-truth-labeling/image_bounding_box/image_bounding_box_verification_template.liquid
@@ -0,0 +1,39 @@
+<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
+<crowd-form>
+  <crowd-image-classifier
+    name="annotatedResult"
+    src="{{ task.input.taskObject | grant_read_access }}"
+    header="Review the existing labels on the objects and choose the appropriate option."
+    categories="{{ task.input.labels | to_json | escape }}"
+    overlay="{
+      'boundingBox': {
+        labels: ['Plane','Boat'],
+        value: [
+          {% for box in task.input.manifestLine["label"].annotations %}
+            {% capture class_id %}{{ box.class_id }}{% endcapture %}
+            {% assign label = task.input.manifestLine["label-metadata"].class-map[class_id] %}
+          {
+            label: {{label | to_json}},
+            left: {{box.left}},
+            top: {{box.top}},
+            width: {{box.width}},
+            height: {{box.height}},
+          },
+          {% endfor %}
+        ]
+      }
+    }"
+  >
+    <full-instructions header="Label verification - Bounding box instructions">
+      <ol>
+        <li><strong>Read</strong> the task carefully and inspect the image.</li>
+        <li><strong>Read</strong> the options and review the examples provided to understand more about the labels.</li>
+        <li><strong>Choose</strong> the appropriate label that best suits the image.</li>
+      </ol>
+    </full-instructions>
+    <short-instructions>
+      <strong>Choose</strong> the appropriate label that best suits the image.
+      <!-- You may wish to include examples of correctly and incorrectly labeled images here -->
+    </short-instructions>
+  </crowd-image-classifier>
+</crowd-form>
diff --git a/...s/sagemaker-ground-truth-labeling/image_bounding_box/sagemaker-ground-truth-labeling.yaml b/...s/sagemaker-ground-truth-labeling/image_bounding_box/sagemaker-ground-truth-labeling.yaml
@@ -0,0 +1,34 @@
+name: ground-truth-labeling
+path: modules/sagemaker/sagemaker-ground-truth-labeling
+targetAccount: primary
+parameters:
+  - name: job_name
+    value: 'plane-and-boat-bounding-box'
+  - name: task_type
+    value: 'image_bounding_box'
+
+  - name: labeling-workteam-arn
+    value: 'arn:aws:sagemaker:<region>:<account>:workteam/private-crowd/<workteam_name>'
+  - name: labeling-instructions-template-s3-uri
+    value: 's3://<bucket_name>/image_bounding_box_labeling_template.html'
+  - name: labeling-categories-s3-uri
+    value: 's3://<bucket_name>/image_bounding_box_labeling_categories.json'
+  - name: labeling-task-title
+    value: 'Labeling - Bounding boxes: Draw bounding boxes around all planes and boats in the image'
+  - name: labeling-task-description
+    value: 'Draw bounding boxes around all planes and boats in the image'
+  - name: labeling-task-keywords
+    value: [ 'image', 'object', 'detection' ]
+
+  - name: verification-workteam-arn
+    value: 'arn:aws:sagemaker:<region>:<account>:workteam/private-crowd/<workteam_name>'
+  - name: verification-instructions-template-s3-uri
+    value: 's3://<bucket_name>/image_bounding_box_verification_template.liquid'
+  - name: verification-categories-s3-uri
+    value: 's3://<bucket_name>/image_bounding_box_verification_categories.json'
+  - name: verification-task-title
+    value: 'Label verification - Bounding boxes: Review the existing labels on the objects and choose the appropriate option.'
+  - name: verification-task-description
+    value: 'Verify that all of the planes and boats in the image are correctly labeled'
+  - name: verification-task-keywords
+    value: ['image', 'object', 'detection', 'label verification', 'bounding boxes']
diff --git a/...ruth-labeling/image_multi_label_classification/image_multi_label_labeling_categories.json b/...ruth-labeling/image_multi_label_classification/image_multi_label_labeling_categories.json
@@ -0,0 +1 @@
+{"labels": [{"label": "Plane"}, {"label": "Boat"}]}
diff --git a/...-truth-labeling/image_multi_label_classification/image_multi_label_labeling_template.html b/...-truth-labeling/image_multi_label_classification/image_multi_label_labeling_template.html
@@ -0,0 +1,21 @@
+<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
+<crowd-form>
+  <crowd-image-classifier-multi-select
+    name="crowd-image-classifier-multi-select"
+    src="{{ task.input.taskObject | grant_read_access }}"
+    header="Please select the correct categories for this image"
+    categories="{{ task.input.labels | to_json | escape }}"
+    exclusion-category="{ text: 'None of the above' }"
+  >
+	<full-instructions header="Classification Instructions">
+      <p>If more than one label applies to the image, select multiple labels.</p>
+      <p>If no labels apply, select <b>None of the above</b></p>
+    </full-instructions>
+
+    <short-instructions>
+      <p>Read the task carefully and inspect the image.</p>
+      <p>Choose the appropriate label(s) that best suit the image.</p>
+      <!-- You may wish to include examples of correctly and incorrectly labeled images here -->
+    </short-instructions>
+  </crowd-image-classifier-multi-select>
+</crowd-form>
diff --git a/...ound-truth-labeling/image_multi_label_classification/sagemaker-ground-truth-labeling.yaml b/...ound-truth-labeling/image_multi_label_classification/sagemaker-ground-truth-labeling.yaml
@@ -0,0 +1,21 @@
+name: ground-truth-labeling
+path: modules/sagemaker/sagemaker-ground-truth-labeling
+targetAccount: primary
+parameters:
+  - name: job_name
+    value: 'vehicle-classification'
+  - name: task_type
+    value: 'image_multi_label_classification'
+
+  - name: labeling-workteam-arn
+    value: 'arn:aws:sagemaker:<region>:<account>:workteam/private-crowd/<workteam_name>'
+  - name: labeling-instructions-template-s3-uri
+    value: 's3://<bucket_name>/image_multi_label_labeling_template.html'
+  - name: labeling-categories-s3-uri
+    value: 's3://<bucket_name>/image_multi_label_labeling_categories.json'
+  - name: labeling-task-title
+    value: 'Labeling - Multi-Classification: Classify all images as containing a plane and/or a boat'
+  - name: labeling-task-description
+    value: 'Classify all images as containing a plane and/or a boat, selecting all of the appropriate labels'
+  - name: labeling-task-keywords
+    value: [ 'image', 'object', 'multi classification' ]
diff --git a/...labeling/image_semantic_segmentation/image_semantic_segmentation_labeling_categories.json b/...labeling/image_semantic_segmentation/image_semantic_segmentation_labeling_categories.json
@@ -0,0 +1 @@
+{"labels": [{"label": "Plane"}, {"label": "Boat"}]}
diff --git a/...h-labeling/image_semantic_segmentation/image_semantic_segmentation_labeling_template.html b/...h-labeling/image_semantic_segmentation/image_semantic_segmentation_labeling_template.html
@@ -0,0 +1,22 @@
+<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
+<crowd-form>
+  <crowd-semantic-segmentation
+    name="crowd-semantic-segmentation"
+    src="{{ task.input.taskObject | grant_read_access }}"
+    header="Please fill all planes and boats in the image"
+    labels="{{ task.input.labels | to_json | escape }}"
+  >
+    <full-instructions header="Segmentation Instructions">
+      <ol>
+          <li><strong>Read</strong> the task carefully and inspect the image.</li>
+          <li><strong>Read</strong> the options and review the examples provided to understand more about the labels.</li>
+          <li><strong>Choose</strong> the appropriate label that best suits the image.</li>
+      </ol>
+    </full-instructions>
+
+    <short-instructions>
+      Use the tools to label the requested items in the image
+      <!-- You may wish to include examples of correctly and incorrectly labeled images here -->
+    </short-instructions>
+  </crowd-semantic-segmentation>
+</crowd-form>
diff --git a/...ling/image_semantic_segmentation/image_semantic_segmentation_verification_categories.json b/...ling/image_semantic_segmentation/image_semantic_segmentation_verification_categories.json
@@ -0,0 +1 @@
+{"labels":[{"label":"Label(s) correct"},{"label":"Incorrect label - missed object"},{"label":"Incorrect label - segmentation not accurate enough"}]}
diff --git a/...ling/image_semantic_segmentation/image_semantic_segmentation_verification_template.liquid b/...ling/image_semantic_segmentation/image_semantic_segmentation_verification_template.liquid
@@ -0,0 +1,44 @@
+<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
+<crowd-form>
+  <crowd-image-classifier
+    name="annotatedResult"
+    src="{{ task.input.taskObject | grant_read_access }}"
+    header="Review the existing labels on the objects and choose the appropriate option."
+    categories="{{ task.input.labels | to_json | escape }}"
+    overlay="{
+      'semanticSegmentation': {
+        'labels': [
+          {% for key_value in task.input.manifestLine.label-ref-metadata.internal-color-map %}
+			{% assign item = key_value[1] %}
+			{% if item['class-name'] != 'BACKGROUND' %}
+			  '{{ item['class-name'] }}',
+			{% endif %}
+		  {% endfor %}
+        ],
+        labelMappings: {
+		  {% for key_value in task.input.manifestLine.label-ref-metadata.internal-color-map %}
+			{% assign item = key_value[1] %}
+            {% if item['class-name'] != 'BACKGROUND' %}
+              {{ item['class-name'] }}: {
+                color: '{{ item['hex-color'] }}'
+              },
+            {% endif %}
+          {% endfor %}
+		},
+        src: '{{ task.input.manifestLine['label-ref'] | grant_read_access }}',
+      }
+    }"
+  >
+    <full-instructions header="Label verification instructions">
+      <ol>
+        <li><strong>Read</strong> the task carefully and inspect the image.</li>
+        <li><strong>Read</strong> the options and review the examples provided to understand more about the labels.</li>
+        <li><strong>Choose</strong> the appropriate label that best suits the image.</li>
+      </ol>
+    </full-instructions>
+    <short-instructions>
+      <strong>Choose</strong> the appropriate label that best suits the image.
+      <!-- You may wish to include examples of correctly and incorrectly labeled images here -->
+    </short-instructions>
+  </crowd-image-classifier>
+</crowd-form>
diff --git a/...er-ground-truth-labeling/image_semantic_segmentation/sagemaker-ground-truth-labeling.yaml b/...er-ground-truth-labeling/image_semantic_segmentation/sagemaker-ground-truth-labeling.yaml
@@ -0,0 +1,34 @@
+name: ground-truth-labeling
+path: modules/sagemaker/sagemaker-ground-truth-labeling
+targetAccount: primary
+parameters:
+  - name: job_name
+    value: 'plane-and-boat-sem-seg'
+  - name: task_type
+    value: 'image_semantic_segmentation'
+
+  - name: labeling-workteam-arn
+    value: 'arn:aws:sagemaker:<region>:<account>:workteam/private-crowd/<workteam_name>'
+  - name: labeling-instructions-template-s3-uri
+    value: 's3://<bucket_name>/image_semantic_segmentation_labeling_template.html'
+  - name: labeling-categories-s3-uri
+    value: 's3://<bucket_name>/image_semantic_segmentation_labeling_categories.json'
+  - name: labeling-task-title
+    value: 'Labeling - Semantic segmentation: Fill all planes and boats in the image'
+  - name: labeling-task-description
+    value: 'Fill all planes and boats in the image using the appropriate label'
+  - name: labeling-task-keywords
+    value: [ 'image', 'object', 'detection' ]
+
+  - name: verification-workteam-arn
+    value: 'arn:aws:sagemaker:<region>:<account>:workteam/private-crowd/<workteam_name>'
+  - name: verification-instructions-template-s3-uri
+    value: 's3://<bucket_name>/image_semantic_segmentation_verification_template.liquid'
+  - name: verification-categories-s3-uri
+    value: 's3://<bucket_name>/image_semantic_segmentation_verification_categories.json'
+  - name: verification-task-title
+    value: 'Label verification - Semantic segmentation: Review the existing labels on the objects and choose the appropriate option.'
+  - name: verification-task-description
+    value: 'Verify that all of the planes and boats in the image are correctly labeled'
+  - name: verification-task-keywords
+    value: ['image', 'object', 'detection', 'label verification', 'semantic segmentation']
diff --git a/...th-labeling/image_single_label_classification/image_single_label_labeling_categories.json b/...th-labeling/image_single_label_classification/image_single_label_labeling_categories.json
@@ -0,0 +1 @@
+{"labels": [{"label": "Plane"}, {"label": "Boat"}, {"label": "Neither"}]}
diff --git a/...ruth-labeling/image_single_label_classification/image_single_label_labeling_template.html b/...ruth-labeling/image_single_label_classification/image_single_label_labeling_template.html
@@ -0,0 +1,19 @@
+<script src="https://assets.crowd.aws/crowd-html-elements.js"></script>
+<crowd-form>
+  <crowd-image-classifier
+    name="crowd-image-classifier"
+    src="{{ task.input.taskObject | grant_read_access }}"
+    header="Please select the correct category for this image"
+    categories="{{ task.input.labels | to_json | escape }}"
+  >
+    <full-instructions header="Classification Instructions">
+      <p>Read the task carefully and inspect the image.</p>
+      <p>Choose the appropriate label that best suits the image.</p>
+    </full-instructions>
+
+    <short-instructions>
+      Choose the appropriate label that best suits the image.
+      <!-- You may wish to include examples of correctly and incorrectly labeled images here -->
+    </short-instructions>
+  </crowd-image-classifier>
+</crowd-form>