openapi_extraction.yml

openapi: 3.0.3
servers:
  - url: https://api.sensible.so/v0
    description: Production server (uses live data)
info:
  title: Extraction
  version: 0.0.0
  license:
    name: Sensible API
    url: https://www.TBD.org/licenses/LICENSE-2.0.html
  description: Extract structured data from documents with the Sensible API.


security:
  - bearerAuth: []

tags:
  - name: Document
    description: "Extract data from documents"
  - name: Portfolio
    description: "Extract data from multiple documents bundled into single PDF files"
  - name: Retrieve extractions
    description: "Retrieve data extracted asynchronously from documents"
  - name: Get Excel from documents
    description: "Convert extracted document data to spreadsheet"


paths:


  /extract/{document_type}:
    post:
      operationId: extract-data-from-a-document
      summary: Extract data from a document (sync)

      description: |
        
        **Note:** Use this endpoint for testing. Use the asynchronous extraction endpoints for production.

        Extract data from a local document synchronously.

        To explore this endpoint, use this interactive API reference, or use one of the following options:

        - For a quick "hello world" response to this endpoint, see the [API quickstart](https://sensible.mintlify.app/.app/integrations/quickstart)
        - For a step-by-step tutorial about calling this endpoint, see [Try synchronous extraction](https://sensible.mintlify.app/.app/api-guides/api-tutorial/api-tutorial-sync).
        - Run this endpoint in the Sensible Postman collection. [Run in Postman](https://god.gw.postman.com/run-collection/16839934-45339059-3fec-4c31-a891-9a12a3e1c22b?action=collection%2Ffork&collection-url=entityId%3D16839934-45339059-3fec-4c31-a891-9a12a3e1c22b%26entityType%3Dcollection%26workspaceId%3Ddbde09dc-b7dd-487d-a68f-20d32b008f90)

        There are two options for posting the document bytes.
          1. (often preferred) specify the non-encoded document bytes as the entire request body,and specify the `Content-Type` header, for example,"application/pdf" or "image/jpeg".
             See the following for supported file formats.
          2. Base64 encode the document bytes, specify them in a body "document" field, and specify application/json for the `Content-Type` header.

        For a list of  supported document file types, see [Supported file types](https://sensible.mintlify.app/.app/senseml-reference/concepts/file-types).

      parameters:
        - $ref: '#/components/parameters/document_type'
        - $ref: '#/components/parameters/environment'
        - $ref: '#/components/parameters/document_name'
      requestBody:
        required: true

        content:
          image/jpeg:
            schema:
              type: string
              format: binary
          image/png:
            schema:
              type: string
              format: binary
          image/tiff:
            schema:
              type: string
              format: binary
              description: non-encoded document bytes as the entire request body
          application/pdf:
            schema:
              type: string
              format: binary
              description: non-encoded document bytes as the entire request body
          application/msword:
            schema:
              type: string
              format: binary
              description: non-encoded document bytes as the entire request body
          application/vnd.openxmlformats-officedocument.wordprocessingml.document:
            schema:
              type: string
              format: binary
              description: non-encoded document bytes as the entire request body

          application/json:
            schema:
              $ref: '#/components/schemas/encodedPdf'


      tags:
      - Document
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ExtractionSingleResponse'
          description: |
            The structured data extracted from the document.
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '429':
          $ref: '#/components/responses/429'
        '500':
          $ref: '#/components/responses/500'
  /generate_upload_url/{document_type}:
    post:
      operationId: generate-an-upload-url
      summary: Extract doc at a Sensible URL
      description: |
        Extract data asynchronously from a document with the following steps:
          1. Use this endpoint to generate a Sensible URL.
          2. PUT your document at the `upload_url` returned from the previous step. Sensible extracts data from the document.
          3. To retrieve the extraction, use a webhook, or use the extraction `id` returned in the response to poll the GET documents/{id} endpoint.

        For supported file size and types, see [Supported file types](https://sensible.mintlify.app/.app/senseml-reference/concepts/file-types).

        For example, if your call to `/generate_upload_url` specifies the document type with a `content_type` body parameter (recommended), your first two steps are as follows:

        Step 1. Generate the Sensible URL:

        ```curl
        curl --location 'https://api.sensible.so/v0/generate_upload_url/<YOUR_DOCUMENT_TYPE>' \
        --header 'Content-Type: application/json' \
        --header 'Accept: application/json' \
        --header 'Authorization: Bearer REDACTED' \
        --data '{"content_type":"application/pdf"}'
        ```

        Step 2. PUT the document:

        ```curl
        curl --location --request PUT 'https://sensible-so-utility-bucket-dev-us-west-2.s3.us-west-2.amazonaws.com/REDACTED' \
        --header 'Content-Type: application/pdf' \
        --data 'YOUR_PATH_TO_DOCUMENT.pdf'
        ```

        Note that in step 2:
          - you must omit an authorization header
          - the `Content-Type` header must match the `content_type` body parameter in step 1
          - the pre-signed `upload_url` doesn't support Base64 encoded documents, so you PUT the document bytes directly to the endpoint.


        For a step-by-step tutorial on calling this endpoint, see
        [Try asynchronous extraction from a Sensible URL](https://docs.sensible.so/docs/api-tutorial-async-2).

      parameters:
        - $ref: '#/components/parameters/document_type'
        - $ref: '#/components/parameters/environment'
        - $ref: '#/components/parameters/document_name'
      requestBody:
        content:
            application/json:
              schema:
                $ref: '#/components/schemas/GenerateUrlRequest'
      tags:
      - Document
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/UploadResponse'
          description: Returns the upload_url at which to PUT the document for extraction
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '429':
          $ref: '#/components/responses/429'
        '500':
          $ref: '#/components/responses/500'
  /extract_from_url/{document_type}:
    post:
      operationId: provide-a-download-url
      summary: Extract doc at your URL
      description: |
        Extract data asynchronously from a document at the specified `document_url`.<br/>
        For supported file size and types, see [Supported file types](https://sensible.mintlify.app/.app/senseml-reference/concepts/file-types).
        Take the following steps.
        1. Run this endpoint.
        3. To retrieve the extraction, use a webhook, or use the extraction `id` returned in the  response to poll the GET documents/{id} endpoint.
        For a step-by-step tutorial on calling this endpoint,
        see [Try asynchronous extraction from your URL](https://sensible.mintlify.app/.app/api-guides/api-tutorial/api-tutorial-async-1).
      parameters:
        - $ref: '#/components/parameters/document_type'
        - $ref: '#/components/parameters/environment'
        - $ref: '#/components/parameters/document_name'
      requestBody:
        content:
            application/json:
              schema:
                $ref: '#/components/schemas/ExtractFromUrlRequest'
      tags:
      - Document
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ExtractFromUrlResponse'
          description: Returns the ID to use to retrieve the extraction
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '429':
          $ref: '#/components/responses/429'
        '500':
          $ref: '#/components/responses/500'


  /extract/{document_type}/{config_name}:
    post:
      operationId: extract-data-from-a-document-with-config
      summary: Extract data from a document using specified config

      description: |
        This endpoint's behavior identical to the [Extract data from a document](ref:extract-data-from-a-document) endpoint's behavior, except that Sensible uses the specified config to extract data from the document instead of automatically choosing the best-scoring extraction in the document type.

      parameters:
        - $ref: '#/components/parameters/document_type'
        - $ref: '#/components/parameters/config_name'
        - $ref: '#/components/parameters/environment'
        - $ref: '#/components/parameters/document_name'
      requestBody:
        required: true

        content:
          image/jpeg:
            schema:
              type: string
              format: binary
          image/png:
            schema:
              type: string
              format: binary
          image/tiff:
            schema:
              type: string
              format: binary
              description: non-encoded document bytes as the entire request body
          application/pdf:
            schema:
              type: string
              format: binary
              description: non-encoded document bytes as the entire request body
          application/msword:
            schema:
              type: string
              format: binary
              description: non-encoded document bytes as the entire request body
          application/vnd.openxmlformats-officedocument.wordprocessingml.document:
            schema:
              type: string
              format: binary
              description: non-encoded document bytes as the entire request body

          application/json:
            schema:
              $ref: '#/components/schemas/encodedPdf'


      tags:
      - Document
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ExtractionSingleResponse'
          description: |
            The structured data extracted from the document.
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '429':
          $ref: '#/components/responses/429'
        '500':
          $ref: '#/components/responses/500'
  /generate_upload_url/{document_type}/{config_name}:
    post:
      operationId: generate-an-upload-url-with-config
      summary: Extract doc at a Sensible URL using specified config
      description: |
       This endpoint's behavior is identical to the [Extract doc at a Sensible URL](ref:generate-upload-url) endpoint's behavior, except that Sensible uses the specified config to extract data from the document instead of automatically choosing the best-scoring extraction in the document type.
      parameters:
        - $ref: '#/components/parameters/document_type'
        - $ref: '#/components/parameters/environment'
        - $ref: '#/components/parameters/document_name'
        - $ref: '#/components/parameters/config_name'
      requestBody:
        content:
            application/json:
              schema:
                $ref: '#/components/schemas/GenerateUrlRequest'
      tags:
      - Document
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/UploadResponse'
          description: Returns the upload_url at which to PUT the document for extraction
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '429':
          $ref: '#/components/responses/429'
        '500':
          $ref: '#/components/responses/500'
  /extract_from_url/{document_type}/{config_name}:
    post:
      operationId: provide-a-download-url-with-config
      summary: Extract doc at your URL using config
      description: |
        This endpoint's behavior is identical to the [Extract doc at your URL](ref:extract-from-url) endpoint's behavior, except that Sensible uses the specified config to extract data from the document instead of automatically choosing the best-scoring extraction in the document type.
      parameters:
        - $ref: '#/components/parameters/document_type'
        - $ref: '#/components/parameters/environment'
        - $ref: '#/components/parameters/document_name'
        - $ref: '#/components/parameters/config_name'
      requestBody:
        content:
            application/json:
              schema:
                $ref: '#/components/schemas/ExtractFromUrlRequest'
      tags:
      - Document
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ExtractFromUrlResponse'
          description: Returns the ID to use to retrieve the extraction
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '429':
          $ref: '#/components/responses/429'
        '500':
          $ref: '#/components/responses/500'


  /generate_upload_url:
    post:
      operationId: generate-an-upload-url-for-a-pdf-portfolio
      summary: Extract portfolio at a Sensible URL

      description:  |
        Use this endpoint with multiple documents that are packaged into one file (a "portfolio"). For a list of supported file types, see [Supported file types](https://sensible.mintlify.app/.app/senseml-reference/concepts/file-types).
        Segments a portfolio file into the specified document types (for example, 1099, w2, and bank_statement) and then runs extractions
        asynchronously for each document Sensible finds in the portfolio.  Take the following steps -
        1. Use this endpoint to generate a Sensible URL.
        2. PUT the document you want to extract data from at the URL, where `SENSIBLE_UPLOAD_URL` is the URL you received
        from this endpoint's response. For more information about how to PUT the document, see the [generate_upload_url/{document_type}](ref:generate-upload-url) endpoint.
        3. To retrieve the extraction, use a webhook, or use the extraction `id` returned in the  response to poll the GET documents/{id} endpoint.
        For more about extracting from portfolios, see [Multi-document extractions](https://sensible.mintlify.app/.app/layout-based-extractions/portfolio).

      parameters:
        - $ref: '#/components/parameters/environment'
        - $ref: '#/components/parameters/document_name'

      requestBody:
        content:
            application/json:
              schema:
                type: object
                properties:
                  webhook:
                    $ref: '#/components/schemas/Webhook'
                  types:
                    $ref: '#/components/schemas/DocumentTypeNames'
                required:
                  - types
      tags:
      - Portfolio
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/UploadPortfolioResponse'
          description: Returns the upload_url at which to PUT the document for extraction
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '429':
          $ref: '#/components/responses/429'
        '500':
          $ref: '#/components/responses/500'

  /extract_from_url:
    post:
      operationId: provide-a-download-url-for-a-pdf-portfolio
      summary: Extract portfolio at your URL
      description:  |

        Use this endpoint with multiple documents that are packaged into one file (a "portfolio"). For a list of supported file types, see [Supported file types](https://sensible.mintlify.app/.app/senseml-reference/concepts/file-types).
        Segments a portfolio file at the specified `document_url` into the specified document types (for example, 1099, w2, and bank_statement)
        and then runs extractions asynchronously for each document Sensible finds in the portfolio. Take the following steps.
        1. Run this endpoint.
        3. To retrieve the extraction, use a webhook, or use the extraction `id` returned in the  response to poll the GET documents/{id} endpoint.
        For more about extracting from portfolios, see [Multi-document extractions](https://sensible.mintlify.app/.app/layout-based-extractions/portfolio).
      parameters:
        - $ref: '#/components/parameters/environment'
        - $ref: '#/components/parameters/document_name'
      requestBody:
        content:
            application/json:
              schema:
                type: object
                properties:
                  document_url:
                    $ref: '#/components/schemas/DocumentUrl'
                  types:
                    $ref: '#/components/schemas/DocumentTypeNames'
                  webhook:
                    $ref: '#/components/schemas/Webhook'
                required:
                  - types
                  - document_url
      tags:
      - Portfolio
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ExtractFromUrlPortfolioResponse'
          description: Returns the ID to use to retrieve the extraction.
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '429':
          $ref: '#/components/responses/429'
        '500':
          $ref: '#/components/responses/500'


  /documents/{id}:
    get:
      operationId: retrieving-results
      summary: Retrieve extraction by ID
      description: |
        Use this endpoint in conjunction with asynchronous extraction requests to retrieve your results.
        You can also use this endpoint to retrieve the results for documents extractions from the synchronous /extract endpoint.
        To poll extraction status, check the `status` field in this endpoint's response.
        When the extraction completes, the returned status is `COMPLETE` and the response includes results in the
        `parsed_document` field.  For fields in the extraction for which Sensible couldn't find a value, Sensible returns null.
      parameters:
        - $ref: '#/components/parameters/id'
      tags:
      - Retrieve extractions
      responses:
        '200':
          content:
            application/json:
              schema:
                oneOf:
                 - $ref: '#/components/schemas/ExtractionSingleRetrievalResponse'
                 - $ref: '#/components/schemas/ExtractionPortfolioRetrievalResponse'
          description: Returns the extraction.
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '500':
          $ref: '#/components/responses/500'


  /extractions:
    get:
      operationId: list-extractions
      summary: List extractions
      tags:
      - Retrieve extractions
      description: |
        Use this endpoint to get a filtered list of past extractions.
        This endpoint returns a summary for each extraction, listed in reverse chronological order.
        To get details about an extraction, use the [Retrieve extraction by ID](ref:retrieving-results) endpoint.
        This endpoint uses keyset pagination to retrieve the next page of results.
        By default it returns a first page of 20 extractions and an opaque `continuation_token` that you can pass in the next request to get the next page of results, until the endpoint returns `continuation_token` to indicate the last page.
        Use the `limit` parameter to configure page size.


      parameters:
        - $ref: '#/components/parameters/start_date'
        - $ref: '#/components/parameters/end_date'
        - $ref: '#/components/parameters/page_limit'
        - $ref: '#/components/parameters/continuation_token'
        - $ref: '#/components/parameters/configuration_ids'
        - $ref: '#/components/parameters/document_type_ids'
        - $ref: '#/components/parameters/environments'
        - $ref: '#/components/parameters/statuses'
        - $ref: '#/components/parameters/min_coverage'
        - $ref: '#/components/parameters/max_coverage'

      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/ExtractionsResponseFiltered'
          description: Returns list of summarized extractions.
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '500':
          $ref: '#/components/responses/500'

  /extractions/statistics:
    get:
      operationId: statistics
      summary: Get extraction statistics
      tags:
      - Retrieve extractions
      description: |
        Returns daily extraction coverage statistics per config. Sensible returns coverage for each config that was used for at least one extraction performed in the production environment in the specified time period. For more information about coverage, see [Monitoring extractions](metrics).
      parameters:
        - $ref: '#/components/parameters/start_date_config'
        - $ref: '#/components/parameters/end_date_config'
      responses:
        '200':
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/StatisticsResponse'
          description: Returns daily statistics for configs in the specified time period.
        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '500':
          $ref: '#/components/responses/500'


  /generate_excel/{ids}:
    get:
      operationId: get-excel-extraction
      summary: Get Excel extraction
      description: |
        You can use this endpoint to get Excel files from documents, for example from PDFs. In more detail, this endpoint converts your JSON document extraction to an Excel spreadsheet.
        To compile multiple documents into one Excel file, specify the IDs of their recent extractions in the request separated by commas, for example,
        `/generate_excel/867514cc-fce7-40eb-8e9d-e6ec48cdac34,5093c65f-05bd-46a3-8df7-da3ed00f6d35`.
        For the best compiled spreadsheet results, configure your SenseML so that the documents output identically named fields.
        For more information about the conversion process, see [SenseML to spreadsheet reference](https://sensible.mintlify.app/.app/integrations/quick-extraction/excel-reference).

        For portfolio extractions, Sensible returns an Excel file containing fields for all the documents it finds in the PDF. For more information, see [Multi-document spreadsheet](https://sensible.mintlify.app/.app/integrations/quick-extraction/excel-reference#multi-document-spreadsheet).

        For a list of document file types that Sensible can extract data from, see [Supported file types](https://sensible.mintlify.app/.app/senseml-reference/concepts/file-types).
        Call this endpoint after an extraction completes. For more information about checking extraction status,
        see the `GET /documents/{id}` endpoint.
      parameters:
        - $ref: '#/components/parameters/ids'
      tags:
      - Get Excel from documents
      responses:
        '200':
          description: |
            Indicates the extraction successfully converted to an Excel file. This response contains the download URL for the Excel file. The link
            expires after 15 minutes.
          content:
            application/json:
              schema:
                properties:
                  url:
                    type: string
                    format: url
                    description: The download URL for the Excel file
                    example: https://sensible-so-document-type-bucket-dev-us-west-2.s3.us-west-2.amazonaws.com/sensible/fc3484c5-3f35-4129-bb29-0ad1291ee9f8/EXTRACTION/14d82783-c12b-4e70-b0ae-ca1ce35a9836.xlsx?REDACTED

        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '500':
          $ref: '#/components/responses/500'


  /generate_csv/{ids}:
    get:
      operationId: get-csv-extraction
      summary: Get CSV extraction
      description: |
        You can use this endpoint to get CSV files from documents, for example, from PDFs. In more detail, this endpoint converts your JSON document extraction to a comma-separated values.
        To compile multiple documents into one CSV file, specify the IDs of their recent extractions in the request separated by commas, for example,
        `/generate_csv/867514cc-fce7-40eb-8e9d-e6ec48cdac34,5093c65f-05bd-46a3-8df7-da3ed00f6d35`.
        For the best compiled spreadsheet results, configure your SenseML so that the documents output identically named fields.
        For more information about the conversion process, see [SenseML to spreadsheet reference](https://sensible.mintlify.app/.app/integrations/quick-extraction/excel-reference).
        For a list of document file types that Sensible can extract data from, see [Supported file types](https://sensible.mintlify.app/.app/senseml-reference/concepts/file-types).
        Call this endpoint after an extraction completes. For more information about checking extraction status,
        see the `GET /documents/{id}` endpoint.
      parameters:
        - $ref: '#/components/parameters/ids'
      tags:
      - Get Excel from documents
      responses:
        '200':
          description: |
            Indicates the extraction successfully converted to an CSV file. This response contains the download URL for the CSV file. The link
            expires after 15 minutes.
          content:
            application/json:
              schema:
                properties:
                  url:
                    type: string
                    format: url
                    description: The download URL for the CSV file
                    example: https://sensible-so-document-type-bucket-dev-us-west-2.s3.us-west-2.amazonaws.com/sensible/fc3484c5-3f35-4129-bb29-0ad1291ee9f8/EXTRACTION/14d82783-c12b-4e70-b0ae-ca1ce35a9836.csv?REDACTED

        '401':
          $ref: '#/components/responses/401'
        '400':
          $ref: '#/components/responses/400'
        '415':
          $ref: '#/components/responses/415'
        '500':
          $ref: '#/components/responses/500'


components:

  responses:

    401:
      description: Not authorized
      content:
        text/plain:
          schema:
            title: Unauthorized
            type: string
            example: Unauthorized
    400:
      description: Bad Request
      content:
        text/plain:
          schema:
            title: Bad Request
            type: string
            example: >-
              Either a specific set of messages about fields in the request, or error messages like the following examples -
              Not available to logged in users
              To use the asynchronous flow you must have persistence enabled
              Specified document type does not exist
              Specified document type ${named type} does not exist
              No published configurations found for environment ${environment}
              Specified golden does not exist
              Specified configuration/version does not exist
              Specified configuration/version is not valid
              Must provide the Content-Type header when request body is present
              Content-Type must be application/json
              Missing request body or body.document
              Could not determine the content type of the document
              Could not determine the content type of the document. Please check that the document was correctly encoded as Base64
              This PDF is invalid. If you submitted this PDF using Base64 encoding, please check that the encoding is correct
              This PDF is password protected. Please resubmit with password protection disabled
              This PDF is empty
              This PDF exceeds the maximum dimensions for OCR of 17 x 17 inches
              This PDF exceeds the maximum size for OCR of 50MB
              No fingerprints match for this PDF and fingerprint_mode is set to strict
              Content type of ${found} does not match declared type of ${expected}
              Document is not present
              The start date must be before the end date

    415:
      description: Unsupported Media Type
      content:
        text/plain:
          schema:
            title: Unsupported Media Type
            type: string
            example: >-
              One of the following error messages -
              Content-Type must be application/json
              Content-Type must be application/json or application/pdf or image/jpeg or image/png or image/tiff
    429:
      description: Too Many Requests
      content:
        text/plain:
          schema:
            title: Unsupported Media Type
            type: string
            example: >-
              One of the following error messages -
              Attempt limit exceeded, please retry after some time.
              Free accounts are limited to 150 API calls per month. Please upgrade your account to make additional calls.
              Pro accounts are limited to 5,000 API calls per month. Please upgrade your account to make additional calls.
    500:
      description: Internal Server Error
      content:
        text/plain:
          schema:
            title: Sensible encountered an unknown error
            type: string
            example: Sensible encountered an unknown error
  parameters:
    id:
      name: id
      required: true
      in: path
      description: Unique ID for the extraction, used to retrieve the extraction.
      schema:
        $ref: '#/components/schemas/ExtractionId'
    ids:
      name: ids
      required: true
      in: path
      description: Comma-delimited list of unique extraction IDs.
      schema:
        $ref: '#/components/schemas/ExtractionId'


    document_type:
      name: document_type
      required: true
      in: path
      description: |
        Type of document to extract from. Create your custom type in the Sensible app (for example, `rate_confirmation`, `certificate_of_insurance`, or `home_inspection_report`).
        To quickly test this endpoint using the `Try It` button in this interactive explorer, use the `senseml_basics` tutorial document type with this [example document](https://github.com/sensible-hq/sensible-docs/raw/main/readme-sync/assets/v0/pdfs/1_extract_your_first_data.pdf).
        As a convenience, Sensible automatically detects the best-fit extraction from among the extraction queries ("configs") in the document type.
        For example, if you create an `auto_insurance_quotes` document type, you can add `carrier 1`, `carrier 2`, and `carrier 3` configs
        to the document type in the Sensible app. Then, you can extract data from all these carriers using the same document type, without specifying the carrier in the API request.
      schema:
        type: string
      example: senseml_basics

    config_name:
      name: config_name
      required: true
      in: path
      description: >-
        User-friendly name of the config to use to extract data from the document.
      schema:
        type: string
      example: anyco_insurance_auto_declarations

    document_name:
      name: document_name
      in: query
      description: >-
        If you specify the filename of the document using this parameter, then Sensible returns the filename in the extraction response.
      schema:
        type: string
      example: test.pdf

    environment:
      name: environment
      in: query
      description: >-
        If you specify `development`, extracts preferentially using config versions
        published to the development environment in the Sensible app. The extraction runs all configs in the doc type before
        picking the best fit. For each config, falls back to production version if no development version of the config exists.
      schema:
        type: string
        enum: [production, development]
        default: production


    start_date:
      name: start_date
      in: query
      required: false
      description: >-
         Retrieves extractions with a `created` date that is equal to or later than this date-time.
         The default is the unix epoch.
      schema:
        type: string
        format: date-time
        default: 1970-01-01T00:00:00Z
      example: 2020-10-10T00:00:00.000Z


    end_date:
      name: end_date
      in: query
      required: false
      description: >-
         Retrieves extractions with a `created` date that is equal to or earlier than this date-time.
         The default is the current date-time.
      schema:
        type: string
        format: date-time
      example: 2024-01-20T00:00:00.000Z


    page_limit:
      name: limit
      in: query
      required: false
      description: >-
         Use the limit to define the number of items you recieve on each page of the paginated response.
         The default is 20.
      schema:
        type: number
        default: 20
      example: 100

    continuation_token:
      name: continuation_token
      in: query
      required: false
      description: >-
         Get the next page of results by making a new request and passing the opaque `continuation_token` parameter
         that Sensible returns in the current page of responses.
         Sensible returns a null `continuation_token` in the response to indicate the last page.
      schema:
        type: string
      example: eyJpZCI6IjRiNTg1Mjc4LWUwOWMtNGJiOS04ODJiLThmYjFhZTA3ZGU3ZiIsInVzZXIiOiJjMDI0Y2QxYy01ZMMzLTRhODItYjJlYS0yYzgwN2U0NDk4OGIiLCJjcmVhdGVkIjoiMjAyNC0wNS0wMVQyMjo11Do1NS43MzMaIn1


    start_date_config:
      name: start_date
      in: query
      required: true
      description: >-
         Retrieves statistics for configs used in production on this day and later.
         Sensible returns daily statistics, so if you specify a time in addition to a date, Sensible ignores the time.
      schema:
        type: string
        format: date-time
      example: 2020-10-10T00:00:00.000Z


    end_date_config:
      name: end_date
      in: query
      required: true
      description: >-
        Retrieves daily statistics for configs used in production on this day and earlier.
      schema:
        type: string
        format: date-time
      example: 2020-10-20T00:00:00.000Z


    document_type_ids:
      name: document_type_ids
      in: query
      description: >-
        Comma-delimited list of document types by which to filter the retrieved extractions.
      schema:
        type: string
      example: 4e95e3d0-8d69-49b0-9501-2cca8b902a45, 24d82783-b12b-4e70-b0ae-ca1ce35a9836


    configuration_ids:
      name: configuration_ids
      in: query
      description: >-
        Comma-delimited list of configurations by which to filter the retrieved extractions.
      schema:
        type: string
      example: 1417523c-f318-4037-90e9-ed7ade06031d,23be500b-4b7f-43dd-b0db-f06ec5c6c8de


    statuses:
      name: statuses
      in: query
      description: >-
         Comma-delimited list of statuses (WAITING, PROCESSING, FAILED, COMPLETE) by which to filter the retrieved extractions.
      schema:
        type: string
      example: COMPLETE, WAITING

    min_coverage:
      name: min_coverage
      in: query
      description: >-
         Minimum extraction coverage score by which to filter the retrieved extractions. For more information about scoring, see [Monitoring extractions](https://sensible.mintlify.app/.app/best-practices/metrics).
      schema:
        type: number
      example: 0.8

    max_coverage:
      name: max_coverage
      in: query
      description: >-
         Maximum extraction coverage score by which to filter the retrieved extractions. For more information about scoring, see [Monitoring extractions](https://sensible.mintlify.app/.app/best-practices/metrics).
      schema:
        type: number
      example: 1.0


    environments:
      name: environments
      in: query
      description: >-
         Comma-delimited list of environments (PRODUCTION, DEVELOPMENT) by which to filter the retrieved extractions.
      schema:
        type: string
      example: PRODUCTION,DEVELOPMENT


  securitySchemes:
    bearerAuth:
      type: http
      scheme: bearer
      description: >-
        Sensible uses API keys to authenticate requests.
        Keep your API keys secure and do not share them publicly accessible areas such as GitHub, client-side code, etc.
        Authentication to the API is performed via Bearer Authentication. Provide your API key as the bearer auth value.

  schemas:

    Coverage:
      type: number
      description: The coverage score measures how fully an extraction captured all your target data in the document. It's a percentage comparing non-null, [validated](https://sensible.mintlify.app/.app/best-practices/validate-extractions) fields to total fields returned by a config for a document. For example, a coverage score of 70% for an extraction with no validation errors means that 30% of fields were null. For more information about scoring, see [Monitoring extractions](https://sensible.mintlify.app/.app/best-practices/metrics).
      example: 0.75
    Environment:
      type: string
      description: >-
        The environment in which the version of the config used to run the extraction was published.
      example: DEVELOPMENT


    ExtractionSingleResponse:
      allOf:
        - $ref: '#/components/schemas/ExtractionSingleBase'
        - $ref: '#/components/schemas/ExtractionContent'
        - type: object
          properties:
            completed:
              $ref: '#/components/schemas/ExtractionCompleted'
            classification_summary:
              $ref: '#/components/schemas/ClassificationSummary'
            page_count:
              type: integer
              example: 100
              description: Total number of pages in the document.
            environment:
              $ref: '#/components/schemas/EnvironmentName'
            document_name:
              $ref: '#/components/schemas/DocName'
            coverage:
              $ref: '#/components/schemas/Coverage'

    DocName:
      type: string
      description: >-
        If you specify the filename of the document using the `document_name` parameter, then
        Sensible displays the name in extraction history in the Sensible app
        and returns the name in the extraction response.
      example: example.pdf

    StatisticsResponse:
      type: object
      properties:
        statistics:
          type: array
          items:
            $ref: '#/components/schemas/ConfigStats'

    ConfigStats:
      type: object
      properties:
        date:
          type: string
          format: date
          description: The day for which Sensible gets statistics for this config.
        configuration_id:
          $ref: '#/components/schemas/ConfigurationId'
        configuration_name:
          $ref: '#/components/schemas/ConfigurationName'
        document_type_id:
          $ref: '#/components/schemas/DocumentTypeId'
        document_type_name:
          $ref: '#/components/schemas/DocumentTypeName'
        coverage_histogram:
          description: |
            Array of numbers that describe the number of extractions that fell into each coverage bucket for the `date` for this config.
            The buckets are as follows:

            - [0, 10)
            - [10, 20)
            - [20, 30)
            - [30, 40)
            - [40, 50)
            - [50, 60)
            - [60, 70)
            - [70, 80)
            - [80, 90)
            - [90, 95)
            - [95, 100)
            - [100]

            `[` denotes inclusive and `)` denotes exclusive.
            For example, when this endpoint returns `"coverage_histogram":[7,5,3,3,2,1,1,4,7,9,13,15]` , the first and last items in the array show that on specified date for the specified config, 7 extractions scored in the lowest bucket of 0-10%, and 15 scored in the highest bucket of 100%.
             For more information about extraction coverage scores, see [Monitoring extractions](https://sensible.mintlify.app/.app/best-practices/metrics).
             From the payload returned by this endpoint, you can calculate other metrics, for example:
              - total number of extractions in a time period
              - doc type and config usage
          type: array
          example: [1,3,5,4,6,5,3,7,8,2,4,9]
          items:
             type: integer
             maxItems: 12
             minItems: 12

    ExtractionsResponseFiltered:
      type: object
      properties:
        extractions:
          type: array
          items:
            anyOf:
              - $ref: '#/components/schemas/SingleExtractionSummaryResponse'
              - $ref: '#/components/schemas/MultiExtractionSummaryResponse'

        cutoff_date:
            type: string
            format: date-time
            example: null
            description: >-
              DEPRECATED. The `continuation_token` and `limit` parameters replace this parameter.
              DESCRIPTION: Pass the cutoff_date parameter in the next request as the `end_date` parameter to retrieve the next page of extractions.
              Note that since Sensible applies the date range filters before all other filters, the `cutoff_date` can represent
              the date-time of an extraction that Sensible retrieved using the date range filter, and then removed using other filters.


    SingleExtractionSummaryResponse:
      allOf:
        - $ref: '#/components/schemas/ExtractionResponseBase'
      properties:
        type:
          $ref: '#/components/schemas/DocumentTypeName'
        configuration:
          $ref: '#/components/schemas/ConfigurationName'
        errors:
          $ref: '#/components/schemas/Errors'
        validations:
          $ref: '#/components/schemas/Validations'


    MultiExtractionSummaryResponse:
      allOf:
        - $ref: '#/components/schemas/ExtractionResponseBase'
      properties:
        types:
          $ref: '#/components/schemas/DocumentTypeNames'
        documents:
          type: array
          items:
            $ref: '#/components/schemas/MultiExtractionSummaryDocument'


    ExtractionSingleRetrievalResponse:
      allOf:
        - $ref: '#/components/schemas/ExtractionSingleResponse'
        - type: object
          properties:
            download_url:
              $ref: '#/components/schemas/DownloadUrlDocument'


    ExtractionPortfolioRetrievalResponse:
      allOf:
        - $ref: '#/components/schemas/PortfolioBase'
        - type: object
          properties:
            completed:
              $ref: '#/components/schemas/ExtractionCompleted'
            page_count:
              type: integer
              example: 100
              description: Total number of pages in the portfolio.
            environment:
              $ref: '#/components/schemas/EnvironmentName'
            document_name:
              $ref: '#/components/schemas/DocName'
            download_url:
              $ref: '#/components/schemas/DownloadUrlDocument'
            documents:
              type: array
              items:
                $ref: '#/components/schemas/DocumentInPortfolio'
            coverage:
              $ref: '#/components/schemas/Coverage'

    ExtractFromUrlResponse:
      allOf:
        - $ref: '#/components/schemas/ExtractionSingleBase'
        - type: object
          properties:
            environment:
              $ref: '#/components/schemas/EnvironmentName'
            document_name:
              $ref: '#/components/schemas/DocName'
            errors:
              $ref: '#/components/schemas/Errors'


    UploadResponse:
      allOf:
        - $ref: '#/components/schemas/ExtractionSingleBase'
        - type: object
          properties:
            upload_url:
              type: string
              format: url
              description: URL at which to PUT the PDF bytes array for extraction. for example, curl -T ./sample.pdf "YOUR_UPLOAD_URL"
              example: https://sensible-so-utility-bucket-prod-us-west-2.s3.us-west-2.amazonaws.com/EXTRACTION_UPLOAD/sensible/fc3484c5-3f35-4129-bb29-0ad1291ee9f8/EXTRACTION/14d82783-c12b-4e70-b0ae-ca1ce35a9836.pdf?AWSAccessKeyId=REDACTED&Expires=1623861476&Signature=REDACTED&x-amz-security-token=REDACTED


    ExtractFromUrlPortfolioResponse:
      allOf:
        - $ref: '#/components/schemas/PortfolioBase'
        - type: object
          properties:
            types:
              $ref: '#/components/schemas/DocumentTypeNames'


    UploadPortfolioResponse:
      allOf:
        - $ref: '#/components/schemas/PortfolioBase'
        - type: object
          properties:
            upload_url:
              type: string
              format: url
              description: URL at which to PUT the PDF bytes array for extraction. for example, curl -T ./sample.pdf "YOUR_UPLOAD_URL"
              example: https://sensible-so-utility-bucket-prod-us-west-2.s3.us-west-2.amazonaws.com/EXTRACTION_UPLOAD/sensible/fc3484c5-3f35-4129-bb29-0ad1291ee9f8/EXTRACTION/14d82783-c12b-4e70-b0ae-ca1ce35a9836.pdf?AWSAccessKeyId=REDACTED&Expires=1623861476&Signature=REDACTED&x-amz-security-token=REDACTED


    ExtractionResponseBase:
      type: object
      properties:
        id:
          $ref: '#/components/schemas/ExtractionId'
        created:
          $ref: '#/components/schemas/ExtractionCreated'
        completed:
          $ref: '#/components/schemas/ExtractionCompleted'
        status:
         $ref: '#/components/schemas/ExtractionStatus'
        validation_summary:
         $ref: '#/components/schemas/ValidationsSummary'
        page_count:
          type: integer
          example: 100
          description: Total number of pages in the document.
        document_name:
         $ref: '#/components/schemas/DocName'
        environment:
         $ref: '#/components/schemas/Environment'
        coverage:
          $ref: '#/components/schemas/Coverage'


    ExtractionSingleBase:
      type: object
      properties:
        id:
          $ref: '#/components/schemas/ExtractionId'
        created:
          $ref: '#/components/schemas/ExtractionCreated'
        type:
         $ref: '#/components/schemas/DocumentTypeName'
        status:
         $ref: '#/components/schemas/ExtractionStatus'

    ExtractionContent:
      type: object
      properties:
        configuration:
          $ref: '#/components/schemas/ConfigurationName'
        parsed_document:
          $ref: '#/components/schemas/ParsedDocument'
        validations:
          $ref: '#/components/schemas/Validations'
        file_metadata:
          $ref: '#/components/schemas/FileMetadata'
        validation_summary:
          $ref: '#/components/schemas/ValidationsSummary'
        errors:
           $ref: '#/components/schemas/Errors'


    PortfolioBase:
      type: object
      properties:
        id:
          $ref: '#/components/schemas/ExtractionId'
        created:
          $ref: '#/components/schemas/ExtractionCreated'
        status:
         $ref: '#/components/schemas/ExtractionStatus'


    ExtractionContentPortfolio:
      allOf:
        - $ref: '#/components/schemas/ExtractionContent'
        - type: object
          properties:
            classification_summary:
              $ref: '#/components/schemas/ClassificationSummaryPortfolio'


    MultiExtractionSummaryDocument:
      type: object
      properties:
        documentType:
          $ref: '#/components/schemas/DocumentTypeName'
        configuration:
          $ref: '#/components/schemas/ConfigurationName'
        startPage:
          type: integer
          description: Page in the portfolio on which the document for this extraction starts.
          example: 2
        endPage:
          type: integer
          description: Page in the portfolio on which this document for this extraction ends.
          example: 6
        output:
          type: object
          properties:
            errors:
              $ref: '#/components/schemas/Errors'
            validations:
              $ref: '#/components/schemas/Validations'

    DocumentInPortfolio:
      type: object
      properties:
        documentType:
          $ref: '#/components/schemas/DocumentTypeName'
        configuration:
          $ref: '#/components/schemas/ConfigurationName'
        startPage:
          type: integer
          description: Page in the portfolio on which the document for this extraction starts.
          example: 2
        endPage:
          type: integer
          description: Page in the portfolio on which this document for this extraction ends.
          example: 6
        output:
          $ref: '#/components/schemas/ExtractionContentPortfolio'


    GenerateUrlRequest:
      type: object
      properties:
        webhook:
          $ref: '#/components/schemas/Webhook'
        content_type:
          $ref: '#/components/schemas/ContentTypeParameter'

    ExtractFromUrlRequest:
      type: object
      properties:
        webhook:
          $ref: '#/components/schemas/Webhook'
        document_url:
          $ref: '#/components/schemas/DocumentUrl'
        content_type:
          $ref: '#/components/schemas/ContentTypeParameter'
      required:
        - document_url


    EnvironmentName:
      description: Name of the environment to which the configuration used by this extraction was published.
      example: development
      type: string


    Classification:
      type: object
      properties:
        configuration:
          $ref: '#/components/schemas/ConfigurationName'
        fingerprints_present:
          type: integer
          example: 1
          description: The number of this config's fingerprints that Sensible found in the document.
        fingerprints:
          type: integer
          example: 1
          description: The number of fingerprints defined in this config.
        score:
          $ref: '#/components/schemas/Score'


    ConfigurationName:
      type: string
      description: >-
        Name of the "configuration", a collection of SenseML queries for extracting document data.
      example: config_for_x_company

    ConfigurationId:
      type: string
      format: uuid
      description: >-
        ID of the "configuration", a collection of SenseML queries for extracting document data.
      example: 24d82783-c12b-4e70-b0ae-ca1ce35a98

    DocumentTypeName:
      description: Unique user-friendly name for a document type
      example: auto_insurance_quotes_all_carriers
      type: string

    DocumentTypeId:
      description: Unique user-friendly name for a document type
      type: string
      format: uuid
      example: 11c82772-a12c-1e71-c0a1-1f1ce35bc7


    ClassificationSummary:
      type: array
      description:  >-
            Metadata about how Sensible scores configs against the document to extract from.
            By default, Sensible compares all configs in the document type, then chooses the best extraction using
            fingerprints, scores, or a combination of the two.
            When two extractions tie by score and fingerprints, Sensible chooses the
            first configuration in alphabetic order.
            For more information, see [fingerprints](https://docs.sensible.so/docs/fingerprint#notes).
      items:
        $ref: '#/components/schemas/Classification'
      example:
        - configuration: config_for_x_company
          fingerprints: 2
          fingerprints_present: 2
          score:
            value: 3
            fields_present: 4
            penalities: 0.5
        - configuration: acme_co
          fingerprints: 2
          fingerprints_present: 2
          score:
            value: 0
            fields_present: 2
            penalities: 1.5

    ClassificationSummaryPortfolio:
      type: array
      description:  >-
            Metadata about how Sensible chose the config to use for this extraction.
            The summary doesn't return fingerprints information for portfolio extractions.
      items:
        $ref: '#/components/schemas/ClassificationPortfolio'
      example:
        - configuration: config_for_x_company
          score:
            value: 3
            fields_present: 4
            penalities: 0.5
        - configuration: acme_co
          score:
            value: 0
            fields_present: 2
            penalities: 1.5

    ClassificationPortfolio:
      type: object
      properties:
        configuration:
          $ref: '#/components/schemas/ConfigurationName'
        score:
          $ref: '#/components/schemas/Score'


    FileMetadata:
      type: object
      description:  >-
        Metadata about the PDF file, for example author, authoring tool, and modified date.
      properties:
        metadata:
          type: object
          description: Raw metadata embedded in the PDF. Returned if available, without data normalization.
        error:
          type: string
          description: Errors Sensible encountered when attempting to retrieve metadata
          example: "Error retrieving PDF metadata: Invalid PDF structure"
        info:
          type: object
          description: Normalized metadata about the PDF, returned if available.
          properties:
            author:
              type: string
              description: The name of the person who created the document.
              example: Jay S. Schiller
            title:
              type: string
              description: Title assigned to the PDF by the PDF producer.
              example: file123
            creator:
              type: string
              description: If the document was converted to PDF from another format, the name of the application that created the original document from which it was converted.
              example:  macOS Version 11.2 (Build 20D64) Quartz PDFContext
            producer:
              type: string
              description: If the document was converted to PDF from another format, the name of the application that converted it to PDF
              example: Preview
            creation_date:
              type: string
              description: File creation date
              example: 2022-08-02T18:09:31.000+00:00
            modification_date:
              type: string
              description: File modification date
              example: 2022-08-03T15:09:23.000+00:00
            error:
              type: string
              description: Errors Sensible encountered when attempting to retrieve metadata.

    Score:
      type: object
      description: The score for the extraction, used to help choose the best extraction.
      properties:
        value:
          type: number
          example: 17
          description: The score total is fields_present minus penalty points. In the absence of fingerprints, Sensible returns the extraction in the document type with the highest score.
        fields_present:
          type: integer
          example: 17
          description: Number of non-null fields Sensible extracted from the document using this config
        penalties:
          type: number
          example: 1.5
          description: Errors are 1 penalty point and warnings are 0.5 points. See the validation_summary for a breakdown.

    ParsedDocument:
      description: |
        Data extracted from the document, structured as an array of fields.
        Configure the verbosity parameter in the SenseML configuration to return
        extraction metadata, such as:
        - page numbers
        - the bounding polygons that
        define line coordinates
        - for text that Sensible OCR'd, confidence scores.
        For more information, see [Verbosity](https://sensible.mintlify.app/.app/senseml-reference/config-settings/verbosity).
      type: object
      example:
        policy_number:
          type: number
          value: 123456789
          lines:
          - text: '123456789'
            page: 0
            boundingPolygon:
            - x: 6.458
              y: 2.601
            - x: 7.354
              y: 2.601
            - x: 7.354
              y: 2.767
            - x: 6.458
              y: 2.767
        name_insured:
          type: string
          value: Petar Petrov
          lines:
          - text: Petar Petrov
            page: 0
            boundingPolygon:
            - x: 1
              y: 5.515
            - x: 1.935
              y: 5.515
            - x: 1.935
              y: 5.674
            - x: 1
              y: 5.674

    Validation:
      type: object
      properties:
        description:
          type: string
          description: Description of the validation
          example: Dollar amount should be more than $100
        severity:
          type: string
          enum: [error, warning, skipped]
          example: warning
          description: Severity of the failing validation (error, warning, skipped)
        message:
          type: string
          description: Messages about why the validation failed
          example: >-
            Missing prerequisites: broker.email

    Validations:
      description: Which extracted fields failed validation rules you write in the Sensible app
      type: array
      items:
        $ref: '#/components/schemas/Validation'
      example:
        - description: Policy number must be 11 digits
          severity: error
        - description: Company email must be in format string@string
          severity: skipped
          message: Missing prerequisites - company_email

    ValidationsSummary:
      type: object
      description: Summary of the extracted fields that fail validation rules you write in the Sensible app.
      properties:
        fields:
          type: integer
          description: Number of fields specified in the SenseML config to extract from the document
          example: 6
        fields_present:
          type: integer
          description: Actual number of non-null fields extracted from the document
          example: 4
        errors:
          type: number
          description: Number of validation errors in the extraction
          example: 0
        warnings:
          type: number
          description: Number of validation warnings in the extraction
          example: 1
        skipped:
          type: integer
          description: Number of fields skipped in the extraction because a prerequisite field was null
          example: 1

    Errors:
      type: array
      description: Extraction error messages.
      items:
        $ref: '#/components/schemas/ExtractionError'

    ExtractionError:
      type: object
      description: Extraction error message
      properties:
        field_id:
          type: string
          description: ID of the extracted field.
          example: phone_number
        message:
          type: string
          description: Description of the error
          example: "ConfigurationError: width <=0"
        type:
          type: string
          description: Error type
          example: configuration


    DownloadUrlDocument:
      type: string
      description: URL of the document extraction
      example: https://sensible-so-document-type-bucket-dev-us-west-2.s3.us-west-2.amazonaws.com/sensible/fc3484c5-3f35-4129-bb29-0ad1291ee9f8/EXTRACTION/246a6f60-0e5b-11eb-b720-295a6fba723e.pdf?AWSAccessKeyId=REDACTED


    ExtractionId:
      type: string
      format: uuid
      description: Unique ID for the extraction, used to retrieve the extraction
      example: 246a6f60-0e5b-11eb-b720-295a6fba723e

    ExtractionCreated:
      type: string
      format: date-time
      example: 2022-10-31T16:27:53.433
      description: Date and time Sensible created the initial empty extraction and set its status to WAITING.

    ExtractionCompleted:
      type: string
      format: date-time
      example: 2022-10-31T16:27:53.741Z
      description: Date and time Sensible set the extraction's status to COMPLETED

    ExtractionStatus:
      type: string
      description: |
         Status of the extraction:
         - WAITING: Sensible created an initial empty extraction and is waiting for the document.
         - PROCESSING: Sensible received the document and is extracting data.
         - FAILED: The extraction failed.
         - COMPLETE: The extraction is complete.
      enum: [WAITING, PROCESSING, COMPLETE, FAILED]
      example: COMPLETE
    DocumentTypeNames:
      type: array
      description: Specifies the document types contained in the PDF portfolio.
      items:
        type: string
        example: [tax_returns, bank_statements, credit_reports]
    Webhook:
      type: object
      description: >-
        Specifies to return extraction results to the defined webhook as soon as they're complete,
        so you don't have to poll for results status. Sensible also calls this webhook on error.
      properties:
        url:
          type: string
          format: url
          description: Webhook destination. Sensible will POST to this URL when the extraction is complete.
          example: https://example.com/example_webhook_url
        payload:
          type: string
          description: Information additional to the API response, for example a UUID for verification. Can be any of the following types - [string, number, boolean, array, object].
          example: info extra to the default extraction payload
    DocumentUrl:
      type: string
      format: url
      description:  >-
        URL that responds to a GET request with the bytes of the document you want to extract data from.
        This URL must be either publicly accessible, or presigned with a security token as part of the URL path.
        To check if the URL meets these criteria, open the URL with a web browser.
        The browser must either render the document as a full-page view with no other data, or download the document, without prompting for authentication.
      example: https://github.com/sensible-hq/sensible-docs/raw/main/readme-sync/assets/v0/pdfs/auto_insurance_anyco.pdf
    ContentTypeParameter:
      type: string
      enum: ["application/pdf", "image/jpeg", "image/png", "image/tiff", "application/msword", "application/vnd.openxmlformats-officedocument.wordprocessingml.document"]

      description: >-
        Content type of the document being presented for extraction.


    encodedPdf:
      type: object
      required:
        - document
      properties:
        document:
          type: string
          description: |
            This parameter shows option \#2 for posting PDF bytes. To populate it, you can encode a document, like this [example](https://github.com/sensible-hq/sensible-docs/raw/main/readme-sync/assets/v0/pdfs/1_extract_your_first_data.pdf) using a free online PDF-to-base64 encoder and paste the resulting bytes into this parameter, or you can right-click this parameter field and select **Use Example Value**.