Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query explainer UI #423

Open
sergiimk opened this issue Sep 12, 2024 · 1 comment · May be fixed by #452
Open

Query explainer UI #423

sergiimk opened this issue Sep 12, 2024 · 1 comment · May be fixed by #452
Assignees
Labels
epic A major chunk of work composed out of smaller tasks

Comments

@sergiimk
Copy link
Member

sergiimk commented Sep 12, 2024

Context

When querying Kamu user can ask for response to include a cryptographic commitment that will hold Kamu node that provided it forever accountable for the returned data and allow other nodes on the network to reproduce and verify the request.

See for details:

Requirements

Client applications (LLM chat bots) that provide end-users some information based on queries executed in Kamu should be able to provide end user a link that takes them to Kamu and shows them:

  • The query that was executed
  • Input datasets that were used in the query
  • The identity of the node that originally executed the query
  • Whether the commitment is valid
  • If commitment is NOT valid it should indicate what part of the commitment failed the validation (e.g. invalid signature, data hash mismatch, etc.)
  • Sample of the output data
  • Query explainer page available only for admin

Data Flow

  • Client application queries Kamu with include=proof parameter and displays data to the end user
  • Client application also presents end user a URL that will take them to Kamu and show the explanation and verification info of the query
  • This link should somehow include the commitment data (e.g. encoded as a query parameter)

Example

This is just an rough idea - exact implementation TBD:

Client app makes a query:

echo '{
  "query": "select * from \"kamu/covid19.quebec.case-details\" limit 1",
  "include": ["proof"]
}' | xh POST "https://api.demo.kamu.dev/query"

Response:

{
    "input": {
        "query": "select * from \"kamu/covid19.quebec.case-details\" limit 1",
        "queryDialect": "SqlDataFusion",
        "dataFormat": "JsonAoS",
        "include": [
            "Input",
            "Proof"
        ],
        "datasets": [
            {
                "id": "did:odf:fed0122c8ddb48715dbaac81f90d9d19f306cc744f69f263bd0b0b8375c1c7bdf3f0c",
                "alias": "kamu/covid19.quebec.case-details",
                "blockHash": "f162016c72ca766d9ecb5e9abeca1486f311eb127ed575dd7c7f71cc700339cf746c7"
            }
        ],
        "skip": 0,
        "limit": 100
    },
    "output": {
        "data": [
            {
                "age_group": "20-29",
                "case_status": "Recovered",
                "date_reported": "2021-04-07T12:00:00Z",
                "exposure": "Close Contact",
                "gender": "Female",
                "health_region": "Ottawa Public Health",
                "hr_uid": 3551,
                "latitude": 45.179577,
                "longitude": -75.79995,
                "objectid": 399001,
                "offset": 1048576,
                "op": 0,
                "province": "Ontario",
                "province_abbr": "ON",
                "row_id": 398455,
                "system_time": "2024-09-02T22:38:09.491Z"
            }
        ],
        "dataFormat": "JsonAoS"
    },
    "subQueries": [],
    "commitment": {
        "inputHash": "f1620f6be0fe2861ed79532991d568ab7c28ac1dc6ec2b1969480f745f0c72b136ce0",
        "outputHash": "f162036c56d50832e88b480a4b58264497defc9847fa5ff5b5461048aedd0471057e3",
        "subQueriesHash": "f1620ca4510738395af1429224dd785675309c344b2b549632e20275c69b15ed1d210"
    },
    "proof": {
        "type": "Ed25519Signature2020",
        "verificationMethod": "did:key:z6MksqxvcNrqUYMC2LrAFcEtjjqEMLsUi1Ek8x6VFA8vA4xF",
        "proofValue": "uutkx5Sr2n_sRT1UAnQWZM02OWP0pgDzfD9LtjPw4PWwgZLxRDqDPyt3AWOfgqSf-Lkcy5xpPbM7yntrSkldSAA"
    }
}

To take the end user to query explanation the client app takes the commitment part of the response:

{
    "input": {
        "query": "select * from \"kamu/covid19.quebec.case-details\" limit 1",
        "queryDialect": "SqlDataFusion",
        "dataFormat": "JsonAoS",
        "include": [
            "Input",
            "Proof"
        ],
        "datasets": [
            {
                "id": "did:odf:fed0122c8ddb48715dbaac81f90d9d19f306cc744f69f263bd0b0b8375c1c7bdf3f0c",
                "alias": "kamu/covid19.quebec.case-details",
                "blockHash": "f162016c72ca766d9ecb5e9abeca1486f311eb127ed575dd7c7f71cc700339cf746c7"
            }
        ],
        "skip": 0,
        "limit": 100
    },
    "subQueries": [],
    "commitment": {
        "inputHash": "f1620f6be0fe2861ed79532991d568ab7c28ac1dc6ec2b1969480f745f0c72b136ce0",
        "outputHash": "f162036c56d50832e88b480a4b58264497defc9847fa5ff5b5461048aedd0471057e3",
        "subQueriesHash": "f1620ca4510738395af1429224dd785675309c344b2b549632e20275c69b15ed1d210"
    },
    "proof": {
        "type": "Ed25519Signature2020",
        "verificationMethod": "did:key:z6MksqxvcNrqUYMC2LrAFcEtjjqEMLsUi1Ek8x6VFA8vA4xF",
        "proofValue": "uutkx5Sr2n_sRT1UAnQWZM02OWP0pgDzfD9LtjPw4PWwgZLxRDqDPyt3AWOfgqSf-Lkcy5xpPbM7yntrSkldSAA"
    }
}

It base-64 encodes it into:

ewogICA...AgICB9Cn0K

And presents end user a link pointing to:

https://platform.demo.kamu.dev/query?commitment=ewogICA...AgICB9Cn0K

When user clicks on the link the Web UI will:

  • Decode and visualize the commitment data
  • Send it for verification to https://api.demo.kamu.dev/verify endpoint
  • Visualize the verification status
  • Show a sample of output data

Verification statuses are currently not documented but can all be found in this test.

Technological Uncertainty

The biggest question is whether we can fit the entire commitment data into a URL parameter.

Investigation is needed on the typical HTTP URL length limits to ensure we will be safely within them.

Otherwise an alternative way to redirect user to the platform with commitment data should be designed.

Tasks

  • Design how commitment data will be passed to the Web UI
  • Design the UI view for visualizing the query
  • Design the new URL path route for the view
  • Add test plan for verification
  • ...
@sergiimk sergiimk added the epic A major chunk of work composed out of smaller tasks label Sep 12, 2024
@zaychenko-sergei
Copy link
Contributor

image

@dmitriy-borzenko dmitriy-borzenko self-assigned this Oct 4, 2024
@dmitriy-borzenko dmitriy-borzenko linked a pull request Oct 17, 2024 that will close this issue
@dmitriy-borzenko dmitriy-borzenko linked a pull request Oct 17, 2024 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
epic A major chunk of work composed out of smaller tasks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants