Skip to content
This repository has been archived by the owner on Jun 12, 2024. It is now read-only.

Sort keys of objects returned by SELECT * FROM table by creation order instead of dictionary order #5

Open
thanhnguyen2187 opened this issue Feb 4, 2024 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@thanhnguyen2187
Copy link

Hi!

Suppose I have this table:

CREATE TABLE `folders` (
	`id` text PRIMARY KEY NOT NULL,
	`name` text NOT NULL,
	`position` real DEFAULT 1 NOT NULL,
	`updated_at` text DEFAULT CURRENT_TIMESTAMP NOT NULL,
	`created_at` text DEFAULT CURRENT_TIMESTAMP NOT NULL
);

And this data:

INSERT INTO `folders`(id, name, position)
VALUES ('default', 'Default', 0)
ON CONFLICT(id) DO NOTHING;

SELECT * FROM folders would return:

{
  "results": [
    {
      "success": true,
      "resultSet": [
        {
          "created_at": "2024-02-04 12:23:08",
          "id": "default",
          "name": "Default",
          "position": 0.0,
          "updated_at": "2024-02-04 12:23:08"
        }
      ]
    }
  ]
}

I think it should follow creation order`, which returns:

  • id
  • name
  • position
  • updated_at
  • created_at

It is the behavior of the behavior of sqlite3 as well:

default|Default|0.0|2024-02-04 12:54:32|2024-02-04 12:54:32

Thanks!

@thanhnguyen2187
Copy link
Author

More context on why I need this: I was wrapping Drizzle around sqliterg and encountered a strange error where Drizzle would not map the returned values correctly. After messing around a bit, I found that for SELECT *, Drizzle expects an array of values that follow the creation order as well.

@proofrock proofrock self-assigned this Feb 4, 2024
@proofrock proofrock added the bug Something isn't working label Feb 4, 2024
@proofrock
Copy link
Owner

proofrock commented Feb 4, 2024

Hi! I agree with you that this should be corrected. The code is this:

Some(row) => {
    let mut map: JsonMap<String, JsonValue> = JsonMap::new();
    for (i, col_name) in column_names.iter().enumerate() {
        let value: Value = row.get_unwrap(i);
        map.insert(col_name.to_string(), val_db2val_json(value));
    }
    response.push(JsonValue::Object(map));
}

where JsonMap is an alias for serde_json::Map. I think (I may be wrong) this is a "pure" hashmap with no ordering guarantee... so I need to work around this.

I cannot give you a ETA for this, it's not a good period, but as soon as I manage I'll take a look at it.

Thanks!

G.

@proofrock
Copy link
Owner

proofrock commented Feb 4, 2024

I found that the JSON specification says that (emphasis mine):

An object is an unordered collection of zero or more name/value
pairs, where a name is a string and a value is a string, number,
boolean, null, object, or array.

So, serde_json seems to take it as a feature the reordering of the keys in alphabetical order. And Drizzle may seems to be at fault.

Not easy to circumvent that; but something similar to this PR may be a way forward

contain-rs/linked-hash-map#48

Would it be helpful to present the values in a list (of lists)? And the headers in another list?

@proofrock
Copy link
Owner

proofrock commented Feb 4, 2024

To expand a bit more: the request may be:

...
        {
            "query": "SELECT * FROM folders",
            "resultFormat": "list",
            ...
        }
...

with values of list or map, by default/if absent map would be used, for retrocompatibility. If list, your result would become:

{
  "results": [
    {
      "success": true,
      "resultHeaders": ["id", "name", "position", "updated_at", "created_at"],
      "resultSet": [
        [ "default", "Default", 0.0, "2024-02-04 12:23:08", "2024-02-04 12:23:08" ]
      ]
    }
  ]
}

@thanhnguyen2187
Copy link
Author

Many thanks for your answers!

"resultFormat": "list" is a really good idea! Let us hope that there would be more use case for it apart from my Drizzle wrapping 😅.

On "fixing" the current code, I was thinking about linked-hash-map as well. I used the data structure (Ordered Dict) in another library (https://github.com/thanhnguyen2187/darkest-savior), where I needed to deserialize the data to a binary format with the correct order. I think using it instead of "normal" hash map is quite straight forward.

I'm only using sqliterg in a side project, so there is no rush! Please take your time!

@proofrock
Copy link
Owner

I wish I could use a linked hash map! Unfortunately for how the code is done, it must be used something from the serde "realm" so to speak, not just any map. 🙁 Besides, the fact that a js object is unordered opens the possibility that some additional "layer" that one may use scrambles the ordering. It is just not the right structure.

I have actually no issues in adding a parameter, if it's useful, sane and has a decent preset that doesn't break anything. 😉 Just have a bit of patience.

Thanks!

@thanhnguyen2187
Copy link
Author

thanhnguyen2187 commented Feb 10, 2024

Hi @proofrock. Do you think that there is any workaround for this issue? It has became a major blocker for my side project. I tried ws4sqlite, and it has the same issue. Specifying column names does not seem to work either 😅.

@proofrock
Copy link
Owner

I put down a quick (and dirty!) implementation for ws4sqlite, in a feature branch

https://github.com/proofrock/ws4sqlite/tree/feature/list-response-format

There's no choice, only the "new" schema is implemented.

Request:

{
    "transaction": [
        {
            "statement": "CREATE TABLE IF NOT EXISTS `folders` (`id` text PRIMARY KEY NOT NULL,`name` text NOT NULL,`position` real DEFAULT 1 NOT NULL,`updated_at` text DEFAULT CURRENT_TIMESTAMP NOT NULL,`created_at` text DEFAULT CURRENT_TIMESTAMP NOT NULL);"
        },
		{
            "statement": "INSERT INTO `folders`(id, name, position) VALUES ('default', 'Default', 0) ON CONFLICT(id) DO NOTHING;"
        },
		{
            "query": "SELECT * FROM folders"
        }
    ]
}

Response:

{
    "results": [
        {
            "success": true,
            "rowsUpdated": 0
        },
        {
            "success": true,
            "rowsUpdated": 1
        },
        {
            "success": true,
            "resultHeaders": [
                "id",
                "name",
                "position",
                "updated_at",
                "created_at"
            ],
            "resultSet": [
                [
                    "default",
                    "Default",
                    0,
                    "2024-02-11 08:01:39",
                    "2024-02-11 08:01:39"
                ]
            ]
        }
    ]
}

So maybe you can experiment a little bit? If you are not able to compile it, please tell me your arch and I'll get you some binaries.

@thanhnguyen2187
Copy link
Author

thanhnguyen2187 commented Feb 11, 2024

Thanks for the quick response! I have been tinkering ws4sqlite, too. Above is a PR that uses OrderedDict to replace resultSet's records. I'll try to add unit tests and cherry pick your branch if it is possible.

@proofrock
Copy link
Owner

Hi Thanh! Please take a look at proofrock/ws4sqlite#44 and contribute if you want!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants