Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Can't use filter "icu_folding" with "unicode_set_filter" #591

Closed
MikePieperSer opened this issue Aug 2, 2023 · 4 comments
Closed
Labels
bug Something isn't working untriaged

Comments

@MikePieperSer
Copy link
Contributor

What is the bug?

Can't use filter "icu_folding" with "unicode_set_filter".
When try to create an index with this filter settings via java client, the server complains about a missing "[".

When I create an index via JSON directly (Dev Tools), then the client throws an exception, when reading the index settings, complaining an unexpected "[".

How can one reproduce the bug?

        // create an index
        CreateIndexRequest createIndexRequest = new CreateIndexRequest.Builder().index(index)
            .settings(b->b
                .analysis(ab -> ab
                    .filter("de_folding", fb -> fb
                        .definition(tfdb -> tfdb
                            .icuFolding(ifb -> ifb
                                .unicodeSetFilter("^äöüÄÖÜß")
                            )
                        )
                    )
                )
            ).build();
        CreateIndexResponse createIndexResponse = client.indices().create(createIndexRequest);

        final GetIndexResponse getIndexResponse = client.indices().get(b -> b
            .index(index + "*")
            .ignoreUnavailable(true)
            .allowNoIndices(true)
            .expandWildcards(ExpandWildcard.Open));
        System.out.println("Get index reponse: " + getIndexResponse);

What is the expected behavior?

That the index can be created with this filter and can be read back.

What is your host/environment?

AWS Opensearch Serverless / java-client 2.6.0

Do you have any screenshots?

No.

Do you have any additional context?

No.

@MikePieperSer MikePieperSer added bug Something isn't working untriaged labels Aug 2, 2023
@MikePieperSer MikePieperSer changed the title [BUG] [BUG] Can't use filter "icu_folding" with "unicode_set_filter" Aug 2, 2023
@dblock
Copy link
Member

dblock commented Aug 2, 2023

Looks like a legit bug, want to try and turn it into a (failing) unit test?

@MikePieperSer
Copy link
Contributor Author

The bug is not in the java-client but in AWS Opensearch Serverless.
When creating an index like this:

PUT icu_sample
{
  "settings": {
    "analysis": {
      "index": {
        "filter": {
          "swedish_folding": {
            "type": "icu_folding",
            "unicode_set_filter": "[^åäöÅÄÖ]"
          }
        }
      }
    }
  }
}

The server creates the index. But then if one tries to get the settings back serverless answers with:

GET icu_sample

{
  "icu_sample": {
        "analysis": {
          "index": {
            "filter": {
              "swedish_folding": {
                "type": "icu_folding",
                "unicode_set_filter": [
                  "^åäöÅÄÖ"
                ]
              }
            }
          }
        }
      }
    }
  }
}

You see "unicode_set_filter" now became an array.
That means, the java-client is absolutely right to complain.

@dblock
Copy link
Member

dblock commented Aug 3, 2023

@MikePieperSer Did you open a ticket with AWS?

@MikePieperSer
Copy link
Contributor Author

Not yet, but it's on my task list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working untriaged
Projects
None yet
Development

No branches or pull requests

2 participants