Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] How is string feature handled? #129

Open
GeekAlexis opened this issue Oct 11, 2021 · 1 comment
Open

[Question] How is string feature handled? #129

GeekAlexis opened this issue Oct 11, 2021 · 1 comment

Comments

@GeekAlexis
Copy link

GeekAlexis commented Oct 11, 2021

Thanks for the great project.

As I know, the CRF model only takes real values. If I use the suffix of a word (say last 4 letters) as a feature, is it internally converted to a binary feature with all combinations of the 4 letters (up to 26 x 26 x 26 x 26 dimensional)? I don't see this documented anywhere.

@gurmitteotia
Copy link

I know you asked this question quite a long time ago but answer can be still useful to someone.

As per ItemSequence document string features are converted to float. e.g. if you pass a feature as {"word1" : "hello"} then it will be converted to {"word1=hello": 1.0}. Have a look document it has many examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants