utf8

UTF-8 support for Nix

Why

Strings in Nix are byte strings, and builtin functions like substring (and by extension some lib functions in nixpkgs) processes bytes instead of UTF-8 code points. That means these functions can create invalid strings when given strings with UTF-8. This library basically allows you to convert it to a list of UTF-8 code points.

Usage

Try it out with flakes

nix repl github:figsoda/utf8#lib --extra-experimental-features "flakes nix-command repl-flake"

or locally

nix repl -f .

`chars`

Type: String -> [ String ]

Split a string into a list of code points

nix-repl> chars "你好，世界！"
[ "你" "好" "，" "世" "界" "！" ]

`head`

Type: String -> String

Return the first code point of the string

nix-repl> head "你好，世界！"
"你"

`tail`

Type: String -> String

Return the string without the first code point

nix-repl> tail "你好，世界！"
"好，世界！"

`length`

Type: String -> Int

Return the number of code points in the string

nix-repl> length "你好，世界！"
6

Development

nix run ./dev # regenerate table.nix

nix develop ./dev
namaka check # run tests
namaka review # review pending snapshots

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
.github		.github
dev		dev
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
default.nix		default.nix
flake.nix		flake.nix
namaka.toml		namaka.toml
table.nix		table.nix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

utf8

Why

Usage

`chars`

`head`

`tail`

`length`

Development

About

Releases 1

Sponsor this project

Packages

Contributors 2

Languages

License

figsoda/utf8

Folders and files

Latest commit

History

Repository files navigation

utf8

Why

Usage

chars

head

tail

length

Development

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Sponsor this project

Packages 0

Contributors 2

Languages

`chars`

`head`

`tail`

`length`

Packages