The wordcase package provides methods for converting words to different "casing" forms, eg:
- SnakeCase
- KebabCase
- DotCase
- ScreamingSnakeCase
- CamelCase
- PascalCase
- Words
- TitleCase
When those methods aren't enough, it also offers a way to build a function that behave exactly* how you want.
eg, to create a title casing method you could do something like:
titleCase := wordcase.
NewPipeline().
TokenizeUsing(SimpleCategorizer, unicode.IsSpace, true). // tokenise on whitespace, don't consider the separator as part of the token
WithAllFormatter(strings.ToLower). // convert all tokens to lowercase
WithFormatter(strings.ToUpper, wordcase.KeyWords). // convert keywords to uppercase (eg HTML, XML, ID -- can supply your own list)
WithAllFormatter(wordcase.UppercaseFirst). // convert the first character of each token to uppercase
JoinWith(" ") // join the tokens back together using a space as a separator
// use it a la:
fmt.Println(titleCase("oNe two Html THREE")) // will print: One Two HTML Three
This package is pretty much overkill, but it does handle a bunch of edge cases pretty nicely (that's hard to do with straight regex substitution).
* exactly-ish
The package contains some example pipelines for common conversions.
If the methods don't behave exactly how you'd like, you can copy the pipelines
from the standalone.go file and modify them to suit.
eg
SnakeCase("One example id")
-> "one.example.id"
eg
KebabCase("One example id")
-> "one-example-id"
eg
DotCase("One example id")' ->
"one.example.id"`
eg
ScreamingSnakeCase("One example id")' ->
"ONE_EXAMPLE_ID"`
eg
CamelCase("One example id")
-> "oneExampleID"
eg
PascalCase("One example id")
-> "OneExampleID"
Converts an "otherly-cased" string into words separated by spaces (actual case of each word left untouched)
eg
Words("oneExampleID")
-> "one Example ID"
Converts each word in a string to uppercase.
Note: there's a Title
function in strings which may be what you want if you don't want
non-alphanumeric characters stripped, or your string is already broken into space-separated words
eg
TitleCase("ONE_EXAMPLE_ID")
-> "One Example ID"
Perform transformations using a pipeline of processors.
A Pipeline
type has the signature func(string) Tokens
.
For a function to work as a pipeline stage, it must have a Pipeline
compatible receiver and return a Pipeline
Pipeline are built using the following kinds of stages:
- Tokenization
- Formatting
- Combination
Splits a string into tokens.
Performed with the method:
TokenizeUsing(test SeparatorTest, sep IsRuneSeparator, del bool)
Where:
- a SeparatorTest holds the logic on deciding if we're going to split on a given rune
- a IsRuneSeparator determines if we consider a give rune a separator
- set del to true to delete the separating rune, or false to make it part of the next token
Multiple tokenization steps can be specified, creating a multi-pass parser.
This can simplify gnarly situations with complicated rules on what makes a token.
Has the signature: func(word string, idx int, test IsRuneSeparator) bool
A SeparatorTest determines if the given index in the given string is a token separator. By having the whole parse text as reference, smarter tests can be performed using lookahead/lookbehind along with the IsRuneSeparator test.
There's two SeparatorTests provides currently:
LookAroundCategorizer considers a rune to be a separator if it passes IsRuneSeparator(c), but also so long as bot of the preceding and succeeding runes are NOT separators
SimpleCategorizer considers only the current rune when deciding on separators.
Formatters are applied to tokens, usually to change the case in some way, but any string operation can be applied.
Performed with the method:
WithFormatter(formatter Formatter, selector TokenSelector)
There is also a convenience method:
WithAllFormatter(formatter Formatter)
which is simple shorthand for:
WithFormatter(formatter, ToAll)
Where:
- Formatter is a function that takes a string, does stuff, returns a string
- TokenSelector is a function that specifies which tokens to apply formatting
A formatter has the function signature func(string) string
, so many of the operations in the strings
library can be applied (eg ToLower
, ToUpper
).
One additional method is currently supplied: UppercaseFirst
which converts the first character of a string to uppercase, and leaves the rest untouched
A TokenSelector
determines which tokens to apply formatting.
They have the signature func(t Tokens) []int
, where t
is the current list of tokens, and []int
is a slice of indexes on which the formatter should operate.
Provided Selectors include:
ToFirst
- returns an index to the first tokenToLast
- returns an index to the last tokenToRest
- returns an index to all tokens except the firstToAll
- returns an index to all tokens
There are 3 logic selectors that operate on other selectors:
Not(sFn TokenSelector)
- inverts the selection list of the provided selectorAnd(a, b TokenSelector)
- returns a selector that matches where both the selectorsa
andb
matchOr(a, b TokenSelector)
- returns a selector that matches where either of the selectorsa
orb
match
There is a TokenSelector
generator method KeyWordFn(keywords []string)
which creates a selector using a list of words provided.
Two lists are provided currently:
GoLintKeywords
- a list of words that golint uses as a set of common initialisms.UsefulKeyWords
- a list that includes all the words fromGoLintKeywords
plus a few more initialisms such asgrpc
,yaml
andtoml
A combination stage is for joining the tokens into new words again.
Performed with the stage
JoinWith(sep string)
Currently, only one Combiner
is supported and must be the last stage of a pipeline.
This method simply joins the tokens together using the string sep
as glue