Open-source library providing conversion between string
, u16string
,
u32string
and u8string
. It is platform-independent and uses the Unicode
UTF code as its basis.
This library distinguishes std::string
and std::u8string
under C++20,
but still assumes the std::string
objects contain UTF-8 values.
#include <utf/utf.hpp>
Apart from as_u8(string_view)
and as_str8(u8string_view)
, all the
functions decode each Unicode code point of the input (using uint32_t
as
the interlingua) and encode it in the output. If the decoding fails, an
empty string is returned.
Conversion between string[_view]
and u8string[_view]
is done by simple
re-interpretation of the contents.
Versions marked with "C++20" comment are only available, if the standard
library defines __cpp_lib_char8_t
.
bool utf::is_valid(std::u8string_view src); // C++20
bool utf::is_valid(std::string_view src);
bool utf::is_valid(std::u16string_view src);
Tries to decode the string one character at a time and returns false
as
soon as decoding fails; otherwise, returns true
. If the utf::is_valid
returns false
for any argument, then any is_xxx
function will return an
empty string for the same argument.
bool utf::is_valid(std::u32string_view src);
Returns true
.
std::u8string utf::as_u8(std::u16string_view src); // C++20
std::u8string utf::as_u8(std::u32string_view src); // C++20
std::u8string utf::as_u8(std::string_view src); // C++20
(C++20) Converts other UTF strings to std::u8string
. The behavior is
that of utf::as_str8
, except for the type of the character used.
std::string utf::as_str8(std::u8string_view src); // C++20
std::string utf::as_str8(std::u16string_view src);
std::string utf::as_str8(std::u32string_view src);
Converts other UTF strings to std::string
encoded as UTF-8. If compiled as
C++20, the behavior is that of utf::as_u8
, except for the type of the
character used.
std::u16string utf::as_u16(std::u8string_view src); // C++20
std::u16string utf::as_u16(std::string_view src);
std::u16string utf::as_u16(std::u32string_view src);
Converts other UTF strings to std::u16string
.
std::u32string utf::as_u32(std::u8string_view src); // C++20
std::u32string utf::as_u32(std::string_view src);
std::u32string utf::as_u32(std::u16string_view src);
Converts other UTF strings to std::u32string
.
#include <utf/version.hpp>
constexpr semver::project_version utf::version;
Current version of the library to link against.
semver::project_version utf::get_version();
Current version of loaded library (if used in dynamic linking) or the same
value as utf::version
(if used in static linking).
#define UTFCONV_NAME "utfconv"
Name of the library
#define UTFCONV_VERSION_MAJOR
#define UTFCONV_VERSION_MINOR
#define UTFCONV_VERSION_PATCH
#define UTFCONV_VERSION_STABILITY
C macros representing the same information, as utf::version
variable, that
is UTFCONV_VERSION_MAJOR
/ MINOR
/ PATCH
have the same values, as
utf::version.get_major()
/ get_minor()
/ get_patch()
.
UTFCONV_VERSION_STABILITY
contains the same string, that would be returned
by utf::version.get_prerelease().to_string()
, that is, either an empty
string, or string starting with a hyphen for easy version strings
concatenation.