annotate dep/utf8proc/utf8proc.h @ 337:a7d4e5107531

dep/animone: REFACTOR ALL THE THINGS 1: animone now has its own syntax divergent from anisthesia, making different platforms actually have their own sections 2: process names in animone are now called `comm' (this will probably break things). this is what its called in bsd/linux so I'm just going to use it everywhere 3: the X11 code now checks for the existence of a UTF-8 window title and passes it if available 4: ANYTHING THATS NOT LINUX IS 100% UNTESTED AND CAN AND WILL BREAK! I still actually need to test the bsd code. to be honest I'm probably going to move all of the bsds into separate files because they're all essentially different operating systems at this point
author Paper <paper@paper.us.eu.org>
date Wed, 19 Jun 2024 12:51:15 -0400
parents ff0b2052b234
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
265
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
1 /*
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
2 * Copyright (c) 2014-2021 Steven G. Johnson, Jiahao Chen, Peter Colberg, Tony Kelman, Scott P. Jones, and other contributors.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
3 * Copyright (c) 2009 Public Software Group e. V., Berlin, Germany
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
4 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
5 * Permission is hereby granted, free of charge, to any person obtaining a
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
6 * copy of this software and associated documentation files (the "Software"),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
7 * to deal in the Software without restriction, including without limitation
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
8 * the rights to use, copy, modify, merge, publish, distribute, sublicense,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
9 * and/or sell copies of the Software, and to permit persons to whom the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
10 * Software is furnished to do so, subject to the following conditions:
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
11 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
12 * The above copyright notice and this permission notice shall be included in
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
13 * all copies or substantial portions of the Software.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
14 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
15 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
16 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
17 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
18 * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
19 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
20 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
21 * DEALINGS IN THE SOFTWARE.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
22 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
23
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
24
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
25 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
26 * @mainpage
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
27 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
28 * utf8proc is a free/open-source (MIT/expat licensed) C library
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
29 * providing Unicode normalization, case-folding, and other operations
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
30 * for strings in the UTF-8 encoding, supporting up-to-date Unicode versions.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
31 * See the utf8proc home page (http://julialang.org/utf8proc/)
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
32 * for downloads and other information, or the source code on github
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
33 * (https://github.com/JuliaLang/utf8proc).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
34 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
35 * For the utf8proc API documentation, see: @ref utf8proc.h
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
36 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
37 * The features of utf8proc include:
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
38 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
39 * - Transformation of strings (@ref utf8proc_map) to:
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
40 * - decompose (@ref UTF8PROC_DECOMPOSE) or compose (@ref UTF8PROC_COMPOSE) Unicode combining characters (http://en.wikipedia.org/wiki/Combining_character)
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
41 * - canonicalize Unicode compatibility characters (@ref UTF8PROC_COMPAT)
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
42 * - strip "ignorable" (@ref UTF8PROC_IGNORE) characters, control characters (@ref UTF8PROC_STRIPCC), or combining characters such as accents (@ref UTF8PROC_STRIPMARK)
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
43 * - case-folding (@ref UTF8PROC_CASEFOLD)
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
44 * - Unicode normalization: @ref utf8proc_NFD, @ref utf8proc_NFC, @ref utf8proc_NFKD, @ref utf8proc_NFKC
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
45 * - Detecting grapheme boundaries (@ref utf8proc_grapheme_break and @ref UTF8PROC_CHARBOUND)
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
46 * - Character-width computation: @ref utf8proc_charwidth
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
47 * - Classification of characters by Unicode category: @ref utf8proc_category and @ref utf8proc_category_string
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
48 * - Encode (@ref utf8proc_encode_char) and decode (@ref utf8proc_iterate) Unicode codepoints to/from UTF-8.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
49 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
50
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
51 /** @file */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
52
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
53 #ifndef UTF8PROC_H
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
54 #define UTF8PROC_H
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
55
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
56 /** @name API version
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
57 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
58 * The utf8proc API version MAJOR.MINOR.PATCH, following
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
59 * semantic-versioning rules (http://semver.org) based on API
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
60 * compatibility.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
61 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
62 * This is also returned at runtime by @ref utf8proc_version; however, the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
63 * runtime version may append a string like "-dev" to the version number
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
64 * for prerelease versions.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
65 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
66 * @note The shared-library version number in the Makefile
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
67 * (and CMakeLists.txt, and MANIFEST) may be different,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
68 * being based on ABI compatibility rather than API compatibility.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
69 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
70 /** @{ */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
71 /** The MAJOR version number (increased when backwards API compatibility is broken). */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
72 #define UTF8PROC_VERSION_MAJOR 2
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
73 /** The MINOR version number (increased when new functionality is added in a backwards-compatible manner). */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
74 #define UTF8PROC_VERSION_MINOR 9
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
75 /** The PATCH version (increased for fixes that do not change the API). */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
76 #define UTF8PROC_VERSION_PATCH 0
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
77 /** @} */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
78
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
79 #include <stdlib.h>
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
80
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
81 #if defined(_MSC_VER) && _MSC_VER < 1800
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
82 // MSVC prior to 2013 lacked stdbool.h and inttypes.h
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
83 typedef signed char utf8proc_int8_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
84 typedef unsigned char utf8proc_uint8_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
85 typedef short utf8proc_int16_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
86 typedef unsigned short utf8proc_uint16_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
87 typedef int utf8proc_int32_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
88 typedef unsigned int utf8proc_uint32_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
89 # ifdef _WIN64
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
90 typedef __int64 utf8proc_ssize_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
91 typedef unsigned __int64 utf8proc_size_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
92 # else
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
93 typedef int utf8proc_ssize_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
94 typedef unsigned int utf8proc_size_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
95 # endif
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
96 # ifndef __cplusplus
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
97 // emulate C99 bool
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
98 typedef unsigned char utf8proc_bool;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
99 # ifndef __bool_true_false_are_defined
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
100 # define false 0
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
101 # define true 1
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
102 # define __bool_true_false_are_defined 1
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
103 # endif
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
104 # else
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
105 typedef bool utf8proc_bool;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
106 # endif
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
107 #else
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
108 # include <stddef.h>
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
109 # include <stdbool.h>
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
110 # include <inttypes.h>
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
111 typedef int8_t utf8proc_int8_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
112 typedef uint8_t utf8proc_uint8_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
113 typedef int16_t utf8proc_int16_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
114 typedef uint16_t utf8proc_uint16_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
115 typedef int32_t utf8proc_int32_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
116 typedef uint32_t utf8proc_uint32_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
117 typedef size_t utf8proc_size_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
118 typedef ptrdiff_t utf8proc_ssize_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
119 typedef bool utf8proc_bool;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
120 #endif
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
121 #include <limits.h>
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
122
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
123 #ifdef UTF8PROC_STATIC
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
124 # define UTF8PROC_DLLEXPORT
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
125 #else
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
126 # ifdef _WIN32
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
127 # ifdef UTF8PROC_EXPORTS
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
128 # define UTF8PROC_DLLEXPORT __declspec(dllexport)
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
129 # else
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
130 # define UTF8PROC_DLLEXPORT __declspec(dllimport)
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
131 # endif
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
132 # elif __GNUC__ >= 4
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
133 # define UTF8PROC_DLLEXPORT __attribute__ ((visibility("default")))
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
134 # else
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
135 # define UTF8PROC_DLLEXPORT
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
136 # endif
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
137 #endif
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
138
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
139 #ifdef __cplusplus
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
140 extern "C" {
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
141 #endif
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
142
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
143 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
144 * Option flags used by several functions in the library.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
145 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
146 typedef enum {
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
147 /** The given UTF-8 input is NULL terminated. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
148 UTF8PROC_NULLTERM = (1<<0),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
149 /** Unicode Versioning Stability has to be respected. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
150 UTF8PROC_STABLE = (1<<1),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
151 /** Compatibility decomposition (i.e. formatting information is lost). */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
152 UTF8PROC_COMPAT = (1<<2),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
153 /** Return a result with decomposed characters. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
154 UTF8PROC_COMPOSE = (1<<3),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
155 /** Return a result with decomposed characters. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
156 UTF8PROC_DECOMPOSE = (1<<4),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
157 /** Strip "default ignorable characters" such as SOFT-HYPHEN or ZERO-WIDTH-SPACE. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
158 UTF8PROC_IGNORE = (1<<5),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
159 /** Return an error, if the input contains unassigned codepoints. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
160 UTF8PROC_REJECTNA = (1<<6),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
161 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
162 * Indicating that NLF-sequences (LF, CRLF, CR, NEL) are representing a
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
163 * line break, and should be converted to the codepoint for line
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
164 * separation (LS).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
165 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
166 UTF8PROC_NLF2LS = (1<<7),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
167 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
168 * Indicating that NLF-sequences are representing a paragraph break, and
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
169 * should be converted to the codepoint for paragraph separation
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
170 * (PS).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
171 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
172 UTF8PROC_NLF2PS = (1<<8),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
173 /** Indicating that the meaning of NLF-sequences is unknown. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
174 UTF8PROC_NLF2LF = (UTF8PROC_NLF2LS | UTF8PROC_NLF2PS),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
175 /** Strips and/or convers control characters.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
176 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
177 * NLF-sequences are transformed into space, except if one of the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
178 * NLF2LS/PS/LF options is given. HorizontalTab (HT) and FormFeed (FF)
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
179 * are treated as a NLF-sequence in this case. All other control
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
180 * characters are simply removed.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
181 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
182 UTF8PROC_STRIPCC = (1<<9),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
183 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
184 * Performs unicode case folding, to be able to do a case-insensitive
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
185 * string comparison.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
186 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
187 UTF8PROC_CASEFOLD = (1<<10),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
188 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
189 * Inserts 0xFF bytes at the beginning of each sequence which is
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
190 * representing a single grapheme cluster (see UAX#29).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
191 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
192 UTF8PROC_CHARBOUND = (1<<11),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
193 /** Lumps certain characters together.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
194 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
195 * E.g. HYPHEN U+2010 and MINUS U+2212 to ASCII "-". See lump.md for details.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
196 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
197 * If NLF2LF is set, this includes a transformation of paragraph and
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
198 * line separators to ASCII line-feed (LF).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
199 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
200 UTF8PROC_LUMP = (1<<12),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
201 /** Strips all character markings.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
202 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
203 * This includes non-spacing, spacing and enclosing (i.e. accents).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
204 * @note This option works only with @ref UTF8PROC_COMPOSE or
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
205 * @ref UTF8PROC_DECOMPOSE
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
206 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
207 UTF8PROC_STRIPMARK = (1<<13),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
208 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
209 * Strip unassigned codepoints.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
210 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
211 UTF8PROC_STRIPNA = (1<<14),
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
212 } utf8proc_option_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
213
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
214 /** @name Error codes
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
215 * Error codes being returned by almost all functions.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
216 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
217 /** @{ */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
218 /** Memory could not be allocated. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
219 #define UTF8PROC_ERROR_NOMEM -1
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
220 /** The given string is too long to be processed. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
221 #define UTF8PROC_ERROR_OVERFLOW -2
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
222 /** The given string is not a legal UTF-8 string. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
223 #define UTF8PROC_ERROR_INVALIDUTF8 -3
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
224 /** The @ref UTF8PROC_REJECTNA flag was set and an unassigned codepoint was found. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
225 #define UTF8PROC_ERROR_NOTASSIGNED -4
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
226 /** Invalid options have been used. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
227 #define UTF8PROC_ERROR_INVALIDOPTS -5
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
228 /** @} */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
229
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
230 /* @name Types */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
231
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
232 /** Holds the value of a property. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
233 typedef utf8proc_int16_t utf8proc_propval_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
234
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
235 /** Struct containing information about a codepoint. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
236 typedef struct utf8proc_property_struct {
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
237 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
238 * Unicode category.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
239 * @see utf8proc_category_t.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
240 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
241 utf8proc_propval_t category;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
242 utf8proc_propval_t combining_class;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
243 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
244 * Bidirectional class.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
245 * @see utf8proc_bidi_class_t.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
246 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
247 utf8proc_propval_t bidi_class;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
248 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
249 * @anchor Decomposition type.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
250 * @see utf8proc_decomp_type_t.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
251 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
252 utf8proc_propval_t decomp_type;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
253 utf8proc_uint16_t decomp_seqindex;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
254 utf8proc_uint16_t casefold_seqindex;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
255 utf8proc_uint16_t uppercase_seqindex;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
256 utf8proc_uint16_t lowercase_seqindex;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
257 utf8proc_uint16_t titlecase_seqindex;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
258 utf8proc_uint16_t comb_index;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
259 unsigned bidi_mirrored:1;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
260 unsigned comp_exclusion:1;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
261 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
262 * Can this codepoint be ignored?
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
263 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
264 * Used by @ref utf8proc_decompose_char when @ref UTF8PROC_IGNORE is
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
265 * passed as an option.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
266 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
267 unsigned ignorable:1;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
268 unsigned control_boundary:1;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
269 /** The width of the codepoint. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
270 unsigned charwidth:2;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
271 unsigned pad:2;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
272 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
273 * Boundclass.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
274 * @see utf8proc_boundclass_t.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
275 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
276 unsigned boundclass:6;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
277 unsigned indic_conjunct_break:2;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
278 } utf8proc_property_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
279
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
280 /** Unicode categories. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
281 typedef enum {
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
282 UTF8PROC_CATEGORY_CN = 0, /**< Other, not assigned */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
283 UTF8PROC_CATEGORY_LU = 1, /**< Letter, uppercase */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
284 UTF8PROC_CATEGORY_LL = 2, /**< Letter, lowercase */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
285 UTF8PROC_CATEGORY_LT = 3, /**< Letter, titlecase */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
286 UTF8PROC_CATEGORY_LM = 4, /**< Letter, modifier */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
287 UTF8PROC_CATEGORY_LO = 5, /**< Letter, other */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
288 UTF8PROC_CATEGORY_MN = 6, /**< Mark, nonspacing */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
289 UTF8PROC_CATEGORY_MC = 7, /**< Mark, spacing combining */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
290 UTF8PROC_CATEGORY_ME = 8, /**< Mark, enclosing */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
291 UTF8PROC_CATEGORY_ND = 9, /**< Number, decimal digit */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
292 UTF8PROC_CATEGORY_NL = 10, /**< Number, letter */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
293 UTF8PROC_CATEGORY_NO = 11, /**< Number, other */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
294 UTF8PROC_CATEGORY_PC = 12, /**< Punctuation, connector */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
295 UTF8PROC_CATEGORY_PD = 13, /**< Punctuation, dash */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
296 UTF8PROC_CATEGORY_PS = 14, /**< Punctuation, open */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
297 UTF8PROC_CATEGORY_PE = 15, /**< Punctuation, close */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
298 UTF8PROC_CATEGORY_PI = 16, /**< Punctuation, initial quote */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
299 UTF8PROC_CATEGORY_PF = 17, /**< Punctuation, final quote */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
300 UTF8PROC_CATEGORY_PO = 18, /**< Punctuation, other */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
301 UTF8PROC_CATEGORY_SM = 19, /**< Symbol, math */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
302 UTF8PROC_CATEGORY_SC = 20, /**< Symbol, currency */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
303 UTF8PROC_CATEGORY_SK = 21, /**< Symbol, modifier */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
304 UTF8PROC_CATEGORY_SO = 22, /**< Symbol, other */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
305 UTF8PROC_CATEGORY_ZS = 23, /**< Separator, space */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
306 UTF8PROC_CATEGORY_ZL = 24, /**< Separator, line */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
307 UTF8PROC_CATEGORY_ZP = 25, /**< Separator, paragraph */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
308 UTF8PROC_CATEGORY_CC = 26, /**< Other, control */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
309 UTF8PROC_CATEGORY_CF = 27, /**< Other, format */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
310 UTF8PROC_CATEGORY_CS = 28, /**< Other, surrogate */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
311 UTF8PROC_CATEGORY_CO = 29, /**< Other, private use */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
312 } utf8proc_category_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
313
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
314 /** Bidirectional character classes. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
315 typedef enum {
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
316 UTF8PROC_BIDI_CLASS_L = 1, /**< Left-to-Right */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
317 UTF8PROC_BIDI_CLASS_LRE = 2, /**< Left-to-Right Embedding */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
318 UTF8PROC_BIDI_CLASS_LRO = 3, /**< Left-to-Right Override */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
319 UTF8PROC_BIDI_CLASS_R = 4, /**< Right-to-Left */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
320 UTF8PROC_BIDI_CLASS_AL = 5, /**< Right-to-Left Arabic */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
321 UTF8PROC_BIDI_CLASS_RLE = 6, /**< Right-to-Left Embedding */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
322 UTF8PROC_BIDI_CLASS_RLO = 7, /**< Right-to-Left Override */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
323 UTF8PROC_BIDI_CLASS_PDF = 8, /**< Pop Directional Format */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
324 UTF8PROC_BIDI_CLASS_EN = 9, /**< European Number */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
325 UTF8PROC_BIDI_CLASS_ES = 10, /**< European Separator */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
326 UTF8PROC_BIDI_CLASS_ET = 11, /**< European Number Terminator */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
327 UTF8PROC_BIDI_CLASS_AN = 12, /**< Arabic Number */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
328 UTF8PROC_BIDI_CLASS_CS = 13, /**< Common Number Separator */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
329 UTF8PROC_BIDI_CLASS_NSM = 14, /**< Nonspacing Mark */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
330 UTF8PROC_BIDI_CLASS_BN = 15, /**< Boundary Neutral */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
331 UTF8PROC_BIDI_CLASS_B = 16, /**< Paragraph Separator */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
332 UTF8PROC_BIDI_CLASS_S = 17, /**< Segment Separator */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
333 UTF8PROC_BIDI_CLASS_WS = 18, /**< Whitespace */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
334 UTF8PROC_BIDI_CLASS_ON = 19, /**< Other Neutrals */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
335 UTF8PROC_BIDI_CLASS_LRI = 20, /**< Left-to-Right Isolate */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
336 UTF8PROC_BIDI_CLASS_RLI = 21, /**< Right-to-Left Isolate */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
337 UTF8PROC_BIDI_CLASS_FSI = 22, /**< First Strong Isolate */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
338 UTF8PROC_BIDI_CLASS_PDI = 23, /**< Pop Directional Isolate */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
339 } utf8proc_bidi_class_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
340
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
341 /** Decomposition type. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
342 typedef enum {
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
343 UTF8PROC_DECOMP_TYPE_FONT = 1, /**< Font */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
344 UTF8PROC_DECOMP_TYPE_NOBREAK = 2, /**< Nobreak */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
345 UTF8PROC_DECOMP_TYPE_INITIAL = 3, /**< Initial */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
346 UTF8PROC_DECOMP_TYPE_MEDIAL = 4, /**< Medial */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
347 UTF8PROC_DECOMP_TYPE_FINAL = 5, /**< Final */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
348 UTF8PROC_DECOMP_TYPE_ISOLATED = 6, /**< Isolated */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
349 UTF8PROC_DECOMP_TYPE_CIRCLE = 7, /**< Circle */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
350 UTF8PROC_DECOMP_TYPE_SUPER = 8, /**< Super */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
351 UTF8PROC_DECOMP_TYPE_SUB = 9, /**< Sub */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
352 UTF8PROC_DECOMP_TYPE_VERTICAL = 10, /**< Vertical */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
353 UTF8PROC_DECOMP_TYPE_WIDE = 11, /**< Wide */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
354 UTF8PROC_DECOMP_TYPE_NARROW = 12, /**< Narrow */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
355 UTF8PROC_DECOMP_TYPE_SMALL = 13, /**< Small */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
356 UTF8PROC_DECOMP_TYPE_SQUARE = 14, /**< Square */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
357 UTF8PROC_DECOMP_TYPE_FRACTION = 15, /**< Fraction */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
358 UTF8PROC_DECOMP_TYPE_COMPAT = 16, /**< Compat */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
359 } utf8proc_decomp_type_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
360
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
361 /** Boundclass property. (TR29) */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
362 typedef enum {
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
363 UTF8PROC_BOUNDCLASS_START = 0, /**< Start */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
364 UTF8PROC_BOUNDCLASS_OTHER = 1, /**< Other */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
365 UTF8PROC_BOUNDCLASS_CR = 2, /**< Cr */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
366 UTF8PROC_BOUNDCLASS_LF = 3, /**< Lf */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
367 UTF8PROC_BOUNDCLASS_CONTROL = 4, /**< Control */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
368 UTF8PROC_BOUNDCLASS_EXTEND = 5, /**< Extend */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
369 UTF8PROC_BOUNDCLASS_L = 6, /**< L */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
370 UTF8PROC_BOUNDCLASS_V = 7, /**< V */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
371 UTF8PROC_BOUNDCLASS_T = 8, /**< T */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
372 UTF8PROC_BOUNDCLASS_LV = 9, /**< Lv */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
373 UTF8PROC_BOUNDCLASS_LVT = 10, /**< Lvt */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
374 UTF8PROC_BOUNDCLASS_REGIONAL_INDICATOR = 11, /**< Regional indicator */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
375 UTF8PROC_BOUNDCLASS_SPACINGMARK = 12, /**< Spacingmark */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
376 UTF8PROC_BOUNDCLASS_PREPEND = 13, /**< Prepend */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
377 UTF8PROC_BOUNDCLASS_ZWJ = 14, /**< Zero Width Joiner */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
378
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
379 /* the following are no longer used in Unicode 11, but we keep
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
380 the constants here for backward compatibility */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
381 UTF8PROC_BOUNDCLASS_E_BASE = 15, /**< Emoji Base */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
382 UTF8PROC_BOUNDCLASS_E_MODIFIER = 16, /**< Emoji Modifier */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
383 UTF8PROC_BOUNDCLASS_GLUE_AFTER_ZWJ = 17, /**< Glue_After_ZWJ */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
384 UTF8PROC_BOUNDCLASS_E_BASE_GAZ = 18, /**< E_BASE + GLUE_AFTER_ZJW */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
385
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
386 /* the Extended_Pictographic property is used in the Unicode 11
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
387 grapheme-boundary rules, so we store it in the boundclass field */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
388 UTF8PROC_BOUNDCLASS_EXTENDED_PICTOGRAPHIC = 19,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
389 UTF8PROC_BOUNDCLASS_E_ZWG = 20, /* UTF8PROC_BOUNDCLASS_EXTENDED_PICTOGRAPHIC + ZWJ */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
390 } utf8proc_boundclass_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
391
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
392 /** Indic_Conjunct_Break property. (TR44) */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
393 typedef enum {
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
394 UTF8PROC_INDIC_CONJUNCT_BREAK_NONE = 0,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
395 UTF8PROC_INDIC_CONJUNCT_BREAK_LINKER = 1,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
396 UTF8PROC_INDIC_CONJUNCT_BREAK_CONSONANT = 2,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
397 UTF8PROC_INDIC_CONJUNCT_BREAK_EXTEND = 3,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
398 } utf8proc_indic_conjunct_break_t;
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
399
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
400 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
401 * Function pointer type passed to @ref utf8proc_map_custom and
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
402 * @ref utf8proc_decompose_custom, which is used to specify a user-defined
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
403 * mapping of codepoints to be applied in conjunction with other mappings.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
404 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
405 typedef utf8proc_int32_t (*utf8proc_custom_func)(utf8proc_int32_t codepoint, void *data);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
406
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
407 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
408 * Array containing the byte lengths of a UTF-8 encoded codepoint based
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
409 * on the first byte.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
410 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
411 UTF8PROC_DLLEXPORT extern const utf8proc_int8_t utf8proc_utf8class[256];
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
412
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
413 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
414 * Returns the utf8proc API version as a string MAJOR.MINOR.PATCH
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
415 * (http://semver.org format), possibly with a "-dev" suffix for
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
416 * development versions.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
417 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
418 UTF8PROC_DLLEXPORT const char *utf8proc_version(void);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
419
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
420 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
421 * Returns the utf8proc supported Unicode version as a string MAJOR.MINOR.PATCH.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
422 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
423 UTF8PROC_DLLEXPORT const char *utf8proc_unicode_version(void);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
424
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
425 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
426 * Returns an informative error string for the given utf8proc error code
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
427 * (e.g. the error codes returned by @ref utf8proc_map).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
428 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
429 UTF8PROC_DLLEXPORT const char *utf8proc_errmsg(utf8proc_ssize_t errcode);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
430
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
431 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
432 * Reads a single codepoint from the UTF-8 sequence being pointed to by `str`.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
433 * The maximum number of bytes read is `strlen`, unless `strlen` is
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
434 * negative (in which case up to 4 bytes are read).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
435 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
436 * If a valid codepoint could be read, it is stored in the variable
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
437 * pointed to by `codepoint_ref`, otherwise that variable will be set to -1.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
438 * In case of success, the number of bytes read is returned; otherwise, a
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
439 * negative error code is returned.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
440 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
441 UTF8PROC_DLLEXPORT utf8proc_ssize_t utf8proc_iterate(const utf8proc_uint8_t *str, utf8proc_ssize_t strlen, utf8proc_int32_t *codepoint_ref);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
442
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
443 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
444 * Check if a codepoint is valid (regardless of whether it has been
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
445 * assigned a value by the current Unicode standard).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
446 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
447 * @return 1 if the given `codepoint` is valid and otherwise return 0.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
448 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
449 UTF8PROC_DLLEXPORT utf8proc_bool utf8proc_codepoint_valid(utf8proc_int32_t codepoint);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
450
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
451 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
452 * Encodes the codepoint as an UTF-8 string in the byte array pointed
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
453 * to by `dst`. This array must be at least 4 bytes long.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
454 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
455 * In case of success the number of bytes written is returned, and
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
456 * otherwise 0 is returned.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
457 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
458 * This function does not check whether `codepoint` is valid Unicode.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
459 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
460 UTF8PROC_DLLEXPORT utf8proc_ssize_t utf8proc_encode_char(utf8proc_int32_t codepoint, utf8proc_uint8_t *dst);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
461
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
462 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
463 * Look up the properties for a given codepoint.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
464 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
465 * @param codepoint The Unicode codepoint.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
466 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
467 * @returns
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
468 * A pointer to a (constant) struct containing information about
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
469 * the codepoint.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
470 * @par
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
471 * If the codepoint is unassigned or invalid, a pointer to a special struct is
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
472 * returned in which `category` is 0 (@ref UTF8PROC_CATEGORY_CN).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
473 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
474 UTF8PROC_DLLEXPORT const utf8proc_property_t *utf8proc_get_property(utf8proc_int32_t codepoint);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
475
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
476 /** Decompose a codepoint into an array of codepoints.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
477 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
478 * @param codepoint the codepoint.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
479 * @param dst the destination buffer.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
480 * @param bufsize the size of the destination buffer.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
481 * @param options one or more of the following flags:
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
482 * - @ref UTF8PROC_REJECTNA - return an error `codepoint` is unassigned
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
483 * - @ref UTF8PROC_IGNORE - strip "default ignorable" codepoints
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
484 * - @ref UTF8PROC_CASEFOLD - apply Unicode casefolding
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
485 * - @ref UTF8PROC_COMPAT - replace certain codepoints with their
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
486 * compatibility decomposition
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
487 * - @ref UTF8PROC_CHARBOUND - insert 0xFF bytes before each grapheme cluster
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
488 * - @ref UTF8PROC_LUMP - lump certain different codepoints together
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
489 * - @ref UTF8PROC_STRIPMARK - remove all character marks
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
490 * - @ref UTF8PROC_STRIPNA - remove unassigned codepoints
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
491 * @param last_boundclass
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
492 * Pointer to an integer variable containing
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
493 * the previous codepoint's (boundclass + indic_conjunct_break << 1) if the @ref UTF8PROC_CHARBOUND
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
494 * option is used. If the string is being processed in order, this can be initialized to 0 for
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
495 * the beginning of the string, and is thereafter updated automatically. Otherwise, this parameter is ignored.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
496 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
497 * @return
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
498 * In case of success, the number of codepoints written is returned; in case
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
499 * of an error, a negative error code is returned (@ref utf8proc_errmsg).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
500 * @par
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
501 * If the number of written codepoints would be bigger than `bufsize`, the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
502 * required buffer size is returned, while the buffer will be overwritten with
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
503 * undefined data.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
504 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
505 UTF8PROC_DLLEXPORT utf8proc_ssize_t utf8proc_decompose_char(
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
506 utf8proc_int32_t codepoint, utf8proc_int32_t *dst, utf8proc_ssize_t bufsize,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
507 utf8proc_option_t options, int *last_boundclass
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
508 );
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
509
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
510 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
511 * The same as @ref utf8proc_decompose_char, but acts on a whole UTF-8
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
512 * string and orders the decomposed sequences correctly.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
513 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
514 * If the @ref UTF8PROC_NULLTERM flag in `options` is set, processing
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
515 * will be stopped, when a NULL byte is encountered, otherwise `strlen`
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
516 * bytes are processed. The result (in the form of 32-bit unicode
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
517 * codepoints) is written into the buffer being pointed to by
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
518 * `buffer` (which must contain at least `bufsize` entries). In case of
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
519 * success, the number of codepoints written is returned; in case of an
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
520 * error, a negative error code is returned (@ref utf8proc_errmsg).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
521 * See @ref utf8proc_decompose_custom to supply additional transformations.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
522 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
523 * If the number of written codepoints would be bigger than `bufsize`, the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
524 * required buffer size is returned, while the buffer will be overwritten with
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
525 * undefined data.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
526 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
527 UTF8PROC_DLLEXPORT utf8proc_ssize_t utf8proc_decompose(
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
528 const utf8proc_uint8_t *str, utf8proc_ssize_t strlen,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
529 utf8proc_int32_t *buffer, utf8proc_ssize_t bufsize, utf8proc_option_t options
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
530 );
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
531
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
532 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
533 * The same as @ref utf8proc_decompose, but also takes a `custom_func` mapping function
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
534 * that is called on each codepoint in `str` before any other transformations
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
535 * (along with a `custom_data` pointer that is passed through to `custom_func`).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
536 * The `custom_func` argument is ignored if it is `NULL`. See also @ref utf8proc_map_custom.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
537 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
538 UTF8PROC_DLLEXPORT utf8proc_ssize_t utf8proc_decompose_custom(
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
539 const utf8proc_uint8_t *str, utf8proc_ssize_t strlen,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
540 utf8proc_int32_t *buffer, utf8proc_ssize_t bufsize, utf8proc_option_t options,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
541 utf8proc_custom_func custom_func, void *custom_data
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
542 );
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
543
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
544 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
545 * Normalizes the sequence of `length` codepoints pointed to by `buffer`
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
546 * in-place (i.e., the result is also stored in `buffer`).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
547 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
548 * @param buffer the (native-endian UTF-32) unicode codepoints to re-encode.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
549 * @param length the length (in codepoints) of the buffer.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
550 * @param options a bitwise or (`|`) of one or more of the following flags:
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
551 * - @ref UTF8PROC_NLF2LS - convert LF, CRLF, CR and NEL into LS
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
552 * - @ref UTF8PROC_NLF2PS - convert LF, CRLF, CR and NEL into PS
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
553 * - @ref UTF8PROC_NLF2LF - convert LF, CRLF, CR and NEL into LF
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
554 * - @ref UTF8PROC_STRIPCC - strip or convert all non-affected control characters
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
555 * - @ref UTF8PROC_COMPOSE - try to combine decomposed codepoints into composite
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
556 * codepoints
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
557 * - @ref UTF8PROC_STABLE - prohibit combining characters that would violate
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
558 * the unicode versioning stability
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
559 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
560 * @return
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
561 * In case of success, the length (in codepoints) of the normalized UTF-32 string is
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
562 * returned; otherwise, a negative error code is returned (@ref utf8proc_errmsg).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
563 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
564 * @warning The entries of the array pointed to by `str` have to be in the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
565 * range `0x0000` to `0x10FFFF`. Otherwise, the program might crash!
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
566 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
567 UTF8PROC_DLLEXPORT utf8proc_ssize_t utf8proc_normalize_utf32(utf8proc_int32_t *buffer, utf8proc_ssize_t length, utf8proc_option_t options);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
568
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
569 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
570 * Reencodes the sequence of `length` codepoints pointed to by `buffer`
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
571 * UTF-8 data in-place (i.e., the result is also stored in `buffer`).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
572 * Can optionally normalize the UTF-32 sequence prior to UTF-8 conversion.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
573 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
574 * @param buffer the (native-endian UTF-32) unicode codepoints to re-encode.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
575 * @param length the length (in codepoints) of the buffer.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
576 * @param options a bitwise or (`|`) of one or more of the following flags:
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
577 * - @ref UTF8PROC_NLF2LS - convert LF, CRLF, CR and NEL into LS
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
578 * - @ref UTF8PROC_NLF2PS - convert LF, CRLF, CR and NEL into PS
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
579 * - @ref UTF8PROC_NLF2LF - convert LF, CRLF, CR and NEL into LF
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
580 * - @ref UTF8PROC_STRIPCC - strip or convert all non-affected control characters
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
581 * - @ref UTF8PROC_COMPOSE - try to combine decomposed codepoints into composite
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
582 * codepoints
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
583 * - @ref UTF8PROC_STABLE - prohibit combining characters that would violate
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
584 * the unicode versioning stability
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
585 * - @ref UTF8PROC_CHARBOUND - insert 0xFF bytes before each grapheme cluster
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
586 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
587 * @return
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
588 * In case of success, the length (in bytes) of the resulting nul-terminated
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
589 * UTF-8 string is returned; otherwise, a negative error code is returned
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
590 * (@ref utf8proc_errmsg).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
591 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
592 * @warning The amount of free space pointed to by `buffer` must
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
593 * exceed the amount of the input data by one byte, and the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
594 * entries of the array pointed to by `str` have to be in the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
595 * range `0x0000` to `0x10FFFF`. Otherwise, the program might crash!
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
596 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
597 UTF8PROC_DLLEXPORT utf8proc_ssize_t utf8proc_reencode(utf8proc_int32_t *buffer, utf8proc_ssize_t length, utf8proc_option_t options);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
598
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
599 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
600 * Given a pair of consecutive codepoints, return whether a grapheme break is
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
601 * permitted between them (as defined by the extended grapheme clusters in UAX#29).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
602 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
603 * @param codepoint1 The first codepoint.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
604 * @param codepoint2 The second codepoint, occurring consecutively after `codepoint1`.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
605 * @param state Beginning with Version 29 (Unicode 9.0.0), this algorithm requires
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
606 * state to break graphemes. This state can be passed in as a pointer
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
607 * in the `state` argument and should initially be set to 0. If the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
608 * state is not passed in (i.e. a null pointer is passed), UAX#29 rules
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
609 * GB10/12/13 which require this state will not be applied, essentially
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
610 * matching the rules in Unicode 8.0.0.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
611 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
612 * @warning If the state parameter is used, `utf8proc_grapheme_break_stateful` must
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
613 * be called IN ORDER on ALL potential breaks in a string. However, it
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
614 * is safe to reset the state to zero after a grapheme break.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
615 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
616 UTF8PROC_DLLEXPORT utf8proc_bool utf8proc_grapheme_break_stateful(
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
617 utf8proc_int32_t codepoint1, utf8proc_int32_t codepoint2, utf8proc_int32_t *state);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
618
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
619 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
620 * Same as @ref utf8proc_grapheme_break_stateful, except without support for the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
621 * Unicode 9 additions to the algorithm. Supported for legacy reasons.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
622 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
623 UTF8PROC_DLLEXPORT utf8proc_bool utf8proc_grapheme_break(
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
624 utf8proc_int32_t codepoint1, utf8proc_int32_t codepoint2);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
625
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
626
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
627 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
628 * Given a codepoint `c`, return the codepoint of the corresponding
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
629 * lower-case character, if any; otherwise (if there is no lower-case
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
630 * variant, or if `c` is not a valid codepoint) return `c`.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
631 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
632 UTF8PROC_DLLEXPORT utf8proc_int32_t utf8proc_tolower(utf8proc_int32_t c);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
633
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
634 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
635 * Given a codepoint `c`, return the codepoint of the corresponding
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
636 * upper-case character, if any; otherwise (if there is no upper-case
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
637 * variant, or if `c` is not a valid codepoint) return `c`.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
638 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
639 UTF8PROC_DLLEXPORT utf8proc_int32_t utf8proc_toupper(utf8proc_int32_t c);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
640
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
641 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
642 * Given a codepoint `c`, return the codepoint of the corresponding
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
643 * title-case character, if any; otherwise (if there is no title-case
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
644 * variant, or if `c` is not a valid codepoint) return `c`.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
645 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
646 UTF8PROC_DLLEXPORT utf8proc_int32_t utf8proc_totitle(utf8proc_int32_t c);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
647
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
648 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
649 * Given a codepoint `c`, return `1` if the codepoint corresponds to a lower-case character
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
650 * and `0` otherwise.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
651 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
652 UTF8PROC_DLLEXPORT int utf8proc_islower(utf8proc_int32_t c);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
653
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
654 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
655 * Given a codepoint `c`, return `1` if the codepoint corresponds to an upper-case character
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
656 * and `0` otherwise.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
657 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
658 UTF8PROC_DLLEXPORT int utf8proc_isupper(utf8proc_int32_t c);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
659
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
660 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
661 * Given a codepoint, return a character width analogous to `wcwidth(codepoint)`,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
662 * except that a width of 0 is returned for non-printable codepoints
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
663 * instead of -1 as in `wcwidth`.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
664 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
665 * @note
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
666 * If you want to check for particular types of non-printable characters,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
667 * (analogous to `isprint` or `iscntrl`), use @ref utf8proc_category. */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
668 UTF8PROC_DLLEXPORT int utf8proc_charwidth(utf8proc_int32_t codepoint);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
669
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
670 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
671 * Return the Unicode category for the codepoint (one of the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
672 * @ref utf8proc_category_t constants.)
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
673 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
674 UTF8PROC_DLLEXPORT utf8proc_category_t utf8proc_category(utf8proc_int32_t codepoint);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
675
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
676 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
677 * Return the two-letter (nul-terminated) Unicode category string for
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
678 * the codepoint (e.g. `"Lu"` or `"Co"`).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
679 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
680 UTF8PROC_DLLEXPORT const char *utf8proc_category_string(utf8proc_int32_t codepoint);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
681
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
682 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
683 * Maps the given UTF-8 string pointed to by `str` to a new UTF-8
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
684 * string, allocated dynamically by `malloc` and returned via `dstptr`.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
685 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
686 * If the @ref UTF8PROC_NULLTERM flag in the `options` field is set,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
687 * the length is determined by a NULL terminator, otherwise the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
688 * parameter `strlen` is evaluated to determine the string length, but
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
689 * in any case the result will be NULL terminated (though it might
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
690 * contain NULL characters with the string if `str` contained NULL
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
691 * characters). Other flags in the `options` field are passed to the
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
692 * functions defined above, and regarded as described. See also
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
693 * @ref utf8proc_map_custom to supply a custom codepoint transformation.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
694 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
695 * In case of success the length of the new string is returned,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
696 * otherwise a negative error code is returned.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
697 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
698 * @note The memory of the new UTF-8 string will have been allocated
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
699 * with `malloc`, and should therefore be deallocated with `free`.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
700 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
701 UTF8PROC_DLLEXPORT utf8proc_ssize_t utf8proc_map(
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
702 const utf8proc_uint8_t *str, utf8proc_ssize_t strlen, utf8proc_uint8_t **dstptr, utf8proc_option_t options
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
703 );
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
704
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
705 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
706 * Like @ref utf8proc_map, but also takes a `custom_func` mapping function
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
707 * that is called on each codepoint in `str` before any other transformations
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
708 * (along with a `custom_data` pointer that is passed through to `custom_func`).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
709 * The `custom_func` argument is ignored if it is `NULL`.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
710 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
711 UTF8PROC_DLLEXPORT utf8proc_ssize_t utf8proc_map_custom(
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
712 const utf8proc_uint8_t *str, utf8proc_ssize_t strlen, utf8proc_uint8_t **dstptr, utf8proc_option_t options,
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
713 utf8proc_custom_func custom_func, void *custom_data
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
714 );
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
715
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
716 /** @name Unicode normalization
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
717 *
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
718 * Returns a pointer to newly allocated memory of a NFD, NFC, NFKD, NFKC or
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
719 * NFKC_Casefold normalized version of the null-terminated string `str`. These
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
720 * are shortcuts to calling @ref utf8proc_map with @ref UTF8PROC_NULLTERM
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
721 * combined with @ref UTF8PROC_STABLE and flags indicating the normalization.
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
722 */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
723 /** @{ */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
724 /** NFD normalization (@ref UTF8PROC_DECOMPOSE). */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
725 UTF8PROC_DLLEXPORT utf8proc_uint8_t *utf8proc_NFD(const utf8proc_uint8_t *str);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
726 /** NFC normalization (@ref UTF8PROC_COMPOSE). */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
727 UTF8PROC_DLLEXPORT utf8proc_uint8_t *utf8proc_NFC(const utf8proc_uint8_t *str);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
728 /** NFKD normalization (@ref UTF8PROC_DECOMPOSE and @ref UTF8PROC_COMPAT). */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
729 UTF8PROC_DLLEXPORT utf8proc_uint8_t *utf8proc_NFKD(const utf8proc_uint8_t *str);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
730 /** NFKC normalization (@ref UTF8PROC_COMPOSE and @ref UTF8PROC_COMPAT). */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
731 UTF8PROC_DLLEXPORT utf8proc_uint8_t *utf8proc_NFKC(const utf8proc_uint8_t *str);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
732 /**
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
733 * NFKC_Casefold normalization (@ref UTF8PROC_COMPOSE and @ref UTF8PROC_COMPAT
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
734 * and @ref UTF8PROC_CASEFOLD and @ref UTF8PROC_IGNORE).
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
735 **/
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
736 UTF8PROC_DLLEXPORT utf8proc_uint8_t *utf8proc_NFKC_Casefold(const utf8proc_uint8_t *str);
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
737 /** @} */
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
738
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
739 #ifdef __cplusplus
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
740 }
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
741 #endif
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
742
ff0b2052b234 *: add missing utf8proc files
Paper <paper@paper.us.eu.org>
parents:
diff changeset
743 #endif