In my “RPi Matrix” project I wanted to render UTF-8 fonts on a 2D-Raster. To rasterize and vector fonts I used a library called FreeType, which accepts unsigned long* as input to render a single character. So I had to get the (uni-)codes for each character from my string. The confusion already started with the difference between UTF-8 and unicode. So my conclusion was I had to convert UTF-8 to some other encoding. Probably some encoding which supports one long (or wchar_t) per character.
The problem even get’s bigger if you want cross-platform support. On the Linux side you have iconv. On windows you have some ugly MultiByteToWideChar(…) function. Luckily I only have to support Linux.
So let’s get started with some code:
1#define UTF8_BUFERR_SIZE 256;
2
3static wchar_t *utf8towchar(char* utf8) {
4 wchar_t *text=malloc (UTF8_BUFERR_SIZE * sizeof(wchar_t));
5 char *output = (char *) text;
6
7 gchar *input = g_strdup(utf8); // get length of utf-8 string
8 gchar *def_copy = input;
9
10 iconv_t foo = iconv_open("WCHAR_T", "UTF-8"); // Convert UTF-8 to WCHAR_T
11 size_t ibl = strlen(input); // Input length
12 size_t obl = UTF8_BUFERR_SIZE; // Max output length
13 iconv(foo, &input, &ibl, &output, &obl);
14 iconv_close(foo);
15
16 g_free(def_copy);
17
18 return text;
19}
Note: the wchar_t array has to be freed!
Now convert the wchar_t array to a unsigned long array and pass each long to the FreeType function FT_Get_Char_Index(face, c);.