C: UTF-8 to wide character

In my “RPi Matrix” project I wanted to render UTF-8 fonts on a 2D-Raster. To rasterize and vector fonts I used a library called FreeType, which accepts unsigned long* as input to render a single character. So I had to get the (uni-)codes for each character from my string. The confusion already started with the difference between UTF-8 and unicode. So my conclusion was I had to convert UTF-8 to some other encoding. Probably some encoding which supports one long (or wchar_t) per character.

The problem even get’s bigger if you want cross-platform support. On the Linux side you have iconv. On windows you have some ugly MultiByteToWideChar(…) function. Luckily I only have to support Linux.

So let’s get started with some code:

 1#define UTF8_BUFERR_SIZE  256;
 2
 3static wchar_t *utf8towchar(char* utf8) {
 4  wchar_t *text=malloc (UTF8_BUFERR_SIZE * sizeof(wchar_t));
 5  char *output = (char *) text;
 6
 7  gchar *input = g_strdup(utf8);                // get length of utf-8 string
 8  gchar *def_copy = input;
 9
10  iconv_t foo = iconv_open("WCHAR_T", "UTF-8"); // Convert UTF-8 to WCHAR_T
11  size_t ibl = strlen(input);                   // Input length
12  size_t obl = UTF8_BUFERR_SIZE;                // Max output length
13  iconv(foo, &input, &ibl, &output, &obl);
14  iconv_close(foo);
15
16  g_free(def_copy);
17
18  return text;
19}

Note: the wchar_t array has to be freed!

Now convert the wchar_t array to a unsigned long array and pass each long to the FreeType function FT_Get_Char_Index(face, c);.

Do you have questions? Send an email to max@maxammann.org