Sunday 26 January 2014

Flexible and efficient text displaying utilities Part 1.

One of my recent goals was to implement text and font handling facilities like I never had before but always wanted. Each iteration of my framework got more complete text displaying solutions but they still were kind of incomplete with comparison to my initial plans. This time I tried to acomplish every goal I planned last time and even added a couple of new ideas to the bag. In fact I planned to finish this up in about 2-3 weeks. Given I do this only in my spare time, after my day-time job, this was tight schedule but Christmas holidays were coming and my plan was to use the free time - what happend some of you might guess from the last post. We almost have february and I am still finishing up this part of my engine ;-) I hope this series will be long and regular as there are so many topics that came up while doing this stuff - worth sharing so I will start of with an overview and part on image compositoning and alpha blending.


Text rendering utilities design.

There are two approaches when it comes to text rendering, offline and online. What I mean by offline is a bitmap font prerendered to a texture with all the glyphs we possibly need and will use. The texture comes with a file with data that is fed into the system that positions and renders textured quads presenting text. There were so many discussions on the topic that I won't even delve into it any deeper but it is easy to imagine that such solution is just plain bad and inflexible. The ugly face of this approach is especially apparent while using asian languages with a lot of different glyphs. Another problem is scaling and font quality. If you use bitmaps and you wan't to preserve screen layout resolution independant then you have to scale texts and scaling fonts does not look good especially if you need to upscale them.

Second approach, reffered to as online is generating what you need at run-time. To do so you need to incorporate some tools that are capable of handling glyphs directly from their vectorial representations saved in TTF or OTF files - preferably a library. I won't be unique here, as there is pretty much one and only industrial strength standard that is crossplatform and well written - the FreeType2 library.

There are quite a few tutorials, blog posts and examples on the web on how to use FreeType, yet I feel that some of them are trying to present too many concepts at a time. FreeType is great library but still the documentation is not the best I have worked with. Hell there are functions that have almost identical name, differ by one argument and the documentation content for both is almost exactly the same. Documentation would be a nice place to explain why should I prefer one over another or what are the differences exactly. But this is not the topic of this post. That being said - FreeType2 was the choice of mine for the back-end of the whole system.

The idea is the following - FontManager class that is responsible for loading, listing, creating and doing stuff. The format for the font is a simple textual definition file containing couple of information. What I do keep in such definition is the following:
  • Family name i.e. Arial
  • Weight's and associated ttf files. 
So this file is pretty simple. What I mean by weights and associated files is that a family of fonts, let it be arial can come in couple of flavours like Arial Narrow, Arial, Arial Bold and Arial Black. There are 4 different weights next one thicker than the previous. I tried to match CSS in this matter a little bit and I assume that weight 400 is regular font, 600 is thicker and so on. By defining what file goes with what weight when a user specifices a font to draw with the system will pick proper file. Also each weight can, and should have corresponding italic font if we want to support those as well.

As you can see this file is really simple. No information about sizes, glyph metrics etc. These are all generated at run-time and not preserved. Also there is another structure that describes apperance of the paragraph, these can have additional spacing etc.

Font generation and caching.

What I really wanted to avoid this time was massive amounts of set-up code that needed to be done to draw some text. In fact I modeled some of the API taking GDI and other libraries I have contact with in the past as a model but these are pain in the ass to use. I wanted clean and easy API this time and lots of customization and flexibility over how I render and display my glyphs.

One of the major changes I made to previous approaches I used was get rid of the idea of particular Font object instance. What I mean is that whenever you need white arial font sized 12 with black stroke I tend to write something like:
 
// face arial, 12 size, color white with 1px black stroke, antialiased, non italic
Font* f = g_fontManager->Create("Arial", 12, Color::White, 1, Color::Black, true, false); 
// text, x, y, opacity
f->Draw("Some text", 50, 50, 0.8f);

This has its benefits of course but it is soon apparent that when you need a couple of different font flavours you end up creating and managing multiple instances of the object. While this is ok when it comes up to resource management as you are clearly in control of what you created and when you release given resource this was such a pain in the ass I decided to remodel this a little bit. Now instead I have something like this:
 
HFont font = g_fontManager->Create("Arial");
...
DrawText("Some text", x, y, style_using_font);

when I need to change some settings I can easily do:
 
auto style = other.Clone();
style.OutterGlow->Color = Color::Cyan;
style.OutterGlow->Blending = BLEND_SCREEN;
DrawText("Glowy text", x, y, style);

No need to create new font objects etc. Of course this has some implications. Something is definitely working behind your back and you may feel threatened that you don't know what's cooking under the hood but soon I will present why this is not a real problem. Instead we get a system where doing the most common stuff is easier while still leaving room for control over what, how and when is created and disposed.

The font object is separated from the visual apperance of the text it outputs. Visuals are handled by something I called ParagraphStyle (but I have a strong feeling I will tweak the naming a little bit).

It looks something like this (some parts stripped-away):
 struct ParagraphStyle
  {
    ParagraphStyle();
    ~ParagraphStyle();

    f32             Opacity;
    u32             Hash;

    struct FontInfo
    {
      HFont         Family;           // Handle to font object 
      f32           Size;             // Size of the text
      Vec3          Color;            // Color of the text
      f32           Fill;             // Percentage of fill visiblity 1 = 100%
      u16           Weight;           // Weight of the font. Default is 400
      bool          Antialiased;      // If text should be antialiased
      bool          Italic;           // If text uses italics
    } Font;

    // The position of layers in this structure represents order of drawing
    ShadowInfo*     Shadow;
    OutterGlowInfo* OutterGlow;
    InnerGlowInfo*  InnerGlow;
    StrokeInfo*     Stroke;
  };
And the structure contains pointers to optional additional nodes that modify the apperance. Two of them are containing data such as:
struct StrokeInfo
  {
    bool            Enabled;          // If effect is enabled and rendered
    EStrokeType     Type;             // Stroke can be outter or inner
    EBlendMode      Blending;         // Blend mode for the stroke
    Vec3            Color;            // Color of the stroke
    f32             Opacity;          // Opacity of the stroke combined with global
    f32             Size;             // Size of the stroke in pixels
  };

  struct ShadowInfo
  {
    bool            Enabled;          // If shadow is enabled and rendered
    EBlendMode      Blending;         // Blend mode for the shadow
    Vec3            Color;            // Color of the shadow
    f32             Opacity;          // Opacity of the shadow combined with global
    f32             Angle;            // Angle of the light source
    f32             Distance;         // Distance from the glyph in pixels
    f32             Size;             // Size of the shadow (uses stroking)
    f32             Spread;           // Spread of the shadow. Uses blurring
    bool            MaskedShadow;     // Determines if glyph should mask the shadow
  };
This gives some idea on how the text effects work and how one can use these to model various effects but how exactly this is done under the hood? Well this will be the second part of this article. Now I want to mention one more thing in this article - blending, image compositioning and some of the implications.

Image compositioning, alpha blending and premultiplied alpha.

One important thing that came up when I was doing my bachelors degree project was image compositioning. I had this idea for optimization of rendering stuff to a rendertarget when it needs updating and drawing large screen overlay each frame with elements in it. It was supposed to speed up GUI rendering where you have A LOT of small elements like buttons, text boxes, windows, sliders etc. and potentially a lot of overdraw happens. Well you have to render with alpha blending if you wish to keep visually appealing apperance and back to front order so GUI with all its widgets can introduce a lot of overdraw. Long story short I did a prototype and throw it away. One of the apparent problems of compositioning transparent stuff in render buffer and drawing it again with trasparency was problems with proper blending. Artifacts occured, colors bleeded etc. I did some quick research and found out about a concept of premultiplied alpha. 

There are a lot of great stuff on the web on this subject. One of which is Shawn Hargreaves blog and article on Premultiplied alpha.

What I will like to pinpoint really quickly are the benefits of the premultiplied alpha over regular alpha blending (srcalpha-invsrcalpha model).

  1. Easier to compress - hides variations in transparent pixels as they are all black
  2. Better to composite transparent layers - removes artifacts
  3. Plays nicer with filtering
  4. Can implement all Porter-Duff composition models - more on that later
  5. Less state changes as normal and additive blending can be done using one state set and manipulation of the alpha value

I mentioned about Porter-Duff work. This is something I know myself with recently as this is a trully impressive paper on image compositioning methods that help you achieve lots of different effects and solves multiple problems. If you think blending ends up with normal and additive then you should definitely read on them on the web. I found quite good explanation to the topic, but somehow trying to implement formulas given in this blog post I ended up with some artifacts on the edges:



I am not sure what possibly wrong I could do but decided to scrapp it in favor of the formulas from the original paper. Anyways this is still a really good info on the topic and I highly recommend it along the paper.

The code I use for image compositioning is as follows (almost 100% copy-paste):
static void GetCompositionWeights(ECompositionMode mode,
  f32* a, f32* b, f32* c, f32* d) 
{
  // FA = 1 * a - ab * b;
  // FB = 1 * c - aa * d;
  // where ab and aa are source alpha and dest alpha
  switch (mode) {
    case COMPOSITE_SRC: 
      *a = 1; *b = 0; *c = 0; *d = 0; break;
    case COMPOSITE_DEST:
      *a = 0; *b = 0; *c = 1; *d = 0; break;
    case COMPOSITE_OVER:
      *a = 1; *b = 0; *c = 1; *d = 1; break;
    case COMPOSITE_DEST_OVER:
      *a = 1; *b = 1; *c = 1; *d = 0; break;
    case COMPOSITE_IN:
      *a = 0; *b = -1; *c = 0; *d = 0; break;
    ...
  }
}

// This structure is temporarily used for 
// conversion between straight alpha to premultiplied
// In the end this step won't be necessary as there
// will be content processor handling asset conversion
struct rgba
{
  f32 r;
  f32 g;
  f32 b;
  f32 a;

  rgba() {}
  rgba(const u8* p) {
    static const f32 c = 1 / 255.0f;
    a = p[3] * c;
    // premultiply pixels
    r = (*p++ * c) *a;
    g = (*p++ * c) *a;
    b = (*p++ * c) *a;
  }
};

bool Image::Composite(const Image& src, u32 srcX, u32 srcY, u32 width, u32 height,
  u32 destX, u32 destY, ECompositionMode mode)
{
  const u32 srcStride = 
    src.GetWidth() * src.GetChannelsCount() * src.GetBpp() / 8;
  const u32 destStride = m_info.Width * m_info.Channels * m_info.Bpp / 8;
  const u8* data = src.GetData();

  
  // Equation reduction based on i and j values
  /*------------------------------------

    Equation - 1 * i - ab * j;   
                       ->i = 1 j = 0
    1
                       ->i = 1 j = 1
    1 - ab
                       ->i = 0 j = -1
    ab
                       ->i = 0 j = 0
    0
    -------------------------------- -*/
  
  rgba srcPixel;
  rgba dstPixel;
  u32 di, si;
  f32 asrc, adst, sa, da;
  f32 a = 0; f32 b = 0; f32 c = 0; f32 d = 0;
  GetCompositionWeights(mode, &a, &b, &c, &d);

  for (u32 y = 0; y < height; ++y) {
    for (u32 x = 0; x < width; ++x) {
      di = (y + destY) * destStride + (x + destX) * m_info.Channels;
      si = (y + srcY) * srcStride + (x + srcX)*m_info.Channels;

      // create rgba helper struct that does premultiplying of alpha
      srcPixel = rgba(&data[si]);
      dstPixel = rgba(&m_buffer[di]);
      asrc = 1 * a - dstPixel.a * b;
      adst = 1 * c - srcPixel.a * d;

      sa = srcPixel.a;
      da = dstPixel.a;
      f32 alpha = (sa * asrc + da * adst);
      m_buffer[di + 3] = alpha * 255;

      m_buffer[di] =
        (srcPixel.r * asrc + dstPixel.r * adst) * 255;
      m_buffer[di + 1] =
        (srcPixel.g * asrc + dstPixel.g * adst) * 255;
      m_buffer[di + 2] =
        (srcPixel.b * asrc + dstPixel.b * adst) * 255;
    }
  }
  return true;
}
The results are exactly the same as in mentioned blog post. Why did I bother to implement these? Well I am amazed by how people tend to complicate stuff. Unfortunatelly FreeType (at least to my knowledge) is capable of stroking image by the rasterizer fills the glyph instead of rendering outline only. Sometimes this is not the problem (if we render stroke and on top of it the glpyh itself) but if we want to use this as a base for some effects that can be achieved by blurring the outline this might not be enough. When rendering stroked glyph I have to render the glyph and its stroked version either way. Having both bitmaps I can use compositioning to produce the outline only. This is fast and I have everything what I need at my disposal. I saw people trying to parse vectorial outlines doing some crazy work on drawing lines to create raster version of these, even integrating different libraries alltogether just to get antialiased line rendering. Wow, this was too crazy for me and I wanted cleaner and easier solution. So here you go, outline rendering of a glyph:


For today this is everything. I plan 4 parts in the series. The second one will talk more on how the glyphs are rendered using FreeType and how the caching works. This involves introduction to a thing I have been working on, atlas support and bin-packing. A little bit on memory management and what more will feel relevant to the topic. The 3rd part will talk about text styles and effects including Inner/Outter glows, strokes and shadows. Implementing those effects requires heavy usage of stroking (FreeType comes with this so this is easy) but also blurring images. I will try to replicate Layer Styles from Photoshop behaviour as this is the standard when it comes to raster graphics.Finally the last part should focus on some details on the overall API, implementation details, efficient rendering and optimizations.

No comments:

Post a Comment