Monday 30 August 2010

Multilingual database design approaches

A while ago I needed to create a database that would support storing data in multiple languages e.g. translated product descriptions for customer from different countries. When I started wondering how to reflect this in database schema it came out the answer is not straightforward. While searching Web for the one-and-only best practice i stumbled upon many opinions and approaches. Here are some most popular ones:

  1. Additional columns
    This is the simplest one, it's basically about creating an additional column for each text that needs to be translated e.g.
    CREATE TABLE app_product (
    Id Int IDENTITY NOT NULL,
    Description_en Text,
    Description_pl Text,
    PRIMARY KEY (Id)
    );


    Advantages:
    + simplicity
    + easy querying (no joins required)

    Disadvantages:
    - adding new language support requires schema changes for each table with multilingual content
    - if not all translations are required (e.g. at some places default language should always be used) it may cause redundant data or empty db fields.
    - hard to maintain

  2. Single translations table

    Approach with single translation table seems to be the cleanest one from database structure perspective. You store all texts that need to be translated in a single translation table:
    CREATE TABLE ref_language (
    Code Char(2)NOT NULL,
    Name Varchar(20) NOT NULL,
    PRIMARY KEY (Code)
    );

    CREATE TABLE app_translation (
    Id Int IDENTITY NOT NULL,
    PRIMARY KEY (Id)
    );

    CREATE TABLE app_translation_entry (
    TranslationId Int NOT NULL,
    LanguageCode Char(2) NOT NULL,
    Text Text NOT NULL,
    FOREIGN KEY (TranslationId) REFERENCES app_translation(Id),
    FOREIGN KEY (LanguageCode) REFERENCES ref_language(Code)
    );

    CREATE TABLE app_product (
    Id Int IDENTITY NOT NULL,
    Description Int NOT NULL,
    PRIMARY KEY (Id),
    FOREIGN KEY (Description) REFERENCES app_translation(Id)
    );


    Advantages:
    + adding new languages doesn't require schema changes
    + seems like clean, relational approach
    + all translations in one place (some may say it's a disadvantage because less readable/maintainable)

    Disadvantages:
    - complex querying (multiple joins required to retrieve correct product description)
    - overcomplicated

  3. Additional translation table for each table with multilingual content

    For each table that stores information that may need to be translated an additional table is created. The original table stores only language insensitive data and the new one all translated info:
    CREATE TABLE ref_language (
    Code Char(2)NOT NULL,
    Name Varchar(20) NOT NULL,
    PRIMARY KEY (Code)
    );

    CREATE TABLE app_product (
    Id Int IDENTITY NOT NULL,
    PRIMARY KEY (Id)
    );

    CREATE TABLE app_product_translation (
    ProductId Int NOT NULL,
    LanguageCode Char(2) NOT NULL,
    Description Text NOT NULL,
    FOREIGN KEY (ProductId) REFERENCES app_product(Id),
    FOREIGN KEY (LanguageCode) REFERENCES ref_language(Code)
    );


    Advantages:
    + adding new languages doesn't require schema changes
    + relatively simple querying (1 join required)

    Disadvantages:
    - may double the amount of tables

The 3 examples presented above give us an idea how different approaches may be used here. These are of course not all possible options, just the most popular ones. You can always modify them e.g. by introducing some additional views that would save you writing complex joins direct from your code.
The solution you choose depends mostly on your project requirements. If you need simplicity and are sure that the number of supported languages is small and fixed you could go with option 1. If you require bit more flexibility and can afford a simple join when querying for multilingual data option 3 would be a possible solution.

Friday 27 August 2010

Using custom fonts on your website

Sometimes you want to make your website look better by changing fonts of some elements e.g. headings. When you set element's font you should choose one of so-called 'web safe fonts'. Web safe fonts are the ones that are by default installed on most of the computers, regardless operating system e.g. Arial, Verdana, Times New Roman etc. To increase the chances that the content of your website is displayed to each visitor using fonts you chose you can specify more than one font in your CSS:

.someclass {
font-family: Arial, Verdana;
}
In this case if the first font is not installed on client's machine the next one will be used.

Custom fonts
In some cases you may need to use a custom font that is most likely not installed on most of the client machines. Common example for such situation is when you receive a design from your client who insists you implement their website exactly like it was designed.

One of possible solutions would be using graphics for all page elements that were designed to use custom fonts. Disadvantages of this approach are:
  • This is not SEO friendly
  • Not suitable for dynamic text
The other option is to place the the font file on your webserver and using CSS point the browser to the font location:

@font-face {
font-family: custom_font;

/* for IE */
src: url(fonts/custom_font.eot);

/* non-IE */
src: local("Unique font name"), url(fonts/custom_font.ttf) format("truetype");
}

/* now you can use your custom font like any other */
.someclass {
font-family: custom_font;
}
Please not that browser needs to download the font first so it may slow down initial loading of your page. For IE you'll need the .eot version of the font. You can convert your ttf font into eot using this free converter. Each browser should only download one of the files.

If possible always try to use web safe fonts which can also produce great results.