Spiga

Multilanguage models in zend framework using Zend_Db_Table

January 20, 09 by Gabi Solomon

As i wrote in a previous post there are more ways to design a database for a multilanguage website, but the one i decided to go with is Coupled Translation table approach, where basicly for each table you have another one where you store a row for each language wich has the primary table id, language id and the other columns that need to be translated.

The Requirements and Implementation

Having this idea in mind i wrote a new class that extends Zend_Db_Table to provide an easy access to the translated tables. Basically the class a couple of things:
Read the rest of this entry »

php class for Unicode Manipulation

January 03, 09 by Gabi Solomon

If you ever had to build a site in a language that had special characters or a multilanguage website, then you have had problems with UTF encoding for sure.

Well recently a new class was published on phpclasses.org by Rubens Takiguti Ribeiro called Unicode Manipulation that will solve your problems.
The class is a complete solution to manipulate Unicode encoded text with support for UTF-16 and UTF-32 besides UTF-8.

This class can be used to manipulate text with Unicode encodings.

It can perform several types of operations that involve text strings encoded as UTF-8, UTF-16 or UTF-32, like:

- Get the text sequence for byte order mark for little and big endian
- Convert a given character code to Unicode encoded text and vice-versa
- Get the byte length of a given Unicode encoded character
- Convert text encoding between UTF-8, UTF-16 and UTF-32
- Get a part of an Unicode encoded string from a given position and an optional length
- Get the string length of an Unicode encoded text
- Determine whether a given string has Unicode encoded text

[Class Page]

Hope you find it usefull,
Cheers.

Multilanguage database design approach

December 26, 08 by Gabi Solomon

Preinfo

Before you go right to the comment section and recommend gettext or other similar ways, know that i am talking about content that is manageable from an admin panel or is added by the user

Also this article is based on personal experience and is not necessary the best way to do this.

Building a metalanguage website poses a lot of problems, and one of them is how you store the content in the database for each language.

If you do a google search you will find little resources about it, and most of the are on forums. This seem a bit strange to me, so after i had decided on a database schema for a metalanguage website i decided to post it here in the hope that other people might find it useful and save them some googling time.

As far as i searched there are more or less 4 databases schemas for metalanguage website.

1. Column approach

This approach is very common and basically it duplicates the column of content for each language.

table pages
-- id (int)
-- title_en (varchar)
-- title_es (varchar)
-- content_en (varchar)
-- content_es (varchar)

The way you would query it is by automatically selecting the right columns according to the language chosen:

SQL:
  1. SELECT `id`, `title_en` AS `title`, `content_en` AS `content` FROM `pages`

Or you could select all and do the column selection from php :

PHP:
  1. echo $rowPage['title' . $_SESSION['currentLanguage']];

Advantages

  • It doesn't have duplicate content, since there is only one row for each record, and only the language columns are duplicated
  • Easy to implement

Disadvantages

  • You need to build the watch what column you are working with depending on the language
  • Hard to maintain. Although this is a easy way for 2-3 languages its becomes a real drag when you have a lot of columns or a lot of languages
  • Hard to add a new language

2. Multirow approach

Another approach that i saw but i have never worked with it. It is simillar to the one above but instead of duplicating the content in columns it does it in rows.

table pages
-- id (int)
-- language_id (int)
-- title (varchar)
-- content (varchar)

So you will basically have 3 rows for the same page if you have 3 languages. The main problem i see with this approach is that it would be a bit tricky to know witch id you will use for the table relations.

Sorry but since i dont really have experience with this i cant show you sql & php examples.

Advantages

  • Ease in adding a new language

Disadvantages

  • Need to watch the table relations
  • A lot of duplicate content. You will have duplicate content for all the columns that are not translated

3. Single Translation table approach

This is an approach that becomes a little more complex then the other 2, but it is more suited for dinamic websites and which have a large number of languages or which intend to add a new language in the future and want to do it with ease.

table languages
-- id (int)
-- name (varchar)

table pages
-- id (int)
-- language_id (int)
-- title (int fk)
-- content (int fk)

table translation
-- id (int)

table translation_entry
-- translation_id (int)
-- language_id (int)
-- content (text)

In this approach you would store the id from the translation table in the title and content columns from the pages table, and then do a join with the translation_entry table based on the language id.

Advantages

  • Proper normalization
  • Ease in adding a new language

Disadvantages

  • Longer joins and query to get the content
  • All the translated content goes into one table
  • For me it just looks hard to work with and maintain

4. Coupled Translation table approach [my aproach :D ]

This is a variation of the above approach that to me seems easier to maintain and work with.
Instead of having just one translation table, you have one for each table. and you move the columns from the pages that need to be translated to the translation table.

table languages
-- id (int)
-- name (varchar)

table pages
-- id (int)

table pages_translation
-- id (int)
-- page_id (int)
-- language_id (int)
-- title (text)
-- content (text)

To get your data you just do a simple join:

SQL:
  1. SELECT * FROM `pages` JOIN `pages_translation` ON `pages`.`id` = `pages_translation`.`page_id` WHERE `map_landmarks_translation`.`language_id`='1'

Advantages

  • Proper normalization
  • Ease in adding a new language
  • Easy to query
  • Columns keep there names

Disadvantages

  • You have to create translation tables for all your tables that have columns that need to be translated

Conclusion

I am sure that there are other methods of doing a multilingual website, this are just the ones that i thought are most commonly used. My solution is the best, its just the best for me, because it works for my project and its easier for me to work with compared to other approaches.
In the end the best approach is the one that is the best for you. The one that you find the most easier to work with and maintain.

Cheers,
and good coding.