Multilanguage database design approach
December 26, 08 by Gabi SolomonPreinfo
Before you go right to the comment section and recommend gettext or other similar ways, know that i am talking about content that is manageable from an admin panel or is added by the user
Also this article is based on personal experience and is not necessary the best way to do this.
Building a metalanguage website poses a lot of problems, and one of them is how you store the content in the database for each language.
If you do a google search you will find little resources about it, and most of the are on forums. This seem a bit strange to me, so after i had decided on a database schema for a metalanguage website i decided to post it here in the hope that other people might find it useful and save them some googling time.
As far as i searched there are more or less 4 databases schemas for metalanguage website.
1. Column approach
This approach is very common and basically it duplicates the column of content for each language.
table pages
-- id (int)
-- title_en (varchar)
-- title_es (varchar)
-- content_en (varchar)
-- content_es (varchar)
The way you would query it is by automatically selecting the right columns according to the language chosen:
-
SELECT `id`, `title_en` AS `title`, `content_en` AS `content` FROM `pages`
Or you could select all and do the column selection from php :
Advantages
- It doesn't have duplicate content, since there is only one row for each record, and only the language columns are duplicated
- Easy to implement
Disadvantages
- You need to build the watch what column you are working with depending on the language
- Hard to maintain. Although this is a easy way for 2-3 languages its becomes a real drag when you have a lot of columns or a lot of languages
- Hard to add a new language
2. Multirow approach
Another approach that i saw but i have never worked with it. It is simillar to the one above but instead of duplicating the content in columns it does it in rows.
table pages
-- id (int)
-- language_id (int)
-- title (varchar)
-- content (varchar)
So you will basically have 3 rows for the same page if you have 3 languages. The main problem i see with this approach is that it would be a bit tricky to know witch id you will use for the table relations.
Sorry but since i dont really have experience with this i cant show you sql & php examples.
Advantages
- Ease in adding a new language
Disadvantages
- Need to watch the table relations
- A lot of duplicate content. You will have duplicate content for all the columns that are not translated
3. Single Translation table approach
This is an approach that becomes a little more complex then the other 2, but it is more suited for dinamic websites and which have a large number of languages or which intend to add a new language in the future and want to do it with ease.
table languages
-- id (int)
-- name (varchar)table pages
-- id (int)
-- language_id (int)
-- title (int fk)
-- content (int fk)table translation
-- id (int)table translation_entry
-- translation_id (int)
-- language_id (int)
-- content (text)
In this approach you would store the id from the translation table in the title and content columns from the pages table, and then do a join with the translation_entry table based on the language id.
Advantages
- Proper normalization
- Ease in adding a new language
Disadvantages
- Longer joins and query to get the content
- All the translated content goes into one table
- For me it just looks hard to work with and maintain
4. Coupled Translation table approach [my aproach
]
This is a variation of the above approach that to me seems easier to maintain and work with.
Instead of having just one translation table, you have one for each table. and you move the columns from the pages that need to be translated to the translation table.
table languages
-- id (int)
-- name (varchar)table pages
-- id (int)table pages_translation
-- id (int)
-- page_id (int)
-- language_id (int)
-- title (text)
-- content (text)
To get your data you just do a simple join:
-
SELECT * FROM `pages` JOIN `pages_translation` ON `pages`.`id` = `pages_translation`.`page_id` WHERE `map_landmarks_translation`.`language_id`='1'
Advantages
- Proper normalization
- Ease in adding a new language
- Easy to query
- Columns keep there names
Disadvantages
- You have to create translation tables for all your tables that have columns that need to be translated
Conclusion
I am sure that there are other methods of doing a multilingual website, this are just the ones that i thought are most commonly used. My solution is the best, its just the best for me, because it works for my project and its easier for me to work with compared to other approaches.
In the end the best approach is the one that is the best for you. The one that you find the most easier to work with and maintain.
Cheers,
and good coding.

