Database modeling for international and multilingual purposes - mysql

I need to create a large scale DB Model for a web application that will be multilingual.
One doubt that I've every time I think on how to do it is how I can resolve having multiple translations for a field. A case example.
The table for language levels, that administrators can edit from the backend, can have multiple items like: basic, advance, fluent, mattern... In the near future probably it will be one more type. The admin goes to the backend and add a new level, it will sort it in the right position.. but how I handle all the translations for the final users?
Another problem with internationalization of a database is that probably for user studies can differ from USA to UK to DE... in every country they will have their levels (that probably it will be equivalent to another but finally, different). And what about billing?
How you model this in a big scale?

Here is the way I would design the database:
Visualization by DB Designer Fork
The i18n table only contains a PK, so that any table just has to reference this PK to internationalize a field. The table translation is then in charge of linking this generic ID with the correct list of translations.
locale.id_locale is a VARCHAR(5) to manage both of en and en_US ISO syntaxes.
currency.id_currency is a CHAR(3) to manage the ISO 4217 syntax.
You can find two examples: page and newsletter. Both of these admin-managed entites need to internationalize their fields, respectively title/description and subject/content.
Here is an example query:
select
t_subject.tx_translation as subject,
t_content.tx_translation as content
from newsletter n
-- join for subject
inner join translation t_subject
on t_subject.id_i18n = n.i18n_subject
-- join for content
inner join translation t_content
on t_content.id_i18n = n.i18n_content
inner join locale l
-- condition for subject
on l.id_locale = t_subject.id_locale
-- condition for content
and l.id_locale = t_content.id_locale
-- locale condition
where l.id_locale = 'en_GB'
-- other conditions
and n.id_newsletter = 1
Note that this is a normalized data model. If you have a huge dataset, maybe you could think about denormalizing it to optimize your queries. You can also play with indexes to improve the queries performance (in some DB, foreign keys are automatically indexed, e.g. MySQL/InnoDB).

Some previous StackOverflow questions on this topic:
What are best practices for multi-language database design?
What's the best database structure to keep multilingual data?
Schema for a multilanguage database
How to use multilanguage database schema with ORM?
Some useful external resources:
Creating multilingual websites: Database Design
Multilanguage database design approach
Propel Gets I18n Behavior, And Why It Matters
The best approach often is, for every existing table, create a new table into which text items are moved; the PK of the new table is the PK of the old table together with the language.
In your case:
The table for language levels, that administrators can edit from the backend, can have multiple items like: basic, advance, fluent, mattern... In the near future probably it will be one more type. The admin goes to the backend and add a new level, it will sort it in the right position.. but how I handle all the translations for the final users?
Your existing table probably looks something like this:
+----+-------+---------+
| id | price | type |
+----+-------+---------+
| 1 | 299 | basic |
| 2 | 299 | advance |
| 3 | 399 | fluent |
| 4 | 0 | mattern |
+----+-------+---------+
It then becomes two tables:
+----+-------+ +----+------+-------------+
| id | price | | id | lang | type |
+----+-------+ +----+------+-------------+
| 1 | 299 | | 1 | en | basic |
| 2 | 299 | | 2 | en | advance |
| 3 | 399 | | 3 | en | fluent |
| 4 | 0 | | 4 | en | mattern |
+----+-------+ | 1 | fr | élémentaire |
| 2 | fr | avance |
| 3 | fr | couramment |
: : : :
+----+------+-------------+
Another problem with internationalitzation of a database is that probably for user studies can differ from USA to UK to DE... in every country they will have their levels (that probably it will be equivalent to another but finally, different). And what about billing?
All localisation can occur through a similar approach. Instead of just moving text fields to the new table, you could move any localisable fields - only those which are common to all locales will remain in the original table.

Related

Best practice to store array-like data in MySQL or similar database?

I have two tables that I want to relate to each other. The issue is any product can have n-number of POs, so individual columns wouldn't work in a traditional DB.
I was thinking of using JSON fields to store an array, or using XML. I would need to insert additional POs, so I'm concerned with the lack of editing support for XML.
What is the standard way of handling n-number of attributes in a single field?
|id | Product | Work POs|
| - | ------- | ------- |
| 1 | bicycle | 002,003 |
| 2 | unicycle| 001,003 |
|PO | Job |
|-- | ---------------- |
|001|Install 1 wheel |
|002|Install 2 wheels |
|003|Install 2 seats |
The standard way to store multi-valued attributes in a relational database is to create another table, so you can store one value per row. This makes it easy to add or remove one new value, or to search for a specific value, or to count PO's per product, and many other types of queries.
id
Product
1
bicycle
2
unicycle
product_id
PO
1
002
1
003
2
001
2
003
PO
Job
001
Install 1 wheel
002
Install 2 wheels
003
Install seat
I also recommend reading my answer to Is storing a delimited list in a database column really that bad?
In some case you really need store array-like data in one field.
In MySQL 5.7.8+ you can use JSON type datafield:
ALTER TABLE `some_table` ADD `po` JSON NOT NULL`;
UPDATE `some_table` SET `po` = '[002,003]' WHERE `some_table`.`id` = 1;
See examples here: https://sebhastian.com/mysql-array/

Many-to-many multi level queries

I have a table called "obj_rels" where I have have fields such as:
pri_type
pri_type_id
sec_type
sec_type_id
The type represents the table the relationship is in. For example COM for comment, FIL for file, SBM for submission or ENT for entity.
I want to know the entity for each object type. For example,
Let's say I have a comment (#4) on a file (#3) uploaded to respond to a submission (#2) for entity (#1). The obj_rels table would have:
+----------+-------------+----------+-------------+
| pri_type | pri_type_id | sec_type | sec_type_id |
+----------+-------------+----------+-------------+
| COM | 4 | FIL | 3 |
| SBM | 2 | FIL | 3 |
| SBM | 2 | ENT | 1 |
+----------+-------------+----------+-------------+
There is no logic to Primary and Secondary entry, so it needs to look at both sides to find a match. How can I write a query that would dig back into the relationships to find the "ENT" associated with the "COM"?
The one I've written repeats a union query five times, thinking I probably won't have more than 5 levels ever, but it is extremely slow. I only have about 9k records in the table, and it takes over 30 seconds to run the query.
What is best practice for this type of relationship search?
Thanks!

How do I handle linking a record to another table?

I'm very new to Access and my teacher is... hard to follow. So I feel like there's something pretty basic I'm probably missing here. I think the biggest problem I'm having with this question is that I'm struggling to find the words to communicate what I actually need to do, which is really putting a damper on my google-fu.
In terms of what I think I want to do, I want to make a record reference another table in its entirety.
Main
+----+-------+--------+-------+----------------------------+
| PK | Name | Phone# | [...] | Cards |
+----+-------+--------+-------+----------------------------+
| 1 | Bob | [...] | [...] | < Reference to 2nd table > |
| 2 | Harry | [...] | [...] | [...] |
| 3 | Ted | [...] | [...] | [...] |
+----+-------+--------+-------+----------------------------+
Bob's Cards
+----+-------------+-----------+-------+-------+-------+
| PK | Card Name | Condition | Year | Price | [...] |
+----+-------------+-----------+-------+-------+-------+
| 1 | Big Slugger | Mint | 1987 | .20 | [...] |
| 2 | Quick Pete | [...] | [...] | [...] | [...] |
| 3 | Mac Donald | [...] | [...] | [...] | [...] |
+----+-------------+-----------+-------+-------+-------+
This would necessitate an entire new table for each record in the main table though, if it's even possible.
But the only alternative solution I can think of is to add 'Card1, Condition1, [...], Card2, Condition2, [...], Card3, [...]' fields to the main table and having to add another set of fields any time someone increases the maximum number of cards stored.
So I'm sort of left believing there is some other approach I should be taking that our teacher has failed to properly explain. We haven't even touched on forms and reports yet so I don't need to worry about working them in.
Any pointers?
(Also, the entirety of this data and structure is only a rough facsimile of my own, as I'd rather learn how to do it and apply it myself than be like 'here's my data, pls fix.')
Third option successfully found in comments by the helpful Minty.
This depends on a number of things, however to keep it simple you
would normally add one field to the cards table, with an number data
type called CardOwnerID. In your example it would be 1 indicating Bob.
This is known as a foreign key. (FK) - However if you have a table of
cards and multiple possible owners then you need a third table - a
Junction table. This would consist of the Main Person ID and the Card
ID. – Minty

How to add related properties to a table in mysql correctly

We have been developing the system at my place of work for sometime now and I feel the database design is getting out of hand somewhat.
For example we have a table widgets (I'm spoofing these somewhat):
+-----------------------+
| Widget |
+-----------------------+
| Id | Name | Price |
| 1 | Sprocket | 100 |
| 2 | Dynamo | 50 |
+-----------------------+
*There's about 40+ columns on this table already
We want to add on a property for each widget for packaging information. We need to know if it has packaging information, if it doesn't have packaging information or we don't know if it does or doesn't. We then need to also store the type of packaging details (assuming it does or maybe it doesn't and it's reduntant info now).
We already have another table which stores the details information information (I personally think this table should be divided up but that's another issue).
PD = PackageDetails
+--------------------------------+
| System Properties |
+--------------------------------+
| Id | Type | Value |
| 28 | PD | Boxed |
| 29 | PD | Vacuum Sealed |
+--------------------------------+
*There's thousands of rows in the table for all system wide table properties
Instinctively I would create a number of mapping tables to capture this information. I have however been instructed to just add another column onto each table to avoid doing a join.
My solution:
Create tables:
+---------------------------------------------------+
| widgets_packaging |
+---------------------------------------------------+
| Id | widget_id | packing_info | packing_detail_id |
| 1 | 27 | PACKAGED | 2 |
| 2 | 28 | UNKNOWN | NULL |
+---------------------------------------------------+
+--------------------+
| packaging |
+--------------------+
| Id | |
| 1 | Boxed |
| 2 | Vacuum Sealed |
+--------------------+
If I want to know what packaging a widget has I join through to widgets_packaging and join again to packaging if I want to know the exact details. Therefore no more columns on the widgets table.
I have however been told to ignore this and put the value int for the packing information and another as a foreign key to System Properties table to find the packaging details. Therefore adding another two columns to the table and creating yet more rows in the system properties table to store package details.
+------------------------------------------------------------+
| Widget |
+------------------------------------------------------------+
| Id | Name |Price | has_packaging | packaging_details |
| 1 | Sprocket |100 | 1 | 28 |
| 2 | Dynamo |50 | 0 | 29 |
+------------------------------------------------------------+
The reason for this is because it's simpler and doesn't involve a join if you only want to know if the widget has packaging (there are lots of widgets). They are concerned that more joins will slow things down.
Which is the more correctly solution here and are their concerns about speed legitimate? My gut instint is that we can't just keep adding columns onto the widgets table as it is growing and growing with flags for properties at present.
The answer to this really depends on whether the application(s) using this database are read or write intensive. If it's read intensive, the de-normalized structure is a better approach because you can make use of indexes. Selects are faster with fewer joins, too.
However, if your application is write intensive, normalization is a better approach (the structure you're suggesting is a more normalized approach). Tables tend to be smaller, which means they have a better chance of fitting into the buffer. Also, normalization tends to lead to less duplication of data, which means updates and inserts only need to be done in one place.
To sum it up:
Write Intensive --> normalization
smaller tables have a better chance of fitting into the buffer
less duplicated data, which means quicker updates / inserts
Read Intensive --> de-normalization
better structure for indexes
fewer joins means better performance
If your application is not heavily weighted toward reads over writes, then a more mixed approach would be better.

SQL - database design, need suggestion

I'm building website where each user has different classes
+----+-----------+---------+
| id | subject | user_id |
+----+-----------+---------+
| 1 | Math 140 | 2 |
| 2 | ART 240 | 2 |
+----+-----------+---------+
Each class then will have bunch of Homework files, Class-Papers files and so on.
And here I need your help. What will be the better approach: Build one table like that:
+----+-----------+--------------------------------------------------+--------------+
| id | subject | Homework | Class-Papers |
+----+-----------+-----------------------------------------------------------------+
| 1 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 2 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 3 | Math 140 | www.example.com/subjects/Math+140/file_name.pdf | bla-bla |
| 4 | ART 240 | www.example.com/subjects/ART +240/file_name.pdf | bla-bla |
| 5 | ART 240 | www.example.com/subjects/ART +240/file_name.pdf | bla-bla |
+----+-----------+--------------------------------------------------+--------------+
And than just separate the content when I want to display it,
OR build a table for every single subject and than just load necessary table?
Or if you can suggest something better or more common/useful/efficient please go ahead.
You should read about normalization and relational design before attempting this.
This is a one-to-many relationship - model it as such.
A table for every subject is crazy. You'll have to add a new table for every subject.
A better solution will make it possible to add new subjects simply by adding data. That's what the relational model is all about.
Don't worry about tables; think about it in natural language first.
A SUBJECT(calculus) can have many COURSES(differential, integral, multi-variable).
A COURSE(differential calculus) can have many SECTIONs (Mon 9-10 am in room 2 of the math building).
A STUDENT(first name, last name, student id) can sign up for zero or more SECTIONs. The list of SECTIONs for a given STUDENT is a TRANSCRIPT.
Each STUDENT has one TRANSCRIPT per semester (fall 2012).
A SECTION can have zero or more ASSIGNMENTs.
These are the tables you'll need for this simple problem. Worry about the names and how they relate before you start writing SQL. You'll be glad you did.
I most definitely woudl NOT create a separate table for each subject. If you did, then if you wanted a query like "list all the homework for student X", you would have to access different tables depending on which subjects that student was enrolled in. Worse, anytime someone added a new subject, you would have to create a new table. If down the road you decide you need a new attribute of homework, instead of updating one table, you would have to update every one of these subject tables. It's just bad news all around.