Data Modeling: ethnicities with parent-child relationship? - mysql

I have a site with users that I want users to be able to identify their ethnicities. What's the best way to model this if there is only 1 level of hierarchy?
Solution 1 (single table):
Ethnicity
- Id
- Parent Id
- Name
Solution 2 (two tables):
Ethnicity Group
- Id
- Name
Ethnicity
- Id
- Ethnicity Group Id
- Name
I will be using this so that users can search for other users based on ethnicity. Which of the 2 approaches will work better for me? Is there another approach I have not considered? I'm using MySQL.

Well there is such a thing as an Ethnicity Group in the real world, so you do need two tables, not one. The real world has three levels (the top-most would be Race), but I understand that may not be necessary here. If you squash the three levels into two, you have to be careful, and lay them all out properly at the beginning. However, they will be vulnerable to people saying they want the real thing, and you may have to change it, or change the structure to fit more in ... much more work later).
If you do it correctly, as per real world, that problem is eliminated. Let me know if you want Race, and I will change the model.
The tables are far too small, and the keys are too meaningful, to add Id-iot columns to them; leave them as pure Relational keys, otherwise you will lose the power of the Relational engine. If you really want narrow keys, use a CHAR(2) EthnicityCode, rather than a NUMERIC(10,0) or a meaningless number.
Link to Ethnicity Data Model (plus the answer to your other question)
Link to IDEF1X Notation for those who are unfamiliar with the Relational Modelling Standard.

If there is nothing like an "ethnicity group" in the real world, I'd suggest you don't introduce one in your data model.
All the queries you can do with the second one you can also do with the first one, because you can just select FROM ethnicity AS e1 JOIN ethnicity AS es ON (e2.ethnicity_id = e1.parent_id).

I don't want to be awkward, but what are you going to do with people of mixed descent? I think that the best that you can hope for is a simple single-level enumeration like the kind of thing you get on census forms (e.g. 'Black', 'White', 'Asian', 'Hispanic' etc). It's not ideal, but it allows people to fairly easily self-identify. Concepts like race and ethnicity are wooly enough without trying to create additional (largely meaningless) hierarchies on top of them, so my gut feeling is to keep it simple.

Related

database design: X:X to 1:many vs X:X to 0:many

(edited 1/5 10:22hr. added some explanation about my notation. And added some additional information I received)
I am doing a course on database design and currently we're doing ERD's and designing db's in MySQL worksbench. Think 1st, 2nd and 3rd NF, creating schema, tables, constraints, etc.
Most of it is pretty clear to me.
However there's one aspect where things remain unclear: the X:X to 1:many relationship vs the X:X to 0:many relationship (meaning: whatever to 0:many, vs whatever to 1:many, etc).
In some cases it's obvious, in others not so much. Whenever it's unclear to me, it's mostly something like this:
Example :
an artist has 1 to many paintings. A painting has 1-and only 1 artist.
Relationship:
|artist| 1:1 -------- 1:many |painting|
the same in another notation
|artist| ||------------ 1< |painting|
This seems fair, but....Then there's the thought: I could be a new artist, not having produced a painting yet.
Or: I could be entering a new artist into a artist table, not yet having entered his paintings yet (which could lead to a practical issue).
Another example:
A workshop has 1 to many participants. A participant enters 0-to-many workshops.
Relationship:
|workshop| many:0 ------- 1:many |participant|
Okay. However: a workshop could have 0 participants (no one want to participate, probably leading to cancellation).
Or: I could be entering a new workshop into a table, not having added any participants yet.
Another example:
An event is held at 1 only 1 location. A location had 1 to many events.
Relationship: |event| many:1 -------- 1:1 |location|
However, maybe you're entering a new (future) location, and there have not been events there yet.
Long shorty short: I am having a hard time establishing the minimal cardinality in cases like above.
Also, when I'm designing a db and get Workbench to forward engineer the SQL for creating the tables (based on my ERD), there doesn't seem to be any difference between a X to 1/many vs a X to 0/many variant. Which makes me wonder: what's the actual (practical) effect or implication of doing one or the other? Maybe the implications (further down the road) make for an easier choice?
Can anyone explain this matter in a simple (fool-proof) way?
Looking forward to understanding!
Addition 1/5:
I've talked about my question/issue with a teacher. He agreed with me that certain minimum cardinalities could lead to a deadlock:
one table cannot be inserted without there being a occurence in the other, and vice versa.
He explained to me that the ERD diagram is a logical model, not perse a fysical model. In other words, the ERD's minimum cardinality is not neccessarily for technical implementation.
Well, if that is the case, I understand his point. Usually an artist has at least one painting. A workshop normally has at least one participant. A location usually has at least one event. So on a logical level, that seems fine.
On a technical/implementation level, it is another deal. You should be able to enter a artist, workshop or location without there already being occurrences in another table.
My question now is:
is this true? Is a ERD a logical model, not a technical model?
and if that is so, WHAT is the reason for adding the minimum cardinality? It seems of little use.
Let's continue with your artist::paintings relationship, I think that's the clearest. When we say 1 to many, we often mean 0/1 to 0/many, but not all the 4 permutations work or are meaningful.
How does "I can have no artists with no paintings" sound? That's the zero-to-zero permutation. It does not sound wrong, but it's a degenerate case that is of no use to us.
1 artist to 0 paintings is OK, as matbailie described, and presents no problems.
1 artist to many paintings is OK, and is the main use case.
0 artists to many paintings is just not correct and should not be supported in the model. A painting must have an artist.
Because cases 1 & 4 don't work, it is not really correct to say 0/1 to 0/many. It is more correct to characterize the cardinality as 1 to 0/many, which encompasses cases 2 & 3 above.
You might say, 'but there are cases where we don't know the artist so we should leave an opportunity to have paintings with no artists'. This statement feels to me like you are leaving the realm of the ERD and entering into physical design. From a design standpoint you could just as easily say there is an artist we just don't know who, so let's create an UNKNOWN ARTIST record and connect those paintings to it.
Your second example (workshops::participants) looks more like a 0/many to 0/many. If you run through the same 4 permutations, they all look credible, though the 0-to-0 case still seems kind of ludicrous.
Your last example is another 1 to 0/many because events without locations cannot be held. When you get to the physical level you can talk about the best way to handle virtual events.
So, none of your examples seem to show a true zero-to-many (which is more accurately stated 0/1 to 0/many). I'm thinking they are pretty rare, if they exist at all. It would have to be associated with some kind of optional activity where if you did enroll in it there were a constrained set of choices.
But... A participant can be in 0-N workshops. So that needs to be many-to-many.
In general "zero" is a degenerate case of "1" or "N"; it is rarely worth worrying about.
I like to start by identifying the "entities" in the model. Your participant, workshop, painting, event, artist, location are excellent examples of such. Then I look for "relations" between obvious pairs.
I say there are only 3 cases. And I like to remember that the two "entities" are manifested as two "database tables":
1:1 -- At which point I ask why the two entities (database tables) are not merged together. The two tables share the same unique key.
1:many -- Represented by the id of the "1" as a column in the "many" table.
Many:many -- This needs a link table between the two tables. This linking table has two ids.
"Id" means a unique key for a table. It is usually a number, but that is not a requirement.
For "0"...
1:0 or many:0 -- You may need a LEFT JOIN to provide NULLs when the entry on the "0" side is missing
Many:many -- If either id is non-existent (the zero case), then there are no rows for that relationship.
Then comes defining the INDEXes for efficient access. And, optionally, FOREIGN KEYs for integrity. The indexes that represent how two entities are related are prime candidates for FKs. Other INDEXes should be added to optimize WHERE clauses in SQL queries.
In all cases, the id/FK/index may be "composite" -- meaning that it is two or more columns that are used for a single id/FK/index.

Is an almost or half empty colum worth or better than another table? How to code fast or spacesaving sql?

i would like to know how to decide between different database-design solutions?
I guess best to describe my Question is to give an example.
Lets say we want to create a Database for Cars. Every Car has a number of Properties we want to save.
There are a lot of Properties every Car has like:
Producer, Model, Color, Age,...
But here are also Properties that are just found in a subcategory or in a small group of cars like:
Draw Bar, Roof Rack, Cargo area, 4 Wheel Drive,...
Some Properties may even only be relevant for less than 5% of the Cars. There are different solutions to solve this.
- The first is dump everything into one table. Of Course Normalized! (not mentioned below)
- The second solution would be creating a table with Properties that every car has. Adding a CartoDrawbar ... table to establish an m:m connection between the rare Properties and the Cars.
- The third possibility i can imagine would be creating Tables for Car Groups like SUVs, Notchback, Truck, Compact, Pickup ... to group cars with similar Properties. (my rare Properties were not the best choice to resemble this).
- Last idea is creating a table with all shared Properties and add a Char or Text Column to fill in everything special.
But which is the best Solution or the fitting Solution? Did i forget an important one? Are there differences in Speed, Filesize or ... to consider? Or some thresholds when to chose this or that solution. I have a personal favorite but i don't want to influence you and i don't have enough knowledge about the relational Databases and or Management Software to judge Speed or File-size of a Table.
There is no "best" solution. In fact, most of your "rare" columns look more like flags -- a car has 4-wheel drive or it does not, a car has a roof-rack or it does not.
My suggestion is to put these into one table, with separate columns, of the appropriate type.
Then, if you really do have optional features, like say the number of gears in a manual transmission, you can then think about how to implement a list. Nowadays, most databases support JSON and that would be a natural choice for such elements.

Restructure Inventory Management Database (2 to 3 Tables; Development Stage)

I’m developing a database. I’d appreciate some help restructuring 2 to 3 tables so the database is both compliant with the first 3 normal forms; and practical to use and to expand on / add to in the future. I want to invest time now to reduce effort / and confusion later.
PREAMBLE
Please be aware that I'm both a nube, and an amateur, though I have a certain amount of experience and skill and an abundance of enthusiasm!
BACKGROUND TO PROJECT
I am writing a small (though ambitious!) web application (using PHP and AJAX to a MySQL database). It is essentially an inventory management system, for recording and viewing the current location of each individual piece of equipment, and its maintenance history. If relevant, transactions will be very low (probably less than 100 a day, but with a possibility of simultaneous connections / operations). Row count will also be very low (maybe a few thousand).
It will deal with many completely different categories of equipment, eg bikes and lamps (to take random examples). Each unit of equipment will have its details or specifications recorded in the database. For a bike, an important specification might be frame colour, whereas a lamp it might require information regarding lampshade material.
Since the categories of equipment have so little in common, I think the most logical way to store the information is 1 table per category. That way, each category can have columns specific to that category.
I intend to store a list of categories in a separate table. Each category will have an id which is unique to that category. (Depending on the final design, this may function as a lookup table and / or as a table to run queries against.) There are likely to be very few categories (perhaps 10 to 20), unless the system is particulary succesful and it expands.
A list of bikes will be held in the bikes table.
Each bike will have an id which is unique to that bike (eg bike 0001).
But the same id will exist in the lamp table (ie lamp 0001).
With my application, I want the user to select (from a dropdown list) the category type (eg bike).
They will then enter the object's numeric id (eg 0001).
The combination of these two ids is sufficient information to uniquely identify an object.
Images:
Current Table Design
Proposed Additional Table
PROBLEM
My gut feeling is that there should be an “overarching table” that encompasses every single article of equipment no matter what category it comes from. This would be far simpler to query against than god knows how many mini tables. But when I try to construct it, it seems like it will break various normal forms. Eg introducing redundancy, possibility of inconsistency, referential integrity problems etc. It also begins to look like a domain table.
Perhaps the overarching table should be a query or view rather than an entity?
Could you please have a look at the screenshots and let me know your opinion. Thanks.
For various reasons, I’d prefer to use surrogate keys rather than natural keys if possible. Ideally, I’d prefer to have that surrogate key in a single column.
Currently, the bike (or lamp) table uses just the first column as its primary key. Should I expand this to a composite key including the Equipment_Category_ID column too? Then make the Equipment_Article table into a view joining on these two columns (iteratively for each equipment category). Optionally Bike_ID and Lamp_ID columns could be renamed to something generic like Equipment_Article_ID. This might make the query simpler, but is there a risk of losing specificity? It would / could still be qualified by the table name.
Speaking of redundancy, the Equipment_Category_ID in the current lamp or bike tables seems a bit redundant (if every item / row in that table has the same value in that column).
It all still sounds messy! But surely this must be very common problem for eg online electronics stores, rental shops, etc. Hopefully someone will say oh that old chestnut! Fingers crossed! Sorry for not being concise, but I couldn't work out what bits to leave out. Most of it seems relevant, if a bit chatty. Thanks in advance.
UPDATE 27/03/2014 (Reply to #ElliotSchmelliot)
Hi Elliot.
Thanks for you reply and for pointing me in the right direction. I studied OOP (in Java) but wasn't aware that something similar was possible in SQL. I read the link you sent with interest, and the rest of the site/book looks like a great resource.
Does MySQL InnoDB Support Specialization & Generalization?
Unfortunately, after 3 hours searching and reading, I still can't find the answer to this question. Keywords I'm searching with include: MySQL + (inheritance | EER | specialization | generalization | parent | child | class | subclass). The only positive result I found is here: http://en.wikipedia.org/wiki/Enhanced_entity%E2%80%93relationship_model. It mentions MySQL Workbench.
Possible Redundancy of Equipment_Category (Table 3)
Yes and No. Because this is a lookup table, it currently has a function. However because every item in the Lamp or the Bike table is of the same category, the column itself may be redundant; and if it is then the Equipment_Category table may be redundant... unless it is required elsewhere. I had intended to use it as the RowSource / OptionList for a webform dropdown. Would it not also be handy to have Equipment_Category as a column in the proposed Equipment parent table. Without it, how would one return a list of all Equipment_Names for the Lamp category (ignoring distinct for the moment).
Implementation
I have no way of knowing what new categories of equipment may need to be added in future, so I’ll have to limit attributes included in the superclass / parent to those I am 100% sure would be common to all (or allow nulls I suppose); sacrificing duplication in many child tables for increased flexibility and hopefully simpler maintenance in the long run. This is particulary important as we will not have professional IT support for this project.
Changes really do have to be automated. So I like the idea of the stored procedure. And the CreateBike example sounds familiar (in principle if not in syntax) to creating an instance of a class in Java.
Lots to think about and to teach myself! If you have any other comments, suggestions etc, they'd be most welcome. And, could you let me know what software you used to create your UML diagram. Its styling is much better than those that I've used.
Cheers!
You sound very interested in this project, which is always awesome to see!
I have a few suggestions for your database schema:
You have individual tables for each Equipment entity i.e. Bike or Lamp. Yet you also have an Equipment_Category table, purely for identifying a row in the Bike table as a Bike or a row in the Lamp table as a Lamp. This seems a bit redundant. I would assume that each row of data in the Bike table represents a Bike, so why even bother with the category table?
You mentioned that your "gut" feeling is telling you to go for an overarching table for all Equipment. Are you familiar with the practice of generalization and specialization in database design? What you are looking for here is specialization (also called "top-down".) I think it would be a great idea to have an overarching or "parent" table that represents Equipment. Then, each sub-entity such as Bike or Lamp would be a child table of Equipment. A parent table only has the fields that all child tables share.
With these suggestions in mind, here is how I might alter your schema:
In the above schema, everything starts as Equipment. However, each Equipment can be specialized into Lamp, Bike, etc. The Equipment entity has all of the common fields. Lamp and Bike each have fields specific to their own type. When creating an entity, you first create the Equipment, then you create the specialized entity. For example, say we are adding the "BMX 200 Ultra" bike. We first create a record in the Equipment table with the generic information (equipmentName, dateOfPurchase, etc.) Then we create the specialized record, in this case a Bike record with any additional bike-specific fields (wheelType, frameColor, etc.) When creating the specialized entities, we need to make sure to link them back to the parent. This is why both the Lamp and Bike entities have a foreign key for equipmentID.
An easy and effective way to add specialized entities is to create a stored procedure. For example, lets say we have a stored procedure called CreateBike that takes in parameters bikeName, dateOfPurchase, wheelType, and frameColor. The stored procedure knows we are creating a Bike, and therefore can easily create the Equipment record, insert the generic equipment data, create the bike record, insert the specialized bike data, and maintain the foreign key relationship.
Using specialization will make your transactional life very simple. For example, if you want all Equipment purchased before 1/1/14, no joins are needed. If you want all Bikes with a frameColor of blue, no joins are needed. If you want all Lamps made of felt, no joins are needed. The only time you will need to join a specialized table back to the Equipment table is if you want data both from the parent entity and the specialized entity. For example, show all Lamps that use 100 Watt bulbs and are named "Super Lamp."
Hope this helps and best of luck!
Edit
Specialization and Generalization, as mentioned in your provided source, is part of an Enhanced Entity Relationship (EER) which helps define a conceptual data model for your schema. As such, it does not need to be "supported" per say, it is more of a design technique. Therefore any database schema naturally supports specialization and generalization as long as the designer implements it.
As far as your Equipment_Category table goes, I see where you are coming from. It would indeed make it easy to have a dropdown of all categories. However, you could simply have a static table (only contains Strings that represent each category) to help with this population, and still keep your Equipment tables separate. You mentioned there will only be around 10-20 categories, so I see no reason to have a bridge between Equipment and Equipment_Category. The fewer joins the better. Another option would be to include an "equipmentCategory" field in the Equipment table instead of having a whole table for it. Then you could simply query for all unique equipmentCategory values.
I agree that you will want to keep your Equipment table to guaranteed common values between all children. Definitely. If things get too complicated and you need more defined entities, you could always break entities up again down the road. For example maybe half of your Bike entities are RoadBikes and the other half are MountainBikes. You could always continue the specialization break down to better get at those unique fields.
Stored Procedures are great for automating common queries. On top of that, parametrization provides an extra level of defense against security threats such as SQL injections.
I use SQL Server. The diagram I created is straight out of SQL Server Management Studio (SSMS). You can simply expand a database, right click on the Database Diagrams folder, and create a new diagram with your selected tables. SSMS does the rest for you. If you don't have access to SSMS I might suggest trying out Microsoft Visio or if you have access to it, Visual Paradigm.

Database Model for Retail Sporting Goods Company

I am a professional designer that has done some databases. I would like some feedback on this on any big mistakes I am making in the table configurations and how the PK and FK relate.
The blue boxes represent data that will come from another database.
Click here to see database design
Click here to see New Design Changed the product sizes and color table
In keeping with what Gilbert Le Blanc described, you could make this more scalable and efficient as follows:
A. Anytime you find yourself adding columns for items which represent possible user choices, consider whether they should actually be modelled as ROWS in a new table. This is referred to as "Normalization" (there's more to it than that, but for this purpose, it should cover what I'm trying to say . . .), and is key to proper database design. If you fail to normalize properly, you will experience extensive pain and regret down the road. Imagin one of your suppliers introduces a new color 6 months after you go live with your database. You will have to re-code your data-access routines just to add that color to whatever front-end presentation you are creating.
B. You MIGHT want to combine some of your Category/Sub-Category/Class Structure into one or two tables. While I don't have a concrete suggestion without knowing more about the retail biz, it seems like there may be any number of heirarchies, depending upon the product. In theory, you could actually get way with a SINGLE table for this:
**tblCategories**
CategoryID Int PK
ParentCategoryID Int FK on tblCategories CategoryID
CategoryName
Records with a ParentCategoryID > 0 are sub-categories.
I am going to attempt attaching an image (I have not done that here on SO before) of what I have just described. Caveats:
I Am working in SQL Server, so things might look a tad different to you.
I have over-simplified the model for the purpose of this example. But it does illustrate the relationships I am describing.
THere may be others with better suggestions for modelling the Product/Categories. The concept I have presented can be challenging to keep straight in your head, but makes use of recursive relationships to create a very flexible/scalable table structure.
I think you are on the right track. However, there are still some areas for (potentially) significant improvement in your normalization. I say potentially because I don't know enough about the sports apparel business, sizing, and the like. However, some observations:
A. I see the same entities representated in several different tables, i.e. Nike, Adidas, Etc. While I understand that one vendor may have several different brands, your table structure could make this more clear. If "Nike" is the vendor, then possible Brands of Nike might be Nike, Converse, whatever other brands Nike provides. If this is what your table does, then forgive.
B. You apparel sizing table might have some potential for additional normalization, or maybe not. Seems complex, and again, I don't know enough about the relatiosnhips represented there. I DO see what appears to be repetition of data in fields which might be better represented as rows in other tables.
C. An example of what I describe in B. is to be had with the footwear sizing. THIS can be normalized more effectively. Note that I have rather arbitrarily placed the FK for GenderCategory in tblFootwear_Sizing_Index, it MIGHT belong in tblFootwearSizes. Again, don't know enough about the footwear industry. But beyond that quibble, you will find the following arrangement more effective and manageable:
There are other areas in your model which might lend themselves to simimlar restructuring. However, as I said, it becomes hard for me to see given my lack of knowledge of your industry. I STILL think you might want to re-examine the many flavors of "Category" and "Class" Further, you most definitely should find some more descriptive names for some of those Category/Class Tables (or any table, really). Think "ProductCategory", "GenderCategory", "FootwearCategory", Etc. Also, don't be too afraid of longer table names, if the make it easier for you (or more importantly, your successor four years form now) to discern what is going on in your code. It may be more cumbersome to type now, but 6 months after you go live, and you are trying to figure out why one of your queries is not returning as expected, you will be glad you did. After all, you can always alias the table names in general use.
I strongly recommend checking out some info on database normalization, then try to apply it to your model. Getting the back-end db model right from jump can make or break your application. Here is one of many articles I got back by googling "Database Normalization":
http://databases.about.com/od/specificproducts/a/3nf.htm
This article is focused on the Third Normal Form (3NF), but provides links to 1NF and 2NF, which are pre-requisites for 3NF.
You should always strive for a minimuim of 3NF in a database design.
Hope that helps, and I would love to hear how you progress on this.
You have 2 footwear size tables.
Taking the apparel size table as an example, you get more flexibility if you make the size one of the columns.
apparelSizeId Size Sort order
1 M 1
1 L 2
1 XL 3
2 S 1
2 M 2
2 L 3
With this type of table design, it's easier to add new sizes.
You can also combine a lot of your size and style tables into one table, although it does make the design harder for business types to understand.

Anyone used SQl Server 2008 HierarchialID type to store genealogy data

I have a genealogical database (about sheep actually), that is used by breeders to research genetic information. In each record I store fatherid and motherid. In a seperate table I store complete 'roll up' information so that I can quickly tell the complete family tree of any animal without recursing thru the entire database...
Recently discovered the hierarchicalID type built into SQL server 2008, on the surface it sounds promising, but I and am wondering if anyone has used it enough to know whether or not it would be appropriate in my type of app(i.e. two parents, multiple kids)? All the samples I have found/read so far deal with manager/employee type relationships where a given boss can have multiple employees, and each employee can have a single boss.
The needs of my app are similar, but not quite the same.
I am sure I will dig into this new technology anyway, but it would be nice to shortcut my research if someone already knew that it was not designed in such a fashion that it would allow me to make use of it.
I am also curious what kind of performance people are seeing using this new data type versus other methods that do the same thing.
Assuming each sheep has one male parent and one female parent, and that no sheep can be its own parent (leading to an Ovine Temporal Paradox), then what about using two HierarchyIDs?
CREATE TABLE dbo.Sheep(
MotherHID hierarchyid NOT NULL,
FatherHID hierarchyid NOT NULL,
Name int NOT NULL
)
GO
ALTER TABLE dbo.Sheep
ADD CONSTRAINT PK_Sheep PRIMARY KEY CLUSTERED (
MotherHID,
FatherHID
)
GO
By making them a joint PK, you'd be uniquely identifying each sheep as the product of its maternal hierarchy and it's paternal hierarchy.
There may be some inherent problem lurking here, so proceed cautiously with a couple simple prototypes - but initially it seems like it would work for you.
I can't see how it would work; in a regular hierarchy, there is a single chain to the root, so it can store the path (which is what the binary is) to each node. However, with multiple parents, this isn't possible: even if you split matriarchy and partiarchy, you still have 1 mother, 2 grandmothers, 4 great-grand-mothers, etc (not even getting into some of the more "interesting" scanerios possible, especially with livestock). There is no single logical path to encode, so no: I don't think that this can work in your case.
I'm happy to be corrected, though.
Using two separate HierarchyID to indicate father and mother would work well.
However, you definitely would NOT want to use those as a unique indicator of the row, since it's a 2-to-many situation. (Two sheep can have multiple children.)
I don't see anything inherently wrong with using HierarchyId for ancestry--for Sheep at least. For people, the relationships are much more complicated than "this person begat that person", so obviously the use would be limited to breeding.
SQL Server hierarchyID is not a robust solution for many genealogy analytic questions. It is based on ORDPATH and I've used it for awhile in genealogy; but there are too many scenarios in genealogy that cannot be readily addressed with ORDPATH methods for directed acyclic graphs. A graph database is much more robust and well suited for genealogy. I use Neo4j: http://stumpf.org/genealogy-blog/graph-databases-in-genealogy.