Drupal MySql database design question - mysql

I was just looking at the MySql database created by drupal after I installed it.
All the tables are in MyISAM.
With a complex software like drupal, wouldn't it make more sense to
use foreign keys and hence InnoDB tables to enforce referential integrity?
Without foreign keys all the constraint checking will happen at the
PHP end.

MySQL offers a variety of database engines for a reason - different engines offer different advantages and disadvantages. InnoDB is a great engine that offers referential integrity as well as transaction safety, but it is poorly optimized for the use case of web site where you have order of magnitude more reads then writes.
MyISAM offers the best performance for a web site where most hits need only read access to the database. In such cases referential integrity can most often be maintained by writing your data inserts and deletes in a way that they cannot succeed if they compromise integrity.
For example, instead of writing
DELETE FROM mytable WHERE id = 5
you can write
DELETE mytable FROM mytable LEFT JOIN linkedtable ON mytable.id=linkedtable.ref WHERE id = 5 AND linkedtable.ref IS NULL
This will succeed in deleting the row only when the are no external references to it.

Related

MySql | relational database vs non relational database in terms of Performance

What i want to ask, if we define relations, one-to-one, one-to-many etc will that increase the performance in comparison to if we dont create relations but do join the table on the go like
select * from employee inner join user on user.user_id = employee.user_id
i know this question has been asked before and most answers i have got saying that performance don't get affected by not using relations.
But i have also heard that creating indexes makes the query faster, so is it possible to create indexes on tables for foreign keys without creating relations. I'm little confused about index.
and what if we have large database like 100+ tables plus alot of records will the relations matter in terms of database query performace??
im using mysql and php..
Foreign keys are basically used for data integrity.
Of course, indexing boosts performance.
Regarding the performance with or without foreign keys, when it's said they improve performance is because when you define a foreign key you are implicitly defining an index. Such an index is created on the referencing table automatically if it does not exist.
Relations are used to maintain the referential integrity of the database. They do not affect performance of the "select" query at all. They do reduce performance of "insert", "update" and "delete" queries, but you rarely want a relational database without referntial integrity.
Indexes are what makes the "select" query run faster. They also make insert and update queries significantly slower. To know more about how the indexes work go to use-the-index-luke. This is by far the best site about this topic that I have found.
That said, databases usually make indexes automatically when you declare a primary key, and some of them (MySql in particular) make indexes automatically even when you define a foreign key. You can read all about why they do that on the above site.

What do you think about cascading deletions on mysql tables?

This question is in the title !
The database i'm using to store datas from my (production) website contains a lot of ON DELETE CASCADE.
I just would know if it's a good thing or if it's a better way to manually code all deletions.
On one hand, it's not very explicit : deletions are made by magic and on a other hand, it make development easier : I don't have to keep the entire schema of my database in my mind.
I think maintaining referential integrity is a good thing to be doing. The last thing you'd want is orphaned rows in your database.
See the MySQL documentation on things to consider when not using referential integrity:
MySQL gives database developers the choice of which approach to use. If you don't need foreign keys and want to avoid the overhead associated with enforcing referential integrity, you can choose another storage engine instead, such as MyISAM. (For example, the MyISAM storage engine offers very fast performance for applications that perform only INSERT and SELECT operations. In this case, the table has no holes in the middle and the inserts can be performed concurrently with retrievals. See Section 8.10.3, “Concurrent Inserts”.)
If you choose not to take advantage of referential integrity checks, keep the following considerations in mind:
In the absence of server-side foreign key relationship checking, the application itself must handle relationship issues. For example, it must take care to insert rows into tables in the proper order, and to avoid creating orphaned child records. It must also be able to recover from errors that occur in the middle of multiple-record insert operations.
If ON DELETE is the only referential integrity capability an application needs, you can achieve a similar effect as of MySQL Server 4.0 by using multiple-table DELETE statements to delete rows from many tables with a single statement. See Section 13.2.2, “DELETE Syntax”.
A workaround for the lack of ON DELETE is to add the appropriate DELETE statements to your application when you delete records from a table that has a foreign key. In practice, this is often as quick as using foreign keys and is more portable.
Be aware that the use of foreign keys can sometimes lead to problems:
Foreign key support addresses many referential integrity issues, but it is still necessary to design key relationships carefully to avoid circular rules or incorrect combinations of cascading deletes.
It is not uncommon for a DBA to create a topology of relationships that makes it difficult to restore individual tables from a backup. (MySQL alleviates this difficulty by enabling you to temporarily disable foreign key checks when reloading a table that depends on other tables. See Section 14.3.5.4, “FOREIGN KEY Constraints”. As of MySQL 4.1.1, mysqldump generates dump files that take advantage of this capability automatically when they are reloaded.)
Source: http://dev.mysql.com/doc/refman/5.5/en/ansi-diff-foreign-keys.html
Cascading deletes are a great tool for you to use provided you make sure only to use them where it makes perfect sense to do so.
The main situation in which you would opt for using a cascading delete is when you have a table that models entities that are "owned" by one (and only one) row in another table. For example, if you have a table that models people and a table that models phone numbers. Here your phone numbers table would have a foreign key to your people table. Now if you decide you no longer want your application to keep track of someone - say "Douglas" - it makes perfect sense that you don't want to keep track of Douglas's phone numbers any more, either. There is no sense in having a phone number floating around in your database and not know whose it is.
But at the same time, when you want to delete a person from the "people" table, you don't want to first have to laboriously check whether you have any phone numbers for that person and delete them. Why do that when you can encode into the database structure the rule that when a person is deleted, their phone numbers can all go as well? That is what a cascading delete will do for you. Just make sure you know what cascading deletes you have, and that they all make sense.
NB. If you use triggers, you need to be more careful. MySQL doesn't fire triggers on cascading deletes.

Django foreign key integrity with MyISAM

If MyISAM doesn't have FK integrity, how does a django app that uses MyISAM tables enforce the integrity of the FK constaints?
Poorly. It does its level best to issue updates and deletes when the referant changes, based on information it has already loaded from prior interactions, but there's just nothing protecting your data from becoming inconsistent.
The ForeignKey construct exists less to declare the integrity constraints as it does to tell django how the different tables link together, so you can traverse in python through the attributes to other model instances of other types. The orm-driven cascading is at best a band-aid over the shortcomings of databases like MyISAM. If this is important to you (and it should be), you should migrate away from the MyISAM engine to InnoDB or PostgreSQL.

Best practice advise for deleting tables in PHP/MySQL framework?

What are some best practices tips for tinkering, deleting tables, making reversible changes in MySQL (not production) testing server? In my case I'm learning a PHP/MySQL framework.
The only general tool I have in my toolbox is to rename files before I delete them. If there is a problem I can always return a file to its original name. I would imagine it should be OK to apply the same practice to a database, since clients can lose their connection to a host. Yet, how does a web application framework proceed when referential integrity is broken only in one place?
I guess you are referring to transactions. InnoDB engine in MySQL supports transactions as well as Foreign Key constraints.
In transactional design, you can execute a bunch of queries that need to be executed as a single entity in order to be meaningful and to maintain data integrity. A transaction is started and if something goes wrong it does a Rollback, thus reverting every change done so far, or committing the entire set of modifications in the database.
Foreign keys are constraints for referential data. Thus in a master-detail relationship you cannot e.g. refer to a master record that does not exist. If there is a table comments with a user_id referring to the users.id field , you are not allowed to enter a comment for a non-existent user.
Read more here if you will
http://dev.mysql.com/doc/refman/5.0/en/innodb-transaction-model.html
and for foreign keys
http://dev.mysql.com/doc/refman/5.0/en/innodb-foreign-key-constraints.html

Session / Log tables keys design question

I have almost always heard people say not to use FKs with user session and any log tables as those are usually High write tables and once written data almost always tays forever without any updates or deletes.
But the question is I have colunms like these:
User_id (link a session or activity log to the user)
activity_id (linking the log activity table to the system activity lookup table)
session_id (linking the user log table with the parent session)
... and there are 4-5 more colunms.
So if I dont use FKs then how will i "relate" these colunms? Can i join tables and get the user info without FKs? Can i write correct data without FKs? Any performance impact or do people just talk and say this is a no no?
Another question I have is if i dont use FKs can i still connect my data with lookup tables?
In fact, you can build the whole database without real FKs in mysql. If you're using MyISAM as a storage engine, the FKs aren't real anyway.
You can nevertheless do all the joins you like, as long as the join keys match.
Performance impact depends on how much data you stuff into a referenced table. It takes extra time if you have a FK in a table and insert data into it, or update a FK value. Upon insertion or modification, the FK needs to be looked up in the referenced table to ensure the reference integrity.
On highly used tables which don't really need reference integrity, I'd just stick with loose columns instead of FKs.
AFAIK InnoDB is currently the only one supporting real foreign keys (unless MySQL 5.5 got new or updated storage engines which support them as well). Storage engines like MyISAM do support the syntax, but don't actually validate the referential integrity.
FK's can be detrimental in "history log" tables. This kind of table wants to preserve the exact state of what happened at a point in time.
The problem with FK's is they don't store the value, just a pointer to the value. If the value changes, then the history is lost. You DO NOT WANT updates to cascade into your history log. It's OK to have a "fake Foreign key" that you can join on, but you also want to intensionally de-normalize relevant fields to preserve the history.