Session / Log tables keys design question - mysql

I have almost always heard people say not to use FKs with user session and any log tables as those are usually High write tables and once written data almost always tays forever without any updates or deletes.
But the question is I have colunms like these:
User_id (link a session or activity log to the user)
activity_id (linking the log activity table to the system activity lookup table)
session_id (linking the user log table with the parent session)
... and there are 4-5 more colunms.
So if I dont use FKs then how will i "relate" these colunms? Can i join tables and get the user info without FKs? Can i write correct data without FKs? Any performance impact or do people just talk and say this is a no no?
Another question I have is if i dont use FKs can i still connect my data with lookup tables?

In fact, you can build the whole database without real FKs in mysql. If you're using MyISAM as a storage engine, the FKs aren't real anyway.
You can nevertheless do all the joins you like, as long as the join keys match.
Performance impact depends on how much data you stuff into a referenced table. It takes extra time if you have a FK in a table and insert data into it, or update a FK value. Upon insertion or modification, the FK needs to be looked up in the referenced table to ensure the reference integrity.
On highly used tables which don't really need reference integrity, I'd just stick with loose columns instead of FKs.
AFAIK InnoDB is currently the only one supporting real foreign keys (unless MySQL 5.5 got new or updated storage engines which support them as well). Storage engines like MyISAM do support the syntax, but don't actually validate the referential integrity.

FK's can be detrimental in "history log" tables. This kind of table wants to preserve the exact state of what happened at a point in time.
The problem with FK's is they don't store the value, just a pointer to the value. If the value changes, then the history is lost. You DO NOT WANT updates to cascade into your history log. It's OK to have a "fake Foreign key" that you can join on, but you also want to intensionally de-normalize relevant fields to preserve the history.

Related

MySQL does FK reduce insert/update operations?

Does anybody know, does FK reduce insert/update operations in MySQL?
I use engine INNODB.
Having a FK on a table implicitly creates (and maintains) an index.
When doing certain write operations, the FK's implicit INDEX is checked to verify the existence of the appropriate row in the other table. This is a minor performance burden during writes.
When doing SELECT ... JOIN for which you failed to explicitly provide the appropriate index, the implicit index produced by some FK may come into play. This is a big benefit to some JOINs, but does not require an FK, since you could have added the INDEX manually.
If the FK definition includes ON DELETE or UPDATE, then even more work may be done, especially for CASCADE. The effect of CASCADE can be achieved with a SELECT plus more code -- but not as efficiently as letting CASCADE do the work.
FKs are limited in what they can do. Stackoverflow is littered with question like "How can I get an FK to do X?"
Does any of this sound like "reducing insert/update operations"?
does FK reduce insert/update operations in MySQL?
It's not about MySQL but yes it does. Creating FK on a column will create a secondary index and thus upon DML operation those indexes needs to be updated as well in order to have a correct table statistics. So that, DB optimizer can generate a correct and efficient query plan

What do you think about cascading deletions on mysql tables?

This question is in the title !
The database i'm using to store datas from my (production) website contains a lot of ON DELETE CASCADE.
I just would know if it's a good thing or if it's a better way to manually code all deletions.
On one hand, it's not very explicit : deletions are made by magic and on a other hand, it make development easier : I don't have to keep the entire schema of my database in my mind.
I think maintaining referential integrity is a good thing to be doing. The last thing you'd want is orphaned rows in your database.
See the MySQL documentation on things to consider when not using referential integrity:
MySQL gives database developers the choice of which approach to use. If you don't need foreign keys and want to avoid the overhead associated with enforcing referential integrity, you can choose another storage engine instead, such as MyISAM. (For example, the MyISAM storage engine offers very fast performance for applications that perform only INSERT and SELECT operations. In this case, the table has no holes in the middle and the inserts can be performed concurrently with retrievals. See Section 8.10.3, “Concurrent Inserts”.)
If you choose not to take advantage of referential integrity checks, keep the following considerations in mind:
In the absence of server-side foreign key relationship checking, the application itself must handle relationship issues. For example, it must take care to insert rows into tables in the proper order, and to avoid creating orphaned child records. It must also be able to recover from errors that occur in the middle of multiple-record insert operations.
If ON DELETE is the only referential integrity capability an application needs, you can achieve a similar effect as of MySQL Server 4.0 by using multiple-table DELETE statements to delete rows from many tables with a single statement. See Section 13.2.2, “DELETE Syntax”.
A workaround for the lack of ON DELETE is to add the appropriate DELETE statements to your application when you delete records from a table that has a foreign key. In practice, this is often as quick as using foreign keys and is more portable.
Be aware that the use of foreign keys can sometimes lead to problems:
Foreign key support addresses many referential integrity issues, but it is still necessary to design key relationships carefully to avoid circular rules or incorrect combinations of cascading deletes.
It is not uncommon for a DBA to create a topology of relationships that makes it difficult to restore individual tables from a backup. (MySQL alleviates this difficulty by enabling you to temporarily disable foreign key checks when reloading a table that depends on other tables. See Section 14.3.5.4, “FOREIGN KEY Constraints”. As of MySQL 4.1.1, mysqldump generates dump files that take advantage of this capability automatically when they are reloaded.)
Source: http://dev.mysql.com/doc/refman/5.5/en/ansi-diff-foreign-keys.html
Cascading deletes are a great tool for you to use provided you make sure only to use them where it makes perfect sense to do so.
The main situation in which you would opt for using a cascading delete is when you have a table that models entities that are "owned" by one (and only one) row in another table. For example, if you have a table that models people and a table that models phone numbers. Here your phone numbers table would have a foreign key to your people table. Now if you decide you no longer want your application to keep track of someone - say "Douglas" - it makes perfect sense that you don't want to keep track of Douglas's phone numbers any more, either. There is no sense in having a phone number floating around in your database and not know whose it is.
But at the same time, when you want to delete a person from the "people" table, you don't want to first have to laboriously check whether you have any phone numbers for that person and delete them. Why do that when you can encode into the database structure the rule that when a person is deleted, their phone numbers can all go as well? That is what a cascading delete will do for you. Just make sure you know what cascading deletes you have, and that they all make sense.
NB. If you use triggers, you need to be more careful. MySQL doesn't fire triggers on cascading deletes.

Quick question about relational one-to many database

I'm doing a venue/events database and I've created my tables and would like some confirmation from someone if I did everything right :)
I have 2 tables:
Venues
Events
The primary key of Venues is VENUE_ID, which is set to auto_increment. I have the same column in Events, which will contain the number of the Venue ID. This should connect them, right?
Also, the table engine is MyISAM.
It does not automatically link the tables to each others, and the referenced columns don't necessarily have to have the same name (in fact, there are situations where this is impossible: e.g. when a table has two columns that both reference the same column in another table).
Read up on foreign keys; they're standard SQL and do exactly what you want. Note, however, that the MyISAM storage engine cannot enforce foreign key constraints, so as long as any of the tables involved uses MyISAM, the foreign key declaration doesn't add much (it does, however, document the relationship, at least in your SQL scripts).
I suggest you use InnoDB (or, if that's feasible, switch to PostgreSQL - not only does it provide foreign key constraints, it also has full support for transactions, unlike MySQL, which will silently commit a pending transaction whenever you do something that's not supported in a transaction, with potentially devastating results). If you have to / want to use MySQL, I suggest you use InnoDB for everything, unless you know you need the extra performance you can get out of MyISAM and you can afford the caveats. Also keep in mind that migrating large tables from MyISAM to InnoDB later in production can be painful or even outright impossible.
Your db structure is right.
You can use Innodb for adding foreign key contraints. Also don't forget to add index to the second table for faster joining two tables.
More info about FK http://dev.mysql.com/doc/refman/5.5/en/innodb-foreign-key-constraints.html
Note to comments:
Innodb allows you to make concurrent select/(insert/update) but MyIsam allows you to do the same things if you don't delete from MyIsam table. Otherwise MyIsam will lock your whole table.
Generally, yes. This is how you indicate a one-to-many relation between two tables. You may also specifically encode the relationship into the database by setting up a Foreign Key constraint. This will allow add'l logic such as cascading.

Best practice advise for deleting tables in PHP/MySQL framework?

What are some best practices tips for tinkering, deleting tables, making reversible changes in MySQL (not production) testing server? In my case I'm learning a PHP/MySQL framework.
The only general tool I have in my toolbox is to rename files before I delete them. If there is a problem I can always return a file to its original name. I would imagine it should be OK to apply the same practice to a database, since clients can lose their connection to a host. Yet, how does a web application framework proceed when referential integrity is broken only in one place?
I guess you are referring to transactions. InnoDB engine in MySQL supports transactions as well as Foreign Key constraints.
In transactional design, you can execute a bunch of queries that need to be executed as a single entity in order to be meaningful and to maintain data integrity. A transaction is started and if something goes wrong it does a Rollback, thus reverting every change done so far, or committing the entire set of modifications in the database.
Foreign keys are constraints for referential data. Thus in a master-detail relationship you cannot e.g. refer to a master record that does not exist. If there is a table comments with a user_id referring to the users.id field , you are not allowed to enter a comment for a non-existent user.
Read more here if you will
http://dev.mysql.com/doc/refman/5.0/en/innodb-transaction-model.html
and for foreign keys
http://dev.mysql.com/doc/refman/5.0/en/innodb-foreign-key-constraints.html

MYSQL mass deletion FROM 80 tables

I have 50GB mysql database (80 tables) that I need to delete some contents from it.
I have a reference table that contains list if product ids that needs to be deleted from the the other tables.
Now, the other tables can be 2 GB each, contains the items that needs to be deleted.
My question is: since it is not a small database, what is the safest way to delete
the data in one shot in order to avoid problems?
What is the best method to verify the the entire data was deleted?
Probably this doesn't help anymore. But you should keep this in mind when creating the database. In mysql (depending on the table storage type, for instance in InnoDB) you can specify relations (They are called foreign key constraints). These relations mean that if you delete an entry from one row (for instance products) you can automatically update or delete entries in other tables that have that row as foreign key (such as product_storage). These relations guard that you have a 100% consistent state. However these relations might be hard to add on hindsight. If you plan to do this more often, it is definitely worth researching if you can add these to your database, they will save you a lot of work (all kinds of queries become simpler)
Without these relations you can't be 100% sure. So you'd have to go over all the tables, not which columns you want to check on and write a bunch of sql queries to make sure there are no entries left.
As Thirler has pointed out, it would be nice if you had foreign keys. Without them burnall 's solution can be used to transactions to ensure that no inconsistencies creep.
Regardless of how you do it, this could take a long time, even hours so please be prepared for that.
As pointed out earlier foreign keys would be nice in this place. But regarding question 1 you could perhaps run the changes within a transaction from the MySQL prompt. This assumes you are using a transaction safe storage engine like InnoDB. You can convert from myisam to InnoDB if you need to. Anyway something like this:
START TRANSACTION;
...Perform changes...
...Control changes...
COMMIT;
...or...
ROLLBACK;
Is it acceptable to have any downtime?
When working with PostgreSQL with databases >250Gb we use this technique on production servers in order to perform database changes. If the outcome isn't as expected we just rollback the transaction. Of course there is a penalty as the I/O-system has to work a bit.
// John
I am agree with Thirler that using of foreign keys is preferrable. It guarantees referential integrity and consisitency of the whole database.
I can believe that life sometimes requires more tricky logic.
So you could use manual queries like
delete from a where id in (select id from keys)
You could delete all records at once or by range of keys or using LIMIT in DELETE. Proper index is a must.
To verify consistency you need function or query. For example:
create function check_consistency() returns boolean
begin
return not exists(select * from child where id not in (select id from parent) )
and not exists(select * from child2 where id not in (select id from parent) );
-- and so on
end
Also maybe something to look into is Partitioning in MySQL tables. For more information check out the ref manual:
http://dev.mysql.com/doc/refman/5.1/en/partitioning.html
Comes down that you can divide tables (for example) in different partitions per datetime values or indexsets.