I have a table that holds various users by user_id with many other indexed columns of ids from third-party tools they use.
For example, table users
user_id | user_name | zendesk_id | mailchimp_id | todoist_id
We have crons configured to hit these third parties for a list of users so that our db has any new users inserted. In the event of a user_name change on the third party, we also want to update that information on our side to keep in sync.
user_id is a primary key with auto-increment.
all third party service columns (ex. zendesk_id) are unique and indexed.
Per mysql documentation, auto-increment will not work properly when other columns are indexed (see similar, but not duplicate question explaining this: ON DUPLICATE KEY + AUTO INCREMENT issue mysql )
Since mysql has this documented problem, my question is this:
How can I keep a reasonably clean auto-increment on user_id when I am pulling full user lists from these third parties to insert new users and update existing users?
Related
Currently I have a composite-primary key consisting of (user, id). My user is John Smith and there are say 30 rows that pertain to him, hence id auto increments each time a new entry is made.
However, if i wanted to add a new user, say Jill Smith to the same table, is there a way in which I can start at (Jill Smith, 1) and have the id auto increment without messing up the previous entries?
No. AUTO_INCREMENT in MySQL cannot have multiple "states" to keep track of multiple counters. To have the described behaviour, you need to implement your own application logic (w/o using the autoincrement feature) and calculate the number part of the key before inserting new rows.
UPDATE
The above is true in general in MySQL but how AUTO_INCREMENT works depends on the storage engine.
The documentation is quite specific on your particular scenario for MyISAM tables:
If the AUTO_INCREMENT column is part of multiple indexes, MySQL
generates sequence values using the index that begins with the
AUTO_INCREMENT column, if there is one. For example, if the animals
table contained indexes PRIMARY KEY (grp, id) and INDEX (id), MySQL
would ignore the PRIMARY KEY for generating sequence values. As a
result, the table would contain a single sequence, not a sequence per
grp value.
https://dev.mysql.com/doc/refman/5.6/en/example-auto-increment.html
I am making a small system to clean up the database. Every person that visits the site gets put in the db, but if he/she doesn't register, he/she should be removed from the database with a cronjob or so if the time when he/she first visited the site is longer than 2 days. The date is stored in MySQL as a timestamp but looks like this: 2013-06-05 01:18:43.
So what I thought about doing was the following:
$STH = $DBH->query("DELETE FROM user WHERE type=0 AND joindate < ".date('d-m-Y H:i:s',time()-$userLife));
Like this, the format of the timestamp is the same as in MySQL. I'm using $userLife so I can easily adjust this var at the beginning of my script.
The problem is however, that I also need to do queries for other tables containing this user_id. For example the table pages:
id | user_id | level | time | views
In this table, it is possible that there are multiple instances of user_id.
Can this be done in one single query, or do I need to first loop through all the users, for each user then do the DELETE-queries for 3 other tables and after that loop delete all the users?
Ideally, you'd define things with a FOREIGN KEY constraint, and define an ON DELETE CASCADE, which automagically will delete all that related data for you. If that's not possible for some reason (stuck with a MyISAM table for instance), you could simply JOIN the related tables (yes, you can delete from more then 1 table at once). If it's your first time doing that, do it on a testdatabase, and certainly not in production.
Two tables share a unique identifier 'id'. Both tables are meant to be joined by using 'id'.
Defining 'id' as an auto incrementing primary key in both tables may risk update inconsistencies.
Is there some general pattern to avoid such a situation or do I have to deal with updating table1 first and table2 by utilizing the last inserted id after (therefore not declaring id as auto inc in table2)?
First, if you use InnoDB table engine in MySQL you could use both transactions and foreign keys for data consistency.
Second, after the insert in the first table, you could get the last insert id (depending on the way you access the db) and use it as foreign key.
Eg
Table 1: Users: user_id, username
Table 2: User_Profiles: user_id, name, phone
In User_Profiles you don't need to define user_id as auto increment, but first insert a record in Users table and use the user_id for the User_Profiles record. If you do this in transaction, the Users record won't be seen outside of the transaction connection until it's completed, this way you guarantee that even if something bad happens after you insert the user, but before you have inserted the profile - there won't be messed up data.
You could also define that the user_id column in User_Profiles table is foreign key of Users table thus if someone deletes a record from the Users table, the database would automatically delete the one in User_Profiles. There are many other options - read more about that.
There is no problem with same column name 'id' in any number of tables.
Several persistence layer frameworks do it same way.
Just use aliases in your SQL to distinct your tables accordingly.
do I have to deal with updating table1 first and table2 by utilizing the last inserted id after (therefore not declaring id as auto inc in table2)?
Yes. And make id a foreign key so it can only exist in table2 if it already exists in table1.
Yes you do, and remember to wrap the operation in a transaction.
I had to implement the following into my database:
The activities that users engage in. Each activity can have a name with up to 80 characters, and only distinct activities should be stored. That is, if two different users like “Swimming”, then the activity “Swimming” should only be stored once as a string.
Which activities each individual user engages in. Note that a user can have more than one hobby!
So I have to implement tables for this purpose and I must also make any modifications to existing tables if and as required and implement any keys and foreign key relationships needed.
All this must be stored with minimal amount of storage, i.e., you must choose the appropriate data types from the MySQL manual. You may assume that new activities will be added frequently, that activities will almost never be removed, and that the total number of distinct activities may reach 100,000.
So I already have a 'User' table with 'user_id' as my primary key.
MY SOLUTION TO THIS:
Create a table called 'Activities' and have 'activity_id' as PK (mediumint(5) ) and 'activity' as storing hobbies (varchar(80)) then I can create another table called 'Link' and use the 'user_id' FK from user table and the 'activity_id' FK from the 'Activities' table to show user with the activities that they like to do.
Is my approach to this question right? Is there another way I can do this to make it more efficient?
How would I show if one user pursues more than one activity in the foreign key table 'Link'?
Your idea is the correct, and only(?) way.. it's called a many to many relationship.
Just to reiterate what you're proposing is that you'll have a user table, and this will have a userid, then an activity table with an activityid.
To form the relationship you'll have a 3rd table, which for performance sake doesn't require a primary key however you should index both columns (userid and activityid)
In your logic when someone enters an activity name, pull all records from the activity table, check whether entered value exists, if not add to table and get back the new activityid and then add an entry to the user_activity table linking the activityid to the userid.
If it already exists just add an entry linking that activity id to the userid.
So your approach is right, the final question just indicates you should google for 'many to many' relationships for some more info if needed.
I've seen a lot of discussion regarding this. I'm just seeking for your suggestions regarding this. Basically, what I'm using is PHP and MySQL. I have a users table which goes:
users
------------------------------
uid(pk) | username | password
------------------------------
12 | user1 | hashedpw
------------------------------
and another table which stores updates by the user
updates
--------------------------------------------
uid | date | content
--------------------------------------------
12 | 2011-11-17 08:21:01 | updated profile
12 | 2011-11-17 11:42:01 | created group
--------------------------------------------
The user's profile page will show the 5 most recent updates of a user. The questions are:
For the updates table, would it be possible to set both uid and date as composite primary keys with uid referencing uid from users
OR would it be better to just create another column in updates which auto-increments and will be used as the primary key (while uid will be FK to uid in users)?
Your idea (under 1.) rests on the assumption that a user can never do two "updates" within one second. That is very poor design. You never know what functions you will implement in the future, but chances are that some day 1 click leads to 2 actions and therefore 2 lines in this table.
I say "updates" quoted because I see this more as a logging table. And who knows what you may want to log somewhere in the future.
As for unusual primary keys: don't do it, it almost always comes right back in your face and you have to do a lot of work to add a proper autoincremented key afterwards.
It depends on the requirement but a third possibility is that you could make the key (uid, date, content). You could still add a surrogate key as well but in that case you would presumably want to implement both keys - a composite and a surrogate - not just one. Don't make the mistake of thinking you have to make an either/or choice.
Whether it is useful to add the surrogate or not depends on how it's being used - don't add a surrogate unless or until you need it. In any case uid I would assume to be a foreign key referencing the users table.