Alternatives to having nullable columns in primary key - mysql

I have a forum, where users can subscribe to receiving notifications when new threads are created. They can subscribe independently for each category in the forum. It should also be possible to subscribe to all categories, so that they do not have to specify each category if they want to receive notifications for all categories. (And also so that they will automatically receive notifications for any new categories that might be created in the future.) It is also possible to subscribe at different levels. For example one can subscribe to notifications o new threads only, or, in addition to the notifications, automatically subscribe to the new threads, with the effect of receiving notifications on new posts in the threads as well.
My first idea was to create a table containing UserId, CategoryId, and LevelId, having (UserId, CategoryId) as PRIMARY KEY. If CategoryId is NULL, it means all categories. This would allow for a base subscription to all categories, and then one could change the subscription level independently for specific categories.
+--------+------------+---------+
| UserId | CategoryId | LevelId |
+--------+------------+---------+
| 1 | 1 | 2 |
| 1 | 2 | 2 |
| 1 | NULL | 1 |
| 2 | NULL | 2 |
+--------+------------+---------+
The problem is that columns part of a PRIMARY KEY does not allow NULL values. I could of course use a separate value, such as -1 as "all categories", but this would break the FOREIGN KEY on CategoryId. Another idea is to use a UNIQUE KEY, which allows NULL values, instead of a PRIMARY KEY, but then there could be several rows with CategoryId = NULL for each user, since NULL != NULL in MySQL.
A different idea is to create a table using only UserId as PRIMARY KEY, and Categories as a bitmask for which categories the user is subscribing to. If Categories is NULL, this could mean all categories. This approach does not allow for a base subscription, overloaded by individual subscriptions. Also, the same subscription level is required for all categories, and there cannot be a FOREIGN KEY on Categories.
+--------+------------+---------+
| UserId | Categories | LevelId |
+--------+------------+---------+
| 1 | 3 | 2 |
| 2 | NULL | 1 |
+--------+------------+---------+
Are there any better suggestions for how to solve this issue?

Related

Better way to design payment database with multiple payment methods

I am trying to make a payment/transaction database for a pretend online store (just trying to learn). 1 payment can purchase 1 to many items. 1 payment can only have 1 payment method.
To keep the example simple, there are 2 payment methods, PayPal and Bitcoin. Each payment method has different attributes, hence they must be different tables.
I have my payments table which tells me what transaction bought what item/s. However, you can see that if the paypal_idis NULL then the bitcoin_id column is not. This means there are a lot of NULL's which I think is not a good design. How can I have good design in a case like this?
paypal table
paypal_id | txn_id | buyer_email | amount
1 | 3sd7fgudf23sdf34 | john#mail.com | 50.00
2 | 45shfik45345fg2s | mike#gmail.com | 100.00
bitcoin table
bitcoin_id | txn_id | amount
1 | 34327yhujndreygdiusfsdf324 | 0.19203
2 | sdfgurjibdsfhubhsdfinjo332 | 0.04123
items table
item_id | item name | price
1 | ball | 50.00
2 | shirt | 50.00
payments
payment_id | item_id | paypal_id | bitcoin_id
1 | 1 | 1 | NULL
2 | 1 | 2 | NULL
3 | 2 | 2 | NULL
4 | 1 | NULL | 1
5 | 1 | NULL | 1
6 | 1 | NULL | 2
Your design is fine. But you might want to consider an alternative where you have a payment_transactions table and then related tables that use the same primary key:
create table payment_transactions (
payments_transactions_id int auto_increment primary key,
type varchar(255),
payment_datetime datetime, -- probably common to all payment methods
. . . other columns if you like,
unique (type, payments_transactions_id) -- this will be used for foreign key references
);
create table bitcoin_payments (
bitcoin_payments_transaction_id int primary key,
type varchar(255) generated always as ('bitcoin'),
. . . , -- columns specific to bitcoins
foreign key (type, bitcoin_payments_transaction_id) references payments (type, payments_transactions_id)
);
-- similar for paypal
Then your payments table can have a foreign key to payments.
This handles much of the data modeling issue;
You have proper declared foreign key relationships.
Only one column is needed in payments regardless of the number of types.
You can easily introduce new types.
This guarantees one type per payment (via the inclusion of type in the foreign key reference).
One downside is that you need to insert each transaction twice. First into the payment_transactions table and then into the proper table.
Payments are actually more complicated than you present. A more realistic data model would handle:
Transaction status.
Retries.
Partial payments.
Once you get the basic structure down, you might want to try adding in new capabilities.

MySQL link two tables together implicitly

Suppose we have two tables
A table called people with people linked to a bank account balances
| id | name | account_id |
--------------------------
| 1 | bob | 11 |
--------------------------
| 2 | sam | 22
A table called accounts with bank account balances
| id | value |
--------------
| 11 | 200 |
--------------
| 22 | 500 |
In order to link the two tables you can do
SELECT a.value as account_balance
FROM people p
WHERE p.name="bob"
LEFT JOIN accounts a ON p.account_id = a.id`
This would return
id => 1
name => bob
account_balance => 200
That's cool - but I am wondering if there is a more implicit way to do this via SQL linkage (foreign keys or otherwise). Can we in MySQL add links in some other way so that when we do a SELECT, it already knows to return value instead of **account_id **?
I'm asking this because I am creating a system where my users can create lookup tables and link them to other tables - but it must be do-able without any programming. The only other way I can think of is to set the name of account_id for example to accounts.value and treat that as a foreign key when doing a SELECT.
I would have to get the column structure and analyze and then determine that there is a foreign key and then return the appropriate foreign column by looking at the column name.

How To Design A Database for a "Check In" Social Service

I want to build a "check in" service like FourSquare or Untappd.
How do I design a suitable database schema for storing check-ins?
For example, suppose I'm developing "CheeseSquare" to help people keep track of the delicious cheeses they've tried.
The table for the items into which one can check in is fairly simple and would look like
+----+---------+---------+-------------+--------+
| ID | Name | Country | Style | Colour |
+----+---------+---------+-------------+--------+
| 1 | Brie | France | Soft | White |
| 2 | Cheddar | UK | Traditional | Yellow |
+----+---------+---------+-------------+--------+
I would also have a table for the users, say
+-----+------+---------------+----------------+
| ID | Name | Twitter Token | Facebook Token |
+-----+------+---------------+----------------+
| 345 | Anne | qwerty | poiuyt |
| 678 | Bob | asdfg | mnbvc |
+-----+------+---------------+----------------+
What's the best way of recording that a user has checked in to a particular cheese?
For example, I want to record how many French cheeses Anne has checked-in. Which cheeses Bob has checked into etc. If Cersei has eaten Camembert more than 5 times etc.
Am I best putting this information in the user's table? E.g.
+-----+------+------+--------+------+------+---------+---------+
| ID | Name | Blue | Yellow | Soft | Brie | Cheddar | Stilton |
+-----+------+------+--------+------+------+---------+---------+
| 345 | Anne | 1 | 0 | 2 | 1 | 0 | 5 |
| 678 | Bob | 3 | 1 | 1 | 1 | 1 | 2 |
+-----+------+------+--------+------+------+---------+---------+
That looks rather ungainly and hard to maintain. So should I have separate tables for recordings check in?
No, don't put it into the users table. That information is better stored in a join table which represents a many-to-many relationship between users and cheeses.
The join table (we'll call cheeses_users) must have at least two columns (user_ID, cheese_ID), but a third (a timestamp) would be useful too. If you default the timestamp column to CURRENT_TIMESTAMP, you need only insert the user_ID, cheese_ID into the table to log a checkin.
cheeses (ID) ⇒ (cheese_ID) cheeses_users (user_ID) ⇐ users (ID)
Created as:
CREATE TABLE cheeses_users
cheese_ID INT NOT NULL,
user_ID INT NOT NULL,
-- timestamp defaults to current time
checkin_time DATETIME DEFAULT CURRENT_TIMESTAMP,
-- (add any other column *specific to* this checkin (user+cheese+time))
--The primary key is the combination of all 3
-- It becomes impossible for the same user to log the same cheese
-- at the same second in time...
PRIMARY KEY (cheese_ID, user_ID, checkin_time),
-- FOREIGN KEYs to your other tables
FOREIGN KEY (cheese_ID) REFERENCES cheeses (ID),
FOREIGN KEY (user_ID) REFERENCES users (ID),
) ENGINE=InnoDB; -- InnoDB is necessary for the FK's to be honored and useful
To log a checkin for Bob & Cheddar, insert with:
INSERT INTO cheeses_users (cheese_ID, user_ID) VALUES (2, 678);
To query them, you join through this table. For example, to see the number of each cheese type for each user, you might use:
SELECT
u.Name AS username,
c.Name AS cheesename,
COUNT(*) AS num_checkins
FROM
users u
JOIN cheeses_users cu ON u.ID = cu.user_ID
JOIN cheeses c ON cu.cheese_ID = c.ID
GROUP BY
u.Name,
c.Name
To get the 5 most recent checkins for a given user, something like:
SELECT
c.Name AS cheesename,
cu.checkin_time
FROM
cheeses_users cu
JOIN cheeses c ON cu.cheese_ID = c.ID
WHERE
-- Limit to Anne's checkins...
cu.user_ID = 345
ORDER BY checkin_time DESC
LIMIT 5
Let's define more clearly, so you can tell me if I'm wrong:
Cheese instances exist and aren't divisible ("Cheddar/UK/Traditional/Yellow" is a valid checkinable cheese, but "Cheddar" isn't, nor is "Yellow" or "Cheddar/France/...)
Users check into a single cheese instance at a given time
Users can re-check into the same cheese instance at a later date.
If this is the case, then to store fully normalized data, and to be able to retrieve that data's history, you need a third relational table linking the two existing tables.
+-----+------------+---------------------+
| uid | cheese_id | timestamp |
+----+-------------+---------------------+
| 345 | 1 | 2014-05-04 19:04:38 |
| 345 | 2 | 2014-05-08 19:04:38 |
| 678 | 1 | 2014-05-09 19:04:38 |
+-----+------------+---------------------+
etc. You can add extra columns to correspond to the cheese data, but strictly speaking you don't need to.
By putting all this in a third table, you potentially improve both performance and flexibility. You can always reconstruct the additions to the users table you mooted, using aggregate queries.
If you really decide you don't need the timestamps, then you'd replace them with basically the equivalent of a COUNT(*) field:
+-----+------------+--------------+
| uid | cheese_id | num_checkins |
+----+-------------+--------------+
| 345 | 1 | 15 |
| 345 | 2 | 3 |
| 678 | 1 | 8 |
+-----+------------+--------------+
That would dramatically reduce the size of your joining table, although obviously there's less of a "paper trail", should you need to reconstruct your data (and possibly say to a user "oh, yeah, we forgot to record your checkin on such-a-date.")
The entities 'User' and 'Cheese' have a many-to-many relationship. A user can have multiple cheeses he checked into, and a cheese can have multiple people that checked into it.
The only right way to design this in a relational database is to store it into a separate table. There are many reasons why storing it into the user table for instance, is a very bad idea. Read up on normalizing databases for more info on this.
Your table should look something like this:
CheckIns(CheeseId, UserId, (etc...))
Other useful columns might include date or rating, or whatever you want to store about a particular relationship between a user and a cheese.

Bad MYSQL design and how to improve it

I have the following problem tying to sort out a database which was when I started an unholy mix of Mysql, hand written back of envelope notes, Excel spreadsheet and completely missing records (don't ask) I have now reduced the problem down to the following.
I have 4 tables in greatly simplified form they are:-
customers
---------
id int not null primary key,
name varchar(50)
users
---------
id2 int not null primary key,
name varchar(50)
address
-------
id int,
id2 int,
country varchar(50)
product
-------
id int,
id2 int,
item varchar(50)
Sample data:
Examples select * from customers;
+----+------+
| id | name |
+----+------+
| 1 | Fred |
+----+------+
select * from users;
+----+---------+
|id2 | name |
+----+---------+
| 1 | Wilma |
| 2 | Pebbles |
+----+---------+
select * from address;
+----+-----+---------+
| id | id2 | country |
+----+-----+---------+
| 1 | 1 | Bedrock |
| 1 | 2 | Bedrock |
+----+-----+---------+
select * from product;
+----+-----+---------+
| id | id2 | item |
+----+-----+---------+
| 1 | 1 | Item1 |
| 1 | 2 | Item2 |
+----+-----+---------+
customers.id is a primary key and links to address.id and product.id
users.id2 is a primary key and links to address.id2 and product.id2
This arrangement fails where more than one user shares an address with a customer the only way to work around this at present is to duplicate the record in address and change the id2 number. At present this only effects one case in the database.
Where users don't share an address with a customer I am unable to workout a select statement that will find Customers name, Customers address, Users name and Users address.
This situation applies to approximately 30% of the records.
Any suggestions on how to sort this chaos would be most welcome.
Richard
To summarize what I think you're describing: An address can be shared by multiple users, but a user/customer can only have one address.
It sounds like you have your many-to-one relationship backwards.
Meaning, you have a foreign key in the address table pointing to customer/user, when what you really want is a foreign key in the customer/user table pointing to address. But that's kind of ugly... so a couple of ways you might clean it up:
Is a customer also a user? If so, have a single foreign key in Address that points to User, and a foreign key from User to Customer would be fairly clean.
If not, you might consider a table that bridges User/Customer and address. Something like this:
User_Cust_ID, Type, Address_ID
1 Customer 5
2 Customer 1
1 User 1
That will allow you to resolve a user and customer that share the same address.
HTH...

Private Message Database Design

Alright, so I think I'm pretty close to having what I need, but I'm unsure about a couple of things:
TABLE messages
message_id
message_type
sender_id
timestamp
TABLE message_type
message_type_code (1, 2, 3)
name (global, company, personal)
TABLE message_to_user
message_id
receiver_id
status (read/unread)
Goals:
Be able to send GLOBAL messages to all users.
Send PERSONAL messages between 1 or more users.
Determine if any of these messages have been read or not by the the receiver.
Questions:
Does my schema take care of all that it needs to?
What would a sample SQL query look like to populate someones inbox, bringing in GLOBAL messages as well as PERSONAL messages - I'd like to be able to determine which is which for the UI.
And please feel free to add to my schema if you feel it would benefit.
Schema looks like it will work. Should probably have a Created date too. There's no way to know if you've read a global message though without creating entries for everyone.
Here's some SQL:
SELECT M.*, MTU.*
FROM messages M
LEFT JOIN message_to_user MTU ON MTU.message_id=M.message_id
WHERE MTU.receiver_id={$UserID} OR M.message_type={$GlobalType}
ORDER BY M.created_on DESC
[EDIT]
Problem: Every user needs to have their own unique "read" status for global e-mails. You probably also want to give them the ability to "delete"/hide this e-mail so they don't have to be looking at it all the time. There is no way around this without creating either a row for each e-mail as it's going out, which is probably taxing to do that many INSERTS all at once...or better yet, don't create a status until it's read. This way, INSERTS for global e-mails will only occur when the message is read.
messages
message_id
message_type
sender_id
timestamp
message_recipient
message_id
user_id
message_status
message_status_id
message_id
user_id
is_read
read_datetime
is_deleted
deleted_datetime
SELECT M.*, MR.*, MS.*
FROM messages M
LEFT JOIN message_recipient MR ON MR.message_id=M.message_id
LEFT JOIN message_status MS ON MS.message_id=M.message_id
WHERE
(MS.message_status_id IS NULL OR MS.is_deleted = 0)
(MR.user_id={$UserId} OR M.message_type={$GlobalType})
ORDER BY M.timestamp DESC
[EDIT]
Whether to use message_type as a DB table or simply as settings within your code is partly a personal preference and partly your needs. If you need to query the DB and see the text "personal" and "global" directly from your query, then you want to use the message_type table. However, if you only need the "type" to handle your business logic, but don't need to see it in query results, then I would go with an "Enum" style approach. Enums are a C# thing...in PHP, the closest you've got is a class with constants...something like:
class MessageTypes {
public const Global = 0;
public const Personal = 1;
}
So, your query would be: WHERE ... message_type=".MessageTypes::Global."...
The one method can be to separate the global messages from the personal messages as I think you have tried to do already.
To effectively get a read status for a global message, you would need to add a table with a composite key containing the global_message_id and user_id together.
messages_tbl
- message_id | int(11) | Primary Key / Auto_Increment
- message_type | int(11)
- sender_id | int(11) | FK to sender
- receiver_id | int(11) | FK to receiver
- status | int(1) | 0/1 for Unread / Read
- message | text
- date | datetime
global_message_tbl
- g_message_id | int(11) | Primary Key / Auto_Increment
- g_message_type | int(11)
- sender_id | int(11) | FK to sender
- date | datetime
global_readstatus_tbl
- user_id | int(11) | Primary Key
- g_message_id | int(11) | Primary Key
- date | datetime
Alternatively merge the messages_tbl and global_message_tbl so they each user is sent a global message personally in a loop. This reduces your schema right down to one table.
messages_tbl
- message_id | int(11) | Primary Key / Auto_Increment
- sender_id | int(11) | FK to sender
- receiver_id | int(11) | FK to receiver
- status | int(1) | 0/1 for Unread / Read
- message_type | varchar(8) | Personal / Global / Company
- message | text
- date | datetime
- type | varchar(8)
If you want the ability to normalise your table a bit better, and make it easier to add message types in the future, move message_type back into its own table again, and make message_type a FK of the message_type_id
message_type_tbl
- message_type_id | int(11) | Primary Key / Auto_Increment
- message_type | varchar(8) | Personal / Global / Company
Update - Sample Table (1 Table)
message_tbl
message_id | message_type | sender_id | receiver_id | status | message | datetime
1 | personal | 2 | 3 | read | foobar | 12/04/11 00:09:00
2 | personal | 2 | 4 | unread | foobar | 12/04/11 00:09:00
3 | personal | 3 | 2 | unread | barfoo | 12/04/11 02:05:00
4 | global | 1 | 2 | unread | gmessage | 13/04/11 17:05:00
5 | global | 1 | 3 | unread | gmessage | 13/04/11 17:05:00
6 | global | 1 | 4 | read | gmessage | 13/04/11 17:05:00
user_tbl
user_id | name
1 | Admin
2 | johnsmith
3 | mjordan
4 | spippen
The above assumes users 2, 3 and 4 are general users sending messages to each other, user 1 is the admin account that will be used to send global messages (delivered directly to each user individually) allowing you to see the same information as if it were a personal message.
To send a global message in this format you would simply loop over the users table to obtain all the ID's you want to send the global message out to, then simply INSERT the rows for each user in the messages_tbl.
If you don't anticipate your users sending millions of messages a day as well as regular global messages to millions of users then the number of rows shouldn't be an issue. You can always purge old read messages from users by creating a cleanup script.