How to design the db for column update voting system? - mysql

Assuming that the db has a table 'houses' -> house_id(pk) | color | nbr_of_windows;
The houses table is populated by 1 user but later , if a column has to be updated , the update will be finalised only if a number of users agree on it. Should I make another 2 tables that will store data about update proposals (the voting system data) for each houses table column? Are there any better solutions?

Related

How do I store all record change in history, so that I can keep track easily in MySQL

I'm a beginner at MySQL.
I have a table with 160rows .
I need to keep tracking this table everytime someone make change to all row ,
example
tabelname = customer
id
user_id
name
lastname
datebirth
accupation
spouse
etc...
etc..
until 160 rows in total
When someone changes a record, let say datebirth, I need all detail record in the history table.
When someone changes 10times, I need all the records to change, so I have 10 save records on history tables.
You can maintain another table for history. Whenever any update happen in base table you just need to copy current row to history table and you can additionally maintain user_id (user who updated row) and timestamp (time when update occurred). This way you can fetch the history of the row with group by operation on customer_id, time etc.
For Implementation you can follow this approach - https://dev.to/zhiyueyi/design-a-table-to-keep-historical-changes-in-database-10fn
for more info - 1-> SQL history table design
2-> how to have a history of the Sql table column.?
3-> How to store historical records in a history table in SQL Server

How to better build a database

We have a DB on SQL, where we have a table (1) for users and a table (2) for user's saved information. Each piece of information is one line in table (2). So my question is the following - If we are intending to grow number of users to more than 1.000.000 and each user can have more than 10 piece of information, which of the following is a better way to build our DB:
a) Having 2 tables - 1 for users and 1 for information from all users, related to users with ID
b) Having a separate table for each user.
Thanks in advance.
Definitely it should be having a single table for the user is much better. Think from the DB prospective. You are thinking about the search time in a 1.000.000 row for a sorted ID. In the second case you have to search 1.000.000 table to get into a right table. So better go for option A.
I'm going to agree that option A is the better of the two options presented.
That being said, I would personally break up the information for the users into more tables as well. This would all be connected using foreign keys and will allow for more specific querying of the information.
SQL is not really horizontally scalable, so if you end up with users with less or more information than others, then you'll have NULL columns and this requires dealing with in various ways.
By using separate tables, you can still have all of the information contained, but not have to worry if one user has a home and cell phone number, while another only has a cell number.
If and when you do need to access a lot of the information at once, SQL is very good at dealing with this through joins and the like.
Option B is not bad, it just does not fit SQL. I would work if the DB in question was document based instead of tables. In that case, creating a single document for each user is a good idea, and likely preferred.
Option C)
table for users with a unique UserID as Clustered Index (Primary Key)
table for Type of saved information with a unique InformationID as Clustered Index (Primary Key)
table for UserInformation with unique UserInformationID as Clustered Index (Primary Key), a column for UserID (nonclustered index, foreign key to user table) and a column for InformationID (nonclustered index, foreign key to Information table). Have a "Value" or similar column to hold the data being save as it relates to the type of information.
Example:
Users Table
UserID UserName
1 | UserName1
2 | UserName2
Information Table
InfoID InfoName
1 | FavoriteColor
2 | FavoriteNumber
3 | Birthday
UserInformation Table
ID UserID InfoID Value
1 | 1 | 1 | Blue
2 | 1 | 2 | 7
3 | 1 | 3 | '11/01/1999'
4 | 2 | 3 | '05/16/1960'
This method allows for you to save any combination of values for any user without recording any of the non-supplied user information. It keeps the information table 'clean' because you won't need to keep adding columns for each new piece of information you wish to track. Just add a new record to the Info table, and then record only the values submitted to the UserInformation table.

Create a new lookup table where data already exists

I am working on a database in MS Access 2013 which has a considerable amount of non-normalised data, and I want to move them out to alternate tables and use them as lookups in the main table. However, when I create a lookup column, MS Access deletes the data and there is far too much data to reset every record by hand.
Is there a way in Access 2013 to create such a lookup without losing the data?
Please don't comment about how using lookup tables in Access is bad. I have read posts like the one below and I disagree with most of the points there, and some of them are just simply wrong.
http://access.mvps.org/access/lookupfields.htm
Below is a sample of my data. I need to extract the 2nd and 3rd fields to other tables. If I can do this with them, I can do it with the others.
Presently this is stored as text in the fields. I would like to remove them and replace them with FK id's.
You could create your second table and add the data to the table. Then update the first table to match the records to each other
Let say you have the following table:
CustOrders
ID Customer DateOrdered
123 K-Mart 01/01/2013
124 K Mart 01/05/2013
125 Walmart 02/05/2013
126 Walmart 03/07/2013
127 Miejers 03/11/2013
128 K-Mart 03/12/2013
You could find out all of the Customers that are in the CustOrders table by performing the following:
SELECT DISTINCT Customer From CustOrders
Then create a record in the following table for each:
Customers
ID Customer
1 K-Mart
2 Walmart
3 Miejers
Then you could Update the CustOrders table by performing the following:
UPDATE CustOrders SET Customer = 1 WHERE Customer = 'K-Mart' OR Customer = 'K Mart'
Of course you would have to do this for each distinct customer you have.
Then if you want you could change the data type of the Customer field in the CustOrders table to Long Integer.
Lastly I would create a combo box on any entry/edit form that has the following RowSource:
SELECT ID, Customer FROM Customers ORDER BY Customer
Set the combo box to limit from list and bind to column 1.

Is there a MySQL option/feature to track history of changes to records?

I've been asked if I can keep track of the changes to the records in a MySQL database. So when a field has been changed, the old vs new is available and the date this took place. Is there a feature or common technique to do this?
If so, I was thinking of doing something like this. Create a table called changes. It would contain the same fields as the master table but prefixed with old and new, but only for those fields which were actually changed and a TIMESTAMP for it. It would be indexed with an ID. This way, a SELECT report could be run to show the history of each record. Is this a good method? Thanks!
Here's a straightforward way to do this:
First, create a history table for each data table you want to track (example query below). This table will have an entry for each insert, update, and delete query performed on each row in the data table.
The structure of the history table will be the same as the data table it tracks except for three additional columns: a column to store the operation that occured (let's call it 'action'), the date and time of the operation, and a column to store a sequence number ('revision'), which increments per operation and is grouped by the primary key column of the data table.
To do this sequencing behavior a two column (composite) index is created on the primary key column and revision column. Note that you can only do sequencing in this fashion if the engine used by the history table is MyISAM (See 'MyISAM Notes' on this page)
The history table is fairly easy to create. In the ALTER TABLE query below (and in the trigger queries below that), replace 'primary_key_column' with the actual name of that column in your data table.
CREATE TABLE MyDB.data_history LIKE MyDB.data;
ALTER TABLE MyDB.data_history MODIFY COLUMN primary_key_column int(11) NOT NULL,
DROP PRIMARY KEY, ENGINE = MyISAM, ADD action VARCHAR(8) DEFAULT 'insert' FIRST,
ADD revision INT(6) NOT NULL AUTO_INCREMENT AFTER action,
ADD dt_datetime DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP AFTER revision,
ADD PRIMARY KEY (primary_key_column, revision);
And then you create the triggers:
DROP TRIGGER IF EXISTS MyDB.data__ai;
DROP TRIGGER IF EXISTS MyDB.data__au;
DROP TRIGGER IF EXISTS MyDB.data__bd;
CREATE TRIGGER MyDB.data__ai AFTER INSERT ON MyDB.data FOR EACH ROW
INSERT INTO MyDB.data_history SELECT 'insert', NULL, NOW(), d.*
FROM MyDB.data AS d WHERE d.primary_key_column = NEW.primary_key_column;
CREATE TRIGGER MyDB.data__au AFTER UPDATE ON MyDB.data FOR EACH ROW
INSERT INTO MyDB.data_history SELECT 'update', NULL, NOW(), d.*
FROM MyDB.data AS d WHERE d.primary_key_column = NEW.primary_key_column;
CREATE TRIGGER MyDB.data__bd BEFORE DELETE ON MyDB.data FOR EACH ROW
INSERT INTO MyDB.data_history SELECT 'delete', NULL, NOW(), d.*
FROM MyDB.data AS d WHERE d.primary_key_column = OLD.primary_key_column;
And you're done. Now, all the inserts, updates and deletes in 'MyDb.data' will be recorded in 'MyDb.data_history', giving you a history table like this (minus the contrived 'data_columns' column)
ID revision action data columns..
1 1 'insert' .... initial entry for row where ID = 1
1 2 'update' .... changes made to row where ID = 1
2 1 'insert' .... initial entry, ID = 2
3 1 'insert' .... initial entry, ID = 3
1 3 'update' .... more changes made to row where ID = 1
3 2 'update' .... changes made to row where ID = 3
2 2 'delete' .... deletion of row where ID = 2
To display the changes for a given column or columns from update to update, you'll need to join the history table to itself on the primary key and sequence columns. You could create a view for this purpose, for example:
CREATE VIEW data_history_changes AS
SELECT t2.dt_datetime, t2.action, t1.primary_key_column as 'row id',
IF(t1.a_column = t2.a_column, t1.a_column, CONCAT(t1.a_column, " to ", t2.a_column)) as a_column
FROM MyDB.data_history as t1 INNER join MyDB.data_history as t2 on t1.primary_key_column = t2.primary_key_column
WHERE (t1.revision = 1 AND t2.revision = 1) OR t2.revision = t1.revision+1
ORDER BY t1.primary_key_column ASC, t2.revision ASC
Edit:
Oh wow, people like my history table thing from 6 years ago :P
My implementation of it is still humming along, getting bigger and more unwieldy, I would assume. I wrote views and pretty nice UI to look at the history in this database, but I don't think it was ever used much. So it goes.
To address some comments in no particular order:
I did my own implementation in PHP that was a little more involved, and avoided some of the problems described in comments (having indexes transferred over, signifcantly. If you transfer over unique indexes to the history table, things will break. There are solutions for this in the comments). Following this post to the letter could be an adventure, depending on how established your database is.
If the relationship between the primary key and the revision column seems off it usually means the composite key is borked somehow. On a few rare occasions I had this happen and was at a loss to the cause.
I found this solution to be pretty performant, using triggers as it does. Also, MyISAM is fast at inserts, which is all the triggers do. You can improve this further with smart indexing (or lack of...). Inserting a single row into a MyISAM table with a primary key shouldn't be an operation you need to optimize, really, unless you have significant issues going on elsewhere. In the entire time I was running the MySQL database this history table implementation was on, it was never the cause of any of the (many) performance problems that came up.
if you're getting repeated inserts, check your software layer for INSERT IGNORE type queries. Hrmm, can't remember now, but I think there are issues with this scheme and transactions which ultimately fail after running multiple DML actions. Something to be aware of, at least.
It's important that the fields in the history table and the data table match up. Or, rather, that your data table doesn't have MORE columns than the history table. Otherwise, insert/update/del queries on the data table will fail, when the inserts to the history tables put columns in the query that don't exist (due to d.* in the trigger queries), and the trigger fails. t would be awesome if MySQL had something like schema-triggers, where you could alter the history table if columns were added to the data table. Does MySQL have that now? I do React these days :P
It's subtle.
If the business requirement is "I want to audit the changes to the data - who did what and when?", you can usually use audit tables (as per the trigger example Keethanjan posted). I'm not a huge fan of triggers, but it has the great benefit of being relatively painless to implement - your existing code doesn't need to know about the triggers and audit stuff.
If the business requirement is "show me what the state of the data was on a given date in the past", it means that the aspect of change over time has entered your solution. Whilst you can, just about, reconstruct the state of the database just by looking at audit tables, it's hard and error prone, and for any complicated database logic, it becomes unwieldy. For instance, if the business wants to know "find the addresses of the letters we should have sent to customers who had outstanding, unpaid invoices on the first day of the month", you likely have to trawl half a dozen audit tables.
Instead, you can bake the concept of change over time into your schema design (this is the second option Keethanjan suggests). This is a change to your application, definitely at the business logic and persistence level, so it's not trivial.
For example, if you have a table like this:
CUSTOMER
---------
CUSTOMER_ID PK
CUSTOMER_NAME
CUSTOMER_ADDRESS
and you wanted to keep track over time, you would amend it as follows:
CUSTOMER
------------
CUSTOMER_ID PK
CUSTOMER_VALID_FROM PK
CUSTOMER_VALID_UNTIL PK
CUSTOMER_STATUS
CUSTOMER_USER
CUSTOMER_NAME
CUSTOMER_ADDRESS
Every time you want to change a customer record, instead of updating the record, you set the VALID_UNTIL on the current record to NOW(), and insert a new record with a VALID_FROM (now) and a null VALID_UNTIL. You set the "CUSTOMER_USER" status to the login ID of the current user (if you need to keep that). If the customer needs to be deleted, you use the CUSTOMER_STATUS flag to indicate this - you may never delete records from this table.
That way, you can always find what the status of the customer table was for a given date - what was the address? Have they changed name? By joining to other tables with similar valid_from and valid_until dates, you can reconstruct the entire picture historically. To find the current status, you search for records with a null VALID_UNTIL date.
It's unwieldy (strictly speaking, you don't need the valid_from, but it makes the queries a little easier). It complicates your design and your database access. But it makes reconstructing the world a lot easier.
You could create triggers to solve this. Here is a tutorial to do so (archived link).
Setting constraints and rules in the database is better than writing
special code to handle the same task since it will prevent another
developer from writing a different query that bypasses all of the
special code and could leave your database with poor data integrity.
For a long time I was copying info to another table using a script
since MySQL didn’t support triggers at the time. I have now found this
trigger to be more effective at keeping track of everything.
This trigger will copy an old value to a history table if it is changed
when someone edits a row. Editor ID and last mod are stored in the
original table every time someone edits that row; the time corresponds
to when it was changed to its current form.
DROP TRIGGER IF EXISTS history_trigger $$
CREATE TRIGGER history_trigger
BEFORE UPDATE ON clients
FOR EACH ROW
BEGIN
IF OLD.first_name != NEW.first_name
THEN
INSERT INTO history_clients
(
client_id ,
col ,
value ,
user_id ,
edit_time
)
VALUES
(
NEW.client_id,
'first_name',
NEW.first_name,
NEW.editor_id,
NEW.last_mod
);
END IF;
IF OLD.last_name != NEW.last_name
THEN
INSERT INTO history_clients
(
client_id ,
col ,
value ,
user_id ,
edit_time
)
VALUES
(
NEW.client_id,
'last_name',
NEW.last_name,
NEW.editor_id,
NEW.last_mod
);
END IF;
END;
$$
Another solution would be to keep an Revision field and update this field on save. You could decide that the max is the newest revision, or that 0 is the most recent row. That's up to you.
Here is how we solved it
a Users table looked like this
Users
-------------------------------------------------
id | name | address | phone | email | created_on | updated_on
And the business requirement changed and we were in a need to check all previous addresses and phone numbers a user ever had.
new schema looks like this
Users (the data that won't change over time)
-------------
id | name
UserData (the data that can change over time and needs to be tracked)
-------------------------------------------------
id | id_user | revision | city | address | phone | email | created_on
1 | 1 | 0 | NY | lake st | 9809 | #long | 2015-10-24 10:24:20
2 | 1 | 2 | Tokyo| lake st | 9809 | #long | 2015-10-24 10:24:20
3 | 1 | 3 | Sdny | lake st | 9809 | #long | 2015-10-24 10:24:20
4 | 2 | 0 | Ankr | lake st | 9809 | #long | 2015-10-24 10:24:20
5 | 2 | 1 | Lond | lake st | 9809 | #long | 2015-10-24 10:24:20
To find the current address of any user, we search for UserData with revision DESC and LIMIT 1
To get the address of a user between a certain period of time
we can use created_on bewteen (date1 , date 2)
MariaDB supports System Versioning since 10.3 which is the standard SQL feature that does exactly what you want: it stores history of table records and provides access to it via SELECT queries. MariaDB is an open-development fork of MySQL. You can find more on its System Versioning via this link:
https://mariadb.com/kb/en/library/system-versioned-tables/
Why not simply use bin log files? If the replication is set on the Mysql server, and binlog file format is set to ROW, then all the changes could be captured.
A good python library called noplay can be used. More info here.
Just my 2 cents. I would create a solution which records exactly what changed, very similar to transient's solution.
My ChangesTable would simple be:
DateTime | WhoChanged | TableName | Action | ID |FieldName | OldValue
1) When an entire row is changed in the main table, lots of entries will go into this table, BUT that is very unlikely, so not a big problem (people are usually only changing one thing)
2) OldVaue (and NewValue if you want) have to be some sort of epic "anytype" since it could be any data, there might be a way to do this with RAW types or just using JSON strings to convert in and out.
Minimum data usage, stores everything you need and can be used for all tables at once. I'm researching this myself right now, but this might end up being the way I go.
For Create and Delete, just the row ID, no fields needed. On delete a flag on the main table (active?) would be good.
The direct way of doing this is to create triggers on tables. Set some conditions or mapping methods. When update or delete occurs, it will insert into 'change' table automatically.
But the biggest part is what if we got lots columns and lots of table. We have to type every column's name of every table. Obviously, It's waste of time.
To handle this more gorgeously, we can create some procedures or functions to retrieve name of columns.
We can also use 3rd-part tool simply to do this. Here, I write a java program
Mysql Tracker
In MariaDB 10.5+ this is as easy to setup as
CREATE TABLE t (x INT) WITH SYSTEM VERSIONING
PARTITION BY SYSTEM_TIME;
Past history can then be queried by doing
SELECT * FROM t FOR SYSTEM_TIME AS OF TIMESTAMP '2016-10-09 08:07:06';
There is currently no counterpart for this in MySQL.
See the documentation for more info. If you're on an older version of MariaDB, the documentation has an alternate syntax that has been available since MariaDB 10.3.4.

Design of MySQL DB to avoid having a table with mutually exclusive fields

I'm creating a new DB and I have this problem: I have two type of users that can place orders: registered users (that is, they have a login) and guest users (that is, no login). The data for registered users and guest users are different and that's why I'm thinking of using two different tables, but the orders (that share the same workflow) are all the same, so I'm thinking about using only one table.
I've read here and here (even if I don't understand fully this example) that I can enforce a MySQL rule to have mutually exclusive columns in a table (in my case they'd be "idGuest" and "idUser") but I don't like that approach.
Is there a better way to do it?
There are several approaches, which depends on the number of records and number of unique fields. For example, if you would say they differ in only two fields, I would have suggested that you just put everything in the same table.
My approach, assuming they differ a lot, would be to think "objects":
You have a main user table, and for each user type you have another table that "elaborates" that user info.
Users
-----
id,email,phone,user_type(guest or registered)
reg_users
---------
users_id, username,password etc.....
unreg_users
-----------
user_id,last_known_address, favorite_color....etc
Where user_id is foreign key to users table
Sounds like mostly a relational supertype/subtype issue. I've answered a similar question and included sample code that you should be able to adapt without much trouble. (Make sure you read the comments.)
The mildly complicating factor for you is that one subtype (guest users) could someday become a different subtype (registered users). How you'd handle that would be application-dependent. (Meaning you'd know, but probably nobody else would.)
I think I would have three tables :
A user table, that would contain :
One row for each user, no matter what type of user
The data that's present for both guests and registered
A field that indicates if a row corresponds to a registered or a guest
A guest table, that would contain :
One row per guest user,
The data that's specific to guests
And a registered table, that would contain :
One row per registered user,
The data that's specific to registered users
Then, when referencing a user (in your orders table, for example), you'd always use the id of the user table.
What you are describing is a polymorphic table. It sounds scary, but it really isn't so bad.
You can keep your separate User and Guest tables. For your Orders table, you have two columns: foreign_id and foreign_type (you can name them anything). The foreign_id is the id of the User or Guest in your case, and the content of the foreign_type is going to be either user or guest:
id | foreign_id | foreign_type | other_data
-------------------------------------------------
1 | 1 | user | ...
2 | 1 | guest | ...
To select rows for a particular user or guest, just specify the foreign_type along with the ID:
SELECT * FROM orders WHERE foreign_id = 1 AND foreign_type = 'guest';
The foreign key in the Orders table pointing back to the Customer entity that placed the order is typically a non-nullable column. If you have two different Customer tables (RegisteredCustomer and GuestCustomer) then you would requiree two separate nullable columns in the Orders table pointing back to the separate customer tables. What I would suggest is to have only one Customers table, containing only those rows (EDIT: sorry, meant to write only those COLUMNS) that are common to registered users and guest users, and then a RegisteredUsers table which has a foreign-key relationship with the Customers table.