How to solve design database issue that violates the normalizaton rule?

How to solve design database issue that violates the normalizaton rule? - mysql

I have two tables: Orders that contains available orders and ExecutorsOfferOffers that contains offers by concrete user for specific order:
The order can be in four statuses: accepted, canceled, finished. Inside ExecutorsOfferOffers we have history of rows when user can accept, reject and complete the order (field status).
At the same time table Orders also has status field to show the current status of order. I confused how to be, move executor_id as external key to Orders table among status. Or store them in ExecutorsOfferOffers. And retrieve the status the order by selecting onw row ordered by id from ExecutorsOfferOffers.
In this case I faced with problem when user can accept, then cancel order (insert two rows in ExecutorsOfferOffers with different statuses).

If I understand correctly:
ExecutorsOrderOffers contains the history of user requests to change the status of the order.
Orders contains the current status of the order.
One Order can have many ExecutorsOrderOffers.
They're not recording the same data.
This information is not necessarily the same. Just because a User requested an order be cancelled does not mean it is cancelled; someone might need to make a phone call or an API might need to be contacted. Perhaps you should leave the status decoupled.
This leaves more flexibility for the business logic to determine the relationship between ExecutorsOrderOffers and Orders.
Use a join table to record the Order status.
Status flags get messy. You have to remember to add them to every where clause, and they make indexing complicated. Instead, consider using a join table to record the status of an order.
-- One for each status.
create table AcceptedOrders (
Order_Id int not null
foreign key(Order_Id) references Orders(Id)
);
Add a timestamp to ExecutorsOrderOffers.
IDs are not a surrogate for timestamps, and you're probably going to want to know when a user made a change; add a Created_At timestamp.
Add an index on ExecutorsOrderOffers(Created_At, Orders_ID)
Index Created_At with Orders_ID in that order. This will cover searches and order by for Created_At as well as when combined with an Orders_ID.
The foreign key index already covers search by Orders_ID alone.
Now you can efficiently look up the latest user request to change the status of an order.

Related

Database design: Value(s) per user per day

I'm setting up a system where for every user (1000+), I want to add a set of values every single day.
Hypotetically:
A system where I can log when Alice and Bob woke up and what they had for dinner on the August 1st 2019 or 2024.
Any suggestions on how to best structure the database tables?
A person table with a primary person ID?
rows: n
A date table with a primary date ID?
rows: m
And a personDate table the person ID and date ID as foreign keys?
rows n x m

I don't think u need a date table unless u want to use it to make specific queries easier. Such as left join against the date to see what days you are missing events. Nevertheless, I would stick to the DATE or DATETIME as the field and avoid making a separate surrogate foreign key. It won't save any space and will potentially perform worse and will be more difficult to use for the developer.
This seems simple and fine to me. I wouldn't worry too much about the performance based upon the number of elements alone. You can insert a billion records with no problem and that implies a very large site.
Just don't insert records if the event didn't happen. In other words you want your database to grow in relation to the real usage. Avoid growth based upon phantom events and you should be okay.
person
person_id
action
action_id
personAction
person_id
action_id
action_datetime

Insert/Update on table with autoincrement and foreign key

I have a table as such:
id entity_id first_year last_year sessions_attended age
1 2020 1996 2008 3 34.7
2 2024 1993 2005 2 45.1
3 ... ... ...
id is auto-increment primary key, and entity_id is a foreign key that must be unique for the table.
I have a query that calculates first and last year of attendance, and I want to be able to update this table with fresh data each time it is run, only updating the first and last year columns:
This is my insert/update for "first year":
insert into my_table (entity_id, first_year)
( select contact_id, #sd:= year(start_date)
from
( select contact_id, event_id, start_date from participations
join events on participations.event_id = events.id where events.event_type_id = 7
group by contact_id order by event_id ASC) as starter)
ON DUPLICATE KEY UPDATE first_year_85 = #sd;
I have one similar that does "last year", identical except for the target column and the order by.
The queries alone return the desired values, but I am having issues with the insert/update queries. When I run them, I end up with the same values for both fields (the correct first_year value).
Does anything stand out as the cause for this?
Anecdotal Note: This seems to work on MySQL 5.5.54, but when run on my local MariaDB, it just exhibits the above behavior...
Update:
Not my table design to dictate. This is a CRM that allows custom fields to be defined by end-users, I am populating the data via external queries.
The participations table holds all event registrations for all entity_ids, but the start dates are held in a separate events table, hence the join.
The variable is there because the ON DUPLICATE UPDATE will not accept a reference to the column without it.
Age is actually slightly more involved: It is age by the start date of the next active event of a certain type.
Fields are being "hard" updated as the values in this table are being pulled by in-CRM reports and searches, they need to be present, can't be dynamically calculated.

Since you have a 'natural' PK (entity_id), why have the id?
age? Are you going to have to change that column daily, or at least monthly? Not a good design. It would be better to have the constant birth_date in the table, then compute the ages in SELECT.
"calculates first and last year of attendance" -- This implies you have a table that lists all years of attendance (yoa)? If so, MAX(yoa) and MIN(yoa) would probably a better way to compute things.
One rarely needs #variables in queries.
Munch on my comments; come back for more thoughts after you provide a new query, SHOW CREATE TABLE, EXPLAIN, and some sample data.

Is it OK to store redundant data in case records from foreign table are deleted

Let's say I have a database table called products which has a list of products, with the primary key product_id
I then have a database table called purchase_order_products which has a list of products assigned to a purchase order, with a foreign key product_id.
Now, if I enforce referential integrity between the two tables, it only requires a single purchase order to reference a product, and it won't be possible to ever delete that particular product from the database (unless the purchase orders for that product are also deleted).
It seems I have a few options:
1) Enforce referential integrity and don't allow the product to ever be deleted.
2) Don't enforce referential integrity, and if anyone ever views a purchase order where the product no longer exists, simply display the product name as "UNKNOWN" or "DELETED".
3) The final option is to not only store the product name in the products table but also store it in the purchase_order_products table alongside the foreign key. Obviously this is redundant data, but it would allow the product to be deleted from the products table, whilst still allowing users to see the names of now non-existent products that were part of purchase orders in the past.
I'm swaying towards option #3 but wondered what is the "correct" way of handling this.

You can enforce referential integrity and use ON DELETE SET NULL, then display "UNKNOWN" or "DELETED" when a purchase order's product_id is null. Thus, option 1 and 2 aren't mutually exclusive.
Option 3 is valid. Having two copies of product_name isn't redundant if the relations they're used in express different predicates. Product <x>'s current name is <y> is different from When purchase_order <z> was created, product <x>'s name was <y>. It's a common technique to record current and historical prices separately, the same can be done for names or any other attributes of a product.

There is no reason to duplicate data. A simple solution is to implement a soft delete on the products. The best way is to have a date field called something appropriate like Deleted and set it to a date far in the future, like 12/31/9999, for current products. To delete a product, just set the Deleted value to the date the product is deleted. This way, to list currently available products, filter out the products where Deleted is in the past.
When showing purchase orders, ignore the Deleted value so it shows all products, even the ones no longer available. Optionally, you could show by some indicator if a product is one that is no longer available.
You might also want to create a view that ignores deleted products for those times in would not be appropriate to show deleted products, as when creating new purchase orders.
You would also want to write a delete trigger on the products table to convert the delete process to just change the value in the Deleted field. You would also want to have a function in the API to allow a product to be "deleted" as of a certain date. Maybe the product was removed a month ago but the database was not updated. Or the product is slated to be removed at a future date so go ahead and set the date. The product will simply disappear from the current products view when that date is reached.

Multiple vote options storing in MySQL table

I have a poll which has an undefined number of options (it can have only 2 options, but it can also have 10 or 20 or more options to choose from). I need to store the current vote count in MySQL table. I can't think of a centralized way of storing them except:
Create a field vote_count and store a serialized array of voting options mapped to counts.
When new vote data comes in this field is read, unserialized, appropriate values are incremented, then field is written to. This needs 2 queries and there might be 5 or more votes incoming per second.
So I need a way to store voting counts for an unknown number of voting options and be able to quickly access it (I need up to date counts for every option displayed on the voting page) and quickly update it (when new votes come in). It has to be within MySQL table. There is no "upper" limit for the number of voting options.

The normative pattern for handling multi-valued attributes, or repeating values, is to add a second table.
Consider a purchase order that can have more than one line item on it. We represent the line items in a child table, with a foreign key to the parent in the purchase order table:
CREATE TABLE `purchase_order` (id int not null, foo varchar(200), ... );
CREATE TABLE `line_item` (id int not null, order_id int not null, ... );
ALTER TABLE `line_item` ADD FOREIGN KEY (order_id) REFERENCES order(id) ;
INSERT INTO purchase_order (id, foo) VALUES (101, 'bar');
INSERT INTO purchase_order (id, order_id) VALUES (783, 101);
INSERT INTO purchase_order (id, order_id) VALUES (784, 101);
INSERT INTO purchase_order (id, order_id) VALUES (785, 101);
We can get a count of the line items associated with a purchase order, like this:
SELECT COUNT(1)
FROM line_item
WHERE order_id = 101;
Or, we can get a count of line items for every purchase order, like this:
SELECT o.id, COUNT(l.id) AS count_line_itesm
FROM purchase_order o
LEFT
JOIN line_item l
ON l.order_id = o.id
GROUP BY o.id
In your case, what are the entities you need to represent (person, place, thing, concept or event; which can be uniquely identified and you need to store information about.
I'm having difficulty conceptualizing what entities it is you are need to represent.
poll -
poll_question - a single question on a given poll
poll_question_answer - a possible answer to a question to a given poll question
voter -
ballot - associated with one voter and one poll (?)
vote - the answer given to a particular poll question
Good database design comes from an understanding of the entities and the relationships, and developing a suitable model.

Can't you just have one table of questions, and another table of possible answers (multiple rows per question, as many as you want). Then either store the counts on the table of answers, or (better) have another table of actual entered answers (this way you can log the details of the person doing the answers, and easily use SUM / COUNT to work out how many votes each option has).

MySQL - Table Implementation

I had to implement the following into my database:
The activities that users engage in. Each activity can have a name with up to 80 characters, and only distinct activities should be stored. That is, if two different users like “Swimming”, then the activity “Swimming” should only be stored once as a string.
Which activities each individual user engages in. Note that a user can have more than one hobby!
So I have to implement tables for this purpose and I must also make any modifications to existing tables if and as required and implement any keys and foreign key relationships needed.
All this must be stored with minimal amount of storage, i.e., you must choose the appropriate data types from the MySQL manual. You may assume that new activities will be added frequently, that activities will almost never be removed, and that the total number of distinct activities may reach 100,000.
So I already have a 'User' table with 'user_id' as my primary key.
MY SOLUTION TO THIS:
Create a table called 'Activities' and have 'activity_id' as PK (mediumint(5) ) and 'activity' as storing hobbies (varchar(80)) then I can create another table called 'Link' and use the 'user_id' FK from user table and the 'activity_id' FK from the 'Activities' table to show user with the activities that they like to do.
Is my approach to this question right? Is there another way I can do this to make it more efficient?
How would I show if one user pursues more than one activity in the foreign key table 'Link'?

Your idea is the correct, and only(?) way.. it's called a many to many relationship.
Just to reiterate what you're proposing is that you'll have a user table, and this will have a userid, then an activity table with an activityid.
To form the relationship you'll have a 3rd table, which for performance sake doesn't require a primary key however you should index both columns (userid and activityid)
In your logic when someone enters an activity name, pull all records from the activity table, check whether entered value exists, if not add to table and get back the new activityid and then add an entry to the user_activity table linking the activityid to the userid.
If it already exists just add an entry linking that activity id to the userid.
So your approach is right, the final question just indicates you should google for 'many to many' relationships for some more info if needed.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008