MySQL best way to store data - mysql

Currently doing a school project.
I currently have a simple database that stores transaction information.
there is one table named "transactions" with columns:
id (key, auto inc.)
itemID (int, about 500 unique id's)
value (int, value of transaction)
dateTime (dateTime, in which entry was added)
At the moment, it is all dumped into one table, would it be better to have a table for every itemID, and store all the transactions for that particular itemID? Or is that not good practice.

In terms of scalability you are doing it right. Consider a new item enters in play, you'll have to create a new table for it. Instead the way you are working now you just have to insert new item in items table and insert transaction associated to that item in transactions table calling it with the foreign key.

Related

Insert value to Star schema fact table

I'm stuck when I was designing star schema. Here is my problem
I have several dimension table already designed.
with customer table (customer_id, name, address) (200 rows)
inventory table (film, category,inv_id) (400 rows)
store table (store_id, store) (4 rows)
and sale table (sale_date,sale_id) (16500 rows)
I'm trying to insert value into fact table (newly created, empty) payment (FK customer_id,FK inv_id,FK store_id,FK sale_id, payment_amount)
I have 15650 payment records. How could I insert these values into fact table?
When I use
insert into payment_amount
select amount
from original
It runs an error. NOT null violation for foreign keys
What should I do to include these values into fact table?
I know I have conceptual error here, hope you can give me a good clarification
The error being thrown may be because these values do not exist in the parent dimensional table - customer table,inventory table, store table , sale table or else once you are trying to load this values in the fact table, the rest of fields with not null constraint are not being inserted.
Your design misses the basic relationship between table criteria. For e.g, how is customer related with sale or how is sale related with store
For e.g, You can have some relationships like - store table -> customer table,sale and inventory table. These relations can be one to many, one-one or many to one kind. So that you can identify a unique sale to a customer from a particular store for particular inventory
You need to design the data flow in such a way that :
1. first a Staging table which will may be non-persistent and will be landing area for all the ingestion from the different sources
2. Have some intermediate table which will contain transformation of data from the staging tables
3. Create the dimension table from the staging and intermediate tables
4. Create the Fact table from dimension and intermediate tables
As a best practice - load the records in the below flow so that there is no conflict of keys -
Staging -> Intermediate -> Dimension -> Facts
In ideal scenario - You need not mentions the Keys explicitly, instead have a cleaning job on top of it.

Advice needed on database design

I am new to database designing. In my case I have to generate lot many keys per user per product. So, I have two options -
Create one table with product_id and key for all the users, or
Create a separate table for each user
In the former case I will have a single table but querying might take more time as all the entries are in the same table for all the users.
In the later case queries might return the result faster but more tables and if users cross 100 or more than it means lot of tables.
Definitely do not create a table for each user. if you create a single table for all users you can use relational database design and add specific information pertaining to each user like address or employee information and use the primary key from the users table as a foreign key. and there will not be any noticeable lag. And maintenance will be whole lot easier
if you want to build relation between your user and product then make table like below
user_product [table name]
id [Primary Key]
user_id [Reference key of user table]
product_id [Reference key of product table]
key
This is your table schema You must use.
if you generate each table then this will take more complex for database and relation management. So, just use above row base format.
if that helpful then let me know.
Thanks

Database design: auto-increment key & update inconsistencies

Two tables share a unique identifier 'id'. Both tables are meant to be joined by using 'id'.
Defining 'id' as an auto incrementing primary key in both tables may risk update inconsistencies.
Is there some general pattern to avoid such a situation or do I have to deal with updating table1 first and table2 by utilizing the last inserted id after (therefore not declaring id as auto inc in table2)?
First, if you use InnoDB table engine in MySQL you could use both transactions and foreign keys for data consistency.
Second, after the insert in the first table, you could get the last insert id (depending on the way you access the db) and use it as foreign key.
Eg
Table 1: Users: user_id, username
Table 2: User_Profiles: user_id, name, phone
In User_Profiles you don't need to define user_id as auto increment, but first insert a record in Users table and use the user_id for the User_Profiles record. If you do this in transaction, the Users record won't be seen outside of the transaction connection until it's completed, this way you guarantee that even if something bad happens after you insert the user, but before you have inserted the profile - there won't be messed up data.
You could also define that the user_id column in User_Profiles table is foreign key of Users table thus if someone deletes a record from the Users table, the database would automatically delete the one in User_Profiles. There are many other options - read more about that.
There is no problem with same column name 'id' in any number of tables.
Several persistence layer frameworks do it same way.
Just use aliases in your SQL to distinct your tables accordingly.
do I have to deal with updating table1 first and table2 by utilizing the last inserted id after (therefore not declaring id as auto inc in table2)?
Yes. And make id a foreign key so it can only exist in table2 if it already exists in table1.
Yes you do, and remember to wrap the operation in a transaction.

MySQL - Table Implementation

I had to implement the following into my database:
The activities that users engage in. Each activity can have a name with up to 80 characters, and only distinct activities should be stored. That is, if two different users like “Swimming”, then the activity “Swimming” should only be stored once as a string.
Which activities each individual user engages in. Note that a user can have more than one hobby!
So I have to implement tables for this purpose and I must also make any modifications to existing tables if and as required and implement any keys and foreign key relationships needed.
All this must be stored with minimal amount of storage, i.e., you must choose the appropriate data types from the MySQL manual. You may assume that new activities will be added frequently, that activities will almost never be removed, and that the total number of distinct activities may reach 100,000.
So I already have a 'User' table with 'user_id' as my primary key.
MY SOLUTION TO THIS:
Create a table called 'Activities' and have 'activity_id' as PK (mediumint(5) ) and 'activity' as storing hobbies (varchar(80)) then I can create another table called 'Link' and use the 'user_id' FK from user table and the 'activity_id' FK from the 'Activities' table to show user with the activities that they like to do.
Is my approach to this question right? Is there another way I can do this to make it more efficient?
How would I show if one user pursues more than one activity in the foreign key table 'Link'?
Your idea is the correct, and only(?) way.. it's called a many to many relationship.
Just to reiterate what you're proposing is that you'll have a user table, and this will have a userid, then an activity table with an activityid.
To form the relationship you'll have a 3rd table, which for performance sake doesn't require a primary key however you should index both columns (userid and activityid)
In your logic when someone enters an activity name, pull all records from the activity table, check whether entered value exists, if not add to table and get back the new activityid and then add an entry to the user_activity table linking the activityid to the userid.
If it already exists just add an entry linking that activity id to the userid.
So your approach is right, the final question just indicates you should google for 'many to many' relationships for some more info if needed.

MySQL Insert Race Condition

I have a webapp that currently stores all of a user's searches into a search_log table. I now want to create another table called results_log that stores all the results we supply to the user. The search_log table contains a primary key called id_search and the results log table has the foreign key id_search, and one other field id_result. The id_searched field is an auto_incrementing field in both tables.
In my web app I would do the inserts in this sequential order:
insert into search_log table
insert into result_log table
I am worried this may cause a race condition. If user A and user B both finish the webapp and reach this part of the code at about the same time, is it possible that the order would go:
User A -> Insert into search_log
User B -> Insert into search_log
User B -> Insert into result_log
User A -> Insert into result_log
Since both tables are auto_incrementing on the id_search field, I'm worried User A and User B will have their data swapped. I also thought about querying for the id_search, but it seems like a even worse solution.
My question is:
-Is there a way to fix this race condition?
-Would one solution be inserting into two tables with one SQL query? Is this possible?
If those tables are related, then you should include the auto increment ID with when inserting. After inserting into search_log, get the last insert ID, no lookup needed. Then include that in the result_log search as another field.
Never rely on auto increment IDs being the same in different tables.