Consider we have one sql table customers
now consider iF we have a table where their are two columns customer_name and orders_name now one customer may have multiple orders (one to many relationship) So we have table where in which we choose customer_name as foriegn key. But now consider we have 100 orders to one customer_name so we have to write same customer_name 100 times. waist of memory.
customer_name,customer_orders table is
so i was thinking is can't we just make table with name of customer_name orders, for examle if we have customer_name bill so we can create a table with name of bill's orders, and write all his orders in it, now we not using any foriegn key,
bill's orders table is
and more tables we can create for other users so how it is possible to delete the table when we delete that customer_name from main table. any idea?
You solve the issue of wasted space by using surrogate keys. Instead of copying a huge alphanumeric field (names) to child tables, you would create an ID of sorts using a more compact data type (byteint, smallint, int, etc.). In the approach you propose where you create a separate table for each customer, you will run into the following issues:
cannot run aggregates across customers, i.e., you cannot simply do a sum, avg, min, etc. for sets of customers slicing the data different ways
SQL will be far more complex with each extra customer added to the queries
your data dictionary is going to grow huge and at some point you will incur major performance issues that are not easy to fix
The point of using a relational database is to allow for users to dynamically slice and dice the data. The method that you are proposing would not be useful for querying.
Related
I'm stuck when I was designing star schema. Here is my problem
I have several dimension table already designed.
with customer table (customer_id, name, address) (200 rows)
inventory table (film, category,inv_id) (400 rows)
store table (store_id, store) (4 rows)
and sale table (sale_date,sale_id) (16500 rows)
I'm trying to insert value into fact table (newly created, empty) payment (FK customer_id,FK inv_id,FK store_id,FK sale_id, payment_amount)
I have 15650 payment records. How could I insert these values into fact table?
When I use
insert into payment_amount
select amount
from original
It runs an error. NOT null violation for foreign keys
What should I do to include these values into fact table?
I know I have conceptual error here, hope you can give me a good clarification
The error being thrown may be because these values do not exist in the parent dimensional table - customer table,inventory table, store table , sale table or else once you are trying to load this values in the fact table, the rest of fields with not null constraint are not being inserted.
Your design misses the basic relationship between table criteria. For e.g, how is customer related with sale or how is sale related with store
For e.g, You can have some relationships like - store table -> customer table,sale and inventory table. These relations can be one to many, one-one or many to one kind. So that you can identify a unique sale to a customer from a particular store for particular inventory
You need to design the data flow in such a way that :
1. first a Staging table which will may be non-persistent and will be landing area for all the ingestion from the different sources
2. Have some intermediate table which will contain transformation of data from the staging tables
3. Create the dimension table from the staging and intermediate tables
4. Create the Fact table from dimension and intermediate tables
As a best practice - load the records in the below flow so that there is no conflict of keys -
Staging -> Intermediate -> Dimension -> Facts
In ideal scenario - You need not mentions the Keys explicitly, instead have a cleaning job on top of it.
I am converting a spreadsheet to a database but how do i accommodate multiple values for a field?
This is a database tracking orders with factories.
Import PO# is the unique key. sometimes 1 order will have 0,1,2,3,4 or more customers requiring that we place their price tickets on the product in the factory. every order is different. what's the proper way to accommodate multiple values in 1 field?
Generally, having multiple values in a field is bad database design. Maybe a one to many relationship will work in this scenario.
So you will have an Order table with PO# as the primary key,
Then you will have a OrderDetails table with the PO# as a foriegn key. i.e. it will not be designated as a primary key.
For each row in the Order table you will have a unique PO# that will not repeat across rows.
In the OrderDetails table you will have a customer per row and because the PO# is not a primary key, it can repeat across rows. This will allow you to designate multiple customers per order. Therefore each row will have its own PriceTicketsOrdered field so you can know per customer what the price is.
Note that each customer can repeat across rows in the OrderDetails table as long as its for a different PO# and/or product.
This is the best I can tell you based on the clarity of your question.
Personally, I normally spend time desinging my database on paper or using some drawing software like visio before I start implementing my database in a specific software like MySql pr PostgreSql.
Reading up on ER Diagrams(Entity Relationship diagrams) might help you.
You should also read up on Database normalization. Probably you should read up on database normalization first.
here is a link that might help:
http://code.tutsplus.com/articles/sql-for-beginners-part-3-database-relationships--net-8561
Sorry, not sure if question title is reflects the real question, but here goes:
I designing system which have standard orders table but with additional previous and next columns.
The question is which approach for foreign keys is better
Here I have basic table with following columns (previous, next) which are self referencing foreign keys. The problem with this table is that the first placed order doesn't have previous and next fields, so they left out empty, so if I have say 10 000 records 30% of them have those columns empty that's 3000 rows which is quite a lot I think, and also I expect numbers to grow. so in a let's say a year time period it can come to 30000 rows with empty columns, and I am not sure if it's ok.
The solution I've have came with is to main table with other 2 tables which have foreign keys to that table. In this case those 2 additional tables are identifying tables and nothing more, and there's no longer rows with empty columns.
So the question is which solution is better when considering query speed, table optimization, and common good practices, or maybe there's one even better that I don't know? (P.s. I am using mysql with InnoDB engine).
If your aim is to do order sets, you could simply add a new table for that, and just have a single column as a foreign key to that table in the order table.
The orders could also include a rank column to indicate in which order orders belonging to the same set come.
create table order_sets (
id not null auto_increment,
-- customer related data, etc...
primary key(id)
);
create table orders (
id int not null auto_increment,
name varchar,
quantity int,
set_id foreign key (order_set),
set_rank int,
primary key(id)
);
Then inserting a new order means updating the rank of all other orders which come after in the same set, if any.
Likewise, for grouping queries, things are way easier than having to follow prev and next links. I'm pretty sure you will need these queries, and the performances will be much better that way.
I am new to database designing. In my case I have to generate lot many keys per user per product. So, I have two options -
Create one table with product_id and key for all the users, or
Create a separate table for each user
In the former case I will have a single table but querying might take more time as all the entries are in the same table for all the users.
In the later case queries might return the result faster but more tables and if users cross 100 or more than it means lot of tables.
Definitely do not create a table for each user. if you create a single table for all users you can use relational database design and add specific information pertaining to each user like address or employee information and use the primary key from the users table as a foreign key. and there will not be any noticeable lag. And maintenance will be whole lot easier
if you want to build relation between your user and product then make table like below
user_product [table name]
id [Primary Key]
user_id [Reference key of user table]
product_id [Reference key of product table]
key
This is your table schema You must use.
if you generate each table then this will take more complex for database and relation management. So, just use above row base format.
if that helpful then let me know.
Thanks
The order_products table holds data of products with the product name and price. It has a list of records what customers have bought.
There are also two fields called product_name and price which are duplicate data from the products table.
It is worth it to normalize order_products table and create history (audit) table for product name and price? Then I don't need product_name and price in the order_products table anymore?
I assume you need to store product name and price at the time of the order. Both will change in the course of time. If that happens a lot, your current approach may be good enough.
I would consider a normalized approach, especially if you have many rows in order_products per (product name, price). Have an additional table that stores the volatile states of a product every time they change. Could be called product_history like you already hinted. Just save the date (or timestamp) with every new state. Have a foriegn key link to the table product to preserve referential integrity. Like this:
create table product_history
(product_id integer -- or timestamp
,valid_from date
,product_name varchar
,price decimal
,PRIMARY KEY (product_id, valid_from)
,FOREIGN KEY (product_id) REFERENCES product(product_id)
ON DELETE CASCADE
ON UPDATE CASCADE)
A fast query to look up the applicable volatile attributes:
SELECT *
FROM product_history
WHERE product_id = $my_product_id
AND valid_from <= $my_date
ORDER BY valid_from DESC
LIMIT 1;
You definitely need an index on (product_id, valid_from) to speed up this query. The primary key in my example will probably do.
That depends. What is the purpose of that table?
In general tables like that can be used to statistical analysis of market trends so its important to have both product_name and price because the product price today may be different than what it was one month ago, but you may want to know at which prices products were most bought.
However if the presence of the price in that table is due to the fact that the price may be part of the products primary key then that is just bad practice and the key should be reduced.
It's not possible to make this judgement knowing just the database structure. It depends on how you use your database (ie. inserts, selects, updates and deletes... And how frequently?).
In one end, if your solution was a reporting solution on a read-only database, you should keep those duplicates! But if on the other end your solution is a logging solution that only logs information but never retreives, I'd go for the denormalized model you're suggesting.
Fully normalized database are not optimized for performance. You often have to denormalize your database design..
Very often a model that has a certain degree of redundant data is the fastest one. When denormalizing you just have to keep a steady eye on the balance between faster queries and slower insertions/updates!
Check these answers and maybe you'll find further help making your decision! When to Denormalize a Database Design
Yes that's a good idea, but a better idea is to create one field in order_products table and dump all your order info there after serializing them. With this approach you don't have to create 2 new tables (may be more if you want to do the same for gift coupon info, shipping info etc etc)
Rationale behind the approach is that order_products are placed order which means they are "published records". Published records don't change much and shouldn't be modified. And these records should be kept for future audits.