Insert value to Star schema fact table - mysql

I'm stuck when I was designing star schema. Here is my problem
I have several dimension table already designed.
with customer table (customer_id, name, address) (200 rows)
inventory table (film, category,inv_id) (400 rows)
store table (store_id, store) (4 rows)
and sale table (sale_date,sale_id) (16500 rows)
I'm trying to insert value into fact table (newly created, empty) payment (FK customer_id,FK inv_id,FK store_id,FK sale_id, payment_amount)
I have 15650 payment records. How could I insert these values into fact table?
When I use
insert into payment_amount
select amount
from original
It runs an error. NOT null violation for foreign keys
What should I do to include these values into fact table?
I know I have conceptual error here, hope you can give me a good clarification

The error being thrown may be because these values do not exist in the parent dimensional table - customer table,inventory table, store table , sale table or else once you are trying to load this values in the fact table, the rest of fields with not null constraint are not being inserted.
Your design misses the basic relationship between table criteria. For e.g, how is customer related with sale or how is sale related with store
For e.g, You can have some relationships like - store table -> customer table,sale and inventory table. These relations can be one to many, one-one or many to one kind. So that you can identify a unique sale to a customer from a particular store for particular inventory
You need to design the data flow in such a way that :
1. first a Staging table which will may be non-persistent and will be landing area for all the ingestion from the different sources
2. Have some intermediate table which will contain transformation of data from the staging tables
3. Create the dimension table from the staging and intermediate tables
4. Create the Fact table from dimension and intermediate tables
As a best practice - load the records in the below flow so that there is no conflict of keys -
Staging -> Intermediate -> Dimension -> Facts
In ideal scenario - You need not mentions the Keys explicitly, instead have a cleaning job on top of it.

Related

MySQL best way to store data

Currently doing a school project.
I currently have a simple database that stores transaction information.
there is one table named "transactions" with columns:
id (key, auto inc.)
itemID (int, about 500 unique id's)
value (int, value of transaction)
dateTime (dateTime, in which entry was added)
At the moment, it is all dumped into one table, would it be better to have a table for every itemID, and store all the transactions for that particular itemID? Or is that not good practice.
In terms of scalability you are doing it right. Consider a new item enters in play, you'll have to create a new table for it. Instead the way you are working now you just have to insert new item in items table and insert transaction associated to that item in transactions table calling it with the foreign key.

Sql creating table without using foreign key

Consider we have one sql table customers
now consider iF we have a table where their are two columns customer_name and orders_name now one customer may have multiple orders (one to many relationship) So we have table where in which we choose customer_name as foriegn key. But now consider we have 100 orders to one customer_name so we have to write same customer_name 100 times. waist of memory.
customer_name,customer_orders table is
so i was thinking is can't we just make table with name of customer_name orders, for examle if we have customer_name bill so we can create a table with name of bill's orders, and write all his orders in it, now we not using any foriegn key,
bill's orders table is
and more tables we can create for other users so how it is possible to delete the table when we delete that customer_name from main table. any idea?
You solve the issue of wasted space by using surrogate keys. Instead of copying a huge alphanumeric field (names) to child tables, you would create an ID of sorts using a more compact data type (byteint, smallint, int, etc.). In the approach you propose where you create a separate table for each customer, you will run into the following issues:
cannot run aggregates across customers, i.e., you cannot simply do a sum, avg, min, etc. for sets of customers slicing the data different ways
SQL will be far more complex with each extra customer added to the queries
your data dictionary is going to grow huge and at some point you will incur major performance issues that are not easy to fix
The point of using a relational database is to allow for users to dynamically slice and dice the data. The method that you are proposing would not be useful for querying.

Database design: same table - mixed data VS several tables - same schema

I would like to store information about people (who have a person_id) that is quite similar to each other, such as:
profession
nationality
tags
etc. = limited amount of characteristics which is not expected to grow in number
Since one person can have more than one tags (or professions for example), it makes sense to normalise the database. All these information require a simple table design: primary key (id) + varchar.
I am wondering what makes more sense:
Store mixed information in one table = one schema
Store information in distinct tables, but tables have the same schema
Edit
This information and the people are connected in a third table: primary key | person_id | property_id
1]One should store information in distinct tables having same schema, if your database is OLTP (Online transaction processing).Later you can use various joins to retrieve table data.
2]You should keep mixed information in one table if your database is for data mart/data warehouse/ data mining purpose where performance is not an issue but information related MIS is having more wheitage.

Beginner Database architecture

I am converting a spreadsheet to a database but how do i accommodate multiple values for a field?
This is a database tracking orders with factories.
Import PO# is the unique key. sometimes 1 order will have 0,1,2,3,4 or more customers requiring that we place their price tickets on the product in the factory. every order is different. what's the proper way to accommodate multiple values in 1 field?
Generally, having multiple values in a field is bad database design. Maybe a one to many relationship will work in this scenario.
So you will have an Order table with PO# as the primary key,
Then you will have a OrderDetails table with the PO# as a foriegn key. i.e. it will not be designated as a primary key.
For each row in the Order table you will have a unique PO# that will not repeat across rows.
In the OrderDetails table you will have a customer per row and because the PO# is not a primary key, it can repeat across rows. This will allow you to designate multiple customers per order. Therefore each row will have its own PriceTicketsOrdered field so you can know per customer what the price is.
Note that each customer can repeat across rows in the OrderDetails table as long as its for a different PO# and/or product.
This is the best I can tell you based on the clarity of your question.
Personally, I normally spend time desinging my database on paper or using some drawing software like visio before I start implementing my database in a specific software like MySql pr PostgreSql.
Reading up on ER Diagrams(Entity Relationship diagrams) might help you.
You should also read up on Database normalization. Probably you should read up on database normalization first.
here is a link that might help:
http://code.tutsplus.com/articles/sql-for-beginners-part-3-database-relationships--net-8561

Should i stock "quotation_request" as a table on my DB?

I'm working on a very simple DB.
Imagine I've table customer and table seller.
The customer is able to request a quotation for some products
There will be a simple form that allow to customers to select products and submit the quotation.
Now, should I create table : "Quotation" and store all quotations (with id_quotation..etc)?
Thank you all
Without knowing all of the business rules that govern the requirements of this database, perhaps the following design will help to answer your question and explain a few concepts in the process.
In database terms, an entity is a person, place, or thing about which we want to collect and store data. From your description we can already see two entities: seller and customer. This is important since the entities we identify conceptually become database tables in their own right.
The seller table should contain data applicable only to sellers. Thus, the qualities (attributes) about sellers that we want to store become columns in our seller table. Each row (record) in the seller table represents an individual seller. Each individual seller is uniquely identified in the seller table with a unique value stored in it's primary key column, which we can name seller_id.
A simplified version of such a table could look like this:
In a similar manner, the customer table should contain data only applicable to customers. The qualities (attributes) about customers that we wish to store become the columns in the customer table. Each row (record) in the customer table represents an individual customer. Each individual customer is uniquely identified in that table with a unique value in it's primary key column, which we can declare as customer_id.
A simplified version of this table:
I'm guessing the business rules state that any customer is able to request any number of products, from any seller, any number of times...since surely any seller would want as many sales and customers as possible!
How can we express and record the interactions (relationship) between seller and customer?
This is done with a new kind of entity: a composite entity. It becomes a new table, having it's own primary key, and contains seller_id and customer_id as foreign keys. The foreign keys in this table connect (relate) the seller table to the customer table.
We can name this new table quotation (if that is your preferred name). Each row of this table is intended to capture and record each and every individual transaction between a customer and a seller. The columns (attributes) of this table are the data that apply to a transaction between a customer and seller, such as amount or date of sale.
A very simplified version of this composite entity:
Note that the foreign key values that exist in this table must already exist in their respective tables as a primary key value. That is, a foreign key value cannot be entered into this table unless it exists already as a primary key value in it's own table. This is important, and it is called referential integrity - it ensures that there is no record of a customer purchasing from a non-existent seller, etc.
In the example above we can see that Builder B requested a quotation from Acme Construction in the amount of $3500.00. They then requested another quotation at another time for the amount of $1800.00. What else does it reveal? All existing customers have ordered something. Acme Lumber has not made a sale at all (yet), etc.
A design such as this enables the database to store any number of transactions between sellers and customers. Likewise, it supports the addition of any number of new customers and sellers, even if they have not sold or purchased anything yet. Queries can be run that reveal which sellers have sold the most or least, and so on.
Good luck with your studies!