I'm sure this has been asked somewhere, but I keep only finding discussions about phone numbers.
I'm designing a system for a supplier group that controls multiple online stores. There are 5 different types/levels of stores (based on their buying group, blah).
These accounts can order products from our warehouse and also have their own collection of customers that can order products from them.
Since both store accounts and their customer accounts will be stored in and accessed by our system, but stores will be accessing their customer accounts independently I want to set up an 'organized' structure for account numbers, that works for both sides of the system.
I was planning on doing 10 digit account numbers, with the first 5 digits identifying the store (the first of which will identify the store type/level), and the last 5 the customer.
For example,
First LEVEL1 store account : 10001-00000
Their customer accounts : 10001-00001, 10001-00002, etc.
Third LEVEL2 store account : 20003-00000
Their customer accounts : 20003-00001, 20003-00002, etc.
This should allow for enough growth to support our potential number of stores and their number of customers.
My question is, should I separate the numbers with a dash or not?
It certainly makes it clearer for humans, but it feels like it would be better in the database to be stored as a true int, though I don't know how often I'd be comparing them, etc.
Should I store it as an INT with no dash, and just format it when displaying it to users? (then obviously be sure to accept account number input with or without a dash)
1NF says that values in each attribute must be atomic. You are violating the 1NF by adding dashes and storing two attributes as one.
What must you (ideally) do?
Each store has an ID, so the store's table should have an ID column that contains that.
Next, the customers should also have their own IDs in their table.
Finally, the relationship (or junction) table between store and customer should contain store's ID and customer's ID as each row.
Alternatively, the customer can have a foreign key which tells which store the customer should be shopping from, assuming a customer is tied to only one store.
Related
I have a conceptual question.
My DB has a table that stores information about people. One of it's fields is their phone number (8-digit for my country).
The thing is that in some cases, two or more people will have the same phone number.
My question is: will it be a better choise to store the phone numbers on another table and then reference it by a foreign key instead of just storing them as a field?If so, will the result be the same for whatever the size of the DB is?
I don't know if this will make any difference, but the table will have no more than 600.000 - 800.000 records, and I guess the coincident phone numbers will be about 10% of the total records.
EDIT:
-Each record can have up to 4 phone numbers (two lines and two cells)
-Both cases will occur, there will be sometimes where the users will be looking for all the people having a specific number, and times where the user will want to know what are all the phone numbers a person has
Technically to be normalized, you would have a separate Phone number table and then a PersonPhonenumber table.
However, in practice I have rarely seen this structure for phone numbers and addresses. For one, it is easy to make a mistake and update more than one person's addess or phone when you only mean to change one. For another it adds an extra join that seems unnneeded to most people. You aren't really gaining much by going to this level other than a small amount of duplication.
The real decider is how you are going to use and update the numbers. If you want to update all the people with the same number frequently, it is better to go fully normalized. If you will usually only want to update one person at a time, it is probably less risky to only have a Person table and a PersonPhone Number table.
If you want history, then I would go with a person table and aPersonPhoneNumber history table. It would have the personid, the phone number, the startdate and the end date. So when John and Mary get divorced, his phonenumber woudl have an end date but hers woudl not and you could clearly see who had the number when.
If you have more then 1 phone number per person
There is a good reason to set new table like:
id, user_id, phone, type, description
So type could be list of
Home, Work, Office2, Boss, Wife, Fax, Mobile etc...
and description like
"work hours only", "evening", "24x7", "Emergency only" etc
If you really manage phone book for your application that is good idea to separate phone numbers from original user table.
If you have two people with the same phone number you will encounter a problem when searching for a specific phone number. A search for a specific phone number will sometimes return more than one result (10% according to your estimate). If you search by phone number AND by people, you can require all searches for a phone number to include a user identifier (first name, last name, location, etc). It depends on what your objective is.
Usually the phone number is just a number, and without a person has no meaning.
So you store it in the Person table.
But it you work for a telephone company for which a phone number has a different meaning and usage (like history, phone lookup, billing) then it should be stored in a separate table. I hope it helped.
I've been using Access to create simple databases for a while with great success, but have run into a problem I can't find an answer to.
We ship individualized serialized units to various end-users, and occasionally to resellers that stock them for end-users. I must keep track of which serial numbers end up with each end-users.
The first database I created to handle this recorded company information in one table using their account number as primary key, order information in a second table using the order number as the primary key and linked via the company name, and unit information in a third table with the serial number as the primary key and linked via the order number.
This worked very well until I had to account for these stock orders with a reseller. As it was structured, every unit was linked to one company via the sales order. The issue is that I may ship 20 units on one order to Company A, who then sells 5 to Company B and 3 to Company C.
I realized I needed to link the company name directly to the units, not the orders and have fixed that.
My issue now is simplicity in entering information in the form. My previous database involved the employee in our shipping department merely entering the sales order, selecting the customer name from a drop down menu, then scanning the serial numbers in a subform. This was to ensure simplicity and try to eliminate human error. He had only three things to input, and most of the input was done by scanning barcodes.
As it is currently structured now, the employees out in shipping would have to populate the company name for every record in the subform with the serial number and that complicates things in a way that is unacceptable. At the point of shipping, the company name will always be the same for every unit in the subform.
So.
How would I go about creating a form where the company name is entered once in the form, and automatically populates itself for every record in the subform? The caveat here is that I must also be able to go back occasionally and change the company name of individual units in an order without necessarily affecting the rest of the order. I suppose it starts out as a one-to-many relationship that then must be able to change.
I hope that makes sense.
I have looked for answers using various approaches with auto-fill and relationships and not preserving data integrity, but I feel the answer is just beyond my reach.
The only solution I can think of is to create another field in the unit table for the end-user, and perhaps write a formula that sets this default value as the company name from the order that shipped it. This seems unnecessarily complicated and redundant, there has to be a better way.
I am working on an e-commerce website that will sell cloths but also provide a way to book a service, like manicure. I am trying to create a single shopping cart, to which users will add both. The user should have a single shopping cart and a single total to pay for in the end. The problem is how should I store these in the database?
Cloths will have size and colour as params, while the services will have the date and time, when the user wants to get the service. Storing all these parameters in a single table order_items doesn't look too wise, while storing params as a serialized string doesn't feel any better.
What is the most common way of storing this data together?
I also work for a e-commerce store and we ahd the same problem. Easiest is have a "cart" and "cart_item" table respectively, where the shared columns between the products are stored, eg: price, quantity etc.
Then you have a table "cart_item_voucher" and "cart_item_product" which save the details specific to the voucher or product. Each one will reference "cart_item" with the "cart_item_id" foreign key.
Also "cart_item" will have a type field and from there you can distinguish the difference.
I want to create a small invoicing application. My first step is creating a database with two tables where each client can have multiple addresses and phone numbers.
I think its safe to assume here that I am dealing with a one to many relationship. But how should I design the tables and fields so they can store multiple addresses and phone numbers per user? By multiple I also mean indeterminate since so I can't just have x no of columns in my table. Off Hand, here is what I was thinking:
Clients
- Id
- Notes
CONTACTS
- id
- client_id
- address
- phone number
Now I am trying to avoid a third join table like this:
Clients_Contacts
- id
- client_id
- contact id
because I still want a one to many and not many to many relationship between contacts and clients. Is there something wrong with the way I am thinking about this? Can someone please help me design this database and show me what a sample query querying multiple addresses would look like.
Why would you nead a join table?
I would have
Client
- ID (PK, FK of Contact)
- Notes
Contact
- ID (PK)
- client_id
- address
- phone_number
Then just set the ID of the client to be the value of client_id within the contact information.
This way, multiple Contact records can have the same client_id.
Your first idea looks fine for your use-case (one to many). You'll only need the third join table if you want a many to many relationship, which you do not want.
A sample query to get all addresses from a specific client would be:
SELECT address FROM Contacts WHERE client_id = 1
You say you want to store multiple addresses and phone numbers per user (= client, I assume?). Your current model already supports this. Every client can have as many contacts as needed (each contact would have the client_id set to the client's id). Only if you want to assign a single contact (address/phone number) to multiple clients as well would you need the third table.
We have a requirement in our application where we need to store references for later access.
Example: A user can commit an invoice at a time and all references(customer address, calculated amount of money, product descriptions) which this invoice contains and calculations should be stored over time.
We need to hold the references somehow but what if the e.g. the product name changes? So somehow we need to copy everything so its documented for later and not affected by changes in future. Even when products are deleted, they need to reviewed later when the invoice is stored.
What is the best practise here regarding database design? Even what is the most flexible approach e.g. when the user want to edit his invoice later and restore it from the db?
Thank you!
Here is one way to do it:
Essentially, we never modify or delete the existing data. We "modify" it by creating a new version. We "delete" it by setting the DELETED flag.
For example:
If product changes the price, we insert a new row into PRODUCT_VERSION while old orders are kept connected to the old PRODUCT_VERSION and the old price.
When buyer changes the address, we simply insert a new row in CUSTOMER_VERSION and link new orders to that, while keeping the old orders linked to the old version.
If product is deleted, we don't really delete it - we simply set the PRODUCT.DELETED flag, so all the orders historically made for that product stay in the database.
If customer is deleted (e.g. because (s)he requested to be unregistered), set the CUSTOMER.DELETED flag.
Caveats:
If product name needs to be unique, that can't be enforced declaratively in the model above. You'll either need to "promote" the NAME from PRODUCT_VERSION to PRODUCT, make it a key there and give-up ability to "evolve" product's name, or enforce uniqueness on only latest PRODUCT_VER (probably through triggers).
There is a potential problem with the customer's privacy. If a customer is deleted from the system, it may be desirable to physically remove its data from the database and just setting CUSTOMER.DELETED won't do that. If that's a concern, either blank-out the privacy-sensitive data in all the customer's versions, or alternatively disconnect existing orders from the real customer and reconnect them to a special "anonymous" customer, then physically delete all the customer versions.
This model uses a lot of identifying relationships. This leads to "fat" foreign keys and could be a bit of a storage problem since MySQL doesn't support leading-edge index compression (unlike, say, Oracle), but on the other hand InnoDB always clusters the data on PK and this clustering can be beneficial for performance. Also, JOINs are less necessary.
Equivalent model with non-identifying relationships and surrogate keys would look like this:
You could add a column in the product table indicating whether or not it is being sold. Then when the product is "deleted" you just set the flag so that it is no longer available as a new product, but you retain the data for future lookups.
To deal with name changes, you should be using ID's to refer to products rather than using the name directly.
You've opened up an eternal debate between the purist and practical approach.
From a normalization standpoint of your database, you "should" keep all the relevant data. In other words, say a product name changes, save the date of the change so that you could go back in time and rebuild your invoice with that product name, and all other data as it existed that day.
A "de"normalized approach is to view that invoice as a "moment in time", recording in the relevant tables data as it actually was that day. This approach lets you pull up that invoice without any dependancies at all, but you could never recreate that invoice from scratch.
The problem you're facing is, as I'm sure you know, a result of Database Normalization. One of the approaches to resolve this can be taken from Business Intelligence techniques - archiving the data ina de-normalized state in a Data Warehouse.
Normalized data:
Orders table
OrderId
CustomerId
Customers Table
CustomerId
Firstname
etc
Items table
ItemId
Itemname
ItemPrice
OrderDetails Table
ItemDetailId
OrderId
ItemId
ItemQty
etc
When queried and stored de-normalized, the data warehouse table looks like
OrderId
CustomerId
CustomerName
CustomerAddress
(other Customer Fields)
ItemDetailId
ItemId
ItemName
ItemPrice
(Other OrderDetail and Item Fields)
Typically, there is either some sort of scheduled job that pulls data from the normalized datas into the Data Warehouse on a scheduled basis, OR if your design allows, it could be done when an order reaches a certain status. (Such as shipped) It could be that the records are stored at each change of status (with a field called OrderStatus tacking the current status), so the fully de-normalized data is available for each step of the oprder/fulfillment process. When and how to archive the data into the warehouse will vary based on your needs.
There is a lot of overhead involved in the above, but the other common approach I'm aware of carries even MORE overhead.
The other approach would be to make the tables read-only. If a customer wants to change their address, you don't edit their existing address, you insert a new record.
So if my address is AddressId 12 when I first order on your site in Jamnuary, then I move on July 4, I get a new AddressId tied to my account. (Say AddressId 123123 because your site is very successful and has attracted a ton of customers.)
Orders I palced before July 4 would have AddressId 12 associated with them, and orders placed on or after July 4 have AddressId 123123.
Repeat that pattern with every table that needs to retain historical data.
I do have a third approach, but searching it is difficult. I use this in one app only, and it actually works out pretty well in this single instance, which had some pretty specific business needs for reconstructing the data exactly as it was at a specific point in time. I wouldn't use it unless I had similar business needs.
At a specific status, serialize the data into an Xml document, or some other document you can use to reconstruct the data. This allows you to save the data as it was at the time it was serialized, retaining original table structure and relaitons.
When you have time-sensitive data, you use things like the product and Customer tables as lookup tables and store the information directly in your Orders/orderdetails tables.
So the order table might contain the customer name and address, the details woudl contain all relevant information about the produtct including especially price(you never want to rely on the product table for price information beyond the intial lookup at teh time of the order).
This is NOT denormalizing, the data changes over time but you need the historical value, so you must store it at the time the record is created or you will lose data intergrity. You don't want your financial reports to suddenly indicate you sold 30% more last year because you have price updates. That's not what you sold.