Mysql Database Layout - mysql

I have created the following database layout, and started coding the application. The more I read, the more I realize my database layout is probably incorrect / inefficient / a bad idea. Before I develop too much code using this layout, I want to make sure I am doing it "correctly".
Basically I have a list of ~2000 stores, and a list of ~50 promotional codes. I need to store whether or not each code is valid at each store. Right now I chose to store each store number as a column header, with the first column containing all of the different possible codes. Here's an image of part of the table so far (1 represents the code being valid, 0 invalid at that store).
The promotional codes will change relatively frequently, but the store numbers should be relatively static, and not change very much.
This is my first time creating a database from scratch like this, and I am a beginner at using mysql, so any help is much appreciated!

You should better use a table for your stores. If you don't, when one store will be added you'll spend a lot of time to add a new field...
Here is what I would do:
table store will contains your 2000s stores
id
name
table code will contains your 50 codes
id
name
table code_store will contains only the valid codes ID, and related store_id(no need to save invalid ones I guess)
code_id
store_id

This type of relation is called many-to-many. I typically have three table for this type of situation. One table for the stores, one for the promo codes, and a third relational table that would have two columns: the store id and the promo id.

Related

Need help starting simple MySQL database using data from Excel

I'm and intern and I've been tasked with something I'm pretty unfamiliar with. My manager has requested I create a simple MySQL database using data from an Excel file(s) and I have no idea where to start. I would normally ask someone here for help but everyone seems to be really busy. Basically, the purpose of the database is to see what different object-groups relate to one another so as to keep things standardized. Trying not to go into detail about things not really relevant.
I was asked to first design a schema for the database and then I would get an update on how to implement it. Would I just start by writing queries to create tables? I'm assuming I would need to convert the Excel files to .csv, how do I read this data and send it to the correct table based on Object Type (an attribute of each object, represented in a column)?
I don't want to ask too much right now, but if someone could help me understand what I need to do to get started I would really appreciate it.
Look at the column headers in your spread sheet.
Decide which columns relate to Objects and which columns relate to Groups
The columns that relate to just Objects will become your field names for the Object table. Give this table an ID field so you can uniquely identify each Object.
The columns that relate to the Groups will become field names for a Group table. Give this table an ID field so you can uniquely identify each Group.
Think about if an Object can be in more than one Group - if so you will probably need an Object-Group table. This table would most likely contain an ObjectID and a GroupID.

Redshift Usage - 1 row by 400 columns per user or (20-400) rows by 4 columns per user

We are building an analytics engine which has to store attribute preference score for each user. We are expecting 400 attributes and they may change(at what frequency is not known as yet). We are planning to store this in Redshift.
My qs is:
Should we store as 1 row per user with 400 cols(1 column for each attribute)
or should we go for a table structure like
(uid, attribute id, attribute value, preference score) which will be (20-400)rows by 3 columns
Which kind of storage would lead to a better performance in Redshift.
Should be really consider NoSQL for this?
Note:
1. This is a backend for real time application with increasing number of users.
2. For processing, the above table has to be read with entire information of all attibutes for one user i.e indirectly create a 1*400 matrix at runtime.
Please help me which desgin would be ideal for such a use case. Thank you
You can go for tables like given in this example and then use bitwise functions
http://docs.aws.amazon.com/redshift/latest/dg/r_bitwise_examples.html
Bitwise functions are here
For your problem, I would suggest a two table design. Its more pain in the beginning but will help in future.
First table would be a key value kind of first table, which would store all the base data and would be kind of future proof, where you can add/remove more attributes, but this table will continue working.
And a N(400 in your case) column 2nd table. This second table you can build using the first table. For the second table, you can start with a bare minimum set of columns .. lets say only 50 out of those 400. So that querying this table would be really fast. And the structure of this table can be refreshed periodically to match with the current reporting requirements. Also you will always have the base table in case you need to backfill any data.

Best practices for storing data from hundreds of fields

I have a form with about 500+ fields (it's a 10 page form, different data types). Can you guys please advise me on the best way to store the data from the form? I can create 500 fields in multiple, logically divided tables but that seems a lot (or maybe that's the best way?!) since I have a few of these forms. I am looking into serializing the data and storing in longtext mysql field. That will have its drawbacks (the one I am thinking of is if the customer wants to search individual fields in the future) but it does seem like a pretty fast solution. I will appreciate if you would share you experience with a similar situation.
Presumibly you dont expect the user to fill the form in in a single sitting! So you will need some sort of work flow to store drafts and amend previous copies etc.
Also assuming some parts of the form are optional.
You could either define a set of database tables with a master table to track status, user name etc, and a child table for each optional part of the form.
Or you could define an XML schema which contains all the possible fields in the form etc plus some status information.
If you always process the entire form and dont want to search through your collection of forms then the XML soiution is slightly better as there are some nifty tricks for moving data from XML to HTML forms and back again. If you need to search based on values inside the form then the SQL based solution is preferable.
You may need 500 columns - unless they can be placed in other tables. It can be hard to tell without seeing your requirements.
Serialising it would make one of the advantages of using a database impossible - querying against certain column values.
create table profile_details (
user_id number,
field_name varchar,
field_value varchar
);
Now you are not only not limited by number of fields, you also pretty free to add and remove them as you keep developing and maintaining your app.
select firstname, lastname, zipcode
from profiles p
join profile_details d1 on (p.user_id=d1.user_id)
join profile_details d2 on (p.user_id=d2.user_id)
where d1.field_name='hobby' and d1.field_value='fishing'
and d2.field_name='income' and d2.field_value>cast(250000 as number);

What is the best dynamic column solution for advertisement webpage?

I'm developing website in which will be categorized advertisements. In each category will be possible different fields of input (example: for car there will be motor size, for cat there will be a race). So I'm thinking how to build database to manage this (I will use MYSQL database). One way you can see in attached picture, I know that also is solution to create table for each values datatape, but I'm wondering that it will slow down a website. This solution which is in picture will generate empty fields in sp_advertisement_value table what isn't good also.
What is in your opinion the best solution? Maybe there is something else?
p.s. Here is a link to database market.
You can store it like name/value pairs (more or less same to what you is described in the image you attached).
A simple schema would be a table having two columns name and value. Instead of having a column for each data type like value_int, value_string etc. have one single column value who's data type can be varchar (or Text as seems fit to you). You can do all the data conversion in your application code as per your needs.
You can do some normalization here too for instance instead of saving name you can make a separate lookup table named parameters having id, name and other related information and have the parameter_id in the table where you are storing parameter values.

Shall I put contact information in a separate table?

I'm planning a database who has a couple of tables who contain plenty of address information, city, zip code, email address, phone #, fax #, and so on (about 11 columns worth of it), a table is an organizations table containing (up to) 2 addresses (legal contacts and contacts they should actually be used), plus every user has the same information tied to him.
We are going to have to run some geolocation stuff on those addresses too (like every address that's within X Kilometers from another address).
I have a bunch of options, each with its own problem:
I could put all the information inside every table but that would make for tables with a very large amount of columns which I'd have problems indexing, and if I change my address format it'll take a while to fix it.
I could put all the information inside an array and serialize it, then store the serialized information in one field, same problem with the previous method with a little less columns and much less availability through mysql queries
I could create a separate table with address information and link it to the other tables either by
putting an address_id column in the users and organizations table
putting a related_id and related_table columns in the addresses table
That should keep stuff tidier, but it might create some unforeseen problems with excessive joining or whatever.
Personally I think that solution 3.2 is the best, but I'm not too confident about it, so I'm asking for opinions.
Option 2 is definitely out as it would put the filtering logic into your codes instead of letting the DBMS handle them.
Option 1 or 3 will depend on your need.
if you need fast access to all the data, and you usually access both addresses along with the organization information, then you might consider option 1. But this will make it difficult to query out (i.e. slow) if the table get too big in mysql.
option 3 is good provided you index the tables correctly.