I have a table in which i have a field that requires 3 letters and 3 numbers (that have to between the values 2000 & 7000).
I've been reading around, and i'm still not sure which is the better way to handle this issue, whether it can be with a simple datatype, say for instance char(6), or if there has to be a field that contains only the 3 letters, and another field that contains the 3 numbers with a check restriction to ensure that the values of that field are between 2000 & 7000.
Any help that you can offer me, i would be glad. Thanks in advance
You may have to give more specificity about the requirements, but it sounds to me like a single column is the best option -- especially if order matters. If the letters and numbers have meanings separately, then they should be in two columns. Otherwise, you'll just end up having to concatenate them together.
char(6) is fine as long as you know it will always be 6 characters exactly. You can't enforce such a specific limit as 2000 to 7000 at column level anyway (which is 4 numbers, isn't it?)
Every field should represent an attribute of the entities the table holds. In other words, if these three letters and three numbers represents different attributes, they should be in separate fields, otherwise (e.g. they are to represent a serial number) you can have them in one field.
Another approach is to think of a possible use case, like: Am I going to perform a query based on the second number? If your answer is yes, then they should be in separate fields, otherwise they represent one attribute and they should be in one field.
Hope it helps.
If the value is "one" value, use one column, say char(6), but...
Here's a surprising fact: mysql doesn't support CHECK constraints!
Mysql allows CHECK constraints to defined, however they are completely ignored and allowed only for comparability with SQL from other databases.
If you want to enforce a format, you'll need to use a trigger, but mysql doesn't support raising exceptions, so you'll have to use a work-around.
The best option is probably to use app code for validation.
Related
I'm trying to understand why this is happening but I couldn't find anything in the internet.
I have a table of meds(called Medicamento) which has 23600 elements in it.
When I try to take an element using the IdMed column it only takes the values with less than 6 digits. Example 1:
SELECT * FROM `Medicamento` WHERE IdMed=100
Example 2:
SELECT * FROM `Medicamento` WHERE IdMed=200703
At this point I thought that the med with that Id was not created so I did this last query which made me not knowing where the mistake is:
SELECT * FROM `Medicamento` WHERE IdMed>200702
Result:
As you can see the first element is the one with the 200703 Id. What I cannot understand is why it takes elemnts with Id's such as 12700 or 100 but it doesn't take elements with Id's of 6 numbers. I thought it could be a matter of formats but I didn't find anything helpful.
Data of the table was taken from 2 different .xlsx files, that's why I thought about formats.
PD: Sorry for my bad English. I hope the problem is understood.
EDIT:
Table data types
In a nutshell, what's happening is your value is losing precision because you're using an inaccurate data type. float is for floating point numbers, and ideally shouldn't normally be used as a primary key. your best bet is to change this to an integer data type instead. By the looks of the comments, this may not be viable, you're probably best off to create another column and use THAT as the primary key instead. What's likely happening is for example with 200703, it's potentially being stored in the database as 200703.000001 or 2007002.99999 and you're searching for a value that's not an exact match to how the database is storing it.
As a suggestion, you may want to change your current float column to a double column instead to retain a little more precision beyond the decimal point.
I am developing a small survey system that consists on inserting all the form data at once (around 90 questions will be answered). Also, you can consult the data later . Having this in mind, I wanted to ask what were the advantages of using a schema like such:
I have several questions in the survey that will have multiple checkboxes (some have 15+), and people will be able to select MULTIPLE of those checkboxes as their answer, and I will store all their selected options in the DB. I am achieving this by using the same name attribute in all those checkbox inputs (belonging to a question) as such: name="q_01[]". The "problem" here (not really a problem, more like a performance/storage optimization enhancing) is that I don't want to store the same values over and over. Say, if I had 20 checkboxes, and their values (attribute in HTML value="Real Value") were something long like Strawberry, Something, etc.. etc.., I would be duplicating the same value over and over, leading to space being wasted.
Instead, I want to store an integer, that will map to a table that holds the real value. That way, I would only store 4 bytes, instead of (UP TO) 255 chars as VARCHAR.
I have heard of such systems, but I have not done it myself, and I don't know any namings or conventions. Could you guys point me in the right direction (by posting an example/youtube vid or a page where there is one)? How are these tables called? I know the basics of foreign keys, and relational tables, so I know the answer lays somewhere in there.
Also, if you could include or give me a hint of how I could query such tables, that would be awesome!
Thank you for your help in advance!
Cheers!
If a question has 15 possible answers which can be chosen independently of each other, then each possible answer will become one table column of your answer table, having column names like strawberry, raspberry, etc. The name of the column doesn't have to be represented in the database itself (although it will be present in the information schema columns table https://dev.mysql.com/doc/refman/5.0/en/columns-table.html belonging to MySQL that, if for advance use cases, you can read from your software). Your columns for taking the checkbox values will all be boolean or tinyint(1); these values should neither contain the string strawberry nor some foreign key to a record containing strawberry.
See the SET datatype.
See SMALLINT UNSIGNED, it has 16 bits, each of which could indicate one answer.
More on the latter case: Best datatype to store a long number made of 0 and 1
I will be storing draw numbers (1-60) in fixed values and fixed order.
One draw has 4 numbers
another has 6 numbers
and another 2 numbers
My idea was to separate each draw type in a separate sql table. The question is, would it be optimal to store the numbers in a single column separated by a delimiter....
ID(int) | numbers(varchar)
or store each number in a separate column instead?
ID(int) | num1(tinyint) | num2(tinyint) | num3(tinyint) | num4(tinyint)
I won't be needing to search for the numbers when they're stored.
If you don't ever need to search for them separately or retrieve them separately, then they are just one opaque "blob" from the database perspective and you won't be violating the principle of atomicity and the 1NF by storing them into single filed.
But, just because that's the case now, doesn't mean it won't change in the future. So at least use the second option. Also, this would allow the DBMS to enforce the integrity of domain and ensure these are actually numbers and just any strings.
However, to future-proof your data, I'd go even further and use the following structure:
In addition to treating numbers in the uniform way and avoiding many NULLs, it'll also allow you to easily vary the max. number of numbers if that ever becomes necessary. I suspect querying will also be easier in this structure.
BTW, if there are no (other fields) and a draw cannot exist without at least one number, you can dispense with the DRAW table altogether and just use DRAW_NUMBER.
separate columns (Database normalization)
If you don't need to search for the numbers (i.e. find which draw has a certain number), then I would store the numbers in the same field.
CLARIFICATION
He said it himself, he's just storing data and doesn't need to do any sort of operation on it. What that data is doesn't matter. It happens to be between 2 to 6 numbers, but that's irrelevant. There is no reason to put them in separate columns unless you need it for some reason.
What I would do is to use only one table, with three columns: id, draw_type, numbers
It's much easier to work with than 3 different tables with 3 to 7 columns each.
I've been inserting some numbers as INT UNSIGNED in MySQL database. I perform search on this column using "SELECT. tablename WHERE A LIKE 'B'. I'm coming across some number formats that are either too long for unsigned integer or have dashes in them like 123-456-789.
What are some good options for modifying the table here? I see two options (are there others?):
Make another column (VARCHAR(50)) to store numbers with dashes. When a search query detects numbers with dashes, look in this new column.
Recreate the table using a VARCHAR(50) instead of unsigned integer for this column in question.
I'm not sure which way is the better in terms of (a) database structure and (b) search speed. I'd love some inputs on this. Thank you.
Update: I guess I should have included more info.
These are order numbers. The numbers without dashes are for one store (A), and the one with dashes are for Amazon (B; 13 or 14 digits I think with two dashes). A's order numbers should be sortable. I'm not sure if B has to be since the numbers don't mean anything to me really (just a unique number).
If I remove the dashes and put them all together as big int, will there be any decrease in performance in the search queries?
the most important question is how you would like to use the data. What do you need? If you make a varchar, and then you would like to sort it as a number, you will not be able to, since it will be treating it as string..
you can always consider big int, however the question is: do you need dashes? or can you just ignore them on application level? if you need them, it means you need varchar. in that case it might make sense to have two columns if you want to be able to for example sort them as numbers, or perform any calculations. otherwise probably one makes more sense.
you should really provide more context about the problem
Mysql has the PROCEDURE ANALYSE , which helps you to identify with your existing data sets. here's some example.
Given you are running query WHERE A LIKE 'B' mainly. You can also try full text search if "A" varies a lot.
I think option 2 makes the most sense. Just add a new column as varchar(50), put everything in the int column into that varchar, and drop the int. Having 2 separate columns to maintain just isn't a good idea.
I have any kind of content what has an ID now here I can specify multiple types for the content.
The question is, should I use multiple rows to add multiple types or use the type field and put there the types separated with commas and parse them in PHP
Multiple Rows
`content_id` | `type`
1 | 1
1 | 2
1 | 3
VS
Single Row
`content_id` | `type`
1 | 1,2,3
EDIT
I'm looking for the faster answer, not the easier, please consider this. Performance is really important for me. So I'm talking about a really huge database with millions or ten millions of rows.
I'd generally always recommend the "multiple rows" approach as it has several advantages:
You can use SQL to return for example WHERE type=3 without any great difficulty as you don't have to use WHERE type LIKE '%3%', which is less efficient
If you ever need to store additional data against each content_id and type pair, you'll find it a lot easier in the multiple row version
You'll be able to apply one, or more, indexes to your table when it's stored in the "multiple row" format to improve the speed at which data is retrieved
It's easier to write a query to add/remove content_id and type pairs when each pair is stored separately than when you store them as a comma seaparated list
It'll (nearly) always be quicker to let SQL process the data to give you a subset than to pass it to PHP, or anything else, for processing
In general, let SQL do what it does best, which is allow you to store the data, and obtain subsets of the data.
I always use multiple rows. If you use single rows your data is hard to read and you have to split it up once you grab it from the database.
Use multiple rows. That way, you can index that type column later, and search it faster if you need to in the future. Also it removes a dependency on your front-end language to do parsing on query results.
Normalised vs de-normalised design.
usually I would recommend sticking to the "multiple rows" style (normalised)
Although sometimes (for performance/storage reasons) people deliberately implement "single row" style.
Have a look here:
http://www.databasedesign-resource.com/denormalization.html
The single row could be better in a few cases. Reporting tends to be easer with some denormalization is the main example. So if your code is cleaner/performs better with the single row, then go for that. Other wise the multiple rows would be the way to go.
Never, ever, ever cram multiple logical fields into a single field with comma separators.
The right way is to create multiple rows.
If there's some performance reason that demands you use a single row, at least make multiple fields in the row. But that said, there is almost never a good performance reason to do this. First make a good design.
Do you ever want to know all the records with, say, type=2? With multiple rows, this is easy: "select content_id from mytable where type=2". With the crammed field, you would have to say "select content_id from mytable where type like '%2%'". Oh, except what happens if there are more than 11 types? The above query would find "12". Okay, you could say "where type like '%,2,%'". Except that doesn't work if 2 is the first or the last in the list. Even if you came up with a way to do it reliably, a LIKE search with an initial % means a sequential read of every record in the table, which is very slow.
How big will you make the cram field? What if the string of types is too big to fit in your maximum?
Do you carry any data about the types? If you create a second table with key of "type" and, say, a description of that type, how will you join to that table. With multiple rows, you could simply write "select content_id, type_id, description from content join type using (type_id)". With a crammed field ... not so easy.
If you add a new type, how do you make it consistent? Suppose it used to say "3,7,9" and now you add "5". Can you say "3,7,9,5" ? Or do they have to be in order? If they're not in order, it's impossible to check for equality, because "1,2" and "2,1" will not look equal but they are really equivalent. In either case, updating a type field now becomes a program rather than a single SQL statement.
If there is some trivial performace gain, it's just not worth it.