I've just been handed a database schema that looks a little odd to me. The database sits behind a Soap web service and I've noticed that all the table ID's are strings in hex format
eg: 0x1D283F
I'm going to have to duplicate the data into a MySQL database. I've never used Hex's as IDs, so I've no idea wether it's a good idea / bad idea or doesn't matter either way. I'm guessing Auto Increment wouldn't work here, which makes me think this is going to be a bad idea.
I could convert them to integers, or leave them as they are, but what are the implications.
First of all, whether a number is 'hexadecimal' or 'decimal' is just a matter of outer representation, it doesn't change the storage format. Using strings of numbers as database indices is obviously inefficient and may prevent certain db features and optimizations, though.
Therefore, convert the hex strings to plain integers and use them as database indices and you'll be fine.
Broadly speaking, you want your primary keys to be meaningless, and (obviously) unique. Hex is just one way of representing a number - underneath, it's still an integer. So, I'd convert to int before storing in the DB.
Bigger question is - are you sure they're unique? If not, you can't use them as your primary key....
It's a bad idea. SELECTs will be faster with integer keys. There might be advantages to using string keys in some instances, but using numeric values stored as strings as keys is just silly.
Related
I'm looking for some advice on the best way to store long strings of data from the mySQL experts.
I have a general purpose table which is used to store any kind of data, by which I mean it should be able to hold alphanumeric and numeric data.
Currently, the table structure is simple with an ID and the actual data stored in a single column as follows:
id INT(11)
data VARCHAR(128)
I now have a requirement to store a larger amount of data (up to 500 characters) and am wondering whether the best way would be to simply increase the varchar column size, or whether I should add a new column (a TEXT type column?) for the times I need to store longer strings.
If any experts out there has any advice I'm all ears!
My preferred method would be to simply increase the varchar column, but that's because I'm lazy.
The mySQL version I'm running is 5.0.77.
I should mention the new 500 character requirement will only be for the odd record; most records in the table will be not longer than 50 characters.
I thought I'd be future-proofing by making the column 128. Shows how much I knew!
Generally speaking, this is not a question that has a "correct" answer. There is no "infinite length" text storage type in MySQL. You could use LONGTEXT, but that still has an (absurdly high) upper limit. Yet if you do, you're kicking your DBMS in the teeth for having to deal with that absurd blob of a column for your 50-character text. Not to mention the fact that you hardly do anything with it.
So, most futureproofness(TM) is probably offered by LONGTEXT. But it's also a very bad method of resolving the issue. Honestly, I'd revisit the application requirements. Storing strings that have no "domain" (as in, being well-defined in their application) and arbitrary length is not one of the strengths of RDBMS.
If I'd want to solve this on the "application design" level, I'd use NoSQL key-value store for this (and I'm as anti-NoSQL-hype as they get, so you know it's serious), even though I recognize it's a rather expensive change for such a minor change. But if this is an indication of what your DBMS is eventually going to hold, it might be more prudent to switch now to avoid this same problem hundred times in the future. Data domain is very important in RDBMS, whereas it's explicitly sidelined in non-relational solutions, which seems to be what you're trying to solve here.
Stuck with MySQL? Just increase it to VARCHAR(1000). If you have no requirements for your data, it's irrelevant what you do anyway.
Careful if using text. TEXT data is not stored in the database server’s memory, therefore, whenever you query TEXT data, MySQL has to read from it from the disk, which is much slower in comparison with CHAR and VARCHAR as it cannot make use of indexes.The better way to store long string will be nosql databases
We can use varchar(<maximum_limit>). The maximum limit that we can pass is 65535 bytes.
Note: This maximum length of a VARCHAR is shared among all columns except TEXT/BLOB columns and the character set used.
We need to store many rows in a MySQL (InnoDB) table, all of them having a 8-byte binary string as primary key.
I was wondering wether it was best to use the BIGINT column type (which contains 64-bit, thus 8-byte, integers) or BINARY(8), which is fixed length.
Since we're using those ids as strings in our application, and not numbers, storing them as binary strings sounds more coherent to me. However, I wonder if there are performance issues with this. Does it make any difference?
If that matters, we are reading/storing these ids using hex notation (like page_id = 0x1122334455667788).
We wouldn't use integers in queries anyway, since we're writing a PHP application and, as you surely know, there isn't a "unsigned long long int" type, so all integers are machine-dependant size.
I'd use the binary(8) if this matches your design.
Otherwise you'll always have a conversion overhead in performance or complexity somewhere. There won't be much (if any) difference between the types at the RDBMS level
I saw a similar post on Stack Overflow already, but wasn't quite satisfied.
Let's say I offer a Web service. http://foo.com/SERVICEID
SERVICEID is a unique String ID used to reference the service (base 64, lower/uppercase + numbers), similar to how URL shortener services generate ID's for a URL.
I understand that there are inherent performance issues with comparing strings versus integers.
But I am curious of how to maximally optimize a primary key of type String.
I am using MySQL, (currently using MyISAM engine, though I admittedly don't understand all the engine differences).
Thanks.
update for my purpose the string was actually just a base62 encoded integer, so the primary key was an integer, and since you're not likely to ever exceed bigint's size it just doesn't make too much sense to use anything else (for my particular use case)
There's nothing wrong with using a CHAR or VARCHAR as a primary key.
Sure it'll take up a little more space than an INT in many cases, but there are many cases where it is the most logical choice and may even reduce the number of columns you need, improving efficiency, by avoiding the need to have a separate ID field.
For instance, country codes or state abbreviations already have standardised character codes and this would be a good reason to use a character based primary key rather than make up an arbitrary integer ID for each in addition.
If your external ID is base64, your internal ID is a binary string. Use that as the key in your database with type BINARY(n) (if fixed length) or VARBINARY if variable length. The binary version is 3/4 shorter than the base64 one.
And just convert from/to base64 in your service.
Using string as the type of primary column is not a good approach because If our values can not be generated sequentially and with an Incremental pattern, this may cause database fragmentation and decrease the database performance.
I had a feeling that searching domain names taking time more than as usual in mysql. actually domain name column has a unique index though query seems slow.
My question is do I need to convert to binary mode?? say md5 hash or something??
Normally keeping the domain names in a "VARCHAR" data type, with an UNIQUE index defined for that field, is the most simple & efficient way of managing your data.
Never try to use any complexity (like using Binary mode or "BLOB" data type), for the sake of one / two field(s), as it will further deteriorate your MySQL performance.
Hope it helps.
I have an array of values called A, B... X, Y, Z. Fun though it would be to have 26 columns in the table I can't help but feel there is a better way. I have considered creating a second table with the id value of row from the first table, the id of the item in the array and then the boolean value but it seems clunky and confusing.
Is there a better way?
Short answer, no. Long answer, it depends.
You can store binary data in a bunch of ways - abusing a number, using a BINARY OR VARBINARY, using a BLOB or TINYBLOB, etc. BINARY types will generally be faster than BLOB types, provided your data is a known size.
However, relational databases aren't designed for doing anything intelligent with binary data. On a project I used to work on, there was a table where each record had as specific binary pattern - stored as some sort of integer - and searching required a lot of ANDs, ORs, XORs and NOTs. It never really worked very well, performance sucked, and it held the whole project down. Looking back, I would have taken a completely different approach.
So if you just want to drop the data in and pull it out again, great. If you want to use it for anything intelligent, tough.
The situation may be different on other database vendors. In fact, have you considered using something else in place of the database? Some sort of object persistence?
Are your possible array values static?
If so, try using MySQL's SET data type.
You can try storing it as a TINYBLOB, or even an UNSIGNED INT, but you'll have to do bit masking in your code.
You can store it as a string and use text manipulation functions to (re)create your array.