how to implement unique IDs across many tables - mysql

If you look at facebook's graph API, it seems as though all objects share the same ID space, and all Ids are unique even if they are in different tables.
Is there a feature in MySQL that handles this? (if not, high level idea of how to implement?)

You may want to check out UUID(). It returns a globally unique ID so your IDs will never clash.
To convert it to integer format, you can
UNHEX(REPLACE(UUID(),'-',''))
for storing in a BINARY(16) column.
(Source for converting to integer: Nicholas Sherlock's comment at MySQL reference)

If you have a single master database server, you can create a table called Object that has an integer auto-incrementing primary key and an object type column. Every time you create an object, you insert a row into this Object table first, get the id, then use that id to insert a row into whatever table will hold the object information. So to create an Event:
INSERT INTO Object (object_type) VALUES ('Event')
Get the last insert id, let's say it's 12345.
INSERT INTO Event (id, name, location) VALUES (12345, 'Cookout', 'My back yard')

OR use a single sequence to drive the ID values.

Related

What is the best way to store a single value in a MySQL DB?

I want to store a single column with just one value in MySQL DB, and this column is not related to any other table in the DB, so just to store a single value should I create an entire table or is there a better way to store single key-value pair in SQL DB.
For example, a boolean variable isActive needs to be stored and queried
It is not uncommon to do this, but don't create a new table for every singleton value. Make one table for this purpose, with two columns, where the first column identifies a name for the value, and the second column is the actual value. Let the data type of the value be string. This is a minor disadvantage when you actually need a boolean, but that way your table is more flexible.
For instance:
create table params(
paramName varchar(100) not null,
paramValue varchar(100)
);
insert into params values ('isActive', '1');
commit;
See also Variant data type in DB which touches on the need to store different data types in the same column. The consensus is to use the string data type unless the specific data type is really essential, in which case I would suggest to create a separate "parameterXXX" table per data type, so all booleans go in one table, all dates in another, ...etc.

unique id without auto_increment

I have an existing schema with a non-auto-incrementing primary key. The key is used as a foreign key in a dozen other tables.
I have inherited a program with major performance problems. Currently, when a new row is added to this table, this is how a new unique id is created:
1) a query for all existing primary key values is retrieved
2) a random number is generated
3) if the number does not exist in the retrieved values, use it, otherwise goto (2)
The app is multi-threaded and multi-server, so simply grabbing the existing ids once at startup isn't an option. I do not have unique information from the initiating request to grab and convert into a pseudo-unique value (like a member id).
I understand it is theoretically possible to perform surgery on the internals to add autoincrementing to an existing primary key. I understand it would also be possible to systematically drop all foreign keys pointing to this table, then create-rename-insert a new version of the table, then add back foreign keys, but this table format is dictated by a third-party app and if I mess this up then Bad Things happen.
Is there a way to leverage sql/mysql to come up with unique row values?
The closest I have come up with is choosing a number randomly from a large space and hoping it is unique in the database, then retrying when the odd collision occurs.
Ideas?
If the table has a primary key that isn't being used for foreign key references, then drop that primary key. The goal is to make your column an auto-incremented primary key.
So, look for the maximum value and then the following should do what you want:
alter table t modify id int not null auto_increment primary key;
alter table t auto_increment = <maximum value> + 1;
I don't think it is necessary to explicitly set the auto_increment value, but I like to be sure.
I think you can SELECT MAX('strange-id-column')+1. That value will be unique and you can put that sql code inside a transaction with the INSERT code in order to prevent errors.
It seems really expensive to pull back a list of all primary key values (for large sets), and then to generate psuedo-random value and verify it's unique, by checking it against the list.
One of the big problems I see with this approach is that a pseudo-random number generator will generate the same sequence of values, when the sequence is started with the same seed value.
If that ever happens, then there will be collision after collision after collision until the sequence reaches a value that hasn't yet been used. And the next time it happens, you'd spin through that whole list again, to add one more value.
I don't understand why the value has to be random.
If there's not a requirement for pseudo-randomness, and an ascending value would be okay, here's what I would do if I didn't want to make any changes to the existing table:
I'd create another "id-generator" table that has an auto_increment column. I perform inserts to that table to generate id values.
Instead of running a query to pull back all existing id values from the existing table, I'd instead perform an INSERT into the "id-generator" table, and then a SELECT LAST_INSERT_ID() to retrieve the id of the row I just inserted, and that would use that as "generated" id value.
Basically, emulating an Oracle SEQUENCE object. It wouldn't be necessary to keep all of the rows in "id-generator" table. So, I could perform a DELETE of all rows that have an id value less than the maximum id value.
If there is a requirement for pseudo-randomness (shudder) I'd probably just attempt the INSERT as a way to find out if the key exists or not. If the insert fails due to a duplicate key, I'd try again with a different id value.
The repeated sequence from a pseudo-random generator scares me... if I got several collisions in a row... are these from a previously used sequence, or are they values from a different sequence. I don't have any way of knowing. Abandoning the sequence and restarting with a new seed, if that seed has been used before, I'm off chasing another series of previously generated values.
For low levels of concurrency (average concurrent ongoing inserts < 1) You can use optimistic locking to achieve a unique id without autoincrement:
set up a one-row table for this function, eg:
create table last_id (last_id bigint not null default 0);
To get your next id, retrieve this value in your app code, apply your newId function, and then attempt to update the value, eg:
select last_id from last_id; // In DB
newId = lastId + 1 // In app code
update last_id set last_id=$newId where last_id=$lastId // In DB
Check the number of rows that were updated. If it was 0, another server beat you to it and you should return to step 1.

Getting an object from database using identity specification

I have entered an object to a table in a DataBase. The recognition of the object is by an ID which is being put using identity specification.
right away, when i enter my object i want to take it's ID from the table,
how can i take it's ID if I recognize each object by it's ID (there might be same objects except for their IDs).
Is there a way to take the ID by for example the last entered to the table?
Thanks in advanced
INSERT
INTO mytable (mycol1, mycol2)
VALUES ('val1', 'val2');
SELECT LAST_INSERT_ID();

Normalizing MySQL data

I'm new to MySQL, and just learned about the importance of data normalization. My database has a simple structure:
I have 1 table called users with fields:
userName (string)
userEmail (string)
password (string)
requests (an array of dictionaries in JSON string format)
data (another array of dictionaries in JSON string format)
deviceID (string)
Right now, this is my structure. Being very new to MySQL, I'm really not seeing why my above structure is a bad idea? Why would I need to normalize this and make separate tables? That's the first question-why? (Some have also said not to put JSON in my table. Why or why not?)
The second question is how? With the above structure, how many tables should I have, and what would be in each table?
Edit:
So maybe normalization is not absolutely necessary here, but maybe there's a better way to implement my data field? The data field is an array of dictionaries: each dictionary is just a note item with a few keys (title, author, date, body). So what I do now is, which I think might be inefficient, every time a user composes a new note, I send that note from my app to PHP to handle. I get the JSON array of dictionaries already part of that user's data, I convert it to a PHP array, I then add to the end of this array the new note, convert the whole thing back to JSON, and put it back in the table as an array of dictionaries. And this process is repeated every time a new note is composed. Is there a better way to do this? Maybe a user's data should be a table, with each row being a note-but I'm not really sure how this would work?
The answer to all your questions really depends on what the JSON data is for, and whether you'll ever need to use some property of that data to determine which rows are returned.
If your data truly has no schema, and you're really just using it to store data that will be used by an application that knows how to retrieve the correct row by some other criteria (such as one of the other fields) every time, there's no reason to store it as anything other than exactly as that application expects it (in this case, JSON).
If the JSON data DOES contain some structure that is the same for all entries, and if it's useful to query this data directly from the database, you would want to create one or more tables (or maybe just some more fields) to hold this data.
As a practical example of this, if the data fields contains JSON enumerating services for that user in an array, and each service has a unique id, type, and price, you might want a separate table with the following fields (using your own naming conventions):
serviceId (integer)
userName (string)
serviceType (string)
servicePrice (float)
And each service for that user would get it's own entry. You could then query for users than have a particular service, which depending on your needs, could be very useful. In addition to easy querying, indexing certain fields of the separate tables can also make for very QUICK queries.
Update: Based on your explanation of the data stored, and the way you use it, you probably do want it normalized. Something like the following:
# user table
userId (integer, auto-incrementing)
userName (string)
userEmail (string)
password (string)
deviceID (string)
# note table
noteId (integer, auto-incrementing)
userId (integer, matches user.userId)
noteTime (datetime)
noteData (string, possibly split into separate fields depending on content, such as subject, etC)
# request table
requestId (integer, auto-incrementing)
userId (integer, matches user.userId)
requestTime (datetime)
requestData (string, again split as needed)
You could then query like so:
# Get a user
SELECT * FROM user WHERE userId = '123';
SELECT * FROM user WHERE userNAme = 'foo';
# Get all requests for a user
SELECT * FROM request WHERE userId = '123';
# Get a single request
SELECT * FROM request WHERE requestId = '325325';
# Get all notes for a user
SELECT * FROM note WHERE userId = '123';
# Get all notes from last week
SELECT * FROM note WHERE userId = '123' AND noteTime > CURDATE() - INTERVAL 1 WEEK;
# Add a note to user 123
INSERT INTO note (noteId, userId, noteData) VALUES (null, 123, 'This is a note');
Notice how much more you can do with normalized data, and how easy it is? It's trivial to locate, update, append, or delete any specific component.
Normalization is a philosophy. Some people think it fits their database approach, some don't. Many modern database solutions even focus on denormalization to improve speeds.
Normalization often doesn't improve speed. However, it greatly improves the simplicity of accessing and writing data. For example, if you wanted to add a request, you would have to write a completely new JSON field. If it was normalized, you could simply add a row to a table.
In normalization, "array of dictionaries in JSON string format" is always bad. Array of dictionaries can be translated as list of rows, which is a table.
If you're new to databases: NORMALIZE. Denormalization is something for professionals.
A main benefit of normalization is to eliminate redundant data, but since each user's data is unique to that user, there is no benefit to splitting this table and normalizing. Furthermore, since the front-end will employ the dictionaries as JSON objects anyway, undue complication and a decrease in performance would result from trying to decompose this data.
Okay, here is a normalized mySQL data-model. Note: you can separate authors and titles into two tables to further reduce data redundancy. You can probably use similar techniques for the "requests dictionaries":
CREATE TABLE USERS(
UID int NOT NULL AUTO_INCREMENT PRIMARY KEY,
userName varchar(255) UNIQUE,
password varchar(30),
userEmail varchar(255) UNIQUE,
deviceID varchar(255)
) ENGINE=InnoDB;
CREATE TABLE BOOKS(
BKID int NOT NULL AUTO_INCREMENT PRIMARY KEY,
FKUSERS int,
Title varchar(255),
Author varchar(50)
) ENGINE=InnoDB;
ALTER TABLE BOOKS
ADD FOREIGN KEY (FKUSERS)
REFERENCES USERS(UID);
CREATE TABLE NOTES(
ID int NOT NULL AUTO_INCREMENT PRIMARY KEY,
FKUSERS int,
FKBOOKS int,
Date date,
Notes text
) ENGINE=InnoDB;
ALTER TABLE NOTES
ADD FOREIGN KEY BKNO (FKUSERS)
REFERENCES USERS(UID);
ALTER TABLE NOTES
ADD FOREIGN KEY (FKBOOKS)
REFERENCES BOOKS(BKID);
In your case, I will abstract out the class that handles this table. Then keep the data normalized. if in future, the data access patterns changes and i need to normalized the data, i css just do so with less impact on the program. I just need to change the class that handles this set of data to query the normalized tables , but return the data as if the database structure never changed.

MySQL enum column from anothertable column

I'm sure this is either totally impossible or really easy:
If I'm creating a table and I want one of the columns to have limited options, it seems that I use either the ENUM or SET value type. But I have to define the possible values at that moment. What if I have another table which has two columns, a primary key column and a data column, and I want the ENUM for my new table to be set to the primary key of the already existing column?
I'm sure I can just write in the values long-hand, but ideally what I need is for new values to be entered into the list table and for the table with the enum column to just accept that the value choices will include anything new added to that list table.
Is this possible without needing to manipulate the structure of the new table each time something is added to the list?
i think this link help :
http://dev.mysql.com/doc/refman/5.0/en/enum.html
have a discussion of it
in the user comments
start :
"In MySQL 5.0, you can convert an enum's values into a dynamically-defined table of values, which then provides effectively a language-neutral method to handle this kind of conversion (rather than relying on PHP, Tcl, C, C++, Java, etc. specific code).
"
he do it with stored PROCEDURE
The easiest way is to use a regular column without contraints. If you're interested in all the current values, use DISTINCT to query them:
select distinct YourColumn from YourTable
That way, you don't have any maintenance and can store whatever you like in the table.
The foreign key table you mention is also a good option. The foreign key will limit the original column. Before you do the actual insert, you run a query to expand the "enum" table:
insert into EnumTable (name)
select 'NewEnumValue'
where not exists (select * from EnumTable where name = 'NewEnumValue')
Not sure what exactly you're trying to achieve btw; limit the column, but automatically expand the choices when someone breaks the limit?