Primary Key Index Automatic - mysql

I m currently doing a project using mysql and am a perfect beginner in it.....
I made a table with the following columns.....
ID // A integer type column which is a primary key........
Date // A Date type column.........
Day // A String column.........
Now i just wanna know whether there exist any method by which the ID column insertion value is automatically generated......??
for eg: - If i insert a date - 4/10/1992 and Day - WED as values. The Mysql Server should automatically generate any integer value starting from 1 checking whether they exist.
i.e in a table containing the values
ID Date Day
1 01/02/1987 Sun
3 04/08/1990 Sun
If i m inserting the Date value and Day value(specified in the example) in the above table. It should be inserted as
2 04/10/1992 WED
I tried methods like using auto incrementer.....But i m afraid it just only increments the ID value.

There's a way to do this, but it's going to affect performance. Go ahead and keep auto_increment on the column, just for the first insert, or for when you want to insert more quickly.
Even with auto_increment on a column, you can specify the value, so long as it doesn't collide with an existing value.
To get the next value or first gap:
SELECT a.ID + 1 AS NextID FROM tbl a
LEFT JOIN tbl b ON b.ID = a.ID + 1
WHERE b.ID IS NULL
ORDER BY a.ID
LIMIT 1
If you get an empty set, just use 1, or let auto_increment do its thing.
For concurrency sake, you will need to lock the table to keep other sessions from using the next ID which you just found.

Well...i understood your problem...You want to generate the entries in such a way that it can control it's limit...
Well i've got a solution which is quite whacky...you may accept it if u feel like....
create your table with your primary key in auto increment mode using unsigned int (as every one suggested here)....
now consider two situations....
If your table needs to be cleared every single year or within certain duration(if such a situation exist)....
perform alter table operation to disable autoincrement mode and delete all your contents...
and then enable it again......
if what you are doing is some sort of datawarehousing.....so that a database for years....
then included a sql query to find the smallest primary key value using predefined key functions before you insert and if it is more than the 2^33 create a new table with the same details and you should maintain a seperate table to track the number of tables of this types
The trick is bit complicated and i m afraid....there don't exist a simple way as you expected....

You really don't need to cover the gaps created by deleting values from integer primary key columns. They were especially designed to ignore those gaps.
The auto increment mechanism could have been designed to take into consideration either the gaps at the top (after you delete some products with the biggest id values) or all gaps. But it wasn't because it was designed not to save space but to save time and to ensure that different transactions don't accidentally generate the same id.
In fact PostgreSQL implements it's SEQUENCE data type / SERIAL column (their equivalent to MySQL auto_increment) in such a way that if a transaction requests the sequence to increment a few times but ends up not using those ids, they never get used. That's also designed to avoid the possibility of transactions ever accidentally generating and using the same id.
You can't even save space because when you decide your table is going to use SMALLINT that's a fixed length 2 byte integer, it doesn't matter if the values are all 0 or maxed out. If you use a normal INTEGER that's a fixed length 4 byte integer.
If you use an UNSIGNED BIGINT that's an 8 byte integer which means it uses 8*8 bits = 64 bits. With an 8 byte integer you can count up to 2^64, even if your application works continuously for years and years it shouldn't reach a 20 digit number like 18446744070000000000 (if it does what the hell are you counting the molecules in the known universe?).
But, assuming you really have a concern that the ids might run out in a couple of years perhaps you should be using UUIDs in stead of integers.
Wikipedia states that "Only after generating 1 billion UUIDs every second for the next 100 years, the probability of creating just one duplicate would be about 50%".
UUIDs can be stored as BINARY(16) if you convert them into raw binary, as CHAR(32) if you strip the dashes or as CHAR(36) if you leave the dashes.
Out of the 16 bytes = 128 bits of data UUIDs use 122 random bits and 6 validation bits and they are constructed using information about when and where they were created. Meaning it is safe to create billions of UUIDs on different computers and the likelihood of collision would be overwhelmingly minuscule (as opposed to generating auto-incremented integers on different machines).

Related

Updating single table frequently vs using another table and CRON to import changes into main table in MySQL?

I have a table with login logs which is EXTREMELY busy and large InnoDB table. New rows are inserted all the time, the table is queried by other parts of the system, it is by far the busiest table in the DB. In this table, there is logid which is PRIMARY KEY and its generated as a random hash by software (not auto increment ID). I also want to store some data like number of items viewed.
create table loginlogs
(
logid bigint unsigned primary key,
some_data varchar(255),
viewed_items biging unsigned
)
viewed_items is a value that will get updated for multiple rows very often (assume thousands of updates / second). The dilemma I am facing now is:
Should I
UPDATE loginlogs SET viewed_items = XXXX WHERE logid = YYYYY
or should I create
create table loginlogs_viewed_items
(
logid bigint unsigned primary key,
viewed_items biging unsigned,
exported tinyint unsigned default 0
)
and then execute with CRON
UPDATE loginlogs_viewed_items t
INNER JOIN loginlogs l ON l.logid = t.logid
SET
t.exported = 1,
l.viewed_items = t.viewed_items
WHERE
t.exported = 0;
e.g. every hour?
Note that either way the viewed_items counter will be updated MANY TIMES for one logid, it can be even 100 / hour / logid and there is tons of rows. So whichever table I chose for this, either the main one or the separate one, it will be getting updated quite frequently.
I want to avoid unnecessary locking of loginlogs table and at the same time I do not want to degrade performance by duplicating data in another table.
Hmm, I wonder why you'd want to change log entries and not just add new ones...
But anyway, as you said either way the updates have to happen, whether individually or in bulk.
If you have less busy time windows updating in bulk then might have an advantage. Otherwise the bulk update may have more significant impact when running in contrast to individual updates that might "interleave" more with the other operations making the impact less "feelable".
If the column you need to update is not needed all the time, you could think of having a separate table just for this column. That way queries that just need the other columns may be less affected by the updates.
"Tons of rows" -- To some people, that is "millions". To others, even "billions" is not really big. Please provide some numbers; the answer can be different. Meanwhile, here are some general principles.
I will assume the table is ENGINE=InnoDB.
UPDATEing one row at a time is 10 times as costly as updating 100 rows at a time.
UPDATEing more than 1000 rows in a single statement is problematic. It will lock each row, potentially leading to delays in other statements and maybe even deadlocks.
Having a 'random' PRIMARY KEY (as opposed to AUTO_INCREMENT or something roughly chronologically ordered) is very costly when the table is bigger than the buffer_pool. How much RAM do you have?
"the table is queried by other parts of the system" -- by the random PK? One row at a time? How frequently?
Please elaborate on how exported works. For example, does it get reset to 0 by something else?
Is there a single client doing all the work? Or are there multiple servers throwing data and queries at the table? (Different techniques are needed.)

Mysql Auto Increment For Group Entries

I need to setup a table that will have two auto increment fields. 1 field will be a standard primary key for each record added. The other field will be used to link multiple records together.
Here is an example.
field 1 | field 2
1 1
2 1
3 1
4 2
5 2
6 3
Notice that each value in field 1 has the auto increment. Field 2 has an auto increment that increases slightly differently. records 1,2 and 3 were made at the same time. records 4 and 5 were made at the same time. record 6 was made individually.
Would it be best to read the last entry for field 2 and then increment it by one in my php program? Just looking for the best solution.
You should have two separate tables.
ItemsToBeInserted
id, batch_id, field, field, field
BatchesOfInserts
id, created_time, field, field field
You would then create a batch record, and add the insert id for that batch to all of the items that are going to be part of the batch.
You get bonus points if you add a batch_hash field to the batches table and then check that each batch is unique so that you don't accidentally submit the same batch twice.
If you are looking for a more awful way to do it that only uses one table, you could do something like:
$batch = //Code to run and get 'SELECT MAX(BATCH_ID) + 1 AS NEW_BATCH_ID FROM myTable'
and add that id to all of the inserted records. I wouldn't recommend that though. You will run into trouble down the line.
MySQL only offers one auto-increment column per table. You can't define two, nor does it make sense to do that.
Your question doesn't say what logic you want to use to control the incrementing of the second field you've called auto-increment. Presumably your PHP program will drive that logic.
Don't use PHP to query the largest ID number, then increment it and use it. If you do your system is vulnerable to race conditions. That is, if more than one instance of your PHP program tries that simultaneously, they will occasionally get the same number by mistake.
The Oracle DBMS has an object called a sequence which gives back guaranteed-unique numbers. But you're using MySQL. You can obtain unique numbers with a programming pattern like the following.
First create a table for the sequence. It has an auto-increment field and nothing else.
CREATE TABLE sequence (
sequence_id INT NOT NULL AUTO_INCREMENT,
PRIMARY KEY (`sequence_id`)
)
Then when you need a unique number in your program, issue these three queries one after the other:
INSERT INTO sequence () VALUES ();
DELETE FROM sequence WHERE sequence_id < LAST_INSERT_ID();
SELECT LAST_INSERT_ID() AS sequence;
The third query is guaranteed to return a unique sequence number. This guarantee holds even if you have dozens of different client programs connected to your database. That's the beauty of AUTO_INCREMENT.
The second query (DELETE) keeps the table from getting big and wasting space. We don't care about any rows in the table except for the most recent one.

A rather complicated auto increment

This is one idea more complicated than a one-to-many connection. I have a bunch of tables like photos, posts, users etc. that can be commented on. My comments table contains 3 fields that help identify the comment:
item-id - the id of the item the comment belongs to
table - the table which item-id resides in ( saved as an integer, but displayed as name below to avoid confusion )
id - the id of the comment, relative to the item-id
A sample for better understanding:
id item-id table
1 1 photos
2 1 photos
1 1 posts
2 1 posts
1 2 posts
1 1 users
Now the problem is with inserts. I find it hard to determine the current last id. Given the table above if a user is to comment on a photo with item-id = 1, then the new comment needs to have an id of 3. The only way I could think of is to run a sub-query on insert but I'm not a big fan of sub-queries. Is there some mechanism built in mysql that can help me achieve this, or any other easy and robust way?
From your comment:
I've come up with this because of the fear of unique ids running out. I know that the maximum integer value mysql can store is 1*10 to the 19th or something, which is a ridiculously large number, but not infinite. As well as numbers as huge take up more space?
MySQL's signed INT type can go up to 231-1. An unsigned INT can go up to 232-1, which is 4,294,967,295.
You're right this is not infinite, but 4.2 billion is pretty high and easily able to handle most needs.
You can also use a signed or unsigned BIGINT, which is 8 bytes, twice the size of an INT, but if you need values larger than INT, then you must store them.
Unsigned BIGINT goes up to 264-1 or 18,446,744,073,709,551,615. You're really, really, really unlikely to exhaust these values in your lifetime, even if you re-load your entire database multiple times per hour.
Re your comment.
Yes, most data types are fixed-size, meaning they use the same number of bytes on every row, regardless of the value you store in it on any given row. The reason for this is that you could change the value later, and if MySQL had to find more space to grow a small numeric value into a large numeric value, it would lead to other kinds of performance problems.
See http://dev.mysql.com/doc/refman/5.6/en/storage-requirements.html for more info on the number of bytes MySQL uses for each data type.
The exception is some string data types, (VARCHAR, VARBINARY, TEXT, BLOB), use a variable amount of space per row depending on the lengths of the strings you actually use.
But there are no numeric or date/time data types in MySQL that vary in size.
Another comment: you should ask yourself how much time & effort you're spending on optimizing this, and whether it would be more economical to just get a bigger disk. It's true the extra 4 bytes per row per integer adds up if you have a large database, but you'd need to store billions of rows before it really matters.
One thing you should consider, why is this important to you? The purpose of an ID is to be a unique identifier. Sure it can represent order in the fact that it's monotonically increasing, but is there any reason it specifically has to go from 1 to 2 to 3 for each (item-id, table) pair? Would it be that harmful if it was instead 1, 6, 20?
If you're using PHP you'll still receive that data in the same order, and in PHP it'll be very easy to know which is 1, 2 and 3.
MyISAM allows you to do this easily:
For MyISAM and BDB tables you can specify AUTO_INCREMENT on a
secondary column in a multiple-column index.
However, it's limited to two columns, so you still need to normalize this to remove one of the columns.
Otherwise, you can insert the next users (item 1) row like this:
INSERT INTO table1 (id, `item-id`, `table`)
SELECT MAX(id) + 1, 1, 'users' FROM table1 WHERE `item-id` = 1 AND `table` = 'users'
To extend it a little, the IFNULL part allows you to use the same clause for inserting the first row.
INSERT INTO table1 (id, `item-id`, `table`)
SELECT IFNULL(MAX(id), 0) + 1, 2, 'users' FROM table1 WHERE `item-id` = 2 AND `table` = 'users'
In this case, you would probably have a multi-column primary key, consisting of all three columns.

Best solution for saving boolean values and saving cpu and memory on searches

What is the best solution for inserting boolean values on database if you want more query performance and minimum losing of memory on select statement.
For example:
I have a table with 36 fields that 30 of them has boolean values (zero or one) and i need to search records using the boolean fields that just have true values.
SELECT * FROM `myTable`
WHERE
`field_5th` = 1
AND `field_12th` = 1
AND `field_20` = 1
AND `field_8` = 1
Is there any solution?
If you want to store boolean values or flags there are basically three options:
Individual columns
This is reflected in your example above. The advantage is that you will be able to put indexes on the flags you intend to use most often for lookups. The disadvantage is that this will take up more space (since the minimum column size that can be allocated is 1 byte.)
However, if you're column names are really going to be field_20, field_21, etc. Then this is absolutely NOT the way to go. Numbered columns are a sign you should use either of the other two methods.
Bitmasks
As was suggested above you can store multiple values in a single integer column. A BIGINT column would give you up to 64 possible flags.
Values would be something like:
UPDATE table SET flags=b'100';
UPDATE table SET flags=b'10000';
Then the field would look something like: 10100
That would represent having two flag values set. To query for any particular flag value set, you would do
SELECT flags FROM table WHERE flags & b'100';
The advantage of this is that your flags are very compact space-wise. The disadvantage is that you can't place indexes on the field which would help improve the performance of searching for specific flags.
One-to-many relationship
This is where you create another table, and each row there would have the id of the row it's linked to, and the flag:
CREATE TABLE main (
main_id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
);
CREATE TABLE flag (
main_id INT UNSIGNED NOT NULL,
name VARCHAR(16)
);
Then you would insert multiple rows into the flag table.
The advantage is that you can use indexes for lookups, and you can have any number of flags per row without changing your schema. This works best for sparse values, where most rows do not have a value set. If every row needs all flags defined, then this isn't very efficient.
For performance comparisson you can read a blog post I wrote on the topic:
Set Performance Compare
Also when you ask which is "Best" that's a very subjective question. Best at what? It all really depends on what your data looks like and what your requirements are and how you want to query it.
Keep in mind that if you want to do a query like:
SELECT * FROM table WHERE some_flag=true
Indexes will only help you if few rows have that value set. If most of the rows in the table have some_flag=true, then mysql will ignore indexes and do a full table scan instead.
How many rows of data are you querying over? You can store the boolean values in an integer value and use bit operations to test for them them. It's not indexable, but storage is very well packed. Using TINYINT fields with indexes would pick one index to use and scan from there.

Is there an extant implementation of a reverse "AUTO_INCREMENT" in either PostgreSQL or MySQL?

Without having to do it manually (which I'm open to implementing if no other options exist), is there a way in either PostgreSQL or MySQL to have an automatic counter/field that decrements instead of increments?
For a variety of reasons in a current application, it would be nice to know how many more entries (from a datatype point of view) can still be added to a table just by looking at the most-recently-added record, rather than subtracting the most recent ID from the max for the datatype.
So, is there an "AUTO_DECREMENT" or similar for either system?
You have to do a bit of manual configuration in PostgreSQL but you can configure a sequence like that:
create sequence example_seq
increment by -1
minvalue 1
maxvalue 5
start with 5;
create table example(
example_id int primary key default nextval('example_seq'),
data text not null
);
alter sequence example_seq owned by example.example_id;
I suppose it would be equivalent to create the table with a serial column and then alter the auto-generated sequence.
Now if I insert some rows I get example_id counting down from 5. If I try to insert more than 5 rows, I get nextval: reached minimum value of sequence "example_seq" (1)