Count(*) Vs. Max(Id)

Count(*) Vs. Max(Id) - mysql

If have a table where I do bulk imports from CSV files.
First column is the Id field with autoincrement.
What bothers me is:
When I do a
Select count(*)
And a
Select max(Id)
I get different values. I would have expected those to be identical ?
What am I missing ?

If you insert 10 rows, delete 5, then insert 10 more then your COUNT(*) will not match MAX(id).
You can also insert an id way ahead of where it should be, like in an empty table INSERT ... (id) VALUES (9000000) will kick up your MAX(id) significantly despite having only 1 row.
Rolled-back transactions can also interfere with this.
If you want to know the next increment, check the AUTO_INCREMENT value, but be aware that this is only a guess, the actual value used may differ by the time you actually get around to inserting.
If you want them to match then you need to:
Start with a table where AUTO_INCREMENT=1, as in it's either brand new or has been cleared with TRUNCATE.
Insert using auto-generated id values as one transaction, or as a series of transactions where all of them have been fully committed.

Related

Can I add rows to MySQL before removing all old rows (except same primary)?

If I have a table that has these rows:
animal (primary)
-------
man
dog
cow
and I want to delete all the rows and insert my new rows (that may contain some of the same data), such as:
animal (primary)
-------
dog
chicken
wolf
I could simply do something like:
delete from animal;
and then insert the new rows.
But when I do that, for a split second, 'dog' won't be accessible through the SELECT statement.
I could simply insert ignore the new data and then delete the rest, one by one, but that doesn't feel like the right solution when I have a lot of rows.
Is there a way to insert the new data and then have MySQL automatically delete the rest afterward?
I have a program that selects data from this table every 5 minutes (and the code I'm writing now will be updating this table once every 30 minutes), so I would like to be as accurate as possible at all times, and I would rather have too many rows for a split second than too few rows for the same time.
Note: I know that this may seem like it is unnecessary but I just feel like if I leave too many of those unlikely possibilities in different places, there will be times where things go wrong.

You may want to use TRUNCATE instead of DELETE here. TRUNCATE is faster than DELETE and resets the table back to its empty state (meaning IDENTITY columns are reset to original values as well).
Not sure why you're having problems with selecting a value that was deleted and re-added, maybe I'm missing some context. But if you're wiping the table clean, you might want to use truncate instead.

You could add another column timestamp and change the select statement to accommodate this scenario where it needs to check for the latest value.
If this is for school, I would argue that you need a timestamp and that is what your professor is looking for. You shouldn't need to truncate a table to get the latest values, you need to adjust the thinking behind the table and how you are querying data. Hope this helps!
Check out these:
How to make a mysql table with date and time columns?
Why not update values instead?
My other questions would be:
How are you loading this into the table?
What does that code look like?
Can you change the way you Select from the table?
What values are being "updated" and change in such a way that you need to truncate the entire table?

If you don't want to add new column, there is an other method.
1. At first step, update table in any way that mark all existing rows for deletion in future. For example:
UPDATE `table_name` SET `animal`=CONCAT('MUST_BE_DELETED_', `animal`)
At second step, insert new rows.
On final step, remove all marked rows:
DELETE FROM `table_name` WHERE `animal` LIKE 'MUST_BE_DELETED_%'

You could implement this by having the updated_on column as timestamp and you may even utilize some default values, but let's go with an example without them.
I presume the table would look something like this:
CREATE TABLE `new_table` (
`animal` varchar(255) NOT NULL,
`updated_on` timestamp,
PRIMARY KEY (`animal`)
) ENGINE=InnoDB
This is just a dummy table example. What's important are the two queries later on.
You would simply perform a query to insert the data, such as:
insert into my_table(animal)
select animal from my_view where animal = 'dogs'
on duplicate key update
updated_on = current_timestamp;
Please notice that my_view is your table/view/query by which you supply the values to insert into your table. Also notice that you need to have primary/unique key constraint on your animal column in this example, in order to work.
Then, you proceed with the following query, to "purge" (delete) the old values:
delete from my_table
where updated_on < (
select *
from (
select max(updated_on) from my_table
) as max_date
);
Please notice that you could make a separate view in order to obtain this max_date value for updated_on entry. This entry should indicate the timestamp for your last updated/inserted values in a previous query, so you could proceed with utilizing it in a where clause in order to issue deletion of old records that you don't want/need anymore.
IMPORTANT NOTE:
Since you are doing multiple queries and it's supposed to be a single operation, I'd advise you to utilize it within a single trancations and to utilize a proper rollback on various potential outcomes (i.e. in case of mysql exceptions). You might wish to utilize a proper stored procedure for that.

SQL Table - How to add a row at the start of an old autoincrement column

I have an existing sql table with 3 columns and 100+ entries/rows. There is an id column with autoincrement.
Now, I want to add 10 new rows at the beginning of the table with id from 1 to 10. But I cannot lose any existing row. So, how do I do it?
One idea that just came to my mind is perhaps I can increase the existing id by adding 10, like 1+10 becomes 11, 25+10 becomes 35, and then I can add rows at the beginning. What will be the script for this IF this is possible?

All you need to do for this is to set the auto_increment for that table to whatever number you need to create space for the new records you want to insert.
For example, if you inserted rows with id's 1-100, you might:
Check the next auto_increment value by running:
select auto_increment as val from information_schema.tables where table_schema='myschema' and table_name='mytable';
Let's assume that value would be 101 (the value that would be used if you inserted a new row). You can "advance" the auto_increment value by running:
alter table myschema.mytable auto_increment = 111;
If you insert a new row like this:
insert into mytable (not_the_id_column) values ('test');
It will get the "next" id of 111. But if you specify id values manually, you are ok in this case as long as you use any value less than 111, so you could insert your desired records like this:
insert into mytable (id, not_the_id_column) values (101, 'test101');
insert into mytable (id, not_the_id_column) values (102, 'test102');
... -- more inserts as needed
Now, you still must take proper precautions when updating PK values, or any value that has dependencies on it (Foreign Key or otherwise), but it is completely legitimate to forcibly advance and/or backfill the id values, as long as the resulting auto_increment value doesn't duplicate one that's already in the table.

I agree with juergen d's comment that you should not do this, but I realize there are situations where this kind of thing must be done.
SELECT MAX(id)-MIN(id)+1 INTO #x FROM theTable;
UPDATE theTable SET id = id + #x;
SELECT MIN(id) INTO #x FROM theTable;
UPDATE theTable SET id = 10 + id - #x;
If the id is the primary key, value collisions within an update can cause MySQL to reject the update. (Hence the pair of updates to avoid such a possibility.)
Edit: Factoring N.B.'s strong objection into this, it would also probably be good to verify the table's next auto-increment value is not going to collide with the updated records after the update is completed. I don't have an appropriate database on hand to verify whether UPDATE statements affect it; and even if they do affect it, you may end up wanting to reduce it so as to not create an unnecessary gap (gaps should ideally not be a problem, but if they are or you are just mildly OCD, it is worth looking into).

Trying to use INTO...SELECT in MySQL. Having trouble with the WHERE clause.

I'm trying to make it so that it copies all the data from oa_tags into member_info, but the problem is that I have a unique auto_increment key in both oa_tags and member_info(it's the same in both, called ID). I need it to copy all the data from oa_tags into member_info, but obviously it has to ignore the entries with the same "ID" column.
This is what I have so far -
INSERT INTO member_info
SELECT *
FROM oa_tags, member_info WHERE oa_tags.ID > member_info.ID;
It's throwing this error at me - "#1136 - Column count doesn't match value count at row 1"
Any suggestions are welcome.
Thanks

This is the how mysql supports what you're wanting to do.
http://dev.mysql.com/doc/refman/5.0/en/ansi-diff-select-into-table.html
This explains why I wanted to know the field list/structure of both tables. (I don't like the not in . I think there has to be a way to do it with exists but I'm struggling; and I'm not sure you really care about performance as this seems to be a one time thing)
INSERT INTO member_info (FIELD LIST)
SELECT (FIELD LIST) from oa_tags where ID not in (Select ID from member_info)
This might work, but I doubt it and it's far from "Best practice" but if it's one time throw away, it might get the job done.
INSERT INTO member_info
SELECT * from oa_tags where ID not in (Select ID from member_info)

If you implemented a best practice where you identify all the columns it will fix many of the problems you are currently having. In general it is poor practice to use the * for columns other than testing. Naming/Qualifying your columns (even when there is a lot) will prevent a lot of future issues when tables change.
You can name both the columns in the insert and the select so that they match and only insert the ones you are interested in. Such as ignoring the 'ID' column

The problem is that your column count is off. Let's assume each table has five columns.
You are trying to insert into member_info which has five columns. Your select joins both tables which means you will get a total of ten columns in your select. Run just the select to verify but that is why you are getting the error.
To fix it you can try changing:
select *
to
select oa_tags.*
However, I don't believe your current statement will work. Say you have a row in oa_tags with id 10. The where clause will match that row to rows in member_info where the id is 1 through 9 and you will end up inserting duplicates. Also, both tables could have an id of 10 but if member info had an id of 9, that statement would still try to insert id 10.
I would let the auto increment do its job and try:
INSERT INTO member_info(column1, column2,... (every column but id)
SELECT column1, column2,... (every column but id to allow the table to auto increment)
FROM oa_tags
If you don't want to insert every record in oa_tags into member_info, you can still filter on the id by adding:
WHERE oa_tags.ID not in (select member_info.ID from member_info);

Selecting in order after many insertions

I have two scripts; one of them inserts rows into the database, and the other processes newly entered, so-far-unprocessed rows.
CREATE TABLE table (id INT NOT NULL PRIMARY KEY AUTO_INCREMENT, col1 VARCHAR(32), col2 VARCHAR(32));
So the first script does several separate insert queries:
INSERT INTO table (id, col1 ,col2) VALUES (0, 'val1_1', 'val1_2');
INSERT INTO table (id, col1 ,col2) VALUES (0, 'val2_1', 'val2_2');
INSERT INTO table (id, col1 ,col2) VALUES (0, 'val3_1', 'val3_2');
...
Then the second script uses something like this to select the unprocessed rows:
SELECT * FROM table WHERE id > (SELECT MAX(id FROM table_processed)) ORDER BY id LIMIT 1000;
(do some processing)
(for each id processed from table: INSERT INTO table_processed (id) VALUES ({table.id});)
Sometimes, the first script will need to insert something like 5000 rows. I noticed that there was at least one instance when the processing script seemed to skip over many of the rows (basically skipped 3000 of them), and was wondering what could cause this and how to prevent it (if it skips over them once, then the next time it'll continue to skip over them since it uses > MAX(id)).
Or is this not supposed to happen? (in which case I guess it'd have to be error with the second script query)

If 2 insert transactions are running, and a later transaction (=gets a higher auto_incremented id) is done earlier, those higher auto increment ids are visible earlier to other transactions (i.e: your processing one) then the lower ones (in a not yet committed transaction, or possibly even an rolled back one). Every INSERT gets an id of the global sequence, so those 2 transactions could not even have a single range of id's, but create a sort of striped use of said range. A good way to work is to never rely on either order or value of auto_incremented ids, do not use them for anything but an identifier.
The most obvious solutions are:
Do not use that MAX(id), but do a LEFT JOIN of table to table_processed, and use those not yet existing in table_processed, but this may be heavy on the selecting side.
Let the INSERTs do an exclusive LOCK on the table (undesirable in busy scenarios, you already seem to have multiple concurrent INSERTs).
Let the INSERTs be done with a processed=0 indexed column (possibly this is just the default value, and you can omit it in the insert), and just SELECT .. FROM table WHERE processed=0, set to 1 when done.
A simple mistake to make is to say: OK, I'll just COMMIT after every single insert so that transaction is done as soon as possible, which is still vulnerable to race conditions, so don't use that.

MySQL AUTO_INCREMENT does not ROLLBACK

I'm using MySQL's AUTO_INCREMENT field and InnoDB to support transactions. I noticed when I rollback the transaction, the AUTO_INCREMENT field is not rollbacked? I found out that it was designed this way but are there any workarounds to this?

It can't work that way. Consider:
program one, you open a transaction and insert into a table FOO which has an autoinc primary key (arbitrarily, we say it gets 557 for its key value).
Program two starts, it opens a transaction and inserts into table FOO getting 558.
Program two inserts into table BAR which has a column which is a foreign key to FOO. So now the 558 is located in both FOO and BAR.
Program two now commits.
Program three starts and generates a report from table FOO. The 558 record is printed.
After that, program one rolls back.
How does the database reclaim the 557 value? Does it go into FOO and decrement all the other primary keys greater than 557? How does it fix BAR? How does it erase the 558 printed on the report program three output?
Oracle's sequence numbers are also independent of transactions for the same reason.
If you can solve this problem in constant time, I'm sure you can make a lot of money in the database field.
Now, if you have a requirement that your auto increment field never have gaps (for auditing purposes, say). Then you cannot rollback your transactions. Instead you need to have a status flag on your records. On first insert, the record's status is "Incomplete" then you start the transaction, do your work and update the status to "compete" (or whatever you need). Then when you commit, the record is live. If the transaction rollsback, the incomplete record is still there for auditing. This will cause you many other headaches but is one way to deal with audit trails.

Let me point out something very important:
You should never depend on the numeric features of autogenerated keys.
That is, other than comparing them for equality (=) or unequality (<>), you should not do anything else. No relational operators (<, >), no sorting by indexes, etc. If you need to sort by "date added", have a "date added" column.
Treat them as apples and oranges: Does it make sense to ask if an apple is the same as an orange? Yes. Does it make sense to ask if an apple is larger than an orange? No. (Actually, it does, but you get my point.)
If you stick to this rule, gaps in the continuity of autogenerated indexes will not cause problems.

I had a client needed the ID to rollback on a table of invoices, where the order must be consecutive
My solution in MySQL was to remove the AUTO-INCREMENT and pull the latest Id from the table, add one (+1) and then insert it manually.
If the table is named "TableA" and the Auto-increment column is "Id"
INSERT INTO TableA (Id, Col2, Col3, Col4, ...)
VALUES (
(SELECT Id FROM TableA t ORDER BY t.Id DESC LIMIT 1)+1,
Col2_Val, Col3_Val, Col4_Val, ...)

Why do you care if it is rolled back? AUTO_INCREMENT key fields are not supposed to have any meaning so you really shouldn't care what value is used.
If you have information you're trying to preserve, perhaps another non-key column is needed.

I do not know of any way to do that. According to the MySQL Documentation, this is expected behavior and will happen with all innodb_autoinc_lock_mode lock modes. The specific text is:
In all lock modes (0, 1, and 2), if a
transaction that generated
auto-increment values rolls back,
those auto-increment values are
“lost.” Once a value is generated for
an auto-increment column, it cannot be
rolled back, whether or not the
“INSERT-like” statement is completed,
and whether or not the containing
transaction is rolled back. Such lost
values are not reused. Thus, there may
be gaps in the values stored in an
AUTO_INCREMENT column of a table.

If you set auto_increment to 1 after a rollback or deletion, on the next insert, MySQL will see that 1 is already used and will instead get the MAX() value and add 1 to it.
This will ensure that if the row with the last value is deleted (or the insert is rolled back), it will be reused.
To set the auto_increment to 1, do something like this:
ALTER TABLE tbl auto_increment = 1
This is not as efficient as simply continuing on with the next number because MAX() can be expensive, but if you delete/rollback infrequently and are obsessed with reusing the highest value, then this is a realistic approach.
Be aware that this does not prevent gaps from records deleted in the middle or if another insert should occur prior to you setting auto_increment back to 1.

INSERT INTO prueba(id)
VALUES (
(SELECT IFNULL( MAX( id ) , 0 )+1 FROM prueba target))
If the table doesn't contain values or zero rows
add target for error mysql type update FROM on SELECT

If you need to have the ids assigned in numerical order with no gaps, then you can't use an autoincrement column. You'll need to define a standard integer column and use a stored procedure that calculates the next number in the insert sequence and inserts the record within a transaction. If the insert fails, then the next time the procedure is called it will recalculate the next id.
Having said that, it is a bad idea to rely on ids being in some particular order with no gaps. If you need to preserve ordering, you should probably timestamp the row on insert (and potentially on update).

Concrete answer to this specific dilemma (which I also had) is the following:
1) Create a table that holds different counters for different documents (invoices, receipts, RMA's, etc..); Insert a record for each of your documents and add the initial counter to 0.
2) Before creating a new document, do the following (for invoices, for example):
UPDATE document_counters SET counter = LAST_INSERT_ID(counter + 1) where type = 'invoice'
3) Get the last value that you just updated to, like so:
SELECT LAST_INSERT_ID()
or just use your PHP (or whatever) mysql_insert_id() function to get the same thing
4) Insert your new record along with the primary ID that you just got back from the DB. This will override the current auto increment index, and make sure you have no ID gaps between you records.
This whole thing needs to be wrapped inside a transaction, of course. The beauty of this method is that, when you rollback a transaction, your UPDATE statement from Step 2 will be rolled back, and the counter will not change anymore. Other concurrent transactions will block until the first transaction is either committed or rolled back so they will not have access to either the old counter OR a new one, until all other transactions are finished first.

SOLUTION:
Let's use 'tbl_test' as an example table, and suppose the field 'Id' has AUTO_INCREMENT attribute
CREATE TABLE tbl_test (
Id int NOT NULL AUTO_INCREMENT ,
Name varchar(255) NULL ,
PRIMARY KEY (`Id`)
)
;
Let's suppose that table has houndred or thousand rows already inserted and you don't want to use AUTO_INCREMENT anymore; because when you rollback a transaction the field 'Id' is always adding +1 to AUTO_INCREMENT value.
So to avoid that you might make this:
Let's remove AUTO_INCREMENT value from column 'Id' (this won't delete your inserted rows):
ALTER TABLE tbl_test MODIFY COLUMN Id int(11) NOT NULL FIRST;
Finally, we create a BEFORE INSERT Trigger to generate an 'Id' value automatically. But using this way won't affect your Id value even if you rollback any transaction.
CREATE TRIGGER trg_tbl_test_1
BEFORE INSERT ON tbl_test
FOR EACH ROW
BEGIN
SET NEW.Id= COALESCE((SELECT MAX(Id) FROM tbl_test),0) + 1;
END;
That's it! You're done!
You're welcome.

$masterConn = mysql_connect("localhost", "root", '');
mysql_select_db("sample", $masterConn);
for($i=1; $i<=10; $i++) {
mysql_query("START TRANSACTION",$masterConn);
$qry_insert = "INSERT INTO `customer` (id, `a`, `b`) VALUES (NULL, '$i', 'a')";
mysql_query($qry_insert,$masterConn);
if($i%2==1) mysql_query("COMMIT",$masterConn);
else mysql_query("ROLLBACK",$masterConn);
mysql_query("ALTER TABLE customer auto_increment = 1",$masterConn);
}
echo "Done";

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008