How to ensure no gaps in auto_increment numbers? - mysql

i have a problem case with auto_incrementing, this is my table i have first it was so smooth to incrementing id*
id* name
1 name1
2 name2
3 name3
4 name4
5 name5
6 name6
but when I delete a record and insert a new record the id starts from 7.
id* name
1 name1
2 name2
3 name3
5 name5
6 name6
7 name7
this is what i want to make:
id* name
1 name1
2 name2
3 name3
4 name7
5 name5
6 name6
I would like a solution where every number is filled in, so if I delete a row the next autoinc number will be the number that I deleted not the next number higher.

First off, it's completely fine to have these gaps. There is no problem. It's just your OCD that forces you to think these numbers have to follow a pattern - they DON'T.
auto_increment is not a PHP feature, it's MySQL feature
auto_increment ensures every row gets a unique number. It doesn't deal with sequential numbers
auto_increment works safely in concurrent environment - that means there are a lot of users connecting to MySQL and doing stuff, and all of them have to be able to deal with the database and not get the same id for identifying a row. This is done through a rather complex process and this is one of the reasons why auto_increment yields gaps
auto_increment is used by InnoDB for physical organization of records on disk - it uses the feature of auto_increment and that one is producing a number that's larger than previous (that's what it does, larger than previous, not sequential). Using this, a b-tree is constructed and records are written in sequence on the hard drive. Tampering with auto_increment makes InnoDB rebalance the tree. It means it goes through records and recreates the index if you mess with it - that's something you don't want. Ever
When you think about it, what do you even get with sequential numbers? Nothing really, except your brain probably hurts less because there's some imaginary order.
For sequential numbers, use triggers to create them. auto_increment has one job and one job only - to produce unique numbers.

If you're trying to get something that looks like a list, I suggest you leave the field "ID" as is and add another field to use for names sorted numerically.
Anyway, you can get the same result with just a query like this:
SELECT name, #Rk := #Rk+1 AS Rank
FROM mynamestable, (Select #Rk := 0) AS Rk
Edit:
This query will return all records in the field name from the table mynamestable and also a column (named Rank) that will be a numeric incremental (starting from 1) so the result will be something like:
name Rank
Name1 1
Name2 2
Name3 3

for it you can use trigger after delete. Update all ids (decrease 1) which greater than deleted id
CREATE TRIGGER update_ids AFTER DELETE ON test_table
FOR EACH ROW SET UPDATE test_table SET id = id - 1 WHERE id > OLD.id;
also you must reset auto_increment or write another trigger for insert, which update id to max(id)

Related

Select all rows contains same value in a column

I want to select all package_id that contain product_id 2.
In this case, package_id 1,3,5 has product_id 2
Table: product_package
package_id package_name product_id
---------------------------------------------
1 Gold 1,2,3
2 Platinum 4,5,12
3 Diamond 2,11,5
4 Titanium 3,5
5 Basic 2
I tried:
SELECT
*
FROM
product_package
WHERE product_id IN(2)
It is outputting package_id 3 and 5 only. How do I output this properly?
product_id structure is varchar(256). Should I change the structure or add Foreign keys?
We always recommend not to stored delimited columns see Is storing a delimited list in a database column really that bad?
But you can use FIND_IN_SET but this is always slow
SELECT
*
FROM
product_package
WHERE FIND_IN_SET(2,product_id)
package_id
package_name
product_id
1
Gold
1,2,3
3
Diamond
2,11,5
5
Basic
2
fiddle
First, let me explain what is happening in your query.
You have WHERE product_id IN(2), but product_id is a misnomer and should rather be product_ids, because it is multiple IDs unfortunately stored in a string. IN is made to look up a value in a list. Your list, however, only consists of one element, so you can just as well use the equality operator: WHERE product_id = 2.
What you have is WHERE string = number, so the DBMS has to convert one of the values in order to compare the two. It converts the string to a number (so '2' matches 2 and '002' matches 2, too, as it should). But your strings are not numbers. The DBMS should raise an error on '1,2,3' for instance, because '1,2,3' is not a number. MySQL, however, has a design flaw here and still converts the string, regardless. It just takes as many characters from the left as they still represent a number. '1' does, but then the comma is not considered numerical (yes, MySQL cannot deal with a thousand separator when convertings strings to numbers implicitly). So converting '1,2,3' to a number results in 1. Equally, '2,11,5' results in 2, so rather surprisingly '2,11,5' = 2 in MySQL. This is why you are getting that row.
You ask "Should I change the structure", and the answer to this is yes. So far your table doesn't comply with the first normal form and should thus not exist in a relational database. You'll want two tables instead forming the 1:n relation:
Table: package
package_id
package_name
1
Gold
2
Platinum
3
Diamond
4
Titanium
5
Basic
Table: product_package
package_id
product_id
1
1
1
2
1
3
2
4
2
5
2
12
3
2
3
11
3
5
4
3
4
5
5
2
You ask "or add Foreign keys?", and the answer is and add foreign keys. So with the changed structure you want product_package(product_id) to reference product(product_id) and product_package(package_id) to reference package(package_id).
Disregarding that you should not be storing multiple values in a single field, you can use LIKE operator to achieve what you are looking for. I'm going with assumptions:
all values are delimited with commas
all values are integers
there are no whitespaces (or any other characters besides integers and commas)
select * from product_package
where product_id like '2,%'
or product_id like '%,2,%'
or product_id like '%,2'
or product_id like '2'
Alternatively, you can use REGEXP operator:
select * from product_package
where product_id regexp '^2$|^2,.+|.+,2,.+|.+,2'
References:
MySQL LIKE
MySQL REGEXP

Update a row if a field is a subsequence of a string

I have a string S = "1-2-3-4-5-6-7-8"
This is how my database table rows look like:
id
SubSequence
1
1-2-4-5
2
1-3-4-5
3
2-5-7-8
4
5-8-9-10
5
6-7-10-11
and so on ...
I want to write a query that would update (in this example) only the first 3 rows because they're a subsequence of string S.
The current solution I have is to programmatically go thru each row, check if it's a subsequence, and update. But I'm wondering if there's a way to do it at the MySQL level for performance.
Update: I don't mind changing the way data is stored. For example, String S could be an array holding those numbers, and the "SubSequence" column can hold those numbers as an array.
No, there is not a way to do the query you describe with good performance in SQL when you store the subsequences as strings like you have done. The reason is that doing substring comparisons cannot be optimized with indexes, so your query will be forced to do the comparisons row by row.
In general, when you try to store sets of values as a string, but you want to use SQL to treat them as discrete values, it's bound to be awkward, difficult to code, and ultimately have bad performance.
In this case, what I would do is make a two tables, one that numbers your entities, and a second table in which each value in your subsequence is stored on a row by itself.
SubSequences:
id
1
2
SubSequenceElements:
id
SubSequenceElement
1
1
1
2
1
4
1
5
2
1
2
3
2
4
2
5
And so on.
Then you can use relational-division techniques to find cases where every element of this set exists in the set you want to compare it to.
Here's an example:
SELECT s.id
FROM SubSequences AS s
LEFT OUTER JOIN (
SELECT id
FROM SubSequenceElements
WHERE SubSequenceElement NOT IN (1,2,3,4,5,6,7,8)
) AS invalid USING (id)
WHERE invalid.id IS NULL;
In other words, you want to return rows from SubSequences such that no match is found in SubSequenceElements with an element value that is not in the set you're trying to match.
It's a bit confusing, because you have to think about the problem is a double-don't-match-this-set problem. But once you get relational division, it can be very powerful.
If the set can be represented by the numbers 0 through 63 (or some subset of that), then...
Using a column like this
elements BIGINT UNSIGNED NOT NULL DEFAULT '0'
Then "2-5-7-8" could be put into it thus:
UPDATE ...
SET elements = (1<<2) | (1<<5) | (1<<7) | (1<<8);
Then various operations can be done in a single expression:
WHERE elements = (1<<2) | (1<<5) | (1<<7) | (1<<8) -- Test for exactly that set
WHERE (elements ^ ~ ( (1<<2) | (1<<5) | (1<<7) | (1<<8) )) != 0
-- checks to see if any other bits are turned on
This last example is close to what you need. One side of the "and not" would have the 1..8 of your example, the other would have
Your example has S represented as 0x1FE;
WHERE subsequence & ~0x1FE
will be 0 (false) for ids 1,2,3; non-zero (true) for ids 4 and 5.

When comparing current row with previous row the query is too slow

When subtracting the previous row from the current row the query is too slow, is there a more efficient way to do this?
I am trying to create a data filter which has the capacity to highlight events which occur sequentially to those that do not. I have a table of machine operational data 'source' which is ordered chronologically. Using a WHERE clause I filter out the data which is of less relevance to this particular analysis. The remaining data is inserted into a new table 'filtered'. Using the inserted ID numbers from 'source' I compare each row with its proceeding row to find the difference in value – if the difference is 1 then then the events have occurred in sequence and if the difference is null then they have not. My problem is with the length of time it takes to compare a row with the previous row. I have reduced my data volume to just 2.5% (275000 rows) of what it full volume will be and the query takes 3012 seconds according to the MySQL Workbench action output. I have experimented with structuring the query differently but ultimately have reached dead ends. So my question is – Is there a more efficient way to compare a row with its previous row ?
OK – here are some more details.
/*First I create the table for the filtered data */
drop table if exists filtered_dta;
create table filtered_dta
(
ID int (11) not null auto_increment,
IDx1 int (11),
primary key (ID)
);
/Then I insert the filtered data/
insert into filtered_dta (IDx1)
select seq from source
WHERE range_value < -1.75
and range_value > -5 ;
/* Then I compare each row with its previous */
select t1.ID, t1.IDx1,(t1.IDx1-t2.IDx1)
as seq_value
from filtered_dta t1
left outer join filtered_dta t2
on t1.IDx1 = t2.IDx1+1
order by IDx1
;
Here are sample tables.
Table - filtered_dta Results
| ID | IDx1 | | ID | IDx1 | seq_value |
1 3 1 3 null
2 4 2 4 1
3 7 3 7 null
4 12 4 12 null
5 13 5 13 1
6 14 6 14 1
A full data set from the source table is expected to be between 3 and 10 million rows. The database will create and use about 50 tables. This database is being used as a back end engine for simulation software which does not have the capacity to process this amount of data and give an appropriate analysis of the system which the data represents.
I have spent some time on the issue and have come across the following;
It may be possible that the find_seq table is creates with myISAM and requires converting to an innoDB table. I tried to set the default engine to innoDB but seen no noticeable differences.
This question was similar in its problem of a slow query MySQL query painfully slow on large data - but its issue lay in having a function in a where clause – from my action output I can see the where clause is not too slow.
I would appreciate any input anyone may have on this. Also I am not a proficient user of MySQL so if possible give details.
Kind regards.
You can use something like this template to identify sequential "islands" without a self-join:
SELECT #island := #island + IF(seqId <> #lastSeqId + 1, 1, 0) AS island
, orderQ.[fieldsYouWant]
, #lastSeqId := seqId
FROM (
SELECT [fieldsYouWant], [sequentialIdentifier] AS seqId
FROM [theTable] AS t
, (SELECT #island := 0, #lastSeqId := [somethingItCannotBe]) AS init_dnr -- Initializes variables, do not reference
WHERE [filteringConditionsMet]
ORDER BY [orderingCriteria]
) AS orderingQ
;
I tried keeping it as generic as possible, but you'll note I had to revert to the assumption that seqId was numeric and expected to increment by one. Conditions in the island calculation can be much more complicated if needed (for cases such as where (A, 1), (A, 2), (B, 3) should be two islands based on the sequence not being defined by a single value).
You can take this template further, to identify "island" boundaries and sizes by simple making the above query as subquery for something like:
SELECT island, MIN(seqId), MAX(seqId), COUNT(seqId)
FROM ([above query]) AS islandQ
GROUP BY island
;

Iterate through a column and summarize findings

I have a table (t1) in mySQL that generates the following table:
type time full
0 11 yes
1 22 yes
0 11 no
3 13 no
I would like to create a second table (t2) from this that will summarize the information found in t1 like the following:
type time num_full total
0 11 1 2
1 22 1 1
3 13 0 1
I want to be able to iterate through the type column in order to be able to start this summary, something like a for-loop. The types can be up to a value of n, so I would rather not write n+1 WHERE statements, then have to update the code every time more types are added.
Notice how t2 skipped the type of value 2? This has also been escaping me when I try looping. I only want the the types found to have rows created in t2.
While a direct answer would be nice, it would be much more helpful to be pointed to some sources where I could figure this out, or both.
This may do what you want
create table t2 if not exists select type, time, sum(full) num_full, count(*) count
from t1
group by type,time
order by type,time;
depending on how you want to aggregate the time column.
This is a starting point for reference on the group by functions : https://dev.mysql.com/doc/refman/5.0/en/group-by-functions.html
here for create syntax
https://dev.mysql.com/doc/refman/5.6/en/create-table.html

MySQL 3 column unique constraint not working, possibly due to NULL

This is my table, GameAdmin:
game_id company_id user_id
1 5 NULL
1 5 NULL
1 NULL 2
1 NULL 3
1 NULL 3
It links games to entities that can edit them (either a company or a user).
I have a UNIQUE index on all columns, but as you can see it's not working as expected.
What is wrong? Is it because of the NULLs?
I know I could make it work by changing the structure to:
game_id admin_type admin_id
1 company 5
1 company 5
1 user 2
1 user 3
1 user 3
But that's not compatible with my JPA/Hibernate setup, or at least very inconvenient, because it doesn't allow me to set the relations like this:
#ManyToOne(optional=true)
private User user;
#ManyToOne(optional=true)
private Company company;
Oh, the solution is so simple. I split the constraint up, so there's one for game_id and company_id, and for game_id and user_id.
It is because of the NULL value. They are not considered unique. If you don't have Foreign Keys on those fields, you might use 0 instead of NULL.
If you know (and I mean know, not guess) that you will never have more than 2 possible classes (user/company), just use negative IDs on one of them.
I know this is not 1NF
I know this is hacky
I know this is not really beautifull
But IMHO this is on the acceptable side of the "thin red line".