I'm creating a PHP script to insert rows into a database called orders based on a shopping cart that is stored in an associative array using a sessional array $_SESSION['cart']. The database looks something like this:
orders
----------+--------------+-------------+-------------+-------------+
Id | Username | Item1Id | Item2Id | Item3Id |
----------+--------------+-------------+-------------+-------------+
1 | a#aa.com | 8000001 | 8000002 | 800003 |
----------+--------------+-------------+-------------+-------------+
5 | a#aa.com | 7000001 | 6000002 | 700003 |
----------+--------------+-------------+-------------+-------------+
7 | b#bb.com | 8000001 | 8000002 | NULL |
----------+--------------+-------------+-------------+-------------+
10 | a#aa.com | 3000001 | 1000002 | 800009 |
----------+--------------+-------------+-------------+-------------+
Id column type is CHAR(20) as I may choose to use letters later on.
As part of inserting the row, I need to assign an Id (Primary Key) to the order row which will be set to 1 higher than the current highest Id number found.
The whole script works perfectly; query finds highest Id in the table and I increment that by 1 and assign it to a variable to use as part of the insert query. The only problem is that "SELECT MAX(Id) FROM orders" can't seem to find anything higher than 9. Is there a condition which prevents the SELECT MAX(Id) from identifying anything in double digits?
I've got it written like:
$highestID = mysqli_query($conn, "SELECT MAX(Id) FROM orders");
$orderID = $highestID +1;
I've emptied the database except for Id numbers1 and 2. Running the PHP script inserts new rows with Id numbers 3, 4, 5 except when it gets to 10, the script is unable to as it produces an error of having duplicate primary key of '10' (from $orderID's value). Even when manually entering a row into the database with Id of '25', $orderID still only returns '10' when I echo out its result.
I have not set any specific limits to the amount of rows that can be entered or anything like that.
Id is char(20) so order by Id using string sort. You could use cast or convert function to sort numbers.
Like:
select max(cast(Id as unsigned)) from orders
You really do not need to go through ALL that trouble for an auto-incremental PK. Here's how you can go about it.
Step 1 : In your phpmyadmin, edit your table, and check the A_I checkbox for your PK column.
Step 2 : While inserting from PHP, leave the field blank. It will automatically assign a value of the current max + 1 to your PK.
Eg,
$query = "Insert into mytable (id, name) values ('', 'Name1'), ('', 'Name2')";
Edit : You really cannot have a CHAR(20) PK and then expect the increment to work btw.
Related
I have a MySQL table with around 3 million rows (listings) at the moment. These listings are updated 24/7 (around 30 listings/sec) by a python script (Scrapy) using pymsql - so the performance of the queries is relevant!
If a listing doesn't exist (i.e. the UNIQUE url), a new record will be inserted (which is around every hundredth listing). The id is set to auto_increment and I am using a INSERT INTO listings ... ON DUPLICATE KEY UPDATE last_seen_at = CURRENT_TIMESTAMP. The update on last_seen_at is necessary to check if the item is still online, as I am crawling the search results page with multiple listings on it and not checking each individual URL each time.
+--------------+-------------------+-----+----------------+
| Field | Type | Key | Extra |
+--------------+-------------------+-----+----------------+
| id | int(11) unsigned | PRI | auto_increment |
| url | varchar(255) | UNI | |
| ... | ... | | |
| last_seen_at | timestamp | | |
| ... | ... | | |
+--------------+-------------------+-----+----------------+
The problem:
At first, it all went fine. Then I noticed larger and larger gaps in the auto_incremented id column and found out it's due to the INSERT INTO ... statement: MySQL attempts to do the insert first. This is when the id gets auto incremented. Once incremented, it stays. Then the duplicate is detected and the update happens.
Now my question is: Which is the best solution regarding performance for with long term perspective?
Option A: Set the id column to unsigned INT or BIGINT and just ignore the gaps. Problem here is I'm afraid of hitting the maximum after a couple of years updating. I'm already at an auto_increment value of around 12,000,000 for around 3,000,000 listings after two days of updating...
Option B: Switch to an INSERT IGNORE ... statement, check the affected rows and UPDATE ... if necessary.
Option C: SELECT ... the existing listings, check existence within python and INSERT ... or UPDATE ... dependingly.
Any other wise options?
Additonal Info: I need an id for information related to a listing stored in other tables (e.g. listings_images, listings_prices etc.). IMHO using the URL (which is unique) won't be the best option for foreign keys.
+------------+-------------------+
| Field | Type |
+------------+-------------------+
| listing_id | int(11) unsigned |
| price | int(9) |
| created_at | timestamp |
+------------+-------------------+
I was in exact situation as yours
I have millions of records being entered by scraper into table, scraper was running every day
I tried following but failed
Load all urls into a Python tuple or list and while scraping, only scrape those which are not in the list - FAILED because at the time of loading urls into a Python tuple or list script consumed so much of server's RAM
Check each record before entering - FAILED because it made INSERTion process too slow because it first have to query the table with millions of rows and then decide whether to INSERT or not
SOLUTION WORKED FOR ME: (for table with millions of rows)
I removed id column because it is irreverent and I do not need that
Make url PRIMARY KEY since it will be unique
Add UNIQUE INDEX -- THIS IS MUST TO DO - It will increase your table's performance drastically
Do bulk inserts instead of inserting one-by-one (see pipeline code below)
Notice it is using INSERT IGNORE INTO, so only new records will be entered and if it exists, it will be ignored completely
If you use REPLACE INTO instead of INSERT IGNORE INTO in MySQL, the new records will be entered, but if a record exists, it will be updated
class BatchInsertPipeline(object):
def __init__(self):
self.items = []
self.query = None
def process_item(self, item, spider):
table = item['_table_name']
del item['_table_name']
if self.query is None:
placeholders = ', '.join(['%s'] * len(item))
columns = '`' + '`, `'.join(item.keys()).rstrip(' `') + '`'
self.query = 'INSERT IGNORE INTO '+table+' ( %s ) VALUES ( %s )' \
% (columns, placeholders)
self.items.append(tuple(item.values()))
if len(self.items) >= 500:
self.insert_current_items(spider)
return item
def insert_current_items(self,spider):
spider.cursor.executemany(self.query, self.items)
self.items = []
def close_spider(self, spider):
self.insert_current_items(spider)
self.items = []
I have a table as shown below
| id | name | doc_no |
|:-----------|------------:|:------------:|
| 1 | abc | D11710001
| 2 | efg | D21710001
| 3 | hij | D31710001
| 4 | klm | D41710001
| 5 | nop | D51710001
| 1 | qrs | D11710002
I want to generate an unique id based on the id given. For example, when i have item to be stored in this table, it will generate an unique id based on the id of the table.
Note: The id in this table is a foreign key. The doc no can be modified by user into their own format manually.
The id format - D 'id' 'year' 'month' 0001(auto increment)
How can i write the sql to generate unique id during storing data?
Continuing with the comment by #strawberry I might recommend not storing the ID in your database. Besides the fact that accessing the auto increment ID at the same time you are inserting the record might be tricky, storing this generated ID would be duplicating the information already stored elsewhere in your table. Instead of storing your ID, just generate it when you query, e.g.
SELECT
id, name, doc_no,
CONCAT('D', id, STR_TO_DATE(date, '%Y-%m'), auto_id) AS unique_id
FROM yourTable;
This assumes that you would be storing the insertion date of each record in a date column called date. It also assumes that your table has an auto increment column called auto_id. Note that having the date of insertion stored may be useful to you in other ways, e.g. if you want to search for data in your table based on date or time.
You could create Trigger and update the column or you can write the update state just after your INSERT
insert into <YOUR_TABLE>(NAME,DOC_NO) values('hello','dummy');
update <YOUR_TABLE> set DOC_NO=CONCAT('D',
CAST(YEAR(NOW()) AS CHAR(4)),
CAST(MONTH(NOW()) AS CHAR(4)),
LAST_INSERT_ID())
WHERE id=LAST_INSERT_ID();
Please note, as above SQL may cause race condition, when simultaneously server get multiple requests.
#Tim Biegeleisen has good point though, as it is better to construct the id when you are SELECTing the data.
I have 2 MySQL tables.
One table has a column that lists all the states
colStates | column2 | column 3
------------------------------
AK | stuff | stuff
AL | stuff | stuff
AR | stuff | stuff
etc.. | etc.. | etc..
The second table has a column(randomStates) with all NULL values that need to be populated with a randomly selected state abbreviation.
Something like...
UPDATE mytable SET `randomStates`= randomly selected state value WHERE randomStates IS NULL
Can someone help me with this statement. I have looked around at other posts, but I don't understand them.
this works for me with trial data in SQLite:
UPDATE mytable
SET randomStates = (SELECT colStates FROM
(SELECT * FROM first_table ORDER BY RANDOM())
WHERE randomStates IS NULL)
without the first SELECT portion, you end up with the same random value inserted into all the NULL randomStates field. (i.e. if you just do SELECT StateValue FROM counts ORDER BY RANDOM() you don't get what you want).
I am trying to find the best way to do this, better if I could use Zend_db_table.
Basically I am inserting a row and one of the values comes from the same DB, this value changes constantly so I need to be sure the data inserted is valid.
I can't query first for the value and then append it to the insert query because between the two queries the data could change and I end up inserting the wrong value. I wonder if LOCKING the table is the way to go or if Zend has a shortcut.
I'm using Mysql.
[EDITED]
For example: This table has a field called item_number, and for each new row I take the last item_number+1 (from the same item_family) and insert with it. It is a manual increment.
TABLE ITEMS
| item_id | item_family | item_number | name |
| 15 | 1 | 10 | Pan |
| 16 | 2 | 1 | Dress |
| 17 | 1 | 11 | Spoon |
In this example you see that the next row from item_family 1 has its item_number = 11 because the previous row from the same item_family was 10.
Thanks!
My solution (using Zend) was to LOCK the table, than query the item_number, append the result to the insert query, insert and UNLOCK the table. Here is how to LOCK and UNLOCK:
$sql = "LOCK TABLE items WRITE";
$this->getAdapter()->query($sql);
//run select to get last item_number
//append result to insert array
//insert
$sql = "UNLOCK TABLES";
$this->getAdapter()->query($sql);
Another way is to write the query so the value would be selected durint the insert. Here is an example:
$sql = INSERT INTO items (item_id, item_family, item_name, item_number)
VALUES (item_id, item_family, item_name, (SELECT item_number FROM... )+1);
$this->getAdapter()->query($sql);
More info about this kind of query in MySQL Web
Provided that item_id is the primary key of your table, and it set as auto increment field, the Zend_Db_Table::insert() function will return the primary key of the row just you inserted.
For example:
$table = new Items();
$data = array(
'item_family' => '1',
'item_number' => '10',
'name' => 'Pan'
);
$itemId = $table->insert($data);
You can also call directly the mysql last inserted id function:
$itemId = $this->getAdapter()->lastInsertId('items');
My problem is: I have a table with an auto_increment column. When I insert some values, all is right.
Insert first row : ID 1
Insert second row : ID 2
Now I want to insert a row at ID 10.
My problem is, that after this there are only rows inserted after ID 10 (which is the normal behaviour ).
But I want that the database first fills up ID 3-9 before making that.
Any suggestions?
EDIT:
To clarify: this is for an URL shortener I want to build for myself.
I convert the id to a word(a-zA-z0-9) for searching, and for saving in the database I convert it to a number which is the ID of the table.
The Problem is now:
I shorten the first link (without a name) -> ID is 1 and the automatically name is 1 converted to a-zA-Z0-9 which is a
Next the same happenes -> ID is 2 and the name is b, which is 2 converted.
Next interesting, somebody want to name the link test -> ID is 4597691 which is the converted test
Now if somebody adds another link with no name -> ID is 4597692 which would be tesu because the number is converted.
I want that new rows will be automatically inserted at the last gap that was made (here 3)
You could have another integer column for URL IDs.
Your process then might look like this:
If a default name is generated for a link, then you simply insert a new row, fill the URL ID column with the auto-increment value, then convert the result to the corresponding name.
If a custom name is specified for a URL, then, after inserting a row, the URL ID column would be filled with the number obtained from converting the chosen name to an integer.
And so on. When looking up for integer IDs, you would then use the URL ID column, not the table auto-increment column.
If I'm missing something, please let me know.
You could do 6 dummy inserts and delete/update them later as you need. The concept of the auto increment, by design, is meant to limit the application's or user's control over the number to ensure a unique value for every single record entered into the table.
ALTER TABLE MY_TABLE AUTO_INCREMENT = 3;
You would have to find first unused id, store it as user variable, use as id for insert.
SELECT #id := t1.id +1
FROM sometable t1 LEFT JOIN sometable t2
ON t2.id = t1.id +1 WHERE t2.id IS NULL LIMIT 1;
INSERT INTO sometable(id, col1, col2, ... ) VALUES(#id, 'aaa', 'bbb', ... );
You will have to run both queries for every insert if you still have gaps, its up to you to decide whether it is worth doing it.
not 100% sure what you're trying to achieve but something like this might work:
drop table if exists foo;
create table foo
(
id int unsigned not null auto_increment primary key,
row_id tinyint unsigned unique not null default 0
)
engine=innodb;
insert into foo (row_id) values (1),(2),(10),(3),(7),(5);
select * from foo order by row_id;
+----+--------+
| id | row_id |
+----+--------+
| 1 | 1 |
| 2 | 2 |
| 4 | 3 |
| 6 | 5 |
| 5 | 7 |
| 3 | 10 |
+----+--------+
6 rows in set (0.00 sec)