I created a database called orthomcl
CREATE DATABASE orthomcl;
CREATE USER 'orthomcl'#'localhost' IDENTIFIED BY 'password';
GRANT ALL PRIVILEGES ON orthomcl.* TO 'orthomcl'#'localhost';
SELECT * FROM mysql.db WHERE Db = 'orthomcl'\G;
I then inserted a table called similarSequences to the database orthomcl
to check if I have duplicate entries in the table I used the following command
USE orthomcl;
select * from similarSequences group by query_id,subject_id having count(*)>1;
this command then returned the following result:
134674 rows in set (5 min 20.81 sec)
Then I created a new table that will have only distinct rows.
create table holdup as select distinct * from similarSequences;
And this resulted in
mysql> create table holdup as select distinct * from similarSequences;
Query OK, 11320619 rows affected (5 min 53.82 sec)
Records: 11320619 Duplicates: 0 Warnings: 0
Now, I would like to select distinct rows from the "holdup table", delete all the rows in similarSequence table and then insert the rows from the holdup table. I am not sure how to proceed further as this is my first time with mysql.
I think this is what you're trying to get at.
SELECT DISTINCT (rows) FROM holdup
DELETE (rows)
INSERT (rows) FROM holdup
Related
I have table with three records. There is one filed as auto_increment. The ID's are 1, 2, 3...
When I run query
SELECT LAST_INSERT_ID() FROM myTable
The result is 5, even the last ID is 3. Why?
And what is better to use? LAST_INSERT_ID() or SELECT MAX(ID) FROM myTable?
The LAST_INSERT_ID() function only returns the most recent autoincremented id value for the most recent INSERT operation, to any table, on your MySQL connection.
If you haven't just done an INSERT it returns an unpredictable value. If you do several INSERT queries in a row it returns the id of the most recent one. The ids of the previous ones are lost.
If you use it within a MySQL transaction, the row you just inserted won't be visible to another connection until you commit the transaction. So, it may seem like there's no row matching the returned LAST_INSERT_ID() value if you're stepping through code to debug it.
You don't have to use it within a transaction, because it is a connection-specific value. If you have two connections (two MySQL client programs) inserting stuff, they each have their own distinct value of LAST_INSERT_ID() for the INSERT operations they are doing.
edit If you are trying to create a parent - child relationship, for example name and email addresses, you might try this kind of sequence of MySQL statements.
INSERT INTO user (name) VALUES ('Josef');
SET #userId := LAST_INSERT_ID();
INSERT INTO email (user_id, email) VALUES (#userId, 'josef#example.com');
INSERT INTO email (user_id, email) VALUES (#userId, 'josef#josefsdomain.me');
This uses LAST_INSERT_ID() to get the autoincremented ID from the user row after you insert it. It then makes a copy of that id in #userId, and uses it twice, to insert two rows in the child table. By using more INSERT INTO email requests, you could insert an arbitrary number of child rows for a single parent row.
Pro tip: SELECT MAX(id) FROM table is a bad, bad way to figure out the ID of the most recently inserted row. It's vulnerable to race conditions. So it will work fine until you start scaling up your application, then it will start returning the wrong values at random. That will ruin your weekends.
last_insert_id() has no relation to specific tables. In the same connection, all table share the same.
Below is a demo for it.
Demo:
mysql> create table t1(c1 int primary key auto_increment);
Query OK, 0 rows affected (0.11 sec)
mysql> create table t2(c1 int primary key auto_increment);
Query OK, 0 rows affected (0.06 sec)
mysql> insert into t1 values(null);
Query OK, 1 row affected (0.01 sec)
mysql> insert into t2 values(4);
Query OK, 1 row affected (0.00 sec)
mysql> insert into t2 values(null);
Query OK, 1 row affected (0.02 sec)
mysql> select last_insert_id() from t1;
+------------------+
| last_insert_id() |
+------------------+
| 5 |
+------------------+
1 row in set (0.00 sec)
I don't think this function does what you think it does. It returns the last id inserted on the current connection.
If you compare that to SELECT MAX(ID) this selects the highest ID irrespective of connection, be careful not to get them mixed up or you will get unexpected results.
As for why it is showing 5 its probably because its the last id to be inserted, I believe that this value will remain even if the record is removed, perhaps someone could confirm this.
Table level triggers is what can come to rescue here. e.g. before insert trigger.
maybe you should restart the database connection than reconnected again for fresh data
When I try to create a database which already exists,
CREATE DATABASE IF NOT EXISTS test;
Query OK, 1 row affected (0.00 sec)
CREATE DATABASE IF NOT EXISTS test;
Query OK, 1 row affected, 1 warning (0.00 sec)
Why does it show 1 row affected message second time , even though it is not creating a new database with the same name?
Although the CREATE DATABASE IF NOT EXISTS test; command won't directly modify rows in an exiting instance of the test database, it will affect the actual details stored internally in the mysql database, or possibly in one of the derived meta views, like the information_schema or performance_schema etc.
The reported Query OK, 1 row affected (0.00 sec) is referring to a row in one of these internal data constructs. When you reissue the CREATE DATABASE command, and it fails gracefully thanks to the IF NOT EXISTS clause, it is still likely to store meta-data internally, maybe an accumulating field that counts warnings or similar, or even just a 'last acted on' timestamp against this database's row. In any case the stored data in this record is changed, and is reflected as an 'affected' row.
I'm looking for a query in which i can update a field , with certain where clauses which reside in different tables
To make sure that i do not update all fields which represent the id i need another where clause which also checks if a service_name like '%DISK%' exists in table service_members:
However this results in (logically):
ERROR 1093 (HY000): You can't specify target table 'service_members' for update in FROM clause
My best try so far:
My initial query is not stringent enough as it matches an id/hostname , this id is present multiple times:
update service_members
set check_command_data = replace(check_command_data,'80%!90%','90%!95%')
where host_name in (select id from host where host_name like '%server-01%');
Output:
Query OK, 3 rows affected (0.00 sec)
Rows matched: 23 Changed: 3 Warnings: 0
It works of course but will update the 'new value' which matches on the same 'old value'
What i need to realize in one query:
update in table service_members setting new value for field
check_command_data (works)
host_name needs to be a select id from table host where the host_name
is like %something% (works but not stringent enough)
service_name in table service_members needs to be a select like %DISK%
I'm a bit puzzled on how to get this query working in the right way , any advice/suggestions?
Thanks in advance!
I am trying to do something in Mysql Server 5.1 on Windows.
I am positive this type of query worked in an older version of Mysql as I supplied it to a client previously without a problem.
Basically, a field in one of my tables contains several ids; such as 1,2,3,4,5
The field is of type varchar
I am trying to see if a value exists in the field by using an IN statement, like below. But it returns nothing.
What am I doing wrong? Is there a better way? Thanks.
mysql> create database testing;
Query OK, 1 row affected (0.00 sec)
mysql> use testing;
Database changed
mysql> create table table1(field1 char(20));
Query OK, 0 rows affected (0.01 sec)
mysql> create table table2(field2 char(20));
Query OK, 0 rows affected (0.00 sec)
mysql> insert into table1 values('1');
Query OK, 1 row affected (0.00 sec)
mysql> insert into table2 values('1,2,3');
Query OK, 1 row affected (0.00 sec)
mysql> select * from table1 where field1 in (select field2 from table2);
Empty set (0.00 sec)
From your query select * from table1 where field1 in (select field2 from table2);, what I'm imagining is like this:
Dissecting the sub-query select field2 from table2, you will have:
field2
'1,2,3'
Then the main query will be (substitution):
select * from table1 where field1 in ('1,2,3');
Obviously it will return no rows since the only value that table1.field1 has is '1'. And '1' <> '1,2,3'.
Well, I bet you are looking for this: FIND_IN_SET
Sample query:
SELECT FIND_IN_SET('1', '1,2,3'); will return 1.
insert into table2 values('1,2,3');
most likely needs to be
insert into table2 values('1');
insert into table2 values('2');
insert into table2 values('3');
Then, your sub select select field2 from table2 returns ('1', '2', '3'), and the IN operator can be used to check if the result of the corresponding field from the main select is contained in this set.
According to the comments, the same schema seems to have worked before. I am not aware that the IN operator can be used like in the question, and splitting of column values into a row set seems to be non-trivial.
Using the FIND_IN_SET() function as proposed by #KaeL, the following query should work:
SELECT b.*
FROM table2 a
INNER JOIN table1 b ON FIND_IN_SET(b.field1, a.field2) > 0;
See also
Single MySQL field with comma separated values
Sample query on SQLFiddle
In any case, you should consider normalizing your schema - using string lists as values of single fields can usually be much better handled by a relational database when the separate values are stored in separate rows in a separate table.
http://sqlfiddle.com/#!2/797cc/3 shows a possible solution.
I need to load a table with a large amount of test data. This is to be used for testing performance and scaling.
How can I easily create 100,000 rows of random/junk data for my database table?
You could also use a stored procedure. Consider the following table as an example:
CREATE TABLE your_table (id int NOT NULL PRIMARY KEY AUTO_INCREMENT, val int);
Then you could add a stored procedure like this:
DELIMITER $$
CREATE PROCEDURE prepare_data()
BEGIN
DECLARE i INT DEFAULT 100;
WHILE i < 100000 DO
INSERT INTO your_table (val) VALUES (i);
SET i = i + 1;
END WHILE;
END$$
DELIMITER ;
When you call it, you'll have 100k records:
CALL prepare_data();
For multiple row cloning (data duplication) you could use
DELIMITER $$
CREATE PROCEDURE insert_test_data()
BEGIN
DECLARE i INT DEFAULT 1;
WHILE i < 100000 DO
INSERT INTO `table` (`user_id`, `page_id`, `name`, `description`, `created`)
SELECT `user_id`, `page_id`, `name`, `description`, `created`
FROM `table`
WHERE id = 1;
SET i = i + 1;
END WHILE;
END$$
DELIMITER ;
CALL insert_test_data();
DROP PROCEDURE insert_test_data;
Here it's solution with pure math and sql:
create table t1(x int primary key auto_increment);
insert into t1 () values (),(),();
mysql> insert into t1 (x) select x + (select count(*) from t1) from t1;
Query OK, 1265 rows affected (0.01 sec)
Records: 1265 Duplicates: 0 Warnings: 0
mysql> insert into t1 (x) select x + (select count(*) from t1) from t1;
Query OK, 2530 rows affected (0.02 sec)
Records: 2530 Duplicates: 0 Warnings: 0
mysql> insert into t1 (x) select x + (select count(*) from t1) from t1;
Query OK, 5060 rows affected (0.03 sec)
Records: 5060 Duplicates: 0 Warnings: 0
mysql> insert into t1 (x) select x + (select count(*) from t1) from t1;
Query OK, 10120 rows affected (0.05 sec)
Records: 10120 Duplicates: 0 Warnings: 0
mysql> insert into t1 (x) select x + (select count(*) from t1) from t1;
Query OK, 20240 rows affected (0.12 sec)
Records: 20240 Duplicates: 0 Warnings: 0
mysql> insert into t1 (x) select x + (select count(*) from t1) from t1;
Query OK, 40480 rows affected (0.17 sec)
Records: 40480 Duplicates: 0 Warnings: 0
mysql> insert into t1 (x) select x + (select count(*) from t1) from t1;
Query OK, 80960 rows affected (0.31 sec)
Records: 80960 Duplicates: 0 Warnings: 0
mysql> insert into t1 (x) select x + (select count(*) from t1) from t1;
Query OK, 161920 rows affected (0.57 sec)
Records: 161920 Duplicates: 0 Warnings: 0
mysql> insert into t1 (x) select x + (select count(*) from t1) from t1;
Query OK, 323840 rows affected (1.13 sec)
Records: 323840 Duplicates: 0 Warnings: 0
mysql> insert into t1 (x) select x + (select count(*) from t1) from t1;
Query OK, 647680 rows affected (2.33 sec)
Records: 647680 Duplicates: 0 Warnings: 0
If you want more control over the data, try something like this (in PHP):
<?php
$conn = mysql_connect(...);
$num = 100000;
$sql = 'INSERT INTO `table` (`col1`, `col2`, ...) VALUES ';
for ($i = 0; $i < $num; $i++) {
mysql_query($sql . generate_test_values($i));
}
?>
where function generate_test_values would return a string formatted like "('val1', 'val2', ...)". If this takes a long time, you can batch them so you're not making so many db calls, e.g.:
for ($i = 0; $i < $num; $i += 10) {
$values = array();
for ($j = 0; $j < 10; $j++) {
$values[] = generate_test_data($i + $j);
}
mysql_query($sql . join(", ", $values));
}
would only run 10000 queries, each adding 10 rows.
try filldb
you can either post your schema or use existing schema and generate dummy data and export from this site and import in your data base.
I really like the mysql_random_data_loader utility from Percona, you can find more details about it here.
mysql_random_data_loader is a utility that connects to the mysql database and fills the specified table with random data. If foreign keys are present in the table, they will also be correctly filled.
This utility has a cool feature, the speed of data generation can be limited.
For example, to generate 30,000 records, in the sakila.film_actor table with a speed of 500 records per second, you need the following command
mysql_random_data_load sakila film_actor 30000 --host=127.0.0.1 --port=3306 --user=my_user --password=my_password --qps=500 --bulk-size=1
I have successfully used this tool to simulate a workload in a test environment by running this utility on multiple threads at different speeds for different tables.
create table mydata as select * from information_schema.columns;
insert into mydata select * from mydata;
-- repeating the insert 11 times will give you at least 6 mln rows in the table.
I am terribly sorry if this is out of place, but I wanted to offer some explanation on this code as I know just enough to explain it and how the answer above is rather useful if you only understand what it does.
The first line Creates a table called mydata , and it generates the layout of the columns from the information_schema, which stores the information about your MYSQL server, and in this case, it is pulling from information_schema.columns, which allows the table being created to have all the column information needed to create not only the table, but all the columns you will need automatically, very handy.
The second line starts off with an Insert statement that will now target that new table called mydata and insert the Information_schema data into the table. The last line is just a comment suggesting you run the script a few times if you want to generate more data.
Lastly in conclusion, in my testing, one execution of this script generated 6,956 rows of data. If you are needing a quick way to generate some records, this isn't a bad method. However, for more advanced testing, you might want to ALTER the table to include a primary key that auto increments so that you have a unique index as a database without a primary key is a sad database. It also tends to have unpredictable results since there can be duplicate entries. All that being said, I wanted to offer some insight into this code because I found it useful, I think others might as well, if only they had spent the time to explain what it is doing. Most people aren't a fan of executing code that they have no idea what it is going to do, even from a trusted source, so hopefully someone else found this useful as I did. I'm not offering this as "the answer" but rather as another source of information to help provide some logistical support to the above answer.
This is a more performant modification to #michalzuber answer. The only difference is removing the WHERE id = 1, so that the inserts can accumulate on each run.
The amount of records produced would be n^2;
So for 10 iterations 10^2 = 1024 records
For 20 iterations 20^2 = 1048576 records and so on.
DELIMITER $$
CREATE PROCEDURE insert_test_data()
BEGIN
DECLARE i INT DEFAULT 1;
WHILE i <= 10 DO
INSERT INTO `table` (`user_id`, `page_id`, `name`, `description`, `created`)
SELECT `user_id`, `page_id`, `name`, `description`, `created`
FROM `table`;
SET i = i + 1;
END WHILE;
END$$
DELIMITER ;
CALL insert_test_data();
DROP PROCEDURE insert_test_data;