I am using mysql workbench and mysql server to query databse. I have two tables t1 and t2 with one column t1_name and t2_name. t2 has 3 million records and t1 has 1 million.
I need to select all t2_names where t2_names are not equal to t1_name or not substring of t1_name. When I try the query below:
SELECT DISTINCT `t2_name`
FROM `t2`, `t1`
`t2`.`t2_name` NOT LIKE CONCAT('%',`t1`.`t1_name`,'%'));
I get this error:
mysql Error Code: 1066. Not unique table/alias: 't2'
Can you explain and correct my query please? Previously I have made this post and tried this query:
SELECT DISTINCT `t2_name`
FROM `t2`
WHERE NOT EXISTS (SELECT * FROM `t1`
WHERE `t2_name` LIKE CONCAT('%',`t2_name`,'%'));
but it takes forever and never ends.
Start by qualifying all column names. Does this still cause an error?
SELECT DISTINCT t2.t2_name
FROM t2 JOIN
t1
ON t2.t2_name NOT LIKE CONCAT('%', t1.t1_name, '%');
If your issue is performance, the not exists is going to be better without the distinct:
SELECT t2_name
FROM t2
WHERE NOT EXISTS (SELECT 1
FROM t1
WHERE t2.t2_name LIKE CONCAT('%', t1.t1_name, '%')
);
However, this is not going to be much of an improvement. Unfortunately, like queries with such wildcards are highly inefficient. Often, you can structure the data model so you can write a more efficient query.
You're missing the WHERE keyword. The parser thinks t2 should be an alias for t1 as it follows t1. But t2 is already occupied by the previous t2.
Insert WHERE (and remove the last closing )):
SELECT DISTINCT `t2_name`
FROM `t2`, `t1`
WHERE `t2`.`t2_name` NOT LIKE CONCAT('%',`t1`.`t1_name`,'%');
Side note: I'm afraid your attempt with building the Cartesian product won't perform any any better than the NOT EXISTS. More likely it performs much, much worse...
I think you have mistyped the second where clause and it should say
SELECT DISTINCT `t2_name`
FROM `t2`
WHERE NOT EXISTS (SELECT * FROM `t1`
WHERE `t1_name` LIKE CONCAT('%',`t2_name`,'%'));
At the moment you are effectively comparing t2_name with itself.
It's going to be jolly slow anyway because mysql is going to do a table scan on that. Have a look at your data structure and content and see whether you might be better doing some data cleansing/restructuring before you start trying to use it for analysis.
Related
Trying to understand this, but code efficiency increased more than 10x when I stopped using subquery. Table2 has about 5000 rows, while table1 is pretty huge, a few hundred thousand.
Original Statement
SELECT *
FROM table1
WHERE indexedCol IN (
SELECT indexedCol
FROM table2
WHERE iCol2 = "somevalue"
)
So somehow this is way more efficient.
SELECT *
FROM table1
WHERE indexedCol IN
(*comma separated result of SELECT FROM table2*)
Is there something I am missing here? Or subquery is never a good idea.
The real issue is the sub-query correlated? What do I mean by that? If the sub-query references table1. If it doesn't then then answer is simple -- if you have two queries
SELECT *
FROM table1
and
SELECT indexedCol
FROM table2
WHERE iCol2 = "somevalue"
The time it take to run one of them is less than the time it takes to run both of them. This could be even worse (as suggested in the comments) if one of them is run for every row.
This query could be rewriten to use a join like this:
SELECT *
FROM TABLE1
JOIN TABLE2 on TABLE1.indexedCol = TABLE2.indexedCol and TABLE2.iCol2 = 'some value'
Which will probably solve your problem.
Some SELECT statements stores in table as a field. I need to write SELECT statement that joins with some SELECT that returns SELECT.
For example:
SELECT *
FROM table1
JOIN (SELECT t_select FROM table2 WHERE = 'some_condition')
Last SELECT SELECT t_select FROM table2 returns some SELECT statement as text.
I need to join table1 with the result of the query that stores in t_select
Do I understand? Basically, you want to "evaluate" the SELECT that is stored in the table? That seems like a really poor design to me.
If you really need to do this, you'll need to pull the SELECT statement out yourself, and send it as a second query. You can't do this in pure MySQL.
Do you just want a subquery?
SELECT *
FROM table1 t1 JOIN
(SELECT t2.* FROM table2 t2 WHERE = 'some_condition') t2
on t1.<somecol> = t2.<someothercol>;
All in all you can't execute the query that is stored in the table withing another query. You will have to retrieve the query first, prepare it, and then execute it. Have a look at execute immediate :
http://dev.mysql.com/worklog/task/?id=2793
http://www.postgresql.org/docs/9.1/static/ecpg-sql-execute-immediate.html
Storing sql statements in a table is not very common, and there's usually better ways to do it.
When i execute this mysql query like
select * from t1 where colomn1 in (select colomn1 from t2) ,
what really happens?
I want to know if it executes the inner statement for every row?
PS: I have 300,000 rows in t1 and 50,000 rows in t2 and it is taking a hell of a time.
I'm flabbergasted to see that everyone points out to use JOIN as if it is the same thing. IT IS NOT!, not with the information given here. E.g. What if t2.column1 has doubles ?
=> Assuming there are no doubles in t2.column1, then yes, put a UNIQUE INDEX on said column and use a JOIN construction as it is more readable and easier to maintain. If it is going to be faster; that depends on what the query engine makes from it. In MSSQL the query-optimizer (probably) would consider them the same thing; maybe MySQL is 'not so eager' to recognize this... don't know.
=> Assuming there can be doubles in t2.column1, put a (non-unique) INDEX on said column and rewrite the WHERE IN (SELECT ..) into a WHERE EXISTS ( SELECT * FROM t2 WHERE t2.column1 = t1.column1). Again, mostly for readability and ease of maintenance; most likely the query engine will treat them the same...
The things to remember are
Always make sure you have proper indexing (but don't go overboard)
Always realize that what really happens will be an interpretation of your sql-code; not a 'direct translation'. You can write the same functionality in different ways to achieve the same goal. And some of these are indeed more resilient to different scenarios.
If you only have 10 rows, pretty much everything works. If you have 10M rows it could be worth examining the query plan... which most-likely will be different from the one with 10 rows.
A join would be quicker, viz:
select t1.* from t1 INNER JOIN t2 on t1.colomn1=t2.colomn1
Try with INNER JOIN
SELECT t1.*
FROM t1
INNER JOIN t2 ON t1.column1=t2.column1
You should do indexing in column1 and then you can use inner join
for indexing
CREATE INDEX index1 ON t1 (col1);
CREATE INDEX index2 ON t2 (col2);
select t1.* from t1 INNER JOIN t2 on t1.colomn1=t2.colomn1
Hi i have this query but its giving me an error of Operand should contain 1 column(s) not sure why?
Select *,
(Select *
FROM InstrumentModel
WHERE InstrumentModel.InstrumentModelID=Instrument.InstrumentModelID)
FROM Instrument
according to your query you wanted to get data from instrument and instrumentModel table and in your case its expecting "from table name " after your select * .when the subselect query runs to get its result its not finding table instrument.InstrumentModelId inorder to fetch result from both the table by matching you can use join .or you can also select perticuler fields by tableName.fieldName and in where condition use your condition.
like :
select Instrument.x,InstrumentModel.y
from instrument,instrumentModel
where instrument.x=instrumentModel.y
You can use a join to select from 2 connected tables
select *
from Instrument i
join InstrumentModel m on m.InstrumentModelID = i.InstrumentModelID
When you use subqueries in the column list, they need to return exactly one value. You can read more in the documentation
as a user commented in the documentation, using subqueries like this can ruin your performance:
when the same subquery is used several times, mysql does not use this fact to optimize the query, so be careful not to run into performance problems.
example:
SELECT
col0,
(SELECT col1 FROM table1 WHERE table1.id = table0.id),
(SELECT col2 FROM table1 WHERE table1.id = table0.id)
FROM
table0
WHERE ...
the join of table0 with table1 is executed once for EACH subquery, leading to very bad performance for this kind of query.
Therefore you should rather join the tables, as described by the other answer.
I have mysql queries with a WHERE IN statement.
SELECT * FROM table1 WHERE id IN (1, 2, 15, 17, 150 ....)
How will it perform with hundreds of ids in the IN clause? is it designed to work with many arguments? (my table will have hundreds of thousands of rows and id is the primary field)
is there a better way to do it?
EDIT: I am getting the Ids from the result set of a search server query. So not from the database. I guess a join statement wouldn't work.
I am not sure how WHERE ... IN performes but for me it sounds like a JOIN or maybe a subselect would be the better choice here.
See also: MYSQL OR vs IN performance and http://www.slideshare.net/techdude/how-to-kill-mysql-performance
You should put the IN clause "arguments" into table2 for instance.
Afterwords you make this:
SELECT t1.* FROM table1 t1
INNER JOIN table2 t2 ON t1.Id = t2.Id