I have two large tables in a database. They both contain a column called "name". My goal is to locate rows that contain names that are in one database but not the other.
I'm guessing there will be a join statement and a where, but I cannot figure out how to use the two in tandem in order to create a successful query.
Suggestions?
SELECT * FROM TABLE_A WHERE NAME NOT IN
( SELECT NAME FROM TABLE_B )
EXISTS might be faster than IN, see Difference between EXISTS and IN in SQL?.
You can use EXISTS like this. It's useful to know both approaches since they are not exactly equal. You can swap the EXISTS quantifier for SOME, ALL or ANY. I think you can figure out what would happen :)
select * from a1 where not exists(select 1 from a2 where name=a1.name);
Note that they are not 100% equal! SQL has three-valued logic!
Related
I am completely new to database coding, and I've tried Googling but cannot seem to figure this out. I imagine there's a simple solution. I have a very large table with MemberIDs and a few other relevant variables that I want to pull records from (table1). I also have a second large table of distinct MemberIDs (table2). I want to pull rows from table 1 where the MemberID exists in table2.
Here’s how I tried to do it, and for some reason I suspect this isn’t working correctly, or there may be a much better way to do this.
proc sql;
create table tablewant as select
MemberID, var1, var2, var3
from table1
where exists (select MemberID from table2)
;
quit;
Is there anything wrong with the way I’m doing this? What's the best way to solve this when working with extremely large tables (over 100 million records)? Would doing some sort of join be better? Also, do I need to change
where exists (select MemberID from table2)
to
where exists (select MemberID from table2 where table1.MemberID = table2.MemberID)
?
You want to implement a "semi-join". You second solution is correct:
select MemberID, var1, var2, var3
from table1
where exists (
select 1 from table2 where table1.MemberID = table2.MemberID
)
Notes:
There's no need to select anything special in the subquery since it's not checking for values, but for row existence instead. For example, 1 will do, as well as *, or even null. I tend to use 1 for clarity.
The query needs to access table2 and this should be optimized specially for such large tables. You should consider adding the index below, if you haven't created it already:
create index ix1 on table2 (MemberID);
The query does not have a filtering criteria. That means that the engine will read 100 million rows and will check each one of them for the matching rows in the secondary table. This will unavoidably take a long time. Are you sure you want to read them all? Maybe you need to add a filtering condition, but I don't know your requirements in this respect.
I am new to mySQL.
I am following Mosh's tutorial to familiarize myself to SQL.
Here's my question for the following code.
SELECT *
FROM order_items
WHERE order_id = 6 AND unit_price*quantity > 30
When I looked up about SELECT *, it says: * means to return all all columns of the queried tables. Then I think SELECT * means that it grabs all tables from all schema.
My question is: Isn't it a bit inefficient and confusing to return all column provided my understanding is right? If the database become bigger and bigger, it will consume unnecessary effort to look up the keyword, so I think SELECT should specify what table it is referring to. Thanks for reading! 🥰
SELECT * does not fetch all tables from all schema. It only fetches the columns from the table you reference in your FROM clause. It only fetches the rows that match your WHERE clause.
The mistake is understandable given this statement in the MySQL documentation:
A select list consisting only of a single unqualified * can be used as shorthand to select all columns from all tables:
SELECT * FROM t1 INNER JOIN t2 ...
What they mean by "all tables" is only all tables referenced in this query. And only those in FROM or JOIN clauses. Not all tables everywhere.
I have a mysql table that has a column "my_set", a non-unique key, and another column "my_element", a unique key. Each my_sel value can correspond to multiple my_element values, whereas each my_element corresponds to only one my_set.
All values of theese two columns are integers unsigned (11).
Starting from a my_element value, in a single query and without nested selects, I need to find all other my_elements that has same my_set.
The solution I would think was nested select
select my_element
from table
where my_set = (
select my_set
from table
where my_element = <elementValue>
)
But, as I explained, I'd like to find a better, maybe faster way to do it without subselect, as performance is being an issue due to the huge number of similar queries in the db scheduled maintenance phase.
Also, better db structure advice could be appreciated, but currently db refactoring is not allowed.
I am not 100% sure what you are asking, but I will try to answer from what I understood.
I think you need to use self join to get all the elements related to given element( which are related by same my_set). Try the below query.
select t2.my_element
from table t1
join table t2 on t1.my_set = m2.my_set and t2.my_element <> t1.my_element
where t1.my_element = "element";
If it does not work. Create a sql fiddle with sample data, that would make it easy for us.
You can try the below query. If i understand it correctly, use the left outer join between two tables to get the ordered set of set and element.
select
mysetid,
myelementid
from
tableset a
left outer join
tableset b
on
a.setid = b.setid
order by
a.setid,b.elementid
Thanks.
I have a query like this :
SELECT * FROM (SELECT linktable FROM adm_linkedfields WHERE name = 'company') as cbo WHERE group='BEST'
Basically, the table name for the main query is fetched through the subquery.
I get an error that #1054 - Unknown column 'group' in 'where clause'
When I investigate (removing the where clause), I find that the query only returns the subquery result at all times.
Subquery table adm_linkedfields has structure id | name | linktable
Currently am using MySQL with PDO but the query should be compatible with major DBs (viz. Oracle, MSSQL, PgSQL and MySQL)
Update:
The subquery should return the name of the table for the main query. In this case it will return tbl_company
The table tbl_company for the main query has this structure :
id | name | group
Thanks in advance.
Dynamic SQL doesn't work like that, what you created is an inline-view, read up on that. What's more, you can't create a dynamic sql query that will work on every db. If you have a limited number of linktables you could try using left-joins or unions to select from all tables but if you don't have a good reason you don't want that.
Just select the tablename in one query and then make another one to access the right table (by creating the query string in php).
Here is an issue:
SELECT * FROM (SELECT linktable FROM adm_linkedfields WHERE name = 'company') as cbo
WHERE group='BEST';
You are selecting from DT which contains only one column "linktable", then you cant put any other column in where clause of outer block. Think in terms of blocks the outer select is refering a DT which contains only one column.
Your problem is similar when you try to do:
create table t1(x1 int);
select * from t1 where z1 = 7; //error
Your query is:
SELECT *
FROM (SELECT linktable
FROM adm_linkedfields
WHERE name = 'company'
) cbo
WHERE group='BEST'
First, if you are interested in cross-database compatibility, do not name columns or tables after SQL reserved words. group is a really, really bad name for a column.
Second, the from clause is returning a table containing a list of names (of tables, but that is irrelevant). There is no column called group, so that is the problem you are having.
What can you do to fix this? A naive solution would be to run the subquery, run it, and use the resulting table name in a dynamic statement to execute the query you want.
The fundamental problem is your data structure. Having multiple tables with the same structure is generally a sign of a bad design. You basically have two choices.
One. If you have control over the database structure, put all the data in a single table, linktable for instance. This would have the information for all companies, and a column for group (or whatever you rename it). This solution is compatible across all databases. If you have lots and lots of data in the tables (think tens of millions of rows), then you might think about partitioning the data for performance reasons.
Two. If you don't have control over the data, create a view that concatenates all the tables together. Something like:
create view vw_linktable as
select 'table1' as which, t.* from table1 t union all
select 'table2', t.* from table2 t
This is also compatible across all databases.
I was wondering if there is a way to do something like selecting all without ... some columns here
something like SELECT */column1,column2 , is there a way to do this ?
I just need to output something like
column1 , column2 ( from another table ) , here all other columns without column1 ( or something to make the select skip the first few columns)
EDIT:
The thing is that i need this to be dynamic , so i cant just select what i don't know. I never know how many columns there will be , i just know the 1st and the 2nd column
EDIT: here is a picture http://oi44.tinypic.com/xgdyiq.jpg
I don't need the second id column , just the last column like i have pointed.
Start building custom views, which are geared aorund saving developers time and encapsulating them from the database schema.
Oh, so select all but certain fields. You have two options.
One is a little slow.. Copy the table, drop the fields you don't want, then SELECT *
The other is to build the field list from a subquery to information_schema or something, then remove occurrences of 'field_i_dont_want' in that list.
SELECT ( SELECT THE TABLES YOU WANT AND CONCAT INTO ONE STRING ) FROM TABLE
If you need to combine records from multiple tables, you need to find a way to relate them together. Primary Keys, Foreign Keys, or anything common among this.
I will try to explain this with a sql similar to your problem.
SELECT table1.id, table2.name, table1.column3, table1.column4
FROM table1
INNER JOIN table2 On table2.commmonfield = table1.commonfield
If you have 'n' columns in your table as in Col1,Col2,Col3....Coln you can select whatever columns you want to select from the table.
SELECT Col1,Col2 FROM YOURTABLE;
You either select all columns (*) or especify the columns you want one by one. There is no way to select 'all but some'.
The SQL language lets you either select a wildcard set of columns or enumerated single columns from a singular table. However you can join a secondary table and get a wildcard there.
SELECT
a.col1,
b.*
FROM
table_a as a
JOIN table_b as b ON (a.col5 = b.col_1)