Hive SQL: How to create flag occurrence while join with other table

Hive SQL: How to create flag occurrence while join with other table - mysql

I want to check whether my member from table A present in table B or not? Here is the problem is Both Table A and Table B has millions of records and table B have duplicate records. So that i cann't do left join. it takes hours to run.
Table A
Table B
Output

use this :
select member,
case when EXISTS (select 1 from TableB where TableB.member = tableA.member) then 1 else 0 end as Flag
from tableA

Not a very good solution but you can try this.
So, we use not in or not exists to get one set of data and then use in or exists to get another set. And then union them all together to get complete set.
select
a.* , 0 flag
from tableA a where member not in ( select member from tableB)
union all
select
a.* , 1 flag
from tableA a where member in ( select member from tableB)
The trick may be, you can run 2 separate SQL for this and will get perf benefit instead of union all.
Not exist will work same way but can give you better performance.
SELECT a.*, 0 flag
FROM tableA a
WHERE NOT EXISTS(
SELECT 1 FROM tableB b WHERE (a.member=b.member))
union all
SELECT a.*, 1 flag
FROM tableA a
WHERE EXISTS(
SELECT 1 FROM tableB b WHERE (a.member=b.member))

Related

Check if table a primary key is exist in table b

Table A:
ID, Name, etc.
Table B:
ID, TableA-ID.
SELECT * FROM A;
and I want to return a boolean value in the same result for this condition ( if A.ID Exists in Table B).

There are several ways of achieving what you need. Below are three possibilities. These all differ in execution plans and how database actually wants to execute them so depending on your record count one may be more efficient than the other. It's better if you see it for yourself.
1) Use LEFT JOIN and check if a non-null field from B is not null to ensure the record exists. Then apply DISTINCT clause if relationship is 1:N to only show rows from A without duplicates.
select distinct a.*, b.id is not null as exists_b
from a
left join b on
a.id = b.tablea-id
2) Use exists() function, which will be evaluated for each row being returned from table A.
select a.*, exists(select 1 from b where a.id = b.tablea-id) as exists_b
from a
3) Use a combination of subquery expression EXISTS and it's contradiction in two queries to check if a record has or has not a match within table B. Then UNION ALL to combine both results into one.
select *, true as exists_b
from a
where exists (
select 1
from b
where a.id = b.tablea-id
)
union all
select *, false as exists_b
from a
where not exists (
select 1
from b
where a.id = b.tablea-id
)

select A.*, IFNULL((select 1 from B where B.TableA-ID = A.ID limit 1),0) as `exists` from A;
The above statement will result in a 1, if the key exists, and a 0 if that key does not exist. Limit 1 is important if there are multiple records in B

Compare one table with another in mysql and display matched record

I have two mysql tabales.
Table1:opened_datatable
Table2:unidata
Table1 has only one column:Emails
Table2 has 45 columns, three of them are:Email_Office, Email_Personal1, Email_Personal2
I want to fetch full rows from Table2-unidata if Emails column of Table1 matches with either Email_Office or Email_Personal1 or Email_Personal2. I am getting little bit confused here.I tried this way:
select a.emails
from opened_datatable a
where a.Emails in (select *
from unidata b
where b.email_office=a.emails
or b.Email_Personal1=a.emails
or b.Email_Personal2=a.Emails
)
Its showing only one row of first table while I want to show matched rows of Table2 -unidata. First I need to mention table 2 and then I should have to match it with table 1-opened_datatable. But how can I do that?

Try This:
SELECT a.emails, b.*
FROM opened_datatable a
INNER JOIN unidata b ON a.emails IN (b.email_office, b.Email_Personal1, b.Email_Personal2)

Your current query should return an error.
Try a Corrrelated Subquery using EXISTS, quite similar to your apporach:
select a.emails
from opened_datatable a
where EXISTS
( select *
from unidata b
where b.email_office=a.emails
or b.Email_Personal1=a.emails
or b.Email_Personal2=a.Emails
)
You will probably not get good performance due to the OR-ed conditions.
Edit:
If performance is too bad, you might try a UNION approach:
select a.emails
from opened_datatable a
where a.emails
IN
( select email_office
from unidata b
UNION
select Email_Personal1
from unidata b
UNION
select b.Email_Personal2
from unidata b
)

Find values missing in a column from a set (mysql)

I am using mysql.
I have a table that has a column id.
Let us say I have an input set of ids. I want to know which all ids are missing in the table.
If the set is "ida", "idb", "idc" and the table only contains "idb", then the returned value should be "ida", "idc".
Is this possible with a single sql query? If not, what is the most efficient way to execute this.
Note that I am not allowed to use stored procedure.

MySQL will only return rows that exist. To return missing rows you must have two tables.
The first table can be temporary (session/connection specific) so that multiple instances can run simultaneously.
create temporary table tmpMustExist (text id);
insert into tmpMustExist select "ida";
insert into tmpMustExist select "idb";
-- etc
select a.id from tmpMustExist as a
left join table b on b.id=a.id
where b.id is null; -- returns results from a table that are missing from b table.
Is this possible with a single sql query?
Well, yes it is. Let me work my way to that, first with a union all to combine the select statements.
create temporary table tmpMustExist (text id);
insert into tmpMustExist select "ida" union all select "idb" union all select "etc...";
select a.id from tmpMustExist as a left join table as b on b.id=a.id where b.id is null;
Note that I use union all which is a bit faster than union because it skips over deduplication.
You can use create table...select. I do this frequently and really like it. (It is a great way to copy a table as well, but it will drop indexes.)
create temporary table tmpMustExist as select "ida" union all select "idb" union all select "etc...";
select a.id from tmpMustExist as a left join table as b on b.id=a.id where b.id is null;
And finally you can use what's called a "derived" table to bring the whole thing into a single, portable select statement.
select a.id from (select "ida" union all select "idb" union all select "etc...") as a left join table as b on b.id=a.id where b.id is null;
Note: the as keyword is optional, but clarifies what I'm doing with a and b. I'm simply creating short names to be used in the join and select field lists

There's a trick. You can either create a table with expected values or you can use union of multiple select for each value.
Then you need to find all the values that are in the etalon, but not in the tested table.
CREATE TABLE IF NOT EXISTS `single` (
`id` varchar(10) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
INSERT INTO `single` (`id`) VALUES
('idb');
SELECT a.id FROM (
SELECT 'ida' as id
UNION
SELECT 'idb' as id
UNION
SELECT 'idc' AS id
) a WHERE a.id NOT IN (SELECT id FROM single)

//you can pass each set string to query
//pro-grammatically you can put quoted string
//columns must be utf8 collation
select * from
(SELECT 'ida' as col
union
SELECT 'idb' as col
union
SELECT 'idc' as col ) as setresult where col not in (SELECT value FROM `tbl`)

select self join if only one resulting row

Is it possible/economical to perform a SELF JOIN of a table (for this example, my table called myTable has two columns pk and fk), and return a record if there is only one resulting record? I am thinking of something like the following, however, only_one_row() is a fictional function that would need to be replaced with something real:
SELECT fk
FROM myTable as t1
INNER JOIN myTable AS t2 ON t2.fk=t1.fk
WHERE t1.pk=1
AND only_one_row();
For instance, if myTable(id,fk) had the following records, only one record is produced, and I which to select the record:
1 1
2 1
3 2
However, if myTable(id,fk) had the following records, two '1' records are produced, and the select should not return any rows:
1 1
2 1
3 2
4 1
I could use PHP to do so, but would rather just use SQL if feasible.

Use a HAVING clause that counts the results.
SELECT fk
FROM myTable as t1
INNER JOIN myTable AS t2 ON t2.fk=t1.fk
WHERE t1.pk=1
HAVING COUNT(*) = 1

How about this:
SELECT fk
FROM myTable as t1
INNER JOIN myTable AS t2 ON t2.fk=t1.fk
WHERE t1.pk=1
GROUP BY fk
HAVING COUNT(fk) = 1

Eliminating duplicates from SQL query

What would be the best way to return one item from each id instead of all of the other items within the table. Currently the query below returns all manufacturers
SELECT m.name
FROM `default_ps_products` p
INNER JOIN `default_ps_products_manufacturers` m ON p.manufacturer_id = m.id

I have solved my question by using the DISTINCT value in my query:
SELECT DISTINCT m.name, m.id
FROM `default_ps_products` p
INNER JOIN `default_ps_products_manufacturers` m ON p.manufacturer_id = m.id
ORDER BY m.name

there are 4 main ways I can think of to delete duplicate rows
method 1
delete all rows bigger than smallest or less than greatest rowid value. Example
delete from tableName a where rowid> (select min(rowid) from tableName b where a.key=b.key and a.key2=b.key2)
method 2
usually faster but you must recreate all indexes, constraints and triggers afterward..
pull all as distinct to new table then drop 1st table and rename new table to old table name
example.
create table t1 as select distinct * from t2; drop table t1; rename t2 to t1;
method 3
delete uing where exists based on rowid. example
delete from tableName a where exists(select 'x' from tableName b where a.key1=b.key1 and a.key2=b.key2 and b.rowid >a.rowid) Note if nulls are on column use nvl on column name.
method 4
collect first row for each key value and delete rows not in this set. Example
delete from tableName a where rowid not in(select min(rowid) from tableName b group by key1, key2)
note that you don't have to use nvl for method 4

Using DISTINCT often is a bad practice. It may be a sing that there is something wrong with your SELECT statement, or your data structure is not normalized.
In your case I would use this (in assumption that default_ps_products_manufacturers has unique records).
SELECT m.id, m.name
FROM default_ps_products_manufacturers m
WHERE EXISTS (SELECT 1 FROM default_ps_products p WHERE p.manufacturer_id = m.id)
Or an equivalent query with IN:
SELECT m.id, m.name
FROM default_ps_products_manufacturers m
WHERE m.id IN (SELECT p.manufacturer_id FROM default_ps_products p)
The only thing - between all possible queries it is better to select the one with the better execution plan. Which may depend on your vendor and/or physical structure, statistics, etc... of your data base.
I think in most cases EXISTS will work better.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Hive SQL: How to create flag occurrence while join with other table - mysql

I want to check whether my member from table A present in table B or not? Here is the problem is Both Table A and Table B has millions of records and table B have duplicate records. So that i cann't do left join. it takes hours to run. Table A Table B Output

use this : select member, case when EXISTS (select 1 from TableB where TableB.member = tableA.member) then 1 else 0 end as Flag from tableA

Related

Check if table a primary key is exist in table b

Compare one table with another in mysql and display matched record

Find values missing in a column from a set (mysql)

select self join if only one resulting row

Eliminating duplicates from SQL query

Categories

Resources