I have table like this
Table1
ID | Val | Val2 |
606541 |3175031503131004|3175032612900004|
606542 |3175031503131004|3175032612900004|
677315 |3175031503131004|3175032612980004|
222222 |1111111111111111|8888888888888888|
231233 |1111111111111111|3175032612900004|
111111 |9999992222211111|1111111111111111|
57 |3173012102121018|3173015101870020|
59 |3173012102121018|3173021107460002|
2 |900 |7000 |
4 |900 |7001 |
I have two condition with column Val and Val2. Show the result if the Val:
Val column has at least two or more duplicate values AND
Val2 column has no duplicate value (unique)
For example :
Sample 1
ID | Val | Val2 |
606541 |3175031503131004|3175032612900004|
606542 |3175031503131004|3175032612900004|
677315 |3175031503131004|3175032612980004|
False, because even the Val column
had two or more duplicate but the Val2
had dulicate value (ID 606541 and 606542)
Sample Expected 1 Result
No records
Sample 2
ID | Val | Val2 |
222222 |1111111111111111|8888888888888888|
231233 |1111111111111111|3175032612900004|
111111 |9999992222211111|1111111111111111|
True, Because the condition is match,
Val column had duplicate value AND Val2 had unique values
Sample 2 Expected Result
ID | Val | Val2 |
222222 |1111111111111111|8888888888888888|
231233 |1111111111111111|3175032612900004|
Sample 3
ID | Val | Val2 |
606541 |3175031503131004|3175032612900004|
606542 |3175031503131004|3175032612900004|
677315 |3175031503131004|3175032612980004|
222222 |1111111111111111|8888888888888888|
231233 |1111111111111111|3175032612900004|
111111 |9999992222211111|1111111111111111|
Note : This is false condition, Because even the value for id 606541, 606542, and
677315 in column Val had duplicate value at least
two or more but the value in column Val2 had no unique value (it could be true condition if id 606541,
606542, and 677315 had 3 different value on Val2).
NOte 2 : for Id 222222 and 231233 that had duplicate value, this is still false, because the column
Val2 with ID 231233 had the same value with ID 606542 and 606541 (3175032612900004), so it didnt match
the second condition which only have no duplicate value
Sample 3 Expected Result
No records
Now back to Table1 in the earlier, i tried to show result from the two condition with this query
SELECT
tb.* FROM table1 tb
WHERE
tb.Val2 IN (
SELECT ta.Val2
FROM (
SELECT
t.*
FROM
table1 t
WHERE
t.Val IN (
SELECT Val FROM table1
GROUP BY Val
HAVING count( Val ) > 1 )
) ta
GROUP BY
ta.Val2
HAVING
count( ta.Val2 ) = 1
)
The result
ID Val Val2
677315 3175031503131004 3175032612980004
222222 1111111111111111 8888888888888888
57 3173012102121018 3173015101870020
59 3173012102121018 3173021107460002
2 900 7000
4 900 7001
While i expect the result was like this:
ID Val Val2
57 3173012102121018 3173015101870020
59 3173012102121018 3173021107460002
2 900 7000
4 900 7001
Is there something wrong with my query ?
Here is my DB Fiddle.
Excuse for any mistakes as this would be my first answer in this forum.
Could you also try with below, i agree to the answer with window function though.
SELECT t.*
FROM table1 t
WHERE t.val IN (SELECT val
FROM table1
GROUP BY val
HAVING COUNT(val) > 1
AND COUNT(val) = COUNT(DISTINCT val2)
)
AND t.val NOT IN (SELECT t.val
FROM table1 t
WHERE EXISTS (SELECT 1
FROM table1 tai
WHERE tai.id != t.id
AND tai.val2 = t.val2));
/*
first part of where clause makes sure we have distinct values in column val2 for repeated value in column val
second part of where clause with not in tells us there is no value shares across different ids with respect to value in column val2
*/
--reverse order query ( not sure gives the expected result)
SELECT t.*
FROM table2 t
WHERE t.val IN (SELECT val FROM table2 GROUP BY val HAVING COUNT(val) = 1)
AND t.val2 IN (SELECT t.val2
FROM table2 ta
WHERE EXISTS (SELECT 1
FROM table2 tai
WHERE tai.id != ta.id
AND tai.val = ta.val));
You have to use Group By to find val & val2 with duplicate values and need to use Inner Join and Left Join in order to include/eliminate records as given conditions (oppose to IN, NOT IN etc. clauses that might cause performance issues in case you're dealing with large data).
Please find the query below:
select t1.*from table1 t1 left join
(select val from table1
where val2 in (select val2 from table1 group by val2 having count(id) > 1)
) t2
on t1.val = t2.val
inner join
(select val from table1 group by val having count(id) >1) t3
on t1.val = t3.val
where t2.val is null
Query for Reverse Condition:
select t1.*from table1 t1 inner join
(select val from table1 group by val having count(id) = 1)
t2
on t1.val = t2.val
inner join
(select val2 from table1 group by val2 having count(id) >1) t3
on t1.val2 = t3.val2
Please find fiddle for both queries here.
Can you try this and let me know the results? SQL fiddle
SELECT t1.id, t1.val, t1.val2 FROM table1 t1
JOIN (
select val from
(select id, val, val2 from table1 group by val2 having count(1) = 1) a
group by a.val having count(1) > 1
)t2 on t1.val = t2.val;
you can use group by :
select * from (select * from #table1 where Val2 in (select Val2 val from #table1 group by Val2 having COUNT(*) =1 )) select1
where select1.val in (select Val val from #table1 group by Val having COUNT(*) >1)
or you can use RANK :
select * from ( SELECT
i.id,
i.Val val,
RANK() OVER (PARTITION BY i.val ORDER BY i.id DESC) AS Rank1,
RANK() OVER (PARTITION BY i.val2 ORDER BY i.id DESC) AS Rank2
FROM #table1 AS i
) select1 where select1.Rank1 >1 or select1.Rank2 =2
You don't need group by or having. Sub-selects will do the job just fine.
SELECT * FROM MyTable a
WHERE (SELECT Count(*) FROM MyTable b WHERE a.val = b.val) >= 2
AND (SELECT Count(*) FROM MyTable c WHERE a.val2 = c.val2) = 1;
This looks at the table as if it was 3 identical tables, but it's just one. The first sub select
(SELECT Count(*) FROM MyTable b WHERE a.val = b.val)
returns a number containing how many occurrences of "Val" are in the table; if there are at least 2 we're good to go. The second sub select
(SELECT Count(*) FROM MyTable c WHERE a.val2 = c.val2)
returns a number containing how many occurrences of "Val2" are in the table; if it's 1 and the first sub select returns at least 2 then we print the record.
If you want a solution, i think this will help.
I got the
val2s which has no duplicates
vals which has more than 1 duplicates
and join
Select t.* from
table1 t
inner join
(Select val2 from table1 group by val2 having count(*) = 1) tv2 on t.val2 = tv2.val2
inner join
(Select val from table1 group by val having count(*) > 1) tv on t.val = tv.val;
You can do it with EXISTS and NOT EXISTS.
If you want only the column Val:
select t1.val from table1 t1
where not exists (
select 1 from table1
where val = t1.val and val2 in (select val2 from table1 group by val2 having count(*) > 1)
)
group by t1.val
having count(t1.val) > 1
If you want full rows:
select t1.* from table1 t1
where exists (select 1 from table1 where id <> t1.id and val = t1.val)
and not exists (
select 1 from table1
where val = t1.val and val2 in (select val2 from table1 group by val2 having count(*) > 1)
)
And one solution with window functions for MySql 8.0+:
select t.id, t.val, t.val2
from (
select *, max(counter2) over (partition by val) countermax
from (
select *,
count(*) over (partition by val) counter,
count(*) over (partition by val2) counter2
from table1
) t
) t
where t.counter > 1 and t.countermax = 1
See the demo.
Common Table Expressions may help readability and perhaps performance as well.
with dup as (select val, count(*) -- two or more of val
from table1
group by val
having count(*)>1)
select tb1.*
from table1 tb1
inner join dup
on dup.val = tb1.val
where not exists (select val2, count(*) -- Not exists is generally fast
from table1
where val = tb1.val
group by 1
having count(*) > 1)
Fiddle
I'm going through your dataset at the moment, and I feel like your final result is accurate when you compare the results to your original dataset. Your criteria used are:
Val is duplicated at least once
Val2 is unique
9999992222211111 is the only unique value in the Val list, so that's the only value I don't expect to see in the final result. For Val2, the only duplicated value is 3175032612900004, so I don't expect to see in the final result.
What it sounds like you're trying to do is to apply the original conditions to your final result table (which is different from your original data table). If that's what you're after, you can go through the same process applied to the original table to your new table, in which you'll get the exact result you want.
I've taken that and included all of this in my fiddle below. You'll see two output queries, one with the result you're seeing, and one with the result you want. Let me know if this answers your question! =)
Here's my fiddle: fiddle
The answer to your query
Is there something wrong with my query ?
is in your Note 2 of Sample 3
NOte 2 : for Id 222222 and 231233 that had duplicate value, this is still false, because the column
Val2 with ID 231233 had the same value with ID 606542 and 606541 (3175032612900004), so it didnt match
the second condition which only have no duplicate value
You are not eliminating the records where Val2 is duplicate with another record outside the set. So, all you need to do in your query is to add the below condition
AND tb.Val NOT IN (SELECT t.Val
FROM table1 t
WHERE t.Val2 IN (SELECT Val2 FROM table1 GROUP BY Val2 HAVING count( Val2 ) > 1 ))
I have added this condition to your query and see the expected results. See fiddle below
My Fiddle
The answer given by #Govind feels like a better re-write of your requirements. It is checking for the duplicates of Val column only when there are no duplicates in Val2 column. Very neat and concise query.
Answer by Govind
Something like this?
SELECT *
FROM table1
WHERE val IN
(SELECT val
FROM table1
GROUP BY val
HAVING COUNT(*) > 1 AND COUNT(DISTINCT val2) = COUNT(*))
AND val NOT IN (SELECT t.val
FROM table1 t
INNER JOIN (SELECT val2
FROM table1
GROUP BY val2
HAVING COUNT(*) > 1) x
ON x.val2 = t.val2);
`select val, count(*) from table1 group by val having count(*)>=2;`
`val count(*)`
`1111111111111111 2`
`3173012102121018 2`
`3175031503131004 3`
`900 2`
Val column has at least two or more duplicate values - TRUE
select val2, count(*) from table1 group by val2 having count(*)>1;
`val2 count(*)`
`3175032612900004 3`
Val2 column has no duplicate value (unique) - FALSE
So ideally you should get no records found right?
Related
I couldn't find and couldn't produce a solution to this problem with SQL Mysql.
I want to update two columns of a table, and the origin of these values ββare from another table, having to come randomly.
Here's a tentative example:
update table1 a1,
(select col1, col2
from table2
ORDER BY RAND() limit 1) a2
set a1.col1 = a2.col1, a1.col2 = a2.col2
where a1.col3 is not null;
From this form, the same value from table2 is always coming.
table1 | table2
id col1 col2 | id col1 col2
1 aaa bbb | 1 xxx yyy
2 ccc ddd | 2 www ttt
| 3 uuu vvv
I want the values ββ(col1, col2) from table 2 to be defined in table1 randomly (col1 and col2).
Without Limit 1, it is also being updated with the same record. As if there were 1 record in table2.
That is, for each line of the update, a subquery is made in the other table bringing a record randomly.
You can use join and row_number(), but multiple times:
update table1 t1 join
(select *, row_number() over (order by rand()) as seqnum
from table1 t1
) tt1
on tt1.id = t1.id join
(select *, row_number() over (order by rand()) as seqnum
from table2 t2
) t2
on t2.seqnum = t1.seqnum
set t1.col1 = t2.col1,
t1.col2 = t2.col2;
This adds a sequence number defined randomly to the two tables and joins on that for the matching. The extra join is to implement the update.
I have 2 tables, table_1 and table_2. table_1 included all data which I need to update to table_2.
table_1
column_2
column_3
b1
b1
b2
b2
table_2
column_1
column_2
column_3
column_4
1
a1
a1
a
2
a
a
a
2
a
a
a
1
a2
a2
a
2
a
a
a
I need to put all data of table_1 to table_2 where column_1 is a specific number, for example, 1. However, I don't have any foreign key to join these two tables. The only relationship is that table_1 has n rows, table_2 also has n rows where column_1 = 1, and I want n rows in table_1 to be updated to these n rows in table_2.
My result would look like this:
column_1
column_2
column_3
column_4
1
b1
b1
a
2
a
a
a
2
a
a
a
1
b2
b2
a
2
a
a
a
Any help would be appreciated.
I think you should try to do that with a scripting language instead of using sql.
get everything from table2 where column_1=1 order by column_2 to an array of objects like
[
{column_1: 1, column_2: b1, column_3: b1, column_4: a},
{column_1: 1, column_2: b2, column_3: b2, column_4: a}
]
then get everything from table1 order by column_2 in an array of objects
[
{column_1: b1, column_2: b1},
{column_1: b2, column_2: b2}
]
and for every element in table1, update table2 using column_1, column_2, column_3, column_4 in a where qlause
I dont think any other way to do this...it really a pain if the structure is like that
It's unclear by what logic you would like which rows in table1 to update which in table2. I will assume you just want to go in order: row 1 to row 1, 2 to 2 etc.
What we can do is add a row_number() to each table, then join on that.
I'm not 100% on MySQL syntax but hopefully you should get the idea. See also here for further "update through join" syntaxes:
WITH t2 AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rn
FROM table_2
WHERE column_1 = 1
),
t1 AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rn
FROM table_1
)
UPDATE t2
SET
t2.column_2 = t1.column_2,
t2.column_3 = t1.column_3
INNER JOIN t1 ON t1.rn = t2.rn;
If you cannot do an update on a WITH table, then you must self-join table2. You haven't indicated the PK of that table, I will just use column PK:
UPDATE table_2
SET
table_2.column_2 = t1.column_2,
table_2.column_3 = t1.column_3
INNER JOIN (
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rn
FROM table_2
WHERE column_1 = 1
) AS t2 ON t2.PK = table_2.PK
INNER JOIN (
SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rn
FROM table_1
)AS t1 ON t1.rn = t2.rn;
I Have below mentioned table:
ID Value
U-1 ACB
U-1 ART
U-1 DDD
U-2 ACB
U-2 DDD
U-3 XCC
U-3 DFC
I want to fetch those rows where Value is DDD but total count of unique ID is <3.
Required Output:
ID Value
U-2 ACB
U-2 DDD
You could use a self join to same table, Inner query will calculate count per id and filter rows where count is less than 3
select a.*
from table1 a
join (
select id, count(*) total
from table1
group by id
having total < 3
and sum(`Value` = 'DDD') > 0
) t using(id);
Demo
OR
select a.*
from table1 a
where (
select count(*)
from table1
where ID = a.ID
having sum(`Value` = 'DDD') > 0
) < 3
but i prefer join approach
updated demo
How about this?
SELECT * FROM
(SELECT * FROM sof t1
WHERE (SELECT COUNT(id) FROM sof t2 WHERE t2.id = t1.id) < 3) as temp2
WHERE id IN (SELECT id FROM sof WHERE value = 'DDD')
The input and output matches for your case at my end atleast.
Demo: http://rextester.com/AZLA7822
I would like to divide multiple column with 2 statements as the following:
TBL1
NAME VAL1 VAL2 VAL3
A 2 3 3
TBL2
NAME VAL1 VAL2 VAL3
B 2 3 3
ERROR SCRIPT
select (select * from tbl1)/(select * from TBL2) as result
Result that i need as the following:
VAL1 VAL2 VAL3
2/2 3/3 3/3
There should be a ON clause but not sure what it should be
SELECT t1.VAL1/t2.VAL1, t1.VAL2/t2.VAL2, t1.VAL3/t2.VAL3,
FROM TBL1 t1, TBL2 t2
The best thing that I can come up with is
SET #COUNTER1 = 0;
SET #COUNTER2 = 0;
SELECT T1.VAL1 / T2.VAL1,
T1.VAL2 / T2.VAL2,
T1.VAL3 / T2.VAL3
FROM (SELECT *, (#COUNTER1 := #COUNTER1 + 1) AS id FROM TBL1) AS T1
INNER JOIN (SELECT *, (#COUNTER2 := #COUNTER2 + 1) AS id FROM TBL2) AS T2
ON T1.id = T2.id
Select Tbl1.Val1 / Tbl2.Val1 As Val1
, Tbl1.Val2 / Tbl2.Val2 As Val2
, Tbl1.Val3 / Tbl2.Val3 As Val3
From Tbl1
Cross Join Tbl2
Of course, this probaly isn't what you want. Firstly, there is nothing that correlates the rows in Table 1 with the rows in Table 2. I.e., if both tables have three rows each, the result will have nine rows. In short, you will get a Cartesian product between the two tables. Second, there is no logic to deal with dividing by zero errors. Should those values simply be set to zero? Should they be null?
MySQL Join Syntax. (Yes MySQL supports the ISO/ANSI standard Cross Join syntax).
SQL Fiddle version
Edit
If what are trying to do is to concatenate the values into a string expression of #/#, then you need to use the Concat function:
Select Concat(Tbl1.Val1,'/',Tbl2.Val1) As Val1
, Concat(Tbl1.Val2,'/',Tbl2.Val2) As Val2
, Concat(Tbl1.Val3,'/',Tbl2.Val3) As Val3
From Tbl1
Cross Join Tbl2
SQL Fiddle version.
Given this 2 million+ entry table,ID auto incrementing, and index1(MainId,SubID,Column1)
index2(MainId,SubID,Column2):
ID MainID SubID Column1 Column2
--------------------------------------
1 1 A 1A_data_1
2 1 A 1A_data_2
3 2 B 2B_data_1
4 2 B 2B_data_2
5 1 A ignore_me
6 1 A 1A_data_3
I can get the row ID that contains the desired column value using indexes with:
Select max(ID)
From table where column1 is not null and column1 <>'ignore_me'
Group By MainID,SubID
Select max(id)
From table where column2 is not null and column2 <>'ignore_me'
Group By MainID,SubID
But what I can't do is find an efficient way to join these against a MainID,SubID group by to get these results:
MainID SubID Column1 Column2
--------------------------------
1 A 1A_data_1 1A_data_3
2 B 2B_data_1 2B_data_2
I've tried a lot of different approaches, but nothing that doesnt take forever. Do I need another index? I feel like I'm overlooking something simple as the group by queries are super fast. Can anyone point me in the right direction?
You can calculate the two IDs in a single query using conditional aggregation:
SELECT
MainID,
SubID,
MAX(CASE WHEN Column1 <> 'ignore_me' THEN ID END) AS ID1,
MAX(CASE WHEN Column2 <> 'ignore_me' THEN ID END) AS ID2
FROM atable
GROUP BY
MainID,
SubID
;
You could also explicitly add AND ColumnN IS NOT NULL to the WHEN conditions but that's not necessary, NULL values would be ignored anyway.
Now you can simply do two left joins with the above subquery as a derived table:
SELECT
tm.MainID,
tm.SubID,
t1.Column1,
t2.Column2
FROM (
SELECT
MainID,
SubID,
MAX(CASE WHEN Column1 <> 'ignore_me' THEN ID END) AS ID1,
MAX(CASE WHEN Column2 <> 'ignore_me' THEN ID END) AS ID2
FROM atable
GROUP BY
MainID,
SubID
) tm
LEFT JOIN atable t1 ON tm.ID1 = t1.ID
LEFT JOIN atable t2 ON tm.ID2 = t2.ID
;
UPDATE (converting to a view, in answer to comments)
So far I can see only one alternative that would be VIEW-friendly:
SELECT
MainID,
SubID,
(
SELECT Column1
FROM atable
WHERE MainID = t.MainID
AND SubID = t.SubID
AND Column1 <> 'ignore_me'
ORDER BY ID DESC
LIMIT 1
) AS ID1,
(
SELECT Column2
FROM atable
WHERE MainID = t.MainID
AND SubID = t.SubID
AND Column2 <> 'ignore_me'
ORDER BY ID DESC
LIMIT 1
) AS ID2
FROM atable t
GROUP BY
MainID,
SubID
;
This query may be slower than the previous one, though: it uses two correlated subqueries, and I'm not sure if queries (or, in particular, views) with correlated subqueries can be efficient in MySQL. Proper indexing might help. In general, you'll probably need to test this for yourself.