Delete from table according to count of attribute - mysql

Lets say I have a column named source in a table x. Individual entries can be like;
Id c1 c2 source ...
1 a b something
2 b a something
3 a b somethingelse
4 c a somethingelse
5 a b something
6 b c something
How can I delete entries with less than 3 same elements in source? For example since source value somethingelse appears 2 times, I need all entries that have somethingelse removed.

DELETE a
FROM tableName a
INNER JOIN
(
SELECT source
FROM tableName
GROUP BY SOURCE
HAVING COUNT(*) < 3
) b ON a.source = b.source
SQLFiddle Demo
One more thing to do for faster performance, add an INDEX to column SOURCE.

Roughly something like this would do the job
DELETE FROM TABLE_T1 WHERE ID IN (
SELECT ID FROM TABLE_T1 GROUP BY SOURCE HAVING COUNT(*) < 3
)

DELETE id
FROM yourtable a
JOIN (
SELECT *
FROM yourtable
GROUP BY
source
HAVING COUNT(*) > 3
) b
ON a.id = b.id

DELETE FROM x WHERE id IN
( SELECT id FROM
( SELECT id, COUNT(source) AS n
FROM x GROUP BY source
HAVING n < 3
)
)

Related

Using "or" ,"and", "like" several times in same column

If I have a table like this:
id car
1 A
1 B
1 C
1 D
2 A
2 B
2 C
2 F
3 A
3 C
3 E
3 F
3 G
what I want is different "id" which have ("A" or "C") and "B" in car. For example:
id car
1 A
1 B
1 C
2 A
2 B
2 C
what I did was
select * from table where (car like "A" or car like"C") and (car like "B")
but it gives me an empty row.
Any clue?
You can use a self-join
SELECT t1.id
FROM yourTable AS t1
JOIN yourTable AS t2 ON t1.id = t2.id
WHERE t1.car IN ('A', 'C')
AND t2.car = 'B'
BTW, you should generally only use LIKE when you're doing a pattern match. For exact matches use =, or IN for matching any of multiple items.
Untested, but something like this should work: get all rows that have the ID such that it is found on the list of all IDs that have an A or a C, and also on the list of all IDs that have a B.
SELECT t.*
FROM mytable t
WHERE t.id IN (
SELECT DISTINCT id
FROM mytable t2
WHERE t2.car='A' OR t2.car='C'
)
AND t.id IN (
SELECT DISTINCT id
FROM mytable t3
WHERE t3.car='B'
)

SELECT only one entry of multiple occurrences

Let's say I have a Table that looks like this:
id fk value
------------
1 1 'lorem'
2 1 'ipsum'
3 1 'dolor'
4 2 'sit'
5 2 'amet'
6 3 'consetetur'
7 3 'sadipscing'
Each fk can appear multiple times, and for each fk I want to select the last row (or more precise the row with the respectively highest id) – like this:
id fk value
------------
3 1 'dolor'
5 2 'amet'
7 3 'sadipscing'
I thought I could use the keyword DISTINCT here like this:
SELECT DISTINCT id, fk, value
FROM table
but I am not sure on which row DISTINCT will return and it must be the last one.
Is there anything like (pseudo)
SELECT id, fk, value
FROM table
WHERE MAX(id)
FOREACH DISTINCT(fk)
I hope I am making any sense here :)
thank you for your time
SELECT *
FROM table
WHERE id IN (SELECT MAX(id) FROM table GROUP BY fk)
Try this:
SELECT a.id, a.fk, a.value
FROM tableA a
INNER JOIN (SELECT MAX(a.id) id, a.fk FROM tableA a GROUP BY a.fk
) AS b ON a.fk = b.fk AND a.id = b.id;
OR
SELECT a.id, a.fk, a.value
FROM (SELECT a.id, a.fk, a.value FROM tableA a ORDER BY a.fk, a.id DESC) AS a
GROUP BY a.fk;
Try this:
SELECT t.* FROM
table1 t
JOIN (
SELECT MAX(id) as id
FROM table1
GROUP BY fk
) t1
ON t.id = t1.id
Inner query will give you highest id for each fk using MAX(). Then we join this inner table with your main table.
You could also do
SELECT id, fk, value FROM table GROUP BY fk HAVING id = MAX(id)
I don't have mysql here, but it works in Sybase ASE

Select Matched Pairs from Two Tables

I need to select matched pairs from two tables containing similarly structured data. "Matched Pair" here means two rows that reference each other in the 'match' column.
A single-table matched pair example:
TABLE
----
id | matchid
1 | 2
2 | 1
ID 1 and 2 are a matched pair because each has a match entry for the other.
Now the real question: what is the best (fastest) way to select the matched pairs that appear in both tables:
Table ONE (id, matchid)
Table TWO (id, matchid)
Example data:
ONE TWO
---- ----
id | matchid id | matchid
1 | 2 2 | 3
2 | 3 3 | 2
3 | 2
4 | 5
5 | 4
The desired result is a single row with IDs 2 and 3.
RESULT
----
id | id
2 | 3
This is because 2 & 3 are a matched pair in table ONE and in table TWO. 4 & 5 are a matched pair in table ONE but not TWO, so we don't select them. 1 and 2 are not a match pair at all since 2 does not have a matching entry for 1.
I can get the matched pairs from one table with this:
SELECT a.id, b.id
FROM ONE a JOIN ONE b
ON a.id = b.matchid AND a.matchid = b.id
WHERE a.id < b.id
How should I build a query that selects only the matching pairs that appear in both tables?
Should I:
Select the query above for each table and WHERE EXISTS them together?
Select the query above for each table and JOIN them together?
Select the query above then JOIN table TWO twice, once for 'id' and once for 'matchid'?
Select the query above for each table and loop through to compare them back in php?
Somehow filter table TWO down so we only have to look at the IDs in matched pairs in table ONE?
Do something totally different?
(Since this is a question of efficiency, it is worth noting that the matches will be quite sparse, maybe 1/1000 or less, and each table will have 100,000+ rows.)
I think I get your point. You want to filter the records in which the pairs exists on both tables.
SELECT LEAST(a.ID, a.MatchID) ID, GREATEST(a.ID, a.MatchID) MatchID
FROM One a
INNER JOIN Two b
ON a.ID = b.ID AND
a.matchID = b.matchID
GROUP BY LEAST(a.ID, a.MatchID), GREATEST(a.ID, a.MatchID)
HAVING COUNT(*) > 1
SQLFiddle Demo
Try this Query:
select
O.id,
O.matchid
from
ONE O
where
(CAST(O.id as CHAR(50))+'~'+CAST(O.matchid as CHAR(50)))
in (select CAST(T.id as CHAR(50))+'~'+CAST(T.matchid as CHAR(50)) from TWO T)
Edited Query:
select distinct
Least(O.id,O.matchid) ID,
Greatest(O.id,O.matchid) MatchID
from
ONE O
where
(CAST(O.id as CHAR(50))+'~'+CAST(O.matchid as CHAR(50)))
in (select CAST(T.id as CHAR(50))+'~'+CAST(T.matchid as CHAR(50)) from TWO T)
and (CAST(O.matchid as CHAR(50))+'~'+CAST(O.id as CHAR(50)))
in (select CAST(T.id as CHAR(50))+'~'+CAST(T.matchid as CHAR(50)) from TWO T)
SQL Fiddle
Naive version, which checks all the four rows that need to exist:
-- EXPLAIN ANALYZE
WITH both_one AS (
SELECT o.id, o.matchid
FROM one o
WHERE o.id < o.matchid
AND EXISTS ( SELECT * FROM one x WHERE x.id = o.matchid AND x.matchid = o.id)
)
, both_two AS (
SELECT t.id, t.matchid
FROM two t
WHERE t.id < t.matchid
AND EXISTS ( SELECT * FROM two x WHERE x.id = t.matchid AND x.matchid = t.id)
)
SELECT *
FROM both_one oo
WHERE EXISTS (
SELECT *
FROM both_two tt
WHERE tt.id = oo.id AND tt.matchid = oo.matchid
);
This one is simpler :
-- EXPLAIN ANALYZE
WITH pair AS (
SELECT o.id, o.matchid
FROM one o
WHERE EXISTS ( SELECT * FROM two x WHERE x.id = o.id AND x.matchid = o.matchid)
)
SELECT *
FROM pair pp
WHERE EXISTS (
SELECT *
FROM pair xx
WHERE xx.id = pp.matchid AND xx.matchid = pp.id
)
AND pp.id < pp.matchid
;

Selecting two rows in a table which have the same data for a particular column

There is a column in a table(contracts) called service location. I have to show all the rows where the service locations matches any other row in the table.
Table Example
A B C
1 2 3
3 2 1
2 5 3
I require a query where the first and second rows will be returned based on a comparison on the second column. I am assuming I will need to use a HAVING COUNT(B) > 1
I came up with this
SELECT `contract_number`
FROM `contracts`
WHERE `import_id` = 'fe508764-54a9-41f7-b36e-50ebfd95971b'
GROUP BY `service_location_id`
HAVING COUNT(`service_location_id` ) >1
But it doesn't generate what I exactly need.
Having would do it, but you would need to use it like this
SELECT *
FROM Contracts
INNER JOIN
( SELECT B
FROM Contracts
GROUP BY B
HAVING COUNT(*) > 1 -- MORE THAN ONE ROW WITH THE SAME VALUE
) dupe
ON dupe.B = Contracts.B
Depending in your indexing you may find a self join performs better though:
SELECT DISTINCT t1.*
FROM contracts t1
INNER JOIN contract` t2
ON t1.B = t2.B
AND t1.A <> t2.A
SELECT *
FROM sheet1
WHERE C
IN (
SELECT C
FROM sheet1
GROUP BY C
HAVING COUNT( C ) >1
)
ORDER BY C
LIMIT 0 , 5000

Get first/last n records per group by

I have two tables : tableA (idA, titleA) and tableB (idB, idA, textB) with a one to many relationship between them. For each row in tableA, I want to retrieve the last 5 rows corresponding in tableB (ordered by idB).
I've tried
SELECT * FROM tableA INNER JOIN tableB ON tableA.idA = tableB.idA LIMIT 5
but it's just limiting the global result of INNER JOIN whereas I want to limit the result for each different tableA.id
How can I do that ?
Thanks
Much simplified and corrected Carlos solution (his solution would return first 5 rows, not last...):
SELECT tB1.idA, tB1.idB, tB1.textB
FROM tableB as tB1
JOIN tableB as tB2
ON tB1.idA = tB2.idA AND tB1.idB <= tB2.idB
GROUP BY tB1.idA, tB1.idB
HAVING COUNT(*) <= 5
In MySQL, you may use tB1.textB even if it is group by query, because you are grouping by the idB in the first table, so there is only single value of tB1.textB for each group...
I think this is what you need:
SELECT tableA.idA, tableA.titleA, temp.idB, temp.textB
FROM tableA
INNER JOIN
(
SELECT tB1.idB, tB2.idA,
(
SELECT textB
FROM tableB
WHERE tableB.idB = tB1.idB
) as textB
FROM tableB as tB1
JOIN tableB as tB2
ON tB1.idA = tB2.idA AND tB1.idB >= tB2.idB
GROUP BY tB1.idA, tB1.idB
HAVING COUNT(*) <= 5
ORDER BY idA, idB
) as temp
ON tableA.idA = temp.idA
More info about this method here:
http://www.sql-ex.ru/help/select16.php
Ensure your "B" table has an index on ( idA, idB ) for optimized order by purposes so for each "A" ID, it can quickly have the "B" order descending thus putting the newest to the top PER EACH "A" ID. Using the MySQL variables, every time the "A" ID changes, it resets the rank back to 1 for the next "A" id.
select
B.idA,
B.idB,
B.textB
#RankSeq := if( #LastAGroup = B.idA, #RankSeq +1, 1 ) ARankSeq,
#LastAGroup := B.idA as ignoreIt
from
tableB B
JOIN tableA A
on B.idA = A.idA,
(select #RankSeq := 0, #LastAGroup := 0 ) SQLVars
having
ARankSeq <= 5
order by
B.idA,
B.idB DESC
select * from tablea ta, tableb tb
where ta.ida=tb.idb and tb.idb in
(select top 5 idb from tableB order by idb asc/desc)
(asc if you want lower ids desc if you want higher ids)
less complicated and easy to include more conditions
if top clause is not present in mysql use limit clause (I don't have much knowledge abt mysql)