Find duplicate records in MySQL without named column

Find duplicate records in MySQL without named column - mysql

I have a table like this:
**lead_id** **form_id** **field_number** **value**
1 2 1 Richard
1 2 2 Garriot
2 2 1 Hellen
2 2 2 Garriot
3 2 1 Richard
3 2 2 Douglas
4 2 1 Tomas
4 2 2 Anderson
Where field_number = 1 is the name and field_number = 2 is the surname.
I would like to find entries that are equal by name OR surname and group them by lead_id, so the output could be like this:
1
2
3
Any thoughts on how this can be done?

This should work and be reasonably efficient (depending upon indexes):
select distinct lead_id
from tablename as t1
where exists (
select 1
from tablename as t2
where t1.field_number = t2.field_number
and t1.value = t2.value
and t1.lead_id <> t2.lead_id
)

Select leadid from (
Select DISTINCT leadid,value from tablename
Where fieldnumber=1
Group by leadid,value
Having count(value) >1
Union all
Select DISTINCT leadid,value from tablename
Where fieldnumber=2
Group by leadid,value
Having count(value) >1
) as temp
Surely there is a faster option

Related

MySql , How to select the not repeated rows only

I have a table with 100 000 record, I want to select only the none repeated.
In another word, if the row are duplicated did not show it at all
ID Name Reslut
1 Adam 10
2 Mark 10
3 Mark 10
result
ID Name Reslut
1 Adam 10
any ideas ?

You could join a query on the table with a query that groups by the name only returns the unique names:
SELECT *
FROM mytable t
JOIN (SELECT name
FROM mytable
GROUP BY name
HAVING COUNT(*) = 1) s ON t.name = s.name

Using the same set :
ID Name Result
1 Adam 10
2 Mark 10
3 Mark 10
4 Mark 20
I'm guessing the final solution would be:
ID Name Result
1 Adam 10
4 Mark 20
Using the above query previously suggested I modified it to take the result into consideration:
SELECT t1.*
FROM myTable t1
JOIN
(
SELECT name, result
FROM myTable
GROUP BY name, result
HAVING COUNT(*) = 1
) t2
WHERE
t1.name=t2.name and
t1.result = t2.result;

MySQL count from different Tables

Table A
Status
1
1
1
2
2
3
Table B
Status
1
1
2
2
2
2
3
3
How to Get following Result Table
Result Table
Status Count
1 5
2 6
3 3
Help me to build SQL Query to get Count in Result Table.

Try this
SELECT Status, COUNT(*) FROM
(SELECT a.Status FROM TableA AS a
UNION ALL
SELECT b.Status FROM TableB AS b) UN
GROUP BY Status

SELECT STATUS, COUNT(*) FROM
(SELECT STATUS FROM TABLE1
UNION ALL
SELECT STATUS FROM TABLE2) un
GROUP BY STATUS
FIDDLE

MySQL Group where any 3 of 5 columns match

I am searching an addresses table for duplicates, using SOUNDEX to find the duplicates. This works fine, and it requires all 5 soundex columns to match in order to group
However, I want to GROUP where ANY 3 of my 5 SOUNDEX columns match.
Here is my current query:
SELECT `Address`.`id`,
SOUNDEX(`Address`.`address_company_name`) as soundex_address_company_name,
SOUNDEX(`Address`.`contact_name`) as soundex_contact_name,
SOUNDEX(`Address`.`street_address`) as soundex_street_address,
SOUNDEX(`Address`.`suburb`) as soundex_suburb,
SOUNDEX(`Address`.`city`) as soundex_city,
`Address`.`address_country_id`,
`Address`.`address_zone_id`,
`Address`.`postcode`,
COUNT(*)
FROM
`addresses` AS `Address`
WHERE
((`Address`.`address_company_name` IS NOT NULL)
OR (`Address`.`contact_name` IS NOT NULL))
GROUP BY
SOUNDEX(address_company_name),
SOUNDEX(contact_name),
SOUNDEX(street_address),
SOUNDEX(suburb),
SOUNDEX(city),
address_country_id,
address_zone_id,
postcode
HAVING
COUNT(*) > 1
I understand how to do this with multiple queries, ie: loop through each address in our database and then re-query the database for addresses which match any 3 of the 5 columns, however I am hoping to do this in fewer queries as the above query executes very quickly.
I also understand that were this possible, some records may be grouped multiple times, I don't mind if this is the case but I am unsure whether this flies in the face of MySQL logic?

You can try something like this
SELECT a.id, b.id id2, COUNT(*) no_matches
FROM
(
SELECT id,
column_id,
CASE column_id
WHEN 1 THEN SOUNDEX(address_company_name)
WHEN 2 THEN SOUNDEX(contact_name)
WHEN 3 THEN SOUNDEX(street_address)
WHEN 4 THEN SOUNDEX(suburb)
WHEN 5 THEN SOUNDEX(city)
END column_value
FROM addresses a CROSS JOIN
(
SELECT 1 column_id UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) i
WHERE address_company_name IS NOT NULL
OR contact_name IS NOT NULL
) a CROSS JOIN
(
SELECT id,
column_id,
CASE column_id
WHEN 1 THEN SOUNDEX(address_company_name)
WHEN 2 THEN SOUNDEX(contact_name)
WHEN 3 THEN SOUNDEX(street_address)
WHEN 4 THEN SOUNDEX(suburb)
WHEN 5 THEN SOUNDEX(city)
END column_value
FROM addresses a CROSS JOIN
(
SELECT 1 column_id UNION ALL
SELECT 2 UNION ALL
SELECT 3 UNION ALL
SELECT 4 UNION ALL
SELECT 5
) i
WHERE address_company_name IS NOT NULL
OR contact_name IS NOT NULL
) b
WHERE a.column_value = b.column_value
AND a.id < b.id
GROUP BY a.id, b.id
HAVING COUNT(*) > 2
Sample output:
| ID | ID2 | NO_MATCHES |
|----|-----|------------|
| 1 | 2 | 4 |
| 4 | 5 | 3 |
Here is SQLFiddle demo

Group dates based on variable periods

i have two tables as follows------
table-1
CalenderType periodNumber periodstartdate
1 1 01-01-2013
1 2 11-01-2013
1 3 15-01-2013
1 4 25-01-2013
2 1 01-01-2013
2 2 15-01-2013
2 3 20-01-2013
2 4 25-01-2013
table2
Incidents Date
xyz 02-01-2013
xxyyzz 03-01-2013
ccvvb 12-01-2013
vvfg 16-01-2013
x3 17-01-2013
x5 24-01-2013
Now i want to find out the number of incidents took place in a given period(the Calendar type may change on runtime like)
the query should look something like this
select .......
from ......
where CalendarType=1
which should return
CalendarType PeriodNumber Incidents
1 1 2
1 2 1
1 3 3
1 4 0
can someone suggest me an approach or any method how this can be achieved.
Note:each period is variable in size.peroid1 may have 10 days period2 may have 5 days etc.

I think this does what you want, although I don't understand how you arrived at your sample output:
select t.CalenderType, t.periodNumber, count(*) as Incidents
from Table1 t
inner join (
select t2.Date, t2.Incidents, max(t1.periodstartdate) as PeriodStartDate
from Table2 t2
inner join Table1 t1 on t2.Date >= t1.periodstartdate
where CalenderType = 1
group by t2.Date, t2.Incidents
) a on t.periodstartdate = a.PeriodStartDate
where CalenderType=1
group by t.CalenderType, t.periodNumber
SQL Fiddle Example

Try this, a bit more general solution,SQLFiddle (Thanks RedFilter for schema):
SELECT t1.CalenderType, t1.periodNumber, count(Incidents)
FROM Table1 t1, Table1 t11, Table2
WHERE
(
(
t1.CalenderType = t11.CalenderType
AND t1.periodNumber = t11.periodNumber - 1
AND Date BETWEEN t1.periodstartdate AND t11.periodstartdate
)
OR
(
t1.periodNumber = (SELECT MAX(periodNumber) FROM Table1 WHERE t1.CalenderType = CalenderType)
AND Date > t1.periodstartdate
)
)
GROUP BY t1.CalenderType, t1.periodNumber
ORDER BY t1.CalenderType, t1.periodNumber

first item used by a user

I am writing a query to grab the items that a specific user_id was the first to use. Here is some sample data -
item_id used_user_id date_used
1 1 2012-08-25
1 2 2012-08-26
1 3 2012-08-27
2 2 2012-08-27
3 1 2012-08-27
4 1 2012-08-21
4 3 2012-08-24
5 3 2012-08-23
query
select item_id as inner_item_id, ( select used_user_id
from test
where test.item_id = inner_item_id
order by date_used asc
limit 1 ) as first_to_use_it
from test
where used_user_id = 1
group by item_id
It returns the correct values
inner_item_id first_to_use_it
1 1
3 1
4 1
but the query is VERY slow on a giant table. Is there a certain index that I can use or a better query that I can write?

i can't get exactly what you mean because in your inner query you have sorted it by their used_user_id and and on your outer query you have filtered it also by their userid. Why not do this directly?
SELECT DISTINCT item_id AS inner_item_id,
used_user_id AS first_to_use_it
FROM test
WHERE used_user_id = 1
UPDATE 1
SELECT b.item_id,
b.used_user_id AS first_to_use_it
FROM
(
SELECT item_ID, MIN(date_used) minDate
FROM tableName
GROUP BY item_ID
) a
INNER JOIN tableName b
ON a.item_ID = b.item_ID AND
a.minDate = b.date_used
WHERE b.used_user_id = 1

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008

Find duplicate records in MySQL without named column - mysql

This should work and be reasonably efficient (depending upon indexes): select distinct lead_id from tablename as t1 where exists ( select 1 from tablename as t2 where t1.field_number = t2.field_number and t1.value = t2.value and t1.lead_id <> t2.lead_id )

Select leadid from ( Select DISTINCT leadid,value from tablename Where fieldnumber=1 Group by leadid,value Having count(value) >1 Union all Select DISTINCT leadid,value from tablename Where fieldnumber=2 Group by leadid,value Having count(value) >1 ) as temp Surely there is a faster option

Related

MySql , How to select the not repeated rows only

MySQL count from different Tables

MySQL Group where any 3 of 5 columns match

Group dates based on variable periods

first item used by a user

Categories

Resources