Finding Partial Duplicates in MySQL (Keys with last 6 characters the same) - mysql

Got stuck on this one, know it should be simple.
But I have a list of unique IDs that looks like AB123456, XY584234, CDE987654. The last six characters mean something, so I need to find all rows that have the same last six characters as another (substring).
So ABCD1234 would match XYCD1234, and return the both of them. Need to run this on the whole database and get all the matches, preferably with the matches next to each other.
Is that possible?

You can do this with group by and right. The following returns a list of all ids that look similar:
select right(id, 6), group_concat(id)
from table t
group by right(id, 6);
You might want to add:
having count(*) > 1
If you don't want singletons.

Please use below query to get your result.
select * from tablename where right(columnname,6)= value

Related

MySQL query - Get first & last rows from groups of larger query

I have a table (names) with about 10M rows (id, first, last, etc), and I need to break it down into digestible groups by last name letter (e.g. All last names ending in A in groups of 100), and grabbing the first and last record of each group.
I'm not sure what the most efficient way is, and not familiar with sub-querys. I think I should count the rows by last name letter (all the A's), divide it by 100, and select the first and last row? Struggling here to get an efficient query to work.
SELECT COUNT(id)
FROM names
WHERE last REGEXP '^[A].*$' / 100
gives me count of groups
SELECT COUNT (id), min(first), max(last),
(SELECT COUNT(id)
FROM names
WHERE last REGEXP '^[A].*$' / 100)
FROM names
can't get right syntax
OK let's start with the basics. First of all, to do the pagination, you would need a query like:
SELECT last, first
FROM names
WHERE last LIKE '%a'
ORDER BY last ASC
LIMIT 0,100 /* query for first page */
The challenge you than have is how to get the very first and last name from each of these groups. Unfortunately, there is no really straightforward way to do this other than either manually inspecting the first and last record of the result set above and then repeating the same thing over and over for each increment of 100. You would be best served to use a application-side DB library that allows you to easily skip your pointer within the result set. Assuming you have ability to easily move the pointer, you could also do this with a single non-paginated query and just move your point to the 1st, 100th, 101st, 200th, etc. record to extract the value.
This is probably a pretty unreasonable action for your application to take everytime you are wanting to render your navigation elements, as you would need to do this 26 times. This might cause you to rethink you navigation experience altogether, or come up with a solution to reasonably cache the results for use in navigational display.
Alternatives could include using a surrogate counter field to number all rows from 1 to x for each first letter grouping and using mathematical mean to get the rows (i.e. modulus):
SET #x=0;
SELECT `last`, `first`
FROM (
SELECT #x:=#x+1 AS `counter`, `last`, `first`
FROM names
WHERE last LIKE '%a'
ORDER BY `last` ASC
) AS all_rows
WHERE `counter` MOD 100 = 0
OR `counter` MOD 100 = 1
Though again you would need to do this 26 times if you wanted to generate all of your second level navigation options.

SQL statement for displaying unique values

Below is the data in my table:
TABLE:
abc-ac
abc-dc
aax-i
bcs-o-dc
ddd-o-poe-dc
I need to write a query which will display only the unique entries as a result:
abc-ac
aax-i
bcs-o-dc
ddd-o-poe-dc
So basically, since the first two entries start with "abc", it should be treated as one and displayed.
Thanks.
If you're not picky about which one of the two abc-* records that it shows you can use this:
SELECT f1 FROM mytable GROUP BY substring_index(f1, '-', 1)
SQLFiddle Here
That substring_index() function will split the value in your field by - and return the first bit. So essentially your records get grouped by only the first part. This is one of the few times that we can take advantage of MySQLs strange GROUP BY behavior where it will allow you to leave out non-aggregated fields from the group by.

MySQL : Count returning double the number of entries when using distinct

So I do a count like so
select distinct count(prod.id) from product as prod....
I get back 175590
I do a select like so
select distinct prod.id from product as prod.... (rest of the query is exactly the same)
and I limit it. Now if I limit the query to return anything over the half way point it returns nothing. It appears as if count is returning double the number of entries each time.
Does anyone know of anything that may be causing this?
Thanks
Tracey
The DISTINCT keyword tells MySQL to strip the duplicate rows from the result set. Because SELECT COUNT(prod.id) returns a single row (I guess this, I cannot tell for sure until I see the complete query), adding DISTINCT in front of COUNT() does not change its behaviour in any way.
What you probably want is SELECT COUNT(DISTINCT prod.id) and that's a totally different thing. It removes the duplicate values of prod.id before counting them.
Your first query is counting how many prod.id's there are.
Your second query is showing all distinct prod.id's.
This is quite different.
If you were to do the second query without the distinct key word the number would be the same.

Cut mysql data when selecting

I have a table with "unique" values. The problem is that the program, which adds these values also adds 3 different postfixs to the value (2 characters in the end of the value). As a result, I have three variable with three postfixs. So i need get only unique values from bd - somehow sort it out without the last two characters. Are any ideas?
What Camera_id should you return (first,last,maximum,minimum???) if rows have one "unique" value but different Camera_id's. Try something like this:
select
LEFT(camera_name,LENGTH(camera_name)-2), max(camera_id)
from cameras
where site_id=1
group by LEFT(camera_name,LENGTH(camera_name)-2)
Do you want to retrieve the values with the first letter only?
SELECT DISTINCT SUBSTRING(ColumnName, 1,1) a
FROM tablename
ORDER BY a
can you show sample records? it helps a lot when your asking question.

Delete rows with Sub Query?

I can't seem to get the SQL to work when using LIKE
DELETE FROM `customer_numbers`
WHERE number NOT LIKE (SELECT number FROM number_part)%
Basically delete all the rows from constomer_numbers table if number does not contain in number_part table
Example:
customer_numbers.number = 0559354544 and number_part.number = 05593 - it shouldn't delete it.. However if 05593 does not contain in customer_numbers.number then delete row from customer_numbers table.. It should match first 5 digits from number_part
You can't use not like with a list (in most databases, I'm pretty sure this is true in mysql).
Instead, you can use a correlated subquery:
DELETE FROM `customer_numbers`
WHERE not exists (SELECT number FROM number_part
where customer_numbers.number like concat(number, '%')
)
Your query is rather broken. In at least two ways:
Your subquery returns multiple rows, but is in a place where it looks like you expect a single result.
You need your LIKE string to be quoted.
2 is probably easy to fix. Try:
... WHERE number LIKE CONCAT((SELECT ... LIMIT 1),'%');
1 is really your problem, though. If you run your subselect as a single command, I expect you'll get multiple rows, right? How do you expect to treat a list of numbers (let's say, 1, 2, 3, 4, 5) as part of a LIKE string?
What I'm guessing you're hoping for is something like LIKE '1%' OR LIKE '2%' OR LIKE '3%'..., etc... No?
At any rate, if you can tell us more precisely what you're trying to do, we can probably help you solve your problem better.
As your question is worded, all I can say is: It doesn't work that way.
You named your fields numbers but it seems like they are character values because of the leading 0s so I am going to assume they are character fields.
If your part number is always going to be 5 characters long then you can do the following:
DELETE FROM `customer_numbers`
WHERE substring(number,1,5) NOT IN (SELECT number FROM number_part)
You can't do a LIKE on a numerical field.
You can't do an equality on a sub-select that isn't DISTINCT.
You must use "IN".
DELETE FROM `customer_numbers` WHERE number NOT IN (SELECT number FROM number_part)
How about something like this:
DELETE FROM customer_numbers WHERE number NOT IN (SELECT number FROM number_part)