Query build automation - mysql

I have the following query
select count(t1.guid)
from table t1
where t1.id=X;
X is a result-set from this query
select ID
from table t2
where t2.flags=65537;
The above query returns 84 results, all of INT datatype.
id is primary key in t2 table,
and foreign key in t1 table;
guid is primary key in t1 table,
and doesn't exist anywhere else.
Object O1 has a unique identifier among the table that declares all objects and their properties (t2)
GUID in table t1 assigns unique identification to every instance of object O1 called by upper layers.
I want to see the number of duplicates every object that fulfills conditions in the second query.
I suppose I should go about declaring a variable and a function that uses said variable but got no clue where to start or how to go about it.
I solved the problem once with hand-hacking 84 times, but looking for a more elegant and more adaptive solution to this;

After a whole day spent, figured it out
Simply link the two posted queries together, but change the "=" operator to "in"
select count(t1.guid)
from table t1
where t1.id in
(select t2.ID
from table t2
where t2.flags=65537);
hand-hacking session avoided!

Related

Removing near identical values from mysql table

Is there a way of removing near identical values from a table in mysql? My table has records more than 10K out of which one of the company looks like this:
id name
123 Vianet Group Inc
5214 Vianet Group, Inc.
on using describe tablename I get this:
Field Type Null Key Default Extra
id int NO PRI auto_increment
name varchar(150) NO UNI
the names of the company are same however I would like to delete the second instance from table, thereby keeping just a single instance of the name in the table. This is just one instance and there are others like these.. Is there a swift way of removing identical values from the column? Please help.
You could try using soundex to find the "near identical" values -
SELECT *
FROM tablename t1
INNER JOIN tablename t2
ON t1.id < t2.id
AND SOUNDEX(t1.name) = SOUNDEX(t2.name)
You will need to test it with some of your example "near identical" values to see what it does and does not work for. As suggested by Akina you will probably need to go for some kind of normalisation process (stored function) or the Levenshtein distance function linked by Slava.

MYSQL drop duplicates of userid

I thought I'd made the column userid in my table "userslive" unique, but somehow must have made a mistake. I've seen multiple answers to this question, but I'm afraid of messing up again so I hope someone can help me directly.
So this table has no unique columns, but I've got a column "timer" which was the timestamp of scraping the data. If possible I'd like to drop rows with the lowest "timer" with duplicate "userid" column.
It's a fairly big table at about 2 million rows (20 columns). There is about 1000 duplicate userid which I've found using this query:
SELECT userid, COUNT(userid) as cnt FROM userslive GROUP BY userid HAVING (cnt > 1);
Is this the correct syntax? I tried this on a backup table, but I suspect this is too heavy for a table this big (unless left to run for a very long time.
DELETE FROM userslive using userslive,
userslive e1
where userslive.timer < e1.timer
and userslive.userid = e1.userid
Is there a quicker way to do this?
EDIT: I should say the "timer" is not a unique column.
DELETE t1.* /* delete from a copy named t1 only */
FROM userslive t1, userslive t2
WHERE t1.userid = t2.userid
AND t1.timer < t2.timer
fiddle
Logic: if for some record (in a copy aliased as t1) we can find a record (in a table copy aliased as t2) with the same user but with greater/later timer value - this record must be deleted.
I've done this in the past and the easiest way to solve this is to add an id column and then select userid, max(new_id) into a new table and join that for the delete. Something like this.
ALTER TABLE `userslive`
ADD `new_id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY;
Now you have your new unique column and create a new table for selecting the ones to delete.
CREATE TABLE `users_to_delete`
AS
SELECT userid, new_id
FROM (
SELECT userid, max(new_id) new_id, count(*) user_rows
FROM `userslive`
GROUP BY 1
) dataset
WHERE user_rows > 1
Then use that to delete your duplicate rows by joining it into a DELETE statement like this:
DELETE `userslive` FROM `userslive`
INNER JOIN `users_to_delete` USING(userid,new_id);
Make sure you back everything up before you delete anything just in case.

Execute records based on where clause from another table

I am having two tables, structure is given
Table 1
schid (NOT A PRIMARY KEY)
name
cost
type
Table 2
schid (NOT A PRIMARY KEY)
details
oldcost
table with data
I am unable to write a query to display records from table 2 of let suppose type A OR B (Here as you can see type field is in table 1), Here one more thing to add is that schid is not a primary key, The query which i am executing is retrieving more records than expected, I think due to join, Can i execute it without using join
The query which i am executing
SELECT t2.table 2
FROM table t2
JOIN table1 t1 ON t1.table1 = t2.table2
WHERE t1.type= 'A'

Most efficient way to select data from one sql table and see if it matches data on another table in the same database

I have a database with 2 tables, both tables have around 200,000 records.
Lets call these tables, TableA and TableB
Currently I have a function that triggers a select query, this query grabs all records in TableA that match a condition. Once I have that data, I have a foreach loop that uses the data from TableA to see if it matches any record in TableB.
The problem is that it takes a while to do this because there are so many records. I know the way Im doing it works because it does what its supposed to but it takes a good 3 minutes to finish the script. Is there a faster more efficient way to do something like this?
Thank you in advance for the help.
PS: I'm using PHP.
The most efficient way to achieve what you want is to:
1. Create a primary key column for each table (if you do not already have one). Example schema where column "id" is a unique identifier for the table row:
TableA
id firstname lastname
1 Michael Douglas
2 Michael Jackson
TableB
id table_a_id pet
1 1 cat
2 2 ape
3 1 dog
Google or search here on stackoverflow on how to create or add a primary key for a mysql table column. An example of creating TableA with a primary key:
CREATE TABLE `TableA` (
`id` int(11) unsigned AUTO_INCREMENT,
`firstname` varchar(100),
`lastname` varchar(100),
PRIMARY KEY (`id`)
)
2. Create an SQL-query to fetch what you need. For example:
To get all rows with at least one match in BOTH tables:
SELECT TableA.id, TableA.firstname, TableA.lastname, TableB.pet
FROM TableA
INNER JOIN TableB
ON TableA.id = TableB.table_a_id;
To instead get all rows from TableA, and only the matching rows from TableB:
SELECT TableA.id, TableA.firstname, TableA.lastname, TableB.pet
FROM TableA
LEFT JOIN TableB
ON TableA.id=TableB.table_a_id;
The answer to your question ultimately depends on what you mean by "if it matches."
Let's assume, for a moment, that you have primary keys on each of these tables, TableA an TableB, and that you're NOT matching those. But that you have one or more other columns, the actual data that you're storing in each row, which you are considering for your matching. Let's call those ColA and ColB.
In that case you could use:
SELECT TableA.id, TableB.id, TableA.ColA, TableB.ColB
FROM TableA
LEFT JOIN TableB
ON (TableA.ColA = TableB.ColA)
AND (TableB.ColB = TableB.ColB);
... notice that we're using a complex expression on which to JOIN. You'd want to add an AND (TableA.XXX = TableB.XXX) for each columned that you want to consider significant in your matching.
Of course I'm assuming that these tables don't share a common surrogate key (otherwise MicKri's JOIN would be simpler ... or a "NATURAL JOIN" would be even simpler still).
What you're doing, conceptually, is defining a pair of (mathematical) sets an finding the intersection between them. The complication of doing this in SQL is that real world tables often have these extra columns (surrogate primary keys, and foreign keys) which aren't attributes of the underlying entities ... but which serve to map relationships among them.
In my example I'm just showing a way to formulate a JOIN query that finds the intersection based only on the attributes that are significant for your purposes.
(By the way, the parentheses in my example are there for human legibility. They should not be required by your SQL engine ... though they don't hurt, either).
Here's one of a number of visual explanations of SQL JOINs that's handy for learning this sort of thing. An INNER JOIN is an intersection. The ON and WHERE clauses define the subsets of the data (columns and rows, respectively) which are to be related.

Delete all not referenced (by foreign key) records from a table in mysql

I have a address table which is referenced from 6 other tables (sometimes multiple tables). Some of those tables have around half a million records (and the address table around 750000 records). I want to have a periodical query running which deletes all records that are not referenced from any of the tables.
The following sub-queries is not a option, because the query never finishes - the scope is too big.
delete from address where address_id not in (select ...)
and not in (select ...) and not in (select ...) ...
What I was hoping was that I could use the foreign key constraint and I could simply delete all records for which the foreign key constraint does not stop me (because there is no reference to the table). I could not find a way to do this (or is there?). Anybody another good idea to tackle this problem?
You can try this ways
DELETE
address
FROM
address
LEFT JOIN other_table ON (address.id = other_table.ref_field)
LEFT JOIN other_table ON (address.id = other_table2.ref_field)
WHERE
other_table.id IS NULL AND other_table2.id IS NULL
OR
DELETE
FROM address A
WHERE NOT EXISTS (
SELECT 1
FROM other_table B
WHERE B.a_key = A.id
)
I always use this:
DELETE FROM table WHERE id NOT IN (SELECT id FROM OTHER table)
I'd do this by first creating a TEMPORARY TABLE (t) that is a UNION of the IDs in the 6 referencing tables, then run:
DELETE x FROM x LEFT JOIN t USING (ID) WHERE x.ID IS NULL;
Where x is the address table.
See 'Multiple-table syntax' here:
http://dev.mysql.com/doc/refman/5.0/en/delete.html
Obviously, your temporary table should have its PRIMARY KEY on ID. It may take some time to query and join, but I can't see a way round it. It should be optimized, unlike the multiple sub-query version.