MYSQL using count(1) in the where clause? - mysql

I'm doing facial recognition. I have a database of people from group A and people from group B. I want to check every person in A with every person in B. I have a number of different algorithms I'm running to verify the faces. To do this I set up the following tables
comparison (
id int,
personA_id int,
personB_id int,
)
facerecScore (
id int,
score int,
comparison_id int,
algo_id int,
)
So lets say I had an eigenfaces program running as my first algorithm I'm testing. Eigenfaces would have an algo_id of 1.
What I want to do is make a query that selects personA and personB from comparison where there exist no existing records in the facerecScore table where algo_id is 1 and the comparison is that comparison.
In other words, if I have already run eigenfaces on these two people, I don't want to run it again. Thus I don't want to select a comparison that already has a record in the facerecscore table with an algo_id of 1

You could try something like the following which will find all rows in comparison which do not have a record in facerecScore for a given algo_id given by the parameter :current_algo
SELECT *
FROM comparison
WHERE id not in (
SELECT comparison_id
FROM facerecScore
WHERE algo_id = :current_algo
);
In the scenario that you want to find all comparison rows for all algo_ids that do not have a corresponding record in facerecScore then you could use something like the following.
SELECT *
FROM comparison, (SELECT algo_id FROM facerecScore GROUP BY algo_id) algo
WHERE id not in (
SELECT comparison_id
FROM facerecScore
WHERE algo_id = algo.algo_id
);
Simply this query first finds all combinations of comparison rows and algo_id then removes any which have a record in facerecScore from the result set.

For anyone who hates correlated subqueries (e.g. for performance reasons, if the original query wasn't optimised), it's possible with a left join and excluding any rows that were actually joined:
Update: Inspired by #penfold's "find all" answer, this is a join+union alternative if the list of algo_ids is known (and short):
select '1' algo_id, c.*
from comparison c
left join facerecScore f
on c.id = f.comparison_id
and f.algo_id = 1
where f.id is null
union all
select '2' algo_id, c.*
from comparison c
left join facerecScore f
on c.id = f.comparison_id
and f.algo_id = 2
where f.id is null
...
Or a more general one (not sure which one will perform better):
select a.algo_id, c.id
from comparison c
cross join (select algo_id from facerecScore group by algo_id) a
left join facerecScore f
on c.id = f.comparison_id
and f.algo_id = a.algo_id
where f.id is null

You can use this, it will return first combination that hasn't been touched. Remove the last part Limit 1,1 and you will get all the combinations that haven't been touched.
SELECT *
FROM comparison
WHERE id
not in (
select comparison_id
from facerecScore
where algo_id = 1)
Limit 1,1

SELECT personA_id, personB_id FROM comparison WHERE id NOT IN (SELECT comparison_id FROM facerecScore WHERE algo_id = 1);
This is probably pretty bad on efficiency with the subquery, but it should give you the right results. Possibly someone else can find a more efficient solution.

Related

SQL SELECT looking for a name like a substring in a join

How should I solve this question?
My SQL statement is:
select ename
from employee e
inner join certified c on e.eid = c.eid
inner join aircraft a on c.aid = a.aid
where cruisingrange> 1000 && a.aname not like'%b'
The answer I get with this query is Jacob and Emily which is wrong. Jacob should not be retrieved.
How should I modify or add to the SQL statement?
Script:
CREATE TABLE Employee (
Eid int,
Ename nvarchar(100),
Salary int
)
INSERT INTO Employee VALUES
(1,'Jacob',85000),(2,'Michael',55000),(3,'Emily',80000),
(4,'Ashley',110000),(5,'Daniel',80000),(6,'Olivia',70000)
CREATE TABLE Aircraft (
Aid int,
Aname nvarchar(100),
Cruisingrange int
)
INSERT INTO Aircraft VALUES
(1,'a1',800),(2,'a2b',700),(3,'a3',1000),
(4,'a4b',1100),(5,'a5',1200)
CREATE TABLE Flight (
Flno int,
Fly_from nvarchar(100),
Fly_to nvarchar(100),
Distance int,
Price int
)
INSERT INTO Flight VALUES
(1,'LA','SF',600,65000),(2,'LA','SF',700,70000),(3,'LA','SF',800,90000),
(4,'LA','NY',1000,85000),(5,'NY','LA',1100,95000)
CREATE TABLE Certified (
Eid int,
Aid int,
CertDate date
)
INSERT INTO Certified VALUES
(1, 1, '2005-01-01'),(1, 2, '2001-01-01'),(1, 3, '2000-01-01'),
(1, 5, '2000-01-01'),(2, 3, '2002-01-01'),(2, 2, '2003-01-01'),
(3, 3, '2003-01-01'),(3, 5, '2004-01-01')
I would read the "not certified on any" to be a check for the existence of a row.
If a matching row exists, then don't return the employee. Only return the emplouee if a matching row doesn't exist.
How would you find a matching row, to find out if an employee is "certified on any"?
There are several approaches. The two best approaches to use 1) anti-join and 2) NOT EXISTS (correlated subquery).
example NOT EXISTS (correlated subquery)
Of the two approaches this one is easier to see how it works.
FROM e
WHERE NOT EXISTS ( SELECT 1
FROM certified c
JOIN aircraft a
ON a.id = c.a_id
WHERE a.aname LIKE '%b%'
AND c.e_id = e.id
)
Note the reference to the outer table (e.id) in the predicate of the subquery. The subquery is "correlated" with the outer query.
Think of if this way: for every row returned by the outer query, the subquery is executed, passing in the value of e.id. (The optimizer doesn't have to perform the operation this way; that's just an easy way of thinking about what we're asking for.)
If the subquery returns 1 or more rows, the condition EXISTS is satisfied, and returns TRUE. If the subquery returns zero rows, EXISTS evaluates to FALSE.
example of anti-join pattern
This approach can take a bit to get your brain wrapped around. Once you do "get it", it's an invaluable tool to keep handy in the SQL toolbelt.
If we use an OUTER JOIN, and pull back all rows from e along with any matching rows, then we can "exclude" the rows that found a match.
FROM e
LEFT
JOIN ( SELECT c.e_id
FROM certified c
JOIN aircraft a
ON a.id = c.a_id
WHERE a.aname LIKE '%b%'
GROUP BY c.e_id
) b
ON b.e_id = e.id
WHERE b.e_id IS NULL
The inline view query is materialized into a derived table named b. That query is intended to return the id of every employee that is certified to fly any aircraft meeting the specified criteria. Then the rows in the derived table are outer joined to e.
The "trick" is the outer join (to include both rows with matches, and rows without matches, and the condition in the WHERE clause that excludes rows that had matches.
I expect someone else will provide an example of how to use a NOT IN (subquery). With that approach, beware of what happens if the subquery returns any NULL values. (HINT: you will want to ensure that the subquery will never ever return a NULL.)
This demonstrates only two of several possible approaches to satisfying the "is not certified on any" criteria.
Obviously, additional joins/subqueries will need to be added to evaluate the other criteria in the query.
You can use this query in SQL Server (maybe it will work on MySQL too):
SELECT e.Ename
FROM Employee e
INNER JOIN Certified c
ON e.Eid = c.Eid
LEFT JOIN aircraft a
ON c.Aid = a.Aid and a.Cruisingrange> 1000
LEFT JOIN aircraft a1
ON c.Aid = a1.Aid and a1.aname like'%b%'
GROUP BY e.Ename
HAVING MAX(a.Aid) IS NOT NULL AND MAX(a1.Aid) IS NULL
We join all tables we need with conditions we need. Then we group by Ename and use MAX(a.Aid) IS NOT NULL that means that pilot can operate an air-vehicle with Cruisingrange > 1000 and MAX(a1.Aid) IS NULL means he is not certified on any aircraft with b in name.

How to adjust this query to get the result?

I have a query which is actually a members match with initial letter 'a' and the result also contain users friend.
What I would like is that the users friend must come before in the result and then remaining users.
Here is the query
SELECT `a`.`mem_id`
FROM `members` `a`
INNER JOIN
(
SELECT DISTINCT `n2`.`mem_id`
FROM `network` `n1`,`network` `n2`
WHERE `n1`.`frd_id` = `n2`.`mem_id`
AND `n1`.`mem_id`='777'
AND `n2`.`frd_id`='777'
) `b`
WHERE `a`.`mem_id`=`b`.`mem_id`
AND `a`.`profilenam` LIKE 'a%'
AND `a`.`deleted` ='N'
ORDER BY `profilenam`
As i understand your question, here is a query that will return the data you're looking for:
SELECT M.mem_id
FROM members M
LEFT OUTER JOIN (SELECT DISTINCT N1.mem_id
,N1.frd_id
FROM network N1
INNER JOIN network N2 ON N2.mem_id = N1.frd_id
AND N2.frd_id = N1.mem_id) F ON F.mem_id = M.mem_id
AND F.frd_id = '777'
WHERE M.profilename LIKE 'a%'
AND M.deleted = 'N'
ORDER BY CASE
WHEN F.mem_id IS NOT NULL THEN 0
ELSE 1
END, M.profilename
Some explanation about my query:
The table members has a LEFT OUTER JOIN on a query that returns every existing friendships (based on your query, i supposed that to consider two members as friends it's mandatory to have a two ways connection in table network).
This jointure with the condition F.frd_id = '777 ensure that the join is only made if the active member is a friend of the member with id '777'.
The last element is the key element that you were looking for, the ORDER BY clause to have the friends first and then the other members. The first condition of the clause is a simple SWITCH statement that tests if the jointure exists or not, if yes that means that the member is a friend, otherwise not. Every friends have the value 0 and the other members have the value 1, by doing an ascending sort on this value the result will be ordered as desired.
Hope this will help.

Nested SQL Query, what is actually occurring at each nesting?

I was wondering how this query works:
SELECT empname FROM Employee WHERE not exists (
SELECT projid FROM Project WHERE not exists (
SELECT empid, projid FROM Assigned WHERE empid = Employee.empid and projid = Project.projid
)
)
It is supposed to return names of all employees who are assigned to every project and it does work however I am getting confused as to how/why it works correctly.
Schema is:
Employee(empID INT,empName VARCHAR(100),job VARCHAR(100),deptID INT,salary INT);
Assigned(empID INT,projID INT,role VARCHAR(100));
Project(projID INT,title VARCHAR(100),budget INT,funds INT);
I am new to SQL so a detailed/simple explanation would be appreciated.
When I need to try to understand what's going on, I look for the inner-most query and work my way outwards. In your case, let's start with:
SELECT empid, projid
FROM Assigned
WHERE empid = Employee.empid and projid = Project.projid
This is matching all records in the Assigned table where the empid and projid are in the previous tables (hence the Employee.empid and Project.projid).
Assume there are 5 projects in the Projects table and Employee1 is assigned to each. That would return 5 records. Also assume Employee2 is assigned to 1 of those projects, thus returning 1 record.
Next look at:
SELECT projid FROM Project WHERE not exists (
...
)
Now this says for those found records in the previous query (Employee1 with 5 projects and Employee2 with 1 project), select any projid from the Project table where there aren't any matches (not exists) from the previous query. In other words, Employee1 would return no projects from this query but Employee2 would return 4 projects.
Finally, look at
SELECT empname FROM Employee WHERE not exists (
...
)
Just as with the 2nd query, for any records found in the previous query (no records to match those employees with all projects such as Employee1 and some records if the employee isn't assigned to every project such as Employee2), select any employee from the Employee table where there aren't any matches (again, not exists). In other words, Employee1 would return since no projects were returned from the previous query, and Employee2 would not return, since 1 or more projects were returned from the previous query.
Hope this helps. Here's some additional information about EXISTS:
http://dev.mysql.com/doc/refman/5.0/en/exists-and-not-exists-subqueries.html
And from that article:
What kind of store is present in all cities?
SELECT DISTINCT store_type FROM stores s1 WHERE NOT EXISTS (
SELECT * FROM cities WHERE NOT EXISTS (
SELECT * FROM cities_stores
WHERE cities_stores.city = cities.city AND cities_stores.store_type = stores.store_type));
The last example is a double-nested NOT EXISTS query. That is, it has
a NOT EXISTS clause within a NOT EXISTS clause. Formally, it answers
the question “does a city exist with a store that is not in Stores”?
But it is easier to say that a nested NOT EXISTS answers the question
“is x TRUE for all y?”
Good luck.
A NOT EXISTS (subquery) predicate will return TRUE when the resultset from the subquery has no rows. It will return FALSE when a matching row is found.
Essentially, the query is asking
for each row in Employee... check each row from the Project table, to see if there is a row in the Assigned table for a row that has an empid that matches the empid on the Employee row and a projid that matches a row in the Project table.
The row from Employee will be returned only if no matching row is found.
Note that the expressions in the SELECT list of the subquery are not important; all that is being checked is whether that subquery returns one (or more) rows or not. Normally, we use a literal 1 in the SELECT list; that remind us that what we are checking is whether a row is found or not.)
I would typically write that query in a style that looks like this:
SELECT e.empname
FROM Employee e
WHERE NOT EXISTS
( SELECT 1
FROM Project p
WHERE NOT EXISTS
( SELECT 1
FROM Assigned a
WHERE a.empid = e.empid
AND a.projid = p.projid
)
)
And I read the "SELECT 1" as "select one row")
The resultset from that query is essentially equivalent to the resultset from this (usually much less efficient) query:
SELECT e.empname
FROM Employee e
WHERE e.empid NOT IN
( SELECT a.empid
FROM Assigned a
JOIN Project p
ON a.projid = p.projid
WHERE a.empid IS NOT NULL
GROUP
BY a.empid
)
The NOT IN query can be a little easier to understand, because you can run that subquery and see that it returns something. (What can be kind of confusing about the NOT EXISTS subquery is that it doesn't matter what expressions are returned in the SELECT list; what matters is whether a row is returned or not.) There are some "gotchas" with the NOT IN subquery besides really bad performance; you need to be careful to ensure that the subquery does not return a NULL value, because then the NOT IN (NULL,...) will never return true.
An equivalent resultset can be returned using an anti-join pattern as well:
SELECT e.empname
FROM Employee e
LEFT
JOIN ( SELECT a.empid
FROM Assigned a
JOIN Project p
ON a.projid = p.projid
WHERE a.empid IS NOT NULL
GROUP
BY a.empid
) o
ON o.empid = e.empid
WHERE o.empid IS NULL
In that query, we are looking for "matches" on empid. The LEFT keyword tells MySQL to also return any rows from Employee (the table one the left side of the JOIN) which do not have a match. For those rows, a NULL value is returned in place of the values of the columns that would have been returned if there had been a matching row. The "trick" is then to throw out all the rows that matched. We do that by checking for a NULL in a column that would not be NULL if there had been a match.
If I were going to write this query using a NOT EXISTS predicate, I would probably actually favor writing it like this:
SELECT e.empname
FROM Employee e
WHERE NOT EXISTS
( SELECT 1
FROM Assigned a
JOIN Project p
ON a.projid = p.projid
WHERE a.empid = e.empid
)

Is this possible in a single mysql query?

I have two tables with a one to many relationship, offer and offer_rows
I want to fetch multiple offers with their content rows. That on it's own is not difficult, I just use an
INNER JOIN on offer.offer_id = offer_rows.offer_id
However, the offer_rows table contains a field called revision and the query needs to always fetch all the rows with the highest revision number. Is this possible with a single query?
I realize I could change the database design, by adding a third table called offer_revision, I could join this table with a select condition to fetch the latest revision number and then connect this table to the rows. This however would take considerable refactoring so I only want to do it if I have to.
I also want to do this with a direct query - no stored procedures.
Of course it is possible:
SELECT o.*, r.revision, r.something_else
FROM offer o,
offer_rows r
WHERE o.offer_id = r.offer_id
AND r.revision = (
SELECT max(revision)
FROM offer_rows
WHERE offer_id = o.offer_id
)
You can select all the rows from offer_rows with the MAX(revision) and then JOIN the offer table (no nested query will be required):
SELECT *, MAX(revision) as latest_revision
FROM offer_rows or
INNER JOIN offer o USING( offer_id )
GROUP BY offer_id
Yes this is possible with a single query. You could have a subquery that get's the highest revision in the WHERE clause.
I've used the following comparison to get a latest version entry:
AND `outer`.`version` = (
SELECT MAX( `inner`.`version` )
FROM `content` `inner`
WHERE `inner`.`id` = `outer`.`id`
AND `inner`.`language` = `outer`.`language`
)

How do you do a mysql join where the join may come from one or another table

This is the query that I am using to match up a members name to an id.
SELECT eve_member_list.`characterID` ,eve_member_list.`name`
FROM `eve_mining_op_members`
INNER JOIN eve_member_list ON eve_mining_op_members.characterID = eve_member_list.characterID
WHERE op_id = '20110821105414-741653460';
My issue is that I have two different member lists, one lists are members that belong to our group and the second list is a list of members that do not belong to our group.
How do i write this query so that if a member is not found in the eve_member_list table it will look in the eve_nonmember_member_list table to match the eve_mining_op_members.characterID to the charName
I apologize in advance if the question is hard to read as I am not quite sure how to properly ask what it is that I am looking for.
Change your INNER JOIN to a LEFT JOIN and join with both the tables. Use IFNULL to select the name if it appears in the first table, but if it is NULL (because no match was found) then it will use the value found from the second table.
SELECT
characterID,
IFNULL(eve_member_list.name, eve_nonmember_member_list.charName) AS name
FROM eve_mining_op_members
LEFT JOIN eve_member_list USING (characterID)
LEFT JOIN eve_nonmember_member_list USING (characterID)
WHERE op_id = '20110821105414-741653460';
If you have control of the database design you should also consider if it is possible to redesign your database so that both members and non-members are stored in the same table. You could for example use a boolean to specify whether or not they are members. Or you could create a person table and have information that is only relevant to members stored in a separate memberinfo table with an nullable foreign key from the person table to the memberinfo table. This will make queries relating to both members and non-members easier to write and perform better.
You could try a left join on both tables, and then selecting the non-null results from the resulting query -
select * from
(select * from
eve_mining_op_members as x
left join eve_member_list as y1 on x.characterID = y1.characterID
left join eve_member_list2 as y2 on x.characterID = y2.characterID) as t
where t.name is not null
Or, you could try the same thing with a union and using inner join (assuming joined tables are the same):
select * from
(select * from eve_mining_op_members as x
inner join eve_member_list as y1 on x.characterID = y1.characterID
UNION
select * from eve_mining_op_members as x
inner join eve_member_list2 as y2 on x.characterID = y2.characterID) as t
You can throw in your op_id condition where you see fit (sorry, I didn't really understand where it came from). Good luck!
You have several options but by
using a UNION between the eve_member_list and eve_nonmember_member_list table
and JOIN the results of this UNION with your original eve_mining_op_members table
you will get your required results.
SQL Statement
SELECT lst.`characterID`
, lst.`name`
FROM `eve_mining_op_members` AS m
INNER JOIN (
SELECT characterID
, name
FROM eve_member_list
UNION ALL
SELECT characterID
, name
FROM eve_nonmember_member_list
) AS lst ON lst.characterID = m.characterID
WHERE op_id = '20110821105414-741653460';