Count clause -> incorrect count value - mysql

We have an issue using a counting combination with inner/left join that we cannot figure out how to solve.
We would appreciate any help on the matter!
We have 4 tables in the example:
1: providers: Including 2 providers
2: providers_categories: Including 2 categories. 1 provider can be in multiple categories (this seems to be causing the issue)
3: connections_providers: connecting the providers to the categories
4: reviews_providers: currently we have included 1 rating per provider
Goal: to output the review count from the table reviews_providers.
Issue: Provider 2 is included in 2 categories. The review count is doubled: 1 count for each provider category: A total of 2 reviews are printed even though only 1 entry exists.
Thank you!
Code:
SELECT prov.id, prov.title, prov_cat.title AS category, AVG(reviews.rating) AS rating, COUNT(reviews.rating) AS count
FROM connections_providers_categories conn
INNER JOIN providers_categories prov_cat
ON prov_cat.id = conn.category_id
LEFT JOIN reviews_providers reviews
ON reviews.provider_id = conn.provider_id
INNER JOIN providers prov
ON prov.id = conn.provider_id
GROUP BY prov.id
ORDER BY prov.title ASC
CREATE TABLE `connections_providers_categories` (
`provider_id` int(4) UNSIGNED NOT NULL,
`category_id` int(4) UNSIGNED NOT NULL
) ENGINE=MyISAM DEFAULT;
INSERT INTO `connections_providers_categories` (`provider_id`, `category_id`) VALUES
(1, 1),
(2, 1),
(2, 2);
CREATE TABLE `providers` (
`id` int(4) UNSIGNED NOT NULL AUTO_INCREMENT,
`title` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT;
INSERT INTO `providers` (`id`, `title`) VALUES
(1, 'Provider 1'),
(2, 'Provider 2');
CREATE TABLE `providers_categories` (
`id` int(4) UNSIGNED NOT NULL AUTO_INCREMENT,
`title` varchar(60) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT;
INSERT INTO `providers_categories` (`id`, `title`) VALUES
(1, 'Category 1'),
(2, 'Category 2');
CREATE TABLE `reviews_providers` (
`id` int(4) UNSIGNED NOT NULL AUTO_INCREMENT,
`provider_id` int(4) UNSIGNED NOT NULL,
`rating` enum('1','2','3','4','5') DEFAULT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM AUTO_INCREMENT=3 DEFAULT;
INSERT INTO `reviews_providers` (`id`, `provider_id`, `rating`) VALUES
(1, 2, '5'),
(2, 1, '3');
Our question might resemble the following question, but we do not find the answer / see that it is the same case even thought both questions include multiple counts: count is multiplied after adding left join
It seems we might need a subquery, but we are not sure how to do this.
Any suggestions?
Thanks!

you can use subquery top get your result
SELECT prov.id, prov.title, GROUP_CONCAT(prov_cat.title) AS category, reviews.rating , reviews.count
FROM connections_providers_categories conn
INNER JOIN providers_categories prov_cat
ON prov_cat.id = conn.category_id
LEFT JOIN (SELECT provider_id, AVG(rating) AS rating, COUNT(provider_id) AS count FROM reviews_providers GROUP BY provider_id) reviews
ON reviews.provider_id = conn.provider_id
INNER JOIN providers prov
ON prov.id = conn.provider_id
GROUP BY prov.id,prov.title
ORDER BY prov.title ASC
id | title | category | rating | count
-: | :--------- | :-------------------- | -----: | ----:
1 | Provider 1 | Category 1 | 3 | 1
2 | Provider 2 | Category 2,Category 1 | 5 | 1
db<>fiddle here

Related

mysql GROUB BY idea

I have the following scenario: there are 1 table with books and two couples of tables (HD/IT) with Sales Order and Purchase Order transactions connecting through Sales Order id.
The table structure follows:
CREATE TABLE `books` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`isbn` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`it_id` int(11) NOT NULL,
`kind` tinyint(4) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `books` (`id`, `isbn`, `it_id`, `kind`) VALUES
(1, '12345', 1, 1),
(2, '12345', 1, 2),
(3, '67890', 2, 1),
(4, '1111111', 2, 2);
CREATE TABLE `porders_hd` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`dt` date NOT NULL,
`so_id` int(11) DEFAULT NULL,
`customer` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `porders_hd` (`id`, `dt`, `so_id`, `customer`) VALUES
(1, '2017-07-02', 1, 1),
(2, '2017-08-03', NULL, 3);
CREATE TABLE `porders_it` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`hd_id` int(11) NOT NULL,
`isbn` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`dscr` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`qty` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `porders_it` (`id`, `hd_id`, `isbn`, `dscr`, `qty`) VALUES
(1, 1, '12345', 'Book 1', 1),
(2, 2, '1111111', 'Book 2', 1);
CREATE TABLE `sorders_hd` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`dt` date NOT NULL,
`customer` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `sorders_hd` (`id`, `dt`, `customer`) VALUES
(1, '2017-07-01', 1),
(2, '2017-08-01', 2);
CREATE TABLE `sorders_it` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`hd_id` int(11) NOT NULL,
`isbn` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`dscr` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`qty` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
INSERT INTO `sorders_it` (`id`, `hd_id`, `isbn`, `dscr`, `qty`) VALUES
(1, 1, '12345', 'Book 1', 1),
(2, 2, '67890', 'Book 2', 1);
In summary there are:
* 1 Sales Order (#1) also existing in the Purchase Order (#1)
* 1 Sales Order (#2) still pending
* 1 Purchase Order (#2) created without a Sales Order
I want to be able to grab all Sales and Purchases Order per book's isbn and the connected SO and PO must be in the same line. The output must be like the one below:
so_id so_date po_id po_date isbn dscr
NULL NULL 2 2017-08-03 1111111 Book 2
1 2017-07-01 1 2017-07-02 12345 Book 1
2 2017-08-01 NULL NULL 67890 Book 3
I tried to grab the rows using a query like the one below:
SELECT
GROUP_CONCAT(so_id) so_id,
GROUP_CONCAT(so_date) so_date,
GROUP_CONCAT(po_id) po_id,
GROUP_CONCAT(po_date) po_date,
isbn,
dscr
FROM (
SELECT
hd.so_id so_id,
NULL so_date,
hd.id po_id,
hd.dt po_date,
bk.isbn,
it.dscr
FROM porders_hd hd,
porders_it it,
books bk
WHERE it.hd_id = hd.id
AND bk.isbn = it.isbn
AND kind = 2
UNION
SELECT
hd.id so_id,
hd.dt so_date,
NULL po_id,
NULL po_date,
bk.isbn,
it.dscr
FROM sorders_hd hd,
sorders_it it,
books bk
WHERE it.hd_id = hd.id
AND bk.isbn = it.isbn
AND kind = 1
) as table1
GROUP BY isbn, so_id, po_id
but since there is info missing I get the following result:
so_id so_date po_id po_date isbn dscr
NULL NULL 2 2017-08-03 1111111 Book 2
1 2017-07-01 NULL NULL 12345 Book 1
1 NULL 1 2017-07-02 12345 Book 1
2 2017-08-01 NULL NULL 67890 Book 3
Any ideas how can I achieve this ?
I think this is what you're after, but I can;t figure out the role of kind from your code. But here is a query that for each books, gets the associated po line item, finds the corresponding so line item and joins the header rows so the dates are available. Note my assumption that a sales order can't exist with a corresponding PO.
SELECT books.isbn, books.descr, sorders_hd.id, sorders_hd.dt, porders_hd.id, porders_hd.dt
FROM book
join porders_it on porders_it.isbn = books.isbn
join porders_hd on porders_hd.id = porders_it.hd_id
left outer join sorders_it on sorders_it.hd_id=porders_hd.so_id and sorders_it.isbn = porders_it.isbn
left outer join sorders_hd on sorders_hd.id = sorders_it.hd_it
You could normalize your tables so that descr need not be repeated, and also use the book.id in the other tables rather than isbn.
I'm adding a new answer because the previous one and the comments are illustrative. Based on that discussion, this requires a FULL OUTER JOIN which must be emulated by UNION ALL in mysql (which may be what OP was attempting originally).
Here is my new code, taking that into account:
SELECT sorders_hd.id as so_id, sorders_hd.dt as so_dt,
porders_hd.id as po_id, porders_hd.dt as po_dt,
books.isbn, porders_it.dscr
from books
left outer join porders_it on porders_it.isbn=books.isbn
join porders_hd on porders_hd.id=porders_it.hd_id
left outer join sorders_it on sorders_it.isbn=books.isbn and sorders_it.hd_id=porders_hd.so_id
left outer join sorders_hd on sorders_hd.id=sorders_it.hd_id
where books.kind=2
UNION ALL
SELECT sorders_hd.id as so_id, sorders_hd.dt as so_dt,
porders_hd.id as po_id, porders_hd.dt as po_dt,
books.isbn, sorders_it.dscr
from books
left outer join sorders_it on sorders_it.isbn=books.isbn
join sorders_hd on sorders_hd.id=sorders_it.hd_id
left outer join porders_it on porders_it.isbn=books.isbn
left outer join porders_hd on porders_hd.id=porders_it.hd_id and porders_hd.so_id=sorders_hd.id
where porders_hd.id is null and books.kind=1;
The output result is:
so_id so_dt po_id po_dt isbn dscr
1 2017-07-01 1 2017-07-02 12345 Book 1
(null) (null) 2 2017-08-03 1111111 Book 2
2 2017-08-01 (null) (null) 67890 Book 2
See SqlFiddle
The "trick" is to use union all with one of the two queries excluding records that linked both sides (to get the 'right' side of the FULL OUTER JOIN)
+1 to OP for providing the DDL and sample data!
I agree that the data model could be reworked, and could be normalized. The existing model still has at least the problem of a duplicate book record when a sales order and purchase order match (one of them is ignored). It seems to me that one improvement would be to have a master book list and include the id (or isbn if that is the primary key) from that table in porders_it and sorders_it, and eliminate the current books table.

Joining table with min(amount) does not work

I have 3 tables, but data is only fetch from 2 tables.
I'm trying to get the lowest bids for selected items and display user name with the lowest bid.
Currently query works until when we display user name, it shows wrong user name, which does not match the bid.
Below is working example of structure and query.
SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE `bid` (
`id` int(11) NOT NULL,
`amount` float NOT NULL,
`user_id` int(11) NOT NULL,
`item_id` int(11) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;
INSERT INTO `bid` (`id`, `amount`, `user_id`, `item_id`) VALUES
(1, 9, 1, 1),
(2, 5, 2, 1),
(3, 4, 3, 1),
(4, 3, 4, 1),
(5, 4, 2, 2),
(6, 22, 5, 1);
-- --------------------------------------------------------
CREATE TABLE `item` (
`id` int(11) NOT NULL,
`name` varchar(100) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1;
INSERT INTO `item` (`id`, `name`) VALUES
(1, 'chair'),
(2, 'sofa'),
(3, 'table'),
(4, 'box');
-- --------------------------------------------------------
CREATE TABLE `user` (
`id` int(11) NOT NULL,
`name` varchar(100) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1;
INSERT INTO `user` (`id`, `name`) VALUES
(1, 'James'),
(2, 'Don'),
(3, 'Hipes'),
(4, 'Sam'),
(5, 'Zakam');
ALTER TABLE `bid`
ADD PRIMARY KEY (`id`);
ALTER TABLE `item`
ADD PRIMARY KEY (`id`);
ALTER TABLE `user`
ADD PRIMARY KEY (`id`);
ALTER TABLE `bid`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=7;
ALTER TABLE `item`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=5;
ALTER TABLE `user`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=5;
Query 1:
SELECT b.id, b.item_id, MIN(b.amount) as amount, b.user_id, p.name
FROM bid b
LEFT JOIN user p ON p.id = b.user_id
WHERE b.item_id in (1, 2)
GROUP BY b.item_id
ORDER BY b.amount, b.item_id
Results:
| id | item_id | amount | user_id | name |
|----|---------|--------|---------|-------|
| 5 | 2 | 4 | 2 | Don |
| 1 | 1 | 3 | 1 | James |
Explanation of query:
Get the selected items (1, 2).
get the lowest bid for thous items - MIN(b.amount)
display user names, who has given the bid - LEFT JOIN user p on p.id = b.user_id (this is not working or I'm doing something wrong)
[Note] I can't use sub-query, I'm doing this in doctrine2 (php code) which limits mysql sub-query
No, you are not necessarily fetching the user_id who has given the bid. You group by item_id, so you get one result row per item. So you are aggregating and for every column you say what value you want to see for that item. E.g.:
MIN(b.amount) - the minimum amount of the item's records
MAX(b.amount) - the maximum amount of the item's records
AVG(b.amount) - the avarage amount of the item's records
b.amount - one of the amounts of the item's records arbitrarily chosen (as there are many amounts and you don't specify which you want to see, the DBMS simply choses one of them)
This said, b.user_id isn't necessarily the user who made the lowest bid, but just one random user of the users who made a bid.
Instead find the minimum bids and join again with your bid table to access the realted records:
select bid.id, bid.item_id, bid.amount, user.id as user_id, user.name
from bid
join
(
select item_id, min(amount) as amount
from bid
group by item_id
) as min_bid on min_bid.item_id = bid.item_id and min_bid.amount = bid.amount
join user on user.id = bid.user_id
order by bid.amount, bid.item_id;
You can solve this using a subquery. I am not 100% sure if this is the most efficient way, but at least it works.
SELECT b1.id, b1.item_id, b1.amount, b1.user_id, p.name
FROM bid b1
LEFT JOIN user p ON p.id = b1.user_id
WHERE b1.id = (
SELECT b2.id
FROM bid b2
WHERE b2.item_id IN (1, 2)
ORDER BY b2.amount LIMIT 1
)
This first selects for the lowest bid with for item 1 or 2 and then uses the id of that bid to find the information you need.
Edit
You are saying that Doctrine does not support subqueries. I have not used Doctrine a lot, but something like this should work:
$subQueryBuilder = $entityManager->createQueryBuilder();
$subQuery = $subQueryBuilder
->select('b2.id')
->from('bid', 'b2')
->where('b2.item_id IN (:items)')
->orderBy('b2.amount')
->setMaxResults(1)
->getDql();
$queryBuilder = $entityManager->createQueryBuilder();
$query = $queryBuilder
->select('b1.id', 'b1.item_id', 'b1.amount', 'b1.user_id', 'p.name')
->from('bid', 'b1')
->leftJoin('user', 'p', 'with', 'p.id = b1.user_id')
->where('b1.id = (' . $subQuery . ')')
->setParameter('items', [1, 2])
->getQuery()->getSingleResult();

Function to find first available option based on count of records and condition

I need to write an SQL statement to get the first 'free' poule (pool / collection of teams) for my team. Let's explain a bit.
I have two tables, one table poules with 4 poules each having a TEAMQTY of 4 (the max. number of teams allowed in a poule):
ID TOURNID NAME TEAMQTY
1 1 Poule 1 4
2 1 Poule 2 4
3 1 Poule 3 4
4 1 Poule 4 4
and a table teams
ID TOURNID NAME POULEID
1 1 Team 1 1
2 1 Team 2 1
3 1 Team 3 1
4 1 Team 4 1
I want to write a function in mysql which based on the situation above suggest a pouleid of 2 since poule 1 is completely filled up with teams. IOW I should be able to insert 4 more teams in PouleId 2, after that my function should return PouleID 3 as a suggestion.
I'm new to mysql (an sql noob) and I've tried:
SELECT id FROM POULES WHERE TOURNID = 1 AND
teamqty > (SELECT COUNT(ID) FROM TEAMS WHERE TOURNID = 1) LIMIT 1
Needless to say my experiment sql code is useless..
Do I need a while loop here or would an SQL statement do?
Here's my supporting code:
CREATE TABLE IF NOT EXISTS `poules` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`TOURNID` int(11) NOT NULL,
`NAME` varchar(20) NOT NULL,
`TEAMQTY` int(11) NOT NULL,
PRIMARY KEY (`ID`),
KEY `TOURNID` (`TOURNID`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;
INSERT INTO `poules` (`ID`, `TOURNID`, `NAME`, `TEAMQTY`) VALUES
(1, 1, '1', 4),
(2, 1, '2', 4),
(3, 1, '3', 4),
(4, 1, '4', 4);
CREATE TABLE IF NOT EXISTS `teams` (
`ID` int(11) NOT NULL AUTO_INCREMENT,
`TOURNID` int(11) NOT NULL,
`NAME` varchar(50) NOT NULL,
`POULEID` int(11) DEFAULT NULL,
PRIMARY KEY (`ID`),
UNIQUE KEY `NAME` (`NAME`),
KEY `TOURNID` (`TOURNID`))
ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=6 ;
INSERT INTO `teams` (`ID`, `TOURNID`, `NAME`, `POULEID`) VALUES
(1, 1, '1', 1),
(2, 1, '2', 1),
(3, 1, '3', 1),
(4, 1, '4', 1);
TIA Mike
you can do left join with a subquery that gets total team count and compares with team count in the main table
you can use limit to get the one result based on order by on team count.
select p.id as pouleid, ifnull(t.teamcount,0), p.tournid
from poules p
left join ( select count(pouleid) as teamcount, pouleid, tournid
from teams
group by pouleid, tournid
)t
on p.id = t.pouleid
and p.tournid = t.tournid
where ifnull(t.teamcount,0) < p.teamqty

MySQL ORDER BY strange sorting

Problem: mysql query return results in strange order, looks like random or so. But it happens only with one hosting, localhost and another one hosting working well. Wanna get - why it happens and how to prevent it.
Schema:
CREATE TABLE `product` (
`product_id` int(11) NOT NULL AUTO_INCREMENT,
`sort_order` int(11) NOT NULL DEFAULT '0',
`status` tinyint(1) NOT NULL DEFAULT '0',
PRIMARY KEY (`product_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
insert into product (sort_order, status)
values
(0, 1),
(0, 1),
(0, 1),
(0, 1),
(0, 1);
CREATE TABLE `product_description` (
`product_id` int(11) NOT NULL,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`product_id`),
KEY `name` (`name`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
insert into product_description
values
(1, 'product_1'),
(2, 'product_2'),
(3, 'product_3'),
(4, 'product_4'),
(5, 'product_5');
CREATE TABLE `product_to_category` (
`product_id` int(11) NOT NULL,
`product_category_id` int(11) NOT NULL,
PRIMARY KEY (`product_id`,`product_category_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
insert into product_to_category
values
(1, 1),
(2, 1),
(3, 1),
(4, 1),
(5, 1);
CREATE TABLE `product_category_path` (
`product_category_id` int(11) NOT NULL,
`path_id` int(11) NOT NULL,
`level` int(11) NOT NULL,
PRIMARY KEY (`product_category_id`,`path_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
insert into product_category_path values (1, 1, 0);
Query:
SELECT p.product_id, pc.product_category_id, pd.name
FROM `product` p
LEFT JOIN `product_description` pd ON pd.product_id = p.product_id
LEFT JOIN `product_to_category` pc ON pc.product_id = p.product_id
WHERE p.status = 1 AND pc.product_category_id IN (SELECT product_category_id FROM `product_category_path` WHERE path_id = 1)
ORDER BY p.sort_order ASC;
On localhost and hosting result is always the same: 1,2,3,4,5. But on hosting it shows 1,3,2,5,4 or 2,1,5,3,4 and everytime new ordering. Why?
Update
http://dev.mysql.com/doc/refman/5.5/en/order-by-optimization.html
http://s.petrunia.net/blog/?p=24
SQL systems (of any make and model) are allowed to return result set rows in any order they find convenient unless you specify the order specifically. To put it another way, the order of a result set is formally unpredictable unless it's specified in ORDER BY. To put it a third way, on your localhost server, it's entirely accidental that your results are in the order you think they should be in. Tables have no inherent order.
You are really lucky your production server exposed this flaw in your query so quickly. Often developers don't find out about this stuff until their tables grow to tens of thousands of rows.
As this modification of your query shows (http://sqlfiddle.com/#!2/211536/2/0), all rows in your resultset have the same value of SORT_ORDER.
Query:
SELECT p.sort_order, p.product_id, pc.product_category_id, pd.name
FROM `product` p
LEFT JOIN `product_description` pd ON pd.product_id = p.product_id
LEFT JOIN `product_to_category` pc ON pc.product_id = p.product_id
WHERE p.status = 1 AND pc.product_category_id IN (SELECT product_category_id FROM `product_category_path` WHERE path_id = 1)
ORDER BY p.sort_order ASC
Results:
| SORT_ORDER | PRODUCT_ID | PRODUCT_CATEGORY_ID | NAME |
|------------|------------|---------------------|-----------|
| 0 | 1 | 1 | product_1 |
| 0 | 2 | 1 | product_2 |
| 0 | 3 | 1 | product_3 |
| 0 | 4 | 1 | product_4 |
| 0 | 5 | 1 | product_5 |
You've told SQL to order them that way. Both servers have done so.
If you need them to be ordered by PRODUCT_ID as well as SORT_ORDER just specify it (http://sqlfiddle.com/#!2/211536/4/0).
ORDER BY p.sort_order ASC, p.product_id ASC;

MySQL filter query with relation

I'm having the following problem with 2 MySQL tables that have a relation:
I can easily query table 1 (address) when I want a full list or filter the result by name or email or such. But now I need to query table 1 and filter it based on the relational content of table 2 (interests). So, I need to find a row (usually many rows) in table 1 only if a (or more) conditions are met in table 2.
Here are the tables:
CREATE TABLE IF NOT EXISTS `address` (
`id` int(11) unsigned NOT NULL AUTO_INCREMENT,
`name` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`email` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`countryCode` char(2) COLLATE utf8_unicode_ci DEFAULT NULL,
`languageCode` char(2) COLLATE utf8_unicode_ci DEFAULT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
UNIQUE KEY `emailUnique` (`email`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
INSERT INTO `address` (`id`, `name`, `email`, `countryCode`, `languageCode`, `timestamp`) VALUES
(1, '', 'dummy#test.com', 'BE', 'nl', '2010-07-16 14:07:00'),
(2, '', 'test#somewhere.com', 'BE', 'fr', '2010-07-16 14:10:25');
CREATE TABLE IF NOT EXISTS `interests` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`address_id` int(11) unsigned NOT NULL,
`cat` char(2) COLLATE utf8_unicode_ci NOT NULL,
`subcat` char(2) COLLATE utf8_unicode_ci NOT NULL,
`timestamp` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (`id`),
KEY `address_id` (`address_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
INSERT INTO `interests` (`id`, `address_id`, `cat`, `subcat`, `timestamp`) VALUES
(1, 1, 'aa', 'xx', '2010-07-16 14:07:00'),
(2, 1, 'aa', 'yy', '2010-07-16 14:07:00'),
(3, 2, 'aa', 'xx', '2010-07-16 14:07:00'),
(4, 2, 'bb', 'zz', '2010-07-16 14:07:00')
(5, 2, 'aa', 'yy', '2010-07-16 14:07:00');
ALTER TABLE `interests`
ADD CONSTRAINT `interests_ibfk_1` FOREIGN KEY (`address_id`) REFERENCES `address` (`id`) ON DELETE CASCADE ON UPDATE NO ACTION;
For example, I need to find the address(es) that has (have) as interest cat=aa and subcat=xx. Or, another example, I need the address(es) with as interest both cat=aa and subcat=xx AND cat=aa and subcat=yy. Specially the latter is important and one has to keep in mind that both the address and the interest tables will be long lists and that the amount of cat/subcat combinations will vary. I'm working with reference queries through Zend_Db_Table (findDependentRowset) at the moment but that solution is way to slow for address lists numbering 100s and even 1000s of hits.
Thank you for your help.
SELECT a.name FROM address a
INNER JOIN interests i ON (a.id = i.address_id)
WHERE i.cat = "aa" AND i.subcat IN ('xx', 'yy')
I added another row in your interests table, to demonstrate a different result set between the two examples:
INSERT INTO interests VALUES (6, 2, 'aa', 'vv', '2010-07-16 14:07:00');
Then you may want to try using correlated subqueries as follows:
SELECT *
FROM address a
WHERE EXISTS (SELECT id
FROM interests
WHERE address_id = a.id AND
(cat = 'aa' and subcat = 'xx'));
Result:
+----+------+--------------------+-------------+--------------+---------------------+
| id | name | email | countryCode | languageCode | timestamp |
+----+------+--------------------+-------------+--------------+---------------------+
| 1 | | dummy#test.com | BE | nl | 2010-07-16 14:07:00 |
| 2 | | test#somewhere.com | BE | fr | 2010-07-16 14:10:25 |
+----+------+--------------------+-------------+--------------+---------------------+
2 rows in set (0.00 sec)
For the second example, we're testing for the new row added previously in order not to have the same result as above:
SELECT *
FROM address a
WHERE EXISTS (SELECT id
FROM interests
WHERE address_id = a.id AND
(cat = 'aa' and subcat = 'xx')) AND
EXISTS (SELECT id
FROM interests
WHERE address_id = a.id AND
(cat = 'aa' and subcat = 'vv'));
Result:
+----+------+--------------------+-------------+--------------+---------------------+
| id | name | email | countryCode | languageCode | timestamp |
+----+------+--------------------+-------------+--------------+---------------------+
| 2 | | test#somewhere.com | BE | fr | 2010-07-16 14:10:25 |
+----+------+--------------------+-------------+--------------+---------------------+
1 row in set (0.00 sec)
Using correlated subqueries is easy and straightforward. However keep in mind that it might not be the best in terms of performance, because the correlated subqueries will be executed once for each address in the outer query.