MySQL: Combining multiple where conditions - mysql

I'm working on a menu system that takes a url and then queries the db to build the menu.
My menu table is:
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| node_id | int(11) | YES | | NULL | |
| parent | int(11) | YES | | NULL | |
| weight | int(11) | YES | | NULL | |
| title | varchar(250) | YES | | NULL | |
| alias | varchar(250) | YES | | NULL | |
| exclude | int(11) | YES | | NULL | |
+---------+--------------+------+-----+---------+----------------+
The relevant columns for my question are alias, parent and node_id.
So for a url like: http://example.com/folder1/folder2/filename
Alias would potentially = "filename", "folder1", "folder2"
Parent = the node_id of the parent folder.
What I know is how to split the url up into an array and check the alias for a match to each part.
What I don't know is how to have it then filter by parent whose alias matches "folder2" and whose parent alias matches "folder1".
I'm imagining a query like so:
select * from menu
where alias='filename' and
where parent = node_id
where alias='folder2' and parent = node_id
where alias='folder1'
Except I know that the above is wrong. I'm hoping this can be done in a single query.
Thanks for any help in advance!

select * from menu
where alias='filename' and
parent = (select node_id from menu
where alias='folder2' and
parent = (select node_id from menu
where alias='folder1'
)
)

I can't get what query you want but here is 2 rules for you:
WHERE statement should be only one.
resulting expression will be applied to the each row, one-by-one. Not to the "whole" table, as you probably think. So, you have to think hard when creating expression. there would be no match with alias='folder2' and alias='folder1' at the same time

This should do it for you. It links all the nodes through parent (if there is one), and will retrieve info for all of the levels in one record.
select *
from menu m1
left outer join menu m2 on m1.parent = m2.node_id
left outer join menu m3 on m2.parent = m3.node_id
where m1.alias = 'filename'
and m2.alias = 'folder2'
and m3.alias = 'folder1'

Related

MySQL - How can you select multiple columns on a nested IFNULL...GROUP_CONCAT() condition?

I have a web application which is connected to a MySQL (5.5.64-MariaDB) database.
One of the queries is as follows:
SELECT
d.id,
d.label AS display_label,
d.anchor,
r.id AS regulation_id,
IFNULL(
(SELECT GROUP_CONCAT(value) FROM display_substances `ds`
WHERE `ds`.`display_id` = `d`.`id`
AND ds.substance_id = 1 -- For example, substance ID = 1
GROUP BY `ds`.`display_id`
), "Not Listed"
) `display_value` FROM displays `d`
JOIN groups g ON d.group_id = g.id
JOIN regulations r ON g.regulation_id = r.id
An example of the output is as follows:
+-----+------------------------------------+------------------------------------------------------------------------------------------+
| id | name | display_value |
+-----+------------------------------------+------------------------------------------------------------------------------------------+
| 4 | techfunction | Intermediate / monomer; Corrosion inhibitor / anodiser / galvaniser; Catalyst; Additive |
| 323 | russia_chemsafety_register_display | Not Listed |
| 733 | peru_pcb_display | Not Listed |
+-----+------------------------------------+------------------------------------------------------------------------------------------+
This query does what we need. For explanatory purposes:
There are 2 tables, displays and display_substances
The query is obtaining display_substances.value for each displays.id
If there is no corresponding display_substances.value then the string "Not Listed" (refer to query above) is returned. If there is a corresponding value then display_substances.value is returned. So in the example data above, IDs 323 and 733 refer to a scenario where there is no corresponding entry, therefore we want "Not Listed". Conversely ID 4 does have a value ("Intermediate / monomer; Corrosion inhibitor / anodiser / galvaniser; Catalyst; Additive") so we get that.
The table structures are as follows:
DESCRIBE displays;
+----------+----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------+----------------------+------+-----+---------+----------------+
| id | smallint(5) unsigned | NO | PRI | NULL | auto_increment |
| name | varchar(127) | NO | | NULL | |
| label | varchar(255) | NO | | NULL | |
+----------+----------------------+------+-----+---------+----------------+
DESCRIBE display_substances;
+--------------+-----------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------------+-----------------------+------+-----+---------+----------------+
| id | mediumint(8) unsigned | NO | PRI | NULL | auto_increment |
| display_id | smallint(5) unsigned | NO | MUL | NULL | |
| substance_id | mediumint(8) unsigned | NO | MUL | NULL | |
| value | text | NO | | NULL | |
| automated | tinyint(4) | YES | | NULL | |
+--------------+-----------------------+------+-----+---------+----------------+
I want to be able to return display_substances.automated (refer to table structure above) as a column from my query. But I can't see how to do this.
The reference to the display_substances table is ds, so I cannot use that in the initial SELECT statement because at that point there's no alias. Equally there is no JOIN condition that would make it possible, because not every row returned obtains data from display_substances (i.e. those that are "Not Listed" are not getting anything from that table).
If I want an additional column next to display_value in the sample output above that shows display_substances.automated, or NULL if it doesn't exist, how can I achieve that?
For reference the automated field either contains a 1 (to represent data that has been obtained through automated processes by our application), or NULL if it isn't automated.
there is no JOIN condition that would make it possible, because not
every row returned obtains data from display_substances
For this case you can use a LEFT JOIN:
SELECT d.id, d.label display_label, d.anchor, r.id regulation_id,
COALESCE(ds.value, 'Not Listed') display_value,
ds.automated
FROM displays d
INNER JOIN groups g ON d.group_id = g.id
INNER JOIN regulations r ON g.regulation_id = r.id
LEFT JOIN (
SELECT display_id, GROUP_CONCAT(value) value, MAX(automated) automated
FROM display_substances
WHERE substance_id = 1
GROUP BY display_id
) ds ON ds.display_id = d.id
I used MAX(automated) as the returned column, but you can use GROUP_CONCAT(automated) just like you do for value and also COALESCE():
COALESCE(ds.automated, 'Not Listed')

MySql LEFT OUTER JOIN causing duplicate rows

Im running a query to grab the first 10 profiles (think of them as an article that shows when a shop opens and holds information about that shop). I'm using the OUTER JOIN to select * images that belong to the profile PK.
Im running the following query, the main part I'm trying to focus on is the JOIN. I won't post the whole query as it's just a whole bunch of 'table'.'colname' = 'table.colname'.
But here is where the magic happens during my outer join.
LEFT JOIN `content_image` AS `image` ON `profile`.`content_ptr_id` = `image`.`content_id`
Full Query:
I've formatted like this so everyone can see the query without scrolling endlessly to the right.
select `profile`.`content_ptr_id` AS `profile.content_ptr_id`,
`profile`.`body` AS `profile.body`,
`profile`.`web_site` AS `profile.web_site`,
`profile`.`email` AS `profile.email`,
`profile`.`hours` AS `profile.hours`,
`profile`.`price_range` AS `profile.price_range`,
`profile`.`price_range_high` AS `profile.price_range_high`,
`profile`.`primary_category_id` AS `profile.primary_category_id`,
`profile`.`business_contact_email` AS `profile.business_contact_email`,
`profile`.`business_contact_phone` AS `profile.business_contact_phone`,
`profile`.`show_in_directory` AS `profile.show_in_directory`,
`image`.`id` AS `image.id`,
`image`.`content_id` AS `image.content_id`,
`image`.`type` AS `image.type`,
`image`.`order` AS `image.order`,
`image`.`caption` AS `image.caption`,
`image`.`author_id` AS `image.author_id`,
`image`.`image` AS `image.image`,
`image`.`link_url` AS `image.link_url`
FROM content_profile AS profile
LEFT JOIN `content_image` AS `image` ON `profile`.`content_ptr_id` = `image`.`content_id`
GROUP BY profile.content_ptr_id
LIMIT 10, 12
Is there a way I can group my results per profile? E.g all images will show in the one profile result? I can't use group by as I'm getting an error
Error: ER_WRONG_FIELD_WITH_GROUP: Expression #12 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'broadsheet.image.id' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by]
code: 'ER_WRONG_FIELD_WITH_GROUP',
errno: 1055,
sqlState: '42000',
index: 0 }
Is there a possible way around this group by error or another query I could run?
Tables:
content_image
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| content_id | int(11) | NO | MUL | NULL | |
| type | varchar(255) | NO | | NULL | |
| order | int(11) | NO | | NULL | |
| caption | longtext | NO | | NULL | |
| author_id | int(11) | YES | MUL | NULL | |
| image | varchar(255) | YES | | NULL | |
| link_url | varchar(200) | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
content_profile
+------------------------+----------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+----------------------+------+-----+---------+-------+
| content_ptr_id | int(11) | NO | PRI | NULL | |
| body | longtext | NO | | NULL | |
| web_site | varchar(200) | NO | | NULL | |
| email | varchar(75) | NO | | NULL | |
| menu | longtext | NO | | NULL | |
| hours | longtext | NO | | NULL | |
| price_range | smallint(5) unsigned | YES | MUL | NULL | |
| price_range_high | smallint(5) unsigned | YES | | NULL | |
| primary_category_id | int(11) | NO | | NULL | |
| business_contact_name | varchar(255) | NO | | NULL | |
| business_contact_email | varchar(75) | NO | | NULL | |
| business_contact_phone | varchar(20) | NO | | NULL | |
| show_in_directory | tinyint(1) | NO | | NULL | |
+------------------------+----------------------+------+-----+---------+-------+
From reading your question, I think you don't have a grasp of how the GROUP BY clause works.
So the short summary of my answer is: learn the fundamentals of the GROUP BY clause.
I will use only a small number of columns to make the explanation easier.
The first problem with your query is that you are not using the group by clause properly - when using a group by clause, all columns that are selected must be either in the group by clause OR be selected with an aggregate function.
Lets suppose these are the only columns you are selecting:
profile.content_ptr_id
profile.body
profile.web_site
image.id
image.content_id
And the query looked like this:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
FROM ...
GROUP BY `profile.content_ptr_id`
This query will error out as you did not specify how you want to consolidate multiple rows to one row for profile.body, profile.web_site, image.id, image.content_id. The database does not know how you want to consolidate the other columns as you can group, or use aggregate functions such as min(), max(), count(), etc.
So one solution to fix the error raised in the query above would be the following:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
FROM ...
GROUP BY `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
Here, I put all the columns in the group by clause which makes the query group and select all the unique combinations of profile.content_ptr_id, profile.body, profile.web_site, image.id, image.content_id columns.
Following is an example query which does not have all the columns included in the group by clause:
Lets say, you want to find out how many images there are for each of the profiles. You can use a query such as the following:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, COUNT(`image.id`)
FROM ...
GROUP BY `profile.content_ptr_id`, `profile.body`, `profile.web_site`
This query lets you find out how many images there are for every unique combination of profile.content_ptr_id, profile.body, profile.web_site columns.
Be aware that in my previous two examples, all the columns that are selected are either included in the group by clause or are selected with an aggregate function. This is a rule all queries need to follow when using the group by clause, otherwise an error will be raised by the database.
Now, lets get onto answering your question:
"Is there a way I can group my results per profile? E.g all images will show in the one profile result?"
I will use the following mock data to explain:
profile
+----------------+--------------+---------------+
| content_ptr_id | body | web_site |
+----------------+--------------+---------------+
| 100 | body1 | web1 |
+----------------+--------------+---------------+
image
+--------+-------------+
| id | content_id |
+--------+-------------+
| iid1 | 100 |
| iid2 | 100 |
+--------+-------------+
Following would be what the result would look like if you don't do a join:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
FROM ...
+----------------+--------------+---------------+--------+-------------+
| content_ptr_id | body | web_site | id | content_id |
+----------------+--------------+---------------+--------+-------------+
| 100 | body1 | web1 | iid1 | 100 |
| 100 | body1 | web1 | iid2 | 100 |
+----------------+--------------+---------------+--------+-------------+
You can't achieve your objective of grouping your results per profile (combining to only show one line per profile) by grouping by all the columns as the result will be the same:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
FROM ...
GROUP BY `profile.content_ptr_id`, `profile.body`, `profile.web_site`, `image.id`, `image.content_id`
will return
+----------------+--------------+---------------+--------+-------------+
| content_ptr_id | body | web_site | id | content_id |
+----------------+--------------+---------------+--------+-------------+
| 100 | body1 | web1 | iid1 | 100 |
| 100 | body1 | web1 | iid2 | 100 |
+----------------+--------------+---------------+--------+-------------+
The question you need to answer is how you want to display the non-unique columns you want to combine - in this case image.id. You can use count, but this will only return you a number. If you want to display all the text, you can use GROUP_CONCAT() which will concatenate all the values delimited by comma by default. If you use GROUP_CONCAT() the result will look like the following:
SELECT `profile.content_ptr_id`, `profile.body`, `profile.web_site`, GROUP_CONCAT(`image.id`), GROUP_CONCAT(`image.content_id`)
FROM ...
GROUP BY `profile.content_ptr_id`, `profile.body`, `profile.web_site`
This query will return:
+----------------+--------------+---------------+--------------------+-------------+
| content_ptr_id | body | web_site | GROUP_CONCAT(id) | content_id |
+----------------+--------------+---------------+--------------------+-------------+
| 100 | body1 | web1 | iid1,iid2 | 100 |
+----------------+--------------+---------------+--------------------+-------------+
If GROUP_CONCAT() is what you want to use for all the image columns, then go ahead, but doing this for many columns consolidating many rows may make the table less readable. But either way, I would suggest you read some articles to familiarise yourself with how the GROUP BY clause works.
Remove the GROUP BY clause.
I suspect you didn't want to do a GROUP BY operation, given that the expression in the group by is the PRIMARY KEY of the content_profile table.
What is up with all the single quotes? Those are used to enclose string literals, not identifiers.
Thank for sparing us from "scrolling endlessly to the right".
Are you aware that spaces and linebreaks can be included in the SQL text, without altering the meaning of the statement? The parser can easily deal with extra whitespace, and adding the extra whitespace to format the statement can make it much easier for a human reader to decipher.
It's not at all clear why the statement is skipping over the first ten rows, and then returning the next twelve. Very strange.
SELECT p.content_ptr_id AS `profile.content_ptr_id`
, p.body AS `profile.body`
, p.web_site AS `profile.web_site`
, p.email AS `profile.email`
, p.hours AS `profile.hours`
, p.price_range AS `profile.price_range`
, p.price_range_high AS `profile.price_range_high`
, p.primary_category_id AS `profile.primary_category_id`
, p.business_contact_email AS `profile.business_contact_email`
, p.business_contact_phone AS `profile.business_contact_phone`
, p.show_in_directory AS `profile.show_in_directory`
, i.id AS `image.id`
, i.content_id AS `image.content_id`
, i.type AS `image.type`
, i.order AS `image.order`
, i.caption AS `image.caption`
, i.author_id AS `image.author_id`
, i.image AS `image.image`
, i.link_url AS `image.link_url`
FROM `content_profile` p
LEFT
JOIN `content_image` i
ON i.content_id = p.content_ptr_id
ORDER
BY p.content_ptr_id
, i.id
Because content_id is not unique in the content_image table, duplicate rows from content_profile are the expected result.
If your code can't handle the "duplicate" rows, i.e. identifying when the row that was just fetched has the same value for content_ptr_id as the previous row, then your SQL shouldn't do a join operation that creates the duplicated values.

Query on two tables for one report (Advanced)

I'm having some trouble with an advanced SQL query, and it's been a long time since I've worked with SQL databases. We use MySQL.
Background:
We will be working with two tables:
"Transactions Table"
table: expire_history
+---------------+-----------------------------+------+-----+-------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------+-----------------------------+------+-----+-------------------+-------+
| m_id | int(11) | NO | PRI | 0 | |
| m_a_ordinal | int(11) | NO | PRI | 0 | |
| a_expired_date| datetime | NO | PRI | | |
| a_state | enum('EXPIRED','UNEXPIRED') | YES | | NULL | |
| t_note | text | YES | | NULL | |
| t_updated_by | varchar(40) | NO | | | |
| t_last_update | timestamp | NO | | CURRENT_TIMESTAMP | |
+---------------+-----------------------------+------+-----+-------------------+-------+
"Information Table"
table: information
+---------------------+---------------+------+-----+---------------------+-------+
| Field | Type | Null | Key | Default | Extra |
+---------------------+---------------+------+-----+---------------------+-------+
| m_id | int(11) | NO | PRI | 0 | |
| m_a_ordinal | int(11) | NO | PRI | 0 | |
| a_type | varchar(15) | YES | MUL | NULL | |
| a_class | varchar(15) | YES | MUL | NULL | |
| a_state | varchar(15) | YES | MUL | NULL | |
| a_publish_date | datetime | YES | | NULL | |
| a_expire_date | date | YES | | NULL | |
| a_updated_by | varchar(20) | NO | | | |
| a_last_update | timestamp | NO | | CURRENT_TIMESTAMP | |
+---------------------+---------------+------+-----+---------------------+-------+
We have a set of fields in one table that describe the record. Each record is comprised of a m_id (the person) and an ordinal (a person can have multiple records). So for instance, my m_id could be 1, and i could have multiple ordinals, (1, 2, 3, 4, etc), each with their own individual set of data. The m_id and the m_a_ordinal comprise a composite key in the "information" table, and the m_id, m_a_ordinal, and a_expired_date fields in the "transactions" table comprises a composite key as well.
Essentially when we expire a record, the a_state field in the information table is updated to expired. At the same time, a record is created in the transactions table with the m_id, m_a_ordinal, and a_expired_date. We've found in the past that people get impatient and can click a button twice, so through some previous help I've managed to narrow down the most recent transaction for each expired record using the following query:
SELECT e1.m_id, e1.m_a_ordinal, e1.a_expired_date, e1.t_note, e1.t_updated_by
FROM expire_history e1
INNER JOIN (SELECT m_id, m_a_ordinal, MAX(a_expired_date) AS a_expired_date
FROM expire_history GROUP BY m_id, m_a_ordinal) e2
ON (e2.m_id = e1.m_id AND e2.m_a_ordinal = e1.m_a_ordinal AND e2.a_expired_date = e1.a_expired_date)
WHERE e2.a_expired_date > '2008-05-15 00:00:00' ORDER BY a_date_expired;
Seems simple enough, right?
Let's add some complexity. Each record in the "information" table has a "natural expiration date" as well. The original developer of our software, however, didn't code it to change the state of the record to "expired" once it's reached it's natural expiration date. It also does not write a transaction to the transaction table once it's expired (which I understand because this is only to keep records of ones that were expired by a person, as opposed to automagically). Also, when a record is expired manually, the original expiration date does not change. This is why this is so complicated :P~~.
Essentially I need to build a report that shows all aspects of expiration, whether it was expired manually, or naturally.
This report should take the data from the query above, and combines it with another query on the "information table" that says if a_expire_date <= CURDATE show record, except if record exisits in (query above from expire_history), then show record from (query on expire_history).
a rough structure of the raw logic is as follows:
for x in record_total
if (m_id m_a_ordinal) exists in expire_history
display m_id, m_a_ordinal, a_expired_date, a_state)
else if (m_id_a_ordinal) exists in information AND a_expire_date <= CURDATE
display (m_id, m_a_ordinal, a_expire_date, a_state)
end if
x++
I hope that this is concise enough.
Thanks for any help you can provide!
SELECT i.m_id, I.m_a_ordinal,
coalesce(e1.a_expired_date, I.A_Expire_Date) as Expire_DT,
coalesce(e1.t_note,'insert related item column'),
coalesce(e1.t_updated_by, I.A_Updated_by) as Updated_By
FROM Information I
LEFT JOIN expire_history e1
ON E1.M_ID = I.M_ID
AND I.m_a_ordinal=e1.M_a_ordinal
INNER JOIN
(SELECT m_id, m_a_ordinal, MAX(a_expired_date) AS a_expired_date
FROM expire_history GROUP BY m_id, m_a_ordinal) e2
ON (e2.m_id = e1.m_id
AND e2.m_a_ordinal = e1.m_a_ordinal
AND e2.a_expired_date = e1.a_expired_date)
WHERE coalesce(e2.a_expired_date,i.A_Expire_Date) > '2008-05-15 00:00:00'
ORDER BY a_date_expired;
Syntax may be off a bit don't ahve time to test; but you can get the gist of it from this I hope:
Again what coalesce does is simply return the first NON-null value in a series of values. If you're only dealing with two NULLIF may work as well.

$query returns results but not the ones i want - having count?

I'll start again,
Lets say My data is:
Table element (id,name,....)
1, name element 1, ....
2, name element 2, ....
3, name element 3, ....
Table tags (id,name,id_element, ....)
1, happy , 1
2, result, 1
3, very , 1
4, element, 2
5, another, 3
6, element, 1
7, happy, 2
So if search is 'very, happy,element,result': Results i would like
1) element with id = 2 because it has all tags
2) element with id = 1 because it has the tag 'element' and the tag 'happy' (only 2 less taggs)
3) .... (only 3 less taggs)
So if search is 'happy,element': Results i would like
1) element with id = 1 because it has all tags (and no more)
2) element with id = 2 because it has the tag 'element' and the tag 'happy' (and two more tags)
3) .... and 3 more tags
This is an echo to my query:
(it doesn't fit al requirements i wrote, but its first test to find with matched tags)
SELECT element.id as id_deseada,tagg.* FROM element,tagg WHERE tagg.id_element = element.id AND tagg.nombre IN ('happy','tagg','result') GROUP BY tagg.id_element ORDER BY element.votos
This returns 10 duplicated elements... :S and doen't even have all taggs (and on database there are taggs with 'happy' results)
if it helps, thats how i get the elements of a tag (by name and with only one tagg)
$query = "SELECT element.id FROM element,tagg WHERE tagg.nombre = '$nombre_tagg' AND tagg.id_element = element.id AND lan = '$lan' GROUP BY tagg.id_element";
I hope it's a bit easier to understand now, excuse my english.. :)
------- EDIT ----------
I Got this query working, goted from: http://www.pui.ch/phred/archives/2005/04/tags-database-schemas.html (second example is my estructure)
SELECT b.*
FROM scBookmarks b, scCategories c
WHERE c.bId = b.bId
AND (c.category IN ('bookmark', 'webservice', 'semweb'))
GROUP BY b.bId
HAVING COUNT( b.bId )=3
imagine bookmark is element and category is tag (for my example)
It works great, it returns all elements with ALL tags, BUT it return aswell the articles with extra tags,
for that query a possible ressult would be:
Element 6 with tags (bookmark, webserbice, semweb, extratag)
and i want this extra tag (actually any other extra tags), is something to do with HAVING COUNT ?
I know
Thanks a lot for you possible aportation!
If I understand correctly, you have a many-to-many relationship where you want to satisfy multiple conditions at once on the same field.
In order to do this, you need to use MySQL's HAVING clause. Since I don't know your database structure, I'll make one for this example.
object
+------------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------+--------------+------+-----+---------+----------------+
| object_id | int(11) | NO | PRI | NULL | auto_increment |
| title | varchar(255) | NO | | NULL | |
| otherfield | text | YES | | NULL | |
+------------+--------------+------+-----+---------+----------------+
tag
+--------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+--------+--------------+------+-----+---------+----------------+
| tag_id | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(255) | NO | | NULL | |
+--------+--------------+------+-----+---------+----------------+
object_tag
+-----------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------+---------+------+-----+---------+-------+
| object_id | int(11) | NO | | NULL | |
| tag_id | int(11) | NO | | NULL | |
+-----------+---------+------+-----+---------+-------+
Now, to retrieve an object with all its tags, you would do:
SELECT object.*,tag.name as tag FROM object
LEFT JOIN object_tag ON object_tag.object_id = object.object_id
LEFT JOIN tag ON object_tag.tag_id = tag.tag_id
That gives you something like:
+-----------+-----------+------------+------+
| object_id | title | otherfield | tag |
+-----------+-----------+------------+------+
| 1 | My object | something | foo |
| 1 | My object | something | bar |
| 1 | My object | something | baz |
+-----------+-----------+------------+------+
Now, since you cannot do a WHERE tag='foo' AND tag='bar', because it's impossible to match, you have to use the HAVING clause.
SELECT object.* FROM object
LEFT JOIN object_tag ON object_tag.object_id = object.object_id
LEFT JOIN tag ON object_tag.tag_id = tag.tag_id
WHERE tag.name IN ('foo', 'bar')
GROUP BY object.object_id
HAVING COUNT(1) >= 2;
Now this will give you:
+-----------+-----------+------------+
| object_id | title | otherfield |
+-----------+-----------+------------+
| 1 | My object | something |
+-----------+-----------+------------+
The HAVING clause tells MySQL to return only if 2 tags or more (COUNT(1) >= 2) matches the result set.

Fast complex query to select bookings

I'm trying to write a query to get a courses information and the number of bookings and attendees. Each course can have many bookings and each booking can have many attendees.
We already have a working report, but it uses multiple queries to get the required information. One to get the courses, one to get the bookings, and one to get the number of attendees. This is very slow because of the size that the database has grown to.
There are a number of extra conditions for the reports:
Bookings must be made more than 5
minutes ago, or have been confirmed
The booking must not be canceled
The course must not be marked as deleted
The courses venue and location must be LIKE a search string
Courses with no bookings must appear in the results
This is the table structure: (I've omitted the unneeded information. All fields are not null and have no default)
mysql> DESCRIBE first_aid_courses;
+------------------+--------------+-----+----------------+
| Field | Type | Key | Extra |
+------------------+--------------+-----+----------------+
| id | int(11) | PRI | auto_increment |
| course_date | date | | |
| region_id | int(11) | | |
| location | varchar(255) | | |
| venue | varchar(255) | | |
| number_of_spaces | int(11) | | |
| deleted | tinyint(1) | | |
+------------------+--------------+-----+----------------+
mysql> DESCRIBE first_aid_bookings;
+-----------------------+--------------+-----+----------------+
| Field | Type | Key | Extra |
+-----------------------+--------------+-----+----------------+
| id | int(11) | PRI | auto_increment |
| first_aid_course_id | int(11) | | |
| placed | datetime | | |
| confirmed | tinyint(1) | | |
| cancelled | tinyint(1) | | |
+-----------------------+--------------+-----+----------------+
mysql> DESCRIBE first_aid_attendees;
+----------------------+--------------+-----+----------------+
| Field | Type | Key | Extra |
+----------------------+--------------+-----+----------------+
| id | int(11) | PRI | auto_increment |
| first_aid_booking_id | int(11) | | |
+----------------------+--------------+-----+----------------+
mysql> DESCRIBE regions;
+----------+--------------+-----+----------------+
| Field | Type | Key | Extra |
+----------+--------------+-----+----------------+
| id | int(11) | PRI | auto_increment |
| name | varchar(255) | | |
+----------+--------------+-----+----------------+
I need to select the following:
Course ID: first_aid_courses.id
Date: first_aid_courses.course_date
Region regions.name
Location: first_aid_courses.location
Bookings: COUNT(first_aid_bookings)
Attendees: COUNT(first_aid_attendees)
Spaces Remaining: COUNT(first_aid_bookings) - COUNT(first_aid_attendees)
This is what I have so far:
SELECT `first_aid_courses`.*,
COUNT(`first_aid_bookings`.`id`) AS `bookings`,
COUNT(`first_aid_attendees`.`id`) AS `attendees`
FROM `first_aid_courses`
LEFT JOIN `first_aid_bookings`
ON `first_aid_courses`.`id` =
`first_aid_bookings`.`first_aid_course_id`
LEFT JOIN `first_aid_attendees`
ON `first_aid_bookings`.`id` =
`first_aid_attendees`.`first_aid_booking_id`
WHERE ( `first_aid_courses`.`location` LIKE '%$search_string%'
OR `first_aid_courses`.`venue` LIKE '%$search_string%' )
AND `first_aid_courses`.`deleted` = 0
AND ( `first_aid_bookings`.`placed` > '$five_minutes_ago'
AND `first_aid_bookings`.`cancelled` = 0
OR `first_aid_bookings`.`confirmed` = 1 )
GROUP BY `first_aid_courses`.`id`
ORDER BY `course_date` DESC
Its not quite working, can any one help me with writing the correct query? Also there are 1000s of rows in this database, so any help on making it fast is appreciated (like which fields to index).
Ok, Ive answered my own question. Sometimes it helps to ask a question for you to figure out the answer.
SELECT `first_aid_courses`.*,
`regions`.`name` AS `region_name`,
COUNT(DISTINCT `first_aid_bookings`.`id`) AS `bookings`,
COUNT(`first_aid_attendees`.`id`) AS `attendees`
FROM `first_aid_courses`
JOIN `regions`
ON `first_aid_courses`.`region_id` = `regions`.`id`
LEFT JOIN `first_aid_bookings`
ON `first_aid_courses`.`id` =
`first_aid_bookings`.`first_aid_course_id`
LEFT JOIN `first_aid_attendees`
ON `first_aid_bookings`.`id` =
`first_aid_attendees`.`first_aid_booking_id`
WHERE ( `first_aid_courses`.`location` LIKE '%$search_string%'
OR `first_aid_courses`.`venue` LIKE '%$search_string%' )
AND `first_aid_courses`.`deleted` = 0
AND ( `first_aid_bookings`.`cancelled` = 0
AND `first_aid_bookings`.`confirmed` = 1 )
GROUP BY `first_aid_courses`.`id`
ORDER BY `course_date` ASC
This is completely untested, but maybe try selecting a count of non-null rows for bookings and attendees, like this:
SUM(IF(`first_aid_bookings`.`id` IS NOT NULL, 1, 0)) AS `bookings`,
COUNT(IF(`first_aid_attendees`.`id` IS NOT NULL, 1, 0)) AS `attendees`
Unless you have it but just do not show it, have a good look on indexes, without them you loose an order of magnitude on performance on any query that references anything but primary key.
Another major performance hit are the LIKE '%nnn%'.
Would it be possible to do something with those?
But with some good indexes, this query should be fine if you have the hardware to back it up.
I have queries doing LIKE on tables with millions of rows. its not a problem if the rest of the query can eliminate any unnecessary matchings.
You could go for subqueries to lessen the scope for the LIKE queries.