Match all keywords using like in group MySQL - mysql

I have a table of keywords (ID,ean,keyword) and another table with product details. I want the search to return EANs where all keywords match at least once, however the closest I have got is the following, but this returns matches that have the first term in them 3 times for example.
To give an example, let's say I have a product called 'Generic headphones - iPhone, iPad, iPod' and I searched 'gen%' 'hea%' 'ip%' it would come back as a match, but it would also match 'Apple headphones - iPhone, iPad, iPod' due to the 3 ip words, which is not desired.
SQL Fiddle
I want EAN 1 to match only, so matches need to be at least 1 for each term.
Any help would be much appreciated.
SELECT Count(keywords.ean) AS cc,
products.*
FROM keywords
INNER JOIN products
ON products.ean = keywords.ean
WHERE (
keyword LIKE 'gen%'
|| keyword like 'ip%'
|| keyword LIKE 'hea%')
GROUP BY (keywords.ean)
HAVING cc>=3
ORDER BY `products`.`ean` ASC
UPDATE: This gets the desired results, but there must be more efficient ways to do this.
SELECT products.*
FROM products
INNER JOIN (SELECT ean, count(*) as tc1
FROM keywords
WHERE ( keyword like 'gen%' )
GROUP BY ean
HAVING tc1 > 0 ) as t1 ON t1.ean = products.ean
INNER JOIN (SELECT ean, count(*) as tc2
FROM keywords
WHERE ( keyword like 'ip%' )
GROUP BY ean
HAVING tc2 > 0 ) as t2 ON t2.ean = products.ean
INNER JOIN (SELECT ean, count(*) as tc3
FROM keywords
WHERE ( keyword like 'hea%' )
GROUP BY ean
HAVING tc3 > 0 ) as t3 ON t3.ean = products.ean
ORDER BY products.ean

Perhaps you're after something more like this...
SELECT p.ean
, p.description
FROM products p
JOIN keywords k
ON k.ean = p.ean
WHERE k.keyword LIKE 'iP%'
OR k.keyword LIKE 'hea%'
OR k.keyword LIKE 'gen%'
GROUP
BY p.ean
HAVING COUNT(DISTINCT CASE WHEN k.keyword LIKE 'iP%' THEN 'iP'
WHEN k.keyword LIKE 'hea%' THEN 'hea'
WHEN k.keyword LIKE 'gen%' THEN 'gen'
ELSE keyword END) = 3;
http://sqlfiddle.com/#!9/270f9/25

This is how I'd do it in PostgreSQL. MySQL may have a slightly different syntax.
SELECT kc.cc AS cc,
products.*
FROM products
INNER JOIN ( SELECT ean, count(*) AS cc
FROM keywords
WHERE ( keyword like 'ip%'
OR keyword like 'ai%'
OR keyword like 'bei%' )
GROUP BY ean
HAVING count(*) >= 3 ) AS kc
ON kc.ean = products.ean
ORDER BY Products.ean;

Related

How should I merge these selects and narrow the result set?

I have this huge query that filters results out of a series of keywords.
select distinct textures.id from textures
WHERE ((textures.id in (
select tt.texture_id
from tag_texture tt join tags t
on t.id = tt.tag_id
where t.name in ('tag1', 'tag2')
group by tt.texture_id
HAVING COUNT(DISTINCT t.id) = 2
)
) OR (textures.id in (
select ct.texture_id
from category_texture ct join categories c
on c.id = ct.category_id
where c.name in ('category1', 'category2')
group by ct.texture_id
HAVING COUNT(DISTINCT c.id) = 2
)
) OR (textures.id in (
select tex.id
from textures tex
where tex.name LIKE 'texturename'
group by tex.id
HAVING COUNT(DISTINCT tex.id) = 1
)
) ) AND textures.is_published = 1
The problem is that if I search for texturename tag1, all texturename results will be found, even if they have nothing to do with tags. However, if I search for "tag1 tag2", the resulting list is filtered out (less results than just searching tag1). Changing those ORs to AND widens the results even more, obviously.
What's the best way to merge these results so that each time a word is filtered the result set is narrowed down?
Changing all the OR to AND should solve the problem:
SELECT id, name
FROM textures
WHERE ((textures.id in (
select tt.texture_id
from tag_texture tt join tags t
on t.id = tt.tag_id
where t.name in ('1k', 'test')
group by tt.texture_id
HAVING COUNT(DISTINCT t.id) = 2
)
) AND (textures.id in (
select ct.texture_id
from category_texture ct join categories c
on c.id = ct.category_id
where c.name in ('mine')
group by ct.texture_id
HAVING COUNT(DISTINCT c.id) = 1
)
) AND (textures.id in (
select tex.id
from textures tex
where tex.name LIKE '%apple%'
group by tex.id
HAVING COUNT(DISTINCT tex.id) = 1
)
) ) AND textures.is_published = 1
SqlFiddle
There's no need to use DISTINCT in this query. You're not joining with any other tables, so nothing is going to cause the results to multiply.
If you want to search for the same keywords in all the fields, and require that at least one of them match each field, get rid of the GROUP BY and HAVING clauses.
select textures.id, textures.name from textures
WHERE ((textures.id in (
select tt.texture_id
from tag_texture tt join tags t
on t.id = tt.tag_id
where t.name in ('1k', 'test', 'apple', 'mine')
)
) AND (textures.id in (
select ct.texture_id
from category_texture ct join categories c
on c.id = ct.category_id
where c.name in ('1k' 'test', 'apple', 'mine')
)
) AND (textures.id in (
select tex.id
from textures tex
where tex.name LIKE '%1k%' OR tex.name LIKE '%test%' OR tex.name LIKE '%apple%'
OR tex.name LIKE '%mine%'
)
) ) AND textures.is_published = 1
I added mine to the list of keywords, because otherwise there was no match in the categories table.
SqlFiddle

Any way to substitute left join or it is necessary?

I would like to know if there is any better way to do the following query or if there is any necessity to make it better? (Considering that the DB load is not that big). The only criterion is that the 3 "variables" have to be included as AND, and the query can have more or less LEFT JOINS.
select `candidates`.* from `candidates`
inner join `candidate_tag` on `candidates`.`id` = `candidate_tag`.`candidate_id`
left join `tags` t1 on `t1`.`id` = `candidate_tag`.`tag_id` and t1.name like '%foo%'
left join `tags` t2 on `t2`.`id` = `candidate_tag`.`tag_id` and t2.name like '%baz%'
left join `tags` t3 on `t3`.`id` = `candidate_tag`.`tag_id` and t3.name like '%zoo%'
group by candidates.id
order by `candidates`.`last_name` asc
My first shot was the conditional query with AND operator but it didn't give me any results, so that is why I chose to change it to left join s.
Thanks!
Try this:
select `candidates`.id from `candidates`
inner join `candidate_tag` on `candidates`.`id` = `candidate_tag`.`candidate_id`
inner join `tags` t on `t`.`id` = `candidate_tag`.`tag_id`
group by candidates.id
HAVING SUM(CASE WHEN t.tag_name LIKE '%foo%' THEN 1 ELSE 0 END) >= 1 AND SUM(CASE WHEN t.tag_name LIKE '%bar%' THEN 1 ELSE 0 END) >= 1 AND SUM(CASE WHEN t.tag_name LIKE '%aaa%' THEN 1 ELSE 0 END) >= 1
order by `candidates`.`last_name` asc
If you want to find candidates who have one of three tags, you can also use conditional aggregation:
select c.*
from candidates c join
candidate_tag ct
on ct.candidate_id = c.id join
tags t
on ct.tag_id = t.id
where t.name like '%foo%' or
t.name like '%baz%' or
t.name like '%zoo%'
group by c.id
order by c.`last_name` asc;
You can also add group_concat(t.name) as tags to get the tags that candidates match. Or, alternatively, if you only want candidates with all three, you can add a having clause:
having count(distinct t.name) = 3
or
having (max(t.name like '%foo%') +
max(t.name like '%baz%') +
max(t.name like '%zoo%')) = 3
You need this version if there are too "foo" tags, for instance.

Limit in subquery

When I use the following query without LIMIT nested in a subquery
SELECT `c`.*,
GROUP_CONCAT(g.photo SEPARATOR "|") AS `photos_list`
FROM `contests` AS `c`
LEFT JOIN
(
SELECT `gallery`.`contest_id`,
`gallery`.`photo`
FROM `gallery`
) AS `g` ON c.id = g.contest_id
GROUP BY `c`.`id`
all works fine
id title photos_list
1 title1 50026c35632eb.jpg
2 title2 50026ac53567f.jpg|50026ac5ec82e.jpg|500e71557270f....
Bun when I add LIMIT, I get "photos_list" in only one row. Following query
SELECT `c`.*,
GROUP_CONCAT(g.photo SEPARATOR "|") AS `photos_list`
FROM `contests` AS `c`
LEFT JOIN
(
SELECT `gallery`.`contest_id`,
`gallery`.`photo`
FROM `gallery`
LIMIT 0, 2
) AS `g` ON c.id = g.contest_id
GROUP BY `c`.`id`
will return
id title photos_list
1 title1 NULL
2 title2 50026ac46ea05.jpg|50026ac53567f.jpg
Item with an id = 1 has to contain photos_list, but it doesn't. Noteworthy that LIMIT does work for item with an id = 2.
What should I do to get a correct result?
SELECT `c`.*,
GROUP_CONCAT(g.photo SEPARATOR "|") AS `photos_list`
FROM `contests` AS `c`
LEFT JOIN
(
SELECT `gallery`.`contest_id`,
`gallery`.`photo`
FROM `gallery`
) AS `g` ON c.id = g.contest_id
GROUP BY `c`.`id`
Change GROUP_CONCAT to this:
SUBSTRING_INDEX(GROUP_CONCAT(g.photo SEPARATOR "|"),'|',2) AS `photos_list`
You can do similar things with timestamps (e.g. AND photo_date > gsub.photo_date) or more complex criteria. The only caveat is that if there are several rows that all match the conditions (e.g. several photos have identical timestamps), all of them will be included. That's why I chose photo_id, which is assumably unique.
Insert it into your original query like so:
SELECT c.id, c.title,
GROUP_CONCAT(g.photo SEPARATOR "|") AS photos_list
FROM contests AS c
LEFT JOIN (
//put query from above here
) AS g
ON c.id = g.contest_id GROUP BY c.id
This works as well. However, without wrapping another SELECT clause around it, if there are no photos for a contest, the contest will not show up.
SELECT c.*, GROUP_CONCAT(g.photo SEPARATOR "|") AS photo_list
FROM
contests c
LEFT JOIN
(SELECT *, #num:= if(#contest = contest_id, #num + 1,1) as row_num,
#contest := contest_id as c_id
FROM gallery
ORDER BY contest_id) AS g
ON c.id = g.contest_id
WHERE g.row_num <= 2
GROUP BY c.id, c.title
SELECT c.*, ((
SELECT GROUP_CONCAT(temp.photo SEPARATOR "|")
FROM (SELECT photo FROM gallery g WHERE c.id = g.contest_id LIMIT 2) temp
)) AS photo_list
FROM contests c
Sorry for the incorrect answer. I'm not saying that the following solution is the optimum one but at least it works. BTW, in this new solution I've assumed that you gallery table has a primary key named id.
SELECT c.*, GROUP_CONCAT(g.photo SEPARATOR "|") AS photos_list
FROM contests AS c
LEFT JOIN (
SELECT
g_0.*
FROM (
SELECT
g_1.*
, ((SELECT COUNT(*) FROM gallery g_2 WHERE g_2.contest_id = g_1.contest_id AND g_2.id <= g_1.id)) AS i
FROM gallery g_1
) g_0
WHERE
g_0.i <= 2
) g ON (c.id = g.contest_id)
GROUP BY c.id
How do you decide which 2 of the possible set of photos for a particular contest should be returned? Is it meant to be a random thing? Or is it the 2 most recent photos, or the 2 highest rated photos, or some other criteria? Once you can set a condition for choosing the photos, the rest is straighforward. This query would get you the 2 photos with the highest photo_ids for each contest_id:
SELECT contest_id, photo, photo_id
FROM gallery gsub
WHERE (
SELECT COUNT(*) FROM gallery
WHERE contest_id=gsub.contest_id //for each category
AND photo_id > gsub.photo_id
) < 2 //if number of photo_ids > than this photo_id < 2, keep this photo
ORDER BY contest_id
You can do similar things with timestamps (e.g. AND photo_date > gsub.photo_date) or more complex criteria. The only caveat is that if there are several rows that all match the conditions (e.g. several photos have identical timestamps), all of them will be included. That's why I chose photo_id, which is assumably unique.
Insert it into your original query like so:
SELECT c.id, c.title,
GROUP_CONCAT(g.photo SEPARATOR "|") AS photos_list
FROM contests AS c
LEFT JOIN (
//put query from above here
) AS g
ON c.id = g.contest_id GROUP BY c.id

Count Distinct itesm from mySQL query

Attempting to get distinct count value back of the number "threads" in a query
SELECT COUNT( ft.thread_id ) AS num_items
FROM filter_thread ft
INNER JOIN filter f ON ft.filter_id = f.filter_id
WHERE f.tag LIKE '%foo%'
OR f.tag LIKE '%bar%'
The above works, but due to the way the tables are set up, counts duplicates. I've tried adding DISTINCT in many places. but had no luck.
For more information...this information is required to correctly list page numbers and associated posts for an AJAX comment section
Try this
SELECT COUNT( ft.thread_id ) AS num_items
FROM filter_thread ft
INNER JOIN filter f ON ft.filter_id = f.filter_id
WHERE f.tag LIKE '%foo%'
OR f.tag LIKE '%bar%'
GROUP BY ft.thread_id

Nested query and grouping by unions (SQL)

I need help writing a query to get some information but I am having trouble writing it.
[table_People]
int id
var name
[table_Tools]
int id
var name
[table_Activity1]
int person_id
int tool_id
date delivery_date
[table_Activity2]
int person_id
int tool_id
date installation_date
The query needs to return a list of all people and the name of the most recent tool they used in either activity 1 or 2 (the most recent activity that happened between the two).
SELECT
people.id AS personId,
people.name AS personName,
(
SELECT
tools.name AS toolName
FROM
activity1
JOIN
tools ON tools.id=activity1.tool_id
WHERE
activity1.id=people.id
UNION ALL
SELECT
tools.name AS toolName
FROM
activity2
JOIN
tools ON tools.id=activity2.tool_id
WHERE
activity2.id=people.id
ORDER BY
installationDate,deliveryDate
) AS toolName
FROM
people
ORDER BY
people.name
ASC
The problem I am having is that I can't sort by date (delivery or installation) as I get errors because they are different column names.
Using UNION in a subquery creates a derived temporary table. Columns that aren't selected are not in the result set, so you can't ORDER on a column that's not in the SELECT clause.
When using UNION, the first column name that is used in the SELECT clause is used in the result set (similar to an alias, though you could also use an alias).
Just be sure to name the column in the SELECT clause.
You also need a LIMIT clause to restrict the subquery to a single row:
SELECT
people.id AS personId,
people.name AS personName,
(
SELECT
tools.name AS toolName, delivery_date
FROM
activity1
JOIN
tools ON tools.id=activity1.tool_id
WHERE
activity1.id=people.id
UNION ALL
SELECT
tools.name AS toolName, installation_date
FROM
activity2
JOIN
tools ON tools.id=activity2.tool_id
WHERE
activity2.id=people.id
ORDER BY
deliveryDate
LIMIT 1
) AS toolName
FROM
people
ORDER BY
people.name
ASC
Here's a more simple example to illustrate the issue:
SELECT fish FROM sea
UNION
SELECT dog FROM land
ORDER BY fish
Is the same as:
SELECT fish AS animal FROM sea
UNION
SELECT dog AS animal FROM land
ORDER BY animal
The results are put into a derived temporary table, and you can name the columns whatever you want, but the first name that you use sticks.
My solution puts the unions together in a subquery and then orders by them. You only want the first row, so you need a limit clause (or rownum = 1 in Oracle or top 1 in MSSQL):
SELECT people.id AS personId,
people.name AS personName,
(SELECT toolname
FROM ((SELECT tools.name AS toolName, delivery_date as thedate
FROM activity1 a
WHERE a.PersonId = people.id
) union all
(SELECT tools.name AS toolName, installation_date as thedate
FROM activity2 a
WHERE a.PersonId = people.id
)
) a join
tools t
on a.toolsid = t.toolsid
order by 2 desc
limit 1
) AS toolName
FROM people
ORDER BY people.name ASC
To simplify the query, I also removed the innermost join to tools.
you cannot sort columns without specifying them after your select.
For Example:
select name from people order by name
select a1.delivery_date, t.name from activity1 a1, tools t
order by a1.delivery_date,t.name
All selected columns for the projection MUST be defined in the order by definition as well. In your example both select statements are just taking tools.name as toolname but you want to sort by other columns.
MySQL version (possibly you can get more compact code from it):
Select * from
(
Select toolName, person_id, Max(TargetDate) as MaxDate
From
(
SELECT tools.name AS toolName, activity1.person_id, activity1.delivery_date as targetDate
FROM
activity1
JOIN
tools ON tools.id=activity1.tool_id
UNION ALL
SELECT
tools.name AS toolName, activity2.person_id, activity2.installation_date as TargetDate
FROM
activity2
JOIN
tools ON tools.id=activity2.tool_id
)
Group by toolName, person_id
) preselect
join
(
Select toolName, person_id, Max(TargetDate)
From
(
SELECT tools.name AS toolName, activity1.person_id, activity1.delivery_date as targetDate
FROM
activity1
JOIN
tools ON tools.id=activity1.tool_id
UNION ALL
SELECT
tools.name AS toolName, activity2.person_id, activity2.installation_date as TargetDate
FROM
activity2
JOIN
tools ON tools.id=activity2.tool_id
)) result on result.toolName = preselect.toolName and result.person_id = preselect.person_id and result.TargetDate = preselect.MaxDate
Do
SELECT
people.id AS personId,
people.name AS personName,
IF (
(SELECT
deliveryDate AS dDate
FROM
activity1
WHERE
person_id=personId) --assuming you have only one row returned here, else limit by some condition
>
(SELECT
installationDate AS iDate
FROM
activity2
WHERE
person_id=personId) --assuming you have only one row returned here, else limit by some condition
, (SELECT
tools.name AS toolName
FROM
activity1
JOIN
tools ON tools.id=activity1.tool_id
WHERE
activity1.person_id=personId)
, (SELECT
tools.name AS toolName
FROM
activity2
JOIN
tools ON tools.id=activity2.tool_id
WHERE
activity2.person_id=personId)
) AS toolName
FROM
people
ORDER BY
people.name
ASC
This query assumes there is only record per person in activity tables. If there are more you need to limit your select result set based on sum maximum condition or so which only u know.
SELECT
p.id AS person_id
, p.name AS person_name
, CASE WHEN COALESCE(a1.delivery_date, '1000-01-01')
> COALESCE(a2.installation_date, '1000-01-01')
THEN t1.name
ELSE t2.name
END AS tool_name
FROM
People AS p
LEFT JOIN
Activity1 AS a1
ON (a1.tool_id, a1.delivery_date) =
( SELECT tool_id, delivery_date
FROM Activity1 AS a
WHERE a.person_id = p.id
ORDER BY delivery_date DESC
LIMIT 1
)
LEFT JOIN
Tools AS t1
ON t1.id = a1.tool_id
LEFT JOIN
Activity2 AS a2
ON (a2.tool_id, a2.installation_date) =
( SELECT tool_id, installation_date
FROM Activity2 AS a
WHERE a.person_id = p.id
ORDER BY installation_date DESC
LIMIT 1
)
LEFT JOIN
Tools AS t2
ON t2.id = a2.tool_id
select people.id as person_id,people.name as person_name,tools.name as toolsname
from table_people people left join
(
select
case when installation_date>delivery_date then act2.tool_id
else act1.tool_id end as recent_most_tool_id,
case when installation_date>delivery_date then act2.person_id
else act1.person_id end as recent_most_person_id
from table_activity1 act1 inner join table_activity2 act2
on act1.person_id=act2.person_id)X
on people.id=X.recent_most_person_id
inner join table_tools tools
on tools.id=X.recent_most_tool_id