Search a table based on multiple rows in another table - mysql

Basically I have three MySQL tables:
Users - contains base information on users
Fields - describes additional fields for said users (e.g. location, dob etc.)
Data - Contains user data described via links to the fields table
With the basic design as follows (the below is a stripped down version)
Users:
ID | username | password | email | registered_date
Fields
ID | name | type
Data:
ID | User_ID | Field_ID | value
what I want to do is search Users by the values for the fields they have, e.g. example fields might be:
Full Name
Town/City
Postcode
etc.
I've got the following, which works when you're only wanting to search by one field:
SELECT `users`.`ID`,
`users`.`username`,
`users`.`email`,
`data`.`value`,
`fields`.`name`
FROM `users`,
`fields`,
`data`
WHERE `data`.`Field_ID` = '2'
AND `data`.`value` LIKE 'london'
AND `users`.`ID` = `data`.`User_ID`
AND `data`.`Field_ID` = `fields`.`ID`
GROUP BY `users`.`ID`
But what about if you want to search for Multiple fields? e.g. say I want to search for Full Name "Joe Bloggs" With Town/City set to "London"? This is the real sticking point for me.
Is something like this possible with MySQL?

I'm going with the assumption that "searching multiple fields" is talking about the Entity-Attribute-Value structure.
In that case, I propose that the first step is to create a derived query - basically, we want to limit the "EAV data joined" to only include the records that have the values we are interested in finding. (I've altered some column names, but the same premise holds.)
SELECT d.userId
FROM data d
JOIN fields f
ON f.fieldId = d.fieldId
-- now that we establish data/field relation, filter rows
WHERE f.type = "location" AND d.value = "london"
OR f.type = "job" AND d.value = "programmer"
This resulting rows are derived from the filtered EAV triplets that match our conditions. Only the userId is selected in this case (as it will be used to join against the user relation), but it is also possible to push fieldId/value/etc through.
Then we can use all of this as a derived query:
SELECT *
FROM users u
JOIN (
-- look, just goes in here :)
SELECT DISTINCT d.userId
FROM data d
JOIN fields f
ON f.fieldId = d.fieldId
WHERE f.type = "location" AND d.value = "london"
OR f.type = "job" AND d.value = "programmer"
) AS e
ON e.userId = u.userId
Notes:
The query planner will figure all the RA stuff out peachy keen; don't worry about this "nesting" as there is no dependent subquery.
I avoid the use of implicit cross-joins as I feel they muddle most queries, this case being a particularly good example.
I've "cheated" and added a DISTINCT to the derived query. This will ensure that at most one record will be joined/returned per user and avoids the use of GROUP BY.
While the above gets "OR" semantics well (it's both easier and I may have misread the question), modifications are required to get "AND" semantics. Here are some ways that the derived query can be written to get such. (And at this point I must apologize to Tony - I forget that I've already done all the plumbing to generate such queries trivially in my environment.)
Count the number of matches to ensure that all rows match. This will only work if each entity is unique per user. It also eliminates the need for DISTINCT to maintain correct multiplicity.
SELECT d.userId
FROM data d
JOIN fields f
ON f.fieldId = d.fieldId
-- now that we establish data/field relation, filter rows
WHERE f.type = "location" AND d.value = "london"
OR f.type = "job" AND d.value = "programmer"
GROUP BY d.userId
HAVING COUNT(*) = 2
Find the intersecting matches:
SELECT d.userId
FROM data d
JOIN fields f ON f.fieldId = d.fieldId
WHERE f.type = "location" AND d.value = "london"
INTERSECT
SELECT d.userId
FROM data d
JOIN fields f ON f.fieldId = d.fieldId
WHERE f.type = "job" AND d.value = "programmer"
Using JOINS (see Tony's answer).
SELECT d1.userId
FROM data d1
JOIN data d2 ON d2.userId = d1.userId
JOIN fields f1 ON f1.fieldId = d1.fieldId
JOIN fields f2 ON f2.fieldId = d2.fieldId
-- requires AND here across row
WHERE f1.type = "location" AND d1.value = "london"
AND f2.type = "job" AND d2.value = "programmer"
An inner JOIN itself provides conjunction semantics when applied outside of the condition. In this case I show "re-normalize" the data. This can also be written such that [sub-]selects appear in the select clause.
SELECT userId
FROM (
-- renormalize, many SO questions on this
SELECT q1.userId, q1.value as location, q2.value as job
FROM (SELECT d.userId, d.value
FROM data d
JOIN fields f ON f.fieldId = d.fieldId
WHERE f.type = "location") AS q1
JOIN (SELECT d.userId, d.value
FROM data d
JOIN fields f ON f.fieldId = d.fieldId
WHERE f.type = "job") AS q2
ON q1.userId = q2.userId
) AS q
WHERE location = "london"
AND job = "programmer"
The above duplicity is relatively easy to generate via code and some databases (such as SQL Server) support CTEs which make writing such much simpler. YMMV.

If I understood you right, this is what you want:
FROM `users`,
`fields`,
`data` `location`
`data` `name`
WHERE `location`.`Field_ID` = '2'
AND `location`.`value` LIKE 'london'
AND `location`.`Field_ID` = `fields`.`ID`
AND `name`.`Field_ID` = 'whathere? something for its name'
AND `name`.`value` LIKE 'london'
AND `name`.`Field_ID` = `fields`.`ID`
AND `users`.`ID` = `data`.`User_ID`
I'd prefer joins though

Well here you hit one of the downsides of the EAV you are using
SELECT u.ID, u.username,u.email, d1.value, f1.Name, d2.Value, f2.name
FROM `users` u,
inner join data d1 On d1.User_id = u.id
inner join data d2 On d2.User_id = u.id
inner join fields f1 on f1.id = d1.field_id
inner join fields f2 on f2.id = d2.field_id
WHERE d1.Field_id = '2' and d1.Value = 'london'
and d2.field_id = '??' and d2.value = 'Joe Bloggs'
GROUP BY `users`.`ID`
Messy isn't it? Bet you can't wait to go for, four or five values. Or think about (Forename = Joe Or surname = Bloggs) and City = London...

Related

How to create a SQL search query with varying arguments

I'm implementing search functionality into my application, and I need to write the SQL query that will search the database for records based on a wide range of possible parameters. I'm including a sample of what I have so far:
SELECT m.*, a.*, p.*, e.*
FROM members m
LEFT JOIN addresses a ON m.member_id = a.address_member_id AND a.preferred_address = TRUE
LEFT JOIN phones p ON m.member_id = p.phone_member_id AND p.preferred_phone = TRUE
LEFT JOIN emails e ON m.member_id = e.email_member_id AND e.preferred_email = TRUE
WHERE m.member_id IN (
(SELECT address_member_id
FROM addresses
WHERE address = '123 Hart Ct' AND unit = NULL AND city = 'Hometown' AND state = 'NJ' AND zip_code = '08895'),
(SELECT phone_member_id
FROM phones
WHERE area_code = '732' AND prefix = '619' AND line_number = '7826'),
(SELECT email_member_id
FROM emails
WHERE email_address = 'craigmiller160#gmail.com')
)
AND m.first_name = 'Jane' AND m.middle_name = 'Deborah' AND m.last_name = 'Foster'
ORDER BY member_id ASC;
As you can see, I'm searching for records in the "members" table based on the values in that table, as well as the values of the "preferred" address, phone, email, etc, that are associated with it.
The problem is that the query above requires all of those values to be provided. In most cases, the user will only provide a handful of the actual values to perform the search.
Now, I know that one option is to build the query dynamically in my application layer, concatenating Strings together and whatnot. That way the WHERE clauses would only have the values that are needed for that specific query. But before I do that, I'm wondering if there's any way to accomplish this in a purely-SQL way?
You have two options to do this in purely-SQL way
The first way is using a stored procedure. Using it you can use IF condition and concatenate the query string inside the stored procedure and then execute it (I can provide a demo if you want)
The second way is using an OR condition
SELECT m.*, a.*, p.*, e.*
FROM members m
LEFT JOIN addresses a ON m.member_id = a.address_member_id
AND a.preferred_address = TRUE
LEFT JOIN phones p ON m.member_id = p.phone_member_id
AND p.preferred_phone = TRUE
LEFT JOIN emails e ON m.member_id = e.email_member_id
AND e.preferred_email = TRUE
WHERE m.member_id IN ((SELECT address_member_id
FROM addresses
WHERE ('123 Hart Ct' is NULL OR address = '123 Hart Ct')
AND (NULL is NULL OR unit = NULL)
AND ('Hometown' is NULL OR city = 'Hometown')
AND ('NJ' is NULL OR state = 'NJ')
AND ('08895' IS NULL OR zip_code = '08895'));
Note you can replace is NULL with = "" if the variable will be empty not NULL

MySQL UNION DISTINCT - exclude

I have query like this:
SELECT cs_event.*, cs_file.name, cs_file.extension, cs_user.first_name, cs_user.last_name
FROM cs_event
LEFT JOIN cs_file ON cs_event.idfile = cs_file.idfile
LEFT JOIN cs_user ON cs_event.iduser = cs_user.iduser
WHERE type != 51
AND idportal = 1
UNION DISTINCT
SELECT cs_event.*, cs_file.name, cs_file.extension, cs_user.first_name, cs_user.last_name
FROM cs_event
LEFT JOIN cs_file ON cs_event.idfile = cs_file.idfile
LEFT JOIN cs_user ON cs_event.iduser = cs_user.iduser
WHERE shared_with_users LIKE '%i:2;%'
AND idportal = 1
ORDER BY add_date DESC
LIMIT 6
The problem is following:
Regular user can't see certain types of events (for now it is type 51) and he can see only things which are shared with him.
shared_with_users column can be null or have value - this column have value only for one type of event (type = 50) and for other events it is null.
I need to perform following:
User can access all events except event with type 51 and if the the event is type of 50, I need to check if the event is shared with him (shared_with_users column), and collect that also. Is it possible to make this kind of query?
Try this
SELECT cs_event.*, cs_file.name, cs_file.extension, cs_user.first_name, cs_user.last_name
FROM cs_event
LEFT JOIN cs_file ON cs_event.idfile = cs_file.idfile
LEFT JOIN cs_user ON cs_event.iduser = cs_user.iduser
WHERE type != 51 o or (type = 50 and shared_with_users LIKE '%i:2;%')
AND idportal = 1
ORDER BY add_date DESC
LIMIT 6
I think you can do this as a single query, with logic in the WHERE clause:
SELECT e.*, f.name, f.extension, u.first_name, u.last_name
FROM cs_event e LEFT JOIN
cs_file f
ON e.idfile = f.idfile LEFT JOIN
cs_user u
ON e.iduser = u.iduser
WHERE idportal = 1 AND
(type <> 51 OR shared_with_users LIKE '%i:2;%');
Some notes:
I don't think the LEFT JOINs are necessary. The WHERE clause may be turning them into inner joins anyway, but it is hard to tell without qualified column names.
I added table aliases so the query is easier to write and to read.
The logic for shared_with_users suggests that you have stored a list of values in a string. That is a bad choice.

Get results with various WHERE combinations within the same JOIN

I'm trying to work out the best way to get all "searchproducts" which have parts with specified attributes. This is what I have at the moment, but this won't return anything despite their being a part that has both attributes/attribute headers.
SELECT DISTINCT * FROM searchproduct
JOIN part ON searchproduct.id = part.searchproduct_id
LEFT JOIN part_attribute ON part.id = part_attribute.part_id
JOIN part_attribute ON part_attribute.id = part_attribute.part_attributeheader_id
WHERE (part_attribute.name = 'Colour' AND part_attribute.value IN ('Black')) AND (part_attribute.name = 'Size' AND part_attribute.value IN ('11'));
Each searchproduct has multiple part, each part has multiple part_attribute, each part_attribute has one part_attributeheader, each part_attributeheader has one name.
I am thinking I possibly need to add some sort of grouping? Everything I have tried returns no results.
Here's an example of the data (is there a better way to show it?)
samplesearchproduct (searchproduct)
|
v
part B (part)
| |
v v
"11" "Black" (part_attribute.value)
| |
v v
"Size" "Colour" (part_attributeheader.name)
I am struggling to understand your layout of the data (are part_attribute and part_attributeheader different tables), but think you need 2 joins. Something like this:-
SELECT DISTINCT *
FROM searchproduct
JOIN part ON searchproduct.id = part.searchproduct_id
INNER JOIN part_attribute p0 ON part.id = p0.part_id
INNER JOIN part_attribute P1 ON p.id = P1.part_attributeheader_id
WHERE (p0.name = 'Colour' AND p0.value IN ('Black'))
AND (P1.name = 'Size' AND P1.value IN ('11'));

Referencing (Using an alias) the value from nested MySQL query in an outer conditional statement

I'm using a nested query to get the value I need from a table that I then need to use in a conditional statement, however, every time I try this I keep on getting an error saying unknown column (format) in the field list
SELECT
(SELECT format FROM competition_stages WHERE comp_id = "5" AND rid = "24") AS format,
a.tie_id, b.name AS team_a, b.team_id AS team_a_id, c.name AS team_b, c.team_id AS team_b_id, SUM(e.bonus) AS team_a_bonus, SUM(f.bonus) AS team_b_bonus,
SUM(CASE
WHEN (a.team_a = e.team_id AND format = "0") THEN e.score
END) as team_a_agg,
SUM(CASE
WHEN (a.team_b = f.team_id AND format = "0") THEN f.score
END) as team_b_agg
FROM competition_tie a
INNER JOIN teams b ON (a.team_a = b.team_id)
INNER JOIN teams c ON (a.team_b = c.team_id)
LEFT JOIN fixtures d ON (a.tie_id = d.tie_id)
LEFT JOIN fixture_scores e ON (d.fx_id = e.fx_id AND a.team_a = e.team_id)
LEFT JOIN fixture_scores f ON (d.fx_id = f.fx_id AND a.team_b = f.team_id)
WHERE a.comp_id = "5" AND a.rid = "24" AND a.season_id = "5"
GROUP BY a.tie_id
ORDER BY a.tie_id ASC
I can get the value of the format column in my results when I go through them but it just seems I can't use it within my query to use in.
Thanks for your help!
Instead of using a subquery, simply join the competition_stages table to your query so that you can reference the format column directly. Assuming there is no appearant relation between the competition_stages table and the other tables in the query (at least not from the information on hand), you can just join the table using the two conditions you specified if these conditions would only yield 1 result in the competition_stages table. Something like this:
SELECT cs.format, a.tie_id, ....
FROM competition_tie a ...
INNER JOIN competition_stages cs ON cs.comp_id = "5" AND cs.rid = "24"

SQL: Get latest entries from history table

I have 3 tables
person (id, name)
area (id, number)
history (id, person_id, area_id, type, datetime)
In this tables I store the info which person had which area at a specific time. It is like a salesman travels in an area for a while and then he gets another area. He can also have multiple areas at a time.
history type = 'I' for CheckIn or 'O' for Checkout.
Example:
id person_id area_id type datetime
1 2 5 'O' '2011-12-01'
2 2 5 'I' '2011-12-31'
A person started traveling in area 5 at 2011-12-01 and gave it back on 2011-12-31.
Now I want to have a list of all the areas all persons have right now.
person1.name, area1.number, area2.number, area6.name
person2.name, area5.number, area9.number
....
The output could be like this too (it doesn't matter):
person1.name, area1.number
person1.name, area2.number
person1.name, area6.number
person2.name, area5.number
....
How can I do that?
This question is, indeed, quite tricky. You need a list of the entries in history where, for a given user and area, there is an 'O' record with no subsequent 'I' record. Working with just the history table, that translates to:
SELECT ho.person_id, ho.area_id, ho.type, MAX(ho.datetime)
FROM History AS ho
WHERE ho.type = 'O'
AND NOT EXISTS(SELECT *
FROM History AS hi
WHERE hi.person_id = ho.person_id
AND hi.area_id = ho.area_id
AND hi.type = 'I'
AND hi.datetime > ho.datetime
)
GROUP BY ho.person_id, ho.area_id, ho.type;
Then, since you're really only after the person's name and the area's number (though why the area number can't be the same as its ID I am not sure), you need to adapt slightly, joining with the extra two tables:
SELECT p.name, a.number
FROM History AS ho
JOIN Person AS p ON ho.person_id = p.id
JOIN Area AS a ON ho.area_id = a.id
WHERE ho.type = 'O'
AND NOT EXISTS(SELECT *
FROM History AS hi
WHERE hi.person_id = ho.person_id
AND hi.area_id = ho.area_id
AND hi.type = 'I'
AND hi.datetime > ho.datetime
);
The NOT EXISTS clause is a correlated sub-query; that tends to be inefficient. You might be able to recast it as a LEFT OUTER JOIN with appropriate join and filter conditions:
SELECT p.name, a.number
FROM History AS ho
JOIN Person AS p ON ho.person_id = p.id
JOIN Area AS a ON ho.area_id = a.id
LEFT OUTER JOIN History AS hi
ON hi.person_id = ho.person_id
AND hi.area_id = ho.area_id
AND hi.type = 'I'
AND hi.datetime > ho.datetime
WHERE ho.type = 'O'
AND hi.person_id IS NULL;
All SQL unverified.
You're looking for results where each row may have a different number of columns? I think you may want to look into GROUP_CONCAT()
SELECT p.`id`, GROUP_CONCAT(a.`number`, ',') AS `areas` FROM `person` a LEFT JOIN `history` h ON h.`person_id` = p.`id` LEFT JOIN `area` a ON a.`id` = h.`area_id`
I haven't tested this query, but I have used group concat in similar ways before. Naturally, you will want to tailor this to fit your needs. Of course, group concat will return a string so it will require post processing to use the data.
EDIT I thikn your question has been edited since I began responding. My query does not really fit your request anymore...
Try this:
select *
from person p
inner join history h on h.person_id = p.id
left outer join history h2 on h2.person_id = p.id and h2.area_id = h.area_id and h2.type = 'O'
inner join areas on a.id = h.area_id
where h2.person_id is null and h.type = 'I'