This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 9 years ago.
The image shows the structure of my table. The first line means tutorB gives 10 marks to studentD. The second line means tutorE does not give any marks to studentD yet.
How can I generate the following table? I have referenced another post in stackoverflow.com. Collaborative filtering in MySQL? Yet, I am still quite confused.
From the image shown above, o means recommended, which the rate is higher than or equal to 7; x means not recommended, which the rate is less than 7.
For example, the tutorB give studentD 10 marks, therefore, from the second line in the image, we can see there is a "o" in column StudentD. ( And other three rows's data are just randomly assigned now.)
Now, if I want to recommend a student for Tutor A. The ranks ( or similarity) of TutorB, C and D are 0,2 and 3 respectively.
How can I generate a SQL such that I can able to convert the rate to "o" and "x" and calculate the rank. And, the most important, I want to recommend StudentH to TutorA as from the image.
How should I modify the code from the previous post? And, if my idea mentioned above correct?
Thanks.
============================================================================
EDITED
I have the following data in the database. The first row means 10 marks is given by tutorA to studentC.
I convert it as another table for a better understanding. v is the value of Rate.
create temporary table ub_rank as
select similar.NameA,count(*) rank
from tbl_rating target
join tbl_rating similar on target.NameB= similar.NameB and target.NameA != similar.NameA
where target.NameA = "tutorA"
group by similar.NameA;
select similar.NameB, sum(ub_rank.rank) total_rank
from ub_rank
join ub similar on ub_rank.NameA = similar.NameA
left join ub target on target.NameA = "tutorA" and target.NameB = similar.NameB
where target.NameB is null
group by similar.NameB
order by total_rank desc;
select * from ub_rank;
The code above is referenced from Collaborative filtering in MySQL?. I have a few questions.
There are 2 parts in the SQL. I can select * from the first part. However, if I enter the whole SQL as shown above, the system alerts Table 'mydatabase.ub' doesn't exist How should I modify the code?
The code will find the similarity. How should I change the code, such that if the marks are less that 7, it changes to o, else change to v , and count the similarity of a given user?
Shamelessly borrowing from the answer to this previous question, see if this does the trick:
SET #sql = NULL;
SELECT
GROUP_CONCAT(DISTINCT
CONCAT(
'max(case when NameB = ''',
NameB,
''' then (case when rate >= 7 then ''x'' else ''o'' end) else '' '' end) AS ',
replace(NameB, ' ', '')
)
) INTO #sql
from tbl_rating
where RoleA = 'Tutor';
SET #sql = CONCAT('SELECT NameA, ', #sql,
' from tbl_rating
where RoleA = ''Tutor''
group by NameA');
PREPARE stmt FROM #sql;
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
Here is a SQL Fiddle.
Your DB schema is actually not very easy to work with.
Here's a query to get an exhaustive rating table:
SELECT Tutor.Name, Student.Name,
CASE WHEN Rating.Rate IS NULL THEN ''
WHEN Rating.Rate > 6 THEN 'o'
ELSE 'x' END
FROM (
SELECT DISTINCT NameB AS Name
FROM tbl_rating
WHERE RoleB='Tutor'
UNION
SELECT DISTINCT NameA AS Name
FROM tbl_rating
WHERE RoleA='Tutor'
ORDER BY Name) AS Tutor
CROSS JOIN (
SELECT DISTINCT NameB AS Name
FROM tbl_rating
WHERE RoleB='Student'
UNION
SELECT DISTINCT NameA AS Name
FROM tbl_rating
WHERE RoleA='Student'
ORDER BY Name) AS Student
LEFT JOIN tbl_rating AS Rating
ON Tutor.Name = Rating.NameA
AND Student.Name = Rating.NameB
ORDER BY Tutor.Name, Student.Name
The above query works by extracting from the table the list of all tutors (first subquery aliased to Tutor), and the list of all students (second subquery Student), do a product of both sets to obtain all the possible combination of tutor and student. Then it does an outer join with the rating table, which associate finds all the ratings done by students on tutors, and fill in with NULL non existent ratings.
(The query to obtain the opposit rating - ie. student rating by tutors - can be obtained by swapping NameA and NameB in the LEFT JOIN clauses).
The CASE turns numerical (or null) ratings to symbols as requested.
For similarities, we need to add two more joins:
one more on Tutor,
and another one on Rating
thus giving:
SELECT T1.Name AS Tutor1 , T2.Name AS Tutor2,
SUM( CASE
WHEN (R1.Rate > 6 && R2.Rate > 6) ||
(R1.Rate < 7 && R2.Rate < 7) THEN 1
ELSE 0 END) AS SIMILARITY
FROM (
SELECT DISTINCT NameB AS Name
FROM tbl_rating
WHERE RoleB='Tutor'
UNION
SELECT DISTINCT NameA AS Name
FROM tbl_rating
WHERE RoleA='Tutor'
ORDER BY Name) AS T1
CROSS JOIN (
SELECT DISTINCT NameB AS Name
FROM tbl_rating
WHERE RoleB='Tutor'
UNION
SELECT DISTINCT NameA AS Name
FROM tbl_rating
WHERE RoleA='Tutor'
ORDER BY Name) AS T2
CROSS JOIN (
SELECT DISTINCT NameB AS Name
FROM tbl_rating
WHERE RoleB='Student'
UNION
SELECT DISTINCT NameA AS Name
FROM tbl_rating
WHERE RoleA='Student'
ORDER BY Name) AS Student
LEFT JOIN tbl_rating AS R1
ON T1.Name = R1.NameA
AND Student.Name = R1.NameB
LEFT JOIN tbl_rating AS R2
ON T2.Name = R2.NameA
AND Student.Name = R2.NameB
WHERE Tutor1 < Tutor2
GROUP BY Tutor1, Tutor2
ORDER BY Tutor1, Tutor2
You could make these queries much more efficient by extracting the students and tutors specific data in their own tables, split the rating table in student ratings and tutors ratings, and use foreign keys:
Table student : Id | Name
Table tutor: Id | Name
Table tutor_rating: StudentId | TutorId | Rate
Table student_rating: StudentId | TutorId | Rate
and possibly a tutor_similiarity table to avoid recomputing the whole dataset all the time, with a couple of triggers on the rating tables to update it (the similarity computation would be then incremental, and queries would just dump its content).
Table tutor_similarity: TutorId1 | TutorId2 | Similarity
This is really a comment but it is too long for a comment.
First, you cannot easily create a table with a variable number of columns. Do you know the columns in advance? In general, you represent a matrix the way you do in your original table . . . the "x" and "y" values are columns and the value goes in a third column.
Second, is the x and o based on the rating from the tutor to the student or vice versa? Your question is entirely ambiguous.
Third, to convert a rating to an "x" or "o", just use a case statement:
select (case when rating >= 7 then 'x' else 'o' end)
Fourth, you say the similarities from A to B, C, and D are 0, 2, and 3 respectively. I have no idea how you are getting this from the matrix that you show. If it is by overlap of "x"s, then the values would seem to be 0, 1, and 2.
My final conclusion is that you don't need to create a matrix like that at all because you already have the data in the correct format.
Related
I have a relation between users and groups. Users can be in a group or not.
EDIT : Added some stuff to the model to make it more convenient.
Let's say I have a rule to add users in a group considering it has a specific town, and a custom metadata like age 18).
Curently, I do that to know which users I have to add in the group of the people living in Paris who are 18:
SELECT user.id AS 'id'
FROM user
LEFT JOIN
(
SELECT user_id
FROM user_has_role_group
WHERE role_group_id = 1 -- Group for Paris
)
AS T1
ON user.id = T1.user_id
WHERE
(
user.town = 'Paris' AND JSON_EXTRACT('custom_metadata', '$.age') = 18
)
AND T1.user_id IS NULL
It works & gives me the IDs of the users to insert in group.
But when I have 50 groups to proceed, like for 50 town or various ages, it forces me to do 50 requests, it's very slow and not efficient for my Database.
How could I generate a result for each group ?
Something like :
role_group_id user_to_add
1 1
1 2
2 1
2 3
The only way I know to do that for now is to do an UNION on several sub queries like the one above, but of course it's very slow.
Note that the custom_metadata field is a user defined field. I can't create specific columns or tables.
Thanks a lot for your help.
if I good understood you:
select user.id, grp.id
from user, role_group grp
where (user.id, grp.id) not in (select user_id, role_group_id from user_has_role_group) and user.town in ('Paris', 'Warsav')
that code give list of users and group which they not belong from one of towns..
To add the missing entries to user_has_role_group, you might want to have some mapping between those town names and their group_id's.
The example below is just using a subquery with unions for that.
But you could replace that with a select from a table.
Maybe even from role_group, if those names correlate with the user town names.
insert into user_has_role_group (user_id, group_id)
select u.user_id, g.group_id
from user u
join (
select 'Paris' as name, 1 as group_id union all
select 'Rome', 2
-- add more towns here
) g on (u.town = g.name)
left join user_has_role_group ug
on (ug.user_id = u.user_id and ug.role_group_id = g.group_id)
where u.town in ('Paris','Rome') -- add more towns here
and json_extract(u.custom_metadata, '$.age') = 18
and ug.id is null;
I'm having trouble figuring out how to structure a SQL query. Let's say we have a User table and a Pet table. Each user can have many pets and Pet has a breed column.
User:
id | name
______|________________
1 | Foo
2 | Bar
Pet:
id | owner_id | name | breed |
______|________________|____________|_____________|
1 | 1 | Fido | poodle |
2 | 2 | Fluffy | siamese |
The end goal is to provide a query that will give me all the pets for each user that match the given where clause while allowing sort and limit parameters to be used. So the ability to limit each user's pets to say 5 and sorted by name.
I'm working on building these queries dynamically for an ORM so I need a solution that works in MySQL and Postgresql (though it can be two different queries).
I've tried something like this which doesn't work:
SELECT "user"."id", "user"."name", "pet"."id", "pet"."owner_id", "pet"."name",
"pet"."breed"
FROM "user"
LEFT JOIN "pet" ON "user"."id" = "pet"."owner_id"
WHERE "pet"."id" IN
(SELECT "pet"."id" FROM "pet" WHERE "pet"."breed" = 'poodle' LIMIT 5)
In Postgres (8.4 or later), use the window function row_number() in a subquery:
SELECT user_id, user_name, pet_id, owner_id, pet_name, breed
FROM (
SELECT u.id AS user_id, u.name AS user_name
, p.id AS pet_id, owner_id, p.name AS pet_name, breed
, row_number() OVER (PARTITION BY u.id ORDER BY p.name, pet_id) AS rn
FROM "user" u
LEFT JOIN pet p ON p.owner_id = u.id
AND p.breed = 'poodle'
) sub
WHERE rn <= 5
ORDER BY user_name, user_id, pet_name, pet_id;
When using a LEFT JOIN, you can't combine that with WHERE conditions on the left table. That forcibly converts the LEFT JOIN to a plain [INNER] JOIN (and possibly removes rows from the result you did not want removed). Pull such conditions up into the join clause.
The way I have it, users without pets are included in the result - as opposed to your query stub.
The additional id columns in the ORDER BY clauses are supposed to break possible ties between non-unique names.
Never use a reserved word like user as identifier.
Work on your naming convention. id or name are terrible, non-descriptive choices, even if some ORMs suggest this nonsense. As you can see in the query, it leads to complications when joining a couple of tables, which is what you do in SQL.
Should be something like pet_id, pet, user_id, username etc. to begin with.
With a proper naming convention we could just SELECT * in the subquery.
MySQL does not support window functions, there are fidgety substitutes ...
SELECT user.id, user.name, pet.id, pet.name, pet.breed, pet.owner_id,
SUBSTRING_INDEX(group_concat(pet.owner_id order by pet.owner_id DESC), ',', 5)
FROM user
LEFT JOIN pet on user.id = pet.owner_id GROUP BY user.id
Above is rough/untested, but this source has exactly what you need, see step 4. also you don't need any of those " 's.
This query suggests friendship based on how many words users have in common. in_common sets this threshold.
I was wondering if it was possible to make this query completely % based.
What I want to do is have user suggested to current user, if 30% of their words match.
curent_user total words 100
in_common threshold 30
some_other_user total words 10
3 out of these match current_users list.
Since 3 is 30% of 10, this is a match for the current user.
Possible?
SELECT users.name_surname, users.avatar, t1.qty, GROUP_CONCAT(words_en.word) AS in_common, (users.id) AS friend_request_id
FROM (
SELECT c2.user_id, COUNT(*) AS qty
FROM `connections` c1
JOIN `connections` c2
ON c1.user_id <> c2.user_id
AND c1.word_id = c2.word_id
WHERE c1.user_id = :user_id
GROUP BY c2.user_id
HAVING count(*) >= :in_common) as t1
JOIN users
ON t1.user_id = users.id
JOIN connections
ON connections.user_id = t1.user_id
JOIN words_en
ON words_en.id = connections.word_id
WHERE EXISTS(SELECT *
FROM connections
WHERE connections.user_id = :user_id
AND connections.word_id = words_en.id)
GROUP BY users.id, users.name_surname, users.avatar, t1.qty
ORDER BY t1.qty DESC, users.name_surname ASC
SQL fiddle: http://www.sqlfiddle.com/#!2/c79a6/9
OK, so the issue is "users in common" defined as asymmetric relation. To fix it, let's assume that in_common percentage threshold is checked against user with the least words.
Try this query (fiddle), it gives you full list of users with at least 1 word in common, marking friendship suggestions:
SELECT user1_id, user2_id, user1_wc, user2_wc,
count(*) AS common_wc, count(*) / least(user1_wc, user2_wc) AS common_wc_pct,
CASE WHEN count(*) / least(user1_wc, user2_wc) > 0.7 THEN 1 ELSE 0 END AS frienship_suggestion
FROM (
SELECT u1.user_id AS user1_id, u2.user_id AS user2_id,
u1.word_count AS user1_wc, u2.word_count AS user2_wc,
c1.word_id AS word1_id, c2.word_id AS word2_id
FROM connections c1
JOIN connections c2 ON (c1.user_id < c2.user_id AND c1.word_id = c2.word_id)
JOIN (SELECT user_id, count(*) AS word_count
FROM connections
GROUP BY user_id) u1 ON (c1.user_id = u1.user_id)
JOIN (SELECT user_id, count(*) AS word_count
FROM connections
GROUP BY user_id) u2 ON (c2.user_id = u2.user_id)
) AS shared_words
GROUP BY user1_id, user2_id, user1_wc, user2_wc;
Friendship_suggestion is on SELECT for clarity, you probably need to filter by it, so yu may just move it to HAVING clause.
I throw this option into your querying consideration... The first part of the from query is to do nothing but get the one user you are considering as the basis to find all others having common words. The where clause is for that one user (alias result OnePerson).
Then, add to the from clause (WITHOUT A JOIN) since the OnePerson record will always be a single record, we want it's total word count available, but didn't actually see how your worked your 100 to 30 threashold if another person only had 10 words to match 3... I actually think its bloat and unnecessary as you'll see later in the where of PreQuery.
So, the next table is the connections table (aliased c2) and that is normal INNER JOIN to the words table for each of the "other" people being considered.
This c2 is then joined again to the connections table again alias OnesWords based on the common word Id -- AND -- the OnesWords user ID is that of the primary user_id being compared against. This OnesWords alias is joined to the words table so IF THERE IS a match to the primary person, we can grab that "common word" as part of the group_concat().
So, now we grab the original single person's total words (still not SURE you need it), a count of ALL the words for the other person, and a count (via sum/case when) of all words that ARE IN COMMON with the original person grouped by the "other" user ID. This gets them all and results as alias "PreQuery".
Now, from that, we can join that to the user's table to get the name and avatar along with respective counts and common words, but apply the WHERE clause based on the total per "other users" available words to the "in common" with the first person's words (see... I didn't think you NEEDED the original query/count as basis of percentage consideration).
SELECT
u.name_surname,
u.avatar,
PreQuery.*
from
( SELECT
c2.user_id,
One.TotalWords,
COUNT(*) as OtherUserWords,
GROUP_CONCAT(words_en.word) AS InCommonWords,
SUM( case when OnesWords.word_id IS NULL then 0 else 1 end ) as InCommonWithOne
from
( SELECT c1.user_id,
COUNT(*) AS TotalWords
from
`connections` c1
where
c1.user_id = :PrimaryPersonBasis ) OnePerson,
`connections` c2
LEFT JOIN `connections` OnesWords
ON c2.word_id = OnesWords.word_id
AND OnesWords.user_id = OnePerson.User_ID
LEFT JOIN words_en
ON OnesWords.word_id = words_en.id
where
c2.user_id <> OnePerson.User_ID
group by
c2.user_id ) PreQuery
JOIN users u
ON PreQuery.user_id = u.id
where
PreQuery.OtherUserWords * :nPercentToConsider >= PreQuery.InCommonWithOne
order by
PreQuery.InCommonWithOne DESC,
u.name_surname
Here's a revised WITHOUT then need to prequery the total original words of the first person.
SELECT
u.name_surname,
u.avatar,
PreQuery.*
from
( SELECT
c2.user_id,
COUNT(*) as OtherUserWords,
GROUP_CONCAT(words_en.word) AS InCommonWords,
SUM( case when OnesWords.word_id IS NULL then 0 else 1 end ) as InCommonWithOne
from
`connections` c2
LEFT JOIN `connections` OnesWords
ON c2.word_id = OnesWords.word_id
AND OnesWords.user_id = :PrimaryPersonBasis
LEFT JOIN words_en
ON OnesWords.word_id = words_en.id
where
c2.user_id <> :PrimaryPersonBasis
group by
c2.user_id
having
COUNT(*) * :nPercentToConsider >=
SUM( case when OnesWords.word_id IS NULL then 0 else 1 end ) ) PreQuery
JOIN users u
ON PreQuery.user_id = u.id
order by
PreQuery.InCommonWithOne DESC,
u.name_surname
There might be some tweaking on the query, but your original query leads me to believe you can easily find simple things like alias or field name type-o instances.
Another options might be to prequery ALL users and how many respective words they have UP FRONT, then use the primary person's words to compare to anyone else explicitly ON those common words... This might be more efficient as the multiple joins would be better on the smaller result set. What if you have 10,000 users and user A has 30 words, and only 500 other users have one or more of those words in common... why compare against all 10,000... but if having up-front a simple summary of each user and how many should be an almost instant query basis.
SELECT
u.name_surname,
u.avatar,
PreQuery.*
from
( SELECT
OtherUser.User_ID,
AllUsers.EachUserWords,
COUNT(*) as CommonWordsCount,
group_concat( words_en.word ) as InCommonWords
from
`connections` OneUser
JOIN words_en
ON OneUser.word_id = words_en.id
JOIN `connections` OtherUser
ON OneUser.word_id = OtherUser.word_id
AND OneUser.user_id <> OtherUser.user_id
JOIN ( SELECT
c1.user_id,
COUNT(*) as EachUserWords
from
`connections` c1
group by
c1.user_id ) AllUsers
ON OtherUser.user_id = AllUsers.User_ID
where
OneUser.user_id = :nPrimaryUserToConsider
group by
OtherUser.User_id,
AllUsers.EachUserWords ) as PreQuery
JOIN users u
ON PreQuery.uer_id = u.id
where
PreQuery.EachUserWords * :nPercentToConsider >= PreQuery.CommonWordCount
order by
PreQuery.CommonWordCount DESC,
u.name_surname
May I suggest a different way to look at your problem?
You might look into a similarity metric, such as Cosine Similarity which will give you a much better measure of similarity between your users based on words. To understand it for your case, consider the following example. You have a vector of words A = {house, car, burger, sun} for a user u1 and another vector B = {flat, car, pizza, burger, cloud} for user u2.
Given these individual vectors you first construct another that positions them together so you can map to each user whether he/she has that word in its vector or not. Like so:
| -- | house | car | burger | sun | flat | pizza | cloud |
----------------------------------------------------------
| A | 1 | 1 | 1 | 1 | 0 | 0 | 0 |
----------------------------------------------------------
| B | 0 | 1 | 1 | 0 | 1 | 1 | 1 |
----------------------------------------------------------
Now you have a vector for each user where each position corresponds to the value of each word to each user. Here it represents a simple count but you can improve it using different metrics based on word frequency if that applies to your case. Take a look at the most common one, called tf-idf.
Having these two vectors, you can compute the cosine similarity between them as follows:
Which basically is computing the sum of the product between each position of the vectors above, divided by their corresponding magnitude. In our example, that is 0.47, in a range that can vary between 0 and 1, the higher the most similar the two vectors are.
If you choose to go this way, you don't need to do this calculation in the database. You compute the similarity in your code and just save the result in the database. There are several libraries that can do that for you. In Python, take a look at the numpy library. In Java, look at Weka and/or Apache Lucene.
I am using SQL Server 2008 and I have 4 tables StudentAbsentees, Students, StudentSections and Sections
In the StudentAbsentees table I am storing the studentId that are absent (absentees only) on a particular day like,
StudentId Time Date
----------- ------ ------
1 10:00 2012-04-13
and in the StudentSections I am storing the studentId in a particular section like
StudentId SectionId
---------- ------------
1 1
2 1
3 1
and in the Students table I am storing student details, likewise in Sections table I have section details like name and capacity of that section.
I need to join these tables and display whether the student is present/absent on a particular day... the result should be
StudentId Status
--------- ------
1 Absent
2 Present
3 Present
I can get the absentees list from these tables, I dunno how to display whether they are present/absent....can anyone help me here
select * from (
select s.id,
case
when sa.date = '2012-01-01'
then 'absent'
else 'present'
end as status,
ROW_NUMBER() OVER (PARTITION BY s.id ORDER BY CASE WHEN sa.date = '2012-01-01' THEN 1 ELSE 2 END) AS RowNumber
from students s
left outer join studentabsentees sa on s.id = sa.studentid
)
as a where a.RowNumber = 1
You're query to show the status of all students for a particular day would look like:
select s.id, s.name, a.status
from student s
left join studentabsentees a on s.id = a.studentid
where a.date = ?
Obviously you have to supply a date.
Note: Your question uses "inner join" in the title. I think left is a better fit because it would show for all students. But if you really wanted just the ones that have a record in the absentee table then you could just change the word "left" in the query to "inner".
Note2: My query assumes a status field. If you don't have one then look at juergen d's answer.
No need for joins, you can just use set operators:
SELECT StudentID, 'Absent'
FROM StudentsAbsentees
WHERE [date] = ...
UNION
(
SELECT StudentID, 'Present'
FROM Students
EXCEPT
SELECT StudentID, 'Present'
FROM StudentsAbsentees
WHERE [date] = ...
)
You can display 'Present' and 'Absent' by just selecting them as constant. It's easy to get the list of all the absent students. Then union this with all the present students. Present students are found by taking the complete student list and using the except operator on the missing students. But in this except part make sure you select the absent students as present so they subtract nicely from the list of all students with present next to their name that you have just created.
I am working on a mysql query and its giving me headache!
The Scenario:
I am building a website where people can select industries they are interested in (NOTIFY_INDUSTRY). I join the selected values and store in a database field.
Example: a member selects agriculture (id = 9) and oil and gas (id = 13). I join them as 9-13 and store in the database.
Users can select several industries, not limited to two.
Also, members can select an industry (COMPANY_INDUSTRY) it belongs in assuming Information Technology which is stored in the database too.
Sample table (members):
ID
EMAIL
COMPANY_NAME
COMPANY_INDUSTRY
NOTIFY_INDUSTRY
The problem:
When a new user registers on the website, mail (the mails are sent on daily basis) is sent to existing users who have the new user's industry (COMPANY_INDUSTRY) as one of their interested industries (NOTIFY_INDUSTRY).
What i have done:
$sql="select id, email
from members
where notify_industry in (
select company_industry
from members
where datediff($today, date_activated) <= 1)"
This does not select the right members and i do not know the right way to go about it
EDIT - Exact Problem with current output:
Does not return any row, even when it should.
Assuming the new user's company_industry is 9, and there is an existing user with notify_industry: 10-9-20; it is meant to return the existing members email as the new member is in the existing member's categories of interest; but i get blanks
As #Shiplu pointed out, this is largely a normalization issue. Despite what some people seem to think, multi-value columns are murder to try to get right.
Your basic issue is:
You have members, who are interested in one or more companies/industries, which belong to one or more industries. You table structure should probably start as:
Industry
===============
id -- autoincrement
name -- varchar
Company
==============
id -- autoincrement
name -- varchar
Company_Industry
===============
companyId -- fk reference to Company.id
industryId -- fk reference to Industry.id
Member
===============
id -- autoincrement
name -- varchar
email -- varchar
Member_Interest_Industry
=========================
memberId -- fk reference to Member.id
industryId -- fk reference to Industry.id
Member_Interest_Company
========================
memberId -- fk reference to Member.id
companyId -- fk reference to Company.id
To get all companies a member is interested in (directly, or through an industry), you can then run something like this:
SELECT a.name, a.email, c.name
FROM Member as a
JOIN Member_Interest_Company as b
ON b.memberId = a.id
JOIN Company as c
ON c.id = b.companyId
WHERE a.id = :inputParm
UNION
SELECT a.name, a.email, d.name
FROM Member as a
JOIN Member_Interest_Industry as b
ON b.memberId = a.id
JOIN Company_Industry as c
ON c.industryId = b.industryId
JOIN Company as d
ON d.id = c.companyId
WHERE a.id = :inputParm
You should redesign the tables, as others have suggested.
However, barring that, there is a gross hack you can do:
SET sql_mode = 'ANSI';
SELECT notify_members.id, notify_members.email
FROM members notify_members
INNER JOIN members new_members
WHERE CURRENT_DATE - new_members.date_activated <= 1
AND
new_members.company_industry RLIKE ('[[:<:]](' || REPLACE(notify_members.notify_industry, '-', '|') || ')[[:>:]]');
Yuck. Basically, you turn 9-13 into the MySQL regular expression [[:<:]](9|13)[[:>:]], which matches 9, 13, 13-27-61, etc., but does not match 19-131 and the like. (This supports a compound COMPANY_INDUSTRY field, too.)
Use join SQL syntax rather than a select in style..
You need to join the members table to itself.
Currently:
select id, email
from members where notify_industry in
(select company_industry
from members
where datediff($today, date_activated) <= 1
)
Use this style:
select m1.id, m1.email
from members m1
inner join members m2 on m1.company_industry = m.notify_industry
where datediff($today, m2.date_activated) <= 1
Note the use of aliasing to m1 and m2 to help understand which id and emails are returned.
This may get a little ugly but you could try the following
WARNING This will make a Cartesian Product worthy of any Mad Scientist
SELECT NotifyIndustry.id,NotifyIndustry.email
FROM
(
SELECT CONCAT('-',COMPANY_INDUSTRY,'-') company FROM members
WHERE datediff($today, date_activated) <= 1)"
) CompanyIndustry
INNER JOIN
(
SELECT CONCAT('-', NOTIFY_INDUSTRY,'-') who_to_notify
FROM members
) NotifyIndustry
ON LOCATE(company,who_to_notify)>0;
probably not the fastest query ever but this should do the job:
select m_to_notify.id, m_to_notify.email
from members m_to_notify
join members m_new_member
on '-' || m_to_notify.notify_industry || '-'
like '%-' || m_new_member.company_industry || '-%'
where datediff($today, m_new_memberdate_activated) <= 1)