SQL Query to find number of users in a Job Area - mysql

I have three tables:
jobAreas (id, title)
jobSkills (id,title, jobAreaID)
userSkills (id, userID, jobSkillID)
Each jobSkills entry belongs to a JobArea (linked by foreign key jobAreaID). And each userSkills entry has a JobSkill that is related to a jobSkill.
I am trying to create a SQL select query that will list the number of users that belong to each Job Area.
SELECT ja.id, ja.title, COUNT(*) as numUsers FROM user_skill_types uskills INNER JOIN job_areas ja INNER JOIN skill_types st ON ja.id = st.parent_id GROUP BY ja.id
But the numbers I am getting are not correct.

Given the following example (based on the table structure provided in the question).
CREATE TABLE `jobareas` (
`id` int(11) NOT NULL,
`title` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO `jobareas` (`id`, `title`) VALUES
(1, 'area1'),
(2, 'area2'),
(3, 'area3'),
(4, 'area4'),
(5, 'area5'),
(6, 'area6'),
(7, 'area7'),
(8, 'area8');
-- --------------------------------------------------------
CREATE TABLE `jobskills` (
`id` int(11) NOT NULL,
`title` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`jobAreaID` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO `jobskills` (`id`, `title`, `jobAreaID`) VALUES
(1, 'skill1', 1),
(2, 'skill2', 3),
(3, 'skill3', 3),
(4, 'skill4', 7),
(5, 'skill5', 4),
(6, 'skill6', 5),
(7, 'skill7', 1),
(8, 'skill8', 7),
(9, 'skill9', 6),
(10, 'skill10', 3),
(11, 'skill11', 4),
(12, 'skill12', 2),
(13, 'skill13', 6),
(14, 'skill14', 7),
(15, 'skill15', 2);
-- --------------------------------------------------------
CREATE TABLE `userskills` (
`id` int(11) NOT NULL,
`userID` int(11) NOT NULL,
`jobSkillID` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO `userskills` (`id`, `userID`, `jobSkillID`) VALUES
(1, 5, 10),
(2, 2, 11),
(3, 4, 14),
(4, 4, 6),
(5, 2, 8),
(6, 6, 9),
(7, 3, 9),
(8, 1, 12),
(9, 1, 3),
(10, 5, 10);
ALTER TABLE `jobareas`
ADD UNIQUE KEY `id` (`id`);
ALTER TABLE `jobskills`
ADD PRIMARY KEY (`id`),
ADD KEY `jobAreaID` (`jobAreaID`);
ALTER TABLE `userskills`
ADD PRIMARY KEY (`id`),
ADD KEY `userID` (`userID`),
ADD KEY `jobSkillID` (`jobSkillID`);
ALTER TABLE `jobskills`
ADD CONSTRAINT `jobskills_ibfk_1` FOREIGN KEY (`jobAreaID`) REFERENCES `jobareas` (`id`);
ALTER TABLE `userskills`
ADD CONSTRAINT `userskills_ibfk_1` FOREIGN KEY (`jobSkillID`) REFERENCES `jobskills` (`id`);
Your query should use DISTINCT.
SELECT COUNT(DISTINCT(`us`.`userID`)) AS `num`,`ja`.`title` FROM `userskills` `us`
INNER JOIN `jobskills` `js` ON `js`.`id` = `us`.`jobSkillID`
INNER JOIN `jobareas` `ja` ON `ja`.`id` = `js`.`jobAreaID`
GROUP BY `ja`.`id`;
The results can be checked in this SQLFiddle

Your SQL Query shared does not seem to match the schema shared. Also you have not specified how to join the job_areas table
Use
select
ja.id, ja.title , count(us.id) as numUsers
from jobAreas ja
INNER JOIN jobSkills js on ja.id = js.jobAreaID
INNER JOIN userSkills us on js.id = us.jobSkillID
GROUP BY ja.id, ja.title

You are probably getting duplicates in your result because of users having multiple skills or jobs having multiple areas, or both. Rather than COUNT(*), use COUNT(DISTINCT userID) to work around that:
SELECT ja.id, ja.title, COUNT(DISTINCT us.userID) as numUsers
FROM jobAreas ja
JOIN jobSkills js ON js.jobAreaID = ja.id
JOIN userSkills us ON us.jobSkillsID = js.id
GROUP BY ja.id, ja.title
Note I've written the query based on the schema in your question. Based on the query you have written, it should probably look something like (it's not clear what the user_skill_types userID column is called, or how to JOIN user_skill_types to job_skills):
SELECT ja.id, ja.title, COUNT(DISTINCT uskills.userID) as numUsers
FROM job_areas ja
JOIN skill_types st ON ja.id = st.parent_id
JOIN user_skill_types uskills ON uskills.jobSkillID = st.id
GROUP BY ja.id, ja.title

Related

How to implement IF condition in two relational tables?

I have two relational tables, and I would like to filter data using IF condition. The problem is that using LEFT JOIN I got records that cannot be grouped.
The tables that I have are:
calendar
bookers
The first table consists of lessons that can be booked by more people, and the second table contains data who booked each lesson. The IF condition that I would like to implement is: return '2' if lesson is booked by specific user, return '1' if lesson is booked, but by another user, and return '0' if lesson is not booked.
What I would like to get according to above tables is given in the figure below.
Expected result
But, when I use LEFT JOIN to link those tables, I got record for every user that booked specific lesson.
SELECT calendar.id, calendarId, lessonType, description,
CASE
WHEN bookedBy then IF(bookedBy = 8, '2', '1')
ELSE '0'
END AS bb,
(select count(bookedBy) from bookers where calendar.id = bookers.lessonId) as nOfBookers
FROM calendar
LEFT JOIN bookers ON calendar.id = bookers.lessonId
WHERE `calendarId`= 180
Without the LEFT JOIN (fiddle), counts are shown properly, but I cannot include IF condition, because the table bookers is not defined.
I would appreciate any help. Thank you very much in advance.
Here is the Fiddle.
CREATE TABLE `calendar` (
`id` int(11) NOT NULL,
`calendarId` varchar(50) NOT NULL,
`lessonType` varchar(255) DEFAULT NULL,
`description` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `calendar`
(`id`, `calendarId`, `lessonType`, `description`)
VALUES
(1, '180', 'A', ''),
(2, '180', 'A', ''),
(3, '180', 'A', ''),
(4, '180', 'B', ''),
(5, '180', 'B', ''),
(6, '180', 'B', ''),
(7, '180', 'B', ''),
(8, '180', 'B', ''),
(9, '180', 'B', '');
CREATE TABLE `bookers` (
`id` int(11) NOT NULL,
`lessonId` int(11) DEFAULT NULL,
`bookedBy` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
--
-- Dumping data for table `bookers`
--
INSERT INTO `bookers` (`id`, `lessonId`, `bookedBy`) VALUES
(4, 1, 8),
(5, 2, 8),
(6, 2, 28),
(7, 2, 17),
(8, 3, 11);
--
-- Indexes for dumped tables
--
ALTER TABLE `calendar`
ADD PRIMARY KEY (`id`),
ADD UNIQUE KEY `id` (`id`);
--
-- Indexes for table `bookers`
--
ALTER TABLE `bookers`
ADD PRIMARY KEY (`id`);
--
-- AUTO_INCREMENT for dumped tables
--
--
-- AUTO_INCREMENT for table `bookers`
--
ALTER TABLE `bookers`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=9;
COMMIT;
select version();
Try this:
SELECT id, calendarid, lessontype, description,
CASE WHEN FIND_IN_SET(8,vbb)>0 THEN 2
WHEN vbb IS NOT NULL THEN 1
ELSE 0 END AS bb,
nOfBookers
FROM
(SELECT c.id, calendarId, lessonType, GROUP_CONCAT(bookedby) AS vbb, description,
(SELECT COUNT(bookedby) FROM bookers WHERE c.id = bookers.lessonId) AS nOfBookers
FROM calendar c
LEFT JOIN bookers b ON c.id = b.lessonId
WHERE `calendarId`= 180
GROUP BY c.id, calendarId, lessonType, description) A;
In addition to your original LEFT JOIN attempt, I've added GROUP_CONCAT(bookedby) AS vbb which will return a comma separated bookedby value; which is 17,28,8. After that, I make the query as a sub-query and do CASE expression with FIND_IN_SET function on vbb to look for specific bookedby.
Here's an update fiddle: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=0933e9fc3cb7445311c34c6705d11637

Condition based selecting from group_concat in Mysql query

Here is my table and sample data.
CREATE TABLE `articles`
(
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `tags`
(
`id` int(11) NOT NULL,
`name` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `article_tags`
(
`article_id` int(11) NOT NULL,
`tag_id` int(11) NOT NULL
);
INSERT INTO `tags` (`id`, `name`) VALUES
(1, 'Wap Stories'),
(2, 'App Stories');
INSERT INTO `articles` (`id`, `title`) VALUES
(1, 'USA'),
(2, 'England'),
(3, 'Germany'),
(4, 'India'),
(5, 'France'),
(6, 'Dubai'),
(7, 'Poland'),
(8, 'Japan'),
(9, 'China'),
(10, 'Australia');
INSERT INTO `article_tags` (`article_id`, `tag_id`) VALUES
(1, 1),
(1, 2),
(4, 1),
(5, 1),
(2, 2),
(2, 1),
(6, 2),
(7, 2),
(8, 1),
(9, 1),
(3, 2),
(9, 2),
(10, 2);
How can I get the below output I have tried using group_concat function. It gives all the results. But my requirement is I need to groupconcat values as
a. Combination of 1,2 can be there, only 1 can be there but 2 alone cannot be there.
b. Combination of 2,1 can be there, only 2 can be there but 1 alone cannot be there
Below is the output I need
id, title, groupconcat
--------------------------
1, USA, 1,2
2, England, 1,2
4, India, 1
5, France, 1
8, Japan, 1
9, China, 1,2
SqlFiddle Link
The query which I am using is
select id, title, group_concat(tag_id order by tag_id) as 'groupconcat' from articles a
left join article_tags att on a.id = att.article_id
where att.tag_id in (1,2)
group by article_id order by id
You can try like this
SELECT id, title, GROUP_CONCAT(tag_id ORDER BY tag_id) AS 'groupconcat'
FROM articles a
LEFT join article_tags att on a.id = att.article_id
WHERE att.tag_id in (1,2)
GROUP BY article_id
HAVING SUBSTRING_INDEX(groupconcat,',',1) !='2'
ORDER BY id

How to refactor, (shorten) this query

I have a database with tables: applicant (or candidate for a job), application (candidate applied for a certain job), test, selected_test(any application has a defined set of tests) and test_result.
When I need to show which applicant scored what result for any application and test I would use this query:
SELECT applicant.first_name, applicant.last_name, application.job, test.name, test_result.score
FROM applicant
INNER JOIN application ON application.applicant_id=applicant.id
INNER JOIN selected_test ON application.id=selected_test.application_id
INNER JOIN test ON selected_test.test_id=test.id
INNER JOIN test_result ON selected_test.test_id=test_result.test_id AND applicant.id=test_result.applicant_id
What I need to accomplish is sorting by certain test type (test.name) along with test.score
This is what I mean:
SELECT a.first_name, a.last_name, app.job, iq.score AS iqScore, math.score AS mathScore, personality.score AS personalityScore, logic.score AS logicScore
FROM applicant a
INNER JOIN application app ON a.id=app.applicant_id
LEFT JOIN
(SELECT app.id AS appId, tr.score
FROM applicant a
INNER JOIN application app ON app.applicant_id=a.id
INNER JOIN selected_test st ON app.id=st.application_id
INNER JOIN test t ON st.test_id=t.id AND t.name='iq'
INNER JOIN test_result tr ON st.test_id=tr.test_id AND a.id=tr.applicant_id) AS iq ON app.id=iq.appId
LEFT JOIN
(SELECT app.id AS appId, tr.score
FROM applicant a
INNER JOIN application app ON app.applicant_id=a.id
INNER JOIN selected_test st ON app.id=st.application_id
INNER JOIN test t ON st.test_id=t.id AND t.name='math'
INNER JOIN test_result tr ON st.test_id=tr.test_id AND a.id=tr.applicant_id) AS math ON app.id=math.appId
LEFT JOIN
(SELECT app.id AS appId, tr.score
FROM applicant a
INNER JOIN application app ON app.applicant_id=a.id
INNER JOIN selected_test st ON app.id=st.application_id
INNER JOIN test t ON st.test_id=t.id AND t.name='personality'
INNER JOIN test_result tr ON st.test_id=tr.test_id AND a.id=tr.applicant_id) AS personality ON app.id=personality.appId
LEFT JOIN
(SELECT app.id AS appId, tr.score
FROM applicant a
INNER JOIN application app ON app.applicant_id=a.id
INNER JOIN selected_test st ON app.id=st.application_id
INNER JOIN test t ON st.test_id=t.id AND t.name='logic'
INNER JOIN test_result tr ON st.test_id=tr.test_id AND a.id=tr.applicant_id) AS logic ON app.id=logic.appId
ORDER BY mathScore DESC, iqScore DESC, logicScore DESC
The query returns a set of applications, showing applicant data, job, test names and scores.
For instance, if I want candidate applications with higher "math" score, followed by highest scores in "IQ" and then in "logic" to be on top, 'ORDER BY' clause looks like the above.
The query works correct but the problem is that in real situation it deals with large data sets and I need a way to shorten/refactor this query.
Example database it works on is here:
CREATE TABLE IF NOT EXISTS `applicant` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(255) NOT NULL,
`last_name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;
--
-- Dumping data for table `applicant`
--
INSERT INTO `applicant` (`id`, `first_name`, `last_name`) VALUES
(2, 'Jack', 'Redburn'),
(4, 'Barry', 'Leon'),
(6, 'Elisabeth', 'Logan'),
(7, 'Jane', 'Doe');
-- --------------------------------------------------------
--
-- Table structure for table `application`
--
CREATE TABLE IF NOT EXISTS `application` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`applicant_id` int(11) NOT NULL,
`job` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=10 ;
--
-- Dumping data for table `application`
--
INSERT INTO `application` (`id`, `applicant_id`, `job`) VALUES
(2, 2, 'Salesman'),
(4, 4, 'Policeman'),
(6, 6, 'Journalist'),
(8, 6, 'Hostess'),
(9, 7, 'Journalist');
-- --------------------------------------------------------
--
-- Table structure for table `selected_test`
--
CREATE TABLE IF NOT EXISTS `selected_test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`application_id` int(11) NOT NULL,
`test_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=24 ;
--
-- Dumping data for table `selected_test`
--
INSERT INTO `selected_test` (`id`, `application_id`, `test_id`) VALUES
(1, 1, 1),
(2, 1, 2),
(3, 1, 3),
(5, 2, 1),
(6, 2, 2),
(7, 2, 3),
(8, 2, 4),
(9, 3, 4),
(10, 3, 2),
(11, 4, 1),
(12, 4, 2),
(13, 4, 3),
(14, 4, 4),
(15, 5, 2),
(16, 5, 3),
(17, 6, 1),
(18, 6, 4),
(19, 7, 3),
(20, 7, 2),
(21, 7, 1),
(22, 8, 2),
(23, 8, 3);
-- --------------------------------------------------------
--
-- Table structure for table `test`
--
CREATE TABLE IF NOT EXISTS `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;
--
-- Dumping data for table `test`
--
INSERT INTO `test` (`id`, `name`) VALUES
(1, 'math'),
(2, 'logic'),
(3, 'iq'),
(4, 'personality');
-- --------------------------------------------------------
--
-- Table structure for table `test_result`
--
CREATE TABLE IF NOT EXISTS `test_result` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`applicant_id` int(11) NOT NULL,
`test_id` int(11) NOT NULL,
`score` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=24 ;
--
-- Dumping data for table `test_result`
--
INSERT INTO `test_result` (`id`, `applicant_id`, `test_id`, `score`) VALUES
(2, 2, 1, 6),
(3, 4, 1, 7),
(6, 6, 1, 3),
(7, 7, 1, 8),
(9, 2, 2, 15),
(11, 4, 2, 12),
(13, 6, 2, 11),
(14, 7, 2, 9),
(15, 7, 3, 105),
(16, 6, 3, 112),
(18, 4, 3, 108),
(20, 2, 3, 117),
(22, 4, 4, 70);
And here is what results look like:
First query is just to show you how data is related:
The large query, shows score data horizontally so it is possible to sort by test name and score:
caveat I don't know mysql
Googling mysql pivot gives this result http://en.wikibooks.org/wiki/MySQL/Pivot_table
So if we apply the same logic using the test.id as the seed number (which is exam in the example from the google search) we get this:
SQLFIDDLE
select first_name, last_name, job,
sum(score*(1-abs(sign(testid-1)))) as math,
sum(score*(1-abs(sign(testid-2)))) as logic,
sum(score*(1-abs(sign(testid-3)))) as iq,
sum(score*(1-abs(sign(testid-4)))) as personality
from
(
SELECT applicant.first_name, applicant.last_name, application.job, test.name, test_result.score, test.id as testid
FROM applicant
INNER JOIN application ON application.applicant_id=applicant.id
INNER JOIN selected_test ON application.id=selected_test.application_id
INNER JOIN test ON selected_test.test_id=test.id
INNER JOIN test_result ON selected_test.test_id=test_result.test_id AND applicant.id=test_result.applicant_id
) t
group by first_name, last_name, job
Now you've got your short query yu can apply sorting as required - you can use case statement in you order by to dynamically change the order as required...
I noticed that you have only defined primary keys. You should see a noticeable performance improvement when you index other fields. Index at least the following: application.applicant_id, selected_test.application_id, selected_test.test_id, test_result.applicant_id, test_result.test_id, test_result.score.
You might be surprised how much this speeds things up for you. In fact, mysql tells us this is the best way to improve performance: https://dev.mysql.com/doc/refman/5.5/en/optimization-indexes.html.

search a set of values in other set of values for a row

Hello I am having issues with execution time on a query that searches for users ( from users table ) that are having at least one interest from one specified interests set and a location from a specified locations set. So I have this test DB:
CREATE TABLE IF NOT EXISTS `interests` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=10 ;
--
-- Dumping data for table `interests`
--
INSERT INTO `interests` (`id`, `name`) VALUES
(1, 'auto'),
(2, 'moto'),
(3, 'health'),
(4, 'garden'),
(5, 'house'),
(6, 'music'),
(7, 'video'),
(8, 'games'),
(9, 'it');
-- --------------------------------------------------------
--
-- Table structure for table `locations`
--
CREATE TABLE IF NOT EXISTS `locations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=11 ;
--
-- Dumping data for table `locations`
--
INSERT INTO `locations` (`id`, `name`) VALUES
(1, 'engalnd'),
(2, 'austia'),
(3, 'germany'),
(4, 'france'),
(5, 'belgium'),
(6, 'italy'),
(7, 'russia'),
(8, 'poland'),
(9, 'norway'),
(10, 'romania');
-- --------------------------------------------------------
--
-- Table structure for table `users`
--
CREATE TABLE IF NOT EXISTS `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`email` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=11 ;
--
-- Dumping data for table `users`
--
INSERT INTO `users` (`id`, `email`) VALUES
(1, 'email1#test.com'),
(2, 'email2#test.com'),
(3, 'email3#test.com'),
(4, 'email4#test.com'),
(5, 'email5#test.com'),
(6, 'email6#test.com'),
(7, 'email7#test.com'),
(8, 'email8#test.com'),
(9, 'email9#test.com'),
(10, 'email10#test.com');
-- --------------------------------------------------------
--
-- Table structure for table `users_interests`
--
CREATE TABLE IF NOT EXISTS `users_interests` (
`user_id` int(11) NOT NULL,
`interest_id` int(11) NOT NULL,
PRIMARY KEY (`user_id`,`interest_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
--
-- Dumping data for table `users_interests`
--
INSERT INTO `users_interests` (`user_id`, `interest_id`) VALUES
(1, 1),
(1, 2),
(2, 5),
(2, 7),
(2, 8),
(3, 1),
(4, 1),
(4, 5),
(4, 6),
(4, 7),
(4, 8),
(5, 1),
(5, 2),
(5, 8),
(6, 3),
(6, 7),
(6, 8),
(7, 7),
(7, 9),
(8, 5);
-- --------------------------------------------------------
--
-- Table structure for table `users_locations`
--
CREATE TABLE IF NOT EXISTS `users_locations` (
`user_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
PRIMARY KEY (`user_id`,`location_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
--
-- Dumping data for table `users_locations`
--
INSERT INTO `users_locations` (`user_id`, `location_id`) VALUES
(2, 5),
(2, 7),
(2, 8),
(3, 1),
(4, 1),
(4, 5),
(4, 6),
(4, 7),
(4, 8),
(5, 1),
(5, 2),
(5, 8),
(6, 3),
(6, 7),
(6, 8),
(7, 7),
(7, 9),
(8, 5);
Is there a better way to query it than this:
SELECT email,
GROUP_CONCAT( DISTINCT ui.interest_id ) AS interests,
GROUP_CONCAT( DISTINCT ul.location_id ) AS locations
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
GROUP BY u.id
HAVING IF( interests IS NOT NULL , FIND_IN_SET( 2, interests )
OR FIND_IN_SET( 3, interests ) , 1 )
AND IF( locations IS NOT NULL , FIND_IN_SET( 2, locations )
OR FIND_IN_SET( 3, locations ) , 1 )
This is the best solution I found but it still slow on a 500k and 1mil rows in the relational tables ( locations and interests ). Especially when you are matching against a large set of values ( let's say above 50 locations and interests ).
So I am trying to achieve the result this query produces, but a bit faster:
email interests locations
email1#test.com 1,2 [BLOB - 0B]
email5#test.com 1,2,8 1,2,8
email6#test.com 3,7,8 3,7,8
email9#test.com [BLOB - 0B] [BLOB - 0B]
email10#test.com [BLOB - 0B] [BLOB - 0B]
I also tried to join against an SELECT UNION table - for the matching set - but it was even slower. Like this:
SELECT *
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
LEFT JOIN (SELECT 2 as interest UNION SELECT 3 as interest) as `is` ON ui.interest_id = is.interest
LEFT JOIN (SELECT 2 as location UNION SELECT 3 as location ) as `ls` ON ul.location_id = ls.location
WHERE IF(ui.user_id IS NOT NULL, `is`.interest IS NOT NULL,1) AND
IF(ul.user_id IS NOT NULL, ls.location IS NOT NULL,1)
GROUP BY u.id
I am using this for a basic targeting system.
I would appreciate very much, any suggestion! Thank you!
you have IS is reserved word for mysql
and also your group by can slow your query but i dont see any meaning to use group by u.id here since the u.id is already unique id.
look demo
try use backticks around it.
SELECT *
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
LEFT JOIN (SELECT 2 as interest UNION SELECT 3 as interest) as `is`
ON ui.interest_id = `is`.interest
LEFT JOIN (SELECT 2 as location UNION SELECT 3 as location ) as `ls`
ON ul.location_id = `ls`.location
WHERE IF(ui.user_id IS NOT NULL, `is`.interest IS NOT NULL,1)
AND
IF(ul.user_id IS NOT NULL, `ls`.location IS NOT NULL,1)

Mysql - Help me alter this search query to get desired results

Following is a dump of the tables and data needed to answer understand the system:-
The system consists of tutors and classes.
The data in the table All_Tag_Relations stores tag relations for each tutor registered and each class created by a tutor. The tag relations are used for searching classes.
CREATE TABLE IF NOT EXISTS `Tags` (
`id_tag` int(10) unsigned NOT NULL auto_increment,
`tag` varchar(255) default NULL,
PRIMARY KEY (`id_tag`),
UNIQUE KEY `tag` (`tag`),
KEY `id_tag` (`id_tag`),
KEY `tag_2` (`tag`),
KEY `tag_3` (`tag`),
KEY `tag_4` (`tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `Tags` (`id_tag`, `tag`) VALUES
(1, 'Sandeepan'),
(2, 'Nath'),
(3, 'first'),
(4, 'class'),
(5, 'new'),
(6, 'Bob'),
(7, 'Cratchit');
CREATE TABLE IF NOT EXISTS `All_Tag_Relations` (
`id_tag` int(10) unsigned NOT NULL default '0',
`id_tutor` int(10) default NULL,
`id_wc` int(10) unsigned default NULL,
KEY `All_Tag_Relations_FKIndex1` (`id_tag`),
KEY `id_wc` (`id_wc`),
KEY `id_tag` (`id_tag`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `All_Tag_Relations` (`id_tag`, `id_tutor`, `id_wc`) VALUES
(1, 1, NULL),
(2, 1, NULL),
(3, 1, 1),
(4, 1, 1),
(6, 2, NULL),
(7, 2, NULL),
(5, 2, 2),
(4, 2, 2),
(8, 1, 3),
(9, 1, 3);
Following is my query:-
This query searches for "first class" (tag for first = 3 and for class = 4, in Tags table) and returns all those classes such that both the terms first and class are present in the class name.
SELECT wtagrels.id_wc,SUM(DISTINCT( wtagrels.id_tag =3)) AS
key_1_total_matches,
SUM(DISTINCT( wtagrels.id_tag =4)) AS
key_2_total_matches
FROM all_tag_relations AS wtagrels
WHERE ( wtagrels.id_tag =3
OR wtagrels.id_tag =4 )
GROUP BY wtagrels.id_wc
HAVING key_1_total_matches = 1
AND key_2_total_matches = 1
LIMIT 0, 20
And it returns the class with id_wc = 1.
But, I want the search to show all those classes such that all the search terms are present in the class name or its tutor name
So that searching "Sandeepan class" (wtagrels.id_tag = 1,4) or "Sandeepan Nath" also returns the class with id_wc=1. And Searching. Searching "Bob First" should not return any classes.
Please modify the above query or suggest a new query, if possible using MyIsam - fulltext search, but somehow help me get the result.
I think this query would help you:
SET #tag1 = 1, #tag2 = 4; -- Setting some user variables to see where the ids go. (you can put the values in the query)
SELECT wtagrels.id_wc,
SUM(DISTINCT( wtagrels.id_tag =#tag1 OR wtagrels.id_tutor =#tag1)) AS key_1_total_matches,
SUM(DISTINCT( wtagrels.id_tag =#tag2 OR wtagrels.id_tutor =#tag2)) AS key_2_total_matches
FROM all_tag_relations AS wtagrels
WHERE ( wtagrels.id_tag =#tag1 OR wtagrels.id_tag =#tag2 )
GROUP BY wtagrels.id_wc
HAVING key_1_total_matches = 1 AND key_2_total_matches = 1
LIMIT 0, 20
It returns id_wc = 1.
For (6, 3) the query returns nothing.