How to refactor, (shorten) this query - mysql

I have a database with tables: applicant (or candidate for a job), application (candidate applied for a certain job), test, selected_test(any application has a defined set of tests) and test_result.
When I need to show which applicant scored what result for any application and test I would use this query:
SELECT applicant.first_name, applicant.last_name, application.job, test.name, test_result.score
FROM applicant
INNER JOIN application ON application.applicant_id=applicant.id
INNER JOIN selected_test ON application.id=selected_test.application_id
INNER JOIN test ON selected_test.test_id=test.id
INNER JOIN test_result ON selected_test.test_id=test_result.test_id AND applicant.id=test_result.applicant_id
What I need to accomplish is sorting by certain test type (test.name) along with test.score
This is what I mean:
SELECT a.first_name, a.last_name, app.job, iq.score AS iqScore, math.score AS mathScore, personality.score AS personalityScore, logic.score AS logicScore
FROM applicant a
INNER JOIN application app ON a.id=app.applicant_id
LEFT JOIN
(SELECT app.id AS appId, tr.score
FROM applicant a
INNER JOIN application app ON app.applicant_id=a.id
INNER JOIN selected_test st ON app.id=st.application_id
INNER JOIN test t ON st.test_id=t.id AND t.name='iq'
INNER JOIN test_result tr ON st.test_id=tr.test_id AND a.id=tr.applicant_id) AS iq ON app.id=iq.appId
LEFT JOIN
(SELECT app.id AS appId, tr.score
FROM applicant a
INNER JOIN application app ON app.applicant_id=a.id
INNER JOIN selected_test st ON app.id=st.application_id
INNER JOIN test t ON st.test_id=t.id AND t.name='math'
INNER JOIN test_result tr ON st.test_id=tr.test_id AND a.id=tr.applicant_id) AS math ON app.id=math.appId
LEFT JOIN
(SELECT app.id AS appId, tr.score
FROM applicant a
INNER JOIN application app ON app.applicant_id=a.id
INNER JOIN selected_test st ON app.id=st.application_id
INNER JOIN test t ON st.test_id=t.id AND t.name='personality'
INNER JOIN test_result tr ON st.test_id=tr.test_id AND a.id=tr.applicant_id) AS personality ON app.id=personality.appId
LEFT JOIN
(SELECT app.id AS appId, tr.score
FROM applicant a
INNER JOIN application app ON app.applicant_id=a.id
INNER JOIN selected_test st ON app.id=st.application_id
INNER JOIN test t ON st.test_id=t.id AND t.name='logic'
INNER JOIN test_result tr ON st.test_id=tr.test_id AND a.id=tr.applicant_id) AS logic ON app.id=logic.appId
ORDER BY mathScore DESC, iqScore DESC, logicScore DESC
The query returns a set of applications, showing applicant data, job, test names and scores.
For instance, if I want candidate applications with higher "math" score, followed by highest scores in "IQ" and then in "logic" to be on top, 'ORDER BY' clause looks like the above.
The query works correct but the problem is that in real situation it deals with large data sets and I need a way to shorten/refactor this query.
Example database it works on is here:
CREATE TABLE IF NOT EXISTS `applicant` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`first_name` varchar(255) NOT NULL,
`last_name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;
--
-- Dumping data for table `applicant`
--
INSERT INTO `applicant` (`id`, `first_name`, `last_name`) VALUES
(2, 'Jack', 'Redburn'),
(4, 'Barry', 'Leon'),
(6, 'Elisabeth', 'Logan'),
(7, 'Jane', 'Doe');
-- --------------------------------------------------------
--
-- Table structure for table `application`
--
CREATE TABLE IF NOT EXISTS `application` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`applicant_id` int(11) NOT NULL,
`job` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=10 ;
--
-- Dumping data for table `application`
--
INSERT INTO `application` (`id`, `applicant_id`, `job`) VALUES
(2, 2, 'Salesman'),
(4, 4, 'Policeman'),
(6, 6, 'Journalist'),
(8, 6, 'Hostess'),
(9, 7, 'Journalist');
-- --------------------------------------------------------
--
-- Table structure for table `selected_test`
--
CREATE TABLE IF NOT EXISTS `selected_test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`application_id` int(11) NOT NULL,
`test_id` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=24 ;
--
-- Dumping data for table `selected_test`
--
INSERT INTO `selected_test` (`id`, `application_id`, `test_id`) VALUES
(1, 1, 1),
(2, 1, 2),
(3, 1, 3),
(5, 2, 1),
(6, 2, 2),
(7, 2, 3),
(8, 2, 4),
(9, 3, 4),
(10, 3, 2),
(11, 4, 1),
(12, 4, 2),
(13, 4, 3),
(14, 4, 4),
(15, 5, 2),
(16, 5, 3),
(17, 6, 1),
(18, 6, 4),
(19, 7, 3),
(20, 7, 2),
(21, 7, 1),
(22, 8, 2),
(23, 8, 3);
-- --------------------------------------------------------
--
-- Table structure for table `test`
--
CREATE TABLE IF NOT EXISTS `test` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=5 ;
--
-- Dumping data for table `test`
--
INSERT INTO `test` (`id`, `name`) VALUES
(1, 'math'),
(2, 'logic'),
(3, 'iq'),
(4, 'personality');
-- --------------------------------------------------------
--
-- Table structure for table `test_result`
--
CREATE TABLE IF NOT EXISTS `test_result` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`applicant_id` int(11) NOT NULL,
`test_id` int(11) NOT NULL,
`score` int(11) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 AUTO_INCREMENT=24 ;
--
-- Dumping data for table `test_result`
--
INSERT INTO `test_result` (`id`, `applicant_id`, `test_id`, `score`) VALUES
(2, 2, 1, 6),
(3, 4, 1, 7),
(6, 6, 1, 3),
(7, 7, 1, 8),
(9, 2, 2, 15),
(11, 4, 2, 12),
(13, 6, 2, 11),
(14, 7, 2, 9),
(15, 7, 3, 105),
(16, 6, 3, 112),
(18, 4, 3, 108),
(20, 2, 3, 117),
(22, 4, 4, 70);
And here is what results look like:
First query is just to show you how data is related:
The large query, shows score data horizontally so it is possible to sort by test name and score:

caveat I don't know mysql
Googling mysql pivot gives this result http://en.wikibooks.org/wiki/MySQL/Pivot_table
So if we apply the same logic using the test.id as the seed number (which is exam in the example from the google search) we get this:
SQLFIDDLE
select first_name, last_name, job,
sum(score*(1-abs(sign(testid-1)))) as math,
sum(score*(1-abs(sign(testid-2)))) as logic,
sum(score*(1-abs(sign(testid-3)))) as iq,
sum(score*(1-abs(sign(testid-4)))) as personality
from
(
SELECT applicant.first_name, applicant.last_name, application.job, test.name, test_result.score, test.id as testid
FROM applicant
INNER JOIN application ON application.applicant_id=applicant.id
INNER JOIN selected_test ON application.id=selected_test.application_id
INNER JOIN test ON selected_test.test_id=test.id
INNER JOIN test_result ON selected_test.test_id=test_result.test_id AND applicant.id=test_result.applicant_id
) t
group by first_name, last_name, job
Now you've got your short query yu can apply sorting as required - you can use case statement in you order by to dynamically change the order as required...

I noticed that you have only defined primary keys. You should see a noticeable performance improvement when you index other fields. Index at least the following: application.applicant_id, selected_test.application_id, selected_test.test_id, test_result.applicant_id, test_result.test_id, test_result.score.
You might be surprised how much this speeds things up for you. In fact, mysql tells us this is the best way to improve performance: https://dev.mysql.com/doc/refman/5.5/en/optimization-indexes.html.

Related

How to implement IF condition in two relational tables?

I have two relational tables, and I would like to filter data using IF condition. The problem is that using LEFT JOIN I got records that cannot be grouped.
The tables that I have are:
calendar
bookers
The first table consists of lessons that can be booked by more people, and the second table contains data who booked each lesson. The IF condition that I would like to implement is: return '2' if lesson is booked by specific user, return '1' if lesson is booked, but by another user, and return '0' if lesson is not booked.
What I would like to get according to above tables is given in the figure below.
Expected result
But, when I use LEFT JOIN to link those tables, I got record for every user that booked specific lesson.
SELECT calendar.id, calendarId, lessonType, description,
CASE
WHEN bookedBy then IF(bookedBy = 8, '2', '1')
ELSE '0'
END AS bb,
(select count(bookedBy) from bookers where calendar.id = bookers.lessonId) as nOfBookers
FROM calendar
LEFT JOIN bookers ON calendar.id = bookers.lessonId
WHERE `calendarId`= 180
Without the LEFT JOIN (fiddle), counts are shown properly, but I cannot include IF condition, because the table bookers is not defined.
I would appreciate any help. Thank you very much in advance.
Here is the Fiddle.
CREATE TABLE `calendar` (
`id` int(11) NOT NULL,
`calendarId` varchar(50) NOT NULL,
`lessonType` varchar(255) DEFAULT NULL,
`description` varchar(255) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
INSERT INTO `calendar`
(`id`, `calendarId`, `lessonType`, `description`)
VALUES
(1, '180', 'A', ''),
(2, '180', 'A', ''),
(3, '180', 'A', ''),
(4, '180', 'B', ''),
(5, '180', 'B', ''),
(6, '180', 'B', ''),
(7, '180', 'B', ''),
(8, '180', 'B', ''),
(9, '180', 'B', '');
CREATE TABLE `bookers` (
`id` int(11) NOT NULL,
`lessonId` int(11) DEFAULT NULL,
`bookedBy` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
--
-- Dumping data for table `bookers`
--
INSERT INTO `bookers` (`id`, `lessonId`, `bookedBy`) VALUES
(4, 1, 8),
(5, 2, 8),
(6, 2, 28),
(7, 2, 17),
(8, 3, 11);
--
-- Indexes for dumped tables
--
ALTER TABLE `calendar`
ADD PRIMARY KEY (`id`),
ADD UNIQUE KEY `id` (`id`);
--
-- Indexes for table `bookers`
--
ALTER TABLE `bookers`
ADD PRIMARY KEY (`id`);
--
-- AUTO_INCREMENT for dumped tables
--
--
-- AUTO_INCREMENT for table `bookers`
--
ALTER TABLE `bookers`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=9;
COMMIT;
select version();
Try this:
SELECT id, calendarid, lessontype, description,
CASE WHEN FIND_IN_SET(8,vbb)>0 THEN 2
WHEN vbb IS NOT NULL THEN 1
ELSE 0 END AS bb,
nOfBookers
FROM
(SELECT c.id, calendarId, lessonType, GROUP_CONCAT(bookedby) AS vbb, description,
(SELECT COUNT(bookedby) FROM bookers WHERE c.id = bookers.lessonId) AS nOfBookers
FROM calendar c
LEFT JOIN bookers b ON c.id = b.lessonId
WHERE `calendarId`= 180
GROUP BY c.id, calendarId, lessonType, description) A;
In addition to your original LEFT JOIN attempt, I've added GROUP_CONCAT(bookedby) AS vbb which will return a comma separated bookedby value; which is 17,28,8. After that, I make the query as a sub-query and do CASE expression with FIND_IN_SET function on vbb to look for specific bookedby.
Here's an update fiddle: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=0933e9fc3cb7445311c34c6705d11637

SQL Query to find number of users in a Job Area

I have three tables:
jobAreas (id, title)
jobSkills (id,title, jobAreaID)
userSkills (id, userID, jobSkillID)
Each jobSkills entry belongs to a JobArea (linked by foreign key jobAreaID). And each userSkills entry has a JobSkill that is related to a jobSkill.
I am trying to create a SQL select query that will list the number of users that belong to each Job Area.
SELECT ja.id, ja.title, COUNT(*) as numUsers FROM user_skill_types uskills INNER JOIN job_areas ja INNER JOIN skill_types st ON ja.id = st.parent_id GROUP BY ja.id
But the numbers I am getting are not correct.
Given the following example (based on the table structure provided in the question).
CREATE TABLE `jobareas` (
`id` int(11) NOT NULL,
`title` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO `jobareas` (`id`, `title`) VALUES
(1, 'area1'),
(2, 'area2'),
(3, 'area3'),
(4, 'area4'),
(5, 'area5'),
(6, 'area6'),
(7, 'area7'),
(8, 'area8');
-- --------------------------------------------------------
CREATE TABLE `jobskills` (
`id` int(11) NOT NULL,
`title` varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
`jobAreaID` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO `jobskills` (`id`, `title`, `jobAreaID`) VALUES
(1, 'skill1', 1),
(2, 'skill2', 3),
(3, 'skill3', 3),
(4, 'skill4', 7),
(5, 'skill5', 4),
(6, 'skill6', 5),
(7, 'skill7', 1),
(8, 'skill8', 7),
(9, 'skill9', 6),
(10, 'skill10', 3),
(11, 'skill11', 4),
(12, 'skill12', 2),
(13, 'skill13', 6),
(14, 'skill14', 7),
(15, 'skill15', 2);
-- --------------------------------------------------------
CREATE TABLE `userskills` (
`id` int(11) NOT NULL,
`userID` int(11) NOT NULL,
`jobSkillID` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;
INSERT INTO `userskills` (`id`, `userID`, `jobSkillID`) VALUES
(1, 5, 10),
(2, 2, 11),
(3, 4, 14),
(4, 4, 6),
(5, 2, 8),
(6, 6, 9),
(7, 3, 9),
(8, 1, 12),
(9, 1, 3),
(10, 5, 10);
ALTER TABLE `jobareas`
ADD UNIQUE KEY `id` (`id`);
ALTER TABLE `jobskills`
ADD PRIMARY KEY (`id`),
ADD KEY `jobAreaID` (`jobAreaID`);
ALTER TABLE `userskills`
ADD PRIMARY KEY (`id`),
ADD KEY `userID` (`userID`),
ADD KEY `jobSkillID` (`jobSkillID`);
ALTER TABLE `jobskills`
ADD CONSTRAINT `jobskills_ibfk_1` FOREIGN KEY (`jobAreaID`) REFERENCES `jobareas` (`id`);
ALTER TABLE `userskills`
ADD CONSTRAINT `userskills_ibfk_1` FOREIGN KEY (`jobSkillID`) REFERENCES `jobskills` (`id`);
Your query should use DISTINCT.
SELECT COUNT(DISTINCT(`us`.`userID`)) AS `num`,`ja`.`title` FROM `userskills` `us`
INNER JOIN `jobskills` `js` ON `js`.`id` = `us`.`jobSkillID`
INNER JOIN `jobareas` `ja` ON `ja`.`id` = `js`.`jobAreaID`
GROUP BY `ja`.`id`;
The results can be checked in this SQLFiddle
Your SQL Query shared does not seem to match the schema shared. Also you have not specified how to join the job_areas table
Use
select
ja.id, ja.title , count(us.id) as numUsers
from jobAreas ja
INNER JOIN jobSkills js on ja.id = js.jobAreaID
INNER JOIN userSkills us on js.id = us.jobSkillID
GROUP BY ja.id, ja.title
You are probably getting duplicates in your result because of users having multiple skills or jobs having multiple areas, or both. Rather than COUNT(*), use COUNT(DISTINCT userID) to work around that:
SELECT ja.id, ja.title, COUNT(DISTINCT us.userID) as numUsers
FROM jobAreas ja
JOIN jobSkills js ON js.jobAreaID = ja.id
JOIN userSkills us ON us.jobSkillsID = js.id
GROUP BY ja.id, ja.title
Note I've written the query based on the schema in your question. Based on the query you have written, it should probably look something like (it's not clear what the user_skill_types userID column is called, or how to JOIN user_skill_types to job_skills):
SELECT ja.id, ja.title, COUNT(DISTINCT uskills.userID) as numUsers
FROM job_areas ja
JOIN skill_types st ON ja.id = st.parent_id
JOIN user_skill_types uskills ON uskills.jobSkillID = st.id
GROUP BY ja.id, ja.title

How to use the result of a subquery

I have three SQL tables as below:
(The "orders" table below is not complete)
How to resolve the following question using just one sql query:
Select the customers who ordered in 2014 all the products (at least) that the customers named 'Smith' ordered in 2013.
Is this possible?
I have thought about this:
Firstly, find the all the products that the client named "Smith" ordered in 2013.
Secondly, find the list of customers who at least have ordered all the above products in 2014.
Which brings me to a SQL query like this:
SELECT cname,
FROM customers
NATURAL JOIN orders
WHERE YEAR(odate) = '2014'
AND "do_something_here"
(SELECT DISTINCT pid
FROM orders
NATURAL JOIN customers
WHERE LOWER(cname)='smith'
AND YEAR(odate)='2013') as results;
in which all the subquery results (the list of products that "Smith" ordered in 2013) should be used to find the clients needed.
But I don't know if this is the good approach.
Sorry for my English because I am not a native speaker.
If you want to test it out on phpMyAdmin, here's the SQL:
-- phpMyAdmin SQL Dump
-- version 4.7.5
-- https://www.phpmyadmin.net/
--
-- Host: localhost
-- Generation Time: Mar 21, 2018 at 02:49 PM
-- Server version: 5.7.20
-- PHP Version: 7.1.7
SET SQL_MODE = "NO_AUTO_VALUE_ON_ZERO";
SET AUTOCOMMIT = 0;
START TRANSACTION;
SET time_zone = "+00:00";
/*!40101 SET #OLD_CHARACTER_SET_CLIENT=##CHARACTER_SET_CLIENT */;
/*!40101 SET #OLD_CHARACTER_SET_RESULTS=##CHARACTER_SET_RESULTS */;
/*!40101 SET #OLD_COLLATION_CONNECTION=##COLLATION_CONNECTION */;
/*!40101 SET NAMES utf8mb4 */;
--
-- Database: `tp1`
--
-- --------------------------------------------------------
--
-- Table structure for table `customers`
--
CREATE TABLE `customers` (
`cid` int(11) NOT NULL,
`cname` varchar(30) NOT NULL,
`residence` varchar(50) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- Dumping data for table `customers`
--
INSERT INTO `customers` (`cid`, `cname`, `residence`) VALUES
(0, 'didnotorder', 'Great Britain'),
(1, 'Jones', 'USA'),
(2, 'Blake', NULL),
(3, 'Dupond', 'France'),
(4, 'Smith', 'Great Britain'),
(5, 'Gupta', 'India'),
(6, 'Smith', 'France');
-- --------------------------------------------------------
--
-- Table structure for table `orders`
--
CREATE TABLE `orders` (
`pid` int(11) NOT NULL,
`cid` int(11) NOT NULL,
`odate` date NOT NULL,
`quantity` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- Dumping data for table `orders`
--
INSERT INTO `orders` (`pid`, `cid`, `odate`, `quantity`) VALUES
(1, 1, '2014-12-12', 2),
(1, 4, '2014-11-12', 6),
(2, 1, '2014-06-02', 6),
(2, 1, '2014-08-20', 6),
(2, 1, '2014-12-12', 2),
(2, 2, '2010-11-12', 1),
(2, 2, '2014-07-21', 3),
(2, 3, '2014-10-01', 1),
(2, 3, '2014-11-12', 1),
(2, 4, '2014-01-07', 1),
(2, 4, '2014-02-22', 1),
(2, 4, '2014-03-19', 1),
(2, 4, '2014-04-07', 1),
(2, 4, '2014-05-22', 1),
(2, 4, '2014-09-12', 4),
(2, 6, '2014-10-01', 1),
(3, 1, '2014-12-12', 1),
(3, 2, '2013-01-01', 1),
(3, 4, '2015-10-12', 1),
(3, 4, '2015-11-12', 1),
(4, 1, '2014-12-12', 3),
(4, 2, '2014-06-11', 2),
(4, 5, '2014-10-12', 1),
(4, 5, '2014-11-13', 5),
(5, 2, '2015-07-21', 3),
(6, 2, '2015-07-21', 7),
(6, 3, '2014-12-25', 1);
-- --------------------------------------------------------
--
-- Table structure for table `products`
--
CREATE TABLE `products` (
`pid` int(11) NOT NULL,
`pname` varchar(30) NOT NULL,
`price` decimal(7,2) NOT NULL,
`origin` varchar(20) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
--
-- Dumping data for table `products`
--
INSERT INTO `products` (`pid`, `pname`, `price`, `origin`) VALUES
(0, 'wasnotordered', '11.00', NULL),
(1, 'chocolate', '5.00', 'Belgium'),
(2, 'sugar', '0.75', 'India'),
(3, 'milk', '0.60', 'France'),
(4, 'tea', '10.00', 'India'),
(5, 'chocolate', '7.50', 'Switzerland'),
(6, 'milk', '1.50', 'France');
--
-- Indexes for dumped tables
--
--
-- Indexes for table `customers`
--
ALTER TABLE `customers`
ADD PRIMARY KEY (`cid`);
--
-- Indexes for table `orders`
--
ALTER TABLE `orders`
ADD PRIMARY KEY (`pid`,`cid`,`odate`),
ADD KEY `orders_fk_cid` (`cid`);
--
-- Indexes for table `products`
--
ALTER TABLE `products`
ADD PRIMARY KEY (`pid`);
--
-- Constraints for dumped tables
--
--
-- Constraints for table `orders`
--
ALTER TABLE `orders`
ADD CONSTRAINT `orders_fk_cid` FOREIGN KEY (`cid`) REFERENCES `customers` (`cid`),
ADD CONSTRAINT `orders_fk_pid` FOREIGN KEY (`pid`) REFERENCES `products` (`pid`);
COMMIT;
/*!40101 SET CHARACTER_SET_CLIENT=#OLD_CHARACTER_SET_CLIENT */;
/*!40101 SET CHARACTER_SET_RESULTS=#OLD_CHARACTER_SET_RESULTS */;
/*!40101 SET COLLATION_CONNECTION=#OLD_COLLATION_CONNECTION */;
You can try something like the following. Basically force join the customers with all the products from smith of 2013, then LEFT JOIN with the products each customer bought of 2014. If both counts are equal means that all products from smith of 2013 were bought at least once in 2014, for each customer.
SELECT
C.cid
FROM
Customers C
CROSS JOIN (
SELECT DISTINCT
P.pid
FROM
Customers C
INNER JOIN Orders O ON C.cid = O.cid
INNER JOIN Products P ON O.pid = P.pid
WHERE
C.cname = 'Smith' AND
YEAR(O.odate) = 2013) X
LEFT JOIN (
SELECT DISTINCT
C.cid,
P.pid
FROM
Customers C
INNER JOIN Orders O ON C.cid = O.cid
INNER JOIN Products P ON O.pid = P.pid
WHERE
YEAR(O.odate) = 2014) R ON C.cid = R.cid AND X.pid = R.pid
GROUP BY
C.cid
HAVING
COUNT(X.pid) = COUNT(R.pid)
If you want to see the customers even if there are no products from smith of 2013, you can switch the CROSS JOIN for a FULL JOIN (...) X ON 1 = 1.
You can solve this by finding all the products that each cid has in common with the Smith customers. Then, just check that the count covers all the products:
select o2014.cid, count(distinct o2013.pid) as num_products,
group_concat(distinct o2013.pid) as products
from orders o2013 join
orders o2014
on o2013.pid = o2014.pid and
year(o2013.odate) = 2013 and year(o2014.odate) = 2014
where o2013.cid = (select c.cid from customers c where c.cname = 'Smith')
group by o2014.cid
having num_products = (select count(distinct o2013.products)
from orders o2013
where o2013.cid = (select c.cid from customers c where c.cname = 'Smith')
);

Condition based selecting from group_concat in Mysql query

Here is my table and sample data.
CREATE TABLE `articles`
(
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`title` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `tags`
(
`id` int(11) NOT NULL,
`name` varchar(100) NOT NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `article_tags`
(
`article_id` int(11) NOT NULL,
`tag_id` int(11) NOT NULL
);
INSERT INTO `tags` (`id`, `name`) VALUES
(1, 'Wap Stories'),
(2, 'App Stories');
INSERT INTO `articles` (`id`, `title`) VALUES
(1, 'USA'),
(2, 'England'),
(3, 'Germany'),
(4, 'India'),
(5, 'France'),
(6, 'Dubai'),
(7, 'Poland'),
(8, 'Japan'),
(9, 'China'),
(10, 'Australia');
INSERT INTO `article_tags` (`article_id`, `tag_id`) VALUES
(1, 1),
(1, 2),
(4, 1),
(5, 1),
(2, 2),
(2, 1),
(6, 2),
(7, 2),
(8, 1),
(9, 1),
(3, 2),
(9, 2),
(10, 2);
How can I get the below output I have tried using group_concat function. It gives all the results. But my requirement is I need to groupconcat values as
a. Combination of 1,2 can be there, only 1 can be there but 2 alone cannot be there.
b. Combination of 2,1 can be there, only 2 can be there but 1 alone cannot be there
Below is the output I need
id, title, groupconcat
--------------------------
1, USA, 1,2
2, England, 1,2
4, India, 1
5, France, 1
8, Japan, 1
9, China, 1,2
SqlFiddle Link
The query which I am using is
select id, title, group_concat(tag_id order by tag_id) as 'groupconcat' from articles a
left join article_tags att on a.id = att.article_id
where att.tag_id in (1,2)
group by article_id order by id
You can try like this
SELECT id, title, GROUP_CONCAT(tag_id ORDER BY tag_id) AS 'groupconcat'
FROM articles a
LEFT join article_tags att on a.id = att.article_id
WHERE att.tag_id in (1,2)
GROUP BY article_id
HAVING SUBSTRING_INDEX(groupconcat,',',1) !='2'
ORDER BY id

search a set of values in other set of values for a row

Hello I am having issues with execution time on a query that searches for users ( from users table ) that are having at least one interest from one specified interests set and a location from a specified locations set. So I have this test DB:
CREATE TABLE IF NOT EXISTS `interests` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=10 ;
--
-- Dumping data for table `interests`
--
INSERT INTO `interests` (`id`, `name`) VALUES
(1, 'auto'),
(2, 'moto'),
(3, 'health'),
(4, 'garden'),
(5, 'house'),
(6, 'music'),
(7, 'video'),
(8, 'games'),
(9, 'it');
-- --------------------------------------------------------
--
-- Table structure for table `locations`
--
CREATE TABLE IF NOT EXISTS `locations` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=11 ;
--
-- Dumping data for table `locations`
--
INSERT INTO `locations` (`id`, `name`) VALUES
(1, 'engalnd'),
(2, 'austia'),
(3, 'germany'),
(4, 'france'),
(5, 'belgium'),
(6, 'italy'),
(7, 'russia'),
(8, 'poland'),
(9, 'norway'),
(10, 'romania');
-- --------------------------------------------------------
--
-- Table structure for table `users`
--
CREATE TABLE IF NOT EXISTS `users` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`email` varchar(255) NOT NULL,
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 AUTO_INCREMENT=11 ;
--
-- Dumping data for table `users`
--
INSERT INTO `users` (`id`, `email`) VALUES
(1, 'email1#test.com'),
(2, 'email2#test.com'),
(3, 'email3#test.com'),
(4, 'email4#test.com'),
(5, 'email5#test.com'),
(6, 'email6#test.com'),
(7, 'email7#test.com'),
(8, 'email8#test.com'),
(9, 'email9#test.com'),
(10, 'email10#test.com');
-- --------------------------------------------------------
--
-- Table structure for table `users_interests`
--
CREATE TABLE IF NOT EXISTS `users_interests` (
`user_id` int(11) NOT NULL,
`interest_id` int(11) NOT NULL,
PRIMARY KEY (`user_id`,`interest_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
--
-- Dumping data for table `users_interests`
--
INSERT INTO `users_interests` (`user_id`, `interest_id`) VALUES
(1, 1),
(1, 2),
(2, 5),
(2, 7),
(2, 8),
(3, 1),
(4, 1),
(4, 5),
(4, 6),
(4, 7),
(4, 8),
(5, 1),
(5, 2),
(5, 8),
(6, 3),
(6, 7),
(6, 8),
(7, 7),
(7, 9),
(8, 5);
-- --------------------------------------------------------
--
-- Table structure for table `users_locations`
--
CREATE TABLE IF NOT EXISTS `users_locations` (
`user_id` int(11) NOT NULL,
`location_id` int(11) NOT NULL,
PRIMARY KEY (`user_id`,`location_id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;
--
-- Dumping data for table `users_locations`
--
INSERT INTO `users_locations` (`user_id`, `location_id`) VALUES
(2, 5),
(2, 7),
(2, 8),
(3, 1),
(4, 1),
(4, 5),
(4, 6),
(4, 7),
(4, 8),
(5, 1),
(5, 2),
(5, 8),
(6, 3),
(6, 7),
(6, 8),
(7, 7),
(7, 9),
(8, 5);
Is there a better way to query it than this:
SELECT email,
GROUP_CONCAT( DISTINCT ui.interest_id ) AS interests,
GROUP_CONCAT( DISTINCT ul.location_id ) AS locations
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
GROUP BY u.id
HAVING IF( interests IS NOT NULL , FIND_IN_SET( 2, interests )
OR FIND_IN_SET( 3, interests ) , 1 )
AND IF( locations IS NOT NULL , FIND_IN_SET( 2, locations )
OR FIND_IN_SET( 3, locations ) , 1 )
This is the best solution I found but it still slow on a 500k and 1mil rows in the relational tables ( locations and interests ). Especially when you are matching against a large set of values ( let's say above 50 locations and interests ).
So I am trying to achieve the result this query produces, but a bit faster:
email interests locations
email1#test.com 1,2 [BLOB - 0B]
email5#test.com 1,2,8 1,2,8
email6#test.com 3,7,8 3,7,8
email9#test.com [BLOB - 0B] [BLOB - 0B]
email10#test.com [BLOB - 0B] [BLOB - 0B]
I also tried to join against an SELECT UNION table - for the matching set - but it was even slower. Like this:
SELECT *
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
LEFT JOIN (SELECT 2 as interest UNION SELECT 3 as interest) as `is` ON ui.interest_id = is.interest
LEFT JOIN (SELECT 2 as location UNION SELECT 3 as location ) as `ls` ON ul.location_id = ls.location
WHERE IF(ui.user_id IS NOT NULL, `is`.interest IS NOT NULL,1) AND
IF(ul.user_id IS NOT NULL, ls.location IS NOT NULL,1)
GROUP BY u.id
I am using this for a basic targeting system.
I would appreciate very much, any suggestion! Thank you!
you have IS is reserved word for mysql
and also your group by can slow your query but i dont see any meaning to use group by u.id here since the u.id is already unique id.
look demo
try use backticks around it.
SELECT *
FROM `users` u
LEFT JOIN users_interests ui ON u.id = ui.user_id
LEFT JOIN users_locations ul ON u.id = ul.user_id
LEFT JOIN (SELECT 2 as interest UNION SELECT 3 as interest) as `is`
ON ui.interest_id = `is`.interest
LEFT JOIN (SELECT 2 as location UNION SELECT 3 as location ) as `ls`
ON ul.location_id = `ls`.location
WHERE IF(ui.user_id IS NOT NULL, `is`.interest IS NOT NULL,1)
AND
IF(ul.user_id IS NOT NULL, `ls`.location IS NOT NULL,1)