Currently I have two tables.
Customers:
id
name
status
1
adam
1
2
bob
1
3
cain
2
Orders:
customer_id
item
1
apple
1
banana
1
bonbon
2
carrot
3
egg
I'm trying to do an INNER JOIN first then use the resulting table to query against.
So a user can type in a partial name or partial item and get all the names and items.
For example if a user type in "b" it would kick back:
customer_id
name
status
items
1
adam
1
apple/banana/bonbon
2
bob
1
carrot
What I am currently doing is:
SELECT * FROM(
SELECT customers.* , GROUP_CONCAT(orders.item SEPARATOR '|') as items
FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id
group by customers.id
) as t
WHERE t.status = 1 AND ( t.name LIKE "%b%" OR t.items LIKE "%b%")
Which does work, but it is incredibly slow (+2 seconds).
The strange part though is if I run the queries individually the subquery executes in .0004 seconds and the outer query executes in .006 seconds.
But for some reason combining them increases the wait time a lot.
Is there a more efficient way to do this?
CREATE TABLE IF NOT EXISTS `customers` (
`id` int(6),
`name` varchar(255) ,
`status` int(6),
PRIMARY KEY (`id`,`name`,`status`)
);
INSERT INTO `customers` (`id`, `name` , `status`) VALUES
('1', 'Adam' , 1),
('2', 'bob' , 1),
('3', 'cain' , 2);
CREATE TABLE IF NOT EXISTS `orders` (
`customer_id` int(6),
`item` varchar(255) ,
PRIMARY KEY (`customer_id`,`item`)
);
INSERT INTO `orders` (`customer_id`, `item`) VALUES
('1', 'apple'),
('1', 'banana'),
('1', 'bonbon'),
('2', 'carrot'),
('3', 'egg');
According to the query, you are trying to perform a full-text search on the fields name and item. I would suggest adding full-text indexes to them using ngram tokenisation as you are looking up by part of a word:
ALTER TABLE customers ADD FULLTEXT INDEX ft_idx_name (name) WITH PARSER ngram;
ALTER TABLE orders ADD FULLTEXT INDEX ft_idx_item (item) WITH PARSER ngram;
In this case, your query would look as follows:
SELECT
customers.*, GROUP_CONCAT(orders.item SEPARATOR '|')
FROM
customers
LEFT JOIN orders on customers.id = orders.customer_id
WHERE
orders.customer_id IS NOT NULL
AND customers.status = 1
AND (MATCH(customers.name) AGAINST('bo')
OR MATCH(orders.item) AGAINST('bo'))
GROUP BY
customers.id
If needed, you could modify ngram_token_size MySQL system variable as its value is 2 by default, which means two or more characters should be input to perform the search.
Another approach is to implement it by means of a dedicated search engine, e.g. Elasticsearch, when requirements evolve.
SELECT * FROM(
SELECT customers.* , GROUP_CONCAT(orders.item SEPARATOR '|') as items
FROM customers
LEFT JOIN orders
ON customers.id = orders.customer_id AND customers.name LIKE "%adam" AND orders.item LIKE "%b"
group by customers.AI
It will be faster to filter the records when starting to left join
Related
I'm trying to write a simple SQL query to show all possible combinations of data in a single table. Here's the table:
id
fruit
1
apple
2
orange
3
pear
4
plum
I've only got as fair as pairing all the data using CROSS JOIN: "apple,orange", "apple,pear" etc.
SELECT t1.fruit, t2.fruit
FROM fruits t1
CROSS JOIN fruits t2
WHERE t1.fruit < t2.fruit
Instead I'm looking for all unique combinations in alphabetical order, e.g.
apple
apple,orange
apple,orange,pear
apple,orange,pear,plum
apple,pear
apple,plum
apple,orange,plum
apple,pear,plum
orange
orange,pear
orange,pear,plum
orange,plum
pear
pear,plum
plum
i.e. as long as a combination exists once, it doesn't need to appear again in a different order, e.g. with apple,orange, there is no need for orange,apple
This should work for any table size.
Result here
Note: this requires MySQL 8+.
-- TABLE
CREATE TABLE IF NOT EXISTS `fruits`
(
`id` int(6) NOT NULL,
`fruit` char(20)
);
INSERT INTO `fruits` VALUES (1, 'apple');
INSERT INTO `fruits` VALUES (2, 'orange');
INSERT INTO `fruits` VALUES (3, 'pear');
INSERT INTO `fruits` VALUES (4 ,'plum');
-- QUERY
WITH RECURSIVE cte ( combination, curr ) AS (
SELECT
CAST(t.fruit AS CHAR(80)),
t.id
FROM
fruits t
UNION ALL
SELECT
CONCAT(c.combination, ', ', CAST( t.fruit AS CHAR(100))),
t.id
FROM
fruits t
INNER JOIN
cte c
ON (c.curr < t.id)
)
SELECT combination FROM cte;
Credit:
Code adapted from this answer
EDIT: This query doesn't give all the possible combinations.
Below query should work:
WITH RECURSIVE cte AS (
SELECT A.id,
CONCAT(A.fruit,',',GROUP_CONCAT(B.fruit ORDER BY B.id)) AS combinations,
COUNT(*) AS count_of_delims
FROM fruits A
INNER JOIN fruits B
ON A.id<B.id
GROUP BY A.id,A.fruit
UNION ALL
SELECT id,
SUBSTRING_INDEX(combinations,',',count_of_delims),
count_of_delims-1
FROM cte
WHERE count_of_delims>0
)
SELECT combinations FROM cte ORDER BY id;
Here is a working example in DB Fiddle.
I have two tables, one of products, and the other of product tags
CREATE TABLE IF NOT EXISTS `products` (
`id` int(6) unsigned NOT NULL,
`name` varchar(5) NOT NULL,
PRIMARY KEY (`id`)
) DEFAULT CHARSET=utf8;
INSERT INTO `products` (`id`, `name`) VALUES
(1, 'Shirt'),
(2, 'Pants'),
(3, 'Socks');
CREATE TABLE IF NOT EXISTS `tags` (
`tag_id` int(6) unsigned NOT NULL,
`product_id` int(6) unsigned NOT NULL
) DEFAULT CHARSET=utf8;
INSERT INTO `tags` (`tag_id`, `product_id`) VALUES
(50, 1),
(51, 1),
(50, 2);
Fiddle: http://sqlfiddle.com/#!9/3f58a16/1
1 - I need a query that will get all products with ALL tags. There can be a variable number of tagged filtered by. (ex: 50 AND 51 AND ... )
SELECT products.id, products.name
FROM products
JOIN (
SELECT product_id, count(DISTINCT tag_id) AS c
FROM tags
WHERE tags.tag_id IN(50,51)
GROUP BY product_id
) t ON t.product_id = products.id
WHERE t.c = 2
2 - I need a query that will get all products with ANY tags. There can be a variable number of tagged filtered by. (ex: 50 OR 51 OR ... )
SELECT products.id, products.name
FROM products
JOIN (
SELECT product_id, count(DISTINCT tag_id) AS c
FROM tags
WHERE tags.tag_id IN(50,51)
GROUP BY product_id
) t ON t.product_id = products.id
My question is if this is a fine way to go about getting the results I need
products
id | name
1 Shirt
2 Shoes
3 Pants
tags
product_id | tag_id
1 50
1 51
2 50
Desired result (where tags are 50 AND 51)
id | name
1 Shirt
I would be happy to edit the title if someone can suggest better phrasing...
For Case1 You can try the below -
SELECT products.id, products.name
FROM products join tags on products.id=product_id
where tag_id in (50,51)
group by products.id, products.name
having count(distinct tag_id)=2
For Case2 you don't need the group by with having clause
SELECT distinct products.id, products.name
FROM products join tags on products.id=product_id
where tag_id in (50,51)
You can use the exists for the first query as follows:
SELECT p.id, p.name
FROM products p join tags t on p.id=t.product_id
where t.tag_id in (50,51)
And exists
(Select 1 from tags tt
Where tt.tag_id in (50,51)
And tt.tag_id <> t.tag_id
And tt.product_id = t.product_id)
For second query just use IN as mentioned in other answer.
I am trying to limit returned results of users to results that are "recent" but where users have a parent, I also need to return the parent.
CREATE TABLE `users` (
`id` int(0) NOT NULL,
`parent_id` int(0) NULL,
`name` varchar(255) NULL,
PRIMARY KEY (`id`)
);
CREATE TABLE `times` (
`id` int(11) NOT NULL,
`time` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
);
INSERT INTO `users`(`id`, `parent_id`, `name`) VALUES (1, NULL, 'Alan');
INSERT INTO `users`(`id`, `parent_id`, `name`) VALUES (2, 1, 'John');
INSERT INTO `users`(`id`, `parent_id`, `name`) VALUES (3, NULL, 'Jerry');
INSERT INTO `users`(`id`, `parent_id`, `name`) VALUES (4, NULL, 'Bill');
INSERT INTO `users`(`id`, `parent_id`, `name`) VALUES (5, 1, 'Carl');
INSERT INTO `times`(`id`, `time`) VALUES (2, '2019-01-01 14:40:38');
INSERT INTO `times`(`id`, `time`) VALUES (4, '2019-01-01 14:40:38');
http://sqlfiddle.com/#!9/91db19
In this case I would want to return Alan, John and Bill, but not Jerry because Jerry doesn't have a record in the times table, nor is he a parent of someone with a record. I am on the fence about what to do with Carl, I don't mind getting the results for him, but I don't need them.
I am filtering tens of thousands of users with hundreds of thousands of times records, so performance is important. In general I have about 3000 unique id's coming from times that could be either an id, or a parent_id.
The above is a stripped down example of what I am trying to do, the full one includes more joins and case statements, but in general the above example should be what we work with, but here is a sample of the query I am using (full query is nearly 100 lines):
SELECT id AS reference_id,
CASE WHEN (id != parent_id)
THEN
parent_id
ELSE null END AS parent_id,
parent_id AS family_id,
Rtrim(last_name) AS last_name,
Rtrim(first_name) AS first_name,
Rtrim(email) AS email,
missedappt AS appointment_missed,
appttotal AS appointment_total,
To_char(birth_date, 'YYYY-MM-DD 00:00:00') AS birthday,
To_char(first_visit_date, 'YYYY-MM-DD 00:00:00') AS first_visit,
billing_0_30
FROM users AS p
RIGHT JOIN(
SELECT p.id,
s.parentid,
Count(p.id) AS appttotal,
missedappt,
billing0to30 AS billing_0_30
FROM times AS p
JOIN (SELECT missedappt, parent_id, id
FROM users) AS s
ON p.id = s.id
LEFT JOIN (SELECT parent_id, billing0to30
FROM aging) AS aging
ON aging.parent_id = p.id
WHERE p.apptdate > To_char(Timestampadd(sql_tsi_year, -1, Now()), 'YYYY-MM-DD')
GROUP BY p.id,
s.parent_id,
missedappt,
billing0to30
) AS recent ON recent.patid = p.patient_id
This example is for a Faircom C-Tree database, but I also need to implement a similar solution in Sybase, MySql, and Pervasive, so just trying to understand what I should do for best performance.
Essentially what I need to do is somehow get the RIGHT JOIN to also include the users parent.
NOTES:
based on your fiddle config I'm assuming you're using MySQL 5.6 and thus don't have support for Common Table Expressions (CTE)
I'm assuming each name (child or parent) is to be presented as separate records in the final result set
We want to limit the number of times we have to join the times and users tables (a CTE would make this a bit easier to code/read).
The main query (times -> users(u1) -> users(u2)) will give us child and parent names in separate columns so we'll use a 2-row dynamic table plus a case statement to to pivot the columns into their own rows (NOTE: I don't work with MySQL and didn't have time to research if there's a pivot capability in MySQL 5.6)
-- we'll let 'distinct' filter out any duplicates (eg, 2 'children' have same 'parent')
select distinct
final.name
from
-- cartesian product of 'allnames' and 'pass' will give us
-- duplicate lines of id/parent_id/child_name/parent_name so
-- we'll use a 'case' statement to determine which name to display
(select case when pass.pass_no = 1
then allnames.child_name
else allnames.parent_name
end as name
from
-- times join users left join users; gives us pairs of
-- child_name/parent_name or child_name/NULL
(select u1.id,u1.parent_id,u1.name as child_name,u2.name as parent_name
from times t
join users u1
on u1.id = t.id
left
join users u2
on u2.id = u1.parent_id) allnames
join
-- poor man's pivot code:
-- 2-row dynamic table; no join clause w/ allnames will give us a
-- cartesian product; the 'case' statement will determine which
-- name (child vs parent) to display
(select 1 as pass_no
union
select 2) pass
) final
-- eliminate 'NULL' as a name in our final result set
where final.name is not NULL
order by 1
Result set:
name
==============
Alan
Bill
John
MySQL fiddle
I'm sure somebody already ask this question somewhere but can't seems to find it.
Is it possible in mysql to do sum or group concat (AGGREGATE FUNCTION) combined with a distinct ?
Exemple: I have an order product which can have many option and many beneficiary. How is it possible in onequery (With out using subquery), to get the list of options, the sum of option price and the list of beneficiary ?
I've constructed a sample data set:
CREATE TABLE `order`
(id INT NOT NULL PRIMARY KEY);
CREATE TABLE order_product
(id INT NOT NULL PRIMARY KEY
,order_id INT NOT NULL
);
CREATE TABLE order_product_options
(id INT NOT NULL PRIMARY KEY
,title VARCHAR(20) NOT NULL
,price INT NOT NULL
,order_product_id INT NOT NULL
);
CREATE TABLE order_product_beneficiary
(id INT NOT NULL PRIMARY KEY
,name VARCHAR(20) NOT NULL
,order_product_id INT NOT NULL
);
INSERT INTO `order` (`id`) VALUES (1);
INSERT INTO `order_product` (`id`, `order_id`) VALUES (1, 1);
INSERT INTO `order_product_options` (`id`, `title`, `price`, `order_product_id`)
VALUES (1,'option1', 1, 1), (2, 'option2', 2, 1), (3, 'option3', 3, 1), (4, 'option3', 3, 1);
INSERT INTO `order_product_beneficiary` (`id`, `name`, `order_product_id`)
VALUES (1,'mark', 1), (2, 'jack', 1), (3, 'jack', 1);
http://sqlfiddle.com/#!9/37e383/2
The result I would like to have is
id: 1
options: option1, option2, option3, option3
options price: 9
beneficiaries: mark, jack, jack
Is this possible in mysql without using subqueries ? (I know it is possible in oracle)
If it's possible, how would you do it ?
Thanks
Based on your description, I think you just want DISTINCT in the GROUP_CONCAT(). However, that won't work because of the duplicates (as explained in a comment but not the question).
One solution is to include the ids in the results:
SELECT op.id,
GROUP_CONCAT(DISTINCT opo.title, '(', opo.id, ')' SEPARATOR ', ') AS options,
SUM(opo.price) AS options_price,
GROUP_CONCAT(DISTINCT opb.name, '(', opb.id, ')' SEPARATOR ', ') AS 'beneficiaries'
FROM order_product op INNER JOIN
order_product_options opo
ON opo.order_product_id = op.id INNER JOIN
order_product_beneficiary opb
ON opb.order_product_id = op.id
GROUP BY op.id;
This is not exactly your results, but it might suffice.
EDIT:
Oh, I see. You are joining along two different dimensions and getting a Cartesian product. The solution is to aggregate before joining:
SELECT op.id, opo.options, opo.options_price,
opb.beneficiaries
FROM order_product op INNER JOIN
(SELECT opo.order_product_id,
GROUP_CONCAT(opo.title SEPARATOR ', ') AS options,
SUM(opo.price) AS options_price
FROM order_product_options opo
GROUP BY opo.order_product_id
) opo
ON opo.order_product_id = op.id INNER JOIN
(SELECT opb.order_product_id,
GROUP_CONCAT(opb.name SEPARATOR ', ') AS beneficiaries
FROM order_product_beneficiary opb
GROUP BY opb.order_product_id
) opb
ON opb.order_product_id = op.id;
Here is the SQL Fiddle.
Somewhat like , Do your price summation in inner query and then join with order_product table.
SELECT
op.id,
MAX(opo.title) AS 'options',
MAX(opo.price) AS 'options price',
GROUP_CONCAT(opb.name SEPARATOR ', ') AS 'beneficiaries'
FROM
order_product op
INNER JOIN (
SELECT order_product_id, SUM(price) price, GROUP_CONCAT(title SEPARATOR ', ') title
FROM order_product_options
GROUP BY order_product_id
) opo ON opo.order_product_id = op.id
INNER JOIN order_product_beneficiary opb ON opb.order_product_id = op.id
GROUP BY op.id
Demo
im trying to create a sql query, that will detect (possible) duplicate customers in my database:
I have two tables:
Customer with the columns: cid, firstname, lastname, zip. Note that cid is the unique customer id and primary key for this table.
IgnoreForDuplicateCustomer with the columns: cid1, cid2. Both columns are foreign keys, which references to Customer(cid). This table is used to say, that the customer with cid1 is not the same as the customer with the cid2.
So for example, if i have
a Customer entry with cid = 1, firstname="foo", lastname="anonymous" and zip="11231"
and another Customer entry with cid=2, firstname="foo", lastname="anonymous" and zip="11231".
So my sql query should search for customers, that have the same firstname, lastname and zip and the detect that customer with cid = 1 is the same as customer with cid = 2.
However, it should be possible to say, that customer cid = 1 and cid=2 are not the same, by storing a new entry in the IgnoreForDuplicateCustomer table by setting cid1 = 1 and cid2 = 2.
So detecting the duplicate customers work well with this sql query script:
SELECT cid, firstname, lastname, zip, COUNT(*) AS NumOccurrences
FROM Customer
GROUP BY fistname, lastname,zip
HAVING ( COUNT(*) > 1 )
My problem is, that i am not able, to integrate the IgnoreForDuplicateCustomer table, to that
like in my previous example the customer with cid = 1 and cid=2 will not be marked / queried as the same, since there is an entry/rule in the IgnoreForDuplicateCustomer table.
So i tried to extend my previous query by adding a where clause:
SELECT cid, firstname, lastname, COUNT(*) AS NumOccurrences
FROM Customer
WHERE cid NOT IN (
SELECT cid1 FROM IgnoreForDuplicateCustomer WHERE cid2=cid
UNION
SELECT cid2 FROM IgnoreForDuplicateCustomer WHERE cid1=cid
)
GROUP BY firstname, lastname, zip
HAVING ( COUNT(*) > 1 )
Unfortunately this additional WHERE clause has absolutely no impact on my result.
Any suggestions?
Here you are:
Select a.*
From (
select c1.cid 'CID1', c2.cid 'CID2'
from Customer c1
join Customer c2 on c1.firstname=c2.firstname
and c1.lastname=c2.lastname and c1.zip=c2.zip
and c1.cid < c2.cid) a
Left Join (
Select cid1 'CID1', cid2 'CID2'
From ignoreforduplicatecustomer one
Union
Select cid2 'CID1', cid1 'CID2'
From ignoreforduplicatecustomer two) b on a.cid1 = b.cid1 and a.cid2 = b.cid2
where b.cid1 is null
This will get you the IDs of duplicate records from customer table, which are not in table ignoreforduplicatecustomer.
Tested with:
CREATE TABLE IF NOT EXISTS `customer` (
`CID` int(11) NOT NULL AUTO_INCREMENT,
`Firstname` varchar(50) NOT NULL,
`Lastname` varchar(50) NOT NULL,
`ZIP` varchar(10) NOT NULL,
PRIMARY KEY (`CID`))
ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=100 ;
INSERT INTO `customer` (`CID`, `Firstname`, `Lastname`, `ZIP`) VALUES
(1, 'John', 'Smith', '1234'),
(2, 'John', 'Smith', '1234'),
(3, 'John', 'Smith', '1234'),
(4, 'Jane', 'Doe', '1234');
And:
CREATE TABLE IF NOT EXISTS `ignoreforduplicatecustomer` (
`CID1` int(11) NOT NULL,
`CID2` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO `ignoreforduplicatecustomer` (`CID1`, `CID2`) VALUES
(1, 2);
Results for my test setup are:
CID1 CID2
1 3
2 3
Edit as per TPete's comment (dind't try it):
SELECT
C1.cid, C1.firstname, C1.lastname
FROM
Customer C1,
Customer C2
WHERE
C1.cid < C2.cid AND
C1.firstname = C2.firstname AND
C1.lastname = C2.lastname AND
C1.zip = C2.zip AND
CAST(C1.cid AS VARCHAR)+' ' +CAST(C2.cid AS VARCHAR) <>
(SELECT CAST(cid1 AS VARCHAR)+' '+CAST(cid2 AS VARCHAR) FROM IgnoreForDuplicateCustomer I WHERE I.cid1 = C1.cid AND I.cid2 = C2.cid);
Initially I thought that IgnoreForDuplicateCustomer was a field in the customer table.
crazy but I think it works :)
first I join the customer tables with itself on the names to get the duplicates
then I exclud the keys on the IgnoreForDuplicateCustomer table (the union is because the first query returns cid1, cid2 and cid2,cid1
the result will be duplicated but I think you can get the info you need
select c1.cid, c2.cid
from Customer c1
join Customer c2 on c1.firstname=c2.firstname
and c1.lastname=c2.lastname and c1.zip=c2.zip
and c1.cid!=c2.cid
except
(
select cid1,cid2 from IgnoreForDuplicateCustomer
UNION
select cid2,cid1 from IgnoreForDuplicateCustomer
)
second shot:
select firstname,lastname,zip from Customer
group by firstname,lastname,zip
having (count(*)>1)
except
select c1.firstname, c1.lastname, c1.zip
from Customer c1 join IgnoreForDuplicateCustomer IG on c1.cid=ig.cid1 join Customer c2 on ig.cid2=c2.cid
third:
select firstname,lastname,zip from (
select firstname,lastname,zip from Customer
group by firstname,lastname,zip
having (count(*)>1)
) X
where firstname not in (
select c1.firstname
from Customer c1 join IgnoreForDuplicateCustomer IG on c1.cid=ig.cid1 join Customer c2 on ig.cid2=c2.cid
)