SQL Join For Exact Result - mysql

I have the following table in my database. Its purpose is to hold colour sets. I.e. [red + black], [blue + green + yellow], etc.
CREATE TABLE `df_productcolours`
(
`id` int(11) NOT NULL AUTO_INCREMENT,
`id_colourSet` int(11) NOT NULL,
`id_colour` int(11) NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `UNIQUE` (`id_colourSet`,`id_colour`),
KEY `idx_colourSet` (`id_colourSet`),
KEY `idx_colour_id` (`id_colour`),
CONSTRAINT `fk_colourid` FOREIGN KEY (`id_colour`) REFERENCES `df_lu_color` (`id`)
ON DELETE NO ACTION ON UPDATE NO ACTION
)
I made a stored proc that takes an array of id_colour integers as input, and returns a colour set id. What it's meant to do is return the set that contains those colours, and ONLY those colours that are provided as input. What it's actually doing is returning sets that contain the colours requested plus some others.
This is the code that I have so far:
SET #count = (SELECT COUNT(*) FROM tempTable_inputColours);
SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN tempTable_inputColours AS B
ON A.id_colour = B.id_colour
GROUP BY A.id_colourSet
HAVING COUNT(A.id_colour) = #count
AND COUNT(B.id_colour) = #count;
I have a feeling the issue may be with the way I'm joining, but I just can't seem to get it. Any help would be appreciated. Thanks.

You can try this:
SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN tempTable_inputColours AS B
ON A.id_colour = B.id_colour
WHERE A.id_colourSet IN (SELECT id_colour FROM tempTable_inputColours)
AND A.id_colour IN (SELECT id_colour FROM tempTable_inputColours)
EDIT
SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN tempTable_inputColours AS B
ON A.id_colour = B.id_colour
WHERE A.id_colourSet =(SELECT SUM(id_colour) FROM tempTable_inputColours)

I think I solved it myself after a few days of punishment. Here's the code:
SET clrCount = (SELECT COUNT(*) FROM _tmp_ColourSet);
-- The first half of the query does an inner join,
-- it will return all sets that have ANY of our requested colours.
-- But the HAVING condition will make it return sets that have AT LEAST all of the colours we are requesting.
-- So at this point we have all the super-sets, if you will.
-- Then, the second half of the query will restrict that further,
-- to only sets that have the same number of colours as we are requesting.
-- And voila :)
-- FIND ALL COLOUR SETS THAT HAVE ALL REQUESTED COLOURS
SET colourSetId = (SELECT A.id_colourSet
FROM df_productcolours AS A
INNER JOIN _tmp_colourset AS B
ON A.id_colour = B.id_colour
GROUP BY A.id_colourSet
HAVING COUNT(A.id_colour) = clrCount
-- FIND ALL COLOUR SETS THAT HAVE EXACTLY N COLOURS
AND A.id_colourSet IN (SELECT A.id_colourSet
FROM df_productcolours AS A
GROUP BY A.id_colourSet
HAVING COUNT(A.id_colour) = clrCount));
Hope it saves someone pulling their hair out.

Related

Improve Efficiency Of This Query: Update With Joins and Subqueries

I have a mysql database with
- table of Parcels which need to be sent to people (here 16,000 records),
indexes on account_no, service
- table of Price Rates (500,000 records) - rate depends on: delivery area, customer price rate and type of service(e.g. next day etc), indexes
on area, price rate, service
- table of first part of postcode (or zip Code), which gives area (3000)
- table of customer account, containing price rate (1600), index on price rate
The query finds the price it will cost to send the parcel and updates the customer price for that parcel with unique id
It is taking 70 seconds for 16000 parcel records to be updated with the price to send each parcel
UPDATE
tbl_parcel AS t20, (
SELECT
id, service, rate_group, area,
(
SELECT
rate
FROM
tbl_rates_all t4
WHERE
t4.service = t10.service
AND t4.area = t10.area
AND t4.rate_group = t10.rate_group
)
AS price
FROM
(
SELECT
id,
t1.service,
rate_group,
area
FROM
tbl_parcel t1
JOIN
tbl_account t2
ON t1.account_no = t2.account_no
JOIN
tbl_pr_postcode t3
ON LEFT(full_pcode, locate(' ', full_pcode) - 1) = t3.postcode
) t10
) AS src
SET
t20.customer_price = src.price
WHERE
t20.id = src.id
Takes 70 seconds for the 16000 parcel records
Ultimately it is this part that is killing the efficiency
FROM
tbl_rates_all t4
WHERE
t4.service = t10.service
AND t4.area = t10.area
AND t4.rate_group = t10.rate_group
I could have separate rates tables for each rate as this was the original design so a variable would call e.g. tbl_rates001 which might only have 3000 records and not 500,000. Problem with doing this in mysql was when creating a table name on the fly it was not possible without using a prepared statement so i thought this method was no good. Shame you couldn't use a user variable to hold the price rate number and then add this to the table rate name.
I'm quite new to databases and queries so if something is screaming at you that would help then thanks for any input
regards
ADDTION AS REQUESTED SCHEMA
CREATE TABLE `tbl_x_rate_all` (
`id` bigint(20) NOT NULL,
`service` varchar(4) NOT NULL,
`chargetype` char(1) NOT NULL,
`area` smallint(6) NOT NULL,
`rate` float(7,2) NOT NULL,
`rate_group` smallint(6) NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1;
ALTER TABLE `tbl_x_rate_all` ADD PRIMARY KEY (`id`), ADD KEY `rate_group` (`rate_group`), ADD KEY `area` (`area`), ADD KEY `service` (`service`),
ADD KEY `chargetype` (`chargetype`);
Assuming id, rate_group and area are from t1 inside t10 then your query is a slower? version of the one below:
UPDATE
tbl_parcel AS t20
INNER JOIN (
SELECT
t1.id,
t4.rate as price
FROM tbl_parcel t1
JOIN tbl_account t2 ON t1.account_no = t2.account_no
JOIN tbl_pr_postcode t3 ON LEFT(full_pcode, locate(' ', full_pcode) - 1) = t3.postcode
LEFT JOIN tbl_rates_all t4 ON t1.service = t4.service AND t1.area = t4.area
AND t1.rate_group = t4.rate_group
) src ON t20.id = src.id
SET
t20.customer_price = src.price
WHERE
t20.id = src.id
I am guessing you can further lose the subquery which tend to be cumbersome:
UPDATE
tbl_parcel AS t20
INNER JOIN tbl_parcel t1 ON t20.id = t1.id
INNER JOIN tbl_account t2 ON t1.account_no = t2.account_no
INNER JOIN tbl_pr_postcode t3 ON LEFT(full_pcode, locate(' ', full_pcode) - 1) = t3.postcode
LEFT JOIN tbl_rates_all t4 ON t1.service = t4.service AND t1.area = t4.area
AND t1.rate_group = t4.rate_group
SET
t20.customer_price = t4.rate
WHERE
t20.id = t1.id
-- can also replace with
-- TRUE
-- or lose it altogether
;
You could try adding index on joins of t1 and t4 if you have reason to believe that the join is the bottleneck:
create index tbl_rates_all_service_area_rate_group_index
on tbl_rates_all (service, area, rate_group);
create index tbl_parcel_service_area_rate_group_index
on tbl_parcel (service, area, rate_group);
#Rich --
Nothing's jumping out. You might be able to get some marginal improvements by building some temp tables, handling the SET outside your main query, and using some OUTER APPLY instead of the nested queries.
If you're fairly new to mysql / databases in general, the EXPLAIN function can be very useful in optimization
https://dev.mysql.com/doc/refman/5.5/en/using-explain.html

Getting datas from multiple table in one crystal report

I got 2 tables tbl_issued and tbl_transaction.
tbl_issued has its columns, ItemID,Item,Serial,Quantity and Size. While tbl_transaction has its columns Released,Received,Approved and Department
My problem is I want to get their columns in 1 query, this is mysql query
SELECT `ItemID`,`Item`,`Serial`,`Quantity`,`Size`,`Class`,`Unit`,(SELECT `Released` FROM `tbl_transaction` WHERE `TransactionID` = 12458952) AS `Released`,
(SELECT `Received` FROM `tbl_transaction` WHERE `TransactionID` = 12458952) AS `Received`,
(SELECT `Approved` FROM `tbl_transaction` WHERE `TransactionID` = 12458952) AS `Aprroved`,
(SELECT `Department` FROM `tbl_transaction` WHERE `TransactionID` = 12458952) AS `Department`
FROM `tbl_issued` WHERE `TransactionID` = 12458952
but transferring this on vb.net does not provide output.
Any ideas how i will translate this query to vb.net? Thanks in advance for help!
I don't know what you are trying to do but if you want it to be simplified, here's how. Have you tried Inner Joins? It's like this.
SELECT ItemID, Item, Serial, Quantity, Size, Class, Unit, Released, Received,
Approved, Deparment from tbl_issued a INNER JOIN tbl_transaction b on
a.TransactionID = b.TransactionID Where a.TransactionID = 12458952
I assume that both tables have TransactionID based on your query.

Stuck with a MySQL query

I have a table schema as
create table Location(
id int primary key,
city varchar(255),
state varchar(100),
country varchar(255)
);
create table Person(
id int primary key,
name varchar(100)
);
create table Photographer(
id int primary key references Person(id) on update cascade on delete cascade,
livesIn int not null references Location(id) on update cascade on delete no action
);
create table Specialty(
photographer int references Photographer(id) on update cascade on delete cascade,
type enum('portrait','landscape','sport'),
primary key(photographer, type)
);
create table Photo(
id int primary key,
takenAt timestamp not null,
takenBy int references Photographer(id) on update cascade on delete no action,
photographedAt int references Location(id) on update cascade on delete no action
);
create table Appearance(
shows int references Person(id) on update cascade on delete cascade,
isShownIn int references Photo(id) on update cascade on delete cascade,
primary key(shows, isShownIn)
);
I am stuck at two queries :
1) The photos such that the photo only shows photographers that live in the same location. List each photo once. That is, photos must have persons that are photographers, and they all need to live in the same place.
2) The locations that have the property that every photo in the location was taken by a photographer who is not shown in any photo in Massachusetts? For each location show only the city, and show each location only once.
My tries :
1)
SELECT ph.id, ph.takenAt, ph.takenBy, ph.photographedAt FROM
(SELECT * FROM Photo p, Appearance ap WHERE p.id = ap.isShownIn
HAVING ap.shows IN (SELECT person.id FROM Person,Photographer WHERE person.id
photographer.id)) ph
WHERE ph.photographedAt = (SELECT location.id FROM location WHERE location.id =
(SELECT livesIn FROM Photographer WHERE id = ph.takenBy))
2)
select distinct city from location where location.id in (
select photographedAt from photo, (select * from appearance where appearance.shows in
(select photographer.id from photographer)) ph
where photo.id = ph.isShownIn )
and location.state <> 'Massachusetts'
Can anyone help in creating these queries ??
Your queries are both of the "list individual items that have properties X and Y, where X and Y are in different tables" variety.
These types of questions are commonly solved using correlated sub-queries with EXISTS and NOT EXISTS.
Using EXISTS takes care of the "show each item only once" part. Otherwise you would need to use grouping in conjunction with complex joins, and this can get messy very quickly.
Question 1 requires:
[...] photos must have persons that are photographers, and they all need to live in the same place.
Note that this definition doesn't say "do not show photos if they contain other people, too". If that's what you really meant, it's upon you to draw conclusions from the SQL below and to write better definitions next time. ;)
SELECT
*
FROM
Photo p
WHERE
EXISTS (
-- ...that has at least one appearance of a photographer
SELECT
1
FROM
Appearance a
INNER JOIN Photographer r ON r.id = a.shows
INNER JOIN Location l ON l.id = r.livesIn
WHERE
a.isShownIn = p.id
-- AND l.id = <optional location filter would go here>
AND NOT EXISTS (
-- ...that does not have an appearance of a photographer from
-- some place else
SELECT
1
FROM
Appearance a1
INNER JOIN Photographer r1 ON r1.id = a1.shows
INNER JOIN Location l1 ON l1.id = r1.livesIn
WHERE
a1.isShownIn = p.Id
AND l1.id <> l.id
)
)
The second question reads
[...] locations that have the property that every photo in the location was taken by a photographer who is not shown in any photo in Massachusetts? For each location show only the city, and show each location only once.
The according SQL would look like:
SELECT
city
FROM
Location l
WHERE
NOT EXISTS (
-- ...a photo at this location taken by a photographer who makes
-- an apperance on another photo which which was taken in Massachusetts
SELECT
1
FROM
Photo p
INNER JOIN Photographer r ON r.id = p.takenBy
INNER JOIN Appearance a ON a.shows = r.id
INNER JOIN Photo p1 ON p1.id = a.isShownIn
WHERE
p.photographedAt = l.Id
AND p1.photographedAt = <the location id of Massachusetts>
)
My attempt for Query1. Photos that show photographers that live in the same city.
select ph.id, ph.takenAt, ph.takenBy, ph.photographedAt from Photo as ph
join Appearance as a on ph.id = a.isShownIn
join Photographer as p on a.shows = p.id where p.livesIn in
(select p1.id from Photographer as p1, Photographer as p2
where p1.id != p2.id and p1.livesIn = p2.livesIn);
My attempt for Query2. Take references of people shown in a photo taken at Massachusets, then list all the pictures not taken by that people.
select * from Photo where takenBy not in
(select a.shows from Photo as ph
join Location as l on ph.photographedAt = l.id
join Appearance as a on a.isShownIn = ph.id
where city = 'Massachusets');
Hope that helps.

Sum of long vectors in SQL

I know it is easy to compute a sparse dot product in SQL, but what is the best way to do a sum (for very long vectors)?
A join is not enough because if a coordinate is filled in one vector but not in the other, it will be ignored.
Thus, I computed the sum with a PHP loop... and that was a pretty stupid idea.
I'm currently thinking of filling the missing 0's in order to prepare an inner join, but is there a shortcut (like an outer join converting NULL to 0)?
Edit. Here is the structure of my table of vectors:
CREATE TABLE `eigaki_vectors` (
`name` varchar(2) COLLATE utf8_unicode_ci NOT NULL,
`i1` int(10) NOT NULL,
`i2` int(10) NOT NULL,
`value` double NOT NULL,
UNIQUE KEY `key` (`name`,`i1`,`i2`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
In this particular case, a vector has composed indices: v_{i_1, i_2}, but this has nothing to do with the problem.
I expected to do something like (thanks xQbert):
SELECT v1.i1, v1.i2, isNull(v1.value, 0) + isNull(v2.value, 0)
FROM eigaki_vectors v1 FULL OUTER JOIN eigaki_vectors v2
ON v1.i1 = v2.i1 AND v1.i2 = v2.i2
AND v1.name = 'a' AND v2.name = 'b'
to add vectors a and b. But FULL OUTER JOIN doesn't exist on MySQL, and I think I'm clumsy with the name column. Any ideas?
coalesce(Field,OtherField, AnotherField,0)
coalesce is basically a if then forever... it picks the first non-null value from your list of variables
isNull does the same thing but only for 2 values
isNull(Field,0)
I managed to get something, thanks to the snippet provided in MySQL: Union of a Left Join with a Right Join:
SELECT IFNULL(v1.value, 0) + IFNULL(v2.value, 0) FROM
(
SELECT i1, i2 FROM eigaki_vectors WHERE name = 'a'
UNION
SELECT i1, i2 FROM eigaki_vectors WHERE name = 'b'
) indices
LEFT OUTER JOIN eigaki_vectors v1 ON indices.i1 = v1.i1 AND indices.i2 = v1.i2 AND v1.name = 'a'
LEFT OUTER JOIN eigaki_vectors v2 ON indices.i1 = v2.i1 AND indices.i2 = v2.i2 AND v2.name = 'b'

Mysql query to check if all sub_items of a combo_item are active

I am trying to write a query that looks through all combo_items and only returns the ones where all sub_items that it references have Active=1.
I think I should be able to count how many sub_items there are in a combo_item total and then compare it to how many are Active, but I am failing pretty hard at figuring out how to do that...
My table definitions:
CREATE TABLE `combo_items` (
`c_id` int(11) NOT NULL,
`Label` varchar(20) NOT NULL,
PRIMARY KEY (`c_id`)
)
CREATE TABLE `sub_items` (
`s_id` int(11) NOT NULL,
`Label` varchar(20) NOT NULL,
`Active` int(1) NOT NULL,
PRIMARY KEY (`s_id`)
)
CREATE TABLE `combo_refs` (
`r_id` int(11) NOT NULL,
`c_id` int(11) NOT NULL,
`s_id` int(11) NOT NULL,
PRIMARY KEY (`r_id`)
)
So for each combo_item, there is at least 2 rows in the combo_refs table linking to the multiple sub_items. My brain is about to make bigbadaboom :(
I would just join the three tables usually and then combo-item-wise sum up the total number of sub-items and the number of active sub-items:
SELECT ci.c_id, ci.Label, SUM(1) AS total_sub_items, SUM(si.Active) AS active_sub_items
FROM combo_items AS ci
INNER JOIN combo_refs AS cr ON cr.c_id = ci.c_id
INNER JOIN sub_items AS si ON si.s_id = cr.s_id
GROUP BY ci.c_id
Of course, instead of using SUM(1) you could just say COUNT(ci.c_id), but I wanted an analog of SUM(si.Active).
The approach proposed assumes Active to be 1 (active) or 0 (not active).
To get only those combo-items whose all sub-items are active, just add WHERE si.Active = 1. You could then reject the SUM stuff anyway. Depends on what you are looking for actually:
SELECT ci.c_id, ci.Label
FROM combo_items AS ci
INNER JOIN combo_refs AS cr ON cr.c_id = ci.c_id
INNER JOIN sub_items AS si ON si.s_id = cr.s_id
WHERE si.Active = 1
GROUP BY ci.c_id
By the way, INNER JOIN ensures that there is at least one sub-item per combo-item at all.
(I have not tested it.)
See this answer:
MySQL: Selecting foreign keys with fields matching all the same fields of another table
Select ...
From combo_items As C
Where Exists (
Select 1
From sub_items As S1
Join combo_refs As CR1
On CR1.s_id = S1.s_id
Where CR1.c_id = C.c_id
)
And Not Exists (
Select 1
From sub_items As S2
Join combo_refs As CR2
On CR2.s_id = S2.s_id
Where CR2.c_id = C.c_id
And S2.Active = 0
)
The first subquery ensures that at least one sub_item exists. The second ensures that none of the sub_items are inactive.