inner join on null condition : bug or feature? - mysql

Here is the test setup:
CREATE TABLE A (
id bigint NOT NULL AUTO_INCREMENT,
value bigint,
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
INSERT INTO A (id, value) VALUES (1, 22);
INSERT INTO A (id, value) VALUES (2, 25);
INSERT INTO A (id, value) VALUES (3, 25);
CREATE TABLE B (
id bigint NOT NULL AUTO_INCREMENT,
value bigint,
PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Important note: table B does not contain any row!!!
Test query:
select * from A inner join B on (A.value=25 OR B.value=null);
Surprise: Empty result set fetched
If table B contain anything like:
INSERT INTO B (id, value) VALUES (3, 66);
Then the same query will return 2 rows:
id value id value
-- ----- -- -----
2 25 3 66
3 25 3 66
This is a bug or a feature of MySQL?

INNER JOIN by definition returns matching records only. If a table does not have any records, then there cannot be any metching record. This is a standard behaviour across all RDBMs. Use left or right join instead of inner if you want to return rows from a table regardless of matching rows from another.
select * from A left join B on ... where A.value=25 ;
Moreover, anything=NULL comparison will always retur false, because NULL does not equal to anything, not even to another NULL value. If you want to test if a field has a value of NULL, then use fieldname IS NULL expression.

Related

SQL Syntax for checking duplicates prior to update (not DUPLICATE KEY UPDATE)

I have a syntactical question with attempting an UPDATE if 2 non key fields are matched. -- INSERT if not matched.
Let me start by saying I have a working query that involves a SELECT with an ON DUPLICATE KEY UPDATE. Now I am just curious, can it be done differently?
Out of sheer curiosity, I am trying this in a manner that will not require the primary key. This really is just an experiment to see if it can be done.
What I want is like ON DUPLICATE KEY UPDATE -- However:
let's pretend I don't know the key , and
let's pretend I can't get the key with a SELECT
Here is the data structure I have:
+--------------------------------------------------------------------------+
| id | contractor_id | email_type_id | email_address |
+--------------------------------------------------------------------------+
Table creation:
CREATE TABLE `email_list` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`contractor_id` int(11) NOT NULL,
`email_type_id` int(11) NOT NULL,
`email_address` varchar(45) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`id`),
UNIQUE KEY `id_UNIQUE` (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=13 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
Now what I am trying to do without a select and without ON DUPLICATE KEY UPDATE is -- If contractor_id and email_type_id are matched -- UPDATE the email_address -- else INSERT.
If have tried this -- (I know I am breaking my own rule of no SELECT):
IF EXISTS (SELECT * FROM email_list WHERE contractor_id = 1166)
UPDATE email_list SET (email_address='herky_jerky#snailmail.com')
WHERE contractor_id = 1166 AND email_type_id = 4
ELSE
INSERT INTO email_list VALUES (
contractor_id = 1166,
email_type_id = 4,
email_address = 'herky_jerky#snailmail.com');
I understand why this doesn't work .. I just don't know what the fix for it is -- It feels a little clunky too using an IF - ELSE statement. Also I don't want to use a SELECT -- So then I thought about using just an IF like:
UPDATE email_list SET email_address = 'herky#jerky.com'
WHERE contractor_id = 1166 AND email_type_id = 4
IF ##ROWCOUNT=0
INSERT INTO email_list VALUES (
contractor_id = 1166,
email_type_id = 4,
email_address = 'herky#jerky.com');
But I don't understand why that one doesn't work. This is just an exercise to see how creative one can be with this type of query. I think both of my ideas are doable -- Can anyone find a fix for either query to make it work?
I'd also love to see other, more creative, ways of attempting what I am asking as well!
I'd implement this using an UPDATE followed by a test of ROW_COUNT() and if no rows were updated, then INSERT.
drop table if exists t;
create table t (id int, x int, y int, str varchar(255));
insert into t (id, x, y, str) values (1, 2, 3, 'foo');
select * from t;
update t set str = 'bar'
where x = 2 and y = 3;
insert into t (id, x, y, str)
select 1, 2, 3, 'inserted'
from dual
where row_count() = 0;
select * from t;
update t set str = 'baz'
where x = 20 and y = 30;
insert into t (id, x, y, str)
select 10, 20, 30, 'baz'
from dual
where row_count() = 0;
select * from t;
drop table t;
You can see it in action here: https://rextester.com/FRFTE79537
The idea here is you do the UPDATE first, followed by an INSERT ... SELECT where the SELECT only returns a row if ROW_COUNT() = 0 is true, and that's only true if the UPDATE didn't match any rows.

mysql/mariadb - LEFT JOIN aggregate not returning all values

First up, I don't know how to word this question so if there's any better terminology or phrasing, feel free to edit.
So here's my schema: http://sqlfiddle.com/#!9/ca46c1/2
CREATE TABLE map
(
id INT UNSIGNED PRIMARY KEY NOT NULL AUTO_INCREMENT
);
CREATE TABLE vote_map
(
id INT UNSIGNED PRIMARY KEY NOT NULL AUTO_INCREMENT,
user_id INT UNSIGNED NOT NULL,
map_id INT UNSIGNED NOT NULL,
score ENUM("-1", "0", "1")
);
CREATE VIEW view_vote_map_rank AS
SELECT
map.id AS map_id,
COALESCE( SUM(CAST(CAST(score AS char) AS SIGNED)), 0) AS score
FROM vote_map
RIGHT JOIN
map ON map.id = vote_map.map_id
GROUP BY map_id;
INSERT INTO map (id) VALUES (1);
INSERT INTO map (id) VALUES (2);
INSERT INTO map (id) VALUES (3);
INSERT INTO map (id) VALUES (4);
INSERT INTO map (id) VALUES (5);
INSERT INTO vote_map (user_id, map_id, score) VALUES (1, 1, '1');
INSERT INTO vote_map (user_id, map_id, score) VALUES (2, 2, '1');
SELECT * FROM map;
SELECT * FROM view_vote_map_rank;
The results I'm getting are
map_id score
3 0
1 1
2 1
However this is incomplete. I was also expecting id 4 and 5 there as well, with a score of 0 too. I'm not sure why it's stopping after the first 0. What am I missing?
In the view, you could use the following:
SELECT
a.id,
sum(IF(b.score IS NULL,0,b.score)) as `score`
FROM map a
LEFT JOIN vote_map b
ON a.id = b.map_id
GROUP BY a.id
Simpler and in your SQL Fiddle, it seems to return the correct results

Joining table with min(amount) does not work

I have 3 tables, but data is only fetch from 2 tables.
I'm trying to get the lowest bids for selected items and display user name with the lowest bid.
Currently query works until when we display user name, it shows wrong user name, which does not match the bid.
Below is working example of structure and query.
SQL Fiddle
MySQL 5.6 Schema Setup:
CREATE TABLE `bid` (
`id` int(11) NOT NULL,
`amount` float NOT NULL,
`user_id` int(11) NOT NULL,
`item_id` int(11) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=latin1;
INSERT INTO `bid` (`id`, `amount`, `user_id`, `item_id`) VALUES
(1, 9, 1, 1),
(2, 5, 2, 1),
(3, 4, 3, 1),
(4, 3, 4, 1),
(5, 4, 2, 2),
(6, 22, 5, 1);
-- --------------------------------------------------------
CREATE TABLE `item` (
`id` int(11) NOT NULL,
`name` varchar(100) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1;
INSERT INTO `item` (`id`, `name`) VALUES
(1, 'chair'),
(2, 'sofa'),
(3, 'table'),
(4, 'box');
-- --------------------------------------------------------
CREATE TABLE `user` (
`id` int(11) NOT NULL,
`name` varchar(100) NOT NULL
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=latin1;
INSERT INTO `user` (`id`, `name`) VALUES
(1, 'James'),
(2, 'Don'),
(3, 'Hipes'),
(4, 'Sam'),
(5, 'Zakam');
ALTER TABLE `bid`
ADD PRIMARY KEY (`id`);
ALTER TABLE `item`
ADD PRIMARY KEY (`id`);
ALTER TABLE `user`
ADD PRIMARY KEY (`id`);
ALTER TABLE `bid`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=7;
ALTER TABLE `item`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=5;
ALTER TABLE `user`
MODIFY `id` int(11) NOT NULL AUTO_INCREMENT,AUTO_INCREMENT=5;
Query 1:
SELECT b.id, b.item_id, MIN(b.amount) as amount, b.user_id, p.name
FROM bid b
LEFT JOIN user p ON p.id = b.user_id
WHERE b.item_id in (1, 2)
GROUP BY b.item_id
ORDER BY b.amount, b.item_id
Results:
| id | item_id | amount | user_id | name |
|----|---------|--------|---------|-------|
| 5 | 2 | 4 | 2 | Don |
| 1 | 1 | 3 | 1 | James |
Explanation of query:
Get the selected items (1, 2).
get the lowest bid for thous items - MIN(b.amount)
display user names, who has given the bid - LEFT JOIN user p on p.id = b.user_id (this is not working or I'm doing something wrong)
[Note] I can't use sub-query, I'm doing this in doctrine2 (php code) which limits mysql sub-query
No, you are not necessarily fetching the user_id who has given the bid. You group by item_id, so you get one result row per item. So you are aggregating and for every column you say what value you want to see for that item. E.g.:
MIN(b.amount) - the minimum amount of the item's records
MAX(b.amount) - the maximum amount of the item's records
AVG(b.amount) - the avarage amount of the item's records
b.amount - one of the amounts of the item's records arbitrarily chosen (as there are many amounts and you don't specify which you want to see, the DBMS simply choses one of them)
This said, b.user_id isn't necessarily the user who made the lowest bid, but just one random user of the users who made a bid.
Instead find the minimum bids and join again with your bid table to access the realted records:
select bid.id, bid.item_id, bid.amount, user.id as user_id, user.name
from bid
join
(
select item_id, min(amount) as amount
from bid
group by item_id
) as min_bid on min_bid.item_id = bid.item_id and min_bid.amount = bid.amount
join user on user.id = bid.user_id
order by bid.amount, bid.item_id;
You can solve this using a subquery. I am not 100% sure if this is the most efficient way, but at least it works.
SELECT b1.id, b1.item_id, b1.amount, b1.user_id, p.name
FROM bid b1
LEFT JOIN user p ON p.id = b1.user_id
WHERE b1.id = (
SELECT b2.id
FROM bid b2
WHERE b2.item_id IN (1, 2)
ORDER BY b2.amount LIMIT 1
)
This first selects for the lowest bid with for item 1 or 2 and then uses the id of that bid to find the information you need.
Edit
You are saying that Doctrine does not support subqueries. I have not used Doctrine a lot, but something like this should work:
$subQueryBuilder = $entityManager->createQueryBuilder();
$subQuery = $subQueryBuilder
->select('b2.id')
->from('bid', 'b2')
->where('b2.item_id IN (:items)')
->orderBy('b2.amount')
->setMaxResults(1)
->getDql();
$queryBuilder = $entityManager->createQueryBuilder();
$query = $queryBuilder
->select('b1.id', 'b1.item_id', 'b1.amount', 'b1.user_id', 'p.name')
->from('bid', 'b1')
->leftJoin('user', 'p', 'with', 'p.id = b1.user_id')
->where('b1.id = (' . $subQuery . ')')
->setParameter('items', [1, 2])
->getQuery()->getSingleResult();

Using SQL Sub-queries in an INSERT Statement

Here are the two tables created:
CREATE TABLE category_tbl(
id int NOT NULL AUTO_INCREMENT,
name varchar(255) NOT NULL,
subcategory varchar(255) NOT NULL,
PRIMARY KEY(id),
CONSTRAINT nameSubcategory UNIQUE KEY(name, subcategory)
) ENGINE=InnoDB;
CREATE TABLE device(
id INT NOT NULL AUTO_INCREMENT,
cid INT DEFAULT NULL,
name VARCHAR(255) NOT NULL,
received DATE,
isbroken BOOLEAN,
PRIMARY KEY(id),
FOREIGN KEY(cid) REFERENCES category_tbl(id)
) ENGINE=InnoDB;
Below is the instruction that was given to me:
-- insert the following devices instances into the device table (you should use a subquery to set up foriegn keys referecnes, no hard coded numbers):
-- cid - reference to name: phone subcategory: maybe a tablet?
-- name - Samsung Atlas
-- received - 1/2/1970
-- isbroken - True
I'm getting errors on the insert statement below from attempting to use a sub-query within an insert statement. How would you solve this issue?
INSERT INTO devices(cid, name, received, isbroken)
VALUES((SELECT id FROM category_tbl WHERE subcategory = 'tablet') , 'Samsung Atlas', 1/2/1970, 'True');
You have different table name in CREATE TABLE and INSERT INTO so just choose one device or devices
When insert date format use the good one like DATE('1970-02-01')
When insert boolean - just TRUE with no qoutes I beleive.
http://sqlfiddle.com/#!9/b7180/1
INSERT INTO devices(cid, name, received, isbroken)
VALUES((SELECT id FROM category_tbl WHERE subcategory = 'tablet') , 'Samsung Atlas', DATE('1970-02-01'), TRUE);
It's not possible to use a SELECT in an INSERT ... VALUES ... statement. The key here is the VALUES keyword. (EDIT: It is actually possible, my bad.)
If you remove the VALUES keyword, you can use the INSERT ... SELECT ... form of the INSERT statement statement.
For example:
INSERT INTO mytable ( a, b, c) SELECT 'a','b','c'
In your case, you could run a query that returns the needed value of the foreign key column, e.g.
SELECT c.id
FROM category_tbl c
WHERE c.name = 'tablet'
ORDER BY c.id
LIMIT 1
If we add some literals in the SELECT list, like this...
SELECT c.id AS `cid`
, 'Samsung Atlas' AS `name`
, '1970-01-02' AS `received`
, 'True' AS `isBroken`
FROM category_tbl c
WHERE c.name = 'tablet'
ORDER BY c.id
LIMIT 1
That will return a "row" that we could insert. Just precede the SELECT with
INSERT INTO device (`cid`, `name`, `received`, `isbroken`)
NOTE: The expressions returned by the SELECT are "lined up" with the columns in the column list by position, not by name. The aliases assigned to the expressions in the SELECT list are arbitrary, they are basically ignored. They could be omitted, but I think having the aliases assigned makes it easier to understand when we run just the SELECT portion.

Complicated SELECT query

I have a table which defines what things another table can have, for example:
CREATE TABLE `objects` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
`name` VARCHAR(50) NOT NULL
);
INSERT INTO `objects` (`name`) VALUES ('Test');
INSERT INTO `objects` (`name`) VALUES ('Test 2');
CREATE TABLE `properties` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
`name` VARCHAR(50) NOT NULL
);
INSERT INTO `properties` (`name`) VALUES ('colour');
INSERT INTO `properties` (`name`) VALUES ('size');
CREATE TABLE `objects_properties` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
`object_id` INT UNSIGNED NOT NULL,
`property_id` INT UNSIGNED NOT NULL,
`value` VARCHAR(50) NOT NULL,
FOREIGN KEY (`object_id`)
REFERENCES `objects` (`id`),
FOREIGN KEY (`property_id`)
REFERENCES `properties` (`id`)
);
INSERT INTO `objects_properties` (`object_id`, `property_id`, `value`) VALUES 1, 1, 'red');
INSERT INTO `objects_properties` (`object_id`, `property_id`, `value`) VALUES 1, 2, 'small');
INSERT INTO `objects_properties` (`object_id`, `property_id`, `value`) VALUES 2, 1, 'blue');
INSERT INTO `objects_properties` (`object_id`, `property_id`, `value`) VALUES 2, 2, 'large');
Hopefully this makes sense. Basically instead of having columns for colour, size etc. in the objects table, I have two other tables, one that defines the properties any object can have, and another that links objects to some or all of these properties.
My question is if there's some way to retrieve this information like this:
+--------+------------+------------+
| object | colour | size |
+--------+------------+------------+
| Test | red | small |
| Test 2 | blue | large |
+--------+------------+------------+
So you can see the column headings are actually row values. I'm not sure if it's possible or how costly it would be compared to doing a few separate queries and putting everything together in PHP.
SELECT o.name, c.colour, s.size
FROM objects o
LEFT JOIN (SELECT op.object_id, op.value colour
FROM objects_properties op
join properties p on op.property_id = p.id and p.name = 'colour') c
ON o.id = c.object_id
LEFT JOIN (SELECT op.object_id, op.value size
FROM objects_properties op
join properties p on op.property_id = p.id and p.name = 'size') s
ON o.id = s.object_id
The keyword here is "pivot table" "crosstab" (but a "pivot table" lies also in that direction) and no, MySQL cannot do this directly. You can create a query that will select this, but you will have to explicitly define the columns yourself in the query. No fetching of columns from another table. Other RDBMS may have capabilities for this.
pivot (or something like that) could be useful. In MS SQL Server you can use it BUT the values to pivot the table must be constant or you can use a stored procedure to calculate it.
Here you can find more info.
Have a nice day!
SELECT o.*,
(
SELECT *
FROM object_properties op
WHERE op.object_id = o.object_id
AND op.property_id = $prop_color_id
) AS color,
(
SELECT *
FROM object_properties op
WHERE op.object_id = o.object_id
AND op.property_id = $prop_size_id
) AS size
FROM objects o
Substitute the $prop_color_id and $prop_size_id with the color and size property id's.
For this query to be efficient, make (object_id, property_id) a PRIMARY KEY in the object_properties and get rid of the surrogate key.