mysql join with sub-query - mysql

This is my schema:
mysql> describe stocks;
+-----------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| symbol | varchar(32) | NO | | NULL | |
| date | datetime | NO | | NULL | |
| value | float(10,3) | NO | | NULL | |
| contracts | int(8) | NO | | NULL | |
| open | float(10,3) | NO | | NULL | |
| close | float(10,3) | NO | | NULL | |
| high | float(10,3) | NO | | NULL | |
| low | float(10,3) | NO | | NULL | |
+-----------+-------------+------+-----+---------+----------------+
9 rows in set (0.03 sec)
I added the column open and low and I want to fill up with the data inside the table.
These values open/close are referenced to each day. (so the relative max/min id of each day should give me the correct value). So my first insight is get the list of date and then left join with the table:
SELECT DISTINCT(DATE(date)) as date FROM stocks
but I'm stuck because I can't get the max/min ID or the the first/last value. Thanks

You will get day wise min and max ids from below query
SELECT DATE_FORMAT(date, "%d/%m/%Y"),min(id) as min_id,max(id) as max_id FROM stocks group by DATE_FORMAT(date, "%d/%m/%Y")
But other requirement is not clear.

Solved!
mysql> UPDATE stocks s JOIN
-> (SELECT k.date, k.value as v1, y.value as v2 FROM (SELECT x.date, x.min_id, x.max_id, stocks.value FROM (SELECT DATE(date) as date,min(id) as min_id,max(id) as max_id FROM stocks group by DATE(date)) AS x LEFT JOIN stocks ON x.min_id = stocks.id) AS k LEFT JOIN stocks y ON k.max_id = y.id) sd
-> ON DATE(s.date) = sd.date
-> SET s.open = sd.v1, s.close = sd.v2;
Query OK, 995872 rows affected (1 min 50.38 sec)
Rows matched: 995872 Changed: 995872 Warnings: 0

Related

When is type casting SQL columns required?

I performed the first query and did not get my expected results. I then realized that 1 slope was interpreted as an integer, thus t.slope in t.slope*pchp.sign*p.slope slope, t.intercept+t.slope*pchp.sign*p.intercept intercept was also acting as an integer.
I then repeated the query but this time casting both slope and intercept as decimals, and obtained the correct results.
I then repeated the query a second time but only time cast slope whose value was not zero but not intercept, and also obtained the correct results.
Which leads me to my question. When is type casting SQL columns required?
MariaDB [testing]> WITH RECURSIVE t AS (
-> SELECT id, id pointsId, type, 0 value, 0 prevValue, 1 slope, 0 intercept
-> FROM points
-> WHERE id IN (406, 428)
-> UNION ALL
-> SELECT t.id, pchp.pointsId, p.type, p.value, p.prevValue, t.slope*pchp.sign*p.slope slope, t.intercept+t.slope*pchp.sign*p.intercept intercept
-> FROM t
-> INNER JOIN points_custom_has_points pchp ON pchp.pointsCustomId=t.pointsId
-> INNER JOIN points p ON p.id=pchp.pointsId
-> )
-> SELECT id, SUM(slope*value+intercept) value, SUM(slope*prevValue+intercept) prevValue FROM t WHERE type='real' GROUP BY id;
+-----+--------+-----------+
| id | value | prevValue |
+-----+--------+-----------+
| 406 | 0 | 0 |
| 428 | 123702 | 123702 |
+-----+--------+-----------+
2 rows in set (0.00 sec)
MariaDB [testing]> WITH RECURSIVE t AS (
-> SELECT id, id pointsId, type, 0 value, 0 prevValue, CAST(1 AS DECIMAL(12,4)) slope, CAST(0 AS DECIMAL(12,4)) intercept
-> FROM points
-> WHERE id IN (406, 428)
-> UNION ALL
-> SELECT t.id, pchp.pointsId, p.type, p.value, p.prevValue, t.slope*pchp.sign*p.slope slope, t.intercept+t.slope*pchp.sign*p.intercept intercept
-> FROM t
-> INNER JOIN points_custom_has_points pchp ON pchp.pointsCustomId=t.pointsId
-> INNER JOIN points p ON p.id=pchp.pointsId
-> )
-> SELECT id, SUM(slope*value+intercept) value, SUM(slope*prevValue+intercept) prevValue FROM t WHERE type='real' GROUP BY id;
+-----+-------------+-------------+
| id | value | prevValue |
+-----+-------------+-------------+
| 406 | 49480.8000 | 49480.8000 |
| 428 | 123702.0000 | 123702.0000 |
+-----+-------------+-------------+
2 rows in set (0.00 sec)
MariaDB [testing]> WITH RECURSIVE t AS (
-> SELECT id, id pointsId, type, 0 value, 0 prevValue, CAST(1 AS DECIMAL(12,4)) slope, 0 intercept
-> FROM points
-> WHERE id IN (406, 428)
-> UNION ALL
-> SELECT t.id, pchp.pointsId, p.type, p.value, p.prevValue, t.slope*pchp.sign*p.slope slope, t.intercept+t.slope*pchp.sign*p.intercept intercept
-> FROM t
-> INNER JOIN points_custom_has_points pchp ON pchp.pointsCustomId=t.pointsId
-> INNER JOIN points p ON p.id=pchp.pointsId
-> )
-> SELECT id, SUM(slope*value+intercept) value, SUM(slope*prevValue+intercept) prevValue FROM t WHERE type='real' GROUP BY id;
+-----+-------------+-------------+
| id | value | prevValue |
+-----+-------------+-------------+
| 406 | 49480.8000 | 49480.8000 |
| 428 | 123702.0000 | 123702.0000 |
+-----+-------------+-------------+
2 rows in set (0.00 sec)
MariaDB [testing]>
Table definitions are as follows:
MariaDB [testing]> explain points;
+----------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+----------------+-------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| idPublic | int(11) | NO | MUL | 0 | |
| accountsId | int(11) | NO | MUL | NULL | |
| name | varchar(45) | NO | MUL | NULL | |
| value | float | YES | | NULL | |
| prevValue | float | YES | | NULL | |
| units | varchar(45) | YES | | NULL | |
| type | char(8) | NO | MUL | NULL | |
| slope | float | NO | | 1 | |
| intercept | float | NO | | 0 | |
| tsValueUpdated | datetime | YES | | NULL | |
| sourceTypeId | tinyint(4) | YES | MUL | NULL | |
+----------------+-------------+------+-----+---------+----------------+
12 rows in set (0.00 sec)
MariaDB [testing]> explain points_custom_has_points;
+----------------+------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+----------------+------------+------+-----+---------+-------+
| pointsCustomId | int(11) | NO | PRI | NULL | |
| pointsId | int(11) | NO | PRI | NULL | |
| sign | tinyint(4) | NO | MUL | 1 | |
+----------------+------------+------+-----+---------+-------+
3 rows in set (0.00 sec)
MariaDB [testing]>
Change CAST(1 AS DECIMAL(12,4)) slope to 1.0 slope. If that works, then here is the explanation:
The first SELECT in a UNION determines the datatype of each column. So, simply saying 1 would set the slope to some flavor of integer (I think BIGINT).
You may have similar issues with other columns.
I suspect you cannot swap the SELECTs in that UNION because of WITH. But that might be another thing to try. (Please report back if you do try it.)

How to calculate a moving average in MYSQL

I have an application that stores stock quotes into my MySQL database.
I have a table called stock_history:
mysql> desc stock_history;
+-------------------+---------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+---------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| date | date | NO | MUL | NULL | |
| close | decimal(12,5) | NO | MUL | NULL | |
| dmal_3 | decimal(12,5) | YES | MUL | NULL | |
+-------------------+---------------+------+-----+---------+----------------+
5 rows in set (0.01 sec)
These are all the values in this table:
mysql> select date, close, dmal_3 from stock_history order by date asc;
+------------+----------+----------+
| date | close | dmal_3 |
+------------+----------+----------+-
| 2000-01-03 | 2.00000 | NULL |
| 2000-01-04 | 4.00000 | NULL |
| 2000-01-05 | 6.00000 | NULL |
| 2000-01-06 | 8.00000 | NULL |
| 2000-01-07 | 10.00000 | NULL |
| 2000-01-10 | 12.00000 | NULL |
| 2000-01-11 | 14.00000 | NULL |
| 2000-01-12 | 16.00000 | NULL |
| 2000-01-13 | 18.00000 | NULL |
| 2000-01-14 | 20.00000 | NULL |
+------------+----------+----------+-
10 rows in set (0.01 sec)
I am guaranteed that there will be 0 or 1 record for each date.
Can I write a single query that will insert the three-day moving average (ie: the average closing prices of that day and the two previous trading days before it) into the dmal_3 field? How?
When the query is done, I want the table to look like this:
mysql> select date, close, dmal_3 from stock_history order by date asc;
+------------+----------+----------+
| date | close | dmal_3 |
+------------+----------+----------+
| 2000-01-03 | 2.00000 | NULL |
| 2000-01-04 | 4.00000 | NULL |
| 2000-01-05 | 6.00000 | 4.00000 |
| 2000-01-06 | 8.00000 | 6.00000 |
| 2000-01-07 | 10.00000 | 8.00000 |
| 2000-01-10 | 12.00000 | 10.00000 |
| 2000-01-11 | 14.00000 | 12.00000 |
| 2000-01-12 | 16.00000 | 14.00000 |
| 2000-01-13 | 18.00000 | 16.00000 |
| 2000-01-14 | 20.00000 | 18.00000 |
+------------+----------+----------+
10 rows in set (0.01 sec)
That is what I call a good challenge. My solution first creates a counter for the values and uses it as a table. From it I select everything and join with the same query as a subquery checking the position of the counter on both. Once the query works it just need an inner join with the actual table to do the update. Here it is my solution:
update stock_history tb1
inner join
(
select a.id,
case when a.step < 3 then null
else
(select avg(b.close)
from (
select hh.*,
#stp:=#stp+1 stp
from stock_history hh,
(select #sum:=0, #stp:=0) x
order by hh.dt
limit 17823232
) b
where b.stp >= a.step-2 and b.stp <= a.step
)
end dmal_3
from (select h1.*,
#step:=#step+1 step
from stock_history h1,
(select #sum:=0, #step:=0) x
order by h1.dt
limit 17823232
) a
) x on tb1.id = x.id
set tb1.dmal_3 = x.dmal_3;
I changed some columns names for easiness of my test. Here it is the working SQLFiddle: http://sqlfiddle.com/#!9/e7dc00/1
If you have any doubt, let me know so I can clarify!
Edit
The limit 17823232 clause was added there in the subqueries because I don't know which version of MySql you are in. Depending on it (>= 5.7, not sure exactly) the database optimizer will ignore the internal order by clauses making it not work the way it should. I just chose a random big number usually you can use the maximum allowed.
The only column with different colunm name between your table and mine is the date one which I named dt because date is a reserved word and you should use backticks ( ` ) to use such columns, therefore I will left it as dt in above query.

mysql subquery understanding

I am trying to find all sale_id's that have an entry in sales_item_taxes table, but do NOT have a corresponding entry in the sales_items table.
mysql> describe phppos_sales_items_taxes;
+------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------+---------------+------+-----+---------+-------+
| sale_id | int(10) | NO | PRI | NULL | |
| item_id | int(10) | NO | PRI | NULL | |
| line | int(3) | NO | PRI | 0 | |
| name | varchar(255) | NO | PRI | NULL | |
| percent | decimal(15,3) | NO | PRI | NULL | |
| cumulative | int(1) | NO | | 0 | |
+------------+---------------+------+-----+---------+-------+
6 rows in set (0.01 sec)
mysql> describe phppos_sales_items;
+--------------------+----------------+------+-----+--------------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------+----------------+------+-----+--------------+-------+
| sale_id | int(10) | NO | PRI | 0 | |
| item_id | int(10) | NO | PRI | 0 | |
| description | varchar(255) | YES | | NULL | |
| serialnumber | varchar(255) | YES | | NULL | |
| line | int(3) | NO | PRI | 0 | |
| quantity_purchased | decimal(23,10) | NO | | 0.0000000000 | |
| item_cost_price | decimal(23,10) | NO | | NULL | |
| item_unit_price | decimal(23,10) | NO | | NULL | |
| discount_percent | int(11) | NO | | 0 | |
+--------------------+----------------+------+-----+--------------+-------+
9 rows in set (0.00 sec)
mysql>
Proposed Query:
SELECT DISTINCT sale_id
FROM phppos_sales_items_taxes
WHERE item_id NOT IN
(SELECT item_id FROM phppos_sales_items WHERE sale_id = phppos_sales_items_taxes.sale_id)
The part I am confused by is the subquery. The query seems to work as intended but I am not understanding the subquery part. How does it look for each sale?
For example if I have the following data:
mysql> select * from phppos_sales;
+---------------------+-------------+-------------+---------+-------------------------+---------+--------------------+-----------+-----------+------------+---------+-----------+-----------------------+-------------+---------+
| sale_time | customer_id | employee_id | comment | show_comment_on_receipt | sale_id | payment_type | cc_ref_no | auth_code | deleted_by | deleted | suspended | store_account_payment | location_id | tier_id |
+---------------------+-------------+-------------+---------+-------------------------+---------+--------------------+-----------+-----------+------------+---------+-----------+-----------------------+-------------+---------+
| 2014-08-09 17:53:38 | NULL | 1 | | 0 | 1 | Cash: $12.96<br /> | | | NULL | 0 | 0 | 0 | 1 | NULL |
| 2014-08-09 17:56:59 | NULL | 1 | | 0 | 2 | Cash: $12.96<br /> | | | NULL | 0 | 0 | 0 | 1 | NULL |
+---------------------+-------------+-------------+---------+-------------------------+---------+--------------------+-----------+-----------+------------+---------+-----------+-----------------------+-------------+---------+
mysql> select * from phppos_sales_items;
+---------+---------+-------------+--------------+------+--------------------+-----------------+-----------------+------------------+
| sale_id | item_id | description | serialnumber | line | quantity_purchased | item_cost_price | item_unit_price | discount_percent |
+---------+---------+-------------+--------------+------+--------------------+-----------------+-----------------+------------------+
| 2 | 1 | | | 1 | 1.0000000000 | 10.0000000000 | 12.0000000000 | 0 |
+---------+---------+-------------+--------------+------+--------------------+-----------------+-----------------+------------------+
1 row in set (0.00 sec)
mysql> select * from phppos_sales_items_taxes;
+---------+---------+------+-----------+---------+------------+
| sale_id | item_id | line | name | percent | cumulative |
+---------+---------+------+-----------+---------+------------+
| 1 | 1 | 1 | Sales Tax | 8.000 | 0 |
| 2 | 1 | 1 | Sales Tax | 8.000 | 0 |
+---------+---------+------+-----------+---------+------------+
2 rows in set (0.00 sec)
When I run the query below it does find sale_id 1. But how does the subquery know to filter correctly. I guess I am not understanding how the sub query works.
mysql> SELECT DISTINCT sale_id
-> FROM phppos_sales_items_taxes
-> WHERE item_id NOT IN
-> (SELECT item_id FROM phppos_sales_items WHERE sale_id = phppos_sales_items_taxes.sale_id)
-> ;
+---------+
| sale_id |
+---------+
| 1 |
+---------+
1 row in set (0.00 sec)
Duffy356 link to the SQL-Joins is good, but sometimes seeing with your own data might sometimes make more sense...
First, your query as written and obviously learning will be very expensive to the engine. How it knows what to include is because it is doing a correlated sub-query -- meaning that FOR every record IN the sales_items_taxes table it is running a query TO the sales_items table, which is returning every item possible for said sale_id. Then it comes back to the main query and compares it to the sales_items_taxes table. If it does NOT find it, it allows the sale_id to be included in the result set. Then it goes to the next record in the sales_items_taxes table.
(Your query reformatted for better readability)
SELECT DISTINCT
sale_id
FROM
phppos_sales_items_taxes
WHERE
item_id NOT IN ( SELECT item_id
FROM phppos_sales_items
WHERE sale_id = phppos_sales_items_taxes.sale_id)
Now, think about this. You have 1 sale with 100 items. It is running the correlated sub-query 100 times. Now do this with 1,000 sales id entries and each has however many items, gets expensive quickly.
A better alternative is to take advantage of databases and do a left-join. The indexes work directly with the LEFT JOIN (or inner join) and are optimized by the engine. Also, notice I am using "aliases" for the tables and qualifying the aliases for readability. By starting with your sales items taxes table (the one you are looking for extra entries) is the basis. Now, left-join this sales items table on the two key components of the sale_id and item_id. I would suggest that each table has an index ON (sale_id, item_id) to match the join condition here.
SELECT DISTINCT
sti.sale_id
FROM
phppos_sales_items_taxes sti
LEFT JOIN phppos_sales_items si
ON sti.sale_id = si.sale_id
AND sti.item_id = si.item_id
WHERE
si.sale_id IS NULL
So, from here, think of it that each table is lined-up side-by-side with each other and all you are getting are those on the left side (sale items taxes) that DO NOT have an entry on the right side (sales_items).
Your problem can be fixed by using joins.
Read the following article about SQL-Joins and think about your problem -> you will be able to fix it ;)
The IN-clause is not the best solution, because some databases have limits on the number of arguments contained in it.
what you really wanted here is:
SELECT DISTINCT sale_id
FROM phppos_sales_items_taxes
WHERE sale_id NOT IN
(SELECT sale_id FROM phppos_sales_items)
WHERE field NOT IN (SELECT field FROM anothertable WHERE ...) is a perfectly fine query construct.
Your original query:
SELECT DISTINCT sale_id
FROM phppos_sales_items_taxes
WHERE item_id NOT IN
(SELECT item_id FROM phppos_sales_items WHERE sale_id = phppos_sales_items_taxes.sale_id)
Here you are pulling all the item_ids from the phppos_sales_items table where sale_id matches with taxes table, and removing those item_ids from the final result.
You can do also get same results in couple other ways, which may be easy to understand.
Use IN query with multiple columns:
select distinct sales_id
from sales_item_taxes
where (sale_id, item_id) not in (select sale_id, item_id from phppos_sales_items)
-- This form of query is easy to read and understand. Performance may not be good for large tables.
Exists / not exists format:
select distinct sales_id
from sales_item_taxes t1
where not exists (select '1' from phppos_sales_items t2
where t2.sale_id = t1.sale_id
and t2.item_id = t1.item_id
)
I would have also suggested the same solution as 'bwperrin' did - not sure why you didn't get any output by running the query. If your criteria is to filter on sale_id - that is the best solution. But looks like you are using (sale_id, item_id) as a way to identify sales record. Make sure your table structure makes sense.

Optimize SQL query (Facebook-like application)

My application is similar to Facebook, and I'm trying to optimize the query that get user records. The user records are that he as src ou dst. The src is in usermuralentry directly, the dst list are in usermuralentry_user.
So, a entry can have one src and many dst.
I have those tables:
mysql> desc usermuralentry ;
+-----------------+------------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+------------------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| user_src_id | int(11) | NO | MUL | NULL | |
| private | tinyint(1) | NO | | NULL | |
| content | longtext | NO | | NULL | |
| date | datetime | NO | | NULL | |
| last_update | datetime | NO | | NULL | |
+-----------------+------------------+------+-----+---------+----------------+
10 rows in set (0.10 sec)
mysql> desc usermuralentry_user ;
+-------------------+---------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------------+---------+------+-----+---------+----------------+
| id | int(11) | NO | PRI | NULL | auto_increment |
| usermuralentry_id | int(11) | NO | MUL | NULL | |
| userinfo_id | int(11) | NO | MUL | NULL | |
+-------------------+---------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
And the following query to retrieve information from two users.
mysql> explain
SELECT *
FROM usermuralentry AS a
, usermuralentry_user AS b
WHERE a.user_src_id IN ( 1, 2 )
OR
(
a.id = b.usermuralentry_id
AND b.userinfo_id IN ( 1, 2 )
);
+----+-------------+-------+------+-------------------------------------------------------------------------------------------+------+---------+------+---------+------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+-------------------------------------------------------------------------------------------+------+---------+------+---------+------------------------------------------------+
| 1 | SIMPLE | b | ALL | usermuralentry_id,usermuralentry_user_bcd7114e,usermuralentry_user_6b192ca7 | NULL | NULL | NULL | 147188 | |
| 1 | SIMPLE | a | ALL | PRIMARY | NULL | NULL | NULL | 1371289 | Range checked for each record (index map: 0x1) |
+----+-------------+-------+------+-------------------------------------------------------------------------------------------+------+---------+------+---------+------------------------------------------------+
2 rows in set (0.00 sec)
but it is taking A LOT of time...
Some tips to optimize? Can the table schema be better in my application?
Use this query
SELECT *
FROM usermuralentry AS a
left join usermuralentry_user AS b
on b.usermuralentry_id = a.id
WHERE a.user_src_id IN(1, 2)
OR (a.id = b.usermuralentry_id
AND b.userinfo_id IN(1, 2));
And for some tips here are
You are using two tables in from clause which is a cartision product and will take a lot of time as well as undesired results. Always use joins in this situation.
I think your join isn't properly formed, and you need to change the query to use UNION. The OR condition in the where clause is killing performance as well:
SELECT *
FROM usermuralentry AS a
JOIN usermuralentry_user AS b ON a.id = b.usermuralentry_id /* use explicit JOIN! */
WHERE a.user_src_id IN (1 , 2)
UNION
SELECT *
FROM usermuralentry AS a
JOIN usermuralentry_user AS b ON a.id = b.usermuralentry_id
WHERE b.usermuralentry_id IN ( 1, 2 )
You also need an index: ALTER TABLE usermuralentry_user ADD INDEX (usermuralentry_id)

MYSQL outer join, doesn't return non-matching rows

I have these two tables in my db
describe external_review_sources;
+-------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------+--------------+------+-----+---------+----------------+
| ersID | int(11) | NO | PRI | NULL | auto_increment |
| name | varchar(50) | NO | UNI | NULL | |
| logo | varchar(255) | NO | | NULL | |
+-------+--------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
And
describe listing_external_review_source_rel;
+---------+--------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+---------+--------------+------+-----+---------+----------------+
| lersrID | int(11) | NO | PRI | NULL | auto_increment |
| bid | int(10) | NO | | NULL | |
| url | varchar(255) | NO | | NULL | |
| ersID | int(11) | YES | | NULL | |
| active | int(10) | NO | | NULL | |
| order | int(10) | NO | | NULL | |
+---------+--------------+------+-----+---------+----------------+
6 rows in set (0.00 sec)
I query these tables this way:
SELECT *
FROM
listing_external_review_source_rel
RIGHT JOIN
external_review_sources USING(ersID)
where bid=902028 or bid IS NULL;
+-------+---------------+------+---------+--------+-------+--------+-------+
| ersID | name | logo | lersrID | bid | url | active | order |
+-------+---------------+------+---------+--------+-------+--------+-------+
| 1 | G1 | a | 17 | 902028 | url11 | 1 | 0 |
| 2 | D1 | b | 18 | 902028 | url22 | 0 | 0 |
+-------+---------------+------+---------+--------+-------+--------+-------+
2 rows in set (0.00 sec)
As you can see results are showing up for bid=902028, how ever for a bid such as 866696 that does NOT exist in listing_external_review_source_rel, the results are empty
SELECT *
FROM listing_external_review_source_rel
RIGHT JOIN external_review_sources USING(ersID)
where bid=866696 or bid IS NULL;
Empty set (0.00 sec)
I expect the results to be this:
+-------+---------------+------+---------+--------+-------+--------+-------+
| ersID | name | logo | lersrID | bid | url | active | order |
+-------+---------------+------+---------+--------+-------+--------+-------+
| 1 | G1 | NULL | NULL| NULL | NULL | NULL | NULL |
| 2 | D1 | NULL | NULL| NULL | NULL | NULL | NULL |
+-------+---------------+------+---------+--------+-------+--------+-------+
2 rows in set (0.00 sec)
That's what I have used the "or bid IS NULL" condition.
What am I doing wrong and what query would give me this result? I basically am interested in having nonmatching rows in my results as well.
Most people use LEFT JOIN so I'll rewrite it to be more standard:
SELECT *
FROM external_review_sources a LEFT JOIN
listing_external_review_source_rel b ON a.ersID=b.ersID AND bid=866696;
Remember that an outter join returns all rows where the ON condition matches, and NULL where they don't. In this case your match condition is more than just the ersID
Try this
SELECT * FROM
external_review_sources e
LEFT JOIN
listing_external_review_source_rel r ON e.ersID = r.ersID AND r.bid = 866696
WHERE
r.bid IS NULL;
When filtering on "outer tables", the filter needs to be a derived table or in the JOIN because you want to filter before the WHERE (logically). Also, a best practice is to use LEFT JOIN for clarity.
With a derived table
SELECT * FROM
external_review_sources e
LEFT JOIN
(
SELECT *
FROM listing_external_review_source_rel
WHERE bid = 866696
) r USING (ersID)
WHERE
r.bid IS NULL;
Try it this way, moving the test of the bid column out of the WHERE clause and into the JOIN:
SELECT *
FROM listing_external_review_source_rel ler
RIGHT JOIN external_review_sources ers
ON ler.ersID = ers.ersID
AND ler.bid=866696;