Combining 2 Tables with an OUTER JOIN on Another Table - mysql

I need to combine 2 tables that may or may not have the data in them, but than I need a full outer join where the last table (if has content where IsActive = 1) gets shown that data, instead of the combined first 2 tables.
Currently have this:
( SELECT qp.ItemName AS name
, qp.TimeAdded AS created
, '' AS effective
, qp.VendorName AS supplier
, qp.Source AS source
, qp.VendorType AS type
, qp.Price AS cost
, '' AS price
, '' AS markup
, '' AS customer
, '' AS customerListID
, qp.VendorListID AS vendorListID
, '' AS itemListID
FROM wp_quantum_purchases AS qp
WHERE qp.IsActive = 1 AND
NOT EXISTS ( SELECT 1
FROM wp_hunter_quote_parts AS hqp
WHERE qp.ItemName = hqp.ItemName AND
hqp.IsActive = 1 ))
UNION ALL
( SELECT qs.ItemName AS name
, qs.TimeAdded AS created
, qs.SalesDate AS effective
, '' AS supplier
, qs.Source AS source
, '' AS type
, '' AS cost
, qs.Price AS price
, '' AS markup
, qs.CustomerName AS customer
, qs.CustomerListID AS customerListID
, '' AS vendorListID
, '' AS itemListID
FROM wp_quantum_sales AS qs
WHERE qs.IsActive = 1 AND
NOT EXISTS ( SELECT 1
FROM wp_hunter_quote_parts AS hqp
WHERE qs.ItemName = hqp.ItemName AND
hqp.IsActive = 1 ))
UNION ALL
( SELECT hqp.ItemName AS name
, hq.Quote_Date AS created
, hqp.SalesDate AS effective
, hqp.VendorName AS supplier
, hqp.Source AS source
, hqp.VendorType AS type
, hqp.Cost AS cost
, hqp.Price AS price
, CAST(( ( ( CAST(hqp.Price AS DECIMAL(10, 2)) - CAST(hqp.Cost AS DECIMAL(10, 2)) ) / CAST(hqp.Cost AS DECIMAL(10, 2)) ) * 100 ) AS DECIMAL(10, 2)) AS markup
, IFNULL(hq.Customer_FullName, 'N/A') AS customer
, hq.Customer_ListID AS customerListID
, hqp.VendorListID AS vendorListID
, hqp.Item_ListID AS itemListID
FROM wp_hunter_quote_parts AS hqp
LEFT JOIN wp_hunter_quotes AS hq
ON ( hq.id = hqp.QuoteID )
WHERE hqp.IsActive = 1)
ORDER BY NAME ASC;
But this is duplicating the data in 1st and 2nd tables and shows the data twice. I need the data from 1st and 2nd tables to be combined as 1 (if exists), but to prioritize the last table (wp_hunter_quote_parts) in here as the content to show from, if IsActive = 1 exists in the last table (wp_hunter_quote_parts). However, if IsActive = 1 does not exist in wp_hunter_quote_parts for ItemName than I would like to combine both wp_quantum_purchases and wp_quantum_sales as if it were 1 row.
Can not do a LEFT JOIN since data could exist in wp_quantum_purchases, but not in wp_quantum_sales OR data could exist in wp_quantum_sales and not in wp_quantum_purchases, OR data could not exist in either of these, and only exist in wp_hunter_quote_parts as well as data might not even exist in wp_hunter_quote_parts.
So, basically, if ItemName exists in wp_quantum_purchases AND IsActive = 1 AND wp_hunter_quote_parts does not have ItemName in table, get purchase data from wp_quantum_purchases, else if ItemName exists in wp_hunter_quote_parts get data from hunter_quote_parts instead.
If ItemName exists in wp_quantum_sales AND IsActive = 1 AND wp_hunter_quote_parts does not have ItemName in table, get sales data from wp_quantum_sales, else if ItemName exists in wp_hunter_quote_parts get data from hunter_quote_parts instead.
How can I combine first and second table, than do an outer join on it with another table?
Another Attempt here:
(SELECT IFNULL(qp.ItemName, qs.ItemName) AS name, IFNULL(qp.TimeAdded, qs.TimeAdded) AS created, qs.SalesDate AS effective, qp.VendorName AS supplier, qp.Source AS source, qp.VendorType AS type, qp.Price AS cost, qs.Price AS price, CAST((((CAST(qs.Price AS DECIMAL(10,2)) - CAST(qp.Price AS DECIMAL(10,2))) / CAST(qp.Price AS DECIMAL(10,2))) * 100) AS DECIMAL(10,2)) AS markup, qs.CustomerName AS customer, qs.CustomerListID AS customerListID, qp.VendorListID AS vendorListID, '' AS itemListID
FROM wp_quantum_purchases AS qp, wp_quantum_sales AS qs
WHERE (qp.IsActive = 1 OR qs.IsActive = 1)
AND NOT EXISTS (
SELECT 1
FROM wp_hunter_quote_parts AS hqp
WHERE (qp.ItemName = hqp.ItemName || qs.ItemName = hqp.ItemName) AND hqp.IsActive = 1
)
)
UNION ALL
(SELECT hqp.ItemName AS name, hq.Quote_Date AS created, hqp.SalesDate AS effective, hqp.VendorName AS supplier, hqp.Source AS source, hqp.VendorType AS type, hqp.Cost AS cost, hqp.Price AS price, CAST((((CAST(hqp.Price AS DECIMAL(10,2)) - CAST(hqp.Cost AS DECIMAL(10,2))) / CAST(hqp.Cost AS DECIMAL(10,2))) * 100) AS DECIMAL(10,2)) AS markup, IFNULL(hq.Customer_FullName, 'N/A') AS customer, hq.Customer_ListID AS customerListID, hqp.VendorListID AS vendorListID, hqp.Item_ListID AS itemListID
FROM wp_hunter_quote_parts AS hqp
LEFT JOIN wp_hunter_quotes AS hq ON (hq.id = hqp.QuoteID)
WHERE (hqp.IsActive = 1))
ORDER BY name ASC
Figured this one would work, but seems that it just keeps going and going and going, and doesn't seem to ever finish the query. No errors that I can see, but doesn't finish ever... And these tables are very small, that is odd...

I may not be understanding your question fully, but you could create a view of the first two tables and then do an outer join with the third table.

Related

JOIN nested subquery returning NULL while updating calculated value

I'm trying to work around the "You can't specify target table for update in FROM clause" MySQL error, which means I've got a nested subquery (temp table). Note, I'm trying to get a SELECT to work before I move on to the actual UPDATE.
What I've got is a ledger table where I'm trying to find the associated row's calculated unit total using active_units (what the row started with) and with each additional adjustment in active units (there is one payment row and multiple adjustment rows in the same table that are associated by commission ID and schedule number). These are grouped by month_num. So if it's active_units = 18 and there are three adjustment rows, one with active_units_chg = -2, then I should end up with 18 + -2 = 16.
When I do this:
SELECT
active_units
, active_units_chg
, active_units_total
, CRM_commission_payments.active_units + (
SELECT SUM(active_units_chg)
FROM CRM_commission_payments AS cp
WHERE cp.CRM_commissions_item_id = CRM_commission_payments.CRM_commissions_item_id
AND cp.schedule_a_no = CRM_commission_payments.schedule_a_no
AND cp.month_num = CRM_commission_payments.month_num
AND cp.item_active = 1
) AS active_units_cal1
FROM CRM_commission_payments
WHERE CRM_commission_payments.payment_type = 'payment'
AND CRM_commission_payments.CRM_quotes_item_id = 2457
active_units_cal1 is correct for the row. However, when I do this with a nested JOIN'd subquery, I get NULL for active_units_calc.calc_chg:
SELECT
active_units
, active_units_chg
, active_units_total
, CRM_commission_payments.active_units + (
SELECT SUM(active_units_chg)
FROM CRM_commission_payments AS cp
WHERE cp.CRM_commissions_item_id = CRM_commission_payments.CRM_commissions_item_id
AND cp.schedule_a_no = CRM_commission_payments.schedule_a_no
AND cp.month_num = CRM_commission_payments.month_num
AND cp.item_active = 1
) AS active_units_cal1
, active_units_calc.calc_chg
FROM CRM_commission_payments
LEFT JOIN (
SELECT
source.calc_chg
, source.CRM_commissions_item_id
, source.schedule_a_no
, source.month_num
FROM CRM_commission_payments AS cp1
INNER JOIN (
SELECT
SUM(active_units_chg) AS calc_chg
, cp.CRM_commissions_item_id
, cp.schedule_a_no
, cp.month_num
FROM CRM_commission_payments AS cp
WHERE cp.item_active = 1
) AS source
WHERE source.CRM_commissions_item_id = cp1.CRM_commissions_item_id
AND source.schedule_a_no = cp1.schedule_a_no
AND source.month_num = cp1.month_num
) AS active_units_calc ON (
active_units_calc.CRM_commissions_item_id = CRM_commission_payments.CRM_commissions_item_id
AND active_units_calc.schedule_a_no = CRM_commission_payments.schedule_a_no
AND active_units_calc.month_num = CRM_commission_payments.month_num
)
WHERE CRM_commission_payments.payment_type = 'payment'
AND CRM_commission_payments.CRM_quotes_item_id = 2457
What am I doing wrong?

Split values then resolve the values to a name

I need to be able to do something with my column (below) that can contain multiple values. The 'HearAboutEvent' column has multiple values separated by a comma. Each one of these values corresponds to an entry in another table. So the value of 11273 will equal facebook, 11274 will mean radio, and 11275 will mean commercial.
The data I am working with looks like this:
weather ID MemberID SubscriptionID DateEntered ParticipatedBefore ParticipatedBeforeCities WeatherDependent NonRefundable TShirtSize HearAboutEvent
Yes 24 18 1 2013-12-19 0 NULL 10950 10952 10957 11273, 11274, 11275
I am able to do the proper join to resolve the value of 'weather', note it is the first column and the 8th column.
This is the query I have created so far to resolve the values of WeatherDependent:
SELECT CFS1.Name as 'weather', *
FROM FSM_CustomForm_693 t
LEFT JOIN FSM_CustomFormSelectOptions CFS1 ON CFS1.ID = t.WeatherDependent
where t.ID = 24
Ultimately I need to have the data look like this:
weather ID MemberID SubscriptionID DateEntered ParticipatedBefore ParticipatedBeforeCities WeatherDependent NonRefundable TShirtSize HearAboutEvent
Yes 24 18 1 2013-12-19 0 NULL 10950 10952 10957 Facebook, radio, commercial
Things I think you could use to accomplish this are:
A Split TVF FUNCTION - http://msdn.microsoft.com/en-us/library/ms186755.aspx
CROSS APPLY - http://technet.microsoft.com/en-us/library/ms175156.aspx
STUFF & FOR XML PATH - http://msdn.microsoft.com/en-us/library/ms188043.aspx & http://msdn.microsoft.com/en-us/library/ms190922.aspx
Going one step further, you need something like this:
Excuse my profuse use of sub queries.
CREATE FUNCTION dbo.Split (#sep char(1), #s varchar(512))
RETURNS table
AS
RETURN (
WITH Pieces(pn, start, stop) AS (
SELECT 1, 1, CHARINDEX(#sep, #s)
UNION ALL
SELECT pn + 1, stop + 1, CHARINDEX(#sep, #s, stop + 1)
FROM Pieces
WHERE stop > 0
)
SELECT pn,
SUBSTRING(#s, start, CASE WHEN stop > 0 THEN stop-start ELSE 512 END) AS s
FROM Pieces
)
GO
SELECT
O.A,O.B,O.C,O.D,O.E,O.F,O.G,O.H,O.I,O.J,O.Stuffed
FROM (
SELECT
*
,STUFF((
SELECT ', ' + Name
FROM (
SELECT
V.*
,Y.Name
FROM (
SELECT
'Yes' AS A
,24 AS B
,18 AS C
,1 AS D
,'2013-12-19' AS E
,0 AS F
,NULL AS G
,10950 AS H
,10952 AS I
,10957 AS J
,'11273, 11274, 11275' AS K
)
AS V
CROSS APPLY dbo.Split(',',REPLACE(K,' ','')) AS P
JOIN (
SELECT 11273 AS Id , 'Facebook' AS Name UNION ALL
SELECT 11274 AS Id , 'radio' AS Name UNION ALL
SELECT 11275 AS Id , 'commercial' AS Name
)Y ON y.Id = p.s) ExampleTable
FOR XML PATH('')
), 1, 1, '' )
AS [Stuffed]
FROM (
SELECT
V.*
FROM (
SELECT
'Yes' AS A
,24 AS B
,18 AS C
,1 AS D
,'2013-12-19' AS E
,0 AS F
,NULL AS G
,10950 AS H
,10952 AS I
,10957 AS J
,'11273, 11274, 11275' AS K
)
AS V
CROSS APPLY dbo.Split(',',REPLACE(K,' ','')) AS P
JOIN (
SELECT 11273 AS Id , 'Facebook' AS Name UNION ALL
SELECT 11274 AS Id , 'radio' AS Name UNION ALL
SELECT 11275 AS Id , 'commercial' AS Name
)Y ON y.Id = p.s
)Z
) O
GROUP BY O.A,O.B,O.C,O.D,O.E,O.F,O.G,O.H,O.I,O.J,O.K,O.Stuffed

Join on max(T.<column>) including further information of T

I have two tables
create table item( id int )
insert into item ( id ) values ( 1 ), ( 2 ), ( 3 )
create table itemstatus
(
itemid int
, ts datetime
, "status" int
)
insert into itemstatus ( itemid, ts, status ) values
( 1, '2013-12-01T12:00:00.000', 1 ),
( 1, '2013-12-01T11:00:00.000', 2 ),
( 1, '2014-01-01T12:00:00.000', 1 ),
( 2, '2011-01-01T12:00:00.000', 1 )
I'd like to get all items with the last status set, in this case
1, '2014-01-01T12:00:00.000', 1
2, '2011-01-01T12:00:00.000', 1
3, NULL, NULL
What's the most efficient way to solve this?
I tried with a subselect and I get the latest timestamp, but I'm not able to add the status since this field is not included in aggregate-function or group-by. If I add it, the results got grouped by status - logically - but that leads to the fact, that I get too much result-lines and would have to add a further condition / subselect.
You may use the Fiddle-link for created tables and testdata. The second query includes the status-field.
Edit:
adding a further join does the trick, but I doubt that's the way to do it.
select
i.*
, d.*
, s.status
from
item i
left join ( select ts = max(ts), itemid from itemstatus group by itemid ) d
on 1 = 1
and i.id = d.itemid
left join itemstatus s
on 1 = 1
and s.itemid = d.itemid
and s.ts = d.ts
See SQL-fiddle for testing.
You can use row_number partitioned by itemid and ordered by ts desc to get the latest registration in itemstatus per itemid.
select I.id,
S.ts,
S.status
from item as I
left outer join (
select S.status,
S.ts,
S.itemid,
row_number() over(partition by S.itemid
order by S.ts desc) as rn
from itemstatus as S
) as S
on I.id = S.itemid and
S.rn = 1

mysql explain result interpretation

The query below does exactly what I expect it to do, is intuitive and doesn't generate intermediary tables. The downside is that it takes a long time to complete.
What I'll do in such cases is break the query in steps and create those intermediary tables & indexes. This time around I'd like to get a better handle on the hints provided by explain, and would appreciate any pointers: what obvious optimization steps am I missing in the query below?
Following the advice in MySQL query optimization and EXPLAIN for a noob I've created indices on order_number , order_type and item in orders_raw. It's unclear however how these would carry over character processing/regexes.
SELECT bundle_headers.order_number , bundle_headers.title , digital_subs.subscription_id , 1 as bundle_component
from
(
select order_number , substring( item , 1 , 3 ) as title , quantity from orders_raw
where order_type in (4,6)
) bundle_headers
inner join
(
select order_number , subscription_id , item as title , quantity from orders_raw
where order_type = 0 and length( item ) = 4
) digital_subs
on bundle_headers.order_number = digital_subs.order_number and
digital_subs.title regexp concat( '.*' , bundle_headers.title , '.*' ) and
bundle_headers.quantity = digital_subs.quantity
UNION
SELECT bundle_headers.order_number , bundle_headers.title , print_subs.subscription_id , 1 as bundle_component
from
(
select order_number , substring( item , 1 , 3 ) as title , quantity from orders_raw
where order_type in (4,6)
) bundle_headers
inner join
(
select order_number , subscription_id , item as title , quantity from orders_raw
where order_type = 0 and length( item ) = 3
) print_subs
on bundle_headers.order_number = print_subs.order_number and
print_subs.title regexp concat( '.*' , bundle_headers.title , '.*' ) and
bundle_headers.quantity = print_subs.quantity;
EDIT, #tin tran: I've yet to rigorously time both the query above and your query (after a couple corrections, copied below) starting out on an idle machine. I did submit it, and didn't see an obvious reduction in run time.
SELECT bundle_headers.order_number,
substring(bundle_headers.item,1,3) as title,
subs.subscription_id,
1 as bundle_component
FROM orders_raw bundle_headers
INNER JOIN orders_raw subs ON (bundle_headers.order_number = subs.order_number)
WHERE (bundle_headers.order_type = 4 OR bundle_headers.order_type = 6)
AND subs.order_type = 0
AND bundle_headers.quantity = subs.quantity
AND subs.item LIKE CONCAT('%',substring(bundle_headers.item,1,3),'%')
AND (length(subs.item) = 4 OR length(subs.item) = 3)
please try this query see if it produces the same result. And if it's any faster
SELECT bundle_headers.order_number,substring(bundle_headers.title,1,3) as title,subs.subscription_id,1 as bundle_component
FROM order_type bundle_headers
INNER JOIN orders_raw subs ON (bundle_headers.order_number = subs.order_number)
WHERE (bundle_headers.order_type = 4 OR bundle_headers.order_type = 6)
AND subs.order_type = 0
AND bundle_headers.quantity = subs.quantity
AND subs.title LIKE CONCAT('%',substring(bundle_headers.title,1,3),'%')
AND (length(subs.item) = 4 OR length(subs.item) = 3)

Filter out orphan table entries

Suppose there is a table with only two columns (an example is shown below). Every '1' entry should be followed (in the sorted order given below) by a '0'. However, as you can see, in the table, there are some 'orphans' where there are two consecutive '1's.
How can I create a query that returns all the rows, except for the first of any consecutive '1's? (This would reduce the example below from 16 rows to 14)
1 E
0 A
1 T
0 S
1 R
0 E
1 F
0 T
1 G
1 T
0 R
1 X
1 R
0 R
1 E
0 T
I'm going to try and clarify my problem, I think that above I simplified it too much. Imagine one table called logs, with four columns:
user (a string containing a username)
machine (a string uniquely identifying various PCs)
type (event's type: a 1 for login and a 0 for logout)
time (the time of the event being logged)
[The machine/time pair provides a unique key, as no machine can be logged in or out of twice at the same instant. Presumably an 'ID' column could be artificially created based on machine/time sort if needed.]
The idea is that every login event should be accompanied by a logout event. In an ideal word it would be fairly easy to match logins to logouts, and hence analyse the time spent logged in.
However, in the case of a power cut, the logout will not be recorded. Therefore (considering only one machine's data, sorted by time) if there are two login events in a row, we want to ignore the first login, because we don't have any reliable data from it. This is the problem I am trying to solve.
Provided, that
only 1's are dupes, never 0's
You want to get rid of all the first 1's if there are more.
Your text says "except for the first of any consecutive", but I think, this is what you want. Or there can only ever be 2, then it is the same.
SELECT x.*
FROM x
LEFT JOIN x y on y.id = (x.id + 1)
WHERE (x.nr = y.nr) IS NOT TRUE -- OR x.nr = 0
ORDER BY x.id
If you want to preserve double 0's, use the commented clause additionally, but probably not needed.
Edit after question edit:
You may want to add an auto-increment column to your data to make this simpler:
Generate (i.e. write) a row number index column in MySQL
Other RDBMS (PostgreSQL, Oracle, SQL Server, ..) have window functions like row_number() or lag() and lead() that make such an operation much easier.
Assuming you get an id (add column, set column id = record number in database) use:
select a.*
from the_table a
left join the_table b on b.id = a.id + 1
and b.col1 = 0
where a.col1 = 1
and b.id is null
Try:
select l.*
from logs l
where l.type = 0 or
not (select type
from (select * from logs order by `time` desc) n
where n.machine = l.machine and
n.user = l.user and
n.time > l.time)
group by () )
USING a CTE to separate the lag-logic from the selection criteria.
DROP TABLE tmp.bits;
CREATE TABLE tmp.bits
( id SERIAL NOT NULL
, bit INTEGER NOT NULL
, code CHAR(1)
);
INSERT INTO tmp.bits(bit, code) VALUES
(1, 'T' )
, (0, 'S' )
, (1, 'R' )
, (0, 'E' )
, (1, 'F' )
, (0, 'T' )
, (1, 'G' )
, (1, 'T' )
, (0, 'R' )
, (1, 'X' )
, (1, 'R' )
, (0, 'R' )
, (1, 'E' )
, (0, 'T' )
;
SET search_path='tmp';
SELECT * FROM bits;
-- EXPLAIN ANALYZE
WITH prevnext AS (
SELECT
bt.id AS thisid
, bt.bit AS thisbit
, bt.code AS thiscode
, bp.bit AS prevbit
, bp.code AS prevcode
FROM bits bt
LEFT JOIN bits bp ON (bt.id > bp.id)
AND NOT EXISTS ( SELECT * FROM bits nx
WHERE nx.id > bp.id
AND nx.id < bt.id
)
)
SELECT thisid, thisbit, thiscode
FROM prevnext
WHERE thisbit=0
OR prevbit IS NULL OR thisbit <> prevbit
;
EDIT:
for those poor soals that cannot use CTEs, it is easy to create a view instead:
CREATE VIEW prevnext AS (
SELECT
bt.id AS thisid
, bt.bit AS thisbit
,bt.code AS thiscode
, bp.bit AS prevbit
, bp.code AS prevcode
FROM bits bt
LEFT JOIN bits bp ON (bt.id > bp.id)
AND NOT EXISTS ( SELECT * FROM bits nx
WHERE nx.id > bp.id
AND nx.id < bt.id
)
)
;
SELECT thisid, thisbit, thiscode
FROM prevnext
WHERE thisbit=0
OR prevbit IS NULL OR thisbit <> prevbit
;