Join 2 tables on dynamically changing column - mysql

Having these 2 tables (from inventory functionalities)
SQL Fiddle
-- ins table
+------+-------------+-------------+
| id | direction | quantity |
+------+-------------+-------------+
| 1 | in | 5 |
| 2 | in | 3 |
+------+-------------+-------------+
-- outs table
+------+-------------+-------------+
| id | direction | quantity |
+------+-------------+-------------+
| 1 | out | 4 |
| 2 | out | 1 |
| 3 | out | 2 |
| 4 | out | 1 |
+------+-------------+-------------+
How can I join rows from outs table to a row from ins table that has quantity covers/equals to the quantities of the outs rows that joined it, in other words how to get a result like this ?
-- result
+------+-------------+-------------+------+-------------+-------------+
| id | direction | quantity | id | direction | quantity |
+------+-------------+-------------+------+-------------+-------------+
| 1 | out | 4 | 1 | in | 5 |
| 2 | out | 1 | 1 | in | 5 |
| 3 | out | 2 | 2 | in | 3 |
| 4 | out | 1 | 2 | in | 3 |
+------+-------------+-------------+------+-------------+-------------+
as you can see rows 1,2 from outs table is taken from/ joined to row 1 from ins table and rows 3,4 from outs table is taken from/ joined to row 2 from ins table
NOTE: the quantities in the 2 tables are guaranteed to be sealed (a row from ins table is always has quantity that is exactly equal to 1 or more quantities of rows from table outs)
I wish I can just do something like this
-- sedu SQL
SELECT
whatever
FROM
outs left join
ins on outs.quantity <= (ins.quantity - previously joined outs.quantities);

This is painful to do in MySQL for a couple of reasons. First, MySQL doesn't have very good support for cumulative sums, which is what you want to compare.
And second, your result set is a little bit weak. It makes more sense to show all the ins records that contribute to each outs record, not just one of them.
For this purpose, you can use a join on cumulative sums, which looks like this:
select o.*, (o.to_quantity - o.quantity) as from_quantity,
i.*
from (select o.*,
(select sum(o2.quantity)
from outs o2
where o2.id <= o.id
) as to_quantity
from outs o
) o join
(select i.*,
(select sum(i2.quantity)
from ins i2
where i2.id <= i.id
) as to_quantity
from ins i
) i
on (o.to_quantity - o.quantity) < i.to_quantity and
o.to_quantity > (i.to_quantity - i.quantity)
Here is the SQL Fiddle.

Subquery with correlation approach might also useful
select t.id, t.direction, t.quantity, i.id, i.direction, i.quantity
from (
select id, direction, quantity,
quantity + coalesce((select quantity from outs where id < o.id order by id desc limit 1),
(select quantity from outs where id > o.id order by id limit 1)) Qty
from outs o
)t inner join ins i on i.quantity = t.Qty

Related

how to get AVG for every record in SQL

I need to get AVG for every row in SQL for example:
this is the first table
+ ---+------+-------------+
| course_id | course_name |
+ ----------+-------------+
| 1 | a |
| 2 | b |
| 3 | c |
| 4 | g |
+ ---+------+-------------+
This is the second table
I need to get AVG for both id 1 and 2. the result for example:
+ -------------------+------+----------+
| course_feedback_id | rate |course_id |
+ -================--+------+----------+
| 1 | 4 | 1 |
| 2 | 3 | 1 |
| 3 | 2 | 2 |
+ -------------------+------+----------+
this is the final answer that i need
+ ----------------------+
| course_id | AVG(rate) |
+ -=======--+-----------+
| 1 | 3.5 |
| 2 | 2 |
+ ----------------------+
I tried this soulution but it will give me only the first row not all records.
SELECT *, AVG(`rate`) from secondTable
please help
SELECT `id`, AVG(`rate`) FROM `your_table` GROUP BY `id`
Try this:
SELECT c.course_id, AVG(fb.rate)
FROM course AS c
INNER JOIN course_feedback AS fb ON fb.course_id = c.course_id
GROUP BY c.course_id
Select course_id,t2.rate from table1 where course_id,rate in (Select course_id,avg(rate) as rate from table group by course_id t2)
When you have multiple entries/redundant entries and you want to find some aggregation per each as in this case you got id containing redundant records, In such cases always try to use group by as group by as the name says will group records of the column to which it is applied and if you apply aggregation avg in this case will be groupwise column to which it is being applied not as a whole like for id 1 we have 2 redundant entries so itll apply avg(id1_entries)..likewise as a group.

MySQL - Return Latest Date and Total Sum from two rows in a column for multiple entries

For every ID_Number, there is a bill_date and then two types of bills that happen. I want to return the latest date (max date) for each ID number and then add together the two types of bill amounts. So, based on the table below, it should return:
| 1 | 201604 | 10.00 | |
| 2 | 201701 | 28.00 | |
tbl_charges
+-----------+-----------+-----------+--------+
| ID_Number | Bill_Date | Bill_Type | Amount |
+-----------+-----------+-----------+--------+
| 1 | 201601 | A | 5.00 |
| 1 | 201601 | B | 7.00 |
| 1 | 201604 | A | 4.00 |
| 1 | 201604 | B | 6.00 |
| 2 | 201701 | A | 15.00 |
| 2 | 201701 | B | 13.00 |
+-----------+-----------+-----------+--------+
Then, if possible, I want to be able to do this in a join in another query, using ID_Number as the column for the join. Would that change the query here?
Note: I am initially only wanting to run the query for about 200 distinct ID_Numbers out of about 10 million. I will be adding an 'IN' clause for those IDs. When I do the join for the final product, I will need to know how to get those latest dates out of all the other join possibilities. (ie, how do I get ID_Number 1 to join with 201604 and not 201601?)
I would use NOT EXISTS and GROUP BY
select, t1.id_number, max(t1.bill_date), sum(t1.amount)
from tbl_charges t1
where not exists (
select 1
from tbl_charges t2
where t1.id_number = t2.id_number and
t1.bill_date < t2.bill_date
)
group by t1.id_number
the NOT EXISTS filter out the irrelevant rows and GROUP BY do the sum.
I would be inclined to filter in the where:
select id_number, sum(c.amount)
from tbl_charges c
where c.date = (select max(c2.date)
from tbl_charges c2
where c2.id_number = c.id_number and c2.bill_type = c.bill_type
)
group by id_number;
Or, another fun way is to use in with tuples:
select id_number, sum(c.amount)
from tbl_charges c
where (c.id_number, c.bill_type, c.date) in
(select c2.id_number, c2.bill_type, max(c2.date)
from tbl_charges c2
group by c2.id_number, c2.bill_type
)
group by id_number;

Identifying the pairs of ID's in a column with the highest number of matches in SQL

I am trying to find the pairs of businesses with the highest number of common customers using MySQL.
The table is like the following:
+------------+------------+
| BusinessID | CustomerID |
+------------+------------+
| A | 1 |
| A | 2 |
| A | 3 |
| B | 4 |
| B | 1 |
| B | 3 |
| B | 2 |
| C | 3 |
| C | 4 |
| C | 5 |
+------------+------------+
And I want the output to be the pairs of businesses and the number of common customers, like this:
+-------------+-------------+------------------------+
| BusinessID | BusinessID | Common Customers Count |
+-------------+-------------+------------------------+
| A | B | 3 |
| A | C | 1 |
| B | C | 2 |
+-------------+-------------+------------------------+
This is the query I wrote:
SELECT a.BusinessID,b.BusinessID,COUNT(*) AS ncom
FROM (SELECT BusinessID, CustomerID FROM MYTABLE) AS a JOIN
(SELECT BusinessID,CustomerID FROM MYTABLE) AS b
ON a.BusinessID < b.BusinessID AND a.CustomerID = b.CustomerID
GROUP BY a.BusinessID, b.BusinessID
ORDER BY ncom
The problem is that my dataset has about 5m rows, and this seems to be too inefficient on large datasets. I tested the query on smaller datasets by limiting the data -- it took 8 seconds to process 10k rows and 30 seconds for 20k rows, so this query wouldn't be feasible to run for 5m rows. How else can I write the query to make it faster?
Don't use subqueries to get the columns from the table, that's probably preventing it from using indexes.
SELECT a.BusinessID, b.BusinessID, COUNT(*) as ncom
FROM MYTABLE AS a
JOIN MYTABLE AS b ON a.BusinessID < b.BusinessID AND a.CustomerID = b.CustomerID
GROUP BY a.BusinessID, b.BusinessID
ORDER BY ncom
Also, give the table the following index:
CREATE INDEX ix_cust_bus ON MYTABLE (CustomerID, BusinessID);

MySQL JOIN with LIMIT query results

I have 2 tables, products and origins
Products:
p_id | name | origin_id
------------------------
1 | P1 | 1
2 | P2 | 2
3 | P3 | 1
Origins:
o_id | name
-------------
1 | O1
2 | O2
I am using the following query :
SELECT * FROM `products` LEFT OUTER JOIN `origins`
ON ( `products`.`origin_id` = `origins`.`o_id` ) LIMIT 2
I am getting the below results
p_id | name | origin_id | o_id | name
-----------------------------------------
1 | P1 | 1 | 1 | O1
3 | P3 | 1 | 1 | O1
I was wondering how the LEFT OUTER JOIN affects the result where I am getting the first and the third row rather than the first and the second row?
When you are not using ORDER BY Clause, there is no guarantee of a specific order for your SELECT query.
So we should use ORDER BY when we need any specific order.
See this: MySQL Ref: What is The Default Sort Order of SELECT with no ORDER BY Clause
You don't control the inherent ordering of rows in a table. It behaves like a set. If you want to order it, use order by clause.
SELECT * FROM `products` p LEFT OUTER JOIN `origins` o
ON ( p.`origin_id` = o.`o_id` ) ORDER BY p.`name` LIMIT 2
Output :
p_id | name | origin_id | o_id | name
-----------------------------------------
1 | P1 | 1 | 1 | O1
2 | P2 | 2 | 2 | O2

Select SUM from multiple tables for every record in MySQL table

I'm having a table with main invoice data, and two table with invoice items:
items which are based on hourly work, with an hourly rate and an amount of hours
items which are products, with a unit count an unit price
For the invoice overview page, I'd like to retrieve all invoices and their total amounts with one query.
A simplified schema
invoices_main
| invoice_id |
| 1 |
| 2 |
| 3 |
invoices_items_products
| item_id | invoice_id | item_count | item_unit_price |
| 1 | 1 | 1 | 999.95 |
| 2 | 1 | 20 | 49.50 |
| 3 | 2 | 3 | 15.00 |
| 4 | 2 | 5 | 5.00 |
| 5 | 3 | 2 | 150.00 |
invoices_items_hourly
| item_id | invoice_id | item_hours | item_hourly_rate |
| 1 | 1 | 3.50 | 90.00 |
| 2 | 1 | 1.00 | 140.00 |
| 3 | 2 | 12.00 | 90.00 |
| 4 | 3 | 1.50 | 90.00 |
With the help of this question, I've constructed the following query:
SELECT
I.invoice_id,
IFNULL(
SUM(ROUND(P.item_unit_price * P.item_count, 2)),
0
) + IFNULL(
SUM(ROUND(H.item_hourly_rate * H.item_hours, 2)),
0
) AS invoice_total_amount
FROM
invoices_main I
LEFT JOIN invoices_items_products P ON I.invoice_id = P.invoice_id
LEFT JOIN invoices_items_hours H ON I.invoice_id = H.invoice_id
GROUP BY
I.invoice_id
It works kind of, but if an invoice has both products and hourly items, with at least multiple entries for one of both, items are duplicated due to the joins and the total amount becomes way too high.
Thus, in the above example schema, it goes wrong with invoice_id 1 and 2, but work with 3.
How can I retrieve a list of invoices with their respective total amounts, in a way that works even if an invoice has multiple products and multiple hourly items?
Try putting both left join's into a subquery instead.
SELECT
I.invoice_id,
IFNULL
(
(
SELECT SUM(ROUND(H.item_hourly_rate * H.item_hours, 2))
FROM invoices_items_hours AS H
WHERE H.invoice_id = I.invoice_id
)
, 0
) +
IFNULL
(
(
SELECT SUM(ROUND(P.item_unit_price * P.item_count, 2))
FROM invoices_items_products AS P
WHERE P.invoice_id = I.invoice_id
)
, 0
) AS invoice_total_amount
FROM invoices_main AS I
GROUP BY I.invoice_id
As mentioned in the comments, you should sum up the revenue in each table per invoice_id before doing the join. If you're looking to get the revenue from both of these places then you can add (B.unit_revenue + C.hourly_revenue) total_revenue to the first SELECT statement below.
SELECT A.invoice_id, B.unit_revenue, C.hourly_revenue FROM
invoices_main AS A
JOIN (
SELECT invoice_id, SUM(item_count * item_unit_price) unit_revenue
FROM invoices_items_products GROUP BY invoice_id
) B
ON
A.invoice_id = B.invoice_id
JOIN (
SELECT invoice_id, SUM(item_hours * item_hourly_rate) hourly_revenue FROM
invoices_items_hours GROUP BY invoice_id
) C
ON
A.invoice_id = C.invoice_id