How to sum values of two tables and group by date - mysql

I am building a trading system where users need to know their running account balance by date for a specific user (uid) including how much they made from trading (results table) and how much they deposited or withdrew from their accounts (adjustments table).
Here is the sqlfiddle and tables: http://sqlfiddle.com/#!9/6bc9e4/1
Adjustments table:
+-------+-----+-----+--------+------------+
| adjid | aid | uid | amount | date |
+-------+-----+-----+--------+------------+
| 1 | 1 | 1 | 20 | 2019-08-18 |
| 2 | 1 | 1 | 50 | 2019-08-21 |
| 3 | 1 | 1 | 40 | 2019-08-21 |
| 4 | 1 | 1 | 10 | 2019-08-19 |
+-------+-----+-----+--------+------------+
Results table:
+-----+-----+-----+--------+-------+------------+
| tid | uid | aid | amount | taxes | date |
+-----+-----+-----+--------+-------+------------+
| 1 | 1 | 1 | 100 | 3 | 2019-08-19 |
| 2 | 1 | 1 | -50 | 1 | 2019-08-20 |
| 3 | 1 | 1 | 100 | 2 | 2019-08-21 |
| 4 | 1 | 1 | 100 | 2 | 2019-08-21 |
+-----+-----+-----+--------+-------+------------+
How do I get the below results for uid (1)
+--------------+------------+------------------+----------------+------------+
| ResultsTotal | TaxesTotal | AdjustmentsTotal | RunningBalance | Date |
+--------------+------------+------------------+----------------+------------+
| - | - | 20 | 20 | 2019-08-18 |
| 100 | 3 | 10 | 133 | 2019-08-19 |
| -50 | 1 | - | 84 | 2019-08-20 |
| 200 | 4 | 90 | 378 | 2019-08-21 |
+--------------+------------+------------------+----------------+------------+
Where RunningBalance is the current account balance for the particular user (uid).
Based on #Gabriel's answer, I came up with something like, but it gives me empty balance and duplicate records
SELECT SUM(ResultsTotal), SUM(TaxesTotal), SUM(AdjustmentsTotal), #runningtotal:= #runningtotal+SUM(ResultsTotal)+SUM(TaxesTotal)+SUM(AdjustmentsTotal) as Balance, date
FROM (
SELECT 0 AS ResultsTotal, 0 AS TaxesTotal, adjustments.amount AS AdjustmentsTotal, adjustments.date
FROM adjustments LEFT JOIN results ON (results.uid=adjustments.uid) WHERE adjustments.uid='1'
UNION ALL
SELECT results.amount AS ResultsTotal, taxes AS TaxesTotal, 0 as AdjustmentsTotal, results.date
FROM results LEFT JOIN adjustments ON (results.uid=adjustments.uid) WHERE results.uid='1'
) unionTable
GROUP BY DATE ORDER BY date

For what you are asking you would want to union then group the results from both tables, this should give the results you want. However, I recommend calculating the running balance outside of MySQL since this adds some complexity to our query.
Weird things could start to happen, for example, if someone already defined the #runningBalance variable as part of the queries scope.
SELECT aggregateTable.*, #runningBalance := ifNULL(#runningBalance, 0) + TOTAL
FROM (
SELECT SUM(ResultsTotal), SUM(TaxesTotal), SUM(AdjustmentsTotal)
, SUM(ResultsTotal) + SUM(TaxesTotal) + SUM(AdjustmentsTotal) as TOTAL
, date
FROM (
SELECT 0 AS ResultsTotal, 0 AS TaxesTotal, amount AS AdjustmentsTotal, date
FROM adjustments
UNION ALL
SELECT amount AS ResultsTotal, taxes AS TaxesTotal, 0 as AdjustmentsTotal, date
FROM results
) unionTable
GROUP BY date
) aggregateTable

Related

How to get Total Overdraft amounts from a particular Date in SQL

I'm trying to get the total amount of overdraft accounts from an old Date, the goal is to get the total amount it was on the 31st of January.
I have the following tables Users and Transactions.
USERS (currently)
| user_id | name | account_balance |
|---------|---------|------------------|
| 1 | Wells | 1.00 |
| 2 | John | -10.00 |
| 3 | Sahar | -5.00 |
| 4 | Peter | 1.00 |
TRANSACTIONS (daily transition can go back in time)
| trans_id | user_id | amount_tendered | trans_datetime |
|------------|---------|-------------------|---------------------|
| 1 | 1 | 2 | 2021-02-16 |
| 2 | 2 | 3 | 2021-02-16 |
| 3 | 3 | 5 | 2021-02-16 |
| 4 | 4 | 2 | 2021-02-16 |
| 5 | 1 | 10 | 2021-02-15 |
so the current total overdraft amount is
SELECT sum(account_balance) AS O_D_Amount
FROM users
WHERE account_balance < 0;
| O_D_Amount |
|------------|
| -15 |
I need Help to reverse this amount to a date in history.
Assuming overdrafts are based on the sum of transactions up to a point, you can use a subquery:
select sum(total) as total_overdraft
from (select user_id, sum(amount_tendered) as total
from transactions t
where t.trans_datetime <= ?
group by user_id
) t
where total < 0;
The ? is a parameter placeholder for the date/time you care about.

Query with dynamic date intervals

Given a statuses table that holds information about products availability, how do I select the date that corresponds to the 1st day in the latest 20 days that the product has been active?
Yes I know the question is hard to follow. I think another way to put it would be: I want to know how many times each product has been sold in the last 20 days that it was active, meaning the product could have been active for years, but I'd only want the sales count from the latest 20 days that it had a status of "active".
It's something easily doable in the server-side (i.e. getting any collection of products from the DB, iterating them, performing n+1 queries on the statuses table, etc), but I have hundreds of thousands of items so it's imperative to do it in SQL for performance reasons.
table : products
+-------+-----------+
| id | name |
+-------+-----------+
| 1 | Apple |
| 2 | Banana |
| 3 | Grape |
+-------+-----------+
table : statuses
+-------+-------------+---------------+---------------+
| id | name | product_id | created_at |
+-------+-------------+---------------+---------------+
| 1 | active | 1 | 2018-01-01 |
| 2 | inactive | 1 | 2018-02-01 |
| 3 | active | 1 | 2018-03-01 |
| 4 | inactive | 1 | 2018-03-15 |
| 6 | active | 1 | 2018-04-25 |
| 7 | active | 2 | 2018-03-01 |
| 8 | active | 3 | 2018-03-10 |
| 9 | inactive | 3 | 2018-03-15 |
+-------+-------------+---------------+---------------+
table : items (ordered products)
+-------+---------------+-------------+
| id | product_id | order_id |
+-------+---------------+-------------+
| 1 | 1 | 1 |
| 2 | 1 | 2 |
| 3 | 1 | 3 |
| 4 | 1 | 4 |
| 5 | 1 | 5 |
| 6 | 2 | 3 |
| 7 | 2 | 4 |
| 8 | 2 | 5 |
| 9 | 3 | 5 |
+-------+---------------+-------------+
table : orders
+-------+---------------+
| id | created_at |
+-------+---------------+
| 1 | 2018-01-02 |
| 2 | 2018-01-15 |
| 3 | 2018-03-02 |
| 4 | 2018-03-10 |
| 5 | 2018-03-13 |
+-------+---------------+
I want my final results to look like this:
+-------+-----------+----------------------+--------------------------------+
| id | name | recent_sales_count | date_to_start_counting_sales |
+-------+-----------+----------------------+--------------------------------+
| 1 | Apple | 3 | 2018-01-30 |
| 2 | Banana | 0 | 2018-04-09 |
| 3 | Grape | 1 | 2018-03-10 |
+-------+-----------+----------------------+--------------------------------+
So this is what I mean by latest 20 active days for e.g. Apple:
It was last activated at '2018-04-25'. That's 4 days ago.
Before that, it was inactive since '2018-03-15', so all these days until '2018-04-25' don't count.
Before that, it was active since '2018-03-01'. That's more 14 days until '2018-03-15'.
Before that, inactive since '2018-02-01'.
Finally, it was active since '2018-01-01', so it should only count the missing 2 days (4 + 14 + 2 = 20) backwards from '2018-02-01', resulting in date_to_start_counting_sales = '2018-01-30'.
With the '2018-01-30' date in hand, I'm then able to count Apple orders in the last 20 active days: 3.
Hope that makes sense.
Here is a fiddle with the data provided above.
I've got a standard SQL solution, that does not use any window function as you are on MySQL 5
My solution requires 3 stacked views.
It would have been better with a CTE but your version doesn't support it. Same goes for the stacked Views... I don't like to stack views and always try to avoid it, but sometimes you have no other choice, because MySQL doesn't accept subqueries in FROM clause for Views.
CREATE VIEW VIEW_product_dates AS
(
SELECT product_id, created_at AS active_date,
(
SELECT created_at
FROM statuses ti
WHERE name = 'inactive' AND ta.created_at < ti.created_at AND ti.product_id=ta.product_id
GROUP BY product_id
) AS inactive_date
FROM statuses ta
WHERE name = 'active'
);
CREATE VIEW VIEW_product_dates_days AS
(
SELECT product_id, active_date, inactive_date, datediff(IFNULL(inactive_date, SYSDATE()),active_date) AS nb_days
FROM VIEW_product_dates
);
CREATE VIEW VIEW_product_dates_days_cumul AS
(
SELECT product_id, active_date, ifnull(inactive_date,sysdate()) AS inactive_date, nb_days,
IFNULL((SELECT SUM(V2.nb_days) + V1.nb_days
FROM VIEW_product_dates_days V2
WHERE V2.active_date >= IFNULL(V1.inactive_date, SYSDATE()) AND V1.product_id=V2.product_id
),V1.nb_days) AS cumul_days
FROM VIEW_product_dates_days V1
);
The final view produce this :
| product_id | active_date | inactive_date | nb_days | cumul_days |
|------------|----------------------|----------------------|---------|------------|
| 1 | 2018-01-01T00:00:00Z | 2018-02-01T00:00:00Z | 31 | 49 |
| 1 | 2018-03-01T00:00:00Z | 2018-03-15T00:00:00Z | 14 | 18 |
| 1 | 2018-04-25T00:00:00Z | 2018-04-29T11:28:39Z | 4 | 4 |
| 2 | 2018-03-01T00:00:00Z | 2018-04-29T11:28:39Z | 59 | 59 |
| 3 | 2018-03-10T00:00:00Z | 2018-03-15T00:00:00Z | 5 | 5 |
So it aggregates all active periods of all products, it counts the number of days for each period, and the cumulative days of all past active periods since current date.
Then we can query this final view to get the desired date for each product. I set a variable for your 20 days, so you can change that number easily if you want.
SET #cap_days = 20 ;
SELECT PD.id, Pd.name,
SUM(CASE WHEN o.created_at > PD.date_to_start_counting_sales THEN 1 ELSE 0 END) AS recent_sales_count ,
PD.date_to_start_counting_sales
FROM
(
SELECT p.*,
(CASE WHEN LowerCap.max_cumul_days IS NULL
THEN ADDDATE(ifnull(HigherCap.min_inactive_date,sysdate()),(-#cap_days))
ELSE
CASE WHEN LowerCap.max_cumul_days < #cap_days AND HigherCap.min_inactive_date IS NULL
THEN ADDDATE(ifnull(LowerCap.max_inactive_date,sysdate()),(-LowerCap.max_cumul_days))
ELSE ADDDATE(ifnull(HigherCap.min_inactive_date,sysdate()),(LowerCap.max_cumul_days-#cap_days))
END
END) as date_to_start_counting_sales
FROM products P
LEFT JOIN
(
SELECT product_id, MAX(cumul_days) AS max_cumul_days, MAX(inactive_date) AS max_inactive_date
FROM VIEW_product_dates_days_cumul
WHERE cumul_days <= #cap_days
GROUP BY product_id
) LowerCap ON P.id=LowerCap.product_id
LEFT JOIN
(
SELECT product_id, MIN(cumul_days) AS min_cumul_days, MIN(inactive_date) AS min_inactive_date
FROM VIEW_product_dates_days_cumul
WHERE cumul_days > #cap_days
GROUP BY product_id
) HigherCap ON P.id=HigherCap.product_id
) PD
LEFT JOIN items i ON PD.id = i.product_id
LEFT JOIN orders o ON o.id = i.order_id
GROUP BY PD.id, Pd.name, PD.date_to_start_counting_sales
Returns
| id | name | recent_sales_count | date_to_start_counting_sales |
|----|--------|--------------------|------------------------------|
| 1 | Apple | 3 | 2018-01-30T00:00:00Z |
| 2 | Banana | 0 | 2018-04-09T20:43:23Z |
| 3 | Grape | 1 | 2018-03-10T00:00:00Z |
FIDDLE : http://sqlfiddle.com/#!9/804f52/24
Not sure which version of MySql you're working with, but if you can use 8.0, that version came out with a lot of functionality that makes things slightly more doable (CTE's, row_number(), partition, etc.).
My recommendation would be to create a view like in this DB-Fiddle Example, call the view on server side and iterate programatically. There are ways of doing it in SQL, but it'd be a bear to write, test and likely would be less efficient.
Assumptions:
Products cannot be sold during inactive date ranges
Statuses table will always alternate status active/inactive/active for each product. I.e. no date ranges where a certain product is both active and inactive.
View Results:
+------------+-------------+------------+-------------+
| product_id | active_date | end_date | days_active |
+------------+-------------+------------+-------------+
| 1 | 2018-01-01 | 2018-02-01 | 31 |
+------------+-------------+------------+-------------+
| 1 | 2018-03-01 | 2018-03-15 | 14 |
+------------+-------------+------------+-------------+
| 1 | 2018-04-25 | 2018-04-29 | 4 |
+------------+-------------+------------+-------------+
| 2 | 2018-03-01 | 2018-04-29 | 59 |
+------------+-------------+------------+-------------+
| 3 | 2018-03-10 | 2018-03-15 | 5 |
+------------+-------------+------------+-------------+
View:
CREATE OR REPLACE VIEW days_active AS (
WITH active_rn
AS (SELECT *, Row_number()
OVER ( partition BY NAME, product_id
ORDER BY created_at) AS rownum
FROM statuses
WHERE name = 'active'),
inactive_rn
AS (SELECT *, Row_number()
OVER ( partition BY NAME, product_id
ORDER BY created_at) AS rownum
FROM statuses
WHERE name = 'inactive')
SELECT x1.product_id,
x1.created_at AS active_date,
CASE WHEN x2.created_at IS NULL
THEN Curdate()
ELSE x2.created_at
END AS end_date,
CASE WHEN x2.created_at IS NULL
THEN Datediff(Curdate(), x1.created_at)
ELSE Datediff(x2.created_at,x1.created_at)
END AS days_active
FROM active_rn x1
LEFT OUTER JOIN inactive_rn x2
ON x1.rownum = x2.rownum
AND x1.product_id = x2.product_id ORDER BY
x1.product_id);

Complex SQL query suggestions please

I have three tables with schema as below:
Table: Apps
| ID (bigint) | USERID (Bigint)| START_TIME (datetime) |
-------------------------------------------------------------
| 1 | 13 | 2013-05-03 04:42:55 |
| 2 | 13 | 2013-05-12 06:22:45 |
| 3 | 13 | 2013-06-12 08:44:24 |
| 4 | 13 | 2013-06-24 04:20:56 |
| 5 | 13 | 2013-06-26 08:20:26 |
| 6 | 13 | 2013-09-12 05:48:27 |
Table: Hosts
| ID (bigint) | APPID (Bigint)| DEVICE_ID (Bigint) |
-------------------------------------------------------------
| 1 | 1 | 1 |
| 2 | 2 | 1 |
| 3 | 1 | 1 |
| 4 | 3 | 3 |
| 5 | 1 | 4 |
| 6 | 2 | 3 |
Table: Usage
| ID (bigint) | APPID (Bigint)| HOSTID (Bigint) | Factor (varchar) |
-------------------------------------------------------------------------------------
| 1 | 1 | 1 | Low |
| 2 | 1 | 3 | High |
| 3 | 2 | 2 | Low |
| 4 | 3 | 4 | Medium |
| 5 | 1 | 5 | Low |
| 6 | 2 | 2 | Medium |
Now if put is userid, i want to get the count of rows of table rows for each month (of all app) for each "Factor" month wise for the last 6 months.
If a DEVICE_ID appears more than once in a month (based on START_TIME, based on joining Apps and Hosts), only the latest rows of Usage (based on combination of Apps, Hosts and Usage) be considered for calculating count.
Example output of the query for the above example should be: (for input user id=13)
| MONTH | USAGE_COUNT | FACTOR |
-------------------------------------------------------------
| 5 | 0 | High |
| 6 | 0 | High |
| 7 | 0 | High |
| 8 | 0 | High |
| 9 | 0 | High |
| 10 | 0 | High |
| 5 | 2 | Low |
| 6 | 0 | Low |
| 7 | 0 | Low |
| 8 | 0 | Low |
| 9 | 0 | Low |
| 10 | 0 | Low |
| 5 | 1 | Medium |
| 6 | 1 | Medium |
| 7 | 0 | Medium |
| 8 | 0 | Medium |
| 9 | 0 | Medium |
| 10 | 0 | Medium |
How is this calculated?
For Month May 2013 (05-2013), there are two Apps from table Apps
In table Hosts , these apps are associated with device_id's 1,1,1,4,3
For this month (05-2013) for device_id=1, the latest value of start_time is: 2013-05-12 06:22:45 (from tables hosts,apps), so in table Usage, look for combination of appid=2&hostid=2 for which there are two rows one with factor Low and other Medium,
For this month (05-2013) for device_id=4, by following same procedure we get one entry i.e 0 Low
Similarly all the values are calculated.
To get the last 6 months via query i'm trying to get it with the following:
SELECT MONTH(DATE_ADD(NOW(), INTERVAL aInt MONTH)) AS aMonth
FROM
(
SELECT 0 AS aInt UNION SELECT -1 UNION SELECT -2 UNION SELECT -3 UNION SELECT -4 UNION SELECT -5
)
Please check sqlfiddle: http://sqlfiddle.com/#!2/55fc2
Because the calculation you're doing involves the same join multiple times, I started by creating a view.
CREATE VIEW `app_host_usage`
AS
SELECT a.id "appid", h.id "hostid", u.id "usageid",
a.userid, a.start_time, h.device_id, u.factor
FROM apps a
LEFT OUTER JOIN hosts h ON h.appid = a.id
LEFT OUTER JOIN `usage` u ON u.appid = a.id AND u.hostid = h.id
WHERE a.start_time > DATE_ADD(NOW(), INTERVAL -7 MONTH)
The WHERE condition is there because I made the assumption that you don't want July 2005 and July 2006 to be grouped together in the same count.
With that view in place, the query becomes
SELECT months.Month, COUNT(DISTINCT device_id), factors.factor
FROM
(
-- Get the last six months
SELECT (MONTH(NOW()) + aInt + 11) % 12 + 1 "Month" FROM
(SELECT 0 AS aInt UNION SELECT -1 UNION SELECT -2 UNION SELECT -3 UNION SELECT -4 UNION SELECT -5) LastSix
) months
JOIN
(
-- Get all known factors
SELECT DISTINCT factor FROM `usage`
) factors
LEFT OUTER JOIN
(
-- Get factors for each device...
SELECT
MONTH(start_time) "Month",
device_id,
factor
FROM app_host_usage a
WHERE userid=13
AND start_time IN (
-- ...where the corresponding usage row is connected
-- to an app row with the highest start time of the
-- month for that device.
SELECT MAX(start_time)
FROM app_host_usage a2
WHERE a2.device_id = a.device_id
GROUP BY MONTH(start_time)
)
GROUP BY MONTH(start_time), device_id, factor
) usageids ON usageids.Month = months.Month
AND usageids.factor = factors.factor
GROUP BY factors.factor, months.Month
ORDER BY factors.factor, months.Month
which is insanely complicated, but I've tried to comment explaining what each part does. See this sqlfiddle: http://sqlfiddle.com/#!2/5c871/1/0

Order by and group by

I am trying to show delivery charges for a shop I am building, there are three tables in the database 1 for the service ie Royal Mail, Carrier..., one for the band ie. UK, Europe, Worldwide1 etc.. and one for the charges (qty = weight)
I have a database of three tables that, when joined form the following
+------------------+-----+-----------+-------+---------+---------------+----------+-------+-------------+
| name | qty | serviceID | basis | bandID | initial_charge | chargeID | price | total_price |
+------------------+-----+-----------+-------+---------+---------------+----------+-------+-------------+
| Collect in store | 0 | 3 | | 1 | 3 | 0.00 | 2 | 0.00 |
| Royal mail | 0 | 1 | 2 | 4 | 2.00 | 3 | 0.00 | 2.00 |
| Royal mail | 1 | 1 | 2 | 4 | 2.00 | 4 | 1.00 | 3.00 |
| APC | 0 | 2 | 1 | 1 | 0.00 | 6 | 5.95 | 5.95 |
+------------------+-----+-----------+-------+---------+---------------+----------+-------+-------------+
Basically what I want to do is (as you can see) Royal Mail has two entries as there are more than one entry in the joined table. What I would like to do is show the highest of the two royal mail entries (I was initially trying to group by service_id) whilst also maintaining the two other services with different service id's
Any assistance would be great as this is driving me mad. I feel like I have tried every combination going!
In the example below the qty (weight) of the items is 3kg
SELECT
`service`.`name`,
`charge`.`qty`,
`service`.`serviceID`,
`band`.`bandID`,
`band`.`initial_charge`,
`charge`.`chargeID`,
`charge`.`price`,
`band`.`initial_charge` + `charge`.`price` AS `total_price`
FROM
`delivery_band` AS `band`
LEFT JOIN
`delivery_charge` AS `charge`
ON
`charge`.`bandID` = `band`.`bandID`
AND
`charge`.`qty` < '3'
LEFT JOIN
`delivery_service` AS `service`
ON
`service`.`serviceID` = `band`.`serviceID`
WHERE
FIND_IN_SET( '225', `band`.`accepted_countries` )
AND
(
`band`.`min_qty` >= '3'
OR
`band`.`min_qty` = '0'
)
AND
(
`band`.`max_qty` <= '3'
OR
`band`.`max_qty` = '0'
)
delivery_service
+-----------+------------------+
| serviceID | name |
+-----------+------------------+
| 1 | Royal mail |
| 2 | APC |
| 3 | Collect in store |
+-----------+------------------+
delivery_band
+--------+-----------+-----------------+----------------+---------+---------+-------------------------------------------------------+
| bandID | serviceID | name | initial_charge | min_qty | max_qty | accepted_countries |
+--------+-----------+-----------------+----------------+---------+---------+-------------------------------------------------------+
| 1 | 2 | UK Mainland | 0.00 | 0 | 0 | 225 |
| 2 | 2 | UK Offshore | 14.00 | 0 | 0 | 240 |
| 3 | 3 | Bradford Store | 0.00 | 0 | 0 | 225 |
| 4 | 1 | UK | 2.00 | 0 | 0 | 225 |
| 5 | 2 | World wide | 15.00 | 0 | 0 | 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20... |
| 6 | 1 | World wide Mail | 5.00 | 0 | 0 | 1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20... |
+--------+-----------+-----------------+----------------+---------+---------+-------------------------------------------------------+
delivery_charge
+----------+--------+-----+-------+
| chargeID | bandID | qty | price |
+----------+--------+-----+-------+
| 1 | 2 | 0 | 5.00 |
| 2 | 3 | 0 | 0.00 |
| 3 | 4 | 0 | 0.00 |
| 4 | 4 | 1 | 1.00 |
| 5 | 4 | 5 | 3.00 |
| 6 | 1 | 0 | 5.95 |
| 7 | 1 | 10 | 10.95 |
| 8 | 2 | 10 | 14.00 |
| 9 | 5 | 0 | 0.00 |
| 10 | 5 | 3 | 5.00 |
| 11 | 5 | 6 | 10.00 |
| 12 | 5 | 9 | 15.00 |
| 13 | 6 | 0 | 0.00 |
| 14 | 6 | 2 | 5.00 |
| 15 | 6 | 4 | 10.00 |
| 16 | 6 | 6 | 15.00 |
+----------+--------+-----+-------+
When I tried adding the charge table as a sub query and then limiting that query it gave me NULL's for all the charge table fields
If I try the following query:
SELECT
`service`.`name`,
`charge`.`qty`,
`service`.`serviceID`,
`band`.`bandID`,
`band`.`initial_charge`,
`charge`.`chargeID`,
MAX( `charge`.`price` ) AS `price`,
`band`.`initial_charge` + `charge`.`price` AS `total_price`
FROM
`delivery_band` AS `band`
LEFT JOIN
`delivery_charge` AS `charge`
ON
`charge`.`bandID` = `band`.`bandID`
AND
`charge`.`qty` < '3'
LEFT JOIN
`delivery_service` AS `service`
ON
`service`.`serviceID` = `band`.`serviceID`
WHERE
FIND_IN_SET( '225', `band`.`accepted_countries` )
AND
(
`band`.`min_qty` >= '3'
OR
`band`.`min_qty` = '0'
)
AND
(
`band`.`max_qty` <= '3'
OR
`band`.`max_qty` = '0'
)
GROUP BY
`service`.`serviceID`
I get this returned:
+------------------+-----+-----------+--------+----------------+----------+-------+-------------+
| name | qty | serviceID | bandID | initial_charge | chargeID | price | total_price |
+------------------+-----+-----------+--------+----------------+----------+-------+-------------+
| Royal mail | 0 | 1 | 4 | 2.00 | 3 | 1.00 | 2.00 |
| APC | 0 | 2 | 1 | 0.00 | 6 | 5.95 | 5.95 |
| Collect in store | 0 | 3 | 3 | 0.00 | 2 | 0.00 | 0.00 |
+------------------+-----+-----------+--------+----------------+----------+-------+-------------+
Which looks fine in principle until you realise that the chargeID = 3 has a price of 0.00 and yet the table is showing a price of 1.00 so the values seem to have become disassociated
What I would like to do is show the highest of the two royal mail entries
You can use MAX to obtain the maximum of a given column, e.g.
SELECT … MAX(charge.price) … FROM …
If you absolutely need the other columns (like charge.chargeID) to match, things will become a lot more complicated. So make sure you actually need that. For details on the general idea behind this kind of query, have a closer look at Select one value from a group based on order from other columns. Adapting this answer by #RichardTheKiwi, I came up with the following query:
SELECT s.name,
c.qty,
s.serviceID,
b.bandID,
b.initial_charge,
c.chargeID,
c.price,
b.initial_charge + c.price AS total_price
FROM delivery_band AS b,
delivery_service AS s,
(SELECT chargeID, price, qty,
#rowctr := IF(bandId = #lastBand, #rowctr+1, 1) AS rowNumber,
#lastBand := bandId AS bandId
FROM (SELECT #rowctr:=0, #lastBand:=null) init,
delivery_charge
WHERE qty < 3
ORDER BY bandId, price DESC
) AS c
WHERE FIND_IN_SET(225, b.accepted_countries)
AND (b.min_qty >= 3 OR B.min_qty = 0)
AND (b.max_qty <= 3 OR B.max_qty = 0)
AND s.serviceID = b.serviceID
AND c.bandID = b.bandID
AND c.rowNumber = 1
See this fiddle for the corresponding output. Note that I only do inner queries, not left queries, since that seems sufficient for the query in question, and keeps things a lot more readable so you can concentrate on the important parts, i.e. those involving rowNumber. The idea is that the subquery generates row numbers for the items of the same band, resetting them for the next band. When you select only rows with rowNumber being 1, you only get the highest price, with all other columns associated with that.

Conditional cumulative SUM in MySQL

I have the following table:
+-----+-----------+----------+------------+------+
| key | idStudent | idCourse | hourCourse | mark |
+-----+-----------+----------+------------+------+
| 0 | 1 | 1 | 10 | 78 |
| 1 | 1 | 2 | 20 | 60 |
| 2 | 1 | 4 | 10 | 45 |
| 3 | 3 | 1 | 10 | 90 |
| 4 | 3 | 2 | 20 | 70 |
+-----+-----------+----------+------------+------+
Using a simple query, I can show student with their weighted average according to hourCourse and mark:
SELECT idStudent,
SUM( hourCourse * mark ) / SUM( hourCourse ) AS WeightedAvg
FROM `test`.`test`
GROUP BY idStudent;
+-----------+-------------+
| idStudent | WeightedAvg |
+-----------+-------------+
| 1 | 60.7500 |
| 3 | 76.6667 |
+-----------+-------------+
But now I need to select the registers until the cumulative sum of hourCourse per student reaches a threshold. For example, for a threshold of 30 hourCourse, only the following registers should be taken into account:
+-----+-----------+----------+------------+------+
| key | idStudent | idCourse | hourCourse | mark |
+-----+-----------+----------+------------+------+
| 0 | 1 | 1 | 10 | 78 |
| 1 | 1 | 2 | 20 | 60 |
| 3 | 3 | 1 | 10 | 90 |
| 4 | 3 | 2 | 20 | 70 |
+-----+-----------+----------+------------+------+
key 2 is not taken into account, because idStudent 1 already reached 30 hourCourse with idCourse 1 and 2.
Finally, the query solution should be the following:
+-----------+-------------+
| idStudent | WeightedAvg |
+-----------+-------------+
| 1 | 66.0000 |
| 3 | 76.6667 |
+-----------+-------------+
Is there any way to create an inline query for this? Thanks in advance.
Edit: The criteria while selecting the courses is from highest to the lowest mark.
Edit: Registers are included while the cumulative sum of hourCourse is less than 30. For instance, two registers of 20 hours each would be included (sum 40), and the following not.
You can calculate the cumulative sums per idStudent in a sub-query, then only select the results where the cumulative sum is <= 30:
select idStudent,
SUM( hourCourse * mark ) / SUM( hourCourse ) AS WeightedAvg
from
(
SELECT t.*,
case when #idStudent<>t.idStudent
then #cumSum:=hourCourse
else #cumSum:=#cumSum+hourCourse
end as cumSum,
#idStudent:=t.idStudent
FROM `test` t,
(select #idStudent:=0,#cumSum:=0) r
order by idStudent, `key`
) t
where t.cumSum <= 30
group by idStudent;
Demo: http://www.sqlfiddle.com/#!2/f5d07/23