Sum columns from two tables in sql - mysql

I have two tables, one is the cost table and the other is the payment table, the cost table contains the cost of product with the product name.
Cost Table
id | cost | name
1 | 100 | A
2 | 200 | B
3 | 200 | A
Payment Table
pid | amount | costID
1 | 10 | 1
2 | 20 | 1
3 | 30 | 2
4 | 50 | 1
Now I have to sum the total of cost by the same name values, and as well sum the total amount of payments by the costID, like the query below
totalTable
name | sum(cost) | sum(amount) |
A | 300 | 80 |
B | 200 | 30 |
However I have been working my way around this using the query below but I think I am doing it very wrong.
SELECT
b.name,
b.sum(cost),
a.sum(amount)
FROM
`Payment Table` a
LEFT JOIN
`Cost Table` b
ON
b.id=a.costID
GROUP by b.name,a.costID
I would be grateful if somebody would help me with my queries or better still an idea as to how to go about it. Thank you

This should work:
select t2.name, sum(t2.cost), coalesce(sum(t1.amount), 0) as amount
from (
select id, name, sum(cost) as cost
from `Cost`
group by id, name
) t2
left join (
select costID, sum(amount) as amount
from `Payment`
group by CostID
) t1 on t2.id = t1.costID
group by t2.name
SQLFiddle

You need do the calculation in separated query and then join them together.
First one is straight forward.
Second one you need to get the name asociated to that payment based in the cost_id
SQL Fiddle Demo
SELECT C.`name`, C.`sum_cost`, COALESCE(P.`sum_amount`,0 ) as `sum_amount`
FROM (
SELECT `name`, SUM(`cost`) as `sum_cost`
FROM `Cost`
GROUP BY `name`
) C
LEFT JOIN (
SELECT `Cost`.`name`, SUM(`Payment`.`amount`) as `sum_amount`
FROM `Payment`
JOIN `Cost`
ON `Payment`.`costID` = `Cost`.`id`
GROUP BY `Cost`.`name`
) P
ON C.`name` = P.`name`
OUTPUT
| name | sum_cost | sum_amount |
|------|----------|------------|
| A | 300 | 80 |
| B | 200 | 30 |

A couple of issues. For one thing, the column references should be qualified, not the aggregate functions.
This is invalid:
table_alias.SUM(column_name)
Should be:
SUM(table_alias.column_name)
This query should return the first two columns you are looking for:
SELECT c.name AS `name`
, SUM(c.cost) AS `sum(cost)`
FROM `Cost Table` c
GROUP BY c.name
ORDER BY c.name
When you introduce a join to another table, like Product Table, where costid is not UNIQUE, you have the potential to produce a (partial) Cartesian product.
To see what that looks like, to see what's happening, remove the GROUP BY and the aggregate SUM() functions, and take a look at the detail rows returned by a query with the join operation.
SELECT c.id AS `c.id`
, c.cost AS `c.cost`
, c.name AS `c.name`
, p.pid AS `p.pid`
, p.amount AS `p.amount`
, p.costid AS `p.costid`
FROM `Cost Table` c
LEFT
JOIN `Payment Table` p
ON p.costid = c.id
ORDER BY c.id, p.pid
That's going to return:
c.id | c.cost | c.name | p.pid | p.amount | p.costid
1 | 100 | A | 1 | 10 | 1
1 | 100 | A | 2 | 20 | 1
1 | 100 | A | 4 | 50 | 1
2 | 200 | B | 3 | 30 | 2
3 | 200 | A | NULL | NULL | NULL
Notice that we are getting three copies of the id=1 row from Cost Table.
So, if we modified that query, adding a GROUP BY c.name, and wrapping c.cost in a SUM() aggregate, we're going to get an inflated value for total cost.
To avoid that, we can aggregate the amount from the Payment Table, so we get only one row for each costid. Then when we do the join operation, we won't be producing duplicate copies of rows from Cost.
Here's a query to aggregate the total amount from the Payment Table, so we get a single row for each costid.
SELECT p.costid
, SUM(p.amount) AS tot_amount
FROM `Payment Table` p
GROUP BY p.costid
ORDER BY p.costid
That would return:
costid | tot_amount
1 | 80
2 | 30
We can use the results from that query as if it were a table, by making that query an "inline view". In this example, we assign an alias of v to the query results. (In the MySQL venacular, an "inline view" is called a "derived table".)
SELECT c.name AS `name`
, SUM(c.cost) AS `sum_cost`
, IFNULL(SUM(v.tot_amount),0) AS `sum_amount`
FROM `Cost Table` c
LEFT
JOIN ( -- inline view to return total amount by costid
SELECT p.costid
, SUM(p.amount) AS tot_amount
FROM `Payment Table` p
GROUP BY p.costid
ORDER BY p.costid
) v
ON v.costid = c.id
GROUP BY c.name
ORDER BY c.name

Related

mysql count the number of matches based on a column

This is my example dataset I have groups with students assigned to them as shown below
uid | groupid | studentid
49 | PZV7cUZCnLwNkSS | wTsBSkkg4Weo8R3
50 | PZV7cUZCnLwNkSS | aIuDhxfChg3enCf
97 | CwvkffFcBCRbzdw | hEwLxJmnJmZFAic
99 | CwvkffFcBCRbzdw | OKFfl58XVQMrAyC
126 | CwvkffFcBCRbzdw | dlH8udyTjNV3nXM
142 | 2vu1eqTCWVjgE58 | Q01Iz3lC2uUMBSB
143 | 2vu1eqTCWVjgE58 | vB5s8hfTaVtx3wO
144 | 2vu1eqTCWVjgE58 | 5O9HA5Z7wVhgi6l
145 | 2vu1eqTCWVjgE58 | OiEUOXNjK2D2s8F
I am trying to output with the following information.
The problem I am having is the Group Size column getting it to output a count.
Studentid | Groupid | Group Size
wTsBSkkg4Weo8R3 | PZV7cUZCnLwNkSS | 2
aIuDhxfChg3enCf | PZV7cUZCnLwNkSS | 2
hEwLxJmnJmZFAic | CwvkffFcBCRbzdw | 3
OKFfl58XVQMrAyC | CwvkffFcBCRbzdw | 3
dlH8udyTjNV3nXM | CwvkffFcBCRbzdw | 3
I have researched if I can you can use a where clause in the count, and does not seem like it will let me do that. I thought about doing a sum but couldn't make that happen either. I feel like I am missing something simple.
An easy way to solve this, is using a JOIN statement:
SELECT a.studentid AS Studentid, a.groupid AS Groupid, COUNT(*)
FROM table AS a
JOIN table AS b ON a.groupid = b.groupid
GROUP BY a.studentid, a.groupid
So here we join the table with itself and use a GROUP BY to group on the studentid and groupid and then use COUNT(*) to count the number of rows in b that have the same groupid.
Try this:
SELECT *
FROM pony a
LEFT JOIN (
SELECT COUNT(*), groupid
FROM pony
GROUP BY groupid
) b ON a.groupid = b.groupid
try this
SELECT T1.Studentid, T1.Groupid, T2.GroupCount
FROM Your_Table T1
INNER JOIN ( SELECT Groupid, count(*) AS GroupCount FROM Your_Table GROUP BY Groupid ) T2
ON T1.Groupid = T2.Groupid
You should try:
SELECT COUNT(Groupid) AS Groupsize FROM table;
It seems that what you're trying to do is simple. If I understand correctly, a simple SELECT COUNT statement. To exclude multiple returns of the same value, use SELECT DISTINCT COUNT()

MySQL query for distinct rows on count

I have such query that gives me results about bestseller items from shops, at the moment it works fine, but now I want to get only one product from each shop so to have a distinct si.shop_id only one bestseller product from a shop
SELECT `si`.`id`, si.shop_id,
(SELECT COUNT(*)
FROM `transaction_item` AS `tis`
JOIN `transaction` as `t`
ON `t`.`id` = `tis`.`transaction_id`
WHERE `tis`.`shop_item_id` = `si`.`id`
AND `t`.`added_date` >= '2014-02-26 00:00:00')
AS `count`
FROM `shop_item` AS `si`
INNER JOIN `transaction_item` AS `ti`
ON ti.shop_item_id = si.id
GROUP BY `si`.`id`
ORDER BY `count` DESC LIMIT 7
and that gives mu a result like:
+--------+---------+-------+
| id | shop_id | count |
+--------+---------+-------+
| 425030 | 38027 | 111 |
| 291974 | 5368 | 20 |
| 425033 | 38027 | 18 |
| 291975 | 5368 | 12 |
| 142776 | 5368 | 10 |
| 397016 | 38027 | 9 |
| 291881 | 5368 | 8 |
+--------+---------+-------+
any ideas?
EDIT
so I created a fiddle for it
http://sqlfiddle.com/#!2/cfc4c/1
Now the query returns best selling products I want it to return only one product from shopso the result of fiddle should be
+----+---------+-------+
| ID | SHOP_ID | COUNT |
+----+---------+-------+
| 1 | 222 | 3 |
| 4 | 333 | 2 |
| 8 | 555 | 1 |
| 9 | 777 | 1 |
+----+---------+-------+
Possibly something like this:-
SELECT si.shop_id,
SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(':', si.id, sub1.item_count) ORDER BY sub1.item_count DESC), ',', 1) AS `count`
FROM shop_item AS si
INNER JOIN
(
SELECT tis.shop_item_id, COUNT(*) AS item_count
FROM transaction_item AS tis
JOIN `transaction` as t
ON t.id = tis.transaction_id
AND t.added_date >= '2014-02-26 00:00:00'
GROUP BY tis.shop_item_id
) sub1
ON sub1.shop_item_id = si.id
GROUP BY si.shop_id
ORDER BY `count` DESC LIMIT 7
The sub query gets the count of items for each shop. Then the main query concatenates the item id and the item count together, group concatenates all those for a single shop together (ordered by the count descending) and then uses SUBSTRING_INDEX to grab the first one (ie, everything before the first comma).
You will have to split up the count field to get the item id and count separately (the separator is a : ).
This is taking a few guesses about what you really want, and with no table declares or data it isn't tested.
EDIT - now tested with the SQL fiddle example:-
SELECT SUBSTRING_INDEX(`count`, ':', 1) AS ID,
shop_id,
SUBSTRING_INDEX(`count`, ':', -1) AS `count`
FROM
(
SELECT si.shop_id,
SUBSTRING_INDEX(GROUP_CONCAT(CONCAT_WS(':', si.id, sub1.item_count) ORDER BY sub1.item_count DESC), ',', 1) AS `count`
FROM shop_item AS si
INNER JOIN transaction_item AS ti
ON ti.shop_item_id = si.id
INNER JOIN
(
SELECT tis.shop_item_id, COUNT(*) AS item_count
FROM transaction_item AS tis
JOIN `transaction` as t
ON t.id = tis.transaction_id
AND t.added_date >= '2014-02-26 00:00:00'
GROUP BY tis.shop_item_id
) sub1
ON sub1.shop_item_id = si.id
GROUP BY si.shop_id
) sub2
ORDER BY `count` DESC LIMIT 7;

Making large SQL query efficicent

I'm stuck on a rather complex query.
I'm looking to write a query that shows the "top five customers" as well as some key metrics (counts with conditions) about each of those customers. Each of the different metrics uses a totally different join structure.
+-----------+------------+ +-----------+------------+ +-----------+------------+
| customer | | | metricn | | | metricn_lineitem |
+-----------+------------+ +-----------+------------+ +-----------+------------+
| id | Name | | id | customer_id| |id |metricn_id |
| 1 | Customer1 | | 1 | 1 | | 1 | 1 |
| 2 | Customer2 | | 2 | 2 | | 2 | 1 |
+-----------+------------+ +-----------+------------+ +-----------+------------+
The issue this is that I always want to group by this customer table.
I first tried to put all of my joins into the original query, but the query was abysmal with performance. I then tried using subqueries, but I couldn't get them to group by the original hospital id.
Here's a sample query
SELECT
customer.name,
(SELECT COUNT(metric1_lineitem.id)
FROM metric1 INNER JOIN metric1_lineitem
ON metric1_lineitem.metric1_id = metric1.id
WHERE metric1.customer_id = customer_id
) as metric_1,
(SELECT COUNT(metric2_lineitem.id)
FROM metric2 INNER JOIN metric2_lineitem
ON metric2_lineitem.metric2_id = metric2.id
WHERE metric2.customer_id = customer_id
) as metric_2
FROM customer
GROUP BY customer.name
SORT BY COUNT(metric1.id) DESC
LIMIT 5
Any advice? Thanks!
SELECT name, metric_1, metric_2
FROM customer AS c
LEFT JOIN (SELECT customer_id, COUNT(*) AS metric_1
FROM metric1 AS m
INNER JOIN metric1_lineitem AS l ON m.id = l.metric1_id
GROUP BY customer_id) m1
ON m1.customer_id = c.customer_id
LEFT JOIN (SELECT customer_id, COUNT(*) AS metric_2
FROM metric2 AS m
INNER JOIN metric2_lineitem AS l ON m.id = l.metric2_id
GROUP BY customer_id) m1
ON m2.customer_id = c.customer_id
ORDER BY metric_1 DESC
LIMIT 5
You should also avoid using COUNT(columnname) when you can use COUNT(*) instead. The former has to test every value to see if it's null.
Although your data structure may be lousy, your query may not be so bad, with two exceptions. I don't think you need the aggregation on the outer level. Also, the "correlation"s in the where clause (such as metric1.customer_id = customer_id) are not doing anything, because customer_id is coming from the local tables. You need metric1.customer_id = c.customer_id:
SELECT c.name,
(SELECT COUNT(metric1_lineitem.id)
FROM metric1 INNER JOIN
metric1_lineitem
ON metric1_lineitem.metric1_id = metric1.id
WHERE metric1.customer_id = c.customer_id
) as metric_1,
(SELECT COUNT(metric2_lineitem.id)
FROM metric2 INNER JOIN
metric2_lineitem
ON metric2_lineitem.metric2_id = metric2.id
WHERE metric2.customer_id = c.customer_id
) as metric_2
FROM customer c
ORDER BY 1 DESC
LIMIT 5;
How can you make this run faster? One way is to introduce indexes. I would recommend metric1(customer_id), metric2(customer_id), metric1_lineitem(metric1_id) and metric2_lineitem(metric2_id).
This may be faster than the aggregation method (proposed by Barmar) because MySQL is inefficient with aggregations. This should allow the aggregations to take place only using indexes instead of the base tables.

left join, return non matching rows, where clause on right table, group by

Sorry about the complicated title.
I have two tables, customers and orders:
customers - names may be duplicated, ids are unique:
name | cid
a | 1
a | 2
b | 3
b | 4
c | 5
orders - pid is unique, join on cid:
pid | cid | date
1 | 1 | 01/01/2012
2 | 1 | 01/01/2012
3 | 2 | 01/01/2012
4 | 3 | 01/01/2012
5 | 3 | 01/01/2012
6 | 3 | 01/01/2012
So I used this code to get a count:
select customers.name, orders.date, count(*) as count
from customers
left JOIN orders ON customers.cid = orders.cid
where date between '01/01/2012' and '02/02/2012'
group by name,date
which worked fine but didnt give me null rows when the cid of customers didnt match a cid in orders, e.g. name-c, id-5
select customers.name, orders.date, count(*) as count
from customers
left JOIN orders ON customers.cid = orders.cid
AND date between '01/01/2012' and '02/02/2012'
group by name,date
So I changed the where to apply to the join instead, which works fine, it gives me the null rows.
So in this example I would get:
name | date | count
a | 01/01/2012 | 3
b | null | 1
b | 01/01/2012 | 3
c | null | 1
But because names have different cid's it is giving me a null row even if the name itself does have rows in orders, which I don't want.
So I'm looking for a way for the null rows to only be returned when any other cid's that share the same name also do not have any rows in orders.
Thanks for any help.
---EDIT---
I have edited the counts for null rows, count never returns null but 1.
The result of
select * from (select customers.name, orders.date, count(*) as count
from customers
left JOIN orders ON customers.cid = orders.cid
AND date between '01/01/2012' and '02/02/2012'
group by name,date) as t1 group by name
is
name | date | count
a | 01/01/2012 | 3
b | null | 1
c | null | 1
First, select your date grouped by (name, date), excluding NULLs, then join with a set of distinct names:
SELECT names.name, grouped.date, grouped.count
FROM ( SELECT DISTINCT name FROM customers ) as names
LEFT JOIN (
SELECT customers.name, orders.date, COUNT(*) as count
FROM customers
LEFT JOIN orders ON customers.cid = orders.cid
WHERE date BETWEEN '01/01/2012' AND '02/02/2012'
GROUP BY name,date
) grouped
ON names.name = grouped.name
The best approach would be Group them together based on Cid's and then other parameters.
So you would get the proper output with NULL values based on Left Outer Join.

mysql small issue

I am trying following query...
SELECT b.name AS batch_name, b.id AS batch_id,
COUNT( s.id ) AS total_students,
COALESCE( sum(s.open_bal), 0 ) AS open_balance,
sum( COALESCE(i.reg_fee,0) + COALESCE(i.tut_fee,0) + COALESCE(i.other_fee,0) ) AS gross_fee
FROM batches b
LEFT JOIN students s on s.batch = b.id
LEFT JOIN invoices i on i.student_id = s.id
GROUP BY b.name, b.id;
result set
| batch_name | batch_id | total_students | open_balance | gross_fee |
+------------+-----------+----------------+--------------+-----------+
| ba | 11 | 44 | 0 | 1782750 |
+------------+-----------+----------------+--------------+-----------+
But its giving unexpted results, and if i remove sum( COALESCE(i.reg_fee,0) + COALESCE(i.tut_fee,0) + COALESCE(i.other_fee,0) ) AS gross_fee and LEFT JOIN fm_invoices i on i.student_id = s.id, it gives expected/correct results as following...
| batch_name | batch_id | total_students | open_balance | gross_fee |
+------------+-----------+----------------+--------------+-----------+
| ba | 11 | 34 | 0 | 0 |
+------------+-----------+----------------+--------------+-----------+
I am sure, i am doing something and i am trying every option since last hour, please help.
I assume your question is something like:
Why does COUNT(s.id) return 44 in the first query and 34 in the second query, and how can I make it count 34 students while I sum the invoices in the same query?
You have multiple invoices for some of your students, and the join results in multiple rows with the same s.id. When you count them, it counts each of these multiple rows.
You should use COUNT(DISTINCT s.id) to make the query count each student id only once, even when it appears multiple times as a consequence of the join to invoices.
Re your question about what to change, just change COUNT(s.id) to COUNT(DISTINCT s.id). The rest of the query looks fine, if I have a correct understanding of what you want it to do.