mysql sum() two joined tables, multiplies result - mysql

i have two tables; invoices & invoiceitems.
invoiceitems contains the items on each invoice
eg:
invoices
----------------------------------
| id |status| net | tax | total |
----------------------------------
| 72 |paid | 100 | 120 | 220 |
| 73 |unpaid| 50 | 5 | 55 |
| 74 |paid | 400 | 45 | 445 |
| 75 |paid | 250 | 67 | 317 |
invoiceitems
-------------------------------
| invoiceid |itemdescription |
-------------------------------
| 72 | apples |
| 72 | pears |
| 72 | oranges |
| 73 | lemons |
| 73 | oranges |
as you can see, in the example invoice number 72 has 3 items
i want to search my invoices for certain things, and display a count of certain fields.
but my problem is that the sum value seems to get multiplied by the number of fields there are in the second table.
$sql = "SELECT COUNT(DISTINCT invoices.id) AS num,
SUM(CASE invoices.status WHEN 'Paid' THEN 1 ELSE 0 END) AS numpaid,
SUM(CASE invoices.status WHEN 'Paid' THEN invoices.total ELSE 0 END) AS sumtotal,
FROM invoices
LEFT JOIN invoiceitems ON invoices.id=invoiceitems.invoiceid
WHERE invoices.id LIKE :invoiceid
AND IFNULL(opcinvoiceitems.itemdescription, '') LIKE :itemdescription
AND invoices.net LIKE :net
AND invoices.tax LIKE :tax
AND invoices.total LIKE :total
AND ......"
so using the above, the total for invoice 72 would be multiplied by 3
i'm really sorry, i know this is really badly explained but i cant explain it any other way, been searching for ages but cant find a solution. hope someone can help. thanks

One way to do what you want is to pre-aggregate the invoiceItems table before joining:
SELECT COUNT(i.id) AS num,
SUM(CASE i.status WHEN 'Paid' THEN 1 ELSE 0 END) AS numpaid,
SUM(CASE i.status WHEN 'Paid' THEN i.total ELSE 0 END) AS sumtotal,
FROM invoices i LEFT JOIN
(select ii.invoiceid, sum(. . .) as . . .
from invoiceitems ii
where IFNULL(ii.itemdescription, '') LIKE :itemdescription AND
group by ii.invoiceid
) ii
ON i.id = ii.invoiceid
WHERE i.id LIKE :invoiceid AND
i.net LIKE :net AND
i.tax LIKE :tax AND
i.total LIKE :total AND .....
Your query doesn't actually use invoiceitems in the from clause, so it is hard to provide a more detailed example.

When you do a join, you produce records created by matching up ones from the original tables. Thus, you will have 3 records for invoice #72, each created by matching up the single invoices record for #72 with each of the invoice items for #72. Each combined record will have the same total (in this case, 220), and thus the sum would be 3 times that.
It sounds like you just want total, then; you could just use total directly, or you could take your sum and divide it by the count (which you appear to also be computing).

Related

Click-through ratio of articles with two different tables using SQL

I would like to calculate the Click-Through Ratio (CTR) of several articles of a website using SQL.
The formula of the CTR is CTR = number clicks / number impressions, i.e. a ratio of how many times an article has been clicked and how many times it has been shown.
I have two tables:
´article_click´: A table with several columns, namely ´article_id´ (denoting the id of the article), ´description´ (a brief description of the article), ´timestamp´ (when it has been clicked), among others. Every time a user clicks an article, a new row is created in the table.
´article_impression´: Similarly, a table with several columns, namely ´article_id´ (denoting the id of the article), ´description´ (a brief description of the article), ´timestamp´ (when it has been shown), among others. Every time an article is shown to a user, a new row is created in the table.
Both tables 1 and 2 look like this:
+------------+-------------+------------------+-----+
| article_id | description | timestamp | ... |
+------------+-------------+------------------+-----+
| 102 | Potatoe | 2021-01-01 13:45 | ... |
| 11 | Lettuce | 2020-02-11 11:00 | ... |
| 34 | Train | 2019-12-12 09:31 | ... |
| 21 | Car | 2011-11-11 08:32 | ... |
| 201 | Train | 2014-02-10 02:12 | ... |
| ... | ... | ... | ... |
+------------+-------------+------------------+-----+
And I would like to create a table such that:
+------------+-----+
| article_id | CTR |
+------------+-----+
| 11 | 0.4 |
| 23 | 0.6 |
| 34 | 0.2 |
| 44 | 0.8 |
| 45 | 0.3 |
| ... | ... |
+------------+-----+
In order to do so, I have tried:
SELECT article_click.article_id, COUNT(article_click.article_id) / COUNT(article_impression.article_id) AS CTR
FROM article_click
INNER JOIN article_impression ON article_click.article_id = article_impression.article_id
GROUP BY article_click.article_id DESC;
But I obtain something like:
+------------+-----+
| article_id | CTR |
+------------+-----+
| 11 | 1.0 |
| 23 | 1.0 |
| 34 | 1.0 |
| 44 | 1.0 |
| 45 | 1.0 |
| ... | ... |
+------------+-----+
Can anyone spot the mistake here? I'm using MySQL as RDBMS.
If the click-through-rate (CTR) is number clicks / number impressions then you'll need to calculate the number of clicks on an article and the number of impressions on an article before joining them to perform the calculation.
You could do this with subqueries or CTEs, but I've opted for the former here.
SELECT c.article_id, c.click_count / i.impression_count AS CTR
FROM (
SELECT article_id, COUNT(*) AS click_count
FROM article_click
GROUP BY article_id) AS c
INNER JOIN (
SELECT article_id, COUNT(*) AS impression_count
FROM article_impression
GROUP BY article_id) AS i
ON c.article_id = i.article_id;
Try it out on SQL Fiddle.
Note that using an INNER JOIN will exclude articles that have impressions but no clicks, so you won't get results where the CTR is 0. If you want those, you can use a LEFT JOIN from impressions to clicks. Since an article cannot be clicked if it has not been shown, we know that a LEFT JOIN from impressions to clicks is sufficient to show all data.
SELECT i.article_id, COALESCE(c.click_count, 0) / i.impression_count AS CTR
FROM (
SELECT article_id, COUNT(*) AS impression_count
FROM article_impression
GROUP BY article_id) AS i
LEFT JOIN (
SELECT article_id, COUNT(*) AS click_count
FROM article_click
GROUP BY article_id) AS c
ON i.article_id = c.article_id;
Note that we have to use the article_id from article_impression since article_click might be NULL. For the same reason, we have to COALESCE the click_count so that we don't end up with an error trying to divide NULL.
Before using joins duplicate data must be avoided. Get individual counts of each table and join both the queries.
select a.article_id, article_click/article_impression_click as ctr
from ( select a.article_id, count(a.article_id) article_click from
article_click a group by article_id) a inner join (select
a.article_id, count(a.article_id) article_impression_click from
article_impression a group by article_id) b on
a.article_id=b.article_id
WITH
v_article AS
( SELECT 'S' type, article_impression.id FROM article_impression
UNION ALL
SELECT 'C' type, article_click.id FROM article_click
)
SELECT
v_article.id,
COUNT(CASE WHEN v_article.type = 'C' THEN 1 END) nb_show,
COUNT(CASE WHEN v_article.type = 'S' THEN 1 END) nb_click,
CASE
WHEN COUNT(CASE WHEN v_article.type = 'S' THEN 1 END) > 0 THEN
ROUND(COUNT(CASE WHEN v_article.type = 'C' THEN 1 END) / COUNT(CASE WHEN v_article.type = 'S' THEN 1 END), 2)
END ratio_click_show
FROM v_article
GROUP BY
v_article.id
;
If you're sure an article can be click only if it has been previously shown (nb_show > 0 and nb_show > nb_click), you can remove the CASE around the ratio calculation.

how to calculate sum of amounts for the accounts in two tables depends on the first characters part

I have two tables tblaccounts + tblamounts
in tblaccounts I have AccID, AccCode, AccType(Main or Sub) fields
in tblamounts I have AccID, TheAmount fields
I need to calculate the total amounts for each account but the role is as you can see in the next data:
+---------+--------+---------+
| AccCode | Amount | AccType |
+---------+--------+---------+
| 1 | 2400 | Main |
+---------+--------+---------+
| 11 | 1600 | Main |
+---------+--------+---------+
| 111 | 100 | Sub |
+---------+--------+---------+
| 112 | 1000 | Sub |
+---------+--------+---------+
| 113 | 500 | Sub |
+---------+--------+---------+
| 12 | 800 | Main |
+---------+--------+---------+
| 121 | 500 | Sub |
+---------+--------+---------+
| 122 | 300 | Sub |
+---------+--------+---------+
the amounts for (Main)12 = Sub121(500)+Sub122(300) = 800
the amounts for (Main)11 = Sub111(100)+Sub112(1000)+Sub113(500) = 1600
the amounts for (Main)1 = Main11(1600)+Main12(800) = 2400
I tried to sum each account depending on the first number so if I want to get the sum of AccCode(1) I must find the accounts that start with 1 and sum all their amounts, but how to check the other accounts that contain more than 1 character like 12, I want to give amount 121 + 122
Updates:
I used the next code:
SELECT AccCode,
(
SELECT SUM(TheAmount) xResult FROM tblamounts
INNER JOIN tblaccounts ON tblaccounts.AccID = tblamounts.AccID
WHERE AccCode LIKE xacc.AccCode%
) FROM tblaccounts xacc
If I understand correctly, you want:
select a.*,
(select sum(am.amounts)
from amounts am
where am.accId like concat(a.accId, '%')
)
from accounts a;
That is, sum the values from all accounts that start with the same characters of a given account.
You are on the right track, but you need to sum up to, but not including the Account itself (only all the accounts below it)
SELECT a.AccCode AS AccountCode,
(SELECT SUM(tblamounts.TheAmount) FROM tblamounts, tblaccounts g
WHERE tblamounts.AccID = g.AccID AND (g.AccCode LIKE CONCAT(a.AccCode,'%')
AND NOT (g.AccCode = a.AccCode AND LOWER(a.AccType) = 'main'))) AS AccountSum ,
a.AccType AS AccountType
FROM tblaccounts a;
So in your result AccountCode 12 will be the sum of 121 and 122.
The assumption is that AccountType 'Main' will not have its own amount, and will only be a sum of the 'Sub' account types starting with similar code.
Easy solution
https://www.w3schools.com/sql/sql_like.asp
select sum(Amount) from tab where AccCode LIKE "12_"
select sum(Amount) from tab where AccCode LIKE "11_"
select sum(Amount) from tab where AccCode LIKE "1_"
EDIT:
SELECT m1.AccCode, sum(m2.Amount) from tab m1
inner join tab m2 on m2.AccCode LIKE m1.AccCode || "_"
where m1.accType = "Main"
group by m1.AccCode

MySql Sum different types of expenses from 'expense' table based on value in 'expense type' group by employee

A more generic title for this post would be
MySql Sum different columns in same table based on value of another row, group by yet another row
I have a table of employee expenses:
id | employee_id | expense_cat_id | expense_amount |
1 | 11 | 1 | 100 |
2 | 11 | 1 | 200 |
3 | 12 | 1 | 120 |
4 | 12 | 1 | 140 |
5 | 11 | 2 | 5 |
6 | 12 | 2 | 8 |`
and I want to produce a report like this:
Employee Id | Expense Cat 1 Total Amount | Expense Cat 2 Total Amount
11 | 300 | 5
12 | 260 | 8
So initially I thought I could use 2 table aliases for the same table like this:
SELECT
employee_id,
sum(expense_cat_1.expense_amount) as expense_1_total,
sum(expense_cat_2.expense_amount) as expense_2_total
FROM
expenses as expense_cat_1 where expense_cat_1.expense_cat_id=1 ,
expenses as expense_cat_2 where expense_cat_2.expense_cat_id=2
group by employee_id
but this was not correct Sql Syntax, which makes sense to me.
So I thought I could do two joins on between employee table and the expenses table:
SELECT
employees.id as employee_id,
sum(expenses_cat_1.expense_amount) as expense_1_total,
sum(expenses_cat_2.expense_amount) as expense_2_total
FROM employees
join expenses as expenses_cat_1 on employees.id = expenses_cat_1.employee_id and expenses_cat_1.expense_cat_id=1
join expenses as expenses_cat_2 on employees.id = expenses_cat_2.employee_id and expenses_cat_2.expense_cat_id=2
group by employees.id
Which comes close, but is wrong:
employee_id | expense_1_total | expense_2_total
11 | 300 | 10
12 | 260 | 16
as the expense 2 total is doubled! I think this is because the join on shows up two rows for each of the two expenses with category 1, and sums them.
I also tried a sub-query approach:
SELECT (SELECT sum(expense_amount)
FROM expenses
WHERE expense_cat_id = 1) AS sum1 ,
(SELECT sum(expense_amount)
FROM expenses
WHERE expense_cat_id = 2) AS sum2,
employee_id
FROM expenses group by employee_id
but this has the same problem as the join approach - totals for cat 2 are doubled.
How do I make the second join only include the expense_2_total once ???
I have a personal dislike of sql case statements as they seem more of a procedural language construct (and sql is declarative), but am happy to consider their use in this case - but I put the challenge out there for sql experts to solve this elegantly.
You are looking for conditional aggregation:
SELECT employee_id,
sum(case when expense_cat_id = 1 then expense_amount else 0 end) as expense_1_total,
sum(case when expense_cat_id = 2 then expense_amount else 0 end) as expense_2_total
FROM expenses e
GROUP BY employee_id;

MySQL SUM previous row by date column using Union

I am hoping I am just stumped because its the end of the work day on a Monday, and someone here can give me a hand.
Basically I have 2 tables that have invoice information and a table that has payment information. Using the following I get the first part of my display.
SELECT d.id, i.id as invid, i.company_id, d.total, created, adjustment FROM tbl_finance_invoices as i
LEFT JOIN tbl_finance_invoice_details as d ON d.invoice_id = i.id
WHERE company_id = '69350'
UNION
SELECT id, 0, comp_id, amount_paid, uploaded_date, 'paid' FROM tbl_finance_invoice_paid_items
WHERE comp_id = '69350'
ORDER BY created
What I want to do is:
Create a new column called "Balance" that adds total to the previous total by the created column regardless of how the rest of the table is sorted.
To give a quick example, my current output is something like:
id | invid | company_id | total | created | adjustment
12 | 16 | 1 | 40 | 01/01/16| 0
100| 0 | 1 | 10 | 01/05/16| 0
50 | 20 | 1 | 50 | 05/01/16| 0
What my goal is would be:
id | invid | company_id | total | created | adjustment | balance |Notes
12 | 16 | 1 | 40 | 01/01/16| 0 | 40 | 0 + 40
100| 0 | 1 | 10 | 01/05/16| 1 | 50 | 40 + 10
50 | 20 | 1 | 50 | 05/01/16| 0 | 100 | 50 + 50
And regardless of sorting by id, invid, total, created, etc, the balance would always be tied to the created date.
So if I added a "Where adjustment = '1'" to my sql, I would get:
100| 0 | 1 | 10 | 01/05/16| 1 | 50 | 40 + 10
Since the OP confirmed my understanding in comments, I'm basing my answer on the following assumption:
The running total would be tied to the order of created_date. The
running total would only be affected by company id as a filtering
criterion, all other filters should be disregarded for that
calculation.
Since the running total may have a different order by and filtering criteria than the rest of the query, therefore the running total calculation has to be placed in a subquery.
The other assumption I have to make is that there cannot be more than one invoice with the same created date for a single customer id, since the original query in the OP does not have any group by or summing either.
I prefer to use the approach suggested by #OMG Ponies in this post on SO, where he initiates the mysql variable holding the running total in a subquery, thus there is no need to initialize the variable in a separate set statement.
SELECT d.id, i.id as invid, i.company_id, rt.total, rt.cumulative_sum, rt.created, adjustment
FROM tbl_finance_invoices as i
LEFT JOIN tbl_finance_invoice_details as d ON d.invoice_id = i.id
LEFT JOIN
(SELECT d.total, created, #running_total := #running_total + t.count AS cumulative_sum
FROM tbl_finance_invoices as i
LEFT JOIN tbl_finance_invoice_details as d ON d.invoice_id = i.id
JOIN (SELECT #running_total := 0) r -- no join condition, so this produces a carthesian join
WHERE company_id = '69350'
ORDER BY created) rt
ON i.created=rt.created --this is also an assumption, I do not know which original table holds the created field
WHERE company_id = '69350' and adjustment=1
ORDER BY d.id
If you need to take the amounts from the tbl_finance_invoice_paid_items into account as well, then you need to add that to the subquery.

mysql 3 queries on 2 tables in one

I have 2 tables that I need to query
**tbl_jobs**
jobid | description | someinfo
1 foo bar
2 fuu buu
**tbl_invlog**
idinv | jobid | type | ammount
1 1 add 100
2 1 rem 50
3 1 rem 15
4 1 add 8
5 2 add 42
the result should be to make a sum of the inventory "add" and "rem" and give a total of sum(add)-sum(rem) for each jobid, including the rest of the job information.
jobid | description | someinfo | amountadd | amountrem | totaladdrem
1 | foo | bar | 108 | 65 | 43
2 | fuu | buu | 42 | 0 | 42
i have made a quadruple select statement with select * from (select .... ) without using joins or other cool stuff. which is terribly slow. I am quite new to mysql.
I would be glad to an idea on how to solve this.
thanks in advance
This is a query that requires a join and conditional aggregation:
select j.jobid, j.description, j.someinfo,
sum(case when il."type" = 'add' then amount else 0 end) as AmountAdd,
sum(case when il."type" = 'rem' then amount else 0 end) as AmountRem,
(sum(case when il."type" = 'add' then amount else 0 end) -
sum(case when il."type" = 'rem' then amount else 0 end)
) as totaladdrem
from tbl_jobs j left outer join
tbl_invlog il
on j.jobid = il.jobid
group by j.jobid, j.description, j.someinfo;
Note some things. First, the tables have table aliases, defined in the from clause. This allows you to say which table the columns come from. Second, the table aliases are always used for all columns in the query.
MySQL would allow you to just do group by j.jobid, using a feature called "hidden columns". I think this is a bad habit (except in a few cases), so this aggregates by all the columns in the jobs table.
The conditional aggregation is done by putting a condition in the sum() statement.