SUM() of COUNT() column in MySQL - mysql

I have the following query:
SELECT COUNT(employees.id) AS emp_count
FROM `orders`
INNER JOIN `companies` ON `companies`.`id` = `orders`.`company_id`
INNER JOIN employees ON employees.company_id = companies.id
AND (employees.deleted_at >= companies.activate_at OR employees.deleted_at IS NULL)
AND employees.created_at <= companies.activate_at
WHERE
(companies.activate_at BETWEEN '2012-01-31 23:00:00' AND '2012-02-29 22:59:59'
AND orders.type = 'Order::TrialActivation'
AND orders.state = 'completed')
I need the SUM of all the "emp_count" columns. Currently I iterate over all the rows returned by above query, and then SUM on "emp_count" in Ruby. But there are many rows, so this takes up a lot of memory.
How would I SUM on "emp_count" in SQL and return just this number?

--- Update: ---
Since the question has been updated, I'll update my answer:
If you are trying to just get the number of rows from your query based on the WHERE syntax, then you can use COUNT(). However, if you want the sum of all of the employees.id values from the query, then change COUNT(employees.id) to SUM(employees.id), and that should do what you want.
--- Original Answer ---
Try using a subquery, kinda like this (code not tested):
SELECT SUM(x.emp_count) FROM (
SELECT COUNT(employees.id) AS emp_count
FROM `orders`
INNER JOIN `companies` ON `companies`.`id` = `orders`.`company_id`
INNER JOIN employees ON employees.company_id = companies.id
AND (employees.deleted_at >= companies.activate_at OR employees.deleted_at IS NULL)
AND employees.created_at <= companies.activate_at
WHERE
(companies.activate_at BETWEEN '2012-01-31 23:00:00' AND '2012-02-29 22:59:59'
AND orders.type = 'Order::TrialActivation'
AND orders.state = 'completed')
) x;
You can read more on subqueries in the MySQL documentation, and see also: How to SUM() multiple subquery rows in MySQL?

Related

MySQL: aggregate data from distinct rows into other distinct rows?

I know the title is an abomination, but I can't think of a succinct way to describe my problem.
I have a table called onsite_notes. onsite_notes's PK is a field called onsite_note_id. I'm trying to get all of the notes' time added up for each customer. Currently, my query is returning double entries for some rows. I'm not sure why, but it's really annoying. What I want to do is count distinct rows that have a specific FK (customer_id). Here's the current query.
SELECT c.searchable_name, co.*, sum(n.time)
as worked_hours_onsite, 'onsite' as type
FROM customers c
LEFT JOIN contracts co on c.customer_id = co.customer_id
LEFT JOIN onsite_tickets t ON t.customer_id = c.customer_id
LEFT JOIN onsite_notes n ON t.onsite_id = n.onsite_id
and (n.date >= 1464760800)
and (n.date < 1467352800)
and (n.isbillable = 1)
WHERE co.contract_type != '' AND
((timestamp(now()) between co.start_date and co.end_date)
OR ((timestamp(now()) <= co.end_date) AND (co.start_date = 0))
OR ((timestamp(now()) >= co.start_date) AND (co.end_date = 0))
OR ((co.start_date = 0) AND (co.end_date=0)))
GROUP BY c.customer_id DESC ) ....
That's the general idea of the thing. It's got a unioned bit, but it's giving me the same headache. Basically, how can I ensure that I'm getting unique rows for each customer? My customer rows are unique, but the aggregated data isn't unique and I want it unique.
If you want a sum you need a proper group by based ond the row you want grouped and if you have selected column you must adeguate the group by to the row you want selected
.. If you need distinct don't use group by
and last group by don't need desc .. desc i for order by so the query should be somethings like this
SELECT c.searchable_name, co.*, sum(n.time)
as worked_hours_onsite, 'onsite' as type
FROM customers c
LEFT JOIN contracts co on c.customer_id = co.customer_id
LEFT JOIN onsite_tickets t ON t.customer_id = c.customer_id
LEFT JOIN onsite_notes n ON t.onsite_id = n.onsite_id
and (n.date >= 1464760800)
and (n.date < 1467352800)
and (n.isbillable = 1)
WHERE co.contract_type != '' AND
((timestamp(now()) between co.start_date and co.end_date)
OR ((timestamp(now()) <= co.end_date) AND (co.start_date = 0))
OR ((timestamp(now()) >= co.start_date) AND (co.end_date = 0))
OR ((co.start_date = 0) AND (co.end_date=0)))
GROUP BY c.searchable_name, co.*

MySQL Count returns More Rows than it Should

I am attempting to count the number of rows from a given query. But count returns more rows than it should. What is happening?
This query returns only 1 row.
select *
from `opportunities`
inner join `companies` on `opportunities`.`company_id` = `companies`.`id`
left join `opportunityTags` on `opportunities`.`id` = `opportunityTags`.`opportunity_id`
where `opportunities`.`isPublished` = '1' and `opportunities`.`Company_id` = '1'
group by `opportunities`.`id` ;
This query returns that there are 3 rows.
select count(*) as aggregate
from `opportunities`
inner join `companies` on `opportunities`.`company_id` = `companies`.`id`
left join `opportunityTags` on `opportunities`.`id` = `opportunityTags`.`opportunity_id`
where `opportunities`.`isPublished` = '1' and `opportunities`.`Company_id` = '1'
group by `opportunities`.`id`;
When you select count(*) it is counting before the group by. You can probably (unfortunately my realm is SQL Server and I don't have a mySQL instance to test) fix this by using the over() function.
For example:
select count(*) over (partition by `opportunities`.`id`)
EDIT: Actually doesn't look like this is available in mySQL, my bad. How about just wrapping the whole thing in a new select statement? It's not the most elegant solution, but will give you the figure you're after.

Order a Group BY

There are alot questions on this topic, still can't figure out a way to make this work.
The query I'm doing is:
SELECT `b`.`ads_id` AS `ads_id`,
`b`.`bod_bedrag` AS `bod_bedrag`,
`a`. `ads_naam` AS `ads_naam`,
`a`.`ads_url` AS `ads_url`,
`a`.`ads_prijs` AS `ads_price`,
`i`.`url` AS `img_url`,
`c`.`url` AS `cat_url`
FROM `ads_market_bids` AS `b`
INNER JOIN `ads_market` AS `a`
ON `b`.`ads_id` = `a`.`id`
INNER JOIN `ads_images` AS `i`
ON `b`.`ads_id` = `i`.`ads_id`
INNER JOIN `ads_categories` AS `c`
ON `a`.`cat_id` = `c`.`id`
WHERE `i`.`img_order` = '0'
AND `b`.`u_id` = '285'
GROUP BY `b`.`ads_id`
HAVING MAX(b.bod_bedrag)
ORDER BY `b`.`bod_bedrag` ASC
But, the problem I keep seeing is that I need b.bod_bedrag to be sorted before the GROUP BY is taking place or so. Don't know how to explain it exactly.
The bod_bedrag i'm getting now are the lowest of the bids in the table. I need the highest.
Tried like everything, even tought of not grouping by but using DISTINCT. This didn't work either. Tried order by max, everything I know or could find on the internet.
Image 1 is the situation without the group by. Order By works great (ofc).
Image 2 is with the group by. As you can see, the lowest bid is taken as bod_bedrag. I need the highest.
Judging by your output you want:
SELECT amb.ads_id,
MAX(amb.bod_bedrag) max_bod_bedrag,
am.ads_naam,
am.ads_url,
am.ads_prijs ads_price,
ai.url img_url,
ac.url cat_url
FROM ads_market_bids amb
JOIN ads_images ai
ON ai.ads_id = amb.ads_id
AND ai.img_order = 0
JOIN ads_market am
ON am.id = amb.ads_id
JOIN ads_categories ac
ON ac.id = am.cat_id
WHERE amb.u_id = 285
GROUP BY amb.ads_id,
am.ads_naam,
am.ads_url,
am.ads_prijs,
ai.url,
ac.url
ORDER BY max_bod_bedrag ASC
I have also removed all the unecessary backtickery and aliasing of columns to the same name.
Your HAVING was doing nothing as all the groups 'have' a MAX(amb.bod_rag).
select distinct `b`.`ads_id` as `ads_id`, max(`b`.`bod_bedrag`) as `bod_bedrag`,
`a`.`ads_naam` as `ads_naam`, `a`.`ads_url` as `ads_url`, `a`.`ads_prijs` as `ads_price`,
`i`.`url` as `img_url`, `c`.`url` as `cat_url`
from `ads_market_bids` as `b`
inner join `ads_market` as `a` on `b`.`ads_id` = `a`.`id`
inner join `ads_images` as `i` on `b`.`ads_id` = `i`.`ads_id`
inner join `ads_categories` as `c` on `a`.`cat_id` = `c`.`id`
where `i`.`img_order` = '0' and `b`.`u_id` = '285'
group by b.ads_id, a.ads_naam, a.ads_url, a.ads_prijs, i.url, c.url
One approach is to simulate row_number() (which MySQL does not have), but it allows for selection - by record - rather than by aggregates which may come from disparate source records. It works by adding to variables to each row (it does not increase the number of rows) Then, using an ordered subquery those variables are set to 1 for the highest b.bod_bedrag for each b.ads_id, all other rows perb.ads_id` get a higher RN value. At the end we filter where RN = 1 (which equates the the record containing the highest bid value)
SELECT *
FROM (
SELECT
#row_num :=IF(#prev_value=`b`.`ads_id`, #row_num + 1, 1) AS RN
,`b`.`ads_id` AS `ads_id`
,`b`.`bod_bedrag` AS `bod_bedrag`
,`a`.`ads_naam` AS `ads_naam`
,`a`.`ads_url` AS `ads_url`
,`a`.`ads_prijs` AS `ads_price`
,`i`.`url` AS `img_url`
,`c`.`url` AS `cat_url`
, #prev_value := `b`.`bod_bedrag`
FROM `ads_market_bids` AS `b`
INNER JOIN `ads_market` AS `a` ON `b`.`ads_id` = `a`.`id`
INNER JOIN `ads_images` AS `i` ON `b`.`ads_id` = `i`.`ads_id`
INNER JOIN `ads_categories` AS `c` ON `a`.`cat_id` = `c`.`id`
CROSS JOIN
( SELECT #row_num :=1
, #prev_value :=''
) vars
WHERE `i`.`img_order` = '0'
AND `b`.`u_id` = '285'
ORDER BY `b`.`ads_id`, b`.`bod_bedrag` DESC
)
WHERE RN = 1;
You can even turn off that silly GROUP BY extension, details in the man page:
MySQL Extensions to GROUP BY

MySQL Help: Return invoices and payments by date

I am having trouble getting a MySQL query to work for me. Here is the setup.
A customer has asked me to compile a report from some accounting data. He wants to select a date (and possibly other criteria) and have it return all of the following (an OR statement):
1.) All invoices that were inserted on or after that date
2.) All invoices regardless of their insert date that have corresponding payments in a separate table whose insert dates are on or after the selected date.
The first clause is basic, but I am having trouble pairing it with the second.
I have assembled a comparable set of test data in an SQL Fiddle. The query that I currently have is provided.
http://www.sqlfiddle.com/#!2/d8d9c/3/2
As noted in the comments of the fiddle, I am working with July 1, 2013 as my selected date. For the test to work, I need invoices 1 through 5 to appear, but not invoice #6.
Try this: http://www.sqlfiddle.com/#!2/d8d9c/9
Here are the summarized changes
I got rid of your GROUP BY. You did not have any aggregate functions. I used DISTINCT instead to eliminate duplicate records
I removed your implicit joins and put explicit joins in their place for readability. Then I changed them to LEFT JOINs. I am not sure what your data looks like but at a minimum, I would assume you need the payments LEFT JOINed if you want to select an invoice that has no payments.
This will probably get you the records you want, but those subselects in the SELECT clause may perform better as LEFT JOINs then using the SUM function
Here is the query
SELECT DISTINCT
a.abbr landowner,
CONCAT(f.ForestLabel, '-', l.serial, '-', l.revision) leasenumber,
i.iid,
FROM_UNIXTIME(i.dateadded,'%M %d, %Y') InvoiceDate,
(SELECT IFNULL(SUM(ch.amount), 0.00) n FROM test_charges ch WHERE ch.invoiceid = i.iid) totalBilled,
(SELECT SUM(p1.amount) n FROM test_payments p1 WHERE p1.invoiceid = i.iid AND p1.transtype = 'check' AND p1.status = 2) checks,
(SELECT SUM(p1.amount) n FROM test_payments p1 WHERE p1.invoiceid = i.iid AND p1.transtype = 'ach' AND p1.status = 2) ach,
CASE WHEN i.totalbilled < 0 THEN i.totalbilled * -1 ELSE 0.00 END credits,
CASE WHEN i.balance >= 0 THEN i.balance ELSE 0.00 END balance,
t.typelabel, g.groupname
FROM test_invoices i
LEFT JOIN test_contracts c
ON i.contractid = c.cid
LEFT JOIN test_leases l
ON c.leaseid = l.bid
LEFT JOIN test_forest f
ON l.forest = f.ForestID
LEFT JOIN test_leasetypes t
ON l.leasetype = t.tid
LEFT JOIN test_accounts a
ON l.account = a.aid
LEFT JOIN test_groups g
ON c.groupid = g.gid
LEFT JOIN test_payments p
ON p.invoiceid = i.iid
WHERE (i.dateadded >= #startdate) OR (p.dateadded >= #startdate)
Try this.
http://www.sqlfiddle.com/#!2/d8d9c/11/2
TL;DR:
… AND (i.dateadded > #startdate
OR EXISTS (
SELECT * FROM test_payments
WHERE test_payments.invoiceid = i.iid
AND test_payments.dateadded >= #startdate))

MySQL Update query with left join and group by

I am trying to create an update query and making little progress in getting the right syntax.
The following query is working:
SELECT t.Index1, t.Index2, COUNT( m.EventType )
FROM Table t
LEFT JOIN MEvents m ON
(m.Index1 = t.Index1 AND
m.Index2 = t.Index2 AND
(m.EventType = 'A' OR m.EventType = 'B')
)
WHERE (t.SpecialEventCount IS NULL)
GROUP BY t.Index1, t.Index2
It creates a list of triplets Index1,Index2,EventCounts.
It only does this for case where t.SpecialEventCount is NULL. The update query I am trying to write should set this SpecialEventCount to that count, i.e. COUNT(m.EventType) in the query above. This number could be 0 or any positive number (hence the left join). Index1 and Index2 together are unique in Table t and they are used to identify events in MEvent.
How do I have to modify the select query to become an update query? I.e. something like
UPDATE Table SET SpecialEventCount=COUNT(m.EventType).....
but I am confused what to put where and have failed with numerous different guesses.
I take it that (Index1, Index2) is a unique key on Table, otherwise I would expect the reference to t.SpecialEventCount to result in an error.
Edited query to use subquery as it didn't work using GROUP BY
UPDATE
Table AS t
LEFT JOIN (
SELECT
Index1,
Index2,
COUNT(EventType) AS NumEvents
FROM
MEvents
WHERE
EventType = 'A' OR EventType = 'B'
GROUP BY
Index1,
Index2
) AS m ON
m.Index1 = t.Index1 AND
m.Index2 = t.Index2
SET
t.SpecialEventCount = m.NumEvents
WHERE
t.SpecialEventCount IS NULL
Doing a left join with a subquery will generate a giant
temporary table in-memory that will have no indexes.
For updates, try avoiding joins and using correlated
subqueries instead:
UPDATE
Table AS t
SET
t.SpecialEventCount = (
SELECT COUNT(m.EventType)
FROM MEvents m
WHERE m.EventType in ('A','B')
AND m.Index1 = t.Index1
AND m.Index2 = t.Index2
)
WHERE
t.SpecialEventCount IS NULL
Do some profiling, but this can be significantly faster in some cases.
my example
update card_crowd as cardCrowd
LEFT JOIN
(
select cc.id , count(1) as num
from card_crowd cc LEFT JOIN
card_crowd_r ccr on cc.id = ccr.crowd_id
group by cc.id
) as tt
on cardCrowd.id = tt.id
set cardCrowd.join_num = tt.num;