Related
In a totals row (location is row group, date is the column group), I'm not sure how to calculate a total based on the below expression that I've been using to calculate "overcapacity" daily totals. I'm trying to total up the daily overcap counts.
The below expression works fine when grouped under the "date" column grouping. So each day in the range picked by the user displays with an "overcapacity" count (if the visitscounts > 7, then the result is visitscounts - 7, otherwise it is 0).
But I'm not sure how to total up the resultant overcapacity counts (I can total up visitcounts fine). The issue is that it takes the FULL daily count and THEN applies the -7, instead of just summing all the previously calculated daily counts (if the daily account exceeds 7, then it subtracts 7 from the full daily count, to come up with an "overcapacity" count).
=IIF(SUM(IIF(Fields!Location.Value = "LOC3" OR Fields!Location.Value = "LOC4",Fields!VisitsCount.Value,0)) > 7,
SUM(SUM(IIF(Fields!Location.Value = "LOC3" OR Fields!Location.Value = "LOC4",Fields!VisitsCount.Value,0))-7),
0)
ADDITIONAL INFORMATION
Dataset Query:
SELECT
L.Location
, D.[date]
, COUNT(DISTINCT CAST(V.VisitID AS VARCHAR(10))+V.Location+V.[Room-Bed]) AS VisitsCount
FROM dbo.DateTable(#StartDate, #EndDate) AS D
CROSS JOIN (
SELECT DISTINCT Location FROM dbo.vHIMOverFlowBedReport2)
AS L LEFT JOIN dbo.vHIMOverFlowBedReport2 AS V ON D.[date] BETWEEN
V.EffectiveDate AND ISNULL(V.ServiceEndDate, #EndDate) AND V.Location = L.Location
WHERE V.EffectiveDate IS NULL OR
(V.EffectiveDate <= #EndDate OR
V.ServiceEndDate >= #StartDate OR
V.ServiceEndDate IS NULL)
GROUP BY
L.Location
, D.[date]
ORDER BY L.Location, D.[date]
OverCap New Column Testing
Dataset Query with Added OverCap Column:
SELECT
x.Location
, x.[date]
, x.VisitsCount
, OverCap = IIF(VisitsCount-7 <0 ,0, VisitsCount-7)
FROM (
SELECT
L.Location
, D.[date]
, COUNT(DISTINCT CAST(V.VisitID AS VARCHAR(10))+V.Location+V.[Room-Bed]) AS VisitsCount
FROM dbo.DateTable(#StartDate, #EndDate) AS D
CROSS JOIN (
SELECT DISTINCT Location FROM dbo.vHIMOverFlowBedReport2)
AS L LEFT JOIN dbo.vHIMOverFlowBedReport2 AS V ON D.[date] BETWEEN
V.EffectiveDate AND ISNULL(V.ServiceEndDate, #EndDate) AND V.Location = L.Location
WHERE V.EffectiveDate IS NULL OR
(V.EffectiveDate <= #EndDate OR
V.ServiceEndDate >= #StartDate OR
V.ServiceEndDate IS NULL)
GROUP BY
L.Location
, D.[date]
) x
ORDER BY x.Location, x.[date]
Full Report Layout with OverCap added
Grouped by Location (row) and [date] (column), with a totals row outside row group for combined location.
report layout
Current Resultset with OverCap Column
CurrentResultSet
NOTES:
Result for LOC3-4 is 129 (see CurrentResultSet) if I use:
OverCap = IIF(VisitsCount-7 <0 ,0, VisitsCount-7)
Result for LOC3-4 is 34 if I use:
OverCap = IIF(Location = 'LOC3' OR Location = 'LOC4', VisitsCount,0)
Result for LOC3-4 is 24 if I use:
OverCap = IIF((Location = 'LOC3' OR Location = 'LOC4') AND VisitsCount > 7, VisitsCount,0)
Expressions Used Successfully for GROUPED locations:
DailyVisitorCount (used for both DailyVisitorCount and TotalVisitorCount).
=Sum(Fields!VisitsCount.Value)
DailyOverCapacityCount (used for both DailyOverCapacityCount and TotalOverCapacityCount):
=SWITCH(
Fields!Location.Value = "LOC1" AND Fields!VisitsCount.Value > 24, SUM(Fields!VisitsCount.Value - 24),
Fields!Location.Value = "LOC2" AND Fields!VisitsCount.Value > 16, SUM(Fields!VisitsCount.Value - 16),
Fields!Location.Value = "LOC3" AND Fields!VisitsCount.Value > 7, SUM(Fields!VisitsCount.Value - 7),
Fields!Location.Value = "LOC4" AND Fields!VisitsCount.Value > 7, SUM(Fields!VisitsCount.Value - 7),
Fields!Location.Value = "LOC5" AND Fields!VisitsCount.Value > 11, SUM(Fields!VisitsCount.Value - 11),
True, 0)
Averages were calculated by using the above expressions but adding to the end:
/CountDistinct(Fields!date.Value)
Expressions Used for combined location (outside grouped location row)
DailyVisitorCount (used successfully for both DailyVisitorCount and TotalVisitorCount).
=IIF(Fields!Location.Value = "LOC3" OR Fields!Location.Value = "LOC4", Sum(Fields!VisitsCount.Value), 0)
TotalOverCapCount (used successfully with DAILY TotalOverCapCount, but not the Total TotalOverCapCount
=IIF(SUM(IIF(Fields!Location.Value = "LOC3" OR Fields!Location.Value = "LOC4",Fields!VisitsCount.Value,0)) > 7,
SUM(SUM(IIF(Fields!Location.Value = "LOC3" OR Fields!Location.Value = "LOC4",Fields!VisitsCount.Value,0))-7),
0)
Expressions still needed:
Total TotalOverCapCount (adds up daily totals for the duration)
Average TotalOverCapCount (average of daily totals for the duration)
As the expression you are trying to sum requires two scoped expression parts, it's actually difficult to then sum these.
What you need to do is
=SUM(
SUM(Fields!.MyField.Value, "ColumnGroup")
, "RowGroup")
This is probably not possible in your scenario. I abandoned the approach quickly and just updated the dataset query to return the data I needed instead
So the dataset query looked like this (using you sample data)
DECLARE #t TABLE([Location] varchar(10), [Date] Date, [VisitsCount] int)
INSERT INTO #t VALUES
('LOC1', '2022-10-31', 18), ('LOC1', '2022-11-01', 19),
('LOC1', '2022-11-02', 19), ('LOC2', '2022-10-31', 34),
('LOC2', '2022-11-01', 30), ('LOC2', '2022-11-02', 35),
('LOC3', '2022-10-31', 8), ('LOC3', '2022-11-01', 8),
('LOC3', '2022-11-02', 8), ('LOC4', '2022-10-31', 5),
('LOC4', '2022-11-01', 5), ('LOC4', '2022-11-02', 7),
('LOC5', '2022-10-31', 11), ('LOC5', '2022-11-01', 11),
('LOC5', '2022-11-02', 11)
SELECT
[Location], [Date], VisitsCount
, OverCap = IIF(VisitsCount-7 <0 ,0, VisitsCount-7)
FROM #t
As you can see, I added an OverCap column. Now all we need to do is sum that in the report.
The report design looks like this..
and the final report looks like this...
Obviously you might need to adapt this to suit whatever grouping you have going but I think it's probably a much simpler approach.
Edit after update by OP
You can wrap your original query in a SELECT and then move the order clause of of the sub query. That should give you the same results.
SELECT
[Location], [Date], VisitsCount
, OverCap = IIF(VisitsCount-7 <0 ,0, VisitsCount-7)
FROM (
SELECT
L.Location
, D.[date]
, COUNT(DISTINCT CAST(V.VisitID AS VARCHAR(10))+V.Location+V.[Room-Bed]) AS VisitsCount
FROM dbo.DateTable(#StartDate, #EndDate) AS D
CROSS JOIN (
SELECT DISTINCT Location FROM dbo.vHIMOverFlowBedReport2)
AS L LEFT JOIN dbo.vHIMOverFlowBedReport2 AS V ON D.[date] BETWEEN
V.EffectiveDate AND ISNULL(V.ServiceEndDate, #EndDate) AND V.Location = L.Location
WHERE V.EffectiveDate IS NULL OR
(V.EffectiveDate <= #EndDate OR
V.ServiceEndDate >= #StartDate OR
V.ServiceEndDate IS NULL)
GROUP BY
L.Location
, D.[date]
) x
ORDER BY [Location], [date]
If you want to just get LOC3+4 at the end of the table
Then you can do this in SQL. All I've done here is taken the existing query, dumped the results to a temp table , then returned the results plus an extra row that only contains LOC3 and LOC4 data.
Note: I added a GroupOrder column so you can sort on this in the report to make sure the combined row appears at the end.
SELECT
[Location], [Date], VisitsCount
, OverCap = IIF(VisitsCount-7 <0 ,0, VisitsCount-7)
INTO #t
FROM (
SELECT
L.Location
, D.[date]
, COUNT(DISTINCT CAST(V.VisitID AS VARCHAR(10))+V.Location+V.[Room-Bed]) AS VisitsCount
FROM dbo.DateTable(#StartDate, #EndDate) AS D
CROSS JOIN (
SELECT DISTINCT Location FROM dbo.vHIMOverFlowBedReport2)
AS L LEFT JOIN dbo.vHIMOverFlowBedReport2 AS V ON D.[date] BETWEEN
V.EffectiveDate AND ISNULL(V.ServiceEndDate, #EndDate) AND V.Location = L.Location
WHERE V.EffectiveDate IS NULL OR
(V.EffectiveDate <= #EndDate OR
V.ServiceEndDate >= #StartDate OR
V.ServiceEndDate IS NULL)
GROUP BY
L.Location
, D.[date]
) x
SELECT
GroupOrder = 1
, [Location], [Date], VisitsCount, OverCap
FROM #t
UNION ALL
SELECT
GroupOrder = 2
, [Location] = 'LOC3+4', MIN([Date]), SUM(VisitsCount), SUM(OverCap)
FROM #t
WHERE [Location] IN ('LOC3', 'LOC4')
GROUP BY YEAR([Date]), MONTH([Date])
I am trying to get the number of 'critics' and 'promoters' from average of ratings from a joined table on a specific group of questions
SELECT category
, SUM( IF( round(avg(items.value) ) <= 6, 1, 0) ) AS critics
, SUM( IF( round(avg(items.value) ) >= 9, 1, 0) ) AS promoters
FROM reviews
INNER JOIN items
ON reviews.id = items.review_id
AND items.question_id in (1, 2, 4)
GROUP BY category
However I get the error:
General error: 1111 Invalid use of group function
I think you should try with using having with it, something like below:
SELECT
category,
COUNT(items.id) AS critics
FROM reviews
INNER JOIN items ON reviews.id = items.review_id AND
items.question_id IN (1, 2, 4)
GROUP BY category
HAVING ROUND(AVG(items.value)) <= 6
First retrieve category wise rounded average value and then apply condition either it is critics and promoters.
-- MySQL
SELECT t.category
, CASE WHEN t.avg_value <= 6
THEN 1
ELSE 0
END critics
, CASE WHEN t.avg_value >= 9
THEN 1
ELSE 0
END promoters
FROM (SELECT category
, ROUND(AVG(items.value)) avg_value
FROM reviews
INNER JOIN items
ON reviews.id = items.review_id
AND items.question_id IN (1, 2, 4)
GROUP BY category) t
Please check this url for finding out pseudocode https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=2679b2be50c3059c73ab9754c612179c
First retrieve category and review_id wise rounded average value and then apply condition either it is critics and promoters.
SELECT t.category
, SUM(CASE WHEN t.avg_value <= 6
THEN 1
ELSE 0
END) critics
, SUM(CASE WHEN t.avg_value >= 9
THEN 1
ELSE 0
END) promoters
FROM (SELECT category
, items.review_id
, ROUND(AVG(items.value)) avg_value
FROM reviews
INNER JOIN items
ON reviews.id = items.review_id
AND items.question_id IN (1, 2, 4)
GROUP BY category
, items.review_id) t
GROUP BY t.category
Issues in getting the right frequency for the cross tabulated data. My expected output is something like this:
I tried replacing the COUNT statement with SUM statement
SUM(IF(product.product_id = 1, line_item.quantity, 0)) AS Soda,
SUM(IF(product.product_id = 2, line_item.quantity, 0)) AS Liquor,
SUM(IF(product.product_id = 3, line_item.quantity, 0)) AS Lemon,
SUM(IF(product.product_id = 4, line_item.quantity, 0)) AS Mango,
SUM(IF(product.product_id = 5, line_item.quantity, 0)) AS Inhaler,
SUM(1) AS Count
FROM line_item
JOIN product USING (product_id)
JOIN ( SELECT 0 lo, 500 hi UNION
SELECT 501 , 1000 UNION
SELECT 1001 , 1500 UNION
SELECT 1501 , 2000 UNION
SELECT 2001 , 2500 ) ranges ON (product.price * line_item.quantity) BETWEEN ranges.lo AND ranges.hi
GROUP BY ranges.lo, ranges.hi```
It is getting closer because it is distributing already the values in its ranges just that the values are not correct. I am expecting to see something like this:
[Expected Result][1]
[1]: https://i.stack.imgur.com/YuB92.png
After reviewing my code here is the answer:
SUM(product.product_id = 1) AS Soda,
SUM(product.product_id = 2) AS Liquor,
SUM(product.product_id = 3) AS Lemon,
SUM(product.product_id = 4) AS Mango,
SUM(product.product_id = 5) AS Inhaler,
SUM(1) AS Count
FROM line_item
JOIN product USING (product_id)
JOIN ( SELECT 0 lo, 500 hi UNION
SELECT 501 , 1000 UNION
SELECT 1001 , 1500 UNION
SELECT 1501 , 2000 UNION
SELECT 2001 , 2500 ) ranges ON product.price * line_item.quantity BETWEEN ranges.lo AND ranges.hi
GROUP BY ranges.lo, ranges.hi
I have a table that look like this:
Company Year Revenue Cost Profit
ABC 1 10 6 4
ABC 2 12 7 5
ABC 3 14 8 6
XYZ 1 25 18 7
XYZ 2 27 19 8
XYZ 3 29 20 9
I want it look like this:
Company Item 1 2 3
ABC Revenue 10 12 14
ABC Cost 6 7 8
ABC Profit 4 5 6
XYZ Revenue 25 27 29
XYZ Cost 18 19 20
XYZ Profit 7 8 9
A crosstab query only allows one value. I can do it using separate crosstab queries for Revenue, Cost And Profit and the combine using the Union function but there must be an easier way.
Any help would be really appreciated.
Max
Variation for you to try.
SELECT Company,
item,
SUBSTRING_INDEX(SUBSTRING_INDEX(item_details, ',', 1), ',', -1) AS `1`,
SUBSTRING_INDEX(SUBSTRING_INDEX(item_details, ',', 2), ',', -1) AS `2`,
SUBSTRING_INDEX(SUBSTRING_INDEX(item_details, ',', 3), ',', -1) AS `3`
FROM
(
SELECT Company, 'Revenue' AS item, GROUP_CONCAT(Revenue ORDER BY `Year`) AS item_details
FROM SomeTable
GROUP BY Company
UNION
SELECT Company, 'Cost' AS item, GROUP_CONCAT(Cost ORDER BY `Year`)
FROM SomeTable
GROUP BY Company
UNION
SELECT Company, 'Profit' AS item, GROUP_CONCAT(Profit ORDER BY `Year`)
FROM SomeTable
GROUP BY Company
) Sub1
ORDER BY Company, FIELD(Item, 'Revenue', 'Cost', 'Profit')
SQL fiddle for you:-
http://www.sqlfiddle.com/#!2/995a0/6
For reasons of scalability (and flexibility), problems like this are best left to the application level code (e.g. a simple PHP loop on a well-ordered result set) but, just for fun...
SELECT company
, item
, MAX(CASE WHEN year = 1 THEN value END) y1
, MAX(CASE WHEN year = 2 THEN value END) y2
, MAX(CASE WHEN year = 3 THEN value END) y3
FROM
( SELECT company, year, 'revenue' item, revenue value FROM my_table
UNION
SELECT company, year, 'cost',cost FROM my_table
UNION
SELECT company, year, 'profit',profit FROM my_table
) x
GROUP
BY company
, item
ORDER
BY company
, FIELD(item,'Revenue','Cost','Profit');
Try this:
SELECT Company, Item, Col1 AS 1, Col2 AS 2, Col3 AS 3
FROM (SELECT a.Company, 'Revenue' AS Item, MAX(IF(a.Year = 1, a.Revenue, 0)) AS Col1,
MAX(IF(a.Year = 2, a.Revenue, 0)) AS Col2, MAX(IF(a.Year = 3, a.Revenue, 0)) AS Col3
FROM tableA a
GROUP BY a.Company
UNION
SELECT a.Company, 'Cost' AS Item, MAX(IF(a.Year = 1, a.Cost, 0)) AS Col1,
MAX(IF(a.Year = 2, a.Cost, 0)) AS Col2, MAX(IF(a.Year = 3, a.Cost, 0)) AS Col3
FROM tableA a
GROUP BY a.Company
UNION
SELECT a.Company, 'Profit' AS Item, MAX(IF(a.Year = 1, a.Profit, 0)) AS Col1,
MAX(IF(a.Year = 2, a.Profit, 0)) AS Col2, MAX(IF(a.Year = 3, a.Profit, 0)) AS Col3
FROM tableA a
GROUP BY a.Company
) AS A
ORDER BY Company, FIELD(Item, 'Revenue', 'Cost', 'Profit')
A second way of doing it. Not tested (so probably some typos), but doing some SQL to get the data then looping around the details, lobbing them to an object to put out the rows. This will cope with any where a company is missing a data for a year.
Note that you could greatly simplify the SQL if you had a table of years that you are interested in and a table of companies.
<?php
$sql = "SELECT Sub1.Year, Sub2.Company, IFNULL(SomeTable.Revenue, 0) AS aValue, 'Revenue' AS Item
FROM
(
SELECT DISTINCT Year FROM SomeTable
) Sub1
CROSS JOIN
(
SELECT DISTINCT Company FROM SomeTable
) Sub2
LEFT OUTER JOIN SomeTable
ON Sub1.Year = SomeTable.Year
AND Sub2.Company = SomeTable.Company
UNION
SELECT Sub1.Year, Sub2.Company, IFNULL(SomeTable.Cost, 0) AS aValue, 'Cost' AS Item
FROM
(
SELECT DISTINCT Year FROM SomeTable
) Sub1
CROSS JOIN
(
SELECT DISTINCT Company FROM SomeTable
) Sub2
LEFT OUTER JOIN SomeTable
ON Sub1.Year = SomeTable.Year
AND Sub2.Company = SomeTable.Company
UNION
SELECT Sub1.Year, Sub2.Company, IFNULL(SomeTable.Profit, 0) AS aValue, 'Profit' AS Item
FROM
(
SELECT DISTINCT Year FROM SomeTable
) Sub1
CROSS JOIN
(
SELECT DISTINCT Company FROM SomeTable
) Sub2
LEFT OUTER JOIN SomeTable
ON Sub1.Year = SomeTable.Year
AND Sub2.Company = SomeTable.Company
ORDER BY Company, FIELD(Item, 'Revenue', 'Cost', 'Profit'), Year";
$query = $db->query($sql) or die($db->error()) ;
if ($row = $this->db->fetchAssoc())
{
echo "<table>";
$PrevCompany = $row['Company'];
$PrevItem = $row['Item'];
$aLine = new ProcessLine($PrevCompany, $PrevItem, true);
do
{
if ($PrevCompany != $row['Company'] or $PrevItem != $row['Item'])
{
unset($aLine);
$PrevCompany = $row['Company'];
$PrevItem = $row['Item'];
$aLine = new ProcessLine($PrevCompany, $PrevItem);
}
$aLine->Assign_Detail($row['Year'], $row['aValue'])
} while($row = $this->db->fetchAssoc());
unset($aLine);
echo "</table>";
}
class ProcessLine
{
private $Company;
private $Item;
private $row_details = array();
private $FirstRow = false
public __CONSTRUCT($Company, $Item, $FirstRow=false)
{
$this->Company = $Company;
$this->Item = $Item;
}
public __DESTRUCT()
{
if ($this->Firstrow)
{
echo "<tr><th>".$this->Company."</th><th>".$this->Item."</th>";
foreach($row_details AS $row_year=>$row_value)
{
echo "<th>$row_year</th>";
}
echo "</tr>";
}
echo "<tr><td>".$this->Company."</td><td>".$this->Item."</td><td>".implode("</td><td>", $row_details)."</td></tr>";
}
public Assign_Detail($in_year, $in_value)
{
$row_details[$in_year] = $in_value;
}
}
?>
I have written a fairly complex SQL query to get some statistics about animals from an animal sampling database. This query includes a number of subqueries and I would now like to see if it is possible to rewrite this query in any way to use joins instead of subqueries. I have a dim idea that this might reduce query time. (it's now about 23s on a mac mini).
Here's the query:
SELECT COUNT(DISTINCT a.AnimalID), TO_DAYS(a.VisitDate) AS day,
DATE_FORMAT(a.VisitDate, '%b %d %Y'), a.origin,
(
SELECT COUNT(DISTINCT a.AnimalID)
FROM samples AS a
JOIN
custom_animals AS b
ON a.AnimalID = b.animal_id
WHERE
b.organism = 2
AND
TO_DAYS(a.VisitDate) = day
) AS Goats,
(
SELECT COUNT(DISTINCT a.AnimalID)
FROM samples AS a
JOIN custom_animals AS b
ON a.AnimalID = b.animal_id
WHERE
b.organism = 2
AND
b.sex = 'Female'
AND
TO_DAYS(a.VisitDate) = day
) AS GF,
(
SELECT COUNT(DISTINCT a.AnimalID)
FROM samples AS a
JOIN custom_animals AS b
ON a.AnimalID = b.animal_id
WHERE
b.organism = 3
AND
b.sex = 'Female'
AND
TO_DAYS(a.VisitDate) = day
) AS SF
FROM
samples AS a
JOIN custom_animals AS b
ON a.AnimalID = b.animal_id
WHERE
project = 5
AND
AnimalID LIKE 'AVD%'
GROUP BY
TO_DAYS(a.VisitDate);
Thanks to ksogor my query is now way faster at;
SELECT DATE_FORMAT(s.VisitDate, '%b %d %Y') AS date,
s.origin,
SUM(IF(project = 5 AND s.AnimalID LIKE 'AVD%', 1, 0)) AS sampled_animals,
SUM(IF(ca.organism = 2, 1, 0)) AS sampled_goats,
SUM(IF(ca.organism = 2 AND ca.sex = 'Female', 1, 0)) AS female_goats,
SUM(IF(ca.organism = 3 AND ca.sex = 'Female', 1, 0)) AS female_sheep
FROM samples s JOIN custom_animals ca ON s.AnimalID = ca.animal_id
GROUP BY date;
I would still need to make this query select distinct s.AnimalID though as right now it counts the samples we have taken from these animals instead of the animals themselves. Anyone got any idea?
After some more help from ksogor I now have a great query:
SELECT DATE_FORMAT(s.VisitDate, '%b %d %Y') AS date,
s.origin,
SUM(IF(project = 5 AND s.AnimalID LIKE 'AVD%', 1, 0)) AS sampled_animals,
SUM(IF(ca.organism = 2, 1, 0)) AS sampled_goats,
SUM(IF(ca.organism = 2 AND ca.sex = 'Female', 1, 0)) AS female_goats,
SUM(IF(ca.organism = 3 AND ca.sex = 'Female', 1, 0)) AS female_sheep
FROM (
SELECT DISTINCT AnimalID AS AnimalID,
VisitDate,
origin,
project
FROM samples
) s
JOIN custom_animals ca ON s.AnimalID = ca.animal_id
GROUP BY date;
You can just use if or case statements, like this:
SELECT SUM(if(project = 5 AND AnimealID LIKE 'AVD%', 1, 0)) AS countbyproj,
TO_DAYS(s.VisitDate) AS day,
DATE_FORMAT(s.VisitDate, '%b %d %Y') AS date,
s.origin,
SUM(if(ca.organism = 2, 1, 0)) AS countGoats,
SUM(if(ca.organism = 2 AND ca.sex = 'Female', 1, 0)) AS countGF,
SUM(if(ca.organism = 3 AND ca.sex = 'Female', 1, 0)) AS countSF
FROM samples s JOIN custom_animals ca ON s.AnimalID = ca.animal_id
GROUP BY TO_DAYS(a.VisitDate);
I can't check query, I don't know what result you're expected and which tables/relations you have, so this is only example with idea.
If you need count unque AnimealID's for each day:
SELECT SUM(byproj) AS countbyproj,
day,
date,
origin,
SUM(Goats) AS countGoats,
SUM(GF) AS countGF,
SUM(SF) AS countSF
FROM (
SELECT s.AnimealID,
if(project = 5 AND AnimealID LIKE 'AVD%', 1, 0) AS byproj,
TO_DAYS(s.VisitDate) AS day,
DATE_FORMAT(s.VisitDate, '%b %d %Y') AS date,
s.origin,
if(ca.organism = 2, 1, 0)) AS Goats,
if(ca.organism = 2 AND ca.sex = 'Female', 1, 0) AS GF,
if(ca.organism = 3 AND ca.sex = 'Female', 1, 0) AS SF
FROM samples s JOIN custom_animals ca ON s.AnimalID = ca.animal_id
) dataset
GROUP BY dataset.day, dataset.AnimealID;