Using "totals" functions on an already grouped query - ms-access

I am having troubles using the aggregate functions when using features like group by in a query. Instead of my aggregate functions applying to the entire query recordset, they only apply to select groups determined by the nature of the query. For example:
Person Date Able
-----------------------------
A 21/05/13 0
B 21/05/13 -1
C 21/05/13 -1
D 21/05/13 0
(grouped by Person, Date, Able)
When applying aggregate functions:
Person Date Able Max(Able) Min(Date)
----------------------------------------------------
A 21/05/13 0 0 21/05/13
B 22/05/13 -1 -1 22/05/13
C 23/05/13 -1 -1 23/05/13
D 24/05/13 0 0 24/05/13
The aggregate functions are made entirely redundant unless the data is completely ungrouped. So far, I have been getting around it using:
1) Using another query to reference the initial query and determine the true aggregate values.
2) Have the form call this second query using the d functions (dlookup,dcount etc.)
In my particular scenario, I have a list (very similar to above) that needs to be presented in a certain order (ranked based on ID). However, I am using an expression in the query to define a different type of ranking. The idea is to show (using conditional formatting) the first record in this new rank. Illustrated below
Person Date ID CalculatedRank
--------------------------------------------
A 21/05/13 1 4
B 21/05/13 2 2
C 21/05/13 3 3
D 21/05/13 4 1
Ideally I would like to have another column that determines which one is first which could be easily acheived by:
first: [CalculatedRank] = Min( [CalculatedRank] )
But as described above, Min() is not giving me 1, it is giving me it on a per row basis (the minimimum isn't always 1 so I can't set this arbitrarily).
Right now I am using a separate query to reference this first query and I sort that based on the calculated rank. Conditional formatting can then use dlookup to determine whether it is first or not from the second query. However, everytime the form refreshes, or a requery is called, every single row's conditional formatting triggers another dlookup, which then references the first query recalculating the new rank, for every row!
As you can imagine, the delay is noticeable causing the cursor to be idle for >5seconds. I am not too sure about the internal mechanisms of access, but using the inbuilt debugger, a requery on a recordset of 4 rows caused my CalculateRank() function to be called 12 times, purely through the conditional formatting calling the second query.
In summary, I have pretty much narrowed it down to requiring a separate query (and therefore dlookup) to properly use the aggregate functions. If I was able to keep everything in one query, the conditional formatting wouldn't need to use dlookup on another query to determine its status.
I am sure I am not the only one that has had problems with this and was wondering if any solutions exist where I can avoid all the stacked querying.
As always, any help is much appreciated!

Wow, I see what you mean! For my table [Table1]
Person Date ID
------ ---------- --
A 2013-05-21 1
B 2013-05-21 2
C 2013-05-21 3
D 2013-05-21 4
and my query [qryTable1Ranked]
SELECT Table1.*, CalculateRank([ID]) AS CalculatedRank
FROM Table1;
which uses the following function in a standard VBA Module
Public Function CalculateRank(ID As Long) As Long
Dim r As Long
Select Case ID
Case 1
r = 4
Case 4
r = 1
Case Else
r = ID
End Select
CalculateRank = r
Debug.Print "x"
End Function
and returns
Person Date ID CalculatedRank
------ ---------- -- --------------
A 2013-05-21 1 4
B 2013-05-21 2 2
C 2013-05-21 3 3
D 2013-05-21 4 1
when I just double-click the query to open it in Datasheet View my ranking function gets called 4 times, once for each row.
If I create a continuous form based on that query and open the form my function gets called 4 times. Then if I add conditional formatting on the [CalculatedRank] text box using Value = DMin("CalculatedRank", "qryTable1Ranked") then my function gets called 32 times!
I found that I can cut that by half (to 16 times) if I add an invisible unbound textbox named [txtMinCalculatedRank], use the following code behind the form...
Option Compare Database
Option Explicit
Private Sub Form_Load()
UpdateMinCalculatedRank
End Sub
Private Sub UpdateMinCalculatedRank()
Me.txtMinCalculatedRank.Value = DMin("CalculatedRank", "qryTable1Ranked")
End Sub
...and change the Conditional Formatting rule to Value = [txtMinCalculatedRank].
I found that I could cut that by half again (to 8 times) if I changed the Record Source of the Form from qryTable1Ranked to Table1 (the base table) and changed the Control Source of the [CalculatedRank] text box to =CalculateRank([ID]) (still using the tricks from the previous tweak).
I think that's probably as good as it gets without going so far as to create a temporary table, or persisting the CalculatedRank (and perhaps an "IsMin" flag) in the base table.

Related

SumIIF Access Query

I am struggling to get the desired results i need using an Access query, and was wondering if what i was looking to do was actually achievable in one query, or whether i would need two queries and then export to Excel and interrogate the results there.
I've a table with a number of columns, i am specifically looking at three columns
Row Type - this will either be populated with A or U
Account Number - there will be potentially multiple instances of account number within the table. Although only once against row type "A", and multiple on row type "U"
Value - a currency field. At Account number level, the sum of "U" row, should equal the "A" value
I am looking to produce a query that will list three things.
[Account Number]
Sum of [Value] when [RowType] = "U"
[Value] when [RowType] = "A"
Would i need to create a new column in my table to generate a value for the requirement "Sum of Value when 'U')
I've tried
SUM(IIF([ROWTYPE]='U',[Value],0)) - but that doesn't seem to work.
I've also tried to use the builder within the Query to replicate the same, but again that also doesn't seem to work.
If all else fails i'm content to have to run two queries in Access and then join Excel, but i tihnk for my own learning and knowledge it would be good to know if what i am trying to do is possible.
I was hoping this is possible to compile in Access, but my knowledge of the application is seriously lacking, and despite looking on the MS Access support pages, and also some of the response on the StackOverflow forums, i still can't get my head around what i need to do.
Example of the data
Row Type
Account ID
Value
A
123456789
50.00
U
123456789
30.00
U
123456789
20.00
A
987654321
100.00
U
987654321
80.00
U
987654321
20.00
The data has been loaded into Access, table called "TEST"
This is the SQL i've got, but doesn't give me the desired results.
SELECT [TEST].[ROW TYPE], SUM([TEST].[VALUE]) AS [TEST].[ACCOUNT ID]
FROM [TEST]
GROUP BY [TEST].[ROW TYPE], [TEST].[ACCOUNTID]
When the query generates, would hope to see two rows, one for each account number.
Three row -
Account Number
Sum Value (where row is U)
Value (Where row is A)
I currently get 4 rows in the query. Two results for each account number, one result is the Value when Row Type = A, the other when Row Type = U.
I guess this is what you are after:
SELECT
[Account ID],
Sum(IIf([Row Type]="A",[Value],0)) AS A,
Sum(IIf([Row Type]="U",[Value],0)) AS U
FROM
TEST
GROUP BY
[Account ID];
Output:
Account ID
A
U
123456789
50,00
50,00
987654321
100,00
100,00

How to request lists that contain certain items in MySQL

In the application I am developing, the user has to set parameters to define the end product he will get.
My tables look like this :
Categories
-------------
Id Name
1 Material
2 Color
3 Shape
Parameters
-------------
Id CategoryId Name
1 1 Wood
2 1 Plastic
3 1 Metal
4 2 Red
5 2 Green
6 2 Blue
7 3 Round
8 3 Square
9 3 Triangle
Combinations
-------------
Id
1
2
...
ParametersCombinations
----------------------
CombinationId ParameterId
1 1
1 4
1 7
2 1
2 5
2 7
Now only some combinations of parameters are available to the user. In my example, he could get a red round wooden thingy or a green round wooden thingy but not a blue one because I can't produce it.
Let's say the user selected wood and round parameters. How do I make a request to know that there's only red and green available so I can disable the blue option for him ?
Or is there some better way to model my database ?
Let us assume you provide the selected parameters id in the following format
// I call this a **parameterList** for convenience sake.
(1,7) // this is parameter id 1 and id 7.
I am also assuming you are using some scripting language to help you with your app. Like ruby or php.
I am also assuming you want to avoid putting as much logic into your stored procedure or MySQL queries as much as possible.
Another assumption is that you are using one of the Rapid Application MVC Frameworks like Rails, Symfony or CakePHP.
Your logic would be:
Find all the combinations that contain ALL the parameters in your parameterList and put these found combinations in a list called relevantCombinations
Find all the parameters_combinations that contain at least 1 of the combinations in the list relevantCombinations. Retrieve only the unique parameter values.
First two steps can be solved using simple Model::find methods and a forloop in the frameworks I described above.
If you are not using frameworks, it is also cool to use the scripting language raw.
If you require them in MySQL queries, here are some possible queries. Be aware that these are not necessary the best queries.
First one is
SELECT * FROM (
SELECT `PossibleList`.`CombinationId`, COUNT(`PossibleList`.`CombinationId`) as number
FROM (
SELECT `CombinationId` FROM `ParametersCombinations`
WHERE `ParameterId` IN (1, 7)
) `PossibleList` GROUP BY `PossibleList`.`CombinationId`
) `PossibleGroupedList` WHERE `number` = 2;
-- note that the (1, 7) and the number 2 needs to be supplied by your app.
-- 2 refers to the number of parameters supplied.
-- In this case you supplied 1 and 7 therefore 2.
To confirm, look at http://sqlfiddle.com/#!2/16831/3.
Note how I purposely have a Combination 3 which only has the Parameter 1 but not 7. Therefore the query did not give you back 3, but only 1 and 2. Feel free to tweak the asterisk * in the first line.
Second one is
SELECT DISTINCT(`ParameterID`)
FROM `ParametersCombinations`
WHERE `CombinationId` IN (1, 2);
-- note that (1, 2) is the result we expect from the first step.
-- the one we call relevantCombinations
To confirm, look at http://sqlfiddle.com/#!2/16831/5
I do not recommend being a masochist and attempt to get your answer in a single query.
I also do NOT recommend using the MySQL queries I have supplied. It is less masochistic. But sufficiently masochistic for me NOT to recommend this way.
Since you did not indicate any tag other than mysql, I suspect that you are stronger with mysql. Hence my answer contains mysql.
My strongest suggestion would be my first. Make full use of established frameworks and put your logic in the business logic layer. Not in the data layer. Even if you don't use frameworks and just use raw php and ruby, that is still a better place for you to place your logic in than MySQL.
I saw that T gave an answer in a single MySQL query but I can tell you that (s)he considers only 1 parameter.
See this part:
WHERE ParameterId = 7 -- 7 is the selected parameter
You can adapt his/her answer with some trickery using a forloop and appending OR clauses.
Again, I do NOT recommend that in the big picture of building an app.
I have also tested his/her answer with http://sqlfiddle.com/#!2/2eda4/2. There may be 1 or 2 small bugs.
In summary, my recommendations in descending order of strength:
Use a framework like Rails or CakePHP and the pseudocode step 1 and 2 and as many find as you need. (STRONGEST)
Use raw scripting language and the pseudocode step 1 and 2 and as many simple queries as you need.
Use the raw MySQL queries I created. (LEAST STRONG)
P.S. I left out the part in my queries as to how to get the name of the Parameters. But given that you can get the ParameterIDs from my answer, I think that is trivial. I have also left out how you may need to remove the already selected parameters (1, 7). Again, that should be trivial to you.
Try the following
SELECT p.*, pc.CombinationId
FROM Parameters p
-- get the parameter combinations for all the parameters
JOIN ParametersCombinations pc
ON pc.ParameterId = p.Id
-- filter the parameter combinations to only combinations that include the selected parameter
JOIN (
SELECT CombinationId
FROM ParametersCombinations
WHERE ParameterId = 7 -- 7 is the selected parameter
) f ON f.CombinationId = pc.CombinationId
Or removing the already selected parameters
SELECT p.*, pc.CombinationId
FROM Parameters p
JOIN ParametersCombinations pc
ON pc.ParameterId = p.Id
JOIN (
SELECT CombinationId
FROM ParametersCombinations
WHERE ParameterId IN (7, 1)
) f ON f.CombinationId = pc.CombinationId
WHERE ParameterId NOT IN (7, 1)

RowNumber for group in SSRS 2005

I have a table in a SSRS report that is displaying only a group, not the table details. I want to find out the row number for the items that are being displayed so that I can use color banding. I tried using "Rowcount(Nothing)", but instead I get the row number of the detail table.
My underlying data is something like
ROwId Team Fan
1 Yankees John
2 Yankees Russ
3 Red Socks Mark
4 Red Socks Mary
...
8 Orioles Elliot
...
29 Dodgers Jim
...
43 Giants Harry
My table showing only the groups looks like this:
ROwId Team
2 Yankees
3 Red Socks
8 Orioles
29 Dodgers
43 Giants
I want it to look like
ROwId Team
1 Yankees
2 Red Socks
3 Orioles
4 Dodgers
5 Giants
You can do this with a RunningValue expression, something like:
=RunningValue(Fields!Team.Value, CountDistinct, "DataSet1")
DataSet1 being the name of the underlying dataset.
Consider the data:
Creating a simple report and comparing the RowNumber and RunningValue approaches shows that RunningValue gives your required results:
You can easily achieve this with a little bit of vbcode. Go to Report - Properties - code and type something like:
Dim rownumber = 0
Function writeRow()
rownumber = rownumber + 1
return rownumber
End Function
Then on your cell, call this function by using =Code.writeRow()
As soon as you start using groups inside the tables, the RowNumber and RunningGroup functions start getting some weird behaviours, thus it's easier to just write a bit of code to do what you want.
I am not convinced all suggestions above provide are a one for all solution. My scenario is I have a grouping that has has multiple columns. I could not use the agreed solution RunningValue because I don't have a single column to use in the function unless I combine (say a computed column) them all to make single unique column.
I could not use the VBA code function as is for the same reason and I had to use the same value across multiple columns and multiple properties for that matter unless I use some other kind of smarts where if I knew the number of uses (say N columns * M properties) then I could only update the RowNumber on every NxM calls however, I could not see any count columns function so if I added a column I would also need to increase my N constant. I also did not want to add a new column as also suggested to my grouping as I could not figure out how to hide it and I could not write a vba system where I could call function A that returns nothing but updates the value (i.e. called only once per group row) then call another function GetRowNumber which simply returns the rownumber variable because the colouring was done before the call so I always had one column out of sync to the rest.
My only other 2 solutions I could think of is put the combined column as mentioned earlier in the query itself or use DENSE_RANK and sort on all group columns, i.e.
DENSE_RANK() OVER (ORDER BY GroupCol1, GroupCol2, ...) AS RowNumber

Populating with '0' when Data in SSRS Does not exist

I'm trying to create a report in SSRS where I have a matrix, which has gender as the column headings and specifically defined agegroups as the rows. The report is sorted by date (ie, the records being displayed are filtered by the modifedAt value). My problem is that i wish for all of the age group categories to be displayed, even if the dataset does not return any data for that row.
So, for example, if i set the date to be a date where there are no db rows where there are Age5-16 children in - I still want to display the category name, but just have the cells related to that row to display '0'. Instead, the report just drops the whole row because, obviously the query returns no data.
Is the solution to have a separate dataset that brings back the entire list of categories and then somehow fit them together? I'm stuck here so any help is appreciated!
I can think of a few ways to do this:
DataSet level
Instead of just returning the relevant data in the underlying data in the DataSet, include all the categories you want to display in all cases.
e.g. For a database query it might be the difference between an inner and left join, i.e. going from something like:
select *
from AgeGroup
inner join MyData on ...
to:
select *
from AgeGroup
left join MyData on ...
So the report always has all the age groups to display. Where there are NULL values, just display 0.
I think this is the best option if you have control over the DataSet - you won't have to update your report at all, with luck the actual DataSet changes should be minimal, there is still only one DataSet call, and it's by far the simplest to maintain.
Hard code groups into the report
Here you include a table header row for each group you want to display, so these are always displayed in all cases.
Here you have some sort of conditional expression to display the values, e.g. For each group row it will be tailored to that group:
=Sum(IIf(Fields!AgeGroup.Value = "5-16", Fields!Amount.Value, Nothing)
This is not too flexible and will need updates as you change groups, and doesn't have as many options for layout. There is still only one DataSet call, so that is a plus.
Subreports
You can have a parent DataSet that displays one row for each age group, then embed a subreport in each row that displays the data you want for that row.
This allows you flexibility in layout but it will add complexity to the report(s) and will mean that you make a lot of DataSet calls that could be avoided with other options.
I know this is old, but I wanted to elaborate on Ian's section 1 above using joins at the dataset level. (His answer was super helpful to me for a report I'm working on.)
per op:
Is the solution to have a separate dataset that brings back the entire list of categories and then somehow fit them together?
That is how I've handled it successfully, but you can do so without actually creating a separate dataset by using common table expressions (or temp tables, of course).
For these example tables:
AGE_Table
ID Group Group_Desc Toys
1 A 00-10 Teddy Bear
2 B 11-20 Video Game
3 C 21-30 Sports Car
4 D 31-40 Mansion
5 E 41-50 Jewelry
People_Table (filtered for whatever date)
ID Name Age Gender Age_Group
1 Ariel 07 F A
2 Brandon 23 M C
3 Chelsea 27 F C
4 Derek 06 M A
You want to see 2 results for the 00-10 row, 2 for the 21-30 row, and then still see rows for the other age groups even if there aren't any results.
We want to create a dataset with all the different age groupings and then join on it. Behold a solution using common table expressions:
with CTE_Age AS
(SELECT Distinct Age_Group from AGE_Table)
SELECT ID, Name, Age, Gender, CTE_Age.Age_Group FROM People_Table
RIGHT JOIN CTE_Age ON
People_Table.Age_Group = CTE_Age.Age_Group
This will return:
ID Name Age Gender Age_Group
1 Ariel 7 F A
4 Derek 6 M A
NULL NULL NULL NULL B
2 Brandon 23 M C
3 Chelsea 27 F C
NULL NULL NULL NULL D
NULL NULL NULL NULL E
Once you have that in your dataset, you can change NULL values to 0 on the report builder side -- I think in 2008R2 the default is just blank.

Can SQL query do this?

I have a table "audit" with a "description" column, a "record_id" column and a "record_date" column. I want to select only those records where the description matches one of two possible strings (say, LIKE "NEW%" OR LIKE "ARCH%") where the record_id in each of those two matches each other. I then need to calculate the difference in days between the record_date of each other.
For instance, my table may contain:
id description record_id record_date
1 New Sub 1000 04/14/13
2 Mod 1000 04/14/13
3 Archived 1000 04/15/13
4 New Sub 1001 04/13/13
I would want to select only rows 1 and 3 and then calculate the number of days between 4/15 and 4/14 to determine how long it took to go from New to Archived for that record (1000). Both a New and an Archived entry must be present for any record for it to be counted (I don't care about ones that haven't been archived). Does this make sense and is it possible to calculate this in a SQL query? I don't know much beyond basic SQL.
I am using MySQL Workbench to do this.
The following is untested, but it should work asuming that any given record_id can only show up once with "New Sub" and "Archived"
select n.id as new_id
,a.id as archive_id
,record_id
,n.record_date as new_date
,a.record_date as archive_date
,DateDiff(a.record_date, n.record_date) as days_between
from audit n
join audit a using(record_id)
where n.description = 'New Sub'
and a.description = 'Archieved';
I changed from OR to AND, because I thought you wanted only the nr of days between records that was actually archived.
My test was in SQL Server so the syntax might need to be tweaked slightly for your (especially the DATEDIFF function) but you can select from the same table twice, one side grabbing the 'new' and one grabbing the 'archived' then linking them by record_id...
SELECT
newsub.id,
newsub.description,
newsub.record_date,
arc.id,
arc.description,
arc.record_date,
DATEDIFF(day, newsub.record_date, arc.record_date) AS DaysBetween
FROM
foo1 arc
, foo1 newsub
WHERE
(newsub.description LIKE 'NEW%')
AND
(arc.description LIKE 'ARC%')
AND
(newsub.record_id = arc.record_id)