Spotfire intersect first 'n' periods - function

Is there a way to use an Over and Intersect function to get the average sales for the first 3 periods (not always consecutive months, sometimes a month is skipped) for each Employee?
For example:
EmpID 1 is 71.67 ((80 + 60 + 75)/3) despite skipping "3/1/2007"
EmpID 3 is 250 ((350 + 250 + 150)/3).
I'm not sure how EmpID 2 would work because there are just two data points.
I've used a work-around by calculated column using DenseRank over Date, "asc", EmpID and then used another Boolean calculated column where DenseRank column name is <= 3, then used Over functions over the Boolean=TRUE column but I want to figure the correct way to do this.
There are Last 'n' Period functions but I haven't seen anything resembling a First 'n' Period function.
EmpID Date Sales
1 1/1/2007 80
1 2/1/2007 60
1 4/1/2007 75
1 5/1/2007 30
1 9/1/2007 100
2 2/1/2007 200
2 3/1/2007 100
3 12/1/2006 350
3 1/1/2007 250
3 3/1/2007 150
3 4/1/2007 275
3 8/1/2007 375
3 9/1/2007 475
3 10/1/2007 300
3 12/1/2007 200

I suppose the solution depends on where you want this data represented, but here is one example
If((Rank([Date],"asc",[EmpID])<=3) and (Max(Rank([Date],"asc",[EmpID])) OVER ([EmpID])>=3),Avg([Sales]) over ([EmpID]))
You can insert this as a calculated column and it will give you what you want (assuming your data is sorted by date when imported).
You may want to see the row numbering, and in that case insert this as a calculated column as well and name it RN
Rank([Date],"asc",[EmpID])
Explanation
Rank([Date],"asc",[EmpID])
This part of the function is basically applying a row number (labeled as RN in the results below) to each EmpID grouping.
Rank([Date],"asc",[EmpID])<=3
This is how we are taking the top 3 rows regardless if Months are skipped. If your data isn't sorted, we'd have to create one additional calculated column but the same logic applies.
(Max(Rank([Date],"asc",[EmpID])) OVER ([EmpID])>=3)
This is where we are basically ignoring EmpID = 2, or any EmpID who doesn't have at least 3 rows. Removing this would give you the average (dynamically) for each EmpID based on their first 1, 2, or 3 months respectively.
Avg([Sales]) over ([EmpID])
Now that our data is limited to the rows we care about, just take the average for each EmpID.

#Chris- Here is the solution I came up with
Step 1: Inserted a calculated column 'rank' with the expression below
DenseRank([Date],"asc",[EmpID])
Step 2: Created a cross table visualization from the data table and limited data with the expression below

Related

How to Query the Same Data within a Table but the Output Row Positions are Different

I have a table inside my database just like the sample below and i would like to query the same data but in the Column 2 the positions of the data would be 1 row greater than the previous data.
P.S. Im actually making a system for a Electric Meter Reading and I need the Current(Column 1) and the Previous(Column 2) Data Reading, so that I could compute the total consumption of the Electric Meter. But I am having a hard time doing it. Any suggestions would be deeply appreciated. Thank You. :)
Example data:
Desired Query Output:
Keep in mind that SQL table rows have no inherent order. They're just bags of records.
You must order them based on some column value or other criterion. In your case I guess you want the most recent and the second most recent meter reading for each account. Presumably your reading table has columns something like this:
reading_id customer_id datestamp value
1 1122 2009-02-11 112
2 1234 2009-02-13 18
3 1122 2009-03-08 125
4 1234 2009-03-10 40
5 1122 2009-04-12 160
6 1234 2009-04-11 62
I guess you need this sort of result set
customer_id datestamp value previous
1122 2009-03-08 125 112
1122 2009-04-12 160 125
1234 ...etcetera.
How can you get this? For each row in the table, you need a way to find the previous reading for the same customer: that is, the row with
the same customer id
the latest datestamp that occurs before the current datestamp.
This is a job for a so-called correlated subquery. Here's the query, with its subquery. (https://www.db-fiddle.com/f/hWGAbq4uAbA5f15j7oZY9o/0)
SELECT aft.customer_id,
aft.datestamp,
( SELECT bef.value
FROM r bef /* row from table.... */
WHERE bef.datestamp < aft.datestamp /* with datestamp < present datestamp */
AND bef.customer_id = aft.customer_id /* and same customer id */
ORDER BY bef.datestamp DESC /* most recent first */
LIMIT 1 /* only most recent */
) prev,
aft.value
FROM r aft
ORDER BY aft.customer_id, aft.datestamp
Notice that dealing with the first reading for each customer takes some thought in your business process.

How to sum specific rows and columns in SQL?

pnr mnd pris
1 1 600
1 7 900
2 1 600
2 7 600
3 1 40
3 7 40
I have trouble how to sum specific rows on the columns. Looking at the above, the table is called travel and it has 3 columns:
pnr - Personal Number
mnd - Month
Pris - Price
So what I want is to sum total of the price for the a specific month, so in this case, it should be 1240 USD and month 1. For the month 7, it should be 1540 USD.
I have trouble to do the query correct. So far from I have tried is this:
SELECT t.rnr, t.mnd, SUM(t.pris)
FROM travel AS t
WHERE t.mnd = 1
The result I get is 3720 USD which I have no idea how the SQL managed to calculate this for me.
Appreciate if someone could please help me out!
For this you need to drop the pnr column from the output (it is not relevant and will cause your data to split) and add a GROUP BY:
SELECT t.mnd, SUM(t.pris)
FROM travel AS t
WHERE t.mnd = 1
GROUP BY t.mnd
Live demo: https://dbfiddle.uk/?rdbms=mysql_8.0&fiddle=b34ec2bb9c077c2d74ffc66748c5c142
(The use of an aggregate function without grouping, as you've got now, is not a standard SQL feature and can often be turned off in MySQL. If turned on, you might not always get the result you expected/intended.)
just group your result with mnd column
SELECT t.mnd, SUM(t.pris)
FROM travel AS t
group by t.mnd

Calculate max value of list of numbers with a maximum combination of "x"

ok, i'm not sure if i can explain this right.
Lets say i have a table with three columns (id, price, maxcombo)
maybe there's like 5 rows in this table with random numbers for price. 2. id is just incremental unique key)
maxcombo specified if that price can be in a combination of up to whatever number it is.
If x was 3, i would need to find the combination that has the maximum value of the sum 1-3 columns.
So say the table had:
1 - 100 - 1
2 - 50 - 3
3 - 10 - 3
4 - 15 - 3
5 - 20 - 2
the correct answer with be just row id 1.
since 100 alone (and can only be alone based on the maxcombo number)
is greater than say 50 + 20 + 15 or 20 + 15 or 10 + 20 etc.
Does that make sense?
I mean i could just calculate all the diff combinations and see which has the largest value, but i would imagine that would take a very long time if the table was larger than 5 rows.
Was wondering any math genius or super dev out there had some advice or creative way to figure this out in a more efficient manner.
Thanks ahead of time!
I built this solution to achieve the desired query. However, it hasn't been tested in terms of efficiency.
Following the example of colums 1-3:
SELECT max(a+b+c) FROM sample_table WHERE a < 3;
EDIT:
Looking at:
The correct answer will be just row id 1
...I considered maybe I misunderstood your question, and you want the query just obtain the rowid. So, I made this other one:
SELECT a FROM sum_combo WHERE a+b+c=(
SELECT max(a+b+c) FROM sum_combo WHERE a > 3
);
Which would for sure take too long in larger tables than just 5 rows.

Finding out max number of entries based on a pre-condition in mysql

I have a timestamp column which has following time entries...i m writing as alphabets for convenience.
Person Time
1 A
2 B
3 C
4 D
5 E
5 F
5 G
6 H
Now the objective is to group all the entries that have a time difference of less than 2 hours between them which are generated by the same person and the count of elements in that group.
And so if i had say 100 entries....first if i were to consider 10 out of 100 entries then i need to check whether all 10 entries are from the same person then check if first 10 had time difference of less than 2 hours between successive elements...if so then the count is 10....if the time difference between 10th and 11th was more...then 11th wont be counted....and if the successive elements were generated by different person...then they cannot be grouped for calculating count.
so principally its like grouping successive entries which fits this criteria and dividing the table into sets (not breaking the table just grouping) and find out which set has the max count for a person.....so if 86 to 100th entry fit the criteria...then the count is 15 provided 86 to 100 are all generated by same person and if every other set had less than 10...then the output of the query should be the person which provided this max time count

To calculate sum of the fields in a matrix with column grouping

I am working on a ssrs report with column grouping. the followin is my scenario.
Matrix 1:
ID 2012 2013
1 20 40
1 30 50
Total 50 90
Matrix 2:
ID 2012 2013
1 60 70
1 60 80
Total 120 150
I need the sum of matrix1 and matrix2 like below:
ID 2012 2013
1 170 240
But I got the result like :
ID 2012 2013
1 410 410
I have applied column grouping in all the 3 matrices and gave the expression to get sum for matrix 3 as: =Sum(Fields!amount1.Value, "dsmatrix1") + Sum(Fields!Tamount1.Value, "dsmatrix2")
Please help me to get a solution for this.
Thanks!
I think I know what's going on. Correct me if I'm wrong.
Based on what I'm seeing, I'm guessing that Matrix 1 and Matrix 2 only have three fields each, an ID field, an amount field (being "amount1" or "Tamount1"), and a year field.
Your column grouping is manipulating the display of the data to show all values broken out by year. This works fine when looking at data from a single dataset. However, your formula is specifying that the sum of everything in the Amount1 field of dsmatrix1 and the Tamount1 field of dsmatrix2 should be added. This does not take into account the column grouping. Your expression is essentially taking all of the values from both datasets and adding them together.
Not knowing more about your query structure or how the data is filtered, my best guess is that you need another SQL dataset. In this case, you would take the queries from your two previous datasets and union them with the "Union All" command. Note that you will want to use Union All and not just Union. More on that here: What is the difference between UNION and UNION ALL?
Your end result should look something like this:
--This will be your dsmatrix1 query copied and pasted
Select ...
Union All
--This will be your dsmatrix2 query copied and pasted
Select ...
--Place one single Order by clause at the bottom
Order by ...
Note: for your two queries to be unioned properly, you'll need to make sure that each have the same number of fields, each with the same data types. Then you can point your third matrix to the new dataset.
Hope that helps!