How to build up edgar filing calculations when a fact is summed up multiple ways? - xbrl

The Edgar documentation has some limited information on how to handle facts with different dimension break-downs. Take as an example the AAPL annual report:
On page 29 the total Net Sales (365,817) is split for products and services
On page 37 the same total is split as per Apple product lines.
I try to figure out from the available files which elements should be added to get to the total Net Sales. The problem is that in the Xbrl extract file all the dimension sub-elements (product/service and iPhone/Mac/etc.) have the same tag (us-gaap:RevenueFromContractWithCustomerExcludingAssessedTax) and all have a very similar context, with a segment of <xbrldi:explicitMember dimension="srt:ProductOrServiceAxis">. The only difference that one of the dimension sets is in the us-gaap: namespace and the other is in the aapl: namespace, but I do not think this should be enough in general. What, e.g. if there would be a third split of the total Net sales, domestic vs. foreign also in the aapl: namespace.
What the manual says is about calculation rules in chapter 6.14.5 of Edgar Filer Manual that facts in a calculation must appear in the same presentation, but in this case there is no calculation for adding up the dimension elements. If one clicks on the iPhone value e.g. then it does not show that it adds up to the total Net sales, but it adds up to the Gross Profit, as it is not an individual fact, it is only a dimension of the same fact as the total.
The other place where I found a reference is 6.15.3, but then again it is talking about adding up different facts to get to the same total, but as said above it is not facts that are added up, but it is the only dimensions of the same fact.
I could probably do a separation based on where these values appear in a Presentation, but I would think to identify what is one set of a dimension and what is another, can be done better.

FASB provides guidance on this topic here: https://www.fasb.org/consolidatedandnonconsolidatedentities_2018
Here's the data from the report:
tag
value
uom
segments
iprx
RevenueFromContractWithCustomerExcludingAssessedTax
3.65817e+11
USD
0
RevenueFromContractWithCustomerExcludingAssessedTax
3.65817e+11
USD
1
RevenueFromContractWithCustomerExcludingAssessedTax
3.65817e+11
USD
2
RevenueFromContractWithCustomerExcludingAssessedTax
1.53306e+11
USD
BusinessSegments=AmericasSegment;
0
RevenueFromContractWithCustomerExcludingAssessedTax
8.9307e+10
USD
BusinessSegments=EuropeSegment;
0
RevenueFromContractWithCustomerExcludingAssessedTax
6.8366e+10
USD
BusinessSegments=GreaterChinaSegment;
0
RevenueFromContractWithCustomerExcludingAssessedTax
2.8482e+10
USD
BusinessSegments=JapanSegment;
0
RevenueFromContractWithCustomerExcludingAssessedTax
2.6356e+10
USD
BusinessSegments=RestOfAsiaPacificSegment;
0
RevenueFromContractWithCustomerExcludingAssessedTax
6.8366e+10
USD
Geographical=CN;
0
RevenueFromContractWithCustomerExcludingAssessedTax
1.63648e+11
USD
Geographical=OtherCountries;
0
RevenueFromContractWithCustomerExcludingAssessedTax
1.33803e+11
USD
Geographical=US;
0
RevenueFromContractWithCustomerExcludingAssessedTax
3.1862e+10
USD
ProductOrService=IPad;
0
RevenueFromContractWithCustomerExcludingAssessedTax
1.91973e+11
USD
ProductOrService=IPhone;
0
RevenueFromContractWithCustomerExcludingAssessedTax
3.519e+10
USD
ProductOrService=Mac;
0
RevenueFromContractWithCustomerExcludingAssessedTax
2.97392e+11
USD
ProductOrService=Product;
0
RevenueFromContractWithCustomerExcludingAssessedTax
6.8425e+10
USD
ProductOrService=Service;
0
RevenueFromContractWithCustomerExcludingAssessedTax
6.8425e+10
USD
ProductOrService=Service;
1
RevenueFromContractWithCustomerExcludingAssessedTax
3.8367e+10
USD
ProductOrService=WearablesHomeandAccessories;
0
Here is a query that shows how the AAPL results can be rolled up consistently:
select sum(value) as total_sum
from num join dim on num.dimh=dim.dimhash
join pre on num.adsh=pre.adsh and num.tag=pre.tag and num.version=pre.version
where num.adsh='0000320193-21-000105' and ddate='20210930' and pre.stmt='IS' and num.tag='RevenueFromContractWithCustomerExcludingAssessedTax' and segments is null and iprx=0
union all select sum(value) as business_segments_sum
from num join dim on num.dimh=dim.dimhash
join pre on num.adsh=pre.adsh and num.tag=pre.tag and num.version=pre.version
where num.adsh='0000320193-21-000105' and ddate='20210930' and pre.stmt='IS' and num.tag='RevenueFromContractWithCustomerExcludingAssessedTax' and segments like 'BusinessSegments=%' and iprx=0
union all select sum(value) as geographical_sum
from num join dim on num.dimh=dim.dimhash
join pre on num.adsh=pre.adsh and num.tag=pre.tag and num.version=pre.version
where num.adsh='0000320193-21-000105' and ddate='20210930' and pre.stmt='IS' and num.tag='RevenueFromContractWithCustomerExcludingAssessedTax' and segments like 'Geographical=%' and iprx=0
union all select sum(value) as product_or_service_sum
from num join dim on num.dimh=dim.dimhash
join pre on num.adsh=pre.adsh and num.tag=pre.tag and num.version=pre.version
where num.adsh='0000320193-21-000105' and ddate='20210930' and pre.stmt='IS' and num.tag='RevenueFromContractWithCustomerExcludingAssessedTax' and segments like 'ProductOrService=%' and (segments like '%=Product;' or segments like '%=Service;') and iprx=0
Yielding:
total_sum
365817000000.0 (Reported total)
365817000000.0 (total by Business Segments)
365817000000.0 (total by Geographical Segements)
365817000000.0 (total by Products and Services)
What's obviously not to like in this instance is that Apple did not use SubSegments to clarify that iPad, iPhone, Mac, and WearablesHomeandAccessories all belong to the Products segment and should not be duplicatively added to the Product segment.
I have no idea why Apple ignores the clear FASB guidance on this subject, but they are the ones with a Trillion+ dollar market cap, not me.

Related

use a transaction database to calculate the probability of an item appearing in a future transaction using R or SQL

I have a database of transactions like in the table below
user_id order_id order_number product_name n
<int> <int> <int> <fctr> <int>
1 11878590 3 Pistachios 1
1 11878590 3 Soda 1
1 12878790 4 Yogurt 1
1 12878790 4 Cheddar Popcorn 1
1 12878790 4 Cinnamon Toast Crunch 1
2 12878791 11 Milk Chocolate Almonds 1
2 12878791 11 Half & Half 1
2 12878791 11 String Cheese 1
11 12878792 19 Whole Milk 1
11 12878792 19 Pistachios 1
11 12878792 19 Soda 1
11 12878792 19 Paper Towel Rolls 1
The table has multiple users who each have multiple transactions. Some users only have 3 transactions, other users have 15, etc. This is all in one table.
I'm trying to calculate a transition matrix for a markov model. I want to find the probability that an item will be in a new basket given that it was present in the previous basket of transactions.
I want my final table to look something like this
user_id product_name probability_present probability_absent
1 Soda .5 .5
1 Pistachios .5 .5
I'm having trouble figuring out how to get the data into a form so that I can calculate the probabilities and specifically coming up with a way to compare all of the t,t-1 combinations.
I have code that I've written to get things into this form, but I'm stuck at this point. I've written my code using the dplyr R package, but I could translate something in SQL into the R code. I can post my code in R if it will be helpful, but it is pretty simple at this point as I just had to do a few joins to get the table into this shape.
What else do I have to do to get the table/values that I'm trying to calculate?
This seems to give you the desired probabilities:
SELECT user_id,
product_name,
COUNT(DISTINCT order_number) / COUNT(*) AS prob_present,
1 - COUNT(DISTINCT order_number) / COUNT(*) AS prob_absent
FROM tbl
WHERE user_id = 1
GROUP BY user_id, product_name;
Or at least it gives you the numbers you have. If this is not right, please provide a slightly more complex example dataset.

Consolidated Data, except under certain conditions

I have the following results(temp table):
Product Side(buy/sell) TotalQuantity AverageWeightedPrice Cost
Prod1 1 100 120 12,000
Prod1 2 -50 130 -6,500
Prod2
Prod2
So on and so forth for multiple products.
I consolidated it to (with a groupby):
Product Side(buy/sell) TotalQuantity AverageWeightedPrice Cost
Prod1 1 50 110 5,500
Prod2
I want the consolidated results, unless it is certain conditions:
When Side 1 and 2 have the same Quantity. Consolidated quantity would be 0 and I would not be able to calculate the AverageWeightedPrice anymore.
When consolidated quantity = 0, the other condition is when TotalQuantity and Cost are of inversed values, ie, when Quantity is positive and Cost is Negative (and when Quantity+ and Cost-)
If it is the certain conditions, I would want to return the UNcosolidated data.
I am having trouble excluding between consolidated and unconsolidated data at the same time.
This is not something that SQL is really good for.
It is totally possible to produce this result with pure sql but the solution is very complex and you should not do it if you have any other option for it since sql does not provide any optimization over complex tasks like this and the syntax can be hard to master. Maintaining the query could also prove to be hard task in the future.
In cases like this you should use sql only to fetch the data and then format (consolidate) it using the whichever programming language or application that is executing the query.

Isolating unique observations and calculating the average in Stata

Currently I have a dataset that appears as follows:
mnbr firm contribution
1591 2 1
9246 6 1
812 6 1
674 6 1
And so on. The idea is that mnbr is the member number of employees who work at firm # whatever. If contribution is 1 (and I have dropped all the 0s for this purpose) said employee has contributed to a certain fund.
I additionally used codebook to determine the number of unique firms that exist. The goal is to determine the average number of contributions per firm i.e. there was 1 contribution for firm 2, 3 contributions for firm 6 and so on. The problem I arrive at is accessing that the unique values number from codebook.
I read some documentation online for
inspect *varlist*
display r(N_unique)
which suggests to me that using r(N_unique) would store that value, yet unfortunately this method did not work for me. So that is part 1.
Part 2 is I'd also like to create a variable that shows the contributions in each firm i.e.
mnbr firm contribution average
1591 2 1 1
9246 6 . 2/3
812 6 1 2/3
674 6 1 2/3
to show that for firm 6, 2 out of the 3 employees contributed to this fund.
Thanks in advance for the help.
To answer your comment, this works for me:
clear
set more off
input ///
mnbr firm cont
1591 2 1
9246 6 .
812 6 1
674 6 1
end
list
// problem 1
inspect firm
display r(N_unique)
// problem 2
bysort firm: egen totc = total(cont)
by firm: gen share = totc / _N
list
You have to use r(N_unique) before running another Stata command, or it can get lost. You can also save that result to a local or scalar.
Problem 2 is also addressed.

Access Calculated Field

I am having difficulty trying to make a calculated field that I need. So here is what I am trying to do:
I have a query that combines the information based on three tables. The most important fields that for the application are as follows:
Family Income Age Patient
15,000 18 Yes
28,000 25 No
30,000 1 Yes
From here I want to make a calculated field that gives the correct program the patient was enrolled in. based on these fields ie:
Program Minimum Income Maximum Income Minimum Age Maximum Age Patient
Children's 0 20,000 1 19 Yes
Adult 0 12,000 19 65 No
Non Patient 0 20,000 1 19 No
Adult 2 12,000 50,000 19 65 No
Etc.
to create:
Family Income Age Patient Program
15,000 18 Yes Children's
28,000 25 No Adult 2
30,000 1 Yes Children's 2
I know I can use IIf to hard code it in to the field, but then it will be really difficult for other people to update the information as the guidelines change. Is it possible to have the information stored in a table? and use the information on the table form etc, or will I need to use IIf
Any Ideas? is it possible to dynamically create the IIf in SQL using VBA while pulling the information from the table?
EDIT:::
Thank you for your response and for formatting my tables, I still have no idea how you changed it, but it looks amazing!
I tried to add the SQL you added down below, but I was not able to make it work. I'm not sure if I made a mistake so I included the SQL of my Query. The query currently returns 0 values, so I think I messed something up. (The real Query is embarassing...I'm sorry for that). Unfortunately, I have done everything in my power to avoid SQL, and now I am paying the price.
SELECT qry_CombinedIndividual.qry_PrimaryApplicant.[Application Date],
qry_CombinedIndividual.qry_PrimaryApplicant.[Eligibility Rep],
qry_CombinedIndividual.qry_PrimaryApplicant.Name,
qry_CombinedIndividual.qry_PrimaryApplicant.Clinic,
qry_CombinedIndividual.qry_PrimaryApplicant.Outreach,
qry_CombinedIndividual.qry_PrimaryApplicant.[Content Type ID],
qry_CombinedIndividual.qry_PrimaryApplicant.[Application Status],
qry_CombinedIndividual.qry_PrimaryApplicant.Renewal,
qry_CombinedIndividual.qry_Enrolled.EthnicityEnr,
qry_CombinedIndividual.qry_Enrolled.GenderEnr, qry_CombinedIndividual.AgeAtApp,
qry_CombinedIndividual.[Percent FPL], tbl_ChildrensMedical.MinPercentFPL,
tbl_ChildrensMedical.MaxPercentFPL, tbl_ChildrensMedical.MinAge,
tbl_ChildrensMedical.MaxAge, tbl_ChildrensMedical.Program
FROM qry_CombinedIndividual
INNER JOIN tbl_ChildrensMedical ON qry_CombinedIndividual.qry_Enrolled.Patient = tbl_ChildrensMedical.Patient
WHERE (((qry_CombinedIndividual.AgeAtApp)>=[tbl_ChildrensMedical].[MinAge]
And (qry_CombinedIndividual.AgeAtApp)<[tbl_ChildrensMedical].[MinAge])
AND ((qry_CombinedIndividual.[Percent FPL])>=[tbl_ChildrensMedical].[MinPercentFPL]
And (qry_CombinedIndividual.[Percent FPL])<[tbl_ChildrensMedical].[MaxPercentFPL]));
Also there are many different programs. Here is the real Children's Table (eventually I would like to add adults if possible)
*Note the actual table uses FPL (which takes family size into account, but is used the same as income). I am again at a total loss as to how you formated the table.
Program Patient MinPercentFPL MaxPercentFPL MinAge MaxAge
SCHIP (No Premium) No 0 210 1 19
SCHIP (Tier 1) No 210 260 1 19
SCHIP (Tier 2) No 260 312 1 19
Newborn No 0 300 0 1
Newborn (Patient) Yes 0 300 0 1
Children's Medical Yes 0 200 1 19
CHIP (20 Premium) Yes 200 250 1 19
CHIP (30 Premium) Yes 250 300 1 19
Do I have the correct implementation for the table I have? Or should I be changing something. I can also send more information/sample data if that would help.
Thank you again!
I just created some tables with your sample data and used the following SQL. Your 3rd 'patient' doesn't match any of the ranges (Age 1, Income $30K)
SELECT tblPatient.PatName, tblPatient.FamInc, tblPatient.Age, tblPatient.Patient,
tblPatientRange.Program, tblPatientRange.MinInc, tblPatientRange.MaxInc, tblPatientRange.MinAge,
tblPatientRange.MaxAge, tblPatientRange.Patient
FROM tblPatient INNER JOIN tblPatientRange ON tblPatient.Patient = tblPatientRange.Patient
WHERE (((tblPatient.FamInc)>=[tblPatientRange]![MinInc] And (tblPatient.FamInc)<=[tblPatientRange]![MaxInc])
AND ((tblPatient.Age)>=[tblPatientRange]![MinAge] And (tblPatient.Age)<=[tblPatientRange]![MaxAge]));

Removing redundant values in SSRS report group

I am developing an SSRS report with the following dataset. There is a filter for 'Period'. It is a multi-select filter. Data is grouped by 'Account' field. I need to display Total Expense for each group (which was easy). I also need to display 'Budget' on the same group level. The problem is the budget data is redundant - see below.
Say for the first group (Account=100 AND Period=201301), Sum([Budget]) would generate 200, which is not true. I can use the Average function which helps if user selects only one Period from the filter. If they select multiple values (e.g. 201301,201302) then the average will be (100+100+150+150)/4=125, which would be wrong because it has to be 100+150=250. I don't want to average among all rows in the returned dataset.
ID Account Period Expense Budget
1 100 201301 20 100
2 100 201301 30 100
3 100 201302 10 150
4 100 201302 40 150
5 200 ...................
So, how do I write an expression to make this happen?
A dirty workaound would be to eliminate redundant values in the Budget column so I can safely use Sum([Budget]) w/o worrying about duplication. The updated dataset would look like this:
ID Account Period Expense Budget
1 100 201301 20 100
2 100 201301 30 NULL
3 100 201302 10 150
4 100 201302 40 NULL
5 200 ...................
Please advice for either approach. Thank you.
The most elegant way is to use the FIRST() aggregate function.
=FIRST(Fields!Budget.Value, "MyAccountGroupName")
There are some situations where this won't work. Then you need to move the logic to your query as you describe or you can get fancy with embedded code in your report.
I would follow your "dirty workaround" approach. You might possibly be able to achieve the result just inside SSRS with some fancy calculations, but it will be totally obscure.