In what situation do we use ATTR Function in Tableau? - function

As the title, why and in what situation do we use ATTR Function in Tableau?
i understand what it is, and its simplest form.
thank you guys.

I know you say you understand what it is in its simplest form, but just for folks who stumble upon this question, I'd also offer this blog post, which gives an example and a "plain English" break-down of this Tableau Knowledge Base article "When to Use the Attribute (ATTR) Function". This same article is also linked to in this older SO question of a similar vein.
In a phrase from the blog post:
It returns a value if it is unique, else it returns *
As another example, this thread discusses ATTR in the context of calculating an average where date is less than the value from another data source.

The typical case is for a tooltip where most of the time there is only one value for a field among all the data rows at the level of detail of the viz, but there are a few exceptions. In that case, you may not want to create new marks by treating the field as a dimension, which would increase the level of detail of the viz. This is so common that Tableau automatically uses ATTR() when you put dimensions on the tooltip shelf.
Other cases are when you are putting a dimension on the detail or row shelves, and want to "debug" the level of detail of your viz. Say, you expect that using A, B, and C as dimensions will cause each mark to have the same value for D. You can use ATTR(D) to concisely show which combinations of A, B, C differ from your expectation.
ATTR is also useful with calculated fields in some cases, say in a IF condition when defining an aggregate calc.

Related

two-way ANOVA output - summary or apa.aov.table?

If I calculate a two-way ANOVA with the code "aov", what should I use to interpret the data, the "summary" of the ANOVA shown in the console or the apa.aov.table which produces a APA style table in word? The p-Values differ a lot.
I suspect that the reason why the p values differs is that the aov() function defaults to type I sums of squares. However, the apa.aov.table() function defaults to type III sums of squares, unless specified otherwise (documentation here: https://www.rdocumentation.org/packages/apaTables/versions/2.0.8/topics/apa.aov.table).
At least in my area, type III sums of squares is kind of the standard at the moment, so if that is also the case for you, I would recommend the Anova() function from the "Car" package, which allows you to specify which type you would like to use.

CUDA: Scatter communication pattern

I am learning CUDA from the Udacity's course on parallel programming. In a quiz, they have a given a problem of sorting a pre-ranked variable(player's height). Since, it is a one-one correspondence between input and output array, should it not be a Map communication pattern instead of a Scatter?
CUDA makes no canonical definition of these terms, that I know of. Therefore my answer is merely a suggestion of how it might be or have been interpreted.
"Since, it is a one-one correspondence between input and output array"
This statement doesn't appear to be supported by the diagram, which shows gaps in the output array, which have no corresponding input point associated with them.
If a smaller set of values are distributed into a larger array (with resultant gaps in the output array, therefore, in which no input value corresponds to the gap location(s)), then a scatter might be used to describe that operation. Both scatters and maps have maps which describe where the input values go, but it might be that the instructor has defined scatter and map in such a way as to differentiate between these two cases, such as the following plausible definitions:
Scatter: one-to-one relationship from input to output (ie. unidirectional relationship). Every input location has a corresponding output location, but not every output location has a corresponding input location.
Map: one-to-one relationship between input and output (ie. bidirectional relationship). Every input location has a corresponding output location, and every output location has a corresponding input location.
Gather: one-to-one relationship from output to input (ie. unidirection relationship). Every output location has a corresponding input location, but not every input location has a corresponding output location.
The definition of each communication pattern (map, scatter, gather, etc.) varies slightly from one language/environment/context to another, but since I have followed that same Udacity course I'll try to explain that term as I understand it in the context of the course:
The Map operation calculates each output element as a function of its corresponding input element, i.e.:
output[tid] = foo(input[tid]);
The Gather pattern calculates each output element as a function of one or more (usually more) input elements, not necessarily the corresponding one (typically these are elements from a neighborhood). For example:
output[tid] = (input[tid-1] + input[tid+1]) / 2;
Lastly, the Scatter operation has each input element contribute to one or more (again, usually more) output elements. For instance,
atomicAdd( &(output[tid-1]), input[tid]);
atomicAdd( &(output[tid]), input[tid]);
atomicAdd( &(output[tid+1]), input[tid]);
The example given in the question is clearly not a Map, because each output is calculated from an input at a different location.
Also, it is hard to see how the same example can be a scatter, because each input element only causes one write to the output, but it is indeed a scatter because each input causes a write to an output whose location is determined by the input.
In other words, each CUDA thread processes an input element at the location associated with its tid(thread ID number), and calculates where to write the result. More usually a scatter would write on several places instead of only one, so this is a particular case that might as well be named differently.
Each player has 3 properties (name, height, rank).
So I think scatter is correct, because we should consider these three things to make output.
If player has only one property like rank,
then Map is correct I think.
reference: Parallel Communication Patterns Recap in this lecture
reference: map/reduce/gather/scatter with image

sLDA. How much values response variable may have?

I try to understand in general how sLDA works. In contrast to LDA, it has 'a response variable associated with each document'. Is each document labeled just by one topic in training set or it might be labeled by multiple topics?
If it must use just one topic as label for one document, is there another LDA model which takes as input several labels for each document in training set?
If sLDA might use more then one topic as label, is there any implementation (in Python, R, C/C++, Matlab) for sLDA with multi-labels?
sLDA has a response variable that is a label, but that really has nothing to do directly with the topics. The topics are still inferred exactly as they are with regular LDA, using probability calculations to build up N topics. Each document ends up with a vector of length N indicating how strongly it "contains" each topic. In sLDA it goes one step further - where it also in the model internally correlates the response label with the topics, to be able to predict what the response label should be for a never before seen document based upon its topic vector.

What is a 'value' in the context of programming?

Can you suggest a precise definition for a 'value' within the context of programming without reference to specific encoding techniques or particular languages or architectures?
[Previous question text, for discussion reference: "What is value in programming? How to define this word precisely?"]
I just happened to be glancing through Pierce's "Types and Programming Languages" - he slips a reasonably precise definition of "value" in a programming context into the text:
[...] defines a subset of terms, called values, that are possible final results of evaluation
This seems like a rather tidy definition - i.e., we take the set of all possible terms, and the ones that can possibly be left over after all evaluation has taken place are values.
Based on the ongoing comments about "bits" being an unacceptable definition, I think this one is a little better (although possibly still flawed):
A value is anything representable on a piece of possibly-infinite Turing machine tape.
Edit: I'm refining this some more.
A value is a member of the set of possible interpretations of any possibly-infinite sequence of symbols.
That is equivalent to the earlier definition based on Turing machine tape, but it actually generalises better.
Here, I'll take a shot: A value is a piece of stored information (in the information-theoretical sense) that can be manipulated by the computer.
(I won't say that a value has meaning; a random number in a register may have no meaning, but it's still a value.)
In short, a value is some assigned meaning to a variable (the object containing the value)
For example type=boolean; name=help; variable=a storage location; value=what is stored in that location;
Further break down:
X = 2; where X is a variable while 2 is the value stored in X.
Have you checked the article in wikipedia?
In computer science, a value is a sequence of bits that is interpreted according to some data type. It is possible for the same sequence of bits to have different values, depending on the type used to interpret its meaning. For instance, the value could be an integer or floating point value, or a string.
Read the Wiki
Value = Value is what we call the "contents" that was stored in the variable
Variables = containers for storing data values
Example: Think of a folder named "Movies"(Variables) and inside of it are it contents which are namely; Pirates of the Carribean, Fantastic Beast, and Lala land, (this in turn is what we now call it's Values )

What's that CS "big word" term for the same action always having the same effect

There's a computer science term for this that escapes my head, one of those words that ends with "-icity".
It means something like a given action will always produce the same result, IE there won't be any hysteresis, or the action will not alter the functioning of the system...
Ring a bell, anyone? Thanks.
Apologies for the tagging, I'm only tagging it Java b/c I learned about this in a Java class back in school and I figure that crowd tends to have more CS background...
This could mean two different things:
deterministic - meaning that given the same initial state, the same operation (with exactly the same data) will always produce the same resulting state (and optional output.) - http://en.wikipedia.org/wiki/Deterministic_algorithm
i.e. same action has the same effect - assuming you start from the same place in the same system. (Nothing random about it, nothing fed in from the outside that could effect the result...)
idempotent - meaning applying a function to a value once e.g. f(x) = v produces the same result as applying the function multiple times e.g. f(f(f(x))) = v - http://en.wikipedia.org/wiki/Idempotence
i.e. one or more function applications yields the same value given the same initial value
you mean idempotent ??
Referential transparency is also used in some CS circles.
Nullipotent?
deterministic ,.,-=
Are you looking for invariant?
http://en.wikipedia.org/wiki/Invariant_%28computer_science%29
In computer science, a predicate is
called an invariant to a sequence of
operations if the predicate always
evaluates at the end of the sequence
to the same value as before starting
the sequence.
side effect-free?
In math, a function 'f' is idempotent if multiple applications do not change the result.
you mean idempotence?
or the action will not alter the functioning of the system...
Are you looking for ‘idempotence’?
The "ends with -icity" part of your question makes me think you might be looking for monotonicity, even though it does not quite match description/definition of the word. From the Wikipedia article:
In mathematics, a monotonic function (or monotone function) is a function which preserves the given order. This concept first arose in calculus, and was later generalized to the more abstract setting of order theory.
In the following illustrations (also borrowed from the Wikipedia article) three functions are drawn:
A:
B:
C:
A and B and both monotonic (increasing and decreasing respectively), while C is not monotonic.
You mean an atomic block of code?
The A in ACID.
Atomicity - states that database modifications must follow an “all or nothing” rule. Each transaction is said to be “atomic.” If one part of the transaction fails, the entire transaction fails.
It sounds like what you're describing would be a memoryless function. Although the term memorylessness is usually used for stochastic distributions, I don't quite remember if it has a programming equivalent...