What are Boolean Networks? - language-agnostic

I was reading this SO question and I got intrigued by what are Boolean Networks, I've looked it up in Wikipedia but the explanation is too vague IMO. Can anyone explain me what Boolean Networks are? And if possible with some examples too?

Boolean networks represent a class of networks where the nodes have states and the edges represent transitions between states. In the simplest case, these states are either 1 or 0 – i.e. boolean.
Transitions may be simple activations or inactivations. For example, consider nodes a and b with an edge from a to b.
f
a ------> b
Here, f is a transition function. In the case of activation, f may be defined as:
f(x) = x
i.e. b's value is 1 if and only if a's value is 1. Conversely, an inactivation (or repression) might look like this:
f(x) = NOT x
More complex networks use more involved boolean functions. E.g. consider:
a b
\ /
\ /
\ /
v
c
Here, we've got edges from a to c and from b to c. c might be defined in terms of a and b as follows.
f(a, b) = a AND NOT b
Thus, c is activated only if a is active and b is inactive, at the same time.
Such networks can be used to model all kinds of relations. One that I know of is in systems biology where they are used to model (huge) interaction networks of chemicals in living cells. These networks effectively model how certain aspects of the cells work and they can be used to find deficiencies, points of attack for drugs and similarities between unrelated components that point to functional equivalence. This is fundamentally important in understanding how life works.

Related

Can 1D CNNs infer a feature from two other included features?

I'm using a 1D CNN on temporal data. Let's say that I have two features A and B. The ratio between A and B (i.e. A/B) is important - let's call this feature C. I'm wondering if I need to explicitly calculate and include feature C, or can the CNN theoretically infer feature C from the given features A and B?
I understand that in deep learning, it's best to exclude highly-correlated features (such as feature C), but I don't understand why.
The short answer is NO. Using the standard DNN layers will not automatically capture this A/B relationship, because standard layers like Conv/Dense will only perform the matrix multiplication operations.
To simplify the discussion, let us assume that your input feature is two-dimensional, where the first dimension is A and the second is B. Applying a Conv layer to this feature simply learns a weight matrix w and bias b
y = w * [f_A, f_B] + b = w_A * f_A + w_B * f_B + b
As you can see, there is no way for this representation to mimic or even approximate the ratio operation between A and B.
You don't have to use the feature C in the same way as feature A and B. Instead, it may be a better idea to keep feature C as an individual input, because its dynamic range may be very different from those of A and B. This means that you can have a multiple-input network, where each input has its own feature extraction layers and the resulting features from both inputs can be concatenated together to predict your target.

Universal Quantification in Isabelle/HOL

It has come to my attention that there are several ways to deal with universal quantification when working with Isabelle/HOL Isar. I am trying to write some proofs in a style that is suitable for undergraduate students to understand and reproduce (that's why I'm using Isar!) and I am confused about how to express universal quantification in a nice way.
In Coq for example, I can write forall x, P(x) and then I may say "induction x" and that will automatically generate goals according to the corresponding induction principle. However, in Isabelle/HOL Isar, if I want to directly apply an induction principle I must state the theorem without any quantification, like this:
lemma foo: P(x)
proof (induct x)
And this works fine as x is then treated as a schematic variable, as if it was universally quantified. However, it lacks the universal quantification in the statement which is not very educational. Another way I have fund is by using \<And> and \<forall>. However, I can not directly apply the induction principle if I state the lemma in this way, I have to first fix the universally quantified variables... which again seems inconvenient from an educational point of view:
lemma foo: \<And>x. P(x)
proof -
fix x
show "P(x)"
proof (induct x)
What is a nice proof pattern for expressing universal quantification that does not require me to explicitly fix variables before induction?
You can use induct_tac, case_tac, etc. These are the legacy variant of the induct/induction and cases methods used in proper Isar. They can operate on bound meta-universally-quantified variables in the goal state, like the x in your second example:
lemma foo: "⋀x. P(x :: nat)"
proof (induct_tac x)
One disadvantage of induct_tac over induction is that it does not provide cases, so you cannot just write case (Suc x) and then from Suc.IH and show ?case in your proof. Another disadvantage is that addressing bound variables is, in general, rather fragile, since their names are often generated automatically by Isabelle and may change when Isabelle changes. (not in the case you have shown above, of course)
This is one of the reasons why Isar proofs are preferred these days. I would strongly advise against showing your students ‘bad’ Isabelle with the intention that it is easier for them to understand.
The facts are these: free variables in a theorem statement in Isabelle are logically equivalent to universally-quantified variables and Isabelle automatically converts them to schematic variables after you have proven it. This convention is not unique to Isabelle; it is common in mathematics and logic, and it helps to reduce clutter. Isar in particular tries to avoid explicit use of the ⋀ operator in goal statements (i.e. have/show; they still appear in assume).
Or, in short: free variables in theorems are universally quantified by default. I doubt that students will find this hard to understand; I certainly did not when I started with Isabelle as a BSc student. In fact, I found it much more natural to state a theorem as xs # (ys # zs) = (xs # ys) # zs instead of ∀xs ys zs. xs # (ys # zs) = (xs # ys) # zs.

Why does Feynman refer to this reversible gate as a NAND?

In chapter 2 of The Pleasure of Finding Things Out, Richard P. Feynman discusses the physical limitations associated with building computers of very small size. He introduces the concept of reversible logic gates:
The great discovery of Bennett and, independently, of Fredkin is that it is possible to do computation with a different kind of fundamental gate unit, namely, a reversible gate unit. I have illustrated their idea---with a unit which I could call a reversible NAND gate.
He goes on to describe the behaviour of what he calls a reversible NAND gate:
It has three inputs and three outputs. Of the outputs, two, A' and B', are the same as two of the inputs, A and B, but the third input works this way. C' is the same as C unless A and B are both 1, in which case C it changes whatever C is. For instance, if C is 1 it is changed to 0, if C is 0 it is changed to 1---but these changes only happen if both A and B are 1.
The book contains a diagram of that reversible gate, which I've attached below (link):
The truth table of a classical, irreversible NAND gate (inputs: A, B; output: C) is as follows:
If I understand Feynman's description correctly, the truth table of what Feynman refers to as a reversible NAND gate should be as follows:
However, I don't understand why Feynman is calling his gate a NAND. How is one supposed to derive the result of NAND(A,B) with his gate? It seems to me that NAND(A,B) cannot be derived directly from any of the three outputs (A',B',C'). NAND(A,B) is given by XOR(C,C'), but that would require an additional XOR gate. So why is Feynman calling his gate a NAND gate?
As others have mentioned, when input C = 1, then output C' = A NAND B.
The key, however, is that no information is destroyed. Given the outputs, the inputs can be determined. Contrast this with a normal NAND gate where the input cannot be determined given only the output; information is destroyed. With the reversible NAND gate, not information is lost. No information loss is significant because theoretically there then can be no energy loss.
If you leave C set, then C' is A NAND B.
(I haven't read the paper, but I'm guessing that C is what makes the gate "reversible" - with C set it's a NAND gate, and with C clear it's an AND gate.)

Pattern matching with associative and commutative operators

Pattern matching (as found in e.g. Prolog, the ML family languages and various expert system shells) normally operates by matching a query against data element by element in strict order.
In domains like automated theorem proving, however, there is a requirement to take into account that some operators are associative and commutative. Suppose we have data
A or B or C
and query
C or $X
Going by surface syntax this doesn't match, but logically it should match with $X bound to A or B because or is associative and commutative.
Is there any existing system, in any language, that does this sort of thing?
Associative-Commutative pattern matching has been around since 1981 and earlier, and is still a hot topic today.
There are lots of systems that implement this idea and make it useful; it means you can avoid write complicated pattern matches when associtivity or commutativity could be used to make the pattern match. Yes, it can be expensive; better the pattern matcher do this automatically, than you do it badly by hand.
You can see an example in a rewrite system for algebra and simple calculus implemented using our program transformation system. In this example, the symbolic language to be processed is defined by grammar rules, and those rules that have A-C properties are marked. Rewrites on trees produced by parsing the symbolic language are automatically extended to match.
The maude term rewriter implements associative and commutative pattern matching.
http://maude.cs.uiuc.edu/
I've never encountered such a thing, and I just had a more detailed look.
There is a sound computational reason for not implementing this by default - one has to essentially generate all combinations of the input before pattern matching, or you have to generate the full cross-product worth of match clauses.
I suspect that the usual way to implement this would be to simply write both patterns (in the binary case), i.e., have patterns for both C or $X and $X or C.
Depending on the underlying organisation of data (it's usually tuples), this pattern matching would involve rearranging the order of tuple elements, which would be weird (particularly in a strongly typed environment!). If it's lists instead, then you're on even shakier ground.
Incidentally, I suspect that the operation you fundamentally want is disjoint union patterns on sets, e.g.:
foo (Or ({C} disjointUnion {X})) = ...
The only programming environment I've seen that deals with sets in any detail would be Isabelle/HOL, and I'm still not sure that you can construct pattern matches over them.
EDIT: It looks like Isabelle's function functionality (rather than fun) will let you define complex non-constructor patterns, except then you have to prove that they are used consistently, and you can't use the code generator anymore.
EDIT 2: The way I implemented similar functionality over n commutative, associative and transitive operators was this:
My terms were of the form A | B | C | D, while queries were of the form B | C | $X, where $X was permitted to match zero or more things. I pre-sorted these using lexographic ordering, so that variables always occurred in the last position.
First, you construct all pairwise matches, ignoring variables for now, and recording those that match according to your rules.
{ (B,B), (C,C) }
If you treat this as a bipartite graph, then you are essentially doing a perfect marriage problem. There exist fast algorithms for finding these.
Assuming you find one, then you gather up everything that does not appear on the left-hand side of your relation (in this example, A and D), and you stuff them into the variable $X, and your match is complete. Obviously you can fail at any stage here, but this will mostly happen if there is no variable free on the RHS, or if there exists a constructor on the LHS that is not matched by anything (preventing you from finding a perfect match).
Sorry if this is a bit muddled. It's been a while since I wrote this code, but I hope this helps you, even a little bit!
For the record, this might not be a good approach in all cases. I had very complex notions of 'match' on subterms (i.e., not simple equality), and so building sets or anything would not have worked. Maybe that'll work in your case though and you can compute disjoint unions directly.

Examples of monoids/semigroups in programming

It is well-known that monoids are stunningly ubiquitous in programing. They are so ubiquitous and so useful that I, as a 'hobby project', am working on a system that is completely based on their properties (distributed data aggregation). To make the system useful I need useful monoids :)
I already know of these:
Numeric or matrix sum
Numeric or matrix product
Minimum or maximum under a total order with a top or bottom element (more generally, join or meet in a bounded lattice, or even more generally, product or coproduct in a category)
Set union
Map union where conflicting values are joined using a monoid
Intersection of subsets of a finite set (or just set intersection if we speak about semigroups)
Intersection of maps with a bounded key domain (same here)
Merge of sorted sequences, perhaps with joining key-equal values in a different monoid/semigroup
Bounded merge of sorted lists (same as above, but we take the top N of the result)
Cartesian product of two monoids or semigroups
List concatenation
Endomorphism composition.
Now, let us define a quasi-property of an operation as a property that holds up to an equivalence relation. For example, list concatenation is quasi-commutative if we consider lists of equal length or with identical contents up to permutation to be equivalent.
Here are some quasi-monoids and quasi-commutative monoids and semigroups:
Any (a+b = a or b, if we consider all elements of the carrier set to be equivalent)
Any satisfying predicate (a+b = the one of a and b that is non-null and satisfies some predicate P, if none does then null; if we consider all elements satisfying P equivalent)
Bounded mixture of random samples (xs+ys = a random sample of size N from the concatenation of xs and ys; if we consider any two samples with the same distribution as the whole dataset to be equivalent)
Bounded mixture of weighted random samples
Let's call it "topological merge": given two acyclic and non-contradicting dependency graphs, a graph that contains all the dependencies specified in both. For example, list "concatenation" that may produce any permutation in which elements of each list follow in order (say, 123+456=142356).
Which others do exist?
Quotient monoid is another way to form monoids (quasimonoids?): given monoid M and an equivalence relation ~ compatible with multiplication, it gives another monoid. For example:
finite multisets with union: if A* is a free monoid (lists with concatenation), ~ is "is a permutation of" relation, then A*/~ is a free commutative monoid.
finite sets with union: If ~ is modified to disregard count of elements (so "aa" ~ "a") then A*/~ is a free commutative idempotent monoid.
syntactic monoid: Any regular language gives rise to syntactic monoid that is quotient of A* by "indistinguishability by language" relation. Here is a finger tree implementation of this idea. For example, the language {a3n:n natural} has Z3 as the syntactic monoid.
Quotient monoids automatically come with homomorphism M -> M/~ that is surjective.
A "dual" construction are submonoids. They come with homomorphism A -> M that is injective.
Yet another construction on monoids is tensor product.
Monoids allow exponentation by squaring in O(log n) and fast parallel prefix sums computation. Also they are used in Writer monad.
The Haskell standard library is alternately praised and attacked for its use of the actual mathematical terms for its type classes. (In my opinion it's a good thing, since without it I'd never even know what a monoid is!). In any case, you might check out http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Monoid.html for a few more examples:
the dual of any monoid is a monoid: given a+b, define a new operation ++ with a++b = b+a
conjunction and disjunction of booleans
over the Maybe monad (aka "option" in Ocaml), first and last. That is,first (Just a) b = Just a
first Nothing b = band likewise for last
The latter is just the tip of the iceberg of a whole family of monoids related to monads and arrows, but I can't really wrap my head around these (other than simply monadic endomorphisms). But a google search on monads monoids turns up quite a bit.
A really useful example of a commutative monoid is unification in logic and constraint languages. See section 2.8.2.2 of 'Concepts, Techniques and Models of Computer Programming' for a precise definition of a possible unification algorithm.
Good luck with your language! I'm doing something similar with a parallel language, using monoids to merge subresults from parallel computations.
Arbitrary length Roman numeral value computation.
https://gist.github.com/4542999