While using relation algebra (DBMS), what is the order of evaluation of the predicate?
For eg.
σA = B ^ D > 5(r)
Is A = B evaluated first (left to right) or D > 5 (right to left)?
Also, is there a precedence table in relation algebra?
Also, is there a precedence table in relation algebra?
Philip is correct to ask which version/which definition of RA you are using.
In Codd's original 1972 RA you couldn't combine conditions with an AND (you've used ^) like that. You'd have to write that restriction as
σA = B(r) ∩ σD > 5(r)
If you're asking these questions because you think RA is some sort of execution engine for SQL: it isn't; in fact the semantics of RA are different to SQL in several important respects.
So if you're really asking about an execution plan for a query in SQL, I would look at the execution plan in your SQL.
There is no single RA (relational algebra) or associated algebra-oriented query language. What is your textbook/tool name & edition/version? What are its relevant definitions?
Binding tells you how to parse into arguments. When there are no state changes & no undefined names, it doesn't make sense to talk about "evaluated first" or "left to right" between arguments. When names can be undefined then there might be operators that allow defined results despite undefined arguments. Eg C's && (aka CAND aka conditional AND) that is like AND but only evaluates its second argument if the first is false.
An algebra is a collection of values & operators. Many presentations of relational algebras confuse & conflate their relational algebra with their language/notation for writing nested calls to relational algebra operators. Sometimes the language/notation has assignment, which has nothing to do with relational algebra per se.
In relational query languages a restriction/selection formula evaluates to a mapping from a tuple to a boolean. It doesn't evaluate to a boolean. We can reasonably say that the mapping is applied to each tuple in a relation to get a boolean from it.
The standard convention in logic for formulas is that binding gets weaker from parentheses to function calls to AND to OR to IMPLIES.
Many relational algebra-oriented query languages reduce ambiguity by forcing parentheses to be present around relations in calls of some relational operators. Eg Typically the parentheses in notation like σ A = B AND D > 5 (r) are obligatory.
Related
I'm not even sure if this is the right place to ask a question like this.
As a part of my MSc thesis, I am doing some parallel algorithm stuff. To put it simply part of the thing that I am doing is evaluating thousands of expression trees in parallel (expressions like sin(exp (x + y) * cos (z))). What I am doing right now is converting these expression trees to Prefix/Postfix expressions and evaluating them using conventional methods (stack, recursion, etc). These are the basic things that we've all been taught in Data Structures and basic Computer Science courses.
I'm wondering if there is anything else to be used instead of expression trees for dealing with expressions. I know that compilers are heavily using expression trees for parsing phase so I'm assuming there are no alternatives to expression trees (or else someone would have used it in a compiler).
Are there any alternative evaluation methods for such expressions (rather than stacks and recursion). Something more "parallel" friendly? Parsing such expression with stack is sequential and will create a bottleneck in parallel systems. (Exotic/weird/theoretic methods -if any- are also acceptable for my work)
I think that evaluating expression trees is parallelizable, you just don't convert them to the prefix or postfix form.
For example, the tree for the expression you gave would look like this:
sin
|
*
/ \
exp cos
| |
+ z
/ \
x y
When you encounter the *, you could evaluate the exp subexpression on one thread and the cos subexpression on another one. (You could use a future here to make the code simpler, assuming your programming language supports them.)
Although if your expressions really are as simple as this one and you have thousands of them, then I don't see any reason why you would need to evaluate a single expression in parallel. Parallelizing on the expressions themselves should be more than enough (e.g. with 1000 expressions and 2 cores, evaluate 500 on one core and the rest on the other core).
I've read definition of these 2 notions on wiki, but the difference is still not clear. Could someone give examples and some easy explanation?
A declarative specification describes an operation or a function with a constraint that relates the output to the input. Rather than giving you a recipe for computing the output, it gives you a rule for checking that the output is correct. For example, consider a function that takes an array a and a value e, and returns the index of an element of the array matching e. A declarative specification would say exactly that:
function index (a, e)
returns i such that a[i] = e
In contrast, an operational specification would look like code, eg with a loop through the indices to find i. Note that declarative specifications are often non-deterministic; in this case, if e matches more than one element of e, there are several values of i that are valid (and the specification doesn't say which to return). This is a powerful feature. Also, declarative specifications are often not total: here, for example, the specification assumes that such an i exists (and in some languages you would add a precondition to make this explicit).
To support declarative specification, a language must generally provide logical operators (especially conjunction) and quantifiers.
A model-based language is one that uses standard mathematical structures (such as sets and relations) to describe the state. Alloy and Z are both model based. In contrast, algebraic languages (such as OBJ and Larch) use equations to describe state implicitly. For example, to specify an operation that inserts an element in a set, in an algebraic language you might write something like
member(insert(s,e),x) = (e = x) or member(s,x)
which says that after you insert e into s, the element x will be a member of the set if you just inserted that element (e is equal to x) or if it was there before (x is a member of s). In contrast, in a language like Z or Alloy you'd write something like
insert (s, e)
s' = s U {e}
with a constraint relating the new value of the set (s') to the old value (s).
For many examples of declarative, model-based specification, see the materials on Alloy at http://alloy.mit.edu, or my book Software Abstractions. You can also see comparisons between model-based declarative languages through an example in the appendix of the book, available at the book's website http://softwareabstractions.org.
This seems to me pretty obvious, There is not but I might be leaving a special case.
As I see it 1SAT (only one literal per clause) and 2SAT can be easily transformed into 3SAT.
An any clause with more than 3 literas has been proven it can be transformed into 3SAT.
So maybe the question should be asked as:
Do all boolean algebra can be put into SAT? or
can we define boolean algebra with ony these operators? AND OR and NOT
No, there is not.
I will not give the full proof but here is the main idea: Write the given formula in a normal form i.e. conjunction of disjunctions. Use induction on the number of variables on an expression. Pick the longest subexpression with n+1 variables, introduce a new variable for some part of subexpression to leave an expression of n variables, add the constraints for the new variable to the formula, repeat the procedure as many times as needed to have a formula where the longest subexpression has n variables.
Pattern matching (as found in e.g. Prolog, the ML family languages and various expert system shells) normally operates by matching a query against data element by element in strict order.
In domains like automated theorem proving, however, there is a requirement to take into account that some operators are associative and commutative. Suppose we have data
A or B or C
and query
C or $X
Going by surface syntax this doesn't match, but logically it should match with $X bound to A or B because or is associative and commutative.
Is there any existing system, in any language, that does this sort of thing?
Associative-Commutative pattern matching has been around since 1981 and earlier, and is still a hot topic today.
There are lots of systems that implement this idea and make it useful; it means you can avoid write complicated pattern matches when associtivity or commutativity could be used to make the pattern match. Yes, it can be expensive; better the pattern matcher do this automatically, than you do it badly by hand.
You can see an example in a rewrite system for algebra and simple calculus implemented using our program transformation system. In this example, the symbolic language to be processed is defined by grammar rules, and those rules that have A-C properties are marked. Rewrites on trees produced by parsing the symbolic language are automatically extended to match.
The maude term rewriter implements associative and commutative pattern matching.
http://maude.cs.uiuc.edu/
I've never encountered such a thing, and I just had a more detailed look.
There is a sound computational reason for not implementing this by default - one has to essentially generate all combinations of the input before pattern matching, or you have to generate the full cross-product worth of match clauses.
I suspect that the usual way to implement this would be to simply write both patterns (in the binary case), i.e., have patterns for both C or $X and $X or C.
Depending on the underlying organisation of data (it's usually tuples), this pattern matching would involve rearranging the order of tuple elements, which would be weird (particularly in a strongly typed environment!). If it's lists instead, then you're on even shakier ground.
Incidentally, I suspect that the operation you fundamentally want is disjoint union patterns on sets, e.g.:
foo (Or ({C} disjointUnion {X})) = ...
The only programming environment I've seen that deals with sets in any detail would be Isabelle/HOL, and I'm still not sure that you can construct pattern matches over them.
EDIT: It looks like Isabelle's function functionality (rather than fun) will let you define complex non-constructor patterns, except then you have to prove that they are used consistently, and you can't use the code generator anymore.
EDIT 2: The way I implemented similar functionality over n commutative, associative and transitive operators was this:
My terms were of the form A | B | C | D, while queries were of the form B | C | $X, where $X was permitted to match zero or more things. I pre-sorted these using lexographic ordering, so that variables always occurred in the last position.
First, you construct all pairwise matches, ignoring variables for now, and recording those that match according to your rules.
{ (B,B), (C,C) }
If you treat this as a bipartite graph, then you are essentially doing a perfect marriage problem. There exist fast algorithms for finding these.
Assuming you find one, then you gather up everything that does not appear on the left-hand side of your relation (in this example, A and D), and you stuff them into the variable $X, and your match is complete. Obviously you can fail at any stage here, but this will mostly happen if there is no variable free on the RHS, or if there exists a constructor on the LHS that is not matched by anything (preventing you from finding a perfect match).
Sorry if this is a bit muddled. It's been a while since I wrote this code, but I hope this helps you, even a little bit!
For the record, this might not be a good approach in all cases. I had very complex notions of 'match' on subterms (i.e., not simple equality), and so building sets or anything would not have worked. Maybe that'll work in your case though and you can compute disjoint unions directly.
It is well-known that monoids are stunningly ubiquitous in programing. They are so ubiquitous and so useful that I, as a 'hobby project', am working on a system that is completely based on their properties (distributed data aggregation). To make the system useful I need useful monoids :)
I already know of these:
Numeric or matrix sum
Numeric or matrix product
Minimum or maximum under a total order with a top or bottom element (more generally, join or meet in a bounded lattice, or even more generally, product or coproduct in a category)
Set union
Map union where conflicting values are joined using a monoid
Intersection of subsets of a finite set (or just set intersection if we speak about semigroups)
Intersection of maps with a bounded key domain (same here)
Merge of sorted sequences, perhaps with joining key-equal values in a different monoid/semigroup
Bounded merge of sorted lists (same as above, but we take the top N of the result)
Cartesian product of two monoids or semigroups
List concatenation
Endomorphism composition.
Now, let us define a quasi-property of an operation as a property that holds up to an equivalence relation. For example, list concatenation is quasi-commutative if we consider lists of equal length or with identical contents up to permutation to be equivalent.
Here are some quasi-monoids and quasi-commutative monoids and semigroups:
Any (a+b = a or b, if we consider all elements of the carrier set to be equivalent)
Any satisfying predicate (a+b = the one of a and b that is non-null and satisfies some predicate P, if none does then null; if we consider all elements satisfying P equivalent)
Bounded mixture of random samples (xs+ys = a random sample of size N from the concatenation of xs and ys; if we consider any two samples with the same distribution as the whole dataset to be equivalent)
Bounded mixture of weighted random samples
Let's call it "topological merge": given two acyclic and non-contradicting dependency graphs, a graph that contains all the dependencies specified in both. For example, list "concatenation" that may produce any permutation in which elements of each list follow in order (say, 123+456=142356).
Which others do exist?
Quotient monoid is another way to form monoids (quasimonoids?): given monoid M and an equivalence relation ~ compatible with multiplication, it gives another monoid. For example:
finite multisets with union: if A* is a free monoid (lists with concatenation), ~ is "is a permutation of" relation, then A*/~ is a free commutative monoid.
finite sets with union: If ~ is modified to disregard count of elements (so "aa" ~ "a") then A*/~ is a free commutative idempotent monoid.
syntactic monoid: Any regular language gives rise to syntactic monoid that is quotient of A* by "indistinguishability by language" relation. Here is a finger tree implementation of this idea. For example, the language {a3n:n natural} has Z3 as the syntactic monoid.
Quotient monoids automatically come with homomorphism M -> M/~ that is surjective.
A "dual" construction are submonoids. They come with homomorphism A -> M that is injective.
Yet another construction on monoids is tensor product.
Monoids allow exponentation by squaring in O(log n) and fast parallel prefix sums computation. Also they are used in Writer monad.
The Haskell standard library is alternately praised and attacked for its use of the actual mathematical terms for its type classes. (In my opinion it's a good thing, since without it I'd never even know what a monoid is!). In any case, you might check out http://www.haskell.org/ghc/docs/latest/html/libraries/base/Data-Monoid.html for a few more examples:
the dual of any monoid is a monoid: given a+b, define a new operation ++ with a++b = b+a
conjunction and disjunction of booleans
over the Maybe monad (aka "option" in Ocaml), first and last. That is,first (Just a) b = Just a
first Nothing b = band likewise for last
The latter is just the tip of the iceberg of a whole family of monoids related to monads and arrows, but I can't really wrap my head around these (other than simply monadic endomorphisms). But a google search on monads monoids turns up quite a bit.
A really useful example of a commutative monoid is unification in logic and constraint languages. See section 2.8.2.2 of 'Concepts, Techniques and Models of Computer Programming' for a precise definition of a possible unification algorithm.
Good luck with your language! I'm doing something similar with a parallel language, using monoids to merge subresults from parallel computations.
Arbitrary length Roman numeral value computation.
https://gist.github.com/4542999