I'm doing some homework, and having a hard time understanding closures. This is in relation to boolean algebra mostly, not any specific programming language.
Here's an example:
Are the following sets closed under the following operations?
The language {a,b} under concatenation.
Now, from this: http://en.wikipedia.org/wiki/Closure_%28mathematics%29, it would seem that because the concatenation of the language {a,b} can produce results that are not members of the original set {a,b}, like ab, aa, bb, etc, the set is NOT closed under the concatenation operation.
Am I looking at this correctly? I feel its easy to misinterpret that definition. I feel like it might mean that if the operation produces results that CAN be created by the given language, then the set is closed under that operation.
Anyone want to take a stab at this and help me out? :)
Thanks!
In computational theory, you usually distinguish betweeen the set of symbols (characters, letters, etc.) and the set of words. It never really makes sense to discuss whether the set of characters is closed under an operation, instead you ask if some set of words is closed under an operation.
In the example you give, {a,b} is the symbol set; the set S of all words over that symbol set is closed under concatenation, since concatenating two words from S results in a word still in S.
Related
I have a table of regular expressions that are in an MySQL table that I match text against.
Is there a way, using MySQL or any other language (preferably Perl) that I can take this list of expressions and determine which of them MAY overlap. This should be independent of whatever text may be supplied to the expressions.
All of the expression have anchors.
Here is an example of what I am trying to get:
Expressions:
^a$
^b$
^ab
^b.*c
^batch
^catch
Result:
'^b.*c' and '^batch' MAY overlap
Thoughts?
Thanks,
Scott
Further explanation:
I have a list of user-created regexes and an imported list of strings that are to be matched against the regexes. In this case the strings are "clean" data (ie they are not user-created but imported from another source - they must not change).
When a user adds to the list of regexes I do not want any collisions on either the existing list of strings nor any future strings (which can not be guessed ahead of time - the only constraints being they are ASCII printable characters no longer than 255 characters).
A brute-force method would be to create a "rainbow" table of all of the permutations of strings and each time a regex is added run all of the regexes against the rainbow table. However I'd like to avoid this (I'm not even sure of the cost) and so was wondering aloud as to the possibility of an algorithm that would AT LEAST show which regexes in a list MAY collide.
I will punt on full REs. Even limiting to BREs and/or MySQL-pre-8.0 will be challenging. Here are some thoughts.
If end-anchored and no + or *, the calculate the length. The fixed-length can be used as a discriminator. Also, it could be used for toning back the "brute force" by perhaps an order of magnitude.
Anything followed by + or * gets turned into .* for simplicity. (Re the "may collide" rule.)
Any RE with explicit characters (including those followed by +) becomes a discriminator in some situations. Eg, ^a.*b$ vs ^a.*c$.
For those anchored at the end, reverse the pattern and test it that way. (I don't know how difficult reversing is.)
If you can say that a particular character must be at any position, then use it as a discriminator: ^a.b.*c$ -- a in pos 1; b in pos 3; c at end. Perhaps this can be extended to character classes: ^\w may match, but ^\d and ^a.*\d$ can't.
I'm trying to create a "scrabble-solver" to run stress-tests on a scrabble-like game I'm developing. I have a database containing ~200.000 words and I'm now looking for a way to match the scrabble tiles given with the words in the database.
Example:
Given tiles: A, P, E, F, O, L, M
Result: APE, POLE, PALE, MOLE, PAL...
Is this possible by using a simple SELECT-statement with REGEXP? If possible I would also like to add letters on specific positions and be able to determine max/min length.
I hope this question made sense :)
I've been googling my eyes out but I can't seem to find what I'm looking for. Anyone got an idea?
Thanks! :)
It doesn't sound like a regex problem. I think you'll be better off simply creating all possible combinations of letters from the existing tiles and then running your SELECT statement with the IN clause. For example, with tiles:
A, P, E
your SELECT clause will be
SELECT word FROM words WHERE word IN ('APE', 'AEP', 'PAE' ,'PEA', 'EPA', 'EAP');
You'll get the list of valid words from your table.
A regex would not help you much in this case. You need to construct the possible words by yourself.
The problem is that you have a limited number of each possible letter and a regex cannot encode that information. If you had infinite supply of each letter, then you could use a regex like [APEFOI]*.
You will have to enumerate all the possible words yourself. The implementation would depend on the language your using, but your best bet might be a next_permutation function or better a function that enumerates all permutations. A simple (and slightly inefficient) implementation (in Python-like pseudocode) would be:
words = []
for permutation in permutations(letters): # enumerate all character orders
for i in range(1, len(permutation)): # enumerate all lengths of words
words.append(letters[:i]) # append to candidate set
At that point words will contain all the candidate words you would then use in a SELECT ... IN statement.
That isn't the most efficient approach, but should be practical enough to get you started.
Can you suggest a precise definition for a 'value' within the context of programming without reference to specific encoding techniques or particular languages or architectures?
[Previous question text, for discussion reference: "What is value in programming? How to define this word precisely?"]
I just happened to be glancing through Pierce's "Types and Programming Languages" - he slips a reasonably precise definition of "value" in a programming context into the text:
[...] defines a subset of terms, called values, that are possible final results of evaluation
This seems like a rather tidy definition - i.e., we take the set of all possible terms, and the ones that can possibly be left over after all evaluation has taken place are values.
Based on the ongoing comments about "bits" being an unacceptable definition, I think this one is a little better (although possibly still flawed):
A value is anything representable on a piece of possibly-infinite Turing machine tape.
Edit: I'm refining this some more.
A value is a member of the set of possible interpretations of any possibly-infinite sequence of symbols.
That is equivalent to the earlier definition based on Turing machine tape, but it actually generalises better.
Here, I'll take a shot: A value is a piece of stored information (in the information-theoretical sense) that can be manipulated by the computer.
(I won't say that a value has meaning; a random number in a register may have no meaning, but it's still a value.)
In short, a value is some assigned meaning to a variable (the object containing the value)
For example type=boolean; name=help; variable=a storage location; value=what is stored in that location;
Further break down:
X = 2; where X is a variable while 2 is the value stored in X.
Have you checked the article in wikipedia?
In computer science, a value is a sequence of bits that is interpreted according to some data type. It is possible for the same sequence of bits to have different values, depending on the type used to interpret its meaning. For instance, the value could be an integer or floating point value, or a string.
Read the Wiki
Value = Value is what we call the "contents" that was stored in the variable
Variables = containers for storing data values
Example: Think of a folder named "Movies"(Variables) and inside of it are it contents which are namely; Pirates of the Carribean, Fantastic Beast, and Lala land, (this in turn is what we now call it's Values )
I have a homework question regarding operator precedence in Fortran. In order to understand the question I need to know how to use binary numbers in Fortran. Can someone give me an example of how to use binary numbers in fortran? (Specifically with subtraction).
You need to be a bit clearer about what you mean by 'binary numbers in fortran'. In one sense, not terribly useful, all Fortran numbers are binary, as indeed most numbers in most programming languages are binary once they get onto the computer.
Fortran, in the standard at least, does not have the concept of a binary intrinsic data type, it has integers, reals, complex numbers, logicals and characters. Of course, your compiler might implement other types as well, but you don't tell us what that compiler is.
Standard Fortran does have the concept of binary input and output formats -- look for the 'B edit descriptor' in your documentation. This can be used on input and output to read and write binary representations of integers. But the numbers are, to Fortran, integers. So, if you were to read a, b as binary numbers, you would subtract them with the statement a-b.
Fortran does have a set of bit intrinsic procedures, which go by the names iand, ibclr, ieor and so forth but these are really for bit-twiddling.
If you can clarify your questions, I, or some other SOer, might be able to clarify an answer.
Finally, I think it's rather odd that you think you need to know about Fortran 'binary' numbers in order to understand operator precedence. Perhaps you could explain a bit more.
Is it possible to construct a turing-complete language in which every string is a correct program?
Any examples? Even better, any real-world examples?
Precisions: by "correct" I mean "compiles", although "runs without error" and "runs without error, and finishes in finite time" would be interesting questions too :)
By string I mean any sequence of bytes, although a restriction to a set of characters will do.
Yes (assuming by correct you mean compiles, not does something useful). Take brainfuck and map multiple letters to the eight commands.
Edit... oh and redefine an unmatched [ or ] to print "meh. nitpickers" to the screen.
One PhD please ;)
This is a compiler for a C-like language expressed in BNF as
<program> ::= <character> | <character> <program>
#!/bin/bash
# full_language.sh
gcc "$1"
if [ $? != 0 ]
then
echo -e "#!/bin/bash\necho 'hi'" > a.out
chmod +x a.out
fi
We can build this up out of any turing-complete language. Take C, for example. If input is a correct C program, than do what it intended to. Otherwise, print "Hello, world!". Or just do nothing.
That makes a turing-complete language where every string is a correct program.
Existence proof: perl.
No, because your definition of 'correct' would leave no room for 'incorrect' since 'correct' would include all computable numbers and non-halting programs. To answer in the affirmative would make the question meaningless as 'correct' loses it's definition.
Combinatory logic is very near to the requirement You pose. Every string (over the {K, S, #} alphabet) can be extended to a program. Thus, althogh Your requirement is not entirely fulfilled, but its straighforward weakening to prefix property is satisfied by combinatory logic.
Although these programs are syntactically correct, but they do not necessarily halt. That is not necessarily a problem: combinatory logic has originally been developed for investigating theoretical questions, not for a practical programming language (although can be used as such). Are non-halting combinatory logic "programs# interesting? Do they have at least a theoretical relevance? Of course some of them do! For example, Omega is a non-halting combinatory logic term, but it is subject of articles, book chapters, it has theroetical interestingness, thus we can say, it is meaningful.
Summary: if we regard combinatory logic over alphabet {K, S, #}, the we can say, every possible strings over this alphabet can be extended (as a prefix) to a syntactically correct combinatory logic program. Some of these won't halt, but even those who don't halt can be theoretically interesting, thus "meaningful" (e.g. Omega).
The answer TokenMacGuy provided is better than mine, becasue it approaches the poblem from a broader view, and also because Jot is inspired combinatory logic, thus TokenMacGuy's answer supercedes mine.
If by "correct" you mean syntactically, then certainly, yes.
http://en.wikipedia.org/wiki/One_instruction_set_computer
http://en.wikipedia.org/wiki/Whitespace_(programming_language)
etc
Turing-complete and "finishes in finite time" are not possible.
Excerpt from wikipedia: http://en.wikipedia.org/wiki/Turing_completeness
"One important result from computability theory is that it is impossible in general to determine whether a program written in a Turing-complete language will continue executing forever or will stop within a finite period of time (see halting problem)."
What you are describing is essentially similar to a mapping from Godel number to original program. Briefly, the idea is that every program should be reducible to a unique integer, and you could use that to draw conclusions about the program, such as with some sort of oracle. One such mapping is the Jot language, which has only two operators, 1 and 0, and the first operator must be a 1.