Related
After reviewing all related entries in Stackoveflow and also after googleing a lot, I can not find the algorithm I need.
Having expressions similar to this one (in postfix notation):
"This is a string" 1 2 * 4.2 ceil mid "is i" ==
I need an algorithm that transforms it into an Expression Tree.
I've found many places that explain how this transformation can be done, but they only use operators (+,-,*, &&, ||, etc). I can not find how to do it when functions (with zero or more arguments) are involved, like in the example above.
Just for clarity, the postfix expression above will look as follows when using infix notation:
mid( "This is a string", 1*2, ceil( 4.2 ) ) == "is i"
A general algorithm in pseudo-code or Java or JavaScript or C would be very much appreciated (please keep this in mind: from postfix to expression-tree).
Thanks a million in advance.
Most of the programming languages use // or # for a single line comment (see wiki). It seems to be that # is especially used for interpreted languages. According to this question the reason for that seems to be that one of the early shells (bourne shell) used '#' as a comment and made use of it (shebang).
Is there a logical reason why to choose # as a comment sign (e.g. symobolize crossing out by #)? And why do we use // as a comment sign in many compiled languages (especially in C as it seems to be one of the earliest compiled languages with that symbol)? Are there logical reasons for that? Why not use # instead of //, or // instead of #?
Is there a logical reason why to choose # as a comment sign [in early shells]?
The Bourne shell tokenizer is quite simple. To add comment line support, a single character identifier was the simplest, and logical, choice.
The set of single characters you can choose from, if you wish to be compatible with both EBCDIC and ASCII (the two major character sets used at that time), is quite small:
! (logical not in bc)
#
% (modulo in bc)
#
^ (power in bc)
~ (used in paths)
Now, I've listed the ones used in bc, the calculator used in the same time period, not because they were a reason, but because you should understand the context of the Bourne shell developers and users. The bc notation did not arrive from out of thin air; the prevailing preferences influenced the choice, because the developers wanted the syntax to be intuitive, at least for themselves. The above bc notes are therefore useful in showing what kind of associations contemporary developers had with specific characters. I don't intend to imply that bc necessarily had an impact on Bourne shell -- but I do believe it did; that one of the reasons for developing the Bourne shell was to make using and automating tools like bc easier.
Effectively, only # and # were "unused" characters available in both ASCII and EBCDIC; and it appears "hash" won over "at".
And why do we use // as a comment sign in many compiled languages?
The // comment style is from BCPL. Many of the BCPL tokens and operators were already multiple characters long, and I suspect that at time the developers considered it better (for interoperability) to reuse an already used character for the comment line token, rather than introduce a completely new character.
I suspect that the // comment style has a historical background in margin notes; a double vertical line used to separate the actual content from notes or explanations being a clear visual separator to even those not familiar with the practice.
Why not use # instead of //, or [vice versa]?
In both of the cases above, there is clear logic. However, that does not mean that these were the only logical choices available. These are just the ones that made the most sense to the developers at the time when the choice was made -- and I've tried to shed some light on the possible reasons, the context for the choices, above.
If these kinds of questions interest you, I recommend you find old math and science (physics in particular) books, and perhaps even reproductions of old notes. Best tools are intuitive, you see; and to find what was intuitive to someone, you need to find out the context they worked in. I am absolutely certain you can find interesting "reasons" -- things that made certain choices logical and intuitive to them, while to us they may seem odd -- by finding out the habits of the early developers and their colleagues and mentors.
Is it possible to construct a turing-complete language in which every string is a correct program?
Any examples? Even better, any real-world examples?
Precisions: by "correct" I mean "compiles", although "runs without error" and "runs without error, and finishes in finite time" would be interesting questions too :)
By string I mean any sequence of bytes, although a restriction to a set of characters will do.
Yes (assuming by correct you mean compiles, not does something useful). Take brainfuck and map multiple letters to the eight commands.
Edit... oh and redefine an unmatched [ or ] to print "meh. nitpickers" to the screen.
One PhD please ;)
This is a compiler for a C-like language expressed in BNF as
<program> ::= <character> | <character> <program>
#!/bin/bash
# full_language.sh
gcc "$1"
if [ $? != 0 ]
then
echo -e "#!/bin/bash\necho 'hi'" > a.out
chmod +x a.out
fi
We can build this up out of any turing-complete language. Take C, for example. If input is a correct C program, than do what it intended to. Otherwise, print "Hello, world!". Or just do nothing.
That makes a turing-complete language where every string is a correct program.
Existence proof: perl.
No, because your definition of 'correct' would leave no room for 'incorrect' since 'correct' would include all computable numbers and non-halting programs. To answer in the affirmative would make the question meaningless as 'correct' loses it's definition.
Combinatory logic is very near to the requirement You pose. Every string (over the {K, S, #} alphabet) can be extended to a program. Thus, althogh Your requirement is not entirely fulfilled, but its straighforward weakening to prefix property is satisfied by combinatory logic.
Although these programs are syntactically correct, but they do not necessarily halt. That is not necessarily a problem: combinatory logic has originally been developed for investigating theoretical questions, not for a practical programming language (although can be used as such). Are non-halting combinatory logic "programs# interesting? Do they have at least a theoretical relevance? Of course some of them do! For example, Omega is a non-halting combinatory logic term, but it is subject of articles, book chapters, it has theroetical interestingness, thus we can say, it is meaningful.
Summary: if we regard combinatory logic over alphabet {K, S, #}, the we can say, every possible strings over this alphabet can be extended (as a prefix) to a syntactically correct combinatory logic program. Some of these won't halt, but even those who don't halt can be theoretically interesting, thus "meaningful" (e.g. Omega).
The answer TokenMacGuy provided is better than mine, becasue it approaches the poblem from a broader view, and also because Jot is inspired combinatory logic, thus TokenMacGuy's answer supercedes mine.
If by "correct" you mean syntactically, then certainly, yes.
http://en.wikipedia.org/wiki/One_instruction_set_computer
http://en.wikipedia.org/wiki/Whitespace_(programming_language)
etc
Turing-complete and "finishes in finite time" are not possible.
Excerpt from wikipedia: http://en.wikipedia.org/wiki/Turing_completeness
"One important result from computability theory is that it is impossible in general to determine whether a program written in a Turing-complete language will continue executing forever or will stop within a finite period of time (see halting problem)."
What you are describing is essentially similar to a mapping from Godel number to original program. Briefly, the idea is that every program should be reducible to a unique integer, and you could use that to draw conclusions about the program, such as with some sort of oracle. One such mapping is the Jot language, which has only two operators, 1 and 0, and the first operator must be a 1.
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
There are common slang names for period, exclamation mark, and asterisk:
. "dot"
! "bang"
* "star"
but what are the teched-out, one-syllable names for characters like
%
&
<
{
[
;
etc.
Here's a few, poetry style:
<> !*''#
^"`$$-
!*=#$_
%*<> ~#4
&[]../
|{,,SYSTEM HALTED
The poem can be appreciated only by reading it aloud:
Waka waka bang splat tick tick hash,
Caret quote back-tick dollar dollar dash,
Bang splat equal at dollar under-score,
Percent splat waka waka tilde number four,
Ampersand bracket bracket dot dot slash,
Vertical-bar curly-bracket comma comma CRASH.
Source: http://babek.info/libertybasicfiles/lbnews/nl123/fun.htm
The Internet knows plenty of these (archive)
Or, check the jargon file.
The intercal reference manual is the definitive guide:
. spot identify 16-bit variable
: two-spot identify 32-bit variable
, tail identify 16-bit array
; hybrid identify 32-bit array
# mesh identify constant
= half-mesh
' spark grouper
` backspark
! wow equivalent to spark-spot
? what unary exlusive OR (ASCII)
" rabbit-ears grouper
". rabbit equivalent to ears-spot
| spike
% double-oh-seven percentage qualifier
- worm used with angles
< angle used with worms
> right angle
( wax precedes line label
) wane follows line label
[ U turn
] U turn back
{ embrace
} bracelet
* splat flags invalid statements
& ampersand[5] unary logical AND
V V unary logical OR
(or book)
V- bookworm unary exclusive OR
(or universal qualifier)
$ big money unary exclusive OR (ASCII)
c| change binary mingle
~ sqiggle binary select
_ flat worm
overline indicates "times 1000"
+ intersection separates list items
/ slat
\ backslat
# whirlpool
-' hookworm
^ shark
(or simply sharkfin)
Strive for clarity and don't use slang, but instead use the commonly accepted name of each symbol. Thus,
. dot or period
! exclamation point
* asterisk or star
% percent (sometimes mod)
& ampersand
< less than
{ left brace
[ left bracket
; semicolon
Be sure to find the more British-flavored ones, I like the #! (read "crunch bang") as in CrunchBang Linux. And for pure weirdness, I enjoy calling "#" the "octothorpe."
Some other fun ones that I've heard (I'd be hard pressed to source these):
= "gets" (when reading C code "int num = 5;" would be "num gets five")
` "quasiquote" (lisp)
#! together are often pronounced "shebang" (consider that a kind of tech-dipthong)
Consider these extras:
! not
# these (Perl)
$ this (Perl)
: otherwise (ternary operator)
; stop (read as a telegram?)
* anything (regex-esque)
& and
| or
In addition to Tyler McHenry's list there is also Jeff's blog post which links to the ASCII entry in The New Hacker's Dictionary.
# = "pound"
as in
#sand
or
#puppy
# is pronounced "dog" (собака) in Russian (for the visual picture).
Hehe, when I was a kid, I called the "#" - character (don't laugh!) "eh". And "$" was called "xsssss".
Why do most computer programming languages not allow binary numbers to be used like decimal or hexadecimal?
In VB.NET you could write a hexadecimal number like &H4
In C you could write a hexadecimal number like 0x04
Why not allow binary numbers?
&B010101
0y1010
Bonus Points!... What languages do allow binary numbers?
Edit
Wow! - So the majority think it's because of brevity and poor old "waves" thinks it's due to the technical aspects of the binary representation.
Because hexadecimal (and rarely octal) literals are more compact and people using them usually can convert between hexadecimal and binary faster than deciphering a binary number.
Python 2.6+ allows binary literals, and so do Ruby and Java 7, where you can use the underscore to make byte boundaries obvious. For example, the hexadedecimal value 0x1b2a can now be written as 0b00011011_00101010.
In C++0x with user defined literals binary numbers will be supported, I'm not sure if it will be part of the standard but at the worst you'll be able to enable it yourself
int operator "" _B(int i);
assert( 1010_B == 10);
In order for a bit representation to be meaningful, you need to know how to interpret it.
You would need to specify what the type of binary number you're using (signed/unsigned, twos-compliment, ones-compliment, signed-magnitude).
The only languages I've ever used that properly support binary numbers are hardware description languages (Verilog, VHDL, and the like). They all have strict (and often confusing) definitions of how numbers entered in binary are treated.
See perldoc perlnumber:
NAME
perlnumber - semantics of numbers and numeric operations in Perl
SYNOPSIS
$n = 1234; # decimal integer
$n = 0b1110011; # binary integer
$n = 01234; # octal integer
$n = 0x1234; # hexadecimal integer
$n = 12.34e-56; # exponential notation
$n = "-12.34e56"; # number specified as a string
$n = "1234"; # number specified as a string
Slightly off-topic, but newer versions of GCC added a C extension that allows binary literals. So if you only ever compile with GCC, you can use them. Documenation is here.
Common Lisp allows binary numbers, using #b... (bits going from highest-to-lowest power of 2). Most of the time, it's at least as convenient to use hexadecimal numbers, though (by using #x...), as it's fairly easy to convert between hexadecimal and binary numbers in your head.
Hex and octal are just shorter ways to write binary. Would you really want a 64-character long constant defined in your code?
Common wisdom holds that long strings of binary digits, eg 32 bits for an int, are too difficult for people to conveniently parse and manipulate. Hex is generally considered easier, though I've not used either enough to have developed a preference.
Ruby which, as already mentioned, attempts to resolve this by allowing _ to be liberally inserted in the literal , allowing, for example:
irb(main):005:0> 1111_0111_1111_1111_0011_1100
=> 111101111111111100111100
D supports binary literals using the syntax 0[bB][01]+, e.g. 0b1001. It also allows embedded _ characters in numeric literals to allow them to be read more easily.
Java 7 now has support for binary literals. So you can simply write 0b110101. There is not much documentation on this feature. The only reference I could find is here.
While C only have native support for 8, 10 or 16 as base, it is actually not that hard to write a pre-processor macro that makes writing 8 bit binary numbers quite simple and readable:
#define BIN(d7,d6,d5,d4, d3,d2,d1,d0) \
( \
((d7)<<7) + ((d6)<<6) + ((d5)<<5) + ((d4)<<4) + \
((d3)<<3) + ((d2)<<2) + ((d1)<<1) + ((d0)<<0) \
)
int my_mask = BIN(1,1,1,0, 0,0,0,0);
This can also be used for C++.
for the record, and to answer this:
Bonus Points!... What languages do allow binary numbers?
Specman (aka e) allows binary numbers. Though to be honest, it's not quite a general purpose language.
Every language should support binary literals. I go nuts not having them!
Bonus Points!... What languages do allow binary numbers?
Icon allows literals in any base from 2 to 16, and possibly up to 36 (my memory grows dim).
It seems the from a readability and usability standpoint, the hex representation is a better way of defining binary numbers. The fact that they don't add it is probably more of user need that a technology limitation.
I expect that the language designers just didn't see enough of a need to add binary numbers. The average coder can parse hex just as well as binary when handling flags or bit masks. It's great that some languages support binary as a representation, but I think on average it would be little used. Although binary -- if available in C, C++, Java, C#, would probably be used more than octal!
In Smalltalk it's like 2r1010. You can use any base up to 36 or so.
Hex is just less verbose, and can express anything a binary number can.
Ruby has nice support for binary numbers, if you really want it. 0b11011, etc.
In Pop-11 you can use a prefix made of number (2 to 32) + colon to indicate the base, e.g.
2:11111111 = 255
3:11111111 = 3280
16:11111111 = 286331153
31:11111111 = 28429701248
32:11111111 = 35468117025
Forth has always allowed numbers of any base to be used (up to size limit of the CPU of course). Want to use binary: 2 BASE ! octal: 8 BASE ! etc. Want to work with time? 60 BASE ! These examples are all entered from base set to 10 decimal. To change base you must represent the base desired from the current number base. If in binary and you want to switch back to decimal then 1010 BASE ! will work. Most Forth implementations have 'words' to shift to common bases, e.g. DECIMAL, HEX, OCTAL, and BINARY.
Although it's not direct, most languages can also parse a string. Java can convert "10101000" into an int with a method.
Not that this is efficient or anything... Just saying it's there. If it were done in a static initialization block, it might even be done at compile time depending on the compiler.
If you're any good at binary, even with a short number it's pretty straight forward to see 0x3c as 4 ones followed by 2 zeros, whereas even that short a number in binary would be 0b111100 which might make your eyes hurt before you were certain of the number of ones.
0xff9f is exactly 4+4+1 ones, 2 zeros and 5 ones (on sight the bitmask is obvious). Trying to count out 0b1111111110011111 is much more irritating.
I think the issue may be that language designers are always heavily invested in hex/octal/binary/whatever and just think this way. If you are less experienced, I can totally see how these conversions wouldn't be as obvious.
Hey, that reminds me of something I came up with while thinking about base conversions. A sequence--I didn't think anyone could figure out the "Next Number", but one guy actually did, so it is solvable. Give it a try:
10
11
12
13
14
15
16
21
23
31
111
?
Edit:
By the way, this sequence can be created by feeding sequential numbers into single built-in function in most languages (Java for sure).