Accurate JSON text encoding detection - json

In RFC4627 a method for identifying the Unicode encoding when a BOM is not present was described. This relied on the first 2 characters in the JSON text to always be ASCII characters. But in RFC7159, the specification defines JSON text as "ws value ws"; implying that a single string value would also be valid. So the first character will be the opening quote, but then any Unicode character allowed in a string could follow. Considering that RFC7159 also discourages the use of a BOM; and no longer describes the procedure for detecting the encoding from the first 4 octets (bytes), how should one detect it? UTF-32 should still work correctly as described in RFC4627, because the first character is four bytes and should still be ASCII, but what about UTF-16? The second (2-byte) character might not contain a zero-byte to help identify the correct encoding.

Update
This answer has been written for JSON according RFC 7159 whose character encoding was allowed to use all unicode schemes (UTF-8, UTF-16, or UTF-32), where UTF-8 was the "default".
The current RFC 8259 restricts the character encoding to be only UTF-8, which is a reasonable constraint.
After taking a look at an implementation which I made a couple years ago, I can tell that it's possible to unambiguously detect the given Unicode Scheme from just one character, given the following assumptions:
The input must be Unicode
The first character must be ASCII
There must be no BOM
Consider this:
Assuming the first character is a "[" (0x5B) - an ASCII.
Then, we may get these byte patterns:
UTF_32LE: 5B 00 00 00
UTF_32BE: 00 00 00 5B
UTF_16LE: 5B 00 xx xx
UTF_16BE: 00 5B xx xx
UTF_8: 5B xx xx xx
where "xx" is either EOF or any other byte.
We should also note, that according RFC7159, the shortest valid JSON can be just one character. That is, it can be possibly 1, 2 or 4 byte - depending of the Unicode Scheme.
So, if it helps, here is an implementation in C++:
namespace json {
//
// Detect Encoding
//
// Tries to determine the Unicode encoding of the input starting at
// first. A BOM shall not be present (you might check with function
// json::unicode::detect_bom() whether there is a BOM, in which case
// you don't need to call this function when a BOM is present).
//
// Return values:
//
// json::unicode::UNICODE_ENCODING_UTF_8
// json::unicode::UNICODE_ENCODING_UTF_16LE
// json::unicode::UNICODE_ENCODING_UTF_16BE
// json::unicode::UNICODE_ENCODING_UTF_32LE
// json::unicode::UNICODE_ENCODING_UTF_32BE
//
// -1: unexpected EOF
// -2: unknown encoding
//
// Note:
// detect_encoding() requires to read ahead a few bytes in order to deter-
// mine the encoding. In case of InputIterators, this has the consequences
// that these iterators cannot be reused, for example for a parser.
// Usually, this requires to reset the istreambuff, that is using the
// member functions pubseekpos() or pupseekoff() in order to reset the get
// pointer of the stream buffer to its initial position.
// However, certain istreambuf implementations may not be able to set the
// stream pos at arbitrary positions. In this case, this method cannot be
// used and other edjucated guesses to determine the encoding may be
// needed.
template <typename Iterator>
inline int
detect_encoding(Iterator first, Iterator last)
{
// Assuming the input is Unicode!
// Assuming first character is ASCII!
// The first character must be an ASCII character, say a "[" (0x5B)
// UTF_32LE: 5B 00 00 00
// UTF_32BE: 00 00 00 5B
// UTF_16LE: 5B 00 xx xx
// UTF_16BE: 00 5B xx xx
// UTF_8: 5B xx xx xx
uint32_t c = 0xFFFFFF00;
while (first != last) {
uint32_t ascii;
if (static_cast<uint8_t>(*first) == 0)
ascii = 0; // zero byte
else if (static_cast<uint8_t>(*first) < 0x80)
ascii = 0x01; // ascii byte
else if (*first == EOF)
break;
else
ascii = 0x02; // non-ascii byte, that is a lead or trail byte
c = c << 8 | ascii;
switch (c) {
// reading first byte
case 0xFFFF0000: // first byte was 0
case 0xFFFF0001: // first byte was ASCII
++first;
continue;
case 0xFFFF0002:
return -2; // this is bogus
// reading second byte
case 0xFF000000: // 00 00
++first;
continue;
case 0xFF000001: // 00 01
return json::unicode::UNICODE_ENCODING_UTF_16BE;
case 0xFF000100: // 01 00
++first;
continue;
case 0xFF000101: // 01 01
return json::unicode::UNICODE_ENCODING_UTF_8;
// reading third byte:
case 0x00000000: // 00 00 00
case 0x00010000: // 01 00 00
++first;
continue;
//case 0x00000001: // 00 00 01 bogus
//case 0x00000100: // 00 01 00 na
//case 0x00000101: // 00 01 01 na
case 0x00010001: // 01 00 01
return json::unicode::UNICODE_ENCODING_UTF_16LE;
// reading fourth byte
case 0x01000000:
return json::unicode::UNICODE_ENCODING_UTF_32LE;
case 0x00000001:
return json::unicode::UNICODE_ENCODING_UTF_32BE;
default:
return -2; // could not determine encoding, that is,
// assuming the first byte is an ASCII.
} // switch
} // while
// premature EOF
return -1;
}
}

Related

Get the IBAN number from an emv card

i have some issues with reading the IBAN number from the german CashCards (also known as Geldkarte).
I can communicate with my card, and i get some informations from it. but i don`t know which commandApdu i must send to the card, to get the IBAN number...
The application runs on Java 7 and i use the java.smartcardio api
Protocoll is T=1
my commandApdu to get the date looks like this:
byte[] commandBytes = new byte[]{0x00, (byte)0xa4, 0x04, 0x00, 0x07, (byte)0xa0, 0x00, 0x00, 0x00,0x04, 0x30, 0x60, 0x00};
the information i get is:
6F 32 84 07 A0 00 00 00 04 30 60 A5 27 50 07 4D 61 65 73 74 72 6F 87 01 03 9F 38 09 9F 33 02 9F 35 01 9F 40 01 5F 2D 04 64 65 65 6E BF 0C 05 9F 4D 02 19 0A
Can anyone tell me the correct apdu for getting the IBAN number?
i am sorry if i forgott some information needed, but this is my first question in this board :-)
Okay so the card has sent back this:
6F328407A0000000043060A52750074D61657374726F8701039F38099F33029F35019F40015F2D046465656EBF0C059F4D02190A
Which translates to:
6F File Control Information (FCI) Template
84 Dedicated File (DF) Name
A0000000043060
A5 File Control Information (FCI) Proprietary Template
50 Application Label
M a e s t r o
87 Application Priority Indicator
03
9F38 Processing Options Data Object List (PDOL)
9F33029F35019F4001
5F2D Language Preference
d e e n
BF0C File Control Information (FCI) Issuer Discretionary Data
9F4D Log Entry
190A
So now you've selected the application you'll want to send a series of 'Read Record' commands to it to get the data out of it like (Card number, expiry date, card holder name, IBAN (if it's in there, haven't seen it before)). The structure of the 'Read Record' command can be found in EMV Book 3 however here's some rough psuedocode as to what your Read Record loop should look like. Off the top of my head I usually set NUM_SFIS to 5 and NUM_RECORDS to 16 as there's not usually anything past these points.
for (int sfiNum = 1; sfiNum <= NUM_SFIS; sfiNum++)
{
for (int rec = 1; rec <= NUM_RECORDS; rec++)
{
byte[] response = tag.transceive(new byte[]{0x00,(byte)0xB2 (byte)rec, (byte)((byte)(sfiNum << 3) | 4), 0x00});
}
}
i solved my problem after a long time this way:
At first send a command to the card, to select the aid (application identifier):
private static byte[] aidWithPossibleIban = new byte[] { 0x00, (byte) 0xa4,
0x04, 0x00, 0x09, (byte) 0xa0, 0x00, 0x00, 0x00, 0x59, 0x45, 0x43,
0x01, 0x00, 0x00 };
then i hat to raise the security-level:
private static byte[] cmdRaiseSecurityLevel = new byte[] { 0x00, 0x22,
(byte) 0xf3, 0x02 };
last thing to do was to read the record:
private static byte[] readSelectedRecord = new byte[] { 0x00, (byte) 0xb2,
0x01, (byte) 0xa4, 0x00 };
regards
Andreas
I would like to add, the IBAN returning from the card is not straightforward.
The IBAN returned is that of the main Bank, and then the account number from the card holder in other record. Therefore one must come out with the right IBAN through code as the check digit has to be calculated as seen here
Since in records we find Country Code (DE), Bankleitzahl BLZ (8 Digits) and Account Number (10 digits), Check Digit can be calculated through
public string ReturnIBAN(string lkz, string blz, string kntnr, bool groupedReturn = true)
{
string bban = string.Empty;
lkz = lkz.ToUpper();
switch (lkz)
{
case "AT":
{
bban = blz.PadLeft(5, '0') + kntnr.PadLeft(11, '0');
}
break;
case "DE":
{
bban = blz.PadLeft(8, '0') + kntnr.PadLeft(10, '0');
}
break;
case "CH":
{
bban = blz.PadLeft(5, '0') + kntnr.PadLeft(12, '0');
}
break;
}
string sum = bban + lkz.Aggregate("", (current, c) => current + (c - 55).ToString()) + "00";
var d = decimal.Parse(sum);
var checksum = 98 - (d % 97);
string iban = lkz + checksum.ToString().PadLeft(2, '0') + bban;
return groupedReturn ? iban.Select((c, i) => (i % 4 == 3) ? c + " " : c + "").Aggregate("", (current, c) => current + c) : iban;
}
Source (In German): here

Code Golf: Conway's Game of Life

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
The Challenge: Write the shortest program that implements John H. Conway's Game of Life cellular automaton. [link]
EDIT: After about a week of competition, I have selected a victor: pdehaan, for managing to beat the Matlab solution by one character with perl.
For those who haven't heard of Game of Life, you take a grid (ideally infinite) of square cells. Cells can be alive (filled) or dead (empty). We determine which cells are alive in the next step of time by applying the following rules:
Any live cell with fewer than two live neighbours dies, as if caused by under-population.
Any live cell with more than three live neighbours dies, as if by overcrowding.
Any live cell with two or three live neighbours lives on to the next generation.
Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.
Your program will read in a 40x80 character ASCII text file specified as a command-line argument, as well as the number of iterations (N) to perform. Finally, it will output to an ASCII file out.txt the state of the system after N iterations.
Here is an example run with relevant files:
in.txt:
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
..................................XX............................................
..................................X.............................................
.......................................X........................................
................................XXXXXX.X........................................
................................X...............................................
.................................XX.XX...XX.....................................
..................................X.X....X.X....................................
..................................X.X......X....................................
...................................X.......XX...................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
Iterate 100 times:
Q:\>life in.txt 100
Resultant Output (out.txt)
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
..................................XX............................................
..................................X.X...........................................
....................................X...........................................
................................XXXXX.XX........................................
................................X.....X.........................................
.................................XX.XX...XX.....................................
..................................X.X....X.X....................................
..................................X.X......X....................................
...................................X.......XX...................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
................................................................................
The Rules:
You need to use file I/O to read/write the files.
You need to accept an input file and the number of iterations as arguments
You need to generate out.txt (overwrite if it exists) in the specified format
You don't need to deal with the edges of the board (wraparound, infinite grids .etc)
EDIT: You do need to have newlines in your output file.
The winner will be determined by character count.
Good luck!
Mathematica - 179 163 154 151 chars
a = {2, 2, 2};
s = Export["out.txt",
CellularAutomaton[{224, {2, {a, {2, 1, 2}, a}}, {1,1}},
(ReadList[#1, Byte, RecordLists → 2>1] - 46)/ 42, #2]〚#2〛
/. {0 → ".", 1 → "X"}, "Table"] &
Spaces added for readability
Invoke with
s["c:\life.txt", 100]
Animation:
You can also get a graph of the mean population over time:
A nice pattern for generating gliders from Wikipedia
AFAIK Mathematica uses a Cellular Automaton to generate random numbers using Rule 30.
MATLAB 7.8.0 (R2009a) - 174 171 161 150 138 131 128 124 characters
Function syntax: (124 characters)
Here's the easier-to-read version (with unnecessary newlines and whitespace added for better formatting):
function l(f,N),
b=char(importdata(f))>46;
for c=1:N,
b=~fix(filter2(ones(3),b)-b/2-3);
end;
dlmwrite('out.txt',char(b*42+46),'')
And here's how the program is run from the MATLAB Command Window:
l('in.txt',100)
Command syntax: (130 characters)
After a comment about calling functions with a command syntax, I dug a little deeper and found out that MATLAB functions can in fact be invoked with a command-line format (with some restrictions). You learn something new every day!
function l(f,N),
b=char(importdata(f))>46;
for c=1:eval(N),
b=~fix(filter2(ones(3),b)-b/2-3);
end;
dlmwrite('out.txt',char(b*42+46),'')
And here's how the program is run from the MATLAB Command Window:
l in.txt 100
Additional Challenge: Tweetable GIF maker - 136 characters
I thought for fun I'd see if I could dump the output to a GIF file instead of a text file, while still keeping the character count below 140 (i.e. "tweetable"). Here's the nicely-formatted code:
function l(f,N),
b=char(importdata(f))>46;
k=ones(3);
for c=1:N+1,
a(:,:,:,c)=kron(b,k);
b=~fix(filter2(k,b)-b/2-3);
end;
imwrite(~a,'out.gif')
Although IMWRITE is supposed to create a GIF that loops infinitely by default, my GIF is only looping once. Perhaps this is a bug that has been fixed in newer versions of MATLAB. So, to make the animation last longer and make the evolution steps easier to see, I left the frame delay at the default value (which seems to be around half a second). Here's the GIF output using the Gosper Glider Gun pattern:
Improvements
Update 1: Changed the matrix b from a logical (i.e. "boolean") type to a numerical one to get rid of a few conversions.
Update 2: Shortened the code for loading the file and used the function MAGIC as a trick to create the convolution kernel in fewer characters.
Update 3: Simplified the indexing logic, replaced ~~b+0 with b/42, and replaced 'same' with 's' as an argument to CONV2 (and it surprisingly still worked!).
Update 4: I guess I should have searched online first, since Loren from The MathWorks blogged about golfing and the Game of Life earlier this year. I incorporated some of the techniques discussed there, which required me to change b back to a logical matrix.
Update 5: A comment from Aslak Grinsted on the above mentioned blog post suggests an even shorter algorithm for both the logic and performing the convolution (using the function FILTER2), so I "incorporated" (read "copied") his suggestions. ;)
Update 6: Trimmed two characters from the initialization of b and reworked the logic in the loop to save 1 additional character.
Update 7: Eric Sampson pointed out in an e-mail that I could replace cell2mat with char, saving 4 characters. Thanks Eric!
Ruby 1.9 - 189 178 159 155 153 chars
f,n=$*
c=IO.read f
n.to_i.times{i=0;c=c.chars.map{|v|i+=1
v<?.?v:('...X'+v)[[83,2,-79].map{|j|c[i-j,3]}.to_s.count ?X]||?.}*''}
File.new('out.txt',?w)<<c
Edit:
Handles newlines with 4 chars less.
Can remove 7 more (v<?.?v:) if you allow it to clobber newlines when the live cells reach the edges.
perl, 127 129 135 chars
Managed to strip off a couple more characters...
$/=pop;#b=split'',<>;map{$n=-1;#b=map{++$n;/
/?$_:($t=grep/X/,#b[map{$n+$_,$n-$_}1,80..82])==3|$t+/X/==3?X:'.'}#b}1..$/;print#b
Python - 282 chars
might as well get the ball rolling...
import sys
_,I,N=sys.argv;R=range(3e3);B=open(I).read();B=set(k for k in R if'A'<B[k])
for k in R*int(N):
if k<1:b,B=B,set()
c=sum(len(set((k+o,k-o))&b)for o in(1,80,81,82))
if(c==3)+(c==2)*(k in b):B.add(k)
open('out.txt','w').write(''.join('.X\n'[(k in B)-(k%81<1)]for k in R))
Python 2.x - 210/234 characters
Okay, the 210-character code is kind of cheating.
#coding:l1
exec'xÚ=ŽA\nÂ#E÷sŠº1­ƒÆscS‰ØL™Æª··­âî¿GÈÿÜ´1iÖ½;Sçu.~H®J×Þ-‰­Ñ%ª.wê,šÖ§J®d꘲>cÉZË¢V䀻Eîa¿,vKAËÀå̃<»Gce‚ÿ‡ábUt¹)G%£êŠ…óbÒüíÚ¯GÔ/n×Xši&ć:})äðtÏÄJÎòDˆÐÿG¶'.decode('zip')
You probably won't be able to copy and paste this code and get it to work. It's supposed to be Latin-1 (ISO-8859-1), but I think it got perverted into Windows-1252 somewhere along the way. Additionally, your browser may swallow some of the non-ASCII characters.
So if it doesn't work, you can generate the file from plain-old 7-bit characters:
s = """
23 63 6F 64 69 6E 67 3A 6C 31 0A 65 78 65 63 27 78 DA 3D 8E 41 5C 6E C2
40 0C 45 F7 73 8A BA 31 13 AD 83 15 11 11 C6 73 08 63 17 05 53 89 D8 4C
99 C6 AA B7 B7 AD E2 EE BF 47 C8 FF DC B4 31 69 D6 BD 3B 53 E7 75 2E 7E
48 AE 4A D7 DE 90 8F 2D 89 AD D1 25 AA 2E 77 16 EA 2C 9A D6 A7 4A AE 64
EA 98 B2 3E 63 C9 5A CB A2 56 10 0F E4 03 80 BB 45 16 0B EE 04 61 BF 2C
76 0B 4B 41 CB C0 E5 CC 83 03 3C 1E BB 47 63 65 82 FF 87 E1 62 55 1C 74
B9 29 47 25 A3 EA 03 0F 8A 07 85 F3 62 D2 FC ED DA AF 11 47 D4 2F 6E D7
58 9A 69 26 C4 87 3A 7D 29 E4 F0 04 74 CF C4 4A 16 CE F2 1B 44 88 1F D0
FF 47 B6 27 2E 64 65 63 6F 64 65 28 27 7A 69 70 27 29
"""
with open('life.py', 'wb') as f:
f.write(''.join(chr(int(i, 16)) for i in s.split()))
The result of this is a valid 210-character Python source file. All I've done here is used zip compression on the original Python source code. The real cheat is that I'm using non-ASCII characters in the resultant string. It's still valid code, it's just cumbersome.
The noncompressed version weighs in at 234 characters, which is still respectable, I think.
import sys
f,f,n=sys.argv
e=open(f).readlines()
p=range
for v in p(int(n)):e=[''.join('.X'[8+16*(e[t][i]!='.')>>sum(n!='.'for v in e[t-1:t+2]for n in v[i-1:i+2])&1]for i in p(80))for t in p(40)]
open('out.txt','w').write('\n'.join(e))
Sorry about the horizontal scroll, but all newlines in the above are required, and I've counted them as one character each.
I wouldn't try to read the golfed code. The variable names are chosen randomly to achieve the best compression. Yes, I'm serious. A better-formatted and commented version follows:
# get command-line arguments: infile and count
import sys
ignored, infile, count = sys.argv
# read the input into a list (each input line is a string in the list)
data = open(infile).readlines()
# loop the number of times requested on the command line
for loop in range(int(count)):
# this monstrosity applies the rules for each iteration, replacing
# the cell data with the next generation
data = [''.join(
# choose the next generation's cell from '.' for
# dead, or 'X' for alive
'.X'[
# here, we build a simple bitmask that implements
# the generational rules. A bit from this integer
# will be chosen by the count of live cells in
# the 3x3 grid surrounding the current cell.
#
# if the current cell is dead, this bitmask will
# be 8 (0b0000001000). Since only bit 3 is set,
# the next-generation cell will only be alive if
# there are exactly 3 living neighbors in this
# generation.
#
# if the current cell is alive, the bitmask will
# be 24 (8 + 16, 0b0000011000). Since both bits
# 3 and 4 are set, this cell will survive if there
# are either 3 or 4 living cells in its neighborhood,
# including itself
8 + 16 * (data[y][x] != '.')
# shift the relevant bit into position
>>
# by the count of living cells in the 3x3 grid
sum(character != '.' # booleans will convert to 0 or 1
for row in data[y - 1 : y + 2]
for character in row[x - 1 : x + 2]
)
# select the relevant bit
& 1
]
# for each column and row
for x in range(80)
)
for y in range(40)
]
# write the results out
open('out.txt','w').write('\n'.join(data))
Sorry, Pythonistas, for the C-ish bracket formatting, but I was trying to make it clear what each bracket was closing.
Haskell - 284 272 232 chars
import System
main=do f:n:_<-getArgs;s<-readFile f;writeFile"out.txt"$t s$read n
p '\n'_='\n'
p 'X'2='X'
p _ 3='X'
p _ _='.'
t r 0=r
t r n=t[p(r!!m)$sum[1|d<-1:[80..82],s<-[1,-1],-m<=d*s,m+d*s<3240,'X'==r!!(m+d*s)]|m<-[0..3239]]$n-1
F#, 496
I could reduce this a lot, but I like this as it's still in the ballpark and pretty readable.
open System.IO
let mutable a:_[,]=null
let N y x=
[-1,-1;-1,0;-1,1;0,-1;0,1;1,-1;1,0;1,1]
|>Seq.sumBy(fun(i,j)->try if a.[y+i,x+j]='X' then 1 else 0 with _->0)
[<EntryPoint>]
let M(r)=
let b=File.ReadAllLines(r.[0])
a<-Array2D.init 40 80(fun y x->b.[y].[x])
for i=1 to int r.[1] do
a<-Array2D.init 40 80(fun y x->
match N y x with|3->'X'|2 when a.[y,x]='X'->'X'|_->'.')
File.WriteAllLines("out.txt",Array.init 40(fun y->
System.String(Array.init 80(fun x->a.[y,x]))))
0
EDIT
428
By request, here's my next stab:
open System
let mutable a,k=null,Array2D.init 40 80
[<EntryPoint>]
let M r=
a<-k(fun y x->IO.File.ReadAllLines(r.[0]).[y].[x])
for i=1 to int r.[1] do a<-k(fun y x->match Seq.sumBy(fun(i,j)->try if a.[y+i,x+j]='X'then 1 else 0 with _->0)[-1,-1;-1,0;-1,1;0,-1;0,1;1,-1;1,0;1,1]with|3->'X'|2 when a.[y,x]='X'->'X'|_->'.')
IO.File.WriteAllLines("out.txt",Array.init 40(fun y->String(Array.init 80(fun x->a.[y,x]))))
0
That's a 14% reduction with some basic golfing. I can't help but feel that I'm losing by using a 2D-array/array-of-strings rather than a 1D array, but don't feel like doing that transform now. Note how I elegantly read the file 3200 times to initialize my array :)
Ruby 1.8: 178 175 chars
f,n=$*;b=IO.read f
n.to_i.times{s=b.dup
s.size.times{|i|t=([82,1,-80].map{|o|b[i-o,3]||''}*'').count 'X'
s[i]=t==3||b[i]-t==?T??X:?.if s[i]>13};b=s}
File.new('out.txt','w')<<b
Newlines are significant (although all can be replaced w/ semicolons.)
Edit: fixed the newline issue, and trimmed 3 chars.
Java, 441... 346
Update 1 Removed inner if and more ugliness
Update 2 Fixed a bug and gained a character
Update 3 Using lots more memory and arrays while ignoring some boundaries issues. Probably a few chars could be saved.
Update 4 Saved a few chars. Thanks to BalusC.
Update 5 A few minor changes to go below 400 and make it just that extra bit uglier.
Update 6 Now things are so hardcoded may as well read in the exact amount in one go. Plus a few more savings.
Update 7 Chain the writing to the file to save a char. Plus a few odd bits.
Just playing around with BalusC's solution. Limited reputation means I couldnt add anything as a comment to his.
class M{public static void main(String[]a)throws Exception{int t=3240,j=t,i=new Integer(a[1])*t+t;char[]b=new char[i+t],p={1,80,81,82};for(new java.io.FileReader(a[0]).read(b,t,t);j<i;){char c=b[j],l=0;for(int n:p)l+=b[j+n]/88+b[j-n]/88;b[j+++t]=c>10?(l==3|l+c==90?88:'.'):c;}new java.io.FileWriter("out.txt").append(new String(b,j,t)).close();}}
More readable(?) version:
class M{
public static void main(String[]a)throws Exception{
int t=3240,j=t,i=new Integer(a[1])*t+t;
char[]b=new char[i+t],p={1,80,81,82};
for(new java.io.FileReader(a[0]).read(b,t,t);j<i;){
char c=b[j],l=0;
for(int n:p)l+=b[j+n]/88+b[j-n]/88;
b[j+++t]=c>10?(l==3|l+c==90?88:'.'):c;
}
new java.io.FileWriter("out.txt").append(new String(b,j,t)).close();
}
}
Scala - 467 364 339 chars
object G{def main(a:Array[String]){val l=io.Source.fromFile(new java.io.File(a(0)))getLines("\n")map(_.toSeq)toSeq
val f=new java.io.FileWriter("out.txt")
f.write((1 to a(1).toInt).foldLeft(l){(t,_)=>(for(y<-0 to 39)yield(for(x<-0 to 79)yield{if(x%79==0|y%39==0)'.'else{val m=t(y-1)
val p=t(y+1);val s=Seq(m(x-1),m(x),m(x+1),t(y)(x-1),t(y)(x+1),p(x-1),p(x),p(x+1)).count('X'==_)
if(s==3|(s==2&t(y)(x)=='X'))'X'else'.'}})toSeq)toSeq}map(_.mkString)mkString("\n"))
f.close}}
I think there is much room for improvement...
[Edit] Yes, it is:
object G{def main(a:Array[String]){var l=io.Source.fromFile(new java.io.File(a(0))).mkString
val f=new java.io.FileWriter("out.txt")
var i=a(1).toInt
while(i>0){l=l.zipWithIndex.map{case(c,n)=>if(c=='\n')'\n'else{val s=Seq(-83,-82,-81,-1,1,81,82,83).map(_+n).filter(k=>k>=0&k<l.size).count(l(_)=='X')
if(s==3|(s==2&c=='X'))'X'else'.'}}.mkString
i-=1}
f.write(l)
f.close}}
[Edit] And I have the feeling there is still more to squeeze out...
object G{def main(a:Array[String]){val f=new java.io.FileWriter("out.txt")
f.write(((1 to a(1).toInt):\(io.Source.fromFile(new java.io.File(a(0))).mkString)){(_,m)=>m.zipWithIndex.map{case(c,n)=>
val s=Seq(-83,-82,-81,-1,1,81,82,83)count(k=>k+n>=0&k+n<m.size&&m(k+n)=='X')
if(c=='\n')c else if(s==3|s==2&c=='X')'X'else'.'}.mkString})
f.close}}
The following solution uses my own custom domain-specific programming language which I have called NULL:
3499538
In case you are wondering how this works: My language consists of only of one statment per program. The statement represents a StackOverflow thread ID belonging to a code golf thread. My compiler compiles this into a program that looks for the best javascript solution (with the SO API), downloads it and runs it in a web browser.
Runtime could be better for new threads (it may take some time for the first upvoted Javascript answer to appear), but on the upside it requires only very little coding skills.
Javascript/Node.js - 233 236 characters
a=process.argv
f=require('fs')
m=46
t=f.readFileSync(a[2])
while(a[3]--)t=[].map.call(t,function(c,i){for(n=g=0;e=[-82,-81,-80,-1,1,80,81,82][g++];)t[i+e]>m&&n++
return c<m?c:c==m&&n==3||c>m&&n>1&&n<4?88:m})
f.writeFile('out.txt',t)
C - 300
Just wondered how much smaller and uglier my java solution could go in C. Reduces to 300 including the newlines for the preprocessor bits. Leaves freeing the memory to the OS! Could save ~20 by assuming the OS will close and flush the file too.
#include<stdio.h>
#include<stdlib.h>
#define A(N)j[-N]/88+j[N]/88
int main(int l,char**a){
int t=3240,i=atoi(a[2])*t+t;
char*b=malloc(i+t),*j;
FILE*f;
fread(j=b+t,1,t,fopen(a[1],"r"));
for(;j-b-i;j++[t]=*j>10?l==3|l+*j==90?88:46:10)
l=A(1)+A(80)+A(81)+A(82);
fwrite(j,1,t,f=fopen("out.txt","w"));
fclose(f);
}
MUMPS: 314 chars
L(F,N,R=40,C=80)
N (F,N,R,C)
O F:"RS" U F D C F
.F I=1:1:R R L F J=1:1:C S G(0,I,J)=($E(L,J)="X")
F A=0:1:N-1 F I=1:1:R F J=1:1:C D S G(A+1,I,J)=$S(X=2:G(A,I,J),X=3:1,1:0)
.S X=0 F i=-1:1:1 F j=-1:1:1 I i!j S X=X+$G(G(A,I+i,J+j))
S F="OUT.TXT" O F:"WNS" U F D C F
.F I=1:1:R F J=1:1:C W $S(G(N,I,J):"X",1:".") W:J=C !
Q
Java, 556 532 517 496 472 433 428 420 418 381 chars
Update 1: replaced 1st StringBuffer by Appendable and 2nd by char[]. Saved 24 chars.
Update 2: found a shorter way to read file into char[]. Saved 15 chars.
Update 3: replaced one if/else by ?: and merged char[] and int declarations. Saved 21 chars.
Update 4: replaced (int)f.length() and c.length by s. Saved 24 chars.
Update 5: made improvements as per hints of Molehill. Major one was hardcoding the char length so that I could get rid of File. Saved 39 chars.
Update 6: minor refactoring. Saved 6 chars.
Update 7: replaced Integer#valueOf() by new Integer() and refactored for loop. Saved 8 chars.
Update 8: Improved neighbour calculation. Saved 2 chars.
Update 9: Optimized file reading since file length is already hardcoded. Saved 37 chars.
import java.io.*;class L{public static void main(String[]a)throws Exception{int i=new Integer(a[1]),j,l,s=3240;int[]p={-82,-81,-80,-1,1,80,81,82};char[]o,c=new char[s];for(new FileReader(a[0]).read(c);i-->0;c=o)for(o=new char[j=s];j-->0;){l=0;for(int n:p)l+=n+j>-1&n+j<s?c[n+j]/88:0;o[j]=c[j]>13?l==3|l+c[j]==90?88:'.':10;}Writer w=new FileWriter("out.txt");w.write(c);w.close();}}
More readable version:
import java.io.*;
class L{
public static void main(String[]a)throws Exception{
int i=new Integer(a[1]),j,l,s=3240;
int[]p={-82,-81,-80,-1,1,80,81,82};
char[]o,c=new char[s];
for(new FileReader(a[0]).read(c);i-->0;c=o)for(o=new char[j=s];j-->0;){
l=0;for(int n:p)l+=n+j>-1&n+j<s?c[n+j]/88:0;
o[j]=c[j]>10?l==3|l+c[j]==90?88:'.':10;
}
Writer w=new FileWriter("out.txt");w.write(c);w.close();
}
}
Closing after writing is absoletely mandatory, else the file is left empty. It would otherwise have saved another 21 chars.
Further I could also save one more char when I use 46 instead of '.', but both javac and Eclipse jerks with a compilation error Possible loss of precision. Weird stuff.
Note: this expects an input file with \n newlines, not \r\n as Windows by default uses!
PHP - 365 328 322 Characters.
list(,$n,$l) = $_SERVER["argv"];
$f = file( $n );
for($j=0;$j<$l;$j++){
foreach($f as $k=>$v){
$a[$k]="";
for($i=0;$i < strlen( $v );$i++ ){
$t = 0;
for($m=-1;$m<2;$m++){
for($h=-1;$h<2;$h++){
$t+=ord($f[$k + $m][$i + $h]);
}
}
$t-=ord($v[$i]);
$a[$k] .= ( $t == 494 || ($t == 452 && ord($v[$i])==88)) ? "X" : "." ;
}
}
$f = $a;
}
file_put_contents("out.txt", implode("\n", $a ));
I'm sure this can be improved upon but I was curious what it would look like in PHP. Maybe this will inspire someone who has a little more code-golf experience.
Updated use list() instead of $var = $_SERVER["argv"] for both args. Nice one Don
Updated += and -= this one made me /facepalm heh cant believe i missed it
Updated file output to use file_put_contents() another good catch by Don
Updated removed initialization of vars $q and $w they were not being used
R 340 chars
cgc<-function(i="in.txt",x=100){
require(simecol)
z<-file("in.txt", "rb")
y<-matrix(data=NA,nrow=40,ncol=80)
for(i in seq(40)){
for(j in seq(80)){
y[i,j]<-ifelse(readChar(z,1) == "X",1,0)
}
readChar(z,3)
}
close(z)
init(conway) <- y
times(conway)<-1:x
o<-as.data.frame(out(sim(conway))[[100]])
write.table(o, "out.txt", sep="", row.names=FALSE, col.names=FALSE)
}
cgc()
I feel it's slightly cheating to have an add in package that does the actual automata for you, but I'm going with it cos I still had to thrash around with matricies and stuff to read in the file with 'X' instead of 1.
This is my first 'code golf', interesting....
c++ - 492 454 386
my first code golf ;)
#include<fstream>
#define B(i,j)(b[i][j]=='X')
int main(int i,char**v){for(int n=0;n<atoi(v[2]);++n){std::ifstream f(v[1]);v[1]="out.txt";char b[40][83];for(i=0;i<40;++i)f.getline(b[i],83);std::ofstream g("out.txt");g<<b[0]<<'\n';for(i=1;i<39;++i){g<<'.';for(int j=1;j<79;++j){int k=B(i-1,j)+B(i+1,j)+B(i,j-1)+B(i,j+1)+B(i-1,j-1)+B(i+1,j+1)+B(i+1,j-1)+B(i-1,j+1);(B(i,j)&&(k<2||k>3))?g<<'.':(!B(i,j)&&k==3)?g<<'X':g<<b[i][j];}g<<".\n";}g<<b[0]<<'\n';}}
A somewhat revised version, replacing some of the logic with a table lookup+a few other minor tricks:
#include<fstream>
#define B(x,y)(b[i+x][j+y]=='X')
int main(int i,char**v){for(int n=0;n<atoi(v[2]);++n){std::ifstream f(v[1]);*v="out.txt";char b[40][83], O[]="...X.....";for(i=0;i<40;++i)f>>b[i];std::ofstream g(*v);g<<b[0]<<'\n';for(i=1;i<39;++i){g<<'.';for(int j=1;j<79;++j){O[2]=b[i][j];g<<O[B(-1,0)+B(1,0)+B(0,-1)+B(0,1)+B(-1,-1)+B(1,1)+B(1,-1)+B(-1,1)];}g<<".\n";}g<<b[0]<<'\n';}}
Perl – 214 chars
What, no perl entries yet?
$i=pop;#c=<>;#c=map{$r=$_;$u='';for(0..79)
{$K=$_-1;$R=$r-1;$u.=((&N.(&N^"\0\W\0").&N)=~y/X//
|(substr$c[$r],$_,1)eq'X')==3?'X':'.';}$u}keys#c for(1..$i);
sub N{substr$c[$R++],$K,3}open P,'>','out.txt';$,=$/;print P#c
Run with: conway.pl infile #times
Another Java attempt, 361 chars
class L{public static void main(final String[]a)throws Exception{new java.io.RandomAccessFile("out.txt","rw"){{int e=88,p[]={-1,1,-80,80,-81,81,-82,82},s=3240,l=0,i=new Byte(a[1])*s+s,c;char[]b=new char[s];for(new java.io.FileReader(a[0]).read(b);i>0;seek(l=++l%s),i--){c=b[l];for(int n:p)c+=l+n>=0&l+n<s?b[l+n]/e:0;write(c>13?(c==49|(c|1)==91?e:46):10);}}};}}
And a little more readable
class L {
public static void main(final String[]a) throws Exception {
new java.io.RandomAccessFile("out.txt","rw"){{
int e=88, p[]={-1,1,-80,80,-81,81,-82,82},s=3240,l=0,i=new Byte(a[1])*s+s,c;
char[] b = new char[s];
for (new java.io.FileReader(a[0]).read(b);i>0;seek(l=++l%s),i--) {
c=b[l];
for (int n:p)
c+=l+n>=0&l+n<s?b[l+n]/e:0;
write(c>13?(c==49|(c|1)==91?e:46):10);
}
}};
}
}
Very similar to Molehill’s version. I've tried to use a different FileWriter and to count the cell's neighbors without an additional variable.
Unfortunately, RandomAccessFile is a pretty long name and it is required that you pass an file access mode.
RUST - 469 characters
Don't know if I should post this here, (this post is 3 years old) but anyway, my try on this, in rust (0.9):
use std::io::fs::File;fn main(){
let mut c=File::open(&Path::new(std::os::args()[1])).read_to_end();
for _ in range(0,from_str::<int>(std::os::args()[2]).unwrap()){
let mut b=c.clone();for y in range(0,40){for x in range(0,80){let mut s=0;
for z in range(x-1,x+2){for t in range(y-1,y+2){
if z>=0&&t>=0&&z<80&&t<40&&(x !=z||y !=t)&&c[t*81+z]==88u8{s +=1;}}}
b[y*81+x]=if s==3||(s==2&&c[y*81+x]==88u8){88u8} else {46u8};}}c = b;}
File::create(&Path::new("out.txt")).write(c);}
For people interested, here is the code before some agressive golfing:
use std::io::fs::File;
fn main() {
let f = std::os::args()[1];
let mut c = File::open(&Path::new(f)).read_to_end();
let n = from_str::<int>(std::os::args()[2]).unwrap();
for _ in range(0,n)
{
let mut new = c.clone();
for y in range(0,40) {
for x in range(0,80) {
let mut sum = 0;
for xx in range(x-1,x+2){
for yy in range(y-1,y+2) {
if xx >= 0 && yy >= 0 && xx <80 && yy <40 && (x != xx || y != yy) && c[yy*81+xx] == 88u8
{ sum = sum + 1; }
}
}
new[y*81+x] = if sum == 3 || (sum == 2 && c[y*81+x] == 88u8) {88u8} else {46u8};
}
}
c = new;
}
File::create(&Path::new("out.txt")).write(c);
}
ét voilà
you may want to use this html file. no file input, but a textarea that does the job!
there is also some html and initiation and vars. the main routine has only 235 characters.
It's hand-minified JS.
<!DOCTYPE html>
<html><body><textarea id="t" style="width:600px;height:600px;font-family:Courier">
</textarea></body><script type="text/javascript">var o,c,m=new Array(3200),
k=new Array(3200),y,v,l,p;o=document.getElementById("t");for(y=0;y<3200;y++)
{m[y]=Math.random()<0.5;}setInterval(function(){p="";for(y=0;y<3200;y++){c=0;
for(v=-1;v<2;v+=2){c+=m[y-1*v]?1:0;for(l=79;l<82;l++)c+=m[y-l*v]?1:0;}
k[y]=c==3||m[y]&&c==2;}p="";for(y=0;y<3200;y++){p+=(y>0&&y%80==0)?"\n":"";
m[y]=k[y];p+=(m[y]?"O":"-");}o.innerHTML=p;},100);</script></html>
One of the classic patterns
***
..*
.*
My avatar was created using my version of the Game of Life using this pattern and rule(note that it is not 23/3):
#D Thanks to my daughter Natalie
#D Try at cell size of 1
#R 8/1
#P -29 -29
.*********************************************************
*.*******************************************************.*
**.*****************************************************.**
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
****************************.*.****************************
***********************************************************
****************************.*.****************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
***********************************************************
**.*****************************************************.**
*.*******************************************************.*
.*********************************************************
IMHO - as I learned Conway's Game of Life the trick wasn't writing short code, but code that could do complex life forms quickly. Using the classic pattern above and a wrapped world of 594,441 cells the best I could ever do was around 1,000 generations / sec.
Another simple pattern
**********
.
................*
.................**
................**.......**********
And gliders
........................*...........
......................*.*...........
............**......**............**
...........*...*....**............**
**........*.....*...**..............
**........*...*.**....*.*...........
..........*.....*.......*...........
...........*...*....................
............**......................

Code Golf: Build Me an Arc

Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
Challenge
The shortest program by character count that accepts standard input of the form X-Y R, with the following guarantees:
R is a non-negative decimal number less than or equal to 8
X and Y are non-negative angles given in decimal as multiples of 45° (0, 45, 90, 135, etc.)
X is less than Y
Y is not 360 if X is 0
And produces on standard output an ASCII "arc" from the starting angle X to the ending angle Y of radius R, where:
The vertex of the arc is represented by o
Angles of 0 and 180 are represented by -
Angles of 45 and 225 are represented by /
Angles of 90 and 270 are represented by |
Angles of 135 and 315 are represented by \
The polygonal area enclosed by the two lines is filled with a non-whitespace character.
The program is not required to produce meaningful output if given invalid input. Solutions in any language are allowed, except of course a language written specifically for this challenge, or one that makes unfair use of an external utility. Extraneous horizontal and vertical whitespace is allowed in the output provided that the format of the output remains correct.
Happy golfing!
Numerous Examples
Input:
0-45 8
Output:
/
/x
/xx
/xxx
/xxxx
/xxxxx
/xxxxxx
/xxxxxxx
o--------
Input:
0-135 4
Output:
\xxxxxxxx
\xxxxxxx
\xxxxxx
\xxxxx
o----
Input:
180-360 2
Output:
--o--
xxxxx
xxxxx
Input:
45-90 0
Output:
o
Input:
0-315 2
Output:
xxxxx
xxxxx
xxo--
xxx\
xxxx\
Perl, 235 211 225 211 207 196 179 177 175 168 160 156 146 chars
<>=~/-\d+/;for$y(#a=-$'..$'){print+(map$_|$y?!($t=8*($y>0)+atan2(-$y,$_)/atan2 1,1)&-$&/45==8|$t>=$`/45&$t<=-$&/45?qw(- / | \\)[$t%4]:$":o,#a),$/}
Perl using say feature, 161 149 139 chars
$ echo -n '<>=~/-\d+/;for$y(#a=-$'"'"'..$'"'"'){say map$_|$y?!($t=8*($y>0)+atan2(-$y,$_)/atan2 1,1)&-$&/45==8|$t>=$`/45&$t<=-$&/45?qw(- / | \\)[$t%4]:$":o,#a}' | wc -c
139
$ perl -E '<>=~/-\d+/;for$y(#a=-$'"'"'..$'"'"'){say map$_|$y?!($t=8*($y>0)+atan2(-$y,$_)/atan2 1,1)&-$&/45==8|$t>=$`/45&$t<=-$&/45?qw(- / | \\)[$t%4]:$":o,#a}'
Perl without trailing newline, 153 143 chars
<>=~/-\d+/;for$y(#a=-$'..$'){print$/,map$_|$y?!($t=8*($y>0)+atan2(-$y,$_)/atan2 1,1)&-$&/45==8|$t>=$`/45&$t<=-$&/45?qw(- / | \\)[$t%4]:$":o,#a}
Original version commented:
$_=<>;m/(\d+)-(\d+) (\d+)/;$e=$1/45;$f=$2/45; # parse angles and radius, angles are 0-8
for$y(-$3..$3){ # loop for each row and col
for$x(-$3..$3){
$t=atan2(-$y,$x)/atan2 1,1; # angle of this point
$t+=8if($t<0); # normalize negative angles
#w=split//,"-/|\\"x2; # array of ASCII symbols for enclosing lines
$s.=!$x&&!$y?"o":$t==$e||$t==$f?$w[$t]:$t>$e&&$t<$f?"x":$";
# if it's origin -> "o", if it's enclosing line, get symbol from array
# if it's between enclosing angles "x", otherwise space
}
$s.=$/;
}
print$s;
EDIT 1: Inlined sub, relational and equality operators return 0 or 1.
EDIT 2: Added version with comments.
EDIT 3: Fixed enclosing line at 360º. Char count increased significantly.
EDIT 4: Added a shorter version, bending the rules.
EDIT 5: Smarter fix for the 360º enclosing line. Also, use a number as fill. Both things were obvious. Meh, I should sleep more :/
EDIT 6: Removed unneeded m from match operator. Removed some semicolons.
EDIT 7: Smarter regexp. Under 200 chars!
EDIT 8: Lots of small improvements:
Inner for loop -> map (1 char)
symbol array from split string -> qw (3 chars)
inlined symbol array (6 chars, together with the previous improvement 9 chars!)
Logical or -> bitwise or (1 char)
Regexp improvement (1 char)
Use arithmethic for testing negative angles, inspired by Jacob's answer (5 chars)
EDIT 9: A little reordering in the conditional operators saves 2 chars.
EDIT 10: Use barewords for characters.
EDIT 11: Moved print inside of loop, inspired by Lowjacker's answer.
EDIT 12: Added version using say.
EDIT 13: Reuse angles characters for fill character, as Gwell's answer does. Output isn't as nice as Gwell's though, that would require 5 additional chars :) Also, .. operator doen't need parentheses.
EDIT 14: Apply regex directly to <>. Assign range operator to a variable, as per Adrian's suggestion to bta's answer. Add version without the final newline. Updated say version.
EDIT 15: More inlining. map{block}#a -> map expr,#a.
Lua, 259 characters
Slightly abuses the non-whitespace character clause to produce a dazzling display and more importantly save strokes.
m=math i=io.read():gmatch("%d+")a=i()/45 b=i()/45 r=i()for y=r,-r,-1 do for x=-r,r do c=m.atan2(y,x)/m.pi*4 c=c<0 and c+8 or c k=1+m.modf(c+.5)io.write(x==0 and y==0 and'o'or c>=a and c<=b and('-/|\\-/|\\-'):sub(k,k)or c==0 and b==8 and'-'or' ')end print()end
Input: 45-360 4
\\\|||///
\\\|||//
\\\\|//
--\\|/
----o----
--//|\\--
////|\\\\
///|||\\\
///|||\\\
Able to handle odd angles
Input: 15-75 8
|/////
|//////
|//////
|//////
///////
|//////-
////---
//-
o
MATLAB, 188 chars :)
input '';[w x r]=strread(ans,'%d-%d%d');l='-/|\-/|\-';[X Y]=meshgrid(-r:r);T=atan2(-Y,X)/pi*180;T=T+(T<=0)*360;T(T>w&T<x)=-42;T(T==w)=-l(1+w/45);T(T==x)=-l(1+x/45);T(r+1,r+1)=-'o';char(-T)
Commented code:
%%# Get the string variable (enclose in quotes, e.g. '45-315 4')
input ''
%%# Extract angles and length
[w x r]=strread(ans,'%d-%d%d');
%%# Store characters
l='-/|\-/|\-';
%%# Create the grid
[X Y]=meshgrid(-r:r);
%%# Compute the angles in degrees
T=atan2(-Y,X)/pi*180;
%%# Get all the angles
T=T+(T<=0)*360;
%# Negative numbers indicate valid characters
%%# Add the characters
T(T>w&T<x)=-42;
T(T==w)=-l(1+w/45);
T(T==x)=-l(1+x/45);
%%# Add the origin
T(r+1,r+1)=-'o';
%%# Display
char(-T)
Mathematica 100 Chars
Out of competition because graphics are too perfect :)
f[x_-y_ z_]:=Graphics#Table[
{EdgeForm#Red,Disk[{0,0},r,{x °,y °}],{r,z,1,-1}]
SetAttributes[f,HoldAll]
Invoke with
f[30-70 5]
Result
alt text http://a.imageshack.us/img80/4294/angulosgolf.png
alt text http://a.imageshack.us/img59/7892/angulos2.png
Note
The
SetAttributes[f, HoldAll];
is needed because the input
f[a-b c]
is otherwise interpreted as
f[(a-b*c)]
GNU BC, 339 chars
Gnu bc because of read(), else and logical operators.
scale=A
a=read()/45
b=read()/45
c=read()
for(y=c;y>=-c;y--){for(x=-c;x<=c;x++){if(x==0)if(y<0)t=-2else t=2else if(x>0)t=a(y/x)/a(1)else if(y<0)t=a(y/x)/a(1)-4else t=a(y/x)/a(1)+4
if(y<0)t+=8
if(x||y)if(t==a||t==b||t==b-8){scale=0;u=(t%4);scale=A;if(u==0)"-";if(u==1)"/";if(u==2)"|";if(u==3)"\"}else if(t>a&&t<b)"x"else" "else"o"};"
"}
quit
MATLAB 7.8.0 (R2009a) - 168 163 162 characters
Starting from Jacob's answer and inspired by gwell's use of any non-whitespace character to fill the arc, I managed the following solution:
[w x r]=strread(input('','s'),'%d-%d%d');
l='o -/|\-/|\-';
X=meshgrid(-r:r);
T=atan2(-X',X)*180/pi;
T=T+(T<=-~w)*360;
T(T>x|T<w)=-1;
T(r+1,r+1)=-90;
disp(l(fix(3+T/45)))
And some test output:
>> arc
0-135 4
\||||////
\|||///-
\||//--
\|/---
o----
I could reduce it further to 156 characters by removing the call to disp, but this would add an extra ans = preceding the output (which might violate the output formatting rules).
Even still, I feel like there are some ways to reduce this further. ;)
Ruby, 292 276 186 chars
x,y,r=gets.scan(/\d+/).map{|z|z.to_i};s=(-r..r);s.each{|a|s.each{|b|g=Math::atan2(-a,b)/Math::PI*180/1%360;print a|b==0?'o':g==x||g==y%360?'-/|\\'[g/45%4].chr: (x..y)===g ?'*':' '};puts}
Nicer-formatted version:
x, y, r = gets.scan(/\d+/).map{|z| z.to_i}
s = (-r..r)
s.each {|a|
s.each {|b|
g = (((Math::atan2(-a,b) / Math::PI) * 180) / 1) % 360
print ((a | b) == 0) ? 'o' :
(g == x || g == (y % 360)) ? '-/|\\'[(g / 45) % 4].chr :
((x..y) === g) ? '*' : ' '
}
puts
}
I'm sure someone out there who got more sleep than I did can condense this more...
Edit 1: Switched if statements in inner loop to nested ? : operator
Edit 2: Stored range to intermediate variable (thanks Adrian), used stdin instead of CLI params (thanks for the clarification Jon), eliminated array in favor of direct output, fixed bug where an ending angle of 360 wouldn't display a line, removed some un-needed parentheses, used division for rounding instead of .round, used modulo instead of conditional add
Ruby, 168 characters
Requires Ruby 1.9 to work
s,e,r=gets.scan(/\d+/).map &:to_i;s/=45;e/=45;G=-r..r;G.map{|y|G.map{|x|a=Math.atan2(-y,x)/Math::PI*4%8;print x|y!=0?a==s||a==e%8?'-/|\\'[a%4]:a<s||a>e ?' ':8:?o};puts}
Readable version:
start, _end, radius = gets.scan(/\d+/).map &:to_i
start /= 45
_end /= 45
(-radius..radius).each {|y|
(-radius..radius).each {|x|
angle = Math.atan2(-y, x)/Math::PI * 4 % 8
print x|y != 0 ? angle==start || angle==_end%8 ? '-/|\\'[angle%4] : angle<start || angle>_end ? ' ' : 8 : ?o
}
puts
}
Perl - 388 characters
Since it wouldn't be fair to pose a challenge I couldn't solve myself, here's a solution that uses string substitution instead of trigonometric functions, and making heavy use of your friendly neighbourhood Perl's ability to treat barewords as strings. It's necessarily a little long, but perhaps interesting for the sake of uniqueness:
($x,$y,$r)=split/\D/,<>;for(0..$r-1){$t=$r-1-$_;
$a.=L x$_.D.K x$t.C.J x$t.B.I x$_."\n";
$b.=M x$t.F.N x$_.G.O x$_.H.P x$t."\n"}
$_=$a.E x$r.o.A x$r."\n".$b;$x/=45;$y/=45;$S=' ';
sub A{$v=$_[0];$x==$v||$y==$v?$_[1]:$x<$v&&$y>$v?x:$S}
sub B{$x<=$_[0]&&$y>$_[0]?x:$S}
#a=!$x||$y==8?'-':$S;
push#a,map{A$_,'\\'.qw(- / | \\)[$_%4]}1..7;
push#a,!$x?x:$S,map{B$_}1..7;
eval"y/A-P/".(join'',#a)."/";print
All newlines are optional. It's fairly straightforward:
Grab user input.
Build the top ($a) and bottom ($b) parts of the pattern.
Build the complete pattern ($_).
Define a sub A to get the fill character for an angle.
Define a sub B to get the fill character for a region.
Build an array (#a) of substitution characters using A and B.
Perform the substitution and print the results.
The generated format looks like this, for R = 4:
DKKKCJJJB
LDKKCJJBI
LLDKCJBII
LLLDCBIII
EEEEoAAAA
MMMFGHPPP
MMFNGOHPP
MFNNGOOHP
FNNNGOOOH
Where A-H denote angles and I-P denote regions.
(Admittedly, this could probably be golfed further. The operations on #a gave me incorrect output when written as one list, presumably having something to do with how map plays with $_.)
C# - 325 319 chars
using System;class P{static void Main(){var s=Console.ReadLine().Split(' ');
var d=s[0].Split('-');int l=s[1][0]-48,x,y,r,a=int.Parse(d[0]),b=int.Parse(d[1]);
for(y=l;y>=-l;y--)for(x=-l;x<=l;)Console.Write((x==0&&y==0?'o':a<=(r=((int)
(Math.Atan2(y,x)*57.3)+360)%360)&&r<b||r==b%360?
#"-/|\"[r/45%4]:' ')+(x++==l?"\n":""));}}
Newlines not significant.
Sample input/output
45-180 8
\||||||||////////
\\|||||||///////
\\\||||||//////
\\\\|||||/////
\\\\\||||////
\\\\\\|||///
\\\\\\\||//
\\\\\\\\|/
--------o
135-360 5
\
\\
\\\
\\\\
\\\\\
-----o-----
----/|\\\\\
---//||\\\\
--///|||\\\
-////||||\\
/////|||||\
Java - 304 chars
class A{public static void main(String[]a){String[]b=a[0].split("-");int e=new Integer(b[1]),r=new Integer(a[1]),g,x,y=r;for(;y>=-r;y--)for(x=-r;x<=r;)System.out.print((x==0&y==0?'o':new Integer(b[0])<=(g=((int)(Math.atan2(y,x)*57.3)+360)%360)&g<e|g==e%360?"-/|\\".charAt(g/45%4):' ')+(x++<r?"":"\n"));}}
More readable version:
class A{
public static void main(String[]a){
String[]b=a[0].split("-");
int e=new Integer(b[1]),r=new Integer(a[1]),g,x,y=r;
for(;y>=-r;y--)for(x=-r;x<=r;)System.out.print((
x==0&y==0
?'o'
:new Integer(b[0])<=(g=((int)(Math.atan2(y,x)*57.3)+360)%360)&g<e|g==e%360
?"-/|\\".charAt(g/45%4)
:' '
)+(x++<r?"":"\n"));
}
}
C (902 byte)
This doesn't use trigonometric functions (like the original perl version), so it's quite ``bloated''. Anyway, here is my first code-golf submission:
#define V(r) (4*r*r+6*r+3)
#define F for(i=0;i<r;i++)
#define C ;break;case
#define U p-=2*r+2,
#define D p+=2*r+2,
#define R *++p=
#define L *--p=
#define H *p='|';
#define E else if
#define G(a) for(j=0;j<V(r)-1;j++)if(f[j]==i+'0')f[j]=a;
#define O(i) for(i=0;i<2*r+1;i++){
main(int i,char**v){char*p,f[V(8)];
int j,m,e,s,x,y,r;p=*++v;x=atoi(p);while(*p!=45)p++;
char*h="0123";y=atoi(p+1);r=atoi(*++v);
for(p=f+2*r+1;p<f+V(r);p+=2*r+2)*p=10;
*(p-2*r-2)=0;x=x?x/45:x;y/=45;s=0;e=2*r;m=r;p=f;O(i)O(j)
if(j>e)*p=h[0];E(j>m)*p=h[1];E(j>s)*p=h[2];else*p=h[3];p++;}
if(i+1==r){h="7654";m--;e--;}E(i==r){s--;}E(i>r){s--;e++;}
else{s++;e--;}p++;}for(p=f+V(r)/2-1,i=0;i<r;i++)*++p=48;
for(i=0;i<8;i++)if(i>=x&&i<y){G(64);}else G(32);
y=y==8?0:y;q:p=f+V(r)/2-1;*p='o';switch(x){
C 0:F R 45 C 1:F U R 47 C 2:F U H C 3:F U L 92
C 4:F L 45 C 5:F D L 47 C 6:F D H C 7:F D R 92;}
if(y!=8){x=y;y=8;goto q;}puts(f);}
also, the #defines look rather ugly, but they save about 200 bytes so I kept them in, anyway. It is valid ANSI C89/C90 and compiles with very few warnings (two about atoi and puts and two about crippled form of main).

Convert a string into Morse code [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Closed 8 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
The challenge
The shortest code by character count, that will input a string using only alphabetical characters (upper and lower case), numbers, commas, periods and question mark, and returns a representation of the string in Morse code.
The Morse code output should consist of a dash (-, ASCII 0x2D) for a long beep (AKA 'dah') and a dot (., ASCII 0x2E) for short beep (AKA 'dit').
Each letter should be separated by a space (' ', ASCII 0x20), and each word should be separated by a forward slash (/, ASCII 0x2F).
Morse code table:
alt text http://liranuna.com/junk/morse.gif
Test cases:
Input:
Hello world
Output:
.... . .-.. .-.. --- / .-- --- .-. .-.. -..
Input:
Hello, Stackoverflow.
Output:
.... . .-.. .-.. --- --..-- / ... - .- -.-. -.- --- ...- . .-. ..-. .-.. --- .-- .-.-.-
Code count includes input/output (that is, the full program).
C (131 characters)
Yes, 131!
main(c){for(;c=c?c:(c=toupper(getch())-32)?
"•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&#/4)'18=,*%.:0;?5"
[c-12]-34:-3;c/=2)putch(c/2?46-c%2:0);}
I eeked out a few more characters by combining the logic from the while and for loops into a single for loop, and by moving the declaration of the c variable into the main definition as an input parameter. This latter technique I borrowed from strager's answer to another challenge.
For those trying to verify the program with GCC or with ASCII-only editors, you may need the following, slightly longer version:
main(c){for(;c=c?c:(c=toupper(getchar())-32)?c<0?1:
"\x95#\x8CKa`^ZRBCEIQiw#S#nx(37+$6-2&#/4)'18=,*%.:0;?5"
[c-12]-34:-3;c/=2)putchar(c/2?46-c%2:32);}
This version is 17 characters longer (weighing in at a comparatively huge 148), due to the following changes:
+4: getchar() and putchar() instead of the non-portable getch() and putch()
+6: escape codes for two of the characters instead of non-ASCII characters
+1: 32 instead of 0 for space character
+6: added "c<0?1:" to suppress garbage from characters less than ASCII 32 (namely, from '\n'). You'll still get garbage from any of !"#$%&'()*+[\]^_`{|}~, or anything above ASCII 126.
This should make the code completely portable. Compile with:
gcc -std=c89 -funsigned-char morse.c
The -std=c89 is optional. The -funsigned-char is necessary, though, or you will get garbage for comma and full stop.
135 characters
c;main(){while(c=toupper(getch()))for(c=c-32?
"•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&#/4)'18=,*%.:0;?5"
[c-44]-34:-3;c;c/=2)putch(c/2?46-c%2:0);}
In my opinion, this latest version is much more visually appealing, too. And no, it's not portable, and it's no longer protected against out-of-bounds input. It also has a pretty bad UI, taking character-by-character input and converting it to Morse Code and having no exit condition (you have to hit Ctrl+Break). But portable, robust code with a nice UI wasn't a requirement.
A brief-as-possible explanation of the code follows:
main(c){
while(c = toupper(getch())) /* well, *sort of* an exit condition */
for(c =
c - 32 ? // effectively: "if not space character"
"•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&#/4)'18=,*%.:0;?5"[c - 44] - 34
/* This array contains a binary representation of the Morse Code
* for all characters between comma (ASCII 44) and capital Z.
* The values are offset by 34 to make them all representable
* without escape codes (as long as chars > 127 are allowed).
* See explanation after code for encoding format.
*/
: -3; /* if input char is space, c = -3
* this is chosen because -3 % 2 = -1 (and 46 - -1 = 47)
* and -3 / 2 / 2 = 0 (with integer truncation)
*/
c; /* continue loop while c != 0 */
c /= 2) /* shift down to the next bit */
putch(c / 2 ? /* this will be 0 if we're down to our guard bit */
46 - c % 2 /* We'll end up with 45 (-), 46 (.), or 47 (/).
* It's very convenient that the three characters
* we need for this exercise are all consecutive.
*/
: 0 /* we're at the guard bit, output blank space */
);
}
Each character in the long string in the code contains the encoded Morse Code for one text character. Each bit of the encoded character represents either a dash or a dot. A one represents a dash, and a zero represents a dot. The least significant bit represents the first dash or dot in the Morse Code. A final "guard" bit determines the length of the code. That is, the highest one bit in each encoded character represents end-of-code and is not printed. Without this guard bit, characters with trailing dots couldn't be printed correctly.
For instance, the letter 'L' is ".-.." in Morse Code. To represent this in binary, we need a 0, a 1, and two more 0s, starting with the least significant bit: 0010. Tack one more 1 on for a guard bit, and we have our encoded Morse Code: 10010, or decimal 18. Add the +34 offset to get 52, which is the ASCII value of the character '4'. So the encoded character array has a '4' as the 33rd character (index 32).
This technique is similar to that used to encode characters in ACoolie's, strager's(2), Miles's, pingw33n's, Alec's, and Andrea's solutions, but is slightly simpler, requiring only one operation per bit (shifting/dividing), rather than two (shifting/dividing and decrementing).
EDIT:
Reading through the rest of the implementations, I see that Alec and Anon came up with this encoding scheme—using the guard bit—before I did. Anon's solution is particularly interesting, using Python's bin function and stripping off the "0b" prefix and the guard bit with [3:], rather than looping, anding, and shifting, as Alec and I did.
As a bonus, this version also handles hyphen (-....-), slash (-..-.), colon (---...), semicolon (-.-.-.), equals (-...-), and at sign (.--.-.). As long as 8-bit characters are allowed, these characters require no extra code bytes to support. No more characters can be supported with this version without adding length to the code (unless there's Morse Codes for greater/less than signs).
Because I find the old implementations still interesting, and the text has some caveats applicable to this version, I've left the previous content of this post below.
Okay, presumably, the user interface can suck, right? So, borrowing from strager, I've replaced gets(), which provides buffered, echoed line input, with getch(), which provides unbuffered, unechoed character input. This means that every character you type gets translated immediately into Morse Code on the screen. Maybe that's cool. It no longer works with either stdin or a command-line argument, but it's pretty damn small.
I've kept the old code below, though, for reference. Here's the new.
New code, with bounds checking, 171 characters:
W(i){i?W(--i/2),putch(46-i%2):0;}c;main(){while(c=toupper(getch())-13)
c=c-19?c>77|c<31?0:W("œ*~*hXPLJIYaeg*****u*.AC5+;79-#6=0/8?F31,2:4BDE"
[c-31]-42):putch(47),putch(0);}
Enter breaks the loop and exits the program.
New code, without bounds checking, 159 characters:
W(i){i?W(--i/2),putch(46-i%2):0;}c;main(){while(c=toupper(getch())-13)
c=c-19?W("œ*~*hXPLJIYaeg*****u*.AC5+;79-#6=0/8?F31,2:4BDE"[c-31]-42):
putch(47),putch(0);}
Below follows the old 196/177 code, with some explanation:
W(i){i?W(--i/2),putch(46-i%2):0;}main(){char*p,c,s[99];gets(s);
for(p=s;*p;)c=*p++,c=toupper(c),c=c-32?c>90|c<44?0:W(
"œ*~*hXPLJIYaeg*****u*.AC5+;79-#6=0/8?F31,2:4BDE"[c-44]-42):
putch(47),putch(0);}
This is based on Andrea's Python answer, using the same technique for generating the morse code as in that answer. But instead of storing the encodable characters one after another and finding their indexes, I stored the indexes one after another and look them up by character (similarly to my earlier answer). This prevents the long gaps near the end that caused problems for earlier implementors.
As before, I've used a character that's greater than 127. Converting it to ASCII-only adds 3 characters. The first character of the long string must be replaced with \x9C. The offset is necessary this time, otherwise a large number of characters are under 32, and must be represented with escape codes.
Also as before, processing a command-line argument instead of stdin adds 2 characters, and using a real space character between codes adds 1 character.
On the other hand, some of the other routines here don't deal with input outside the accepted range of [ ,.0-9\?A-Za-z]. If such handling were removed from this routine, then 19 characters could be removed, bringing the total down as low as 177 characters. But if this is done, and invalid input is fed to this program, it may crash and burn.
The code in this case could be:
W(i){i?W(--i/2),putch(46-i%2):0;}main(){char*p,s[99];gets(s);
for(p=s;*p;p++)*p=*p-32?W(
"œ*~*hXPLJIYaeg*****u*.AC5+;79-#6=0/8?F31,2:4BDE"
[toupper(*p)-44]-42):putch(47),putch(0);}
Using a Morse Code Font?
Console.Write(params[0]);
Perl, 170 characters (with a little help from accomplished golfer mauke). Wrapped for clarity; all newlines are removable.
$_=uc<>;y,. ,|/,;s/./$& /g;#m{A..Z,0..9,qw(| , ?)}=
".-NINNN..]IN-NII..AMN-AI---.M-ANMAA.I.-].AIAA-NANMMIOMAOUMSMSAH.B.MSOIONARZMIZ"
=~/../g;1while s![]\w|,?]!$m{$&}!;print
Explanation:
Extract the morse dictionary. Each symbol is defined in terms of two chars, which can be either literal dots or dashes, or a reference to the value of another defined char. E and T contain dummy chars to avoid desyncing the decoder; we'll remove them later.
Read and format the input. "Hello world" becomes "H E L L O / W O R L D"
The next step depends on the input and output dictionaries being distinct, so turn dots in the input to an unused char (vertical bar, |)
Replace any char in the input that occurs in the morse dictionary with its value in the dictionary, until no replacements occur.
Remove the dummy char mentioned in step 1.
Print the output.
In the final version, the dictionary is optimized for runtime efficiency:
All one-symbol characters (E and T) and two-symbol characters (A, I, M, and N) are defined directly and decode in one pass.
All three-symbol characters are defined in terms of a two-symbol character and a literal symbol, decoding in two passes.
All four-symbol characters are defined in terms of two two-symbol characters, decoding in two passes with three replacements.
The five- and six-symbol characters (numbers and punctuation) decode in three passes, with four or five replacements respectively.
Since the golfed code only replaces one character per loop (to save one character of code!) the number of loops is limited to five times the length of the input (three times the length of the input if only alphabetics are used). But by adding a g to the s/// operation, the number of loops is limited to three (two if only alphabetics are used).
Example transformation:
Hello 123
H E L L O / 1 2 3
II .] AI AI M- / AO UM SM
.... . .-.. .-.. --- / .-M- .A-- I.--
.... . .-.. .-.. --- / .---- ..--- ...--
Python list comprehension, 159-character one-liner
for c in raw_input().upper():print c<","and"/"or bin(ord("•ƒwTaQIECBRZ^`šŒ#S#n|':<.$402&9/6)(18?,*%+3-;=>"[ord(c)-44])-34)[3:].translate(" "*47+"/.-"+" "*206),
Uses the similar data packing to P Daddy's C implementation, but does not store the bits in reverse order and uses bin() to extract the data rather than arithmetic. Note also that spaces are detected using inequality; it considers every character "less than comma" to be a space.
Python for loop, 205 chars including newlines
for a in raw_input().upper():
q='_ETIANMSURWDKGOHVF_L_PJBXCYZQ__54_3___2__+____16=/_____7___8_90'.find(a);s=''
while q>0:s='-.'[q%2]+s;q=~-q/2
print['/','--..--','..--..','.-.-.-',''][' ,?.'.find(a)]+s,
I was dorking around with a compact coding for the symbols, but I don't see if getting any better than the implicit trees already in use, so I present the coding here in case some one else can use it.
Consider the string:
--..--..-.-.-..--...----.....-----.--/
which contains all the needed sequences as substrings. We could code the symbols by offset and length like this:
ET RRRIIGGGJJJJ
--..--..-.-.-..--...----.....-----.--/
CCCC DD WWW 00000
,,,,,, AALLLL BBBB 11111
--..--..-.-.-..--...----.....-----.--/
?????? KKK MMSSS 22222
FFFF PPPP 33333
--..--..-.-.-..--...----.....-----.--/
UUU XXXX 44444
NN PPPP OOO 55555
--..--..-.-.-..--...----.....-----.--/
ZZZZ 66666
77777 YYYY
--..--..-.-.-..--...----.....-----.--/
...... 88888 HHHH
99999 VVVV QQQQ
--..--..-.-.-..--...----.....-----.--/
with the space (i.e. word boundary) starting and ending on the final character (the '/'). Feel free to use it, if you see a good way.
Most of the shorter symbols have several possible codings, of course.
P Daddy found a shorter version of this trick (and I can now see at least some of the redundancy here) and did a nice c implementation. Alec did a python implementation with the first (buggy and incomplete) version. Hobbs did a pretty compact perl version that I don't understand at all.
J, 124 130 134 characters
'.- /'{~;2,~&.>(]`(<&3:)#.(a:=])"0)}.&,&#:&.></.40-~a.i.')}ggWOKIHX`dfggggggg-#B4*:68,?5</.7>E20+193ACD'{~0>.45-~a.i.toupper
J beats C! Awesome!
Usage:
'.- /'{~;2,~&.>(]`(<&3:)#.(a:=])"0)}.&,&#:&.></.40-~a.i.')}ggWOKIHX`dfggggggg-#B4*:68,?5</.7>E20+193ACD'{~0>.45-~a.i.toupper 'Hello World'
.... . .-.. .-.. --- / .-- --- .-. .-.. -..
'.- /'{~;2,~&.>(]`(<&3:)#.(a:=])"0)}.&,&#:&.></.40-~a.i.')}ggWOKIHX`dfggggggg-#B4*:68,?5</.7>E20+193ACD'{~0>.45-~a.i.toupper 'Hello, Stackoverflow.'
.... . .-.. .-.. --- .-.-.- / ... - .- -.-. -.- --- ...- . .-. ..-. .-.. --- .-- --..--
Python 3 One Liner: 172 characters
print(' '.join('/'if c==' 'else''.join('.'if x=='0'else'-'for x in bin(ord("ijÁĕÁÿïçãáàðøüþÁÁÁÁÁČÁÅ×ÚÌÂÒÎÐÄ×ÍÔÇÆÏÖÝÊÈÃÉÑËÙÛÜ"[ord(c)-44])-192)[3:])for c in input().upper()))
(Encoding the tranlation table into unicode code points. Works fine, and they display here fine in my test on my Windows Vista machine.)
Edited to pare down to 184 characters by removing some unnecessary spaces and brackets (making list comps gen exps).
Edit again: More spaces removed that I didn't even know was possible before seeing other answers here - so down to 176.
Edit again down to 172 (woo woo!) by using ' '.join instead of ''.join and doing the spaces separately. (duh!)
C# 266 chars
The 131 char C solution translated to C# yields 266 characters:
foreach(var i in Encoding.ASCII.GetBytes(args[0].ToUpper())){var c=(int)i;for(c=(c-32!=0)?Encoding.ASCII.GetBytes("•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&#/4)'18=,*%.:0;?5")[c-44]-34:-3;c!=0;c/=2)Console.Write(Encoding.ASCII.GetChars(new byte[]{(byte)((c/2!=0)?46-c%2:0)}));}
which is more readable as:
foreach (var i in Encoding.ASCII.GetBytes(args[0].ToUpper()))
{
var c = (int)i;
for (c = ((c - 32) != 0) ? Encoding.ASCII.GetBytes("•ƒŒKa`^ZRBCEIQiw#S#nx(37+$6-2&#/4)'18=,*%.:0;?5")[c - 44] - 34 : -3
; c != 0
; c /= 2)
Console.Write(Encoding.ASCII.GetChars(new byte[] { (byte)((c / 2 != 0) ? 46 - c % 2 : 0) }));
}
Golfscript - 106 chars - NO FUNNY CHARS :)
newline at the end of the input is not supported, so use something like this
echo -n Hello, Stackoverflow| ../golfscript.rb morse.gs
' '/{{.32|"!etianmsurwdkgohvf!l!pjbxcyzq"?)"UsL?/'#! 08<>"#".,?0123456789"?=or
2base(;>{'.-'\=}%' '}%}%'/'*
Letters are a special case and converted to lowercase and ordered in their binary positions.
Everything else is done by a translation table
Python
Incomplete solution, but maybe somebody can make a full solution out of it. Doesn't handle digits or punctuation, but weighs in at only 154 chars.
def e(l):
i='_etianmsurwdkgohvf_l_pjbxcyzq'.find(l.lower());v=''
while i>0:v='-.'[i%2]+v;i=(i-1)/2;return v or '/'
def enc(s):return ' '.join(map(e,s))
C (248 characters)
Another tree-based solution.
#define O putchar
char z[99],*t=
" ETINAMSDRGUKWOHBL~FCPJVX~YZQ~~54~3~~~2~~+~~~~16=/~~.~~7,~~8~90";c,p,i=0;
main(){gets(z);while(c=z[i++]){c-46?c-44?c:O(45):O(c);c=c>96?c-32:c;p=-1;
while(t[++p]!=c);for(;p;p/=2){O(45+p--%2);}c-32?O(32):(O(47),O(c));}}
Could be errors in source tree because wikipedia seems to have it wrong or maybe I misunderstood something.
F#, 256 chars
let rec D i=if i=16 then" "else
let x=int"U*:+F8c]uWjGbJ0-0Dnmd0BiC5?\4o`h7f>9[1E=pr_".[i]-32
if x>43 then"-"+D(x-43)else"."+D x
let M(s:string)=s.ToUpper()|>Seq.fold(fun s c->s+match c with
|' '->"/ "|','->"--..-- "|'.'->".-.-.- "|_->D(int c-48))""
For example
M("Hello, Stack.") |> printfn "%s"
yields
.... . .-.. .-.. --- --..-- / ... - .- -.-. -.- .-.-.-
I think my technique may be unique so far. The idea is:
there is an ascii range of chars that covers most of what we want (0..Z)
there are only 43 chars in this range
thus we can encode one bit (dash or dot) plus a 'next character' in a range of 86 chars
the range ascii(32-117) is all 'printable' and can serve as this 86-char range
so the string literal encodes a table along those lines
There's a little more to it, but that's the gist. Comma, period, and space are not in the range 0..Z so they're handled specially by the 'match'. Some 'unused' characters in the range 0..Z (like ';') are used in the table as suffixes of other morse translations that aren't themselves morse 'letters'.
Here's my contribution as a console application in VB.Net
Module MorseCodeConverter
Dim M() As String = {".-", "-...", "-.-.", "-..", ".", "..-.", "--.", "....", "..", ".---", "-.-", ".-..", "--", "-.", "---", ".--.", "--.-", ".-.", "...", "-", "..-", "...-", ".--", "-..-", "-.--", "--..", "-----", ".----", "..---", "...--", "....-", ".....", "-....", "--...", "---..", "----."}
Sub Main()
Dim I, O
Dim a, b
While True
I = Console.ReadLine()
O = ""
For Each a In I
b = AscW(UCase(a))
If b > 64 And b < 91 Then
O &= M(b - 65) & " "
ElseIf b > 47 And b < 58 Then
O &= M(b - 22) & " "
ElseIf b = 46 Then
O &= ".-.-.- "
ElseIf b = 44 Then
O &= "--..-- "
ElseIf b = 63 Then
O &= "..--.. "
Else
O &= "/"
End If
Next
Console.WriteLine(O)
End While
End Sub
End Module
I left he white space in to make it readable. Totals 1100 characters. It will read the input from the command line, one line at a time, and send the corresponding output back to the output stream. The compressed version is below, with only 632 characters.
Module Q
Dim M() As String={".-","-...","-.-.","-..",".","..-.","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--..","-----",".----","..---","...--","....-",".....","-....","--...","---..","----."}
Sub Main()
Dim I,O,a,b:While 1:I=Console.ReadLine():O="":For Each a In I:b=AscW(UCase(a)):If b>64 And b<91 Then:O &=M(b-65)&" ":ElseIf b>47 And b<58 Then:O &=M(b-22)&" ":ElseIf b=46 Then:O &=".-.-.- ":ElseIf b=44 Then:O &="--..-- ":ElseIf b=63 Then:O &= "..--.. ":Else:O &="/":End IF:Next:Console.WriteLine(O):End While
End Sub
End Module
C (233 characters)
W(n,p){while(n--)putch(".-.-.--.--..--..-.....-----..../"[p++]);}main(){
char*p,c,s[99];gets(s);for(p=s;*p;){c=*p++;c=toupper(c);c=c>90?35:c-32?
"È#À#¶µ´³²±°¹¸·#####Ê##i Že‘J•aEAv„…`q!j“d‰ƒˆ"[c-44]:63;c-35?
W(c>>5,c&31):0;putch(0);}}
This takes input from stdin. Taking input from the command line adds 2 characters. Instead of:
...main(){char*p,c,s[99];gets(s);for(p=s;...
you get:
...main(int i,char**s){char*p,c;for(p=s[1];...
I'm using Windows-1252 code page for characters above 127, and I'm not sure how they'll turn up in other people's browsers. I notice that, in my browser at least (Google Chrome), two of the characters (between "#" and "i") aren't showing up. If you copy out of the browser and paste into a text editor, though, they do show up, albeit as little boxes.
It can be converted to ASCII-only, but this adds 24 characters, increasing the character count to 257. To do this, I first offset each character in the string by -64, minimizing the number of characters that are greater than 127. Then I substitute \xXX character escapes where necessary. It changes this:
...c>90?35:c-32?"È#À#¶µ´³²±°¹¸·#####Ê##i Že‘J•aEAv„…`q!j“d‰ƒˆ"[c-44]:63;
c-35?W(...
to this:
...c>90?99:c-32?"\x88#\x80#vutsrqpyxw#####\x8A#\0PA)\xE0N%Q\nU!O\5\1\66DE 1
\xE1*S$ICH"[c-44]+64:63;c-99?W(...
Here's a more nicely formatted and commented version of the code:
/* writes `n` characters from internal string to stdout, starting with
* index `p` */
W(n,p){
while(n--)
/* warning for using putch without declaring it */
putch(".-.-.--.--..--..-.....-----..../"[p++]);
/* dmckee noticed (http://tinyurl.com/n4eart) the overlap of the
* various morse codes and created a 37-character-length string that
* contained the morse code for every required character (except for
* space). You just have to know the start index and length of each
* one. With the same idea, I came up with this 32-character-length
* string. This not only saves 5 characters here, but means that I
* can encode the start indexes with only 5 bits below.
*
* The start and length of each character are as follows:
*
* A: 0,2 K: 1,3 U: 10,3 4: 18,5
* B: 16,4 L: 15,4 V: 19,4 5: 17,5
* C: 1,4 M: 5,2 W: 4,3 6: 16,5
* D: 9,3 N: 1,2 X: 9,4 7: 25,5
* E: 0,1 O: 22,3 Y: 3,4 8: 24,5
* F: 14,4 P: 4,4 Z: 8,4 9: 23,5
* G: 5,3 Q: 5,4 0: 22,5 .: 0,6
* H: 17,4 R: 0,3 1: 21,5 ,: 8,6
* I: 20,2 S: 17,3 2: 20,5 ?: 10,6
* J: 21,4 T: 1,1 3: 19,5
*/
}
main(){ /* yuck, but it compiles and runs */
char *p, c, s[99];
/* p is a pointer within the input string */
/* c saves from having to do `*p` all the time */
/* s is the buffer for the input string */
gets(s); /* warning for use without declaring */
for(p=s; *p;){ /* begin with start of input, go till null character */
c = *p++; /* grab *p into c, increment p.
* incrementing p here instead of in the for loop saves
* one character */
c=toupper(c); /* warning for use without declaring */
c = c > 90 ? 35 : c - 32 ?
"È#À#¶µ´³²±°¹¸·#####Ê##i Že‘J•aEAv„…`q!j“d‰ƒˆ"[c - 44] : 63;
/**** OR, for the ASCII version ****/
c = c > 90 ? 99 : c - 32 ?
"\x88#\x80#vutsrqpyxw#####\x8A#\0PA)\xE0N%Q\nU!O\5\1\66DE 1\xE1"
"*S$ICH"[c - 44] + 64 : 63;
/* Here's where it gets hairy.
*
* What I've done is encode the (start,length) values listed in the
* comment in the W function into one byte per character. The start
* index is encoded in the low 5 bits, and the length is encoded in
* the high 3 bits, so encoded_char = (char)(length << 5 | position).
* For the longer, ASCII-only version, 64 is subtracted from the
* encoded byte to reduce the necessity of costly \xXX representations.
*
* The character array includes encoded bytes covering the entire range
* of characters covered by the challenge, except for the space
* character, which is checked for separately. The covered range
* starts with comma, and ends with capital Z (the call to `toupper`
* above handles lowercase letters). Any characters not supported are
* represented by the "#" character, which is otherwise unused and is
* explicitly checked for later. Additionally, an explicit check is
* done here for any character above 'Z', which is changed to the
* equivalent of a "#" character.
*
* The encoded byte is retrieved from this array using the value of
* the current character minus 44 (since the first supported character
* is ASCII 44 and index 0 in the array). Finally, for the ASCII-only
* version, the offset of 64 is added back in.
*/
c - 35 ? W(c >> 5, c & 31) : 0;
/**** OR, for the ASCII version ****/
c - 99 ? W(c >> 5, c & 31) : 0;
/* Here's that explicit check for the "#" character, which, as
* mentioned above, is for characters which will be ignored, because
* they aren't supported. If c is 35 (or 99 for the ASCII version),
* then the expression before the ? evaluates to 0, or false, so the
* expression after the : is evaluated. Otherwise, the expression
* before the ? is non-zero, thus true, so the expression before
* the : is evaluated.
*
* This is equivalent to:
*
* if(c != 35) // or 99, for the ASCII version
* W(c >> 5, c & 31);
*
* but is shorter by 2 characters.
*/
putch(0);
/* This will output to the screen a blank space. Technically, it's not
* the same as a space character, but it looks like one, so I think I
* can get away with it. If a real space character is desired, this
* must be changed to `putch(32);`, which adds one character to the
* overall length.
} /* end for loop, continue with the rest of the input string */
} /* end main */
This beats everything here except for a couple of the Python implementations. I keep thinking that it can't get any shorter, but then I find some way to shave off a few more characters. If anybody can find any more room for improvement, let me know.
EDIT:
I noticed that, although this routine rejects any invalid characters above ASCII 44 (outputting just a blank space for each one), it doesn't check for invalid characters below this value. To check for these adds 5 characters to the overall length, changing this:
...c>90?35:c-32?"...
to this:
...c-32?c>90|c<44?35:"...
REBOL (118 characters)
A roughly 10 year-old implementation
foreach c ask""[l: index? find" etinamsdrgukwohblzfcpövxäqüyj"c while[l >= 2][prin pick"-."odd? l l: l / 2]prin" "]
Quoted from: http://www.rebol.com/oneliners.html
(no digits though and words are just separated by double spaces :/ ...)
Python (210 characters)
This is a complete solution based on Alec's one
def e(l):
i=(' etianmsurwdkgohvf_l_pjbxcyzq__54_3___2%7s16%7s7___8_90%12s?%8s.%29s,'%tuple('_'*5)).find(l.lower());v=''
while i>0:v='-.'[i%2]+v;i=(i-1)/2
return v or '/'
def enc(s):return ' '.join(map(e,s))
C, 338 chars
338 with indentation and all removable linebreaks removed:
#define O putchar
#define W while
char*l="x#####ppmmmmm##FBdYcbcbSd[Kcd`\31(\b1g_<qCN:_'|\25D$W[QH0";
int c,b,o;
main(){
W(1){
W(c<32)
c=getchar()&127;
W(c>96)
c^=32;
c-=32;
o=l[c/2]-64;
b=203+(c&1?o>>3:0);
o=c&1?o&7:o>>3;
W(o>6)
O(47),o=0;
c/=2;
W(c--)
b+=(l[c]-64&7)+(l[c]-64>>3);
b=(((l[b/7]<<7)+l[b/7+1])<<(b%7))>>14-o;
W(o--)
O(b&(1<<o)?46:45);
O(32);
}
}
This isn't based on the tree approach other people have been taking. Instead, l first encodes the lengths of all bytes between 32 and 95 inclusive, two bytes to a character. As an example, D is -.. for a length of 3 and E is . for a length of 1. This is encoded as 011 and 001, giving 011001. To make more characters encodable and avoid escapes, 64 is then added to the total, giving 1011001 - 89, ASCII Y. Non-morse characters are assigned a length of 0. The second half of l (starting with \031) are the bits of the morse code itself, with a dot being 1 and a dash 0. To avoid going into high ASCII, this data is encoded 7 bits/byte.
The code first sanitises c, then works out the morse length of c (in o), then adds up the lengths of all the previous characters to produce b, the bit index into the data.
Finally, it loops through the bits, printing dots and dashes.
The length '7' is used as a special flag for printing a / when encountering a space.
There are probably some small gains to be had from removing brackets, but I'm way off from some of the better results and I'm hungry, so...
C# Using Linq (133 chars)
static void Main()
{
Console.WriteLine(String.Join(" ", (from c in Console.ReadLine().ToUpper().ToCharArray()
select m[c]).ToArray()));
}
OK, so I cheated. You also need to define a dictionary as follows (didn't bother counting the chars, since this blows me out of the game):
static Dictionary<char, string> m = new Dictionary<char, string>() {
{'A', ".-"},
{'B', "-.."},
{'C', "-.-."},
{'D', "-.."},
{'E', "."},
{'F', "..-."},
{'G', "--."},
{'H', "...."},
{'I', ".."},
{'J', ".---"},
{'K', "-.-"},
{'L', ".-.."},
{'M', "--"},
{'N', "-."},
{'O', "---"},
{'P', ".--."},
{'Q', "--.-"},
{'R', ".-."},
{'S', "..."},
{'T', "-"},
{'U', "..-"},
{'V', "...-"},
{'W', ".--"},
{'X', "-..-"},
{'Y', "-.--"},
{'Z', "--.."},
{'0', "-----"},
{'1', ".----"},
{'2', "..---"},
{'3', "...--"},
{'4', "....-"},
{'5', "....."},
{'6', "-...."},
{'7', "--..."},
{'8', "---.."},
{'9', "----."},
{' ', "/"},
{'.', ".-.-.-"},
{',', "--..--"},
{'?', "..--.."},
};
Still, can someone provide a more concise C# implementation which is also as easy to understand and maintain as this?
Perl, 206 characters, using dmckee's idea
This is longer than the first one I submitted, but I still think it's interesting. And/or awful. I'm not sure yet. This makes use of dmckee's coding idea, plus a couple other good ideas that I saw around. Initially I thought that the "length/offset in a fixed string" thing couldn't come out to less data than the scheme in my other solution, which uses a fixed two bytes per char (and all printable bytes, at that). I did in fact manage to get the data down to considerably less (one byte per char, plus four bytes to store the 26-bit pattern we're indexing into) but the code to get it out again is longer, despite my best efforts to golf it. (Less complex, IMO, but longer anyway).
Anyway, 206 characters; newlines are removable except the first.
#!perl -lp
($a,#b)=unpack"b32C*",
"\264\202\317\0\31SF1\2I.T\33N/G\27\308XE0=\x002V7HMRfermlkjihgx\207\205";
$a=~y/01/-./;#m{A..Z,0..9,qw(. , ?)}=map{substr$a,$_%23,1+$_/23}#b;
$_=join' ',map$m{uc$_}||"/",/./g
Explanation:
There are two parts to the data. The first four bytes ("\264\202\317\0") represent 32 bits of morse code ("--.-..-.-.-----.....--..--------") although only the first 26 bits are used. This is the "reference string".
The remainder of the data string stores the starting position and length of substrings of the reference string that represent each character -- one byte per character, in the order (A, B, ... Z, 0, 1, ... 9, ".", ",", "?"). The values are coded as 23 * (length - 1) + pos, and the decoder reverses that. The last starting pos is of course 22.
So the unpack does half the work of extracting the data and the third line (as viewed here) does the rest, now we have a hash with $m{'a'} = '.-' et cetera, so all there is left is to match characters of the input, look them up in the hash, and format the output, which the last line does... with some help from the shebang, which tells perl to remove the newline on input, put lines of input in $_, and when the code completes running, write $_ back to output with newlines added again.
Python 2; 171 characters
Basically the same as Andrea's solution, but as a complete program, and using stupid tricks to make it shorter.
for c in raw_input().lower():print"".join(".-"[int(d)]for d in bin(
(' etianmsurwdkgohvf_l_pjbxcyzq__54_3___2%7s16%7s7___8_90%12s?%8s.%29s,'
%(('',)*5)).find(c))[3:])or'/',
(the added newlines can all be removed)
Or, if you prefer not to use the bin() function in 2.6, we can get do it in 176:
for c in raw_input():C=lambda q:q>0and C(~-q/2)+'-.'[q%2]or'';print C(
(' etianmsurwdkgohvf_l_pjbxcyzq__54_3___2%7s16%7s7___8_90%12s?%8s.%29s,'%
(('',)*5)).find(c.lower()))or'/',
(again, the added newlines can all be removed)
C89 (293 characters)
Based off some of the other answers.
EDIT: Shrunk the tree (yay).
#define P putchar
char t['~']="~ETIANMSURWDKGOHVF~L~PJBXCYZQ~~54~3",o,q[9],Q=10;main(c){for(;Q;)t[
"&./7;=>KTr"[--Q]]="2167890?.,"[Q];while((c=getchar())>=0){c-=c<'{'&c>96?32:0;c-
10?c-32?0:P(47):P(10);for(o=1;o<'~';++o)if(t[o]==c){for(;o;o/=2)q[Q++]=45+(o--&1
);for(;Q;P(q[--Q]));break;}P(32);}}
Here's another approach, based on dmckee's work, demonstrating just how readable Python is:
Python
244 characters
def h(l):p=2*ord(l.upper())-88;a,n=map(ord,"AF__GF__]E\\E[EZEYEXEWEVEUETE__________CF__IBPDJDPBGAHDPC[DNBSDJCKDOBJBTCND`DKCQCHAHCZDSCLD??OD"[p:p+2]);return "--..--..-.-.-..--...----.....-----.-"[a-64:a+n-128]
def e(s):return ' '.join(map(h,s))
Limitations:
dmckee's string missed the 'Y' character, and I was too lazy to add it. I think you'd just have to change the "??" part, and add a "-" at the end of the second string literal
it doesn't put '/' between words; again, lazy
Since the rules called for fewest characters, not fewest bytes, you could make at least one of my lookup tables smaller (by half) if you were willing to go outside the printable ASCII characters.
EDIT: If I use naïvely-chosen Unicode chars but just keep them in escaped ASCII in the source file, it still gets a tad shorter because the decoder is simpler:
Python
240 characters
def h(l):a,n=divmod(ord(u'\x06_7_\xd0\xc9\xc2\xbb\xb4\xad\xa6\x9f\x98\x91_____\x14_AtJr2<s\xc1d\x89IQdH\x8ff\xe4Pz9;\xba\x88X_f'[ord(l.upper())-44]),7);return "--..--..-.-.-..--...----.....-----.-"[a:a+n]
def e(s):return ' '.join(map(h,s))
I think it also makes the intent of the program much clearer.
If you saved this as UTF-8, I believe the program would be down to 185 characters, making it the shortest complete Python solution, and second only to Perl. :-)
Here's a third, completely different way of encoding morse code:
Python
232 characters
def d(c):
o='';b=ord("Y_j_?><80 !#'/_____f_\x06\x11\x15\x05\x02\x15\t\x1c\x06\x1e\r\x12\x07\x05\x0f\x16\x1b\n\x08\x03\r\x18\x0e\x19\x01\x13"[ord(c.upper())-44])
while b!=1:o+='.-'[b&1];b/=2
return o
e=lambda s:' '.join(map(d,s))
If you can figure out a way to map this onto some set of printable characters, you could save quite a few characters. This is probably my most direct solution, though I don't know if it's the most readable.
OK, now I've wasted way too much time on this.
Haskell
type MorseCode = String
program :: String
program = "__5__4H___3VS__F___2 UI__L__+_ R__P___1JWAE"
++ "__6__=B__/_XD__C__YKN__7_Z__QG__8_ __9__0 OMT "
decode :: MorseCode -> String
decode = interpret program
where
interpret = head . foldl exec []
exec xs '_' = undefined : xs
exec (x:y:xs) c = branch : xs
where
branch (' ':ds) = c : decode ds
branch ('-':ds) = x ds
branch ('.':ds) = y ds
branch [] = [c]
For example, decode "-- --- .-. ... . -.-. --- -.. ." returns "MORSE CODE".
This program is from taken from the excellent article Fun with Morse Code.
PHP
I modified the previous PHP entry to be slightly more efficient. :)
$a=array(32=>"/",44=>"--..--",1,".-.-.-",48=>"-----",".----","..---","...--","....-",".....","-....","--...","---..","----.",63=>"..--..",1,".-","-...","-.-.","-..",".","..-.","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--..");
foreach(str_split(strtoupper("hello world?"))as$k=>$v){echo $a[ord($v)]." ";}
Komodo says 380 characters on 2 lines - the extra line is just for readability. ;D
The interspersed 1s in the array is just to save 2 bytes by filling that array position with data instead of manually jumping to the array position after that.
Consider the first vs. the second. The difference is clearly visible. :)
array(20=>"data",22=>"more data";
array(20=>"data",1,"more data";
The end result, however, is exactly as long as you use the array positions rather than loop through the contents, which we don't do on this golf course.
End result: 578 characters, down to 380 (198 characters, or ~34.26% savings).
Bash, a script I wrote a while ago (time-stamp says last year) weighing in at a hefty 1661 characters. Just for fun really :)
#!/bin/sh
txt=''
res=''
if [ "$1" == '' ]; then
read -se txt
else
txt="$1"
fi;
len=$(echo "$txt" | wc -c)
k=1
while [ "$k" -lt "$len" ]; do
case "$(expr substr "$txt" $k 1 | tr '[:upper:]' '[:lower:]')" in
'e') res="$res"'.' ;;
't') res="$res"'-' ;;
'i') res="$res"'..' ;;
'a') res="$res"'.-' ;;
'n') res="$res"'-.' ;;
'm') res="$res"'--' ;;
's') res="$res"'...' ;;
'u') res="$res"'..-' ;;
'r') res="$res"'.-.' ;;
'w') res="$res"'.--' ;;
'd') res="$res"'-..' ;;
'k') res="$res"'-.-' ;;
'g') res="$res"'--.' ;;
'o') res="$res"'---' ;;
'h') res="$res"'....' ;;
'v') res="$res"'...-' ;;
'f') res="$res"'..-.' ;;
'l') res="$res"'.-..' ;;
'p') res="$res"'.--.' ;;
'j') res="$res"'.---' ;;
'b') res="$res"'-...' ;;
'x') res="$res"'-..-' ;;
'c') res="$res"'-.-.' ;;
'y') res="$res"'-.--' ;;
'z') res="$res"'--..' ;;
'q') res="$res"'--.-' ;;
'5') res="$res"'.....' ;;
'4') res="$res"'....-' ;;
'3') res="$res"'...--' ;;
'2') res="$res"'..---' ;;
'1') res="$res"'.----' ;;
'6') res="$res"'-....' ;;
'7') res="$res"'--...' ;;
'8') res="$res"'---..' ;;
'9') res="$res"'----.' ;;
'0') res="$res"'-----' ;;
esac;
[ ! "$(expr substr "$txt" $k 1)" == " " ] && [ ! "$(expr substr "$txt" $(($k+1)) 1)" == ' ' ] && res="$res"' '
k=$(($k+1))
done;
echo "$res"
C89 (388 characters)
This is incomplete as it doesn't handle comma, fullstop, and query yet.
#define P putchar
char q[10],Q,tree[]=
"EISH54V 3UF 2ARL + WP J 1TNDB6=X/ KC Y MGZ7 Q O 8 90";s2;e(x){q[Q++]
=x;}p(){for(;Q--;putchar(q[Q]));Q=0;}T(int x,char*t,int s){s2=s/2;return s?*t-x
?t[s2]-x?T(x,++t+s2,--s/2)?e(45):T(x,t,--s/2)?e(46):0:e(45):e(46):0;}main(c){
while((c=getchar())>=0){c-=c<123&&c>96?32:0;if(c==10)P(10);if(c==32)P(47);else
T(c,tree,sizeof(tree)),p();P(' ');}}
Wrapped for readability. Only two of the linebreaks are required (one for the #define, one after else, which could be a space). I've added a few non-standard characters but didn't add non-7-bit ones.
C, 533 characters
I took advice from some comments and switched to stdin. Killed another 70 characters roughly.
#include <stdio.h>
#include <ctype.h>
char *u[36] = {".-","-...","-.-.","-..",".","..-.","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--..","-----",".----","..---","...--","....-",".....","-....","--...","---..","----."};
main(){
char*v;int x;char o;
do{
o = toupper(getc(stdin));v=0;if(o>=65&&o<=90)v=u[o-'A'];if(o>=48&&o<=57)v=u[o-'0'+26];if(o==46)v=".-.-.-";if(o==44)v="--..--";if(o==63)v="..--..";if(o==32)v="/";if(v)printf("%s ", v);} while (o != EOF);
}
C (381 characters)
char*p[36]={".-","-...","-.-.","-..",".","..-.","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--..","-----",".----","..---","...--","....-",".....","-....","--...","---..","----."};
main(){int c;while((c=tolower(getchar()))!=10)printf("%s ",c==46?".-.-.-":c==44?"--..--":c==63?"..--..":c==32?"/":*(p+(c-97)));}
C, 448 bytes using cmdline arguments:
char*a[]={".-.-.-","--..--","..--..","/",".-","-...","-.-.","-..",".","..-.","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--..","-----",".----","..---","...--","....-",".....","-....","--...","---..","----."},*k=".,? ",*s,*p,x;main(int _,char**v){for(;s=*++v;putchar(10))for(;x=*s++;){p=strchr(k,x);printf("%s ",p?a[p-k]:isdigit(x)?a[x-18]:isalpha(x=toupper(x))?a[x-61]:0);}}
C, 416 bytes using stdin:
char*a[]={".-.-.-","--..--","..--..","/",".-","-...","-.-.","-..",".","..-.","--.","....","..",".---","-.-",".-..","--","-.","---",".--.","--.-",".-.","...","-","..-","...-",".--","-..-","-.--","--..","-----",".----","..---","...--","....-",".....","-....","--...","---..","----."},*k=".,? ",*p,x;main(){while((x=toupper(getchar()))-10){p=strchr(k,x);printf("%s ",p?a[p-k]:isdigit(x)?a[x-18]:isalpha(x)?a[x-61]:0);}}

How do I display the binary representation of a float or double?

I'd like to display the binary (or hexadecimal) representation of a floating point number. I know how to convert by hand (using the method here), but I'm interested in seeing code samples that do the same.
Although I'm particularly interested in the C++ and Java solutions, I wonder if any languages make it particularly easy so I'm making this language agnostic. I'd love to see some solutions in other languages.
EDIT: I've gotten good coverage of C, C++, C#, and Java. Are there any alternative-language gurus out there who want to add to the list?
C/C++ is easy.
union ufloat {
float f;
unsigned u;
};
ufloat u1;
u1.f = 0.3f;
Then you just output u1.u.
Doubles just as easy.
union udouble {
double d;
unsigned long u;
}
because doubles are 64 bit.
Java is a bit easier: use Float.floatToRawIntBits() combined with Integer.toBinaryString() and Double.doubleToRawLongBits combined with Long.toBinaryString().
In C:
int fl = *(int*)&floatVar;
&floatVar would get the adress memory then (int*) would be a pointer to this adress memory, finally the * to get the value of the 4 bytes float in int.
Then you can printf the binary format or hex format.
Java: a google search finds this link on Sun's forums
specifically (I haven't tried this myself)
long binary = Double.doubleToLongBits(3.14159);
String strBinary = Long.toBinaryString(binary);
Apparently nobody cared to mention how trivial is to obtain hexadecimal exponent notation, so here it is:
#include <iostream>
#include <cstdio>
using namespace std;
int main()
{
// C++11 manipulator
cout << 23.0f << " : " << std::hexfloat << 23.0f << endl;
// C equivalent solution
printf("23.0 in hexadecimal is: %A\n", 23.0f);
}
In .NET (including C#), you have BitConverter that accepts many types, allowing access to the raw binary; to get the hex, ToString("x2") is the most common option (perhaps wrapped in a utility method):
byte[] raw = BitConverter.GetBytes(123.45);
StringBuilder sb = new StringBuilder(raw.Length * 2);
foreach (byte b in raw)
{
sb.Append(b.ToString("x2"));
}
Console.WriteLine(sb);
Oddly, base-64 has a 1-line conversion (Convert.ToBase64String), but base-16 takes more effort. Unless you reference Microsoft.VisualBasic, in which case:
long tmp = BitConverter.DoubleToInt64Bits(123.45);
string hex = Microsoft.VisualBasic.Conversion.Hex(tmp);
I did it this way:
/*
#(#)File: $RCSfile: dumpdblflt.c,v $
#(#)Version: $Revision: 1.1 $
#(#)Last changed: $Date: 2007/09/05 22:23:33 $
#(#)Purpose: Print C double and float data in bytes etc.
#(#)Author: J Leffler
#(#)Copyright: (C) JLSS 2007
#(#)Product: :PRODUCT:
*/
/*TABSTOP=4*/
#include <stdio.h>
#include "imageprt.h"
#ifndef lint
/* Prevent over-aggressive optimizers from eliminating ID string */
extern const char jlss_id_dumpdblflt_c[];
const char jlss_id_dumpdblflt_c[] = "#(#)$Id: dumpdblflt.c,v 1.1 2007/09/05 22:23:33 jleffler Exp $";
#endif /* lint */
union u_double
{
double dbl;
char data[sizeof(double)];
};
union u_float
{
float flt;
char data[sizeof(float)];
};
static void dump_float(union u_float f)
{
int exp;
long mant;
printf("32-bit float: sign: %d, ", (f.data[0] & 0x80) >> 7);
exp = ((f.data[0] & 0x7F) << 1) | ((f.data[1] & 0x80) >> 7);
printf("expt: %4d (unbiassed %5d), ", exp, exp - 127);
mant = ((((f.data[1] & 0x7F) << 8) | (f.data[2] & 0xFF)) << 8) | (f.data[3] & 0xFF);
printf("mant: %16ld (0x%06lX)\n", mant, mant);
}
static void dump_double(union u_double d)
{
int exp;
long long mant;
printf("64-bit float: sign: %d, ", (d.data[0] & 0x80) >> 7);
exp = ((d.data[0] & 0x7F) << 4) | ((d.data[1] & 0xF0) >> 4);
printf("expt: %4d (unbiassed %5d), ", exp, exp - 1023);
mant = ((((d.data[1] & 0x0F) << 8) | (d.data[2] & 0xFF)) << 8) |
(d.data[3] & 0xFF);
mant = (mant << 32) | ((((((d.data[4] & 0xFF) << 8) |
(d.data[5] & 0xFF)) << 8) | (d.data[6] & 0xFF)) << 8) |
(d.data[7] & 0xFF);
printf("mant: %16lld (0x%013llX)\n", mant, mant);
}
static void print_value(double v)
{
union u_double d;
union u_float f;
f.flt = v;
d.dbl = v;
printf("SPARC: float/double of %g\n", v);
image_print(stdout, 0, f.data, sizeof(f.data));
image_print(stdout, 0, d.data, sizeof(d.data));
dump_float(f);
dump_double(d);
}
int main(void)
{
print_value(+1.0);
print_value(+2.0);
print_value(+3.0);
print_value( 0.0);
print_value(-3.0);
print_value(+3.1415926535897932);
print_value(+1e126);
return(0);
}
Running on a SUN UltraSPARC, I got:
SPARC: float/double of 1
0x0000: 3F 80 00 00 ?...
0x0000: 3F F0 00 00 00 00 00 00 ?.......
32-bit float: sign: 0, expt: 127 (unbiassed 0), mant: 0 (0x000000)
64-bit float: sign: 0, expt: 1023 (unbiassed 0), mant: 0 (0x0000000000000)
SPARC: float/double of 2
0x0000: 40 00 00 00 #...
0x0000: 40 00 00 00 00 00 00 00 #.......
32-bit float: sign: 0, expt: 128 (unbiassed 1), mant: 0 (0x000000)
64-bit float: sign: 0, expt: 1024 (unbiassed 1), mant: 0 (0x0000000000000)
SPARC: float/double of 3
0x0000: 40 40 00 00 ##..
0x0000: 40 08 00 00 00 00 00 00 #.......
32-bit float: sign: 0, expt: 128 (unbiassed 1), mant: 4194304 (0x400000)
64-bit float: sign: 0, expt: 1024 (unbiassed 1), mant: 2251799813685248 (0x8000000000000)
SPARC: float/double of 0
0x0000: 00 00 00 00 ....
0x0000: 00 00 00 00 00 00 00 00 ........
32-bit float: sign: 0, expt: 0 (unbiassed -127), mant: 0 (0x000000)
64-bit float: sign: 0, expt: 0 (unbiassed -1023), mant: 0 (0x0000000000000)
SPARC: float/double of -3
0x0000: C0 40 00 00 .#..
0x0000: C0 08 00 00 00 00 00 00 ........
32-bit float: sign: 1, expt: 128 (unbiassed 1), mant: 4194304 (0x400000)
64-bit float: sign: 1, expt: 1024 (unbiassed 1), mant: 2251799813685248 (0x8000000000000)
SPARC: float/double of 3.14159
0x0000: 40 49 0F DB #I..
0x0000: 40 09 21 FB 54 44 2D 18 #.!.TD-.
32-bit float: sign: 0, expt: 128 (unbiassed 1), mant: 4788187 (0x490FDB)
64-bit float: sign: 0, expt: 1024 (unbiassed 1), mant: 2570638124657944 (0x921FB54442D18)
SPARC: float/double of 1e+126
0x0000: 7F 80 00 00 ....
0x0000: 5A 17 A2 EC C4 14 A0 3F Z......?
32-bit float: sign: 0, expt: 255 (unbiassed 128), mant: 0 (0x000000)
64-bit float: sign: 0, expt: 1441 (unbiassed 418), mant: -1005281217 (0xFFFFFFFFC414A03F)
Well both the Float and Double class (in Java) have a toHexString('float') method so pretty much this would do for hex conversion
Double.toHexString(42344);
Float.toHexString(42344);
Easy as pie!
I had to think about posting here for a while because this might inspire fellow coders to do evil things with C. I decided to post it anyway but just remember: do not write this kind of code to any serious application without proper documentation and even then think thrice.
With the disclaimer aside, here we go.
First write a function for printing for instance a long unsigned variable in binary format:
void printbin(unsigned long x, int n)
{
if (--n) printbin(x>>1, n);
putchar("01"[x&1]);
}
Unfortunately we can't directly use this function to print our float variable so we'll have to hack a little. The hack probably looks familiar to everyone who have read about Carmack's Inverse Square Root trick for Quake. The idea is set a value for our float variable and then get the same bit mask for our long integer variable. So we take the memory address of f, convert it to a long* value and use that pointer to get the bit mask of f as a long unsigned. If you were to print this value as long unsigned, the result would be a mess, but the bits are the same as in the original float value so it doesn't really matter.
int main(void)
{
long unsigned lu;
float f = -1.1f;
lu = *(long*)&f;
printbin(lu, 32);
printf("\n");
return 0;
}
If you think this syntax is awful, you're right.
In Haskell, there is no internal representation of floating points accessible. But you can do binary serialiazation from many formats, including Float and Double. The following solution is generic to any type that has instance of Data.Binary supports:
module BinarySerial where
import Data.Bits
import Data.Binary
import qualified Data.ByteString.Lazy as B
elemToBits :: (Bits a) => a -> [Bool]
elemToBits a = map (testBit a) [0..7]
listToBits :: (Bits a) => [a] -> [Bool]
listToBits a = reverse $ concat $ map elemToBits a
rawBits :: (Binary a) => a -> [Bool]
rawBits a = listToBits $ B.unpack $ encode a
Conversion can be done with rawBits:
rawBits (3.14::Float)
But, if you need to access the float value this way, you are probably doing something wrong. The real question might be How to access exponent and significand of a floating-point number? The answers are exponent and significand from Prelude:
significand 3.14
0.785
exponent 3.14
2
Python:
Code:
import struct
def float2bin(number, hexdecimal=False, single=False):
bytes = struct.pack('>f' if single else '>d', number)
func, length = (hex, 2) if hexdecimal else (bin, 8)
byte2bin = lambda byte: func(ord(byte))[2:].rjust(length, '0')
return ''.join(map(byte2bin, bytes))
Sample:
>>> float2bin(1.0)
'0011111111110000000000000000000000000000000000000000000000000000'
>>> float2bin(-1.0)
'1011111111110000000000000000000000000000000000000000000000000000'
>>> float2bin(1.0, True)
'3ff0000000000000'
>>> float2bin(1.0, True, True)
'3f800000'
>>> float2bin(-1.0, True)
'bff0000000000000'
You can easy convert float variable to int variable (or double to long) using such code in C#:
float f = ...;
unsafe
{
int i = *(int*)&f;
}
For future reference, C++ 2a introduce a new function template bit_cast that does the job.
template< class To, class From >
constexpr To bit_cast(const From& from) noexcept;
We can simply call,
float f = 3.14;
std::bit_cast<int>(f);
For more details, see https://en.cppreference.com/w/cpp/numeric/bit_cast
In C++ you can show the binary representation it in this way:
template <class T>
std::bitset<sizeof(T)*8> binary_representation(const T& f)
{
typedef unsigned long TempType;
assert(sizeof(T)<=sizeof(TempType));
return std::bitset<sizeof(T)*8>(*(reinterpret_cast<const TempType*>(&f)));
}
the limit here is due the fact that bitset longer parameter is an unsigned long,
so it works up to float, you can use something else than bitset and the extend
that assert.
BTW, cletus suggestion fails in the sense that you need an "unsingned long long" to cover a double, anyway you need something that shows the binary (1 or 0) representation.