I'm trying to create a calculator(real calculator), from scratch, with only simple parts (relay, diode, etc.). But I got to a chronic problem, how the hell do I convert binary numbers, ex:00101110 to decimal? so that I can make my display show 46 for example. Making the display translate a 0000...1001 (0...9) is easy, but what about after that? when a number comes with two or more decimal places (ex:10000000 =128). I know it can be tricky to explain, so is there somewhere I can find the answer, maybe a schematic?
This isn't a programming language, it's literally a relay computer (remember the old IBM, Harvard Mark I?). I just want to make a relay calculator that does binary calculations (the calculation part is theoretically finished. For now only sum.)
What I can't do is make the result in binary become something that can be shown on a 7segment display.
An easy example is: "0000 0111" this I can make the display show the number 7, because it has only one decimal place. Now with "0011 0100" the situation changes, the number would be 52, simply making the "52" appear on a display is not a challenge, the problem here is: how does a processor translate binary numbers from 0000 to infinity, in a way that can you put it on a display?
I don't necessarily need a definitive answer, whatever, even if a website, a book, a light at the end of the tunnel.
Related
I need to convert a decimal number to a base between 2 and 9 using Little Man Computer. How do I proceed?
I believe successive divisions are the best method. In my opinion, I must write a code which divides two numbers, then save the integer ratio for the next division, as well as all of the remainders in an array of indefinite size, but I've been struggling with the division code for hours now. I tried searching for a code which divides two numbers, but all the ones I tried have mistakes/don't work. I'm stuck at the easiest part of the problem, I can't imagine how I'm ever going to be able to write a self-modifying code which manages an array of ever-increasing line positions and backtracks through it at the end to extract all the remainders. I'm at a loss here, any help would be appreciated.
I need to analyse a binary data file containing raw data from a scientific instrument. A quick look in a hex viewer indicates that's probably no encryption or anything fancy: integers will probably be written as integers (but I don't know what byte order), and who knows about floating point.
I have access to a (closed source) program that can view the contents of the file. So I can see that a certain value is 74078. Actually searching for that value I'm not sure about - do I search for 00 01 21 5E, some other byte order, etc? (Hex Fiend doesn't support searching for decimal values) And how would I find a floating point number?
The software that produces these files runs on XP. I'd prefer tools that run on OSX if possible.
(Hmm, I wrote up this question, forgot to post it, then solved the problem. I guess I will write my own answer.)
In the end, Hex Fiend turned out to be just enough. What I was expecting to do:
Convert a known value into hex
Search for it
What I actually did:
Pick a random chunk of hex that looked like it might be a useful value
Tell Hex Fiend to display it as integer, or as float, in either little endian or big endian, until it gave a plausible looking result (ie, 45.000 is a lot more plausible than some huge integer)
Search for that result in the results I had from the closed source program.
Document it, go back to step 1. (Except that normally the next chunk wouldn't be 'random', but would follow sequentially.)
In this case there were really only three (binary) variables for how to interpret data:
float or integer
2 bytes or 4 bytes
little or big endian
With more variables the task would be a lot harder. It would have been nice if Hex Fiend could search for integers/floats directly, perhaps trying out the different combinations. Perhaps other hex viewers do.
And to answer one of my original questions, 74078 turned out to be stored as 5E2101. A bit more trial and error and I would have got there. :)
UPDATE
If I was doing this over, I'd use "Synalyze It!", a tool designed for exactly this purpose.
I went to my bank website the other day and entered my account number with a trailing space. An error message popped that said, "Account number must consist of numeric values only." I thought to myself, "Seriously?! You couldn't have just stripped the space for me?". If I were any less of a computer geek, I may even have thought, "What? There are only numbers in there!" (not being able to see space).
The Calculator that comes with Ubuntu on the other hand merrily accepts spaces and commas, but oddly doesn't like trailing dots (without any ensuing digits).
So, that begs the question. Exactly how forgiving should web forms be? I don't think trimming whitespace is too much to ask, but what about other integer fields?
Should they allow +/- signs?
How many spaces should be allowed between the sign and the number?
What about commas for thousands separators?
What about in other parts of the world where use dots instead?
What if they're in between every 4 digits instead of every 3?
What about hexidecimal and octal representations?
Scientific notation?
What if I accidentally hit the quote button when I'm trying to hit enter, should that be stripped too?
It would be very easy for me to strip out all non-digit characters, and that would be extremely forgiving, but what if the user made an actual mistake that affects the input and should have been caught, but now I've just stripped it out?
What about things like phone numbers (which have a huge variety of formats), postal codes, zip codes, credit card numbers, usernames, emails, URLs (should I assume http? What about .com while I'm at it?)?
Where do you draw the line?
For something as important as banking, I don't mind it complaining about my input, especially if the other option is mistakenly transferring a bucketload of money into some stranger's account instead of my wife's (because of a missing or incorrect digit for example).
A classic example is one of my banks which disallows monetary values unless they have ".99" at the end (where 9 can be any digit of course). The vast majority of things I do are for exact dollar amounts and it's a tad annoying to have to always enter 500.00 instead of just 500.
But I'll be happier about that the first time I avoid accidentally paying somebody $5072 instead of $50.72 just because I forgot the decimal point. Actually, that's pretty unlikely since it also asks for confirmation and I'm pretty anal in controlling my money :-)
Having said that, the general rule I try to follow is "be liberal in what you accept, be strict in what you produce".
This allows other software using my output to expect a limited range of possibilities (making their lives easier). But it makes my software more useful if it can handle simple misteaks.
You draw the line at the point where the computer is guessing at what the correct input should be.
For example, a license key input box I wrote once accepts spaces and dashes and both upper and lower case, even though internally the keys were without said spaces, dashes and were all upper case. I could do that, since I knew that none of the keys actually had spaces or dashes.
Your example about URLs is another good one. I've noticed that modern browsers (I'm using Chrome), when something like 'flowers' is typed into the address bar, it knows it should search for it since it's not a valid URL. If instead, I type 'st' it auto corrects (or auto-suggests) 'stackoverflow.com' since it's a bookmark.
A well-written input system will complain when it would otherwise be forced to guess what the correct input should be.
Numeric input:
Stripping non-digits seems reasonable to me, but the problem is conflicting decimal notation. Some regions expect , (comma) to denote the decimal separator, while others use . (period). Unless the input would likely be in other bases, I would only assume base 10. If it's reasonable to assume non-base 10 input (base-16 for color input, for example), I would go with standard conventions for denoting the bases: leading 0 means base 8, leading 0x means base 16.
String input:
This gets a lot more complicated. It mostly depends on what the input is actually meant to represent. A username should exclude characters that will cause trouble, but the meaning of 'cause trouble' will vary depending on the use of the application and the system itself. URLs have a concrete definition of what qualifies, but that definition is rather broad. Fortunately, many languages come with tools to discern URLs, without you having to code your own parsing (whether the language does it perfectly or not is another question).
In the end, it's really a case-by-case basis. I do like paxadiablo's general rule, though: Accept as much as you can, output only what you must.
It totally depends on how the data is going to be used.
If the input is a monetary amount, for a transaction for example, then the inputted variable should be normalised to a set of standards for sure.
If it's simply a case of a phone number, then it is unlikely the stored data will provide any functional sort of use so you can be more forgiving.
There is nothing wrong with forcing correct format to make displayed look nicer, but you have to balance user irritation with micro benefits.
Once you start collecting data you can scan through it and see what sort of patterns emerge, and you can auto strip off inputted format.
Where do you draw the line?
When the consequences of accepting "invalid" data outweigh the irritation of not accepting it.
Should they allow +/- signs?
If negative values are valid, then of course they should.
If not, then don't just silently strip minus signs, as it totally changes the meaning of the data. Stripping pluses is less of a problem.
What if [thousands separators are] in between every 4 digits instead of every 3?
In countries that use three-digit grouping, "1,0000" can be assumed to be a typo. But is it a typo for "10,000" or for "1,000"? I wouldn't dare guess, as a wrong guess could cost the user $9,000.
What about hexidecimal and octal
representations?
Unless you're running the search feature for unicode.org, I can't imagine why anyone would use hexidecimal in web form.
And "01234" is almost certainly intended to be 1234 instead of 668.
What about things like...credit card numbers
Please allow spaces or hyphens in credit card numbers. It's really annoying when I have to type an undelimited 16-digit number.
I think you're over reacting a little bit. If there's anything in the field that shouldn't be there, strip it. otherwise try to force the input into whatever format you want, and if it doesn't fit, reject it.
I would say "Accept anything but process only valid data".
Expect your users to behave like a computer noob. Validate the input data using regular expressions and other validators.
Search for standard regular expressions for urls, emails and stuff.
Throw in a regular exp like this "/(?:([a-zA-Z0-9][\s,]+))([a-zA-Z0-9]+)$/" for comma or space separated values. With minor tweaking this exp will work for any number of comma separated values.
The one that irritates me as a user is credit card numbers, conventionally these appear as groups of 4 digits with spaces separating them but the odd webform will only accept a single string of digits with no spaces and no indication that this is the format it's seeking. Similarly telephone numbers, humans often use spaces to improve clarity, webforms sometimes accept the spaces and sometimes don't.
I'd like to write a program that lets users draw points, lines, and circles as though with a straightedge and compass. Then I want to be able to answer the question, "are these three points collinear?" To answer correctly, I need to avoid rounding error when calculating the points.
Is this possible? How can I represent the points in memory?
(I looked into some unusual numeric libraries, but I didn't find anything that claimed to offer both exact arithmetic and exact comparisons that are guaranteed to terminate.)
Yes.
I highly recommend Introduction to constructions, which is a good basic guide.
Basically you need to be able to compute with constructible numbers - numbers that are either rational, or of the form a + b sqrt(c) where a,b,c were previously created (see page 6 on that PDF). This could be done with algebraic data type (e.g. data C = Rational Integer Integer | Root C C C in Haskell, where Root a b c = a + b sqrt(c)). However, I don't know how to perform tests with that representation.
Two possible approaches are:
Constructible numbers are a subset of algebraic numbers, so you can use algebraic numbers.
All algebraic numbers can be represented using polynomials of whose they are roots. The operations are computable, so if you represent a number a with polynomial p and b with polynomial q (p(a) = q(b) = 0), then it is possible to find a polynomial r such that r(a+b) = 0. This is done in some CASes like Mathematica, example. See also: Computional algebraic number theory - chapter 4
Use Tarski's test and represent numbers. It is slow (doubly exponential or so), but works :) Example: to represent sqrt(2), use the formula x^2 - 2 && x > 0. You can write equations for lines there, check if points are colinear etc. See A suite of logic programs, including Tarski's test
If you turn to computable numbers, then equality, colinearity etc. get undecidable.
I think the only way this would be possible is if you used a symbolic representation,
as opposed to trying to represent coordinate values directly -- so you would have
to avoid trying to coerce values like sqrt(2) into some numerical format. You will
be dealing with irrational numbers that are not finitely representable in binary,
decimal, or any other positional notation.
To expand on Jim Lewis's answer slightly, if you want to operate on points that are constructible from the integers with exact arithmetic, you will need to be able to operate on representations of the form:
a + b sqrt(c)
where a, b, and c are either rational numbers, or representations in the form given above. Wikipedia has a pretty decent article on the subject of what points are constructible.
Answering the question of exact equality (as necessary to establish colinearity) with such representations is a rather tricky problem.
If you try to compare co-ordinates for your points, then you have a problem. Leaving aside co-linearity for a moment, how about just working out whether two points are the same or not?
Supposing that one has given co-ordinates, and the other is a compass-straightedge construction starting from certain other co-ordinates, you want to determine with certainty whether they're the same point or not. Either way is a theorem of Euclidean geometry, it's not something you can just measure. You can prove they aren't the same by spotting some difference in their co-ordinates (for example by computing decimal places of each until you encounter a difference). But in general to prove they are the same cannot be done by approximate methods. Compute as many decimal places as you like of some expansions of 1/sqrt(2) and sqrt(2)/2, and you can prove they're very close together but you won't ever prove they're equal. That takes algebra (or geometry).
Similarly, to show that three points are co-linear you will need theorem-proving software. Represent the points A, B, C by their constructions, and attempt to prove the theorem "A, B and C are colinear". This is very hard - your program will prove some theorems but not others. Much easier is to ask the user for a proof that they are co-linear, and then verify (or refute) that proof, but that's probably not what you want.
In general, constructable points may have an arbitrarily complex symbolic form, so you must use a symbolic representation to work them exactly. As Stephen Canon noted above, you often need numbers of the form a+b*sqrt(c), where a and b are rational and c is an integer. All numbers of this form form a closed set under arithmetic operations. I have written some C++ classes (see rational_radical1.h) to work with these numbers if that is all you need.
It is also possible to construct numbers which are sums of any number of terms of rational multiples of radicals. When dealing with more than a single radicand, the numbers are no longer closed under multiplication and division, so you will need to store them as variable length rational coefficient arrays. The time complexity of operations will then be quadratic in the number of terms.
To go even further, you can construct the square root of any given number, so you could potentially have nested square roots. Here, the representations must be tree-like structures to deal with root hierarchy. While difficult to implement, there is nothing in principle preventing you from working with these representations. I'm not sure just what additional numbers can be constructed, but beyond a certain point, your symbolic representation will be expressive enough to handle very large classes of numbers.
Addendum
Found this Google Books link.
If the grid axes are integer valued then the answer is fairly straight forward, the points are either exactly colinear or they are not.
Typically however, one works with real numbers (well, floating points) and then draws the rounded values on the screen which does exist in integer space. In this case you have no choice but to pick a tolerance and use it to determine colinearity. Keep it small and the users will never know the difference.
You seem to be asking, in effect, "Can the normal mathematics (integer or floating point) used by computers be made to represent real numbers perfectly, with no rounding errors?" And, of course, the answer to that is "No." If you want theoretical correctness, then you will be stuck with the much harder problem of symbolic manipulation and coding up the equivalent of the inferences that are done in geometry. (In short, I'm agreeing with Steve Jessop, above.)
Some thoughts in the hope that they might help.
The sort of constructions you're talking about will require multiplication and division, which means that to preserve exactness you'll have to use rational numbers, which are generally easy to implement on top of a suitable sort of big integer (i.e., of unbounded magnitude). (Common Lisp has these built-in, and there have to be other languages.)
Now, you need to represent square roots of arbitrary numbers, and these have to be mixed in.
Therefore, a number is one of: a rational number, a rational number multiplied by a square root of a rational number (or, alternately, just the square root of a rational), or a sum of numbers. In order to prove anything, you're going to have to get these numbers into some sort of canonical form, which for all I can figure offhand may be annoying and computationally expensive.
This of course means that the users will be restricted to rational points and cannot use arbitrary rotations, but that's probably not important.
I would recommend no to try to make it perfectly exact.
The first reason for this is what you are asking here, the rounding error and all that stuff that comes with floating point calculations.
The second one is that you have to round your input as the mouse and screen work with integers. So, initially all user input would be integers, and your output would be integers.
Beside, from a usability point of view, its easier to click in the neighborhood of another point (in a line for example) and that the interface consider you are clicking in the point itself.
I understand the reasons for and against ROT13, but I'm wondering why specifically people have chosen 13 places to shift the alphabet? I understand it's halfway around, but is there an elegant reason to go -that- far, but not 12 or 14 spots?
It seems to me like making each letter "as far away" as possible from its starting position only is meaningful to a human who might recognize "close" characters (although I doubt this is possible/probable).
Anyone know the answer to this?
Because it has the nice property of being involutive, that is to say, ROT13(ROT13(alphaOnlyString)) = alphaOnlyString.
According to Wikipedia:
A shift of thirteen was chosen over other values, such as three as in the original Caesar cipher, because thirteen is the value for which encoding and decoding are equivalent, thereby allowing the convenience of a single command for both.
Probably cause it is its own inverse. The same algorithm can be used for "encryption" as well as "decryption".
Because shifting by 13 moves the characters half way around the alphabet (which has 26 places). So, to get back to plaintext you only need to shift it 13 moves again. This way, you don't have to have separate functions for encoding or decoding because the same operation will be encode or decode.