I am tasked with finding the binary representation for the number 3.4219087*10^12. This is a very large number (and I have to do this by hand), so I was wondering if there was some sort of short cut or technique I could use to convert it instead.
Related
I am currently struggling with a university homework which you can summarize like that:
The user inputs a binary number.
That number should be split up in parts. Every part is assigned to a Thread and the Thread takes care of the conversion. After all Threads are finished converting their part, the results are added and the final decimal number will be displayed.
Maybe my way of thinking is completely wrong, but is it even possible to just split up a binary number in parts, convert every part and add it up to get the initial decimal number?
Thanks in advance, and also sorry in advance If I'm completely dumb!
I have about 300 measurements (each stored in a dat file) that I would like to read using MATLAB or Python. The files can be exported to text or csv using a proprietary program, but this has to be done one by one.
The question is: what would be the best approach to crack the format of the binary file using the known content from the exported file?
Not sure if this makes any difference to make the cracking easier, but the files are just two columns of (900k) numbers, and from the dat files' size (1,800,668 bytes), it appears as if each number is 16 bits (float) and there is some other information (possible the header).
I tried using HEX-Editor, but wasn't able to pick up any trends from there.
Lastly, I want to make sure to specify that these are measurements I made and the data in them belongs to me. I am not trying to obtain data that I am not supposed to.
Thanks for any help.
EDIT: Reading up a little more I realized that there my be some kind of compression going on. When you look at the data in StreamWare, it gives 7 decimal places, leading me to believe that it is a single precision value (4 bytes). However, the size of the files suggests that each value only takes 2 bytes.
After thinking about it a little more, I finally figured it out. This is very specific, but just in case another Dantec StreamWare user runs into the same problem, it could save him/her a little time.
First, the data is actually only a single vector. The time column is calculated from the length of the recorded signal and the sampling frequency. That information is probably in the header (but I wasn't able to crack that portion).
To obtain the values in MATLAB, I skipped the header bytes using fseek(fid, 668, 'bof'), then I read the data as uint16 using fread(fid, 900000, 'uint16'). This gives you integers.
To get the float value, all you have to do is divide by 2^16 (it's a 16 bit resolution system) and multiply by ten. I assume the factor of ten depends on the range of your data acquisition system.
I hope this helps.
I was wondering why a computer would need binary code converters to convert from BCD to Excess-3 for example. Why is this necessary can't computers just use one form of binary code.
Some older forms of binary representation persist even after a newer, "better" form comes into use. For example, legacy hardware that is still in use running legacy code that would be too costly to rewrite. Word lengths were not standardized in the early years of computing, so machines with words varying from 5 to 12 bits in length naturally will require different schemes for representing the same numbers.
In some cases, a company might persist in using a particular representation for self-compatibility (i.e., with the company's older products) reasons, or because it's an ingrained habit or "the company way." For example, the use of big-endian representation in Motorola and PowerPC chips vs. little-endian representation in Intel chips. (Though note that many PowerPC processors support both types of endian-ness, even if manufacturers typically only use one in a product.)
The previous paragraph only really touches upon byte ordering, but that can still be an issue for data interchange.
Even for BCD, there are many ways to store it (e.g., 1 BCD digit per word, or 2 BCD digits packed per byte). IBM has a clever representation called zoned decimal where they store a value in the high-order nybble which, combined with the BCD value in the low-order nybble, forms an EBCDIC character representing the value. This is pretty useful if you're married to the concept of representing characters using EBCDIC instead of ASCII (and using BCD instead of 2's complement or unsigned binary).
Tangentially related: IBM mainframes from the 1960s apparently converted BCD into an intermediate form called qui-binary before performing an arithmetic operation, then converted the result back to BCD. This is sort of a Rube Goldberg contraption, but according to the linked article, the intermediate form gives some error detection benefits.
The IBM System/360 (and probably a bunch of newer machines) supported both packed BCD and pure binary representations, though you have to watch out for IBM nomenclature — I have heard an old IBMer refer to BCD as "binary," and pure binary (unsigned, 2's complement, whatever) as "binary coded hex." This provides a lot of flexibility; some data may naturally be best represented in one format, some in the other, and the machine provides instructions to convert between forms conveniently.
In the case of floating point arithmetic, there are some values that cannot be represented exactly in binary floating point, but can be with BCD or a similar representation. For example, the number 0.1 has no exact binary floating point equivalent. This is why BCD and fixed-point arithmetic are preferred for things like representing amounts of currency, where you need to exactly represent things like $3.51 and can't allow floating point error to creep in when adding.
Intended application is important. Arbitrary precision arithmetic will require a different representation strategy compared to the fixed-width registers in your CPU (e.g., Java's BigDecimal class). Many languages support arbitrary precision (e.g., Scheme, Haskell), though the underlying implementation of arbitrary precision numbers varies. I'm honestly not sure what is preferable for arbitrary precision, a BCD-type scheme or a denser pure binary representation. In the case of Java's BigDecimal, conversion from binary floating point to BigDecimal is best done by first converting to a String — this makes such conversions potentially inefficient, so you really need to know ahead of time whether float or double is good enough, or whether you really need arbitrary precision, and when.
Another tangent: Groovy, a JVM language, quietly treats all floating point numeric literals in code as BigDecimal values, and uses BigDecimal arithmetic in preference to float or double. That's one reason Groovy is very popular with the insurance industry.
tl;dr There is no one-size-fits-all numeric data type, and as long as that remains the case (probably the heat death of the universe), you'll need to convert between representations.
There are some scenarios where programmers need or want to find grossly large numbers. These are often so large that they defy the programmer's comprehension. I'm talking about things like the largest known prime number (with 12978189 digits) and the recently calculated 10 trillion digits of pi.
How can you create a program that handles these? This far exceeds an integer, a long, a double, a BigInteger, a BigDecimal, or anything of the sort. How do these kinds of programs for discovering these numbers get created? How can you even store them in memory when no appropriate datatypes exist, and they would likely consume gigabytes of data each?
To address your specific examples:
A 12 million digit integer isn't terribly large for a typical "large integer" class to handle. This should be able to be stored in memory.
To store 10 trillion digits of π, you could use a disk file and memory-map it. You'll need a 64 bit OS and application, but you can simply create a 10 terabyte file on disk (you'll probably need a few disks and a filesystem like ZFS that can store it across disks), and map it into CPU address space. The algorithms that calculate π (such as BBP) conveniently calculate one hex digit at a time which fits well into half a byte of memory.
The (abstract) answer is to write algorithms using the machine's native types that produce the results you want. For instance, when you do addition by hand on paper of two very large integers, the biggest actual calculation you need is only 9+9+1 (nine plus nine plus one for the carry). Of course you need paper large enough to write the two numbers down in the first place and the answer down as well. So as long as the two numbers and the answer can be stored in a computer's harddisk (the paper), an algorithm can be written that does it with variables that only need a value up to 19; so even a char variable is more than capable of handling this let alone an int variable.
The (concrete) answer is that really good programmers have already done this and there even FOSS libraries for it. One good one is the GNU Project's GMP library which has loads of functions to handle arbitrary size integer arithmetic and arbitrary precision floating point arithmetic. So as long as your computer can store the information needing during the calculation, it can be done. You'll need to invest the time to read the documentation of course.
I am Clustering bunch of words with k-means algorithm in RapidMiner 5.2
I am converting nominal to numerical before the clustering. However, to really view my clustering, i need to view numbers back as word. How can i convert it back?
Use the Parse Numbers or Guess Types operators.