Automatic integer factoring for 310-digit decimal numbers - prime-factoring

Is here some software, which is capable of factoring a 310-digit decimal integer number into primes? There was msieve, which I successfully used for 120-digit factoring, but 310 digit is greater than max allowed number of 308-digit for msieve.
PS: the number to factor have 2 prime factors, and p-1,p+1 and other easy and fast factoring methods are likely to fail.
UPDATE: Seems only GGNFS will work and there are some python scripts to automate factoring.

Use Lenstra's EC algorithm if it's not a semiprime. Otherwise use Pomerance's NFS. Good introductions exist for both of these "boxes." My bet is to browse the homepages of Lenstra and Pomerance, they're both really good at exposition. Or check out "Number Theory: A Programmers Guide", by Mark Herkommer. It's just what you need, nothing more, and very clear.
EDIT: Although 1000 bit modulus may be a bit of a stretch, assuming you have conventional hardware.
EDIT: Sure, some additional links: http://tinyurl.com/herkAmzon for the Herkommer book.
A 1987 paper on EC factoring from Hendrik Lenstra's homepage : Factoring integers with elliptic curves, Ann. of Math. 126, 649-673..
From the vast net : A very simple Python source code for the above algorithm (which I haven't proofread)
Carl Pomerance's homepage and a relevant paper on the Number Field Sieve is here
However, you may also find useful this narrative on the sieve's development, or this exposition on the quadratic version also from Pomerance's page.
Check out this site dedicated to an implementation of the GNFS, but I strongly recommend finding a copy of the Herkommer book which contains clear simple source code on a few pages.
EDIT : Also consider running the factoring across the Elastic Compute cloud. I hear a guy does it overnight for $75 as per this WiRED article

Related

Reasons and history for choice of common comment signs

Most of the programming languages use // or # for a single line comment (see wiki). It seems to be that # is especially used for interpreted languages. According to this question the reason for that seems to be that one of the early shells (bourne shell) used '#' as a comment and made use of it (shebang).
Is there a logical reason why to choose # as a comment sign (e.g. symobolize crossing out by #)? And why do we use // as a comment sign in many compiled languages (especially in C as it seems to be one of the earliest compiled languages with that symbol)? Are there logical reasons for that? Why not use # instead of //, or // instead of #?
Is there a logical reason why to choose # as a comment sign [in early shells]?
The Bourne shell tokenizer is quite simple. To add comment line support, a single character identifier was the simplest, and logical, choice.
The set of single characters you can choose from, if you wish to be compatible with both EBCDIC and ASCII (the two major character sets used at that time), is quite small:
! (logical not in bc)
#
% (modulo in bc)
#
^ (power in bc)
~ (used in paths)
Now, I've listed the ones used in bc, the calculator used in the same time period, not because they were a reason, but because you should understand the context of the Bourne shell developers and users. The bc notation did not arrive from out of thin air; the prevailing preferences influenced the choice, because the developers wanted the syntax to be intuitive, at least for themselves. The above bc notes are therefore useful in showing what kind of associations contemporary developers had with specific characters. I don't intend to imply that bc necessarily had an impact on Bourne shell -- but I do believe it did; that one of the reasons for developing the Bourne shell was to make using and automating tools like bc easier.
Effectively, only # and # were "unused" characters available in both ASCII and EBCDIC; and it appears "hash" won over "at".
And why do we use // as a comment sign in many compiled languages?
The // comment style is from BCPL. Many of the BCPL tokens and operators were already multiple characters long, and I suspect that at time the developers considered it better (for interoperability) to reuse an already used character for the comment line token, rather than introduce a completely new character.
I suspect that the // comment style has a historical background in margin notes; a double vertical line used to separate the actual content from notes or explanations being a clear visual separator to even those not familiar with the practice.
Why not use # instead of //, or [vice versa]?
In both of the cases above, there is clear logic. However, that does not mean that these were the only logical choices available. These are just the ones that made the most sense to the developers at the time when the choice was made -- and I've tried to shed some light on the possible reasons, the context for the choices, above.
If these kinds of questions interest you, I recommend you find old math and science (physics in particular) books, and perhaps even reproductions of old notes. Best tools are intuitive, you see; and to find what was intuitive to someone, you need to find out the context they worked in. I am absolutely certain you can find interesting "reasons" -- things that made certain choices logical and intuitive to them, while to us they may seem odd -- by finding out the habits of the early developers and their colleagues and mentors.

Reverse engineering GCN binary (ELF) in C#, but not an expert on elf or elf conventions. Do some of these sections/section names look familiar?

My first exposure to the ELF format binary began less than 2 weeks ago; please excuse the crudeness of my grasp of them (and of course, correction of any misconceptions I display here would be welcome).
The story so far: I have some GCN binaries which I am trying to fully reverse engineer so that I might be able to generate my own with a higher degree of control (i.e. limiting the number of intermediate steps executed by code not my own and not entirely within my understanding). What I've found from some resources online and my own delving is that each binary contains two ELF structures; the first is fairly small, containing three sections (no program headers) named "", ".shstrtab", and ".ddiPipelineHeader".
The ".ddiPipelineHeader" section size is 48 bytes, with the byte 0 being a 1, and bytes 16-19 containing what appears to be a 32bit integer that corresponds to the number of bytes in the binary from the start of the second ELF structure. All the other bytes in this section are 0. A google search of ".ddiPipelineHeader" returned exactly 1 result which I didn't find useful. Before I run off all half-cocked into dangerous, crashy GPU experimentation-land, does this section's structure sound at all familiar? Is there possibly an explanation of what each byte would do (e.g. bytes 4-15 are 0 padding, etc. etc.)?
I also have all the sections contained in the second ELF to ask about, but I figure I'll be able to delve more deeply into those with a better foundation gleaned here, so I'll hold off on that part for now.
Thanks for any insight!

Handling Double values on CUDA ( Compute Capability 1.1) [duplicate]

I am writing a program for an embedded hardware that only supports 32-bit single-precision floating-point arithmetic. The algorithm I am implementing, however, requires a 64-bit double-precision addition and comparison. I am trying to emulate double datatype using a tuple of two floats. So a double d will be emulated as a struct containing the tuple: (float d.hi, float d.low).
The comparison should be straightforward using a lexicographic ordering. The addition however is a bit tricky because I am not sure which base should I use. Should it be FLT_MAX? And how can I detect a carry?
How can this be done?
Edit (Clarity): I need the extra significant digits rather than the extra range.
double-float is a technique that uses pairs of single-precision numbers to achieve almost twice the precision of single precision arithmetic accompanied by a slight reduction of the single precision exponent range (due to intermediate underflow and overflow at the far ends of the range). The basic algorithms were developed by T.J. Dekker and William Kahan in the 1970s. Below I list two fairly recent papers that show how these techniques can be adapted to GPUs, however much of the material covered in these papers is applicable independent of platform so should be useful for the task at hand.
https://hal.archives-ouvertes.fr/hal-00021443
Guillaume Da Graça, David Defour
Implementation of float-float operators on graphics hardware,
7th conference on Real Numbers and Computers, RNC7.
http://andrewthall.org/papers/df64_qf128.pdf
Andrew Thall
Extended-Precision Floating-Point Numbers for GPU Computation.
This is not going to be simple.
A float (IEEE 754 single-precision) has 1 sign bit, 8 exponent bits, and 23 bits of mantissa (well, effectively 24).
A double (IEEE 754 double-precision) has 1 sign bit, 11 exponent bits, and 52 bits of mantissa (effectively 53).
You can use the sign bit and 8 exponent bits from one of your floats, but how are you going to get 3 more exponent bits and 29 bits of mantissa out of the other?
Maybe somebody else can come up with something clever, but my answer is "this is impossible". (Or at least, "no easier than using a 64-bit struct and implementing your own operations")
It depends a bit on what types of operations you want to perform. If you only care about additions and subtractions, Kahan Summation can be a great solution.
If you need both the precision and a wide range, you'll be needing a software implementation of double precision floating point, such as SoftFloat.
(For addition, the basic principle is to break the representation (e.g. 64 bits) of each value into its three consitituent parts - sign, exponent and mantissa; then shift the mantissa of one part based on the difference in the exponents, add to or subtract from the mantissa of the other part based on the sign bits, and possibly renormalise the result by shifting the mantissa and adjusting the exponent correspondingly. Along the way, there are a lot of fiddly details to account for, in order to avoid unnecessary loss of accuracy, and deal with special values such as infinities, NaNs, and denormalised numbers.)
Given all the constraints for high precision over 23 magnitudes, I think the most fruitful method would be to implement a custom arithmetic package.
A quick survey shows Briggs' doubledouble C++ library should address your needs and then some. See this.[*] The default implementation is based on double to achieve 30 significant figure computation, but it is readily rewritten to use float to achieve 13 or 14 significant figures. That may be enough for your requirements if care is taken to segregate addition operations with similar magnitude values, only adding extremes together in the last operations.
Beware though, the comments mention messing around with the x87 control register. I didn't check into the details, but that might make the code too non-portable for your use.
[*] The C++ source is linked by that article, but only the gzipped tar was not a dead link.
This is similar to the double-double arithmetic used by many compilers for long double on some machines that have only hardware double calculation support. It's also used as float-float on older NVIDIA GPUs where there's no double support. See Emulating FP64 with 2 FP32 on a GPU. This way the calculation will be much faster than a software floating-point library.
However in most microcontrollers there's no hardware support for floats so they're implemented purely in software. Because of that, using float-float may not increase performance and introduce some memory overhead to save the extra bytes of exponent.
If you really need the longer mantissa, try using a custom floating-point library. You can choose whatever is enough for you, for example change the library to adapt a new 48-bit float type of your own if only 40 bits of mantissa and 7 bits of exponent is needed. No need to spend time for calculating/storing the unnecessary 16 bits anymore. But this library should be very efficient because compiler's libraries often have assembly level optimization for their own type of float.
Another software-based solution that might be of use: GNU MPFR
It takes care of many other special cases and allows arbitrary precision (better than 64-bit double) that you would have to otherwise take care of yourself.
That's not practical. If it was, every embedded 32-bit processor (or compiler) would emulate double precision by doing that. As it stands, none do it that I am aware of. Most of them just substitute float for double.
If you need the precision and not the dynamic range, your best bet would be to use fixed point. IF the compiler supports 64-bit this will be easier too.

Real number arithmetic in a general purpose language?

As (hopefully) most of you know, floating point arithmetic is different from real number arithmetic. It's for starters imprecise. Many numbers, especially decimals (0.1, 0.3) cannot be represented, leading to problems like this. A more thorough list can be found here.
Are there any general purpose languages that have built-in support for something closer to real number arithmetic? If not, what are good libraries that support this?
EDIT: Arbitrary precision decimal
datatypes are not what I am looking
for. I want to be able to represent
numbers like 1/3, sqrt(3), or 1 + 2i as well.
Though I hate to say it, Fortran. It has extensive support for arbitrary-precision arithmetic and tons of support for big-number calculations. It's ancient and gross, but it gets the job done.
All the numbers used in your examples are algebraic numbers, and can be represented
finitely as roots of polynomials with integer coefficients.
The same cannot be said of real numbers in general, which is easily seen when one
considers that the reals are uncountable, but the set of computer programs is
countable. Therefore most reals will not have a finite representation in code.
What you are looking for is symbolic calculation (MATLAB and other tools used in math and engineering are good at it).
If you want a general purposed language, I think expression tree in C# is good point to start with. In the essence, the ability to store the expression (instead of evaluate the expression into real values) is the key to be able to perform symbolic calculation. Note that expression tree does not provide symbolic calculation, it just provides the data structure that supports symbolic calculation.
This question is interesting, but raises some issues. First, you will never be able to represent all the real numbers using a (even theoretically infinite) computer, for cardinality reasons.
What you are looking for is a "symbolic numbers" datatype. You can imagine some sort of expression tree, with predefined constants, arithmetical operations, and perhaps algebraic (roots of polynomials) and transcendantal (exp, sin, cos, log, etc) functions.
Now the fun part of the story: you cannot find an algorithm which tells whether two such trees are representing the same number (or equivalently, which test whether such a tree is zero). I won't state anything precise, but as a hint, this is similar to the Halting Problem (for computer scientists) or the Gödel Incompleteness Theorem (for mathematicians).
This renders such a class pretty useless.
For some subfields of the reals, you have canonical forms, like a/b for the rationals, or finite algebraic extensions of the rationals (a/b + ic/d for complex rationals, a/b + sqrt(2) * a/b for Q[sqrt(2)], etc). These can be used to represent some particular sets of algebraic numbers.
In practice, this is the most complicated thing you will need. If you have a particular necessity, like ranges of floating point numbers (to prove some result is whithin a specified interval, this is probably the closest you can get to real numbers), or arbitrary precision numbers, you have freely available classes everywhere. Google boost::range for the former, and gmp for the latter.
There are several languages with support for rational and complex numbers. Scheme, for instance, has support built in for arbitrarily precise rational numbers, and complex numbers with either rational, floating point, or integral coefficients:
> (+ 1/2 1/3)
5/6
> (* 3 1+1/2i)
3+3/2i
> (+ 1/2 .5)
1.0
If you want to go beyond rational numbers or complex numbers with rational coefficients, to algebraic numbers such as sqrt(2) or closed-form numbers like e, you will probably have to look beyond general purpose programming languages, and use a special purpose mathematical language like Mathematica or Maxima.
To cover the real numbers with any flair you'll need a symbolic package.
Boost, the C++ project, has a Rational library, but that's only part of the story.
You have irrational numbers in all sorts of forms (pi, base of the natural logarithm, square and cube roots, the Champernowne constant, to name only a few). The only way I know of to handle arithmetic operations is a symbolic package with smarts as to the relationship amongst all of these numbers. Assuming you could express e^pi, how would you add one to it? Or take the square root of it?
Mathematica might handle these cases.
Java: java.math.BigDecimal
C#: decimal
A lot of languages have support for that: Java has BigDecimal, Perl has Math::BigFloat and Math::BigRat, Haskell has Integer and a lot of libraries and languages are listed in the wikipedia.
Ada natively supports fixed-point math as well as floating-point. Fixed-point can be much more exact than floating-point, as long as the number's exponents remain in range.
If you need floating-points, but more precision than IEEE gives, there are bignum packages around for just about every language.
I think that's about the best you can do. Neither scheme can exactly represent repeating decimals (like 1/3). It would probably be possible to come up with a scheme that does, but I know of no language that supports such a thing with a built-in type. Even that won't help you with irrational numbers (like pi and e). I believe there's even a theorem that says there will always be unrepresentable numbers, no matter what scheme you come up with.
EDIT: Arbitrary precision decimal
datatypes are not what I am looking
for. I want to be able to represent
numbers like 1/3, sqrt(3), or 1 + 2i
as well.
Ruby has a Rational class, so 1/3 can be expressed exactly as Rational(1,3). It also has a Complex class.
Scheme defines rationals, bignums, floating point and complex numbers. An implementation is not required to support them all, but if they are present, you can mix them and they will to "the right thing".
While its not "built-in", I think C++ (maybe C#) is your best bet. There are classes out there that have been written for this purpose.
http://www.oonumerics.org/oon/

2D non-polynomial function fitting from the command line

I just wrote a simple Unix command line utility that could be implemented a lot more efficiently. I can measure its performance by just running it on a number of inputs and measuring the time it takes. This will produce a set of pairs of numbers, s t, where s is the input size and t the processing time. In order to determine the performance characteristics of my utility, I need to fit a function through these data points. I can do this manually, but I prefer to be lazy and let a utility do it for me.
Does such a utility exist?
Its input is a sequence of pairs of numbers.
Its output is a formula that expresses how the second number depends as a function on the first, plus an error measure.
One step of the way is to have a utility that does this just for polynomials.
This has been discussed here but it didn't produce a ready-to-use solution.
The next step is to extend the utility to try non-polynomial terms: negative-degree polynomials (as in y = 1/x) and logarithmic terms (as in y = x log x) will need to be tried as well. One idea to cope with the non-polynomial terms is to just surround the polynomial fitting with x and y scale transformations. I don't know whether that will do. This question is related but not exactly the same.
As I said, I'm lazy: I'm not looking for ideas on how to to write this myself, I'm looking for a reliable result of a project that has already done it for me. Any suggestions?
I believe that SAS has this, RS/1 has this, I think that Mathematica has this, Execel and most spreadsheets have a primitive form of this and usually there are add-ons available for more advanced forms. There are lots of Lab analysis and Statistical analysis tools that have stuff like this.
RE., Command Line Tools:
SAS, RS/1 and Minitab were all command line tools 20 years ago when I used them. I bet at least one of them still has this capability.