Bliss bug? Automorphism group generators depend on branching heuristic

Bliss bug? Automorphism group generators depend on branching heuristic - igraph

I'm trying to use Bliss to compute the automorphism group generators of a graph. I thought I'd rather ask here on SO before I bother the auther with "bug", which is actually just my fault.
One of the options of bliss is the branching heuristic, which specifies which cell (of a partition of the vertices) is considered next.
e.g.
f means First non-unit cell.
fs means First smallest non-unit cell.
fsm means First smallest maximally non-trivially connected non-unit cell.
etc.
This graph gives me a headache: http://pastebin.com/Ppq7N1mN (file format: http://www.tcs.hut.fi/Software/bliss/fileformat.shtml)
Surprisingly, the numbers of generators differ for different branching heuristics... The modes f, fl return 2 generators, but I think there are 3, which fs,fm,fsm and flm confirm.
It's weird that the automorphism group size matches |Aut|=8. I checked why, and somehow bliss thinks that the orbit of one of the generators has size 4 and thus calculates the order 2*4=8. I don't exactly know how the algorithm works nor do I understand the code well enough so that I could find a bug.
So my question is: Am I missing something and this behaviour is normal or is that a bug in the library?
And here's the output. That the canonical labellings are different is expected!
bliss -directed -can -sh=f test.dimacs
Generator: (2,4)(10,18)(11,20)(12,19)(13,21)(40,76)(41,77)(42,78)(43,79)(44,80)(45,81)(46,82)(47,83)(48,84)(49,85)(50,86)(51,87)(52,88)(53,89)(54,90)(55,91)(56,92)(57,93)
Generator: (1,2)(3,4)(6,10)(7,11)(8,13)(9,12)(14,18)(15,20)(16,21)(17,19)(22,43)(23,42)(24,41)(25,40)(26,57)(27,56)(28,55)(29,54)(30,53)(31,52)(32,51)(33,50)(34,49)(35,48)(36,47)(37,46)(38,45)(39,44)(58,79)(59,78)(60,77)(61,76)(62,93)(63,92)(64,91)(65,90)(66,89)(67,88)(68,87)(69,86)(70,85)(71,84)(72,83)(73,82)(74,81)(75,80)(94,95)
Canonical labeling: (1,4)(6,20,14,21)(7,16,8,9,12,11,15,17,13)(10,19)(22,77,66,24,68,30,25,73,88,90,39,44,43,75,45,29,41,67,92,54,40,71,51,32,33,49,83,34,85,82,86,46,87,31,93,64,80,42,61,72,37,89)(23,58,76,70,84,52,91,78,60,69,48,53)(26,62,63,57,65,38)(27,56,55,79,74)(28,81)(35,50,47)
Nodes: 6
Leaf nodes: 4
Bad nodes: 0
Canrep updates: 1
Generators: 2
Max level: 2
|Aut|: 8
Total time: 0.06 seconds
bliss -directed -can -sh=fsm test.dimacs
Generator: (2,4)(10,18)(11,20)(12,19)(13,21)(40,76)(41,77)(42,78)(43,79)(44,80)(45,81)(46,82)(47,83)(48,84)(49,85)(50,86)(51,87)(52,88)(53,89)(54,90)(55,91)(56,92)(57,93)
Generator: (1,3)(6,14)(7,15)(8,16)(9,17)(22,58)(23,59)(24,60)(25,61)(26,62)(27,63)(28,64)(29,65)(30,66)(31,67)(32,68)(33,69)(34,70)(35,71)(36,72)(37,73)(38,74)(39,75)
Generator: (1,2)(3,4)(6,10)(7,11)(8,13)(9,12)(14,18)(15,20)(16,21)(17,19)(22,43)(23,42)(24,41)(25,40)(26,57)(27,56)(28,55)(29,54)(30,53)(31,52)(32,51)(33,50)(34,49)(35,48)(36,47)(37,46)(38,45)(39,44)(58,79)(59,78)(60,77)(61,76)(62,93)(63,92)(64,91)(65,90)(66,89)(67,88)(68,87)(69,86)(70,85)(71,84)(72,83)(73,82)(74,81)(75,80)(94,95)
Canonical labeling: (1,2,4,3)(6,21,8,7,15,14,20,16)(9,13)(10,19)(11,17,12)(22,75,42,59,60,66)(23,61,72,34,83,36,35,53,25,73,86,48,51,31,93,62,64,80,44,45,27,55,79,76,70,82,88,90,38,29,41,69,46,89,24,67,92,56,57,63,54,39,43,77,68,32,33,47,37,87,30)(26,65,40,71,52,91,78,58,74,28,81)(49,85,84,50)(94,95)
Nodes: 9
Leaf nodes: 4
Bad nodes: 0
Canrep updates: 1
Generators: 3
Max level: 3
|Aut|: 8
Total time: 0.06 seconds
I didn't know how to tag this question, so I chose igraph - a library that uses bliss for this problem.

A friend of mine suggested that the generating set of the second method might not be minimal. And indeed you can get the missing generator from the other two:
(1 2)(3 4)... * (2 4)(10 18)... * (1 2)(3 4)... = (1 3)(6 14)...

Related

Find the Relationship Between Two Logarithmic Equations

No idea if I am asking this question in the right place, but here goes...
I have a set of equations that were calculated based on numbers ranging from 4 to 8. So an equation for when this number is 5, one for when it is 6, one for when it is 7, etc. These equations were determined from graphing a best fit line to data points in a Google Sheet graph. Here is an example of a graph...
Example...
When the number is between 6 and 6.9, this equation is used: windGust6to7 = -29.2 + (17.7 * log(windSpeed))
When the number is between 7 and 7.9, this equation is used: windGust7to8 = -70.0 + (30.8 * log(windSpeed))
I am using these equations to create an image in python, but the image is too choppy since each equation covers a range from x to x.9. In order to smooth this image out and make it more accurate, I really would need an equation for every 0.1 change in number. So an equation for 6, a different equation for 6.1, one for 6.2, etc.
Here is an example output image that is created using the current equations:
So my question is: Is there a way to find the relationship between the two example equations I gave above in order to use that to create a smoother looking image?

This is not about logarithms; for the purposes of this derivation, log(windspeed) is a constant term. Rather, you're trying to find a fit for your mapping:
6 (-29.2, 17.7)
7 (-70.0, 30.8)
...
... and all of the other numbers you have already. You need to determine two basic search paramteres:
(1) Where in each range is your function an exact fit? For instance, for the first one, is it exactly correct at 6.0, 6.5, 7.0, or elsewhere? Change the left-hand column to reflect that point.
(2) What sort of fit do you want? You are basically fitting a pair of parameterized equations, one for each coefficient:
x y x y
6 -29.2 6 17.7
7 -70.0 7 30.8
For each of these, you want to find the coefficients of a good matching function. This is a large field of statistical and algebraic study. Since you have four ranges, you will have four points for each function. It is straightforward to fit a cubic equation to each set of points in Cartesian space. However, the resulting function may not be as smooth as you like; in such a case, you may well find that a 4th- or 5th- degree function fits better, or perhaps something exponential, depending on the actual distribution of your points.
You need to work with your own problem objectives and do a little more research into function fitting. Once you determine the desired characteristics, look into scikit for fitting functions to do the heavy computational work for you.

Compressing a binary matrix

We were asked to find a way to compress a square binary matrix as much as possible, and if possible, to add redundancy bits to check and maybe correct errors.
The redundancy thing is easy to implement in my opinion. The complicated part is compressing the matrix. I thought about using run-length after reshaping the matrix to a vector because there will be more zeros than ones, but I only achieved a 40bits compression (we are working on small sizes) although I thought it'd be better.
Also, after run-length an idea was Huffman coding the matrix, but a dictionary must be sent in order to recover the original information.
I'd like to know what would be the best way to compress a binary matrix?
After reading some comments, yes #Adam you're right, the 14x14 matrix should be compressed in 128bits, so if I only use the coordinates (rows&cols) for each non-zero element, still it would be 160bits (since there are twenty ones). I'm not looking for an exact solution but for a useful idea.

You can only talk about compressing something if you have a distribution and a representation. That's the issue of the dictionary you have to send along: you always need some sort of dictionary of protocol to uncompress something. It just so happens that things like .zip and .mpeg already have those dictionaries/codecs. Even something as simple as Huffman-encoding is an algorithm; on the other side of the communication channel (you can think of compression as communication), the other person already has a bit of code (the dictionary) to perform the Huffman decompression scheme.
Thus you cannot even begin to talk about compressing something without first thinking "what kinds of matrices do I expect to see?", "is the data truly random, or is there order?", and if so "how can I represent the matrices to take advantage of order in the data?".
You cannot compress some matrices without increasing the size of other objects (by at least 1 bit). This is bad news if all matrices are equally probable, and you care equally about them all.
Addenda:
The answer to use sparse matrix machinery is not necessarily the right answer. The matrix could for example be represented in python as [[(r+c)%2 for c in range (cols)] for r in range(rows)] (a checkerboard pattern), and a sparse matrix wouldn't compress it at all, but the Kolmogorov complexity of the matrix is the above program's length.
Well, I know every matrix will have the same number of ones, so this is kind of deterministic. The only think I don't know is where the 1's will be. Also, if I transmit the matrix with a dictionary and there are burst errors, maybe the dictionary gets affected so... wouldnt be the resulting information corrupted? That's why I was trying to use lossless data compression such as run-length, the decoder just doesnt need a dictionary. --original poster
How many 1s does the matrix have as a fraction of its size, and what is its size (NxN -- what is N)?
Furthermore, this is an incorrect assertion and should not be used as a reason to desire run-length encoding (which still requires a program); when you transmit data over a channel, you can always add error-correction to this data. "Data" is just a blob of bits. You can transmit both the data and any required dictionaries over the channel. The error-correcting machinery does not care at all what the bits you transmit are for.
Addendum 2:
There are (14*14) choose 20 possible arrangements, which I assume are randomly chosen. If this number was larger than 128^2 what you're trying to do would be impossible. Fortunately log_2((14*14) choose 20) ~= 90bits < 128bits so it's possible.
The simple solution of writing down 20 numbers like 32,2,67,175,52,...,168 won't work because log_2(14*14)*20 ~= 153bits > 128bits. This would be equivalent to run-length encoding. We want to do something like this but we are on a very strict budget and cannot afford to be "wasteful" with bits.
Because you care about each possibility equally, your "dictionary"/"program" will simulate a giant lookup table. Matlab's sparse matrix implementation may work but is not guaranteed to work and is thus not a correct solution.
If you can create a bijection between the number range [0,2^128) and subsets of size 20, you're good to go. This corresponds to enumerating ways to descend the pyramid in http://en.wikipedia.org/wiki/Binomial_coefficient to the 20th element of row 196. This is the same as enumerating all "k-combinations". See http://en.wikipedia.org/wiki/Combination#Enumerating_k-combinations
Fortunately I know that Mathematica and Sage and other CAS software can apparently generate the "5th" or "12th" or arbitrarily numbered k-subset. Looking through their documentation, we come upon a function called "rank", e.g. http://www.sagemath.org/doc/reference/sage/combinat/subset.html
So then we do some more searching, and come across some arcane Fortran code like http://people.sc.fsu.edu/~jburkardt/m_src/subset/ksub_rank.m and http://people.sc.fsu.edu/~jburkardt/m_src/subset/ksub_unrank.m
We could reverse-engineer it, but it's kind of dense. But now we have enough information to search for k-subset rank unrank, which leads us to http://www.site.uottawa.ca/~lucia/courses/5165-09/GenCombObj.pdf -- see the section
"Generating k-subsets (of an n-set): Lexicographical
Ordering" and the rank and unrank algorithms on the next few pages.
In order to achieve the exact theoretically optimal compression, in the case of a uniformly random distribution of 1s, we must thus use this technique to biject our matrices to our output number of range <2^128. It just so happens that combinations have a natural ordering, known as ranking and unranking of combinations. You assign a number to each combination (ranking), and if you know the number you automatically know the combination (unranking). Googling k-subset rank unrank will probably yield other algorithms.
Thus your solution would look like this:
serialize the matrix into a list
e.g. [[0,0,1][0,1,1][1,0,0]] -> [0,0,1,0,1,1,1,0,0]
take the indices of the 1s:
e.g. [0,0,1,0,1,1,1,0,0] -> [3,5,6,7]
1 2 3 4 5 6 7 8 9 a k=4-subset of an n=9 set
take the rank
e.g. compressed = rank([3,5,6,7], n=9)
compressed==412 (or something, I made that up)
you're done!
e.g. 412 -binary-> 110011100 (at most n=9bits, less than 2^n=2^9=512)
to uncompress, unrank it

I'll get to 128 bits in a sec, first here's how you fit a 14x14 boolean matrix with exactly 20 nonzeros into 136 bits. It's based on the CSC sparse matrix format.
You have an array c with 14 4-bit counters that tell you how many nonzeros are in each column.
You have another array r with 20 4-bit row indices.
56 bits (c) + 80 bits (r) = 136 bits.
Let's squeeze 8 bits out of c:
Instead of 4-bit counters, use 2-bit. c is now 2*14 = 28 bits, but can't support more than 3 nonzeros per column. This leaves us with 128-80-28 = 20 bits. Use that space for array a4c with 5 4-bit elements that "add 4 to an element of c" specified by the 4-bit element. So, if a4c={2,2,10,15, 15} that means c[2] += 4; c[2] += 4 (again); c[10] += 4;.
The "most wasteful" distribution of nonzeros is one where the column count will require an add-4 to support 1 extra nonzero: so 5 columns with 4 nonzeros each. Luckily we have exactly 5 add-4s available.
Total space = 28 bits (c) + 20 bits
(a4c) + 80 bits (r) = 128 bits.

Your input is a perfect candidate for a sparse matrix. You said you're using Matlab, so you already have a good sparse matrix built for you.
spm = sparse(dense_matrix)
Matlab's sparse matrix implementation uses Compressed Sparse Columns, which has memory usage on the order of 2*(# of nonzeros) + (# of columns), which should be pretty good in your case of 20 nonzeros and 14 columns. Storing 20 values sure is better than storing 196...
Also remember that all matrices in Matlab are going to be composed of doubles. Just because your matrix can be stored as a 1-bit boolean doesn't mean Matlab won't stick it into a 64-bit floating point value... If you do need it as a boolean you're going to have to make your own type in C and use .mex files to interface with Matlab.

After thinking about this again, if all your matrices are going to be this small and they're all binary, then just store them as a binary vector (bitmask). Going off your 14x14 example, that requires 196 bits or 25 bytes (plus n, m if your dimensions are not constant). That same vector in Matlab would use 64 bits per element, or 1568 bytes. So storing the matrix as a bitmask takes as much space as 4 elements of the original matrix in Matlab, for a compression ratio of 62x.
Unfortunately I don't know if Matlab supports bitmasks natively or if you have to resort to .mex files. If you do get into C++ you can use STL's vector<bool> which implements a bitmask for you.

Determining edge weights given a list of walks in a graph

These questions regard a set of data with lists of tasks performed in succession and the total time required to complete them. I've been wondering whether it would be possible to determine useful things about the tasks' lengths, either as they are or with some initial guesstimation based on appropriate domain knowledge. I've come to think graph theory would be the way to approach this problem in the abstract, and have a decent basic grasp of the stuff, but I'm unable to know for certain whether I'm on the right track. Furthermore, I think it's a pretty interesting question to crack. So here we go:
Is it possible to determine the weights of edges in a directed weighted graph, given a list of walks in that graph with the lengths (summed weights) of said walks? I recognize the amount and quality of permutations on the routes taken by the walks will dictate the quality of any possible answer, but let's assume all possible walks and their lengths are given. If a definite answer isn't possible, what kind of things can be concluded about the graph? How would you arrive at those conclusions?
What if there were several similar walks with possibly differing lengths given? Can you calculate a decent average (or other illustrative measure) for each edge, given enough permutations on different routes to take? How will discounting some permutations from the available data set affect the calculation's accuracy?
Finally, what if you had a set of initial guesses as to the weights and had to refine those using the walks given? Would that improve upon your guesstimation ability, and how could you apply the extra information?
EDIT: Clarification on the difficulties of a plain linear algebraic approach. Consider the following set of walks:
a = 5
b = 4
b + c = 5
a + b + c = 8
A matrix equation with these values is unsolvable, but we'd still like to estimate the terms. There might be some helpful initial data available, such as in scenario 3, and in any case we can apply knowledge of the real world - such as that the length of a task can't be negative. I'd like to know if you have ideas on how to ensure we get reasonable estimations and that we also know what we don't know - eg. when there's not enough data to tell a from b.

Seems like an application of linear algebra.
You have a set of linear equations which you need to solve. The variables being the lengths of the tasks (or edge weights).
For instance if the tasks lengths were t1, t2, t3 for 3 tasks.
And you are given
t1 + t2 = 2 (task 1 and 2 take 2 hours)
t1 + t2 + t3 = 7 (all 3 tasks take 7 hours)
t2 + t3 = 6 (tasks 2 and 3 take 6 hours)
Solving gives t1 = 1, t2 = 1, t3 = 5.
You can use any linear algebra techniques (for eg: http://en.wikipedia.org/wiki/Gaussian_elimination) to solve these, which will tell you if there is a unique solution, no solution or an infinite number of solutions (no other possibilities are possible).
If you find that the linear equations do not have a solution, you can try adding a very small random number to some of the task weights/coefficients of the matrix and try solving it again. (I believe falls under Perturbation Theory). Matrices are notorious for radically changing behavior with small changes in the values, so this will likely give you an approximate answer reasonably quickly.
Or maybe you can try introducing some 'slack' task in each walk (i.e add more variables) and try to pick the solution to the new equations where the slack tasks satisfy some linear constraints (like 0 < s_i < 0.0001 and minimize sum of s_i), using Linear Programming Techniques.

Assume you have an unlimited number of arbitrary characters to represent each edge. (a,b,c,d etc)
w is a list of all the walks, in the form of 0,a,b,c,d,e etc. (the 0 will be explained later.)
i = 1
if #w[i] ~= 1 then
replace w[2] with the LENGTH of w[i], minus all other values in w.
repeat forever.
Example:
0,a,b,c,d,e 50
0,a,c,b,e 20
0,c,e 10
So:
a is the first. Replace all instances of "a" with 50, -b,-c,-d,-e.
New data:
50, 50
50,-b,-d, 20
0,c,e 10
And, repeat until one value is left, and you finish! Alternatively, the first number can simply be subtracted from the length of each walk.

I'd forget about graphs and treat lists of tasks as vectors - every task represented as a component with value equal to it's cost (time to complete in this case.
In tasks are in different orderes initially, that's where to use domain knowledge to bring them to a cannonical form and assign multipliers if domain knowledge tells you that the ratio of costs will be synstantially influenced by ordering / timing. Timing is implicit initial ordering but you may have to make a function of time just for adjustment factors (say drivingat lunch time vs driving at midnight). Function might be tabular/discrete. In general it's always much easier to evaluate ratios and relative biases (hardnes of doing something). You may need a functional language to do repeated rewrites of your vectors till there's nothing more that romain knowledge and rules can change.
With cannonical vectors consider just presence and absence of task (just 0|1 for this iteratioon) and look for minimal diffs - single task diffs first - that will provide estimates which small number of variables. Keep doing this recursively, be ready to back track and have a heuristing rule for goodness or quality of estimates so far. Keep track of good "rounds" that you backtraced from.
When you reach minimal irreducible state - dan't many any more diffs - all vectors have the same remaining tasks then you can do some basic statistics like variance, mean, median and look for big outliers and ways to improve initial domain knowledge based estimates that lead to cannonical form. If you finsd a lot of them and can infer new rules, take them in and start the whole process from start.
Yes, this can cost a lot :-)

How many combinations of k neighboring pixels are there in an image?

I suck at math, so I can't figure this out: how many combinations of k neighboring pixels are there in an image? Combinations of k pixels out of n * n total pixels in the image, but with the restriction that they must be neighbors, for each k from 2 to n * n. I need the sum for all values of k for a program that must take into account that many elements in a set that it's reasoning about.
Neighbors are 4-connected and do not wrap-around.

Once you get the number of distinct shapes for a blob of pixels of size k (here's a reference) then it comes down to two things:
How many ways on your image can you place this blob?
How many of these are the same so that you don't double-count (because of symmetries)?
Getting an exact answer is a huge computational job (you're looking at more than 10^30 distinct shapes for k=56 -- imagine if k = 10,000) but you may be able to get good enough for what you need by fitting for the first 50 values of k.
(Note: the reference in the wikipedia article takes care of duplicates with their definition of A_k.)

It seems that you are working on a problem that can be mapped to Markovian Walks.
If I understand your question, you are trying to count paths of length k like this:
Start (end)-> any pixel after visiting k neighbours
* - - - - -*
| |
| |
- - - -
in a structure that is similar to a chess board, and you want to connect only vertical and horizontal neighbours.
I think that you want the paths to be self avoiding, meaning that a pixel should not be traversed twice in a walk (meaning no loops). This condition lead to a classical problem called SAWs (Self Avoiding Walks).
Well, now the bad news: The problem is open! No one solved it yet.
You can find a nice intro to the problem here, starting at page 54 (or page 16, the counting is confusing because the page numbers are repeating in the doc). But the whole paper is very interesting and easy to read. It manages to explain the mathematical background, the historical anecdotes and the scientific importance of markovian chains in a few slides.
Hope this helps ... to avoid the problem.

If you were planning to iterate over all possible polyominos, I'm afraid you'll be waiting a long time. From the wikipedia site about polyominos, it's going to be at least O(4.0626^n) and probably closer to O(8^n). By the time n=14, the count will be over 5 billion and too big to fit into an int. By time n=30, the count will be more than 17 quintillion and you won't be able to fit it into a long. If all the world governments pooled together their resources to iterate through all polyominos in a 32 x 32 icon, they would not be able to do it before the sun goes supernova.
Now that doesn't mean what you want to do is intractable. It is likely almost all the work you do on one polyominal was done in part on others. It may be a fun task make an exponential speedup using dynamic programming. What is it you're trying to accomplish?

Of Ways to Count the Limitless Primes [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
Alright, so maybe I shouldn't have shrunk this question sooo much... I have seen the post on the most efficient way to find the first 10000 primes. I'm looking for all possible ways. The goal is to have a one stop shop for primality tests. Any and all tests people know for finding prime numbers are welcome.
And so:
What are all the different ways of finding primes?

Some prime tests only work with certain numbers, for instance, the Lucas–Lehmer test only works for Mersenne numbers.
Most prime tests used for big numbers can only tell you that a certain number is "probably prime" (or, if the number fails the test, it is definitely not prime). Usually you can continue the algorithm until you have a very high probability of a number being prime.
Have a look at this page and especially its "See Also" section.
The Miller-Rabin test is, I think, one of the best tests. In its standard form it gives you probable primes - though it has been shown that if you apply the test to a number beneath 3.4*10^14, and it passes the test for each parameter 2, 3, 5, 7, 11, 13 and 17, it is definitely prime.
The AKS test was the first deterministic, proven, general, polynomial-time test. However, to the best of my knowledge, its best implementation turns out to be slower than other tests unless the input is ridiculously large.

For a given integer, the fastest primality check I know is:
Take a list of 2 to the square root of the integer.
Loop through the list, taking the remainder of the integer / current number
If the remainder is zero for any number in the list, then the integer is not prime.
If the remainder was non-zero for all numbers in the list, then the integer is prime.
It uses significantly less memory than The Sieve of Eratosthenes and is generally faster for individual numbers.

The Sieve of Eratosthenes is a decent algorithm:
Take the list of positive integers 2 to any given Ceiling.
Take the next item in the list (2 in the first iteration) and remove all multiples of it (beyond the first) from the list.
Repeat step two until you reach the given Ceiling.
Your list is now composed purely of primes.
There is a functional limit to this algorithm in that it exchanges speed for memory. When generating very large lists of primes the memory capacity needed skyrockets.

#akdom's question to me:
Looping would work fine on my previous suggestion, and you don't need to do any calculations to determine if a number is even; in your loop, simply skip every even number, as shown below:
//Assuming theInteger is the number to be tested for primality.
// Check if theInteger is divisible by 2. If not, run this loop.
// This loop skips all even numbers.
for( int i = 3; i < sqrt(theInteger); i + 2)
{
if( theInteger % i == 0)
{
//getting here denotes that theInteger is not prime
// somehow indicate that some number, i, divides it and break
break;
}
}

A Rutgers grad student recently found a recurrence relation that generates primes. The difference of its successive numbers will generate either primes or 1's.
a(1) = 7
a(n) = a(n-1) + gcd(n,a(n-1)).
It makes a lot of crap that needs to be filtered out. Benoit Cloitre also has this recurrence that does a similar task:
b(1) = 1
b(n) = b(n-1) + lcm(n,b(n-1))
then the ratio of successive numbers, minus one [b(n)/b(n-1)-1] is prime. A full account of all this can be read at Recursivity.
For the sieve, you can do better by using a wheel instead of adding one each time, check out the Improved Incremental Prime Number Sieves. Here is an example of a wheel. Let's look at the numbers, 2 and 5 to ignore. Their wheel is, [2,4,2,2].

In your algorithm using the list from 2 to the root of the integer, you can improve performance by only testing odd numbers after 2. That is, your list only needs to contain 2 and all odd numbers from 3 to the square root of the integer. This cuts the number of times you loop in half without introducing any more complexity.

#theprise
If I were wanting to use an incrementing loop instead of an instantiated list (problems with memory for massive numbers...), what would be a good way to do that without building the list?
It doesn't seem like it would be cheaper to do a divisibility check for the given integer (X % 3) than just the check for the normal number (N % X).

If you're wanting to find a way of generating prime numbers, this have been covered in a previous question.

We Keep Coding

html mysql json google-apps-script actionscript-3 ms-access google-chrome google-maps reporting-services sql-server-2008