What is the fastest algorithm that exists up with to solve a particular NP-Complete problem? For example, a naive implementation of travelling salesman is O(n!), but with dynamic programming it can be done in O(n^2 * 2^n). Is there any perhaps "easier" NP-Complete problem that has a better running time?
I'm curious about exact solutions, not approximations.
[...] with dynamic programming it can be done in O(n^2 * 2^n). Is there any perhaps "easier" NP-Complete problem that has a better running time?
Sort of. You can get rid of any polynomial factor by creating an artificial problem that encodes the same solution in a polynomially larger input. As long as the input is only polynomially larger, the resulting problem is still NP-complete. Since the complexity is by definition the function that maps input size to running time, if the input size grows the function gets into lower O classes.
So, the same algorithm running on TSP with the input padded with n^2 useless bits, will have complexity O(1 * 2^sqrt(n)).
A characteristic of the NP-Complete problems is that any of the problems in NP can be mechanically transformed into any of the NP-Complete problems in, at most, polynomial time.
Therefore, whatever the best solution for any given NP-Complete problem is, it is automatically a similarly-good solution for any other NP problem.
Given that dynamic programming can solve Traveling Salesman Problem in 2^n time and 2^n space, the same must be true of all other NP problems [well, plus the time to apply the transformation, I guess - so it could be 2^(n+1)].
Generally you cannot find the best solution for the generic Travelling Salesman problem without trying all combinations (there might be negative distances, etc).
By adding additional restrictions and loosening the requirement of getting the best solution, you can speed up things quite a bit.
For instance you can get polynomial executable time if the distances in the problem obey the "it is not longer to go directly from A to B than going from A to C to B" (i.e. a shortcut is never longer), and you can live with the result maximally being 1.5 times the optimal value. See http://en.wikipedia.org/wiki/Travelling_salesman_problem#Metric_TSP
Related
I am using Modelica for solving a system of equations for heat transfer problems, and one of them is radiation which is written as
Ta^4-Tb^4
Can someone say if it is computationally faster solving a system with the equation written as:
(Ta-Tb)(Ta+Tb)(Ta^2+Tb^2)
?
There cannot be a definitive answer to this question. This is because the Modelica specification is used to formally define the problem statement but it says nothing about how tools solve such equations. Furthermore, since most Modelica tools do symbolic manipulation anyway, it is hard to predict what steps they might take with such an equation. For example, a tool may very well transform this into a Horner polynomial on its own (without your manual intervention).
If you are going to solve for the temperatures in such an equation as a non-linear system, be careful about negative temperature solutions. You should investigate the "start" attribute to specify initial (positive) guesses when these temperatures are iteration variables in non-linear problems.
I would say that there are two reasons why splitting it into (Ta-Tb)(Ta+Tb)(Ta^2+Tb^2) is SLOWER and NOT FASTER.
(Ta^2+Tb^2) requires 2 multiplications and an addition, which means that (Ta-Tb)(Ta+Tb)(Ta^2+Tb^2) requires 4 multiplications and 3 additions. On the other hand, i guess that Ta^4-Tb^4 is done like this: ((Ta^2)^2 - (Tb^2)^2) which means 1 addition and 4 multiplications.
Mathematica, like a more generic compiler probably knows very well how to optimise these very simple expression. Which means that it is generally safer in terms of computation time to use simple patterns which will be easily caugth and translated into super efficient machine code.
I might obviously be wrong, but I cannot see any reason why (Ta-Tb)(Ta+Tb)(Ta^2+Tb^2) could be FASTER. Hope it helps.
Oscar
I'm tackling an interesting machine learning problem and would love to hear if anyone knows a good algorithm to deal with the following:
The algorithm must learn to approximate a function of N inputs and M outputs
N is quite large, e.g. 1,000-10,000
M is quite small, e.g. 5-10
All inputs and outputs are floating point values, could be positive or negative, likely to be relatively small in absolute value but no absolute guarantees on bounds
Each time period I get N inputs and need to predict the M outputs, at the end of the time period the actual values for the M outputs are provided (i.e. this is a supervised learning situation where learning needs to take place online)
The underlying function is non-linear, but not too nasty (e.g. I expect it will be smooth and continuous over most of the input space)
There will be a small amount of noise in the function, but signal/noise is likely to be good - I expect the N inputs will expain 95%+ of the output values
The underlying function is slowly changing over time - unlikely to change drastically in a single time period but is likely to shift slightly over the 1000s of time periods range
There is no hidden state to worry about (other than the changing function), i.e. all the information required is in the N inputs
I'm currently thinking some kind of back-propagation neural network with lots of hidden nodes might work - but is that really the best approach for this situation and will it handle the changing function?
With your number of inputs and outputs, I'd also go for a neural network, it should do a good approximation. The slight change is good for a back-propagation technique, it should not have to 'de-learn' stuff.
I think stochastic gradient descent (http://en.wikipedia.org/wiki/Stochastic_gradient_descent) would be a straight forward first step, it will probably work nicely given the operating conditions you have.
I'd also go for an ANN. Single layer might do fine since your input space is large. You might wanna give it a shot before adding a lot of hidden layers.
#mikera What is it going to be used for? Is it an assignment in a ML course?
I'm looking for an algorithm to calculate ln(1-x). x is often small (<0.01), but occasionally it might be larger. Algorithm needs to be accurate, and not too slow. I'd rather not use library for ln(x), because I might lose accuracy.
Depending on the accuracy you want, -x is a good approximation to small ln(1-x). From here.
Edit: If the reason for needing the algorithm is getting the best accuracy, then there are many libraries that are specialised for log(1+x). For example, in Python use log1p. Ditto in C and C++.
If you are using MATLAB the log1p() function was designed specifically for calculating ln(1+x) for small values of x.
Already finished my application which multiplies CRS matrix and vector (SpMV) and the only thing to do now is to count FLOPS my application did. In my opinion it's really hard to estimate number of floating point operation in case of sparse matrix - vector multiplication, because the number of multiplies in one row is really "jumpy" or fluent.
I only tried to measure time using "cudaprof" ( available in ./CUDA/bin directory) - it works fine.
Any sugestions and instruction pastes appreciated !
That's not just your opinion; it's simple fact that the number of operations in the case of a sparse matrix is data-dependent, and so you can't get a reasonable answer without knowing something about the data. That makes it impossible to have a one-number-fits-all-data estimate.
This is probably one of the sorts of situations where you could think hard about it for many hours (and do lots of research) to make a maybe-accurate estimate, or you could spend a few minutes writing a variant of your existing implementation that increments a counter each time it does an operation. Sure, that's going to take quite a while to run (especially if you don't do it in a CUDA-enabled form), but probably a lot less time than it would take to do the thinking, and when the answer comes out, you don't have to do a lot of work to convince yourself that it's right.
Is there a known algorithm or method to find all complete sub-graphs within a graph? I have an undirected, unweighted graph and I need to find all subgraphs within it where each node in the subgraph is connected to each other node in the subgraph.
Is there an existing algorithm for this?
This is known as the clique problem; it's hard and is in NP-complete in general, and yes there are many algorithms to do this.
If the graph has additional properties (e.g. it's bipartite), then the problem becomes considerably easier and is solvable in polynomial time, but otherwise it's very hard, and is completely solvable only for small graphs.
From Wikipedia
In computer science, the clique problem refers to any of the problems related to finding particular complete subgraphs ("cliques") in a graph, i.e., sets of elements where each pair of elements is connected.
Clique problems include:
finding the maximum clique (a clique with the largest number of vertices),
finding a maximum weight clique in a weighted graph,
listing all maximal cliques (cliques that cannot be enlarged)
solving the decision problem of testing whether a graph contains a clique larger than a given size.
These problems are all hard: the clique decision problem is NP-complete (one of Karp's 21 NP-complete problems), the problem of finding the maximum clique is both fixed-parameter intractable and hard to approximate, and listing all maximal cliques may require exponential time as there exist graphs with exponentially many maximal cliques. Nevertheless, there are algorithms for these problems that run in exponential time or that handle certain more specialized input graphs in polynomial time.
See also
Bron–Kerbosch algorithm
Well the problem of finding a k-vertex subgraph in a graph of size n is of complexity
O(n^k k^2)
Since there are n^k subgraphs to check and each of them have k^2 edges.
What you are asking for, finding all subgraphs in a graph is a NP-complete problem and is explained in the Bron-Kerbosch algorithm listed above.