How to convert two intersected DFAs into a minimal DFA - intersection

I have the following problem: There are two deterministic finite automatons which should be intersected and converted into a single minimal deterministic finite automaton. Is there an algorithm to do that?
I know that I can create an NFA by creating the cartesian product of the two automatons and convert the outcome into a DFA, but this is a time-killing procedure. Is there an easier way to create the intersection of two automatons?
By the way: here is the solution:
I tried my approach as I described below, but I can't imagine how to get the solution: Computing the complement of the two DFA's gives my two new DFA's with exactly two accepting states. Now I have to combine them and minimize them, but wherefrom can I get the third accepting state?

To the best of my knowledge, there's no "direct" algorithm that accomplishes this. You can do this by
minimizing the two input DFAs,
computing their Cartesian product (which produces another DFA, by the way, not an NFA), then
minimizing the result.
It's not strictly necessary to minimize the two input DFAs, but it can help with efficiency. If you have an n-state and an m-state DFA, their Cartesian product will have O(mn) states. The DFA minimization algorithm runs in time O(k2), where k is the number of states in the DFA, so if the original DFAs have sizes n and m, computing the Cartesian product and then minimizing will take time O(m2n2), whereas minimizing, then computing the Cartesian product, then minimizing again takes time O(m2 + n2 + m'2n'2), where m' and n' are the sizes of the minimized DFAs.
Hope this helps!

Related

Simulating a matrix of variables with predefined correlation structure

For a simulation study I am working on, we are trying to test an algorithm that aims to identify specific culprit factors that predict a binary outcome of interest from a large mixture of possible exposures that are mostly unrelated to the outcome. To test this algorithm, I am trying to simulate the following data:
A binary dependent variable
A set of, say, 1000 variables, most binary and some continuous, that are not associated with the outcome (that is, are completely independent from the binary dependent variable, but that can still be correlated with one another).
A group of 10 or so binary variables which will be associated with the dependent variable. I will a-priori determine the magnitude of the correlation with the binary dependent variable, as well as their frequency in the data.
Generating a random set of binary variables is easy. But is there a way of doing this while ensuring that none of these variables are correlated with the dependent outcome?
Thank you!
"But is there a way of doing this while ensuring that none of these variables are correlated with the dependent outcome?"
With statistical sampling you can't ensure anything, you can only adjust the acceptable risk. Finding an acceptable level of risk may be harder than many people think.
Spurious correlations are a very real phenomenon. Real independent observations will often contain correlations, and if you want to actually test your algorithm to see how it will perform in reality then your tests should produce such phenomena in a manner similar to the real world—you should be generating independent candidate factors and allowing spurious correlations to occur.
If you are performing ~1000 independent tests of candidate factors, and you're targeting a risk level of α = 0.05, you can expect 50 non-significant terms to leak through into your analysis. To avoid this, you need to adjust your testing threshold using something along the lines of a Bonferroni correction. Recall that statistical discriminating power is based on standard error, which is inversely proportional to the square root of the sample size. Bonferroni says that 1000 simultaneous tests need their individual test threshold to be adjusted by a factor of 1000, which in turn means the sample size needs to be a million times larger than when performing a single test for significance.
So in summary I'd say that you shouldn't attempt to ensure lack of correlation, it's going to occur in the real world. You can mitigate the risk of non-predictive factors being included due to spurious correlation by generating massive amounts of data. In practice there will be non-predictors that leak through unless you can obtain enough data, so I'd suggest that your testing should address the rates of occurrence as a function of number of candidate factors and the sample size.

Can I find price floors and ceilings with cuda

Background
I'm trying to convert an algorithm from sequential to parallel, but I am stuck.
Point and Figure Charts
I am creating point and figure charts.
Decreasing
While the stock is going down, add an O every time it breaks through the floor.
Increasing
While the stock is going up, add an X every time it breaks through the ceiling.
Reversal
If the stock reverses direction, but the change is less than a reversal threshold (3 units) do nothing. If the change is greater than the reversal threshold, start a new column (X or O)
Sequential vs Parallel
Sequentially, this is pretty straight forward. I keep a variable for the floor and ceiling. If the current price breaks through the floor or ceiling, or changes more than the reversal threshold, I can take the appropriate action.
My question is, is there a way to find these reversal point in parallel? I'm fairly new to thinking in parallel, so I'm sorry if this is trivial. I am trying to do this in CUDA, but I have been stuck for weeks. I have tried using the finite difference algorithms from NVidia. These produce local max / min but not the reversal points. Small fluctuations produce numerous relative max / min, but most of them are trivial because the change is not greater than the reversal size.
My question is, is there a way to find these reversal point in parallel?
one possible approach:
use thrust::unique to remove periods where the price is numerically constant
use thrust::adjacent_difference to produce 1st difference data
use thrust::adjacent_difference on 1st difference data to get the 2nd difference data, i.e the points where there is a change in the sign of the slope.
use these points of change in sign of slope to identify separate regions of data - build a key vector from these (e.g. with a prefix sum). This key vector segments the price data into "runs" where the price change is in a particular direction.
use thrust::exclusive_scan_by_key on the 1st difference data, to produce the net change of the run
Wherever the net change of the run exceeds a threshold, flag as a "reversal"
Your description of what constitutes a reversal may also be slightly unclear. The above method would not flag a reversal on certain data patterns that you might classify as a reversal. I suspect you are looking beyond a single run as I have defined it here. If that is the case, there may be a method to address that as well - with more steps.

Solve linear equation of AX=B

I currently solve Ax=b equation two times.
where A is sparse matrix NxN
x, b are vectors of size N. (I have b1 and b2)
I want to reduce times by solving both of them in one shot using cusparse functions.
so what I though is to build from the 2 b's I have, one matrix of size Nx2, and solve it with A as the equation AX=B can do.
Is it theoretically right?
which cusparse function should I use?
please pay attention I'm working with sparse matrix and not dense matrix.
Thanks!
To answer your questions
Yes, it is possible to solve a suitably well conditioned sparse problem for multiple RHS vectors in this way.
Unless your LHS sparse matrix is either tridiagonal or triangular, then you can't use cusparse directly for this.
cusolver 7.5 contains several "low level" routines for factorising sparse matrices, meaning that you can factorize once and reuse the factorisation several times with different RHS, for example cusolverSpXcsrluSolve() can be called after an LU factorisation to solve using the same precomputed factorisation as many times as you require. (Note I originally had assumed that there was a sparse getrs like function in cusolve, and it appears there isn't. I certainly talked to NVIDIA about the use case for one some years ago and thought they had added it, sorry for the confusion there).

Logic or lookup table: Best practices

Suppose you have a function/method that uses two metric to return a value — essentially a 2D matrix of possible values. Is it better to use logic (nested if/switch statements) to choose the right value, or just build that matrix (as an Array/Hash/Dictionary/whatever), and then the return value becomes simply a matter of performing a lookup?
My gut feeling says that for an M⨉N matrix, relatively small values for both M and N (like ≤3) would be OK to use logic, but for larger values it would be more efficient to just build the matrix.
What are general best practices for this? What about for an N-dimensional matrix?
The decision depends on multiple factors, including:
Which option makes the code more readable and hence easier to maintain
Which option performs faster, especially if the lookup happens squillions of times
How often do the values in the matrix change? If the answer is "often" then it is prob better to externalise the values out of the code and put them in an matrix stored in a way that can be edited simply.
Not only how big is the matrix but how sparse is it?
What I say is that about nine conditions is the limit for an if .. else ladder or a switch. So if you have a 2D cell you can reasonably hard-code the up, down, diagonals, and so on. If you go to three dimensions you have 27 cases and it's too much, but OK if you're restricted to the six cub faces.
Once you've got a a lot of conditions, start coding via look-up tables.
But there's no real answer. For example Windows message loops need to deal with a lot of different messages, and you can't sensibly encode the handling code in look-up tables.

Multiple classifications from predictive classifier

I have a classification problem where I would like to predict an outcome, but would like my classifier to get several 'attempts' at the answer (something like placing an each-way bet), rather than a single classification which is either correct or incorrect, and was wondering about the best process for this.
Example: Given outcomes A, B, C, and D, I would like to predict that it will be 'A or B', or 'A or C', and the 'correct' solution(s) (those that at least contain the right individual answer) affect the learning process accordingly.
So far, my thoughts have been to split the data set up into bins, more or less as above (A or C) and train a classifier in the usual way, or to train multiple classifiers such that they are diverse, and simply combine the results, but I was wondering if there is a better/Different way? I'm sure this can't be a unique problem, but I'm not sure of the correct terminology to Google.
I don't know if it's a related problem, but is there also a way to include in the options 'I don't know' - ie. don't make a classification?
A lot of classifiers can do what you want.
Naive Bayes can give you probabilities for each label, so you can take the k most probable labels instead of just the single most probable label and output that.
Logistic Regression, SVMs can also give you a score for each label, letting you do something similar.
Another trick is to slightly perturb the input feature vector and feed it to the classifier. Repeat that several times, and you get not one output label, but several. You can count and sort them by frequency to get multiple potential answers. You can then make some cutoff criteria to pick only a subset of those labels and return them to the user.