Minizinc: Pairwise intersection of int arrays - intersection

I have some arrays (can be more than 10) of int variables. I am looking for an efficient way to constraint the pairwise intersection count of these arrays, i.e that each array cannot have more than x elements in common with any other array.
Example in pseudocode: [1,4,4] and [2,2,1] would have one element in common -> the number 1. [4,4,4] and [9,4,4] have the element 4 in common, the duplicate 4 should be ignored.
In my current implementation, I iterate over all pairs of arrays and for each par check for each element if it is in the other array as well. This is of course very slow and the duplicates are not eliminated as they should be.
The interesting part of my code looks like this:
constraint matches [0] = exists ( i in index_set(values1) ) ( values1[i]==values2[0] );
constraint matches [1] = exists ( i in index_set(values1) ) ( values1[i]==values2[1] );
constraint matches [2] = exists ( i in index_set(values1) ) ( values1[i]==values2[2] );
constraint matches [3] = exists ( i in index_set(values1) ) ( values1[i]==values2[3] );
constraint matches [4] = exists ( i in index_set(values1) ) ( values1[i]==values2[4] );
constraint sum(matches) < x;
I have thought about using minizinc sets as they support some set operations, but I could not get them to work with variables.
Any ideas?

Perhaps something like this, using array2set for converting the arrays to sets and then card and intersect to calculate the number of intersections between each pair.
int: rows = 4; % number of columns
int: cols = 5; % number of rows
array[1..rows,1..cols] of int: a = array2d(1..rows,1..cols,
[
4,6,9,5,6,
5,3,7,1,3,
3,8,3,3,1,
1,1,4,7,2,
]);
% convert the arrays to sets
array[1..rows] of set of int: s = [array2set([a[i,j] | j in 1..cols]) | i in 1..rows];
% decision variables
var int: z;
solve satisfy;
constraint
z = sum(r1,r2 in 1..rows where r1 < r2) (
card(s[r1] intersect s[r2])
)
;
output
[
"z:\(z)\n"
] ++
[
show(s[i]) ++ "\n"
| i in 1..rows
];
The output of this model is
z:7
{4,5,6,9}
{1,3,5,7}
{1,3,8}
{1,2,4,7}

Related

How to calculate a probability vector and an observation count vector for a range of bins?

I want to test the hypothesis whether some 30 occurrences should fit a Poisson distribution.
#GNU Octave
X = [8 0 0 1 3 4 0 2 12 5 1 8 0 2 0 1 9 3 4 5 3 3 4 7 4 0 1 2 1 2]; #30 observations
bins = {0, 1, [2:3], [4:5], [6:20]}; #each bin can be single value or multiple values
I am trying to use Pearson's chi-square statistics here and coded the below function. I want a Poisson vector to contain corresponding Poisson probabilities for each bin and count the observations for each bin. I feel the loop is rather redundant and ugly. Can you please let me know how can I re-factor the function without the loop and make the whole calculation cleaner and more vectorized?
function result= poissonGoodnessOfFit(bins, observed)
assert(iscell(bins), "bins should be a cell array");
assert(all(cellfun("ismatrix", bins)) == 1, "bin entries either scalars or matrices");
assert(ismatrix(observed) && rows(observed) == 1, "observed data should be a 1xn matrix");
lambda_head = mean(observed); #poisson lambda parameter estimate
k = length(bins); #number of bin groups
n = length(observed); #number of observations
poisson_probability = []; #variable for poisson probability for each bin
observations = []; #variable for observation counts for each bin
for i=1:k
if isscalar(bins{1,i}) #this bin contains a single value
poisson_probability(1,i) = poisspdf(bins{1, i}, lambda_head);
observations(1, i) = histc(observed, bins{1, i});
else #this bin contains a range of values
inner_bins = bins{1, i}; #retrieve the range
inner_bins_k = length(inner_bins); #number of values inside
inner_poisson_probability = []; #variable to store individual probability of each value inside this bin
inner_observations = []; #variable to store observation counts of each value inside this bin
for j=1:inner_bins_k
inner_poisson_probability(1,j) = poisspdf(inner_bins(1, j), lambda_head);
inner_observations(1, j) = histc(observed, inner_bins(1, j));
endfor
poisson_probability(1, i) = sum(inner_poisson_probability, 2); #assign over the sum of all inner probabilities
observations(1, i) = sum(inner_observations, 2); #assign over the sum of all inner observation counts
endif
endfor
expected = n .* poisson_probability; #expected observations if indeed poisson using lambda_head
chisq = sum((observations - expected).^2 ./ expected, 2); #Pearson Chi-Square statistics
pvalue = 1 - chi2cdf(chisq, k-1-1);
result = struct("actual", observations, "expected", expected, "chi2", chisq, "pvalue", pvalue);
return;
endfunction
There's a couple of things worth noting in the code.
First, the 'scalar' case in your if block is actually identical to your 'range' case, since a scalar is simply a range of 1 element. So no special treatment is needed for it.
Second, you don't need to create such explicit subranges, your bin groups seem to be amenable to being used as indices into a larger result (as long as you add 1 to convert from 0-indexed to 1-indexed indices).
Therefore my approach would be to calculate the expected and observed numbers over the entire domain of interest (as inferred from your bin groups), and then use the bin groups themselves as 1-indices to obtain the desired subgroups, summing accordingly.
Here's an example code, written in the octave/matlab compatible subset of both languges:
function Result = poissonGoodnessOfFit( BinGroups, Observations )
% POISSONGOODNESSOFFIT( BinGroups, Observations) calculates the [... etc, etc.]
pkg load statistics; % only needed in octave; for matlab buy statistics toolbox.
assert( iscell( BinGroups ), 'Bins should be a cell array' );
assert( all( cellfun( #ismatrix, BinGroups ) ) == 1, 'Bin entries either scalars or matrices' );
assert( ismatrix( Observations ) && rows( Observations ) == 1, 'Observed data should be a 1xn matrix' );
% Define helpful variables
RangeMin = min( cellfun( #min, BinGroups ) );
RangeMax = max( cellfun( #max, BinGroups ) );
Domain = RangeMin : RangeMax;
LambdaEstimate = mean( Observations );
NBinGroups = length( BinGroups );
NObservations = length( Observations );
% Get expected and observed numbers per 'bin' (i.e. discrete value) over the *entire* domain.
Expected_Domain = NObservations * poisspdf( Domain, LambdaEstimate );
Observed_Domain = histc( Observations, Domain );
% Apply BinGroup values as indices
Expected_byBinGroup = cellfun( #(c) sum( Expected_Domain(c+1) ), BinGroups );
Observed_byBinGroup = cellfun( #(c) sum( Observed_Domain(c+1) ), BinGroups );
% Perform a Chi-Square test on the Bin-wise Expected and Observed outputs
O = Observed_byBinGroup; E = Expected_byBinGroup ; df = NBinGroups - 1 - 1;
ChiSquareTestStatistic = sum( (O - E) .^ 2 ./ E );
PValue = 1 - chi2cdf( ChiSquareTestStatistic, df );
Result = struct( 'actual', O, 'expected', E, 'chi2', ChiSquareTestStatistic, 'pvalue', PValue );
end
Running with your example gives:
X = [8 0 0 1 3 4 0 2 12 5 1 8 0 2 0 1 9 3 4 5 3 3 4 7 4 0 1 2 1 2]; % 30 observations
bins = {0, 1, [2:3], [4:5], [6:20]}; % each bin can be single value or multiple values
Result = poissonGoodnessOfFit( bins, X )
% Result =
% scalar structure containing the fields:
% actual = 6 5 8 6 5
% expected = 1.2643 4.0037 13.0304 8.6522 3.0493
% chi2 = 21.989
% pvalue = 0.000065574
A general comment about the code; it is always preferable to write self-explainable code, rather than code that does not make sense by itself in the absence of a comment. Comments generally should only be used to explain the 'why', rather than the 'how'.

How to truncate double precision value in PostgreSQL by keeping exactly first two decimals?

I'm trying to truncate double precision value when I'm build json using json_build_object() function in PostgreSQL 11.8 but with no luck. To be more precise I'm trying to truncate 19.9899999999999984 number to ONLY two decimals but making sure it DOES NOT round it to 20.00 (which is what it does), but to keep it at 19.98.
BTW, what I've tried so far was to use:
1) TRUNC(found_book.price::numeric, 2) and I get value 20.00
2) ROUND(found_book.price::numeric, 2) and I get value 19.99 -> so far this is closesest value but not what I need
3) ROUND(found_book.price::double precision, 2) and I get
[42883] ERROR: function round(double precision, integer) does not exist
Also here is whole code I'm using:
create or replace function public.get_book_by_book_id8(b_id bigint) returns json as
$BODY$
declare
found_book book;
book_authors json;
book_categories json;
book_price double precision;
begin
-- Load book data:
select * into found_book
from book b2
where b2.book_id = b_id;
-- Get assigned authors
select case when count(x) = 0 then '[]' else json_agg(x) end into book_authors
from (select aut.*
from book b
inner join author_book as ab on b.book_id = ab.book_id
inner join author as aut on ab.author_id = aut.author_id
where b.book_id = b_id) x;
-- Get assigned categories
select case when count(y) = 0 then '[]' else json_agg(y) end into book_categories
from (select cat.*
from book b
inner join category_book as cb on b.book_id = cb.book_id
inner join category as cat on cb.category_id = cat.category_id
where b.book_id = b_id) y;
book_price = trunc(found_book.price, 2);
-- Build the JSON response:
return (select json_build_object(
'book_id', found_book.book_id,
'title', found_book.title,
'price', book_price,
'amount', found_book.amount,
'is_deleted', found_book.is_deleted,
'authors', book_authors,
'categories', book_categories
));
end
$BODY$
language 'plpgsql';
select get_book_by_book_id8(186);
How do I achieve to keep EXACTLY ONLY two FIRST decimal digits 19.98 (any suggestion/help is greatly appreciated)?
P.S. PostgreSQL version is 11.8
In PostgreSQL 11.8 or 12.3 I cannot reproduce:
# select trunc('19.9899999999999984'::numeric, 2);
trunc
-------
19.98
(1 row)
# select trunc(19.9899999999999984::numeric, 2);
trunc
-------
19.98
(1 row)
# select trunc(19.9899999999999984, 2);
trunc
-------
19.98
(1 row)
Actually I can reproduce with the right type and a special setting:
# set extra_float_digits=0;
SET
# select trunc(19.9899999999999984::double precision::text::numeric, 2);
trunc
-------
19.99
(1 row)
And a possible solution:
# show extra_float_digits;
extra_float_digits
--------------------
3
(1 row)
select trunc(19.9899999999999984::double precision::text::numeric, 2);
trunc
-------
19.98
(1 row)
But note that:
Note: The extra_float_digits setting controls the number of extra
significant digits included when a floating point value is converted
to text for output. With the default value of 0, the output is the
same on every platform supported by PostgreSQL. Increasing it will
produce output that more accurately represents the stored value, but
may be unportable.
As #pifor suggested I've managed to get it done by directly passing trunc(found_book.price::double precision::text::numeric, 2) as value in json_build_object like this:
json_build_object(
'book_id', found_book.book_id,
'title', found_book.title,
'price', trunc(found_book.price::double precision::text::numeric, 2),
'amount', found_book.amount,
'is_deleted', found_book.is_deleted,
'authors', book_authors,
'categories', book_categories
)
Using book_price = trunc(found_book.price::double precision::text::numeric, 2); and passing it as value for 'price' key didn't work.
Thank you for your help. :)

NxN detect win for tic-tac-toe

I have tried to generalise my tic-tac-toe game for an NxN grid. I have everything working but am finding it hard to get the code needed to detect a win.
This is my function at the moment where I loop over the rows and columns of the board. I can't figure out why it's not working currently. Thanks
def check_win(array_board):
global winner
for row in range(N):
for i in range(N-1):
if array_board[row][i] != array_board[row][i+1] or array_board[row][i] == 0:
break
if i == N-1:
winner = array_board[row][0]
pygame.draw.line(board, (0, 0, 0), (75, (row * round(height / N) + 150)), (825, (row * round(height / N) + 150)), 3)
for col in range(N):
for j in range(N-1):
if array_board[j][col] == 0 or array_board[col][j] != array_board[col][i+1]:
break
if j == N - 1:
winner = array_board[0][col]
pygame.draw.line(board, (0, 0, 0), (col * round(width / N) + 150, 75), (col * round(width / N) + 150, 825), 3)
You don't specify in your question, so my noughts-and-crosses grid is a 2D array of characters, with some default "empty" string (a single space).
def getEmptyBoard( size, default=' ' ):
""" Create a 2D array <size> by <size> of empty string """
grid = []
for j in range( size ):
row = []
for i in range( size ): # makes a full empty row
row.append( default )
grid.append( row )
return ( size, grid )
So given a 2D grid of strings, how does one check for a noughts-and-crosses Win? This would be when the count of the same character in a particular row or column is equal to the size of the grid.
Thus if you have a 5x5 grid, any row with 5 of the same item (say 'x') is a winner. Similarly for a column... 5 lots of 'o' is a win.
So given a 2D array, how do you check for these conditions. One way to do this is to tally the number of occurrences of separate symbols in each cell. If that tally reaches the 5 (grid size), then whatever that symbol is, it's a winner.
def checkForWin( board, default=' ' ):
winner = None
size = board[0]
grid = board[1]
### Tally the row and column
for j in range( size ):
col_results = {}
### Count the symbols in this column
for i in range( size ):
value = grid[i][j]
if ( value in col_results.keys() ):
col_results[ value ] += 1
else:
col_results[ value ] = 1
### Check the tally for a winning count
for k in col_results.keys():
if ( k != default and col_results[k] >= size ):
winner = k # Found a win
print("Winner: column %d" % ( j ) )
break
if ( winner != None ):
break
# TODO: also implement for rows
# TODO: also implement for diagonals
return winner # returns None, or 'o', 'x' (or whatever used for symbols)
The above function uses two loops and a python dictionary to keep a list of what's been found. It's possible to check both the row and columns in the same loops, so it's not really row-by-row or column-by-column, just two loops of size.
Anyway, so during the loop when we first encounter an x, it will be added to the dictionary, with a value of 1. The next time we find an x, the dictionary is used to tally that occurrence, dict['x'] → 2, and so forth for the entire column.
At the end of the loop, we iterate through the dictionary keys (which might be , o, and x) checking the counts. When the count is the same size as a row or column, it's a winning line.
Obviously if there's no win found, we zero the tally and move to the next column/row with the outer-loop.

Count unique values of consts

In my project I defined following consts:
(declare-const v0_g Int)
(declare-const v1_g Int)
(declare-const v2_g Int)
(declare-const v3_g Int)
(declare-const v4_g Int)
...
In result, I got following values in my model:
...
(define-fun v4_g () Int
2)
(define-fun v3_g () Int
10)
(define-fun v2_g () Int
10)
(define-fun v1_g () Int
8)
(define-fun v0_g () Int
0)
...
Now I want to define new const called cost and assign the number of unique values of vi_g (in the example above cost == 4 (i.e. {0,2,8,10}). How can I achieve it using z3 solver?
The only idea I came up with is:
Knowing the maximum value (MAXVAL) that can be assigned to any of vi_g, define MAXVAL boolean consts (ci),
For each of this consts make an assert that e.g. c0 = (v0_g == 0) v (v1_g == 0) v ... v (vn_g == 0),
Count how many ci const equals True.
However, it requires a lot of additional clauses if MAXVAL is large.
There's no easy way to count the number of models of a general formula. Either your specific formula allows some sort of simplification or it isn't easy. See e.g. literature for #SAT (https://en.wikipedia.org/wiki/Sharp-SAT).
A naïve way is to implement the counting with a linear loop blocking one model at a time (or potentially several ones if the model is partial).

How to order a list of delaunay triangles to a ordered percolation list in Octave?

Given a list of all triangles
v2_T = delaunay(v2_p)
from a list of all points "v2_p" and given a list of all triangle neighbors
v2_N = neighbors(v2_T)
how can I order "v2_T" such that starting from the first triangle going up, the next triangle you find in "v2_T" will always have at least one triangle neighbor I have listed previously. The closet function I can think of that performs a similar task might be a binary tree search or something involving a recursive algorithm.
Could someone provide sample Octave code? Thanks.
Here is my uncommitted solution to the above question. This is a dynamically linked function for Octave written in c++ with file name "dlf_percolate.cc". To compile this function use the command system('mkoctfile filedirectory/dlf_percolate.cc') or the alternative command mkoctfile "filedirectory/dlf_percolate.cc" in the octave terminal, where one must designate the file directory "filedirectory" of where the file "dlf_percolate.cc" is saved. To test the function v1_I = dlf_percolate(v2_N), one needs a generated list of neighbors v2_N = neighbors(v2_T), where v2_T is the generated list of delaunay triangles and neighbors() is a function that does not exist in Octave yet. Neighbors v2_N can be calculated from using functions used in the package "msh" http://octave.sourceforge.net/msh/. Once one has v2_N, one can compute the order of numerical labeled triangles in percolated order as v1_I = dlf_percolate(v2_N,v_first_neigh), where "v_first_neigh" is the first triangle to start calculating the percolated order of listed triangles "v1_I".
#include <octave/oct.h>
void func_perc
(
Matrix & v2_neigh_list
,
ColumnVector & v1_perc_list
,
ColumnVector & b1_toggled_neigh
,
int & v0_perc_index
,
int v0_next_neigh
) ;
DEFUN_DLD (dlf_percolate, args, ,
"Returns a list of sorted indices of the neighbors in percolated order."
) {
int v0_first_neigh = 1 ;
switch( args.length() )
{
case 1:
// v0_first_neigh = 1 default value
break;
case 2:
v0_first_neigh = args(1).scalar_value() ;
break;
default:
error("Only one or two inputs are needed!") ;
return args;
break;
}
octave_value_list o1_retval ;
Matrix v2_neigh_list = args(0).matrix_value() ;
int v0_cols = v2_neigh_list.cols();
int v0_rows = v2_neigh_list.rows();
if( ( v0_first_neigh <= 0 ) || ( v0_rows < v0_first_neigh ) )
{
error("v0_first_neigh must be a valid member of the list!") ;
return args;
}
ColumnVector v1_perc_list(v0_rows,0);
ColumnVector b1_toggled_neigh(v0_rows,false);
int v0_perc_index = 0 ;
func_perc
(
v2_neigh_list
,
v1_perc_list
,
b1_toggled_neigh
,
v0_perc_index
,
v0_first_neigh
) ;
o1_retval(0) = v1_perc_list ;
return o1_retval ;
}
void func_perc
(
Matrix & v2_neigh_list
,
ColumnVector & v1_perc_list
,
ColumnVector & b1_toggled_neigh
,
int & v0_perc_index
,
int v0_next_neigh
)
{
if
(
( v0_next_neigh > 0 )
&&
( ( v0_perc_index ) < v1_perc_list.length() )
&&
( b1_toggled_neigh( v0_next_neigh - 1 ) == false )
)
{
v1_perc_list( v0_perc_index ) = v0_next_neigh ;
v0_perc_index++;
b1_toggled_neigh( v0_next_neigh - 1 ) = true ;
for( int v0_i = 0 ; v0_i < v2_neigh_list.cols() ; v0_i++ )
{
func_perc
(
v2_neigh_list
,
v1_perc_list
,
b1_toggled_neigh
,
v0_perc_index
,
v2_neigh_list( v0_next_neigh - 1 , v0_i )
) ;
}
}
return ;
}
I believe any calculated percolation path must involve a recursive algorithm. If not, at minimum, recursion makes easier code implementation to solve these types of problems. The first build I designed for this function in Octave script called an Octave function recursively which ran progressively slower at each step of the recursive algorithm. I believe recursion in Octave functions is not very efficient, because of the functional over headed of the interpretive language. Writing native functions in c++ for Octave is a better way to implement recursive algorithms efficiently. The c++ function func_perc() is the recursive algorithm used in dlf_percolate().