Reverse Engineering Z3 SMT solver solutions - reverse-engineering

I am using Z3 named SMT solver to generate a new set of random numbers from a given vector under some constraints. I am doing this in order to hide my input stream. The corresponding code can be found below:
from z3 import *
import sys
import io
import math
X0 = Real('X0')
X1 = Real('X1')
X2 = Real('X2')
X3 = Real('X3')
X4 = Real('X4')
X5 = Real('X5')
X6 = Real('X6')
X7 = Real('X7')
X8 = Real('X8')
X9 = Real('X9')
X10 = Real('X10')
X11 = Real('X11')
X12 = Real('X12')
X13 = Real('X13')
X14 = Real('X14')
DistinctParameter = [Distinct(X0 , X1 , X2 , X3 , X4 , X5 , X6 , X7 , X8 , X9 , X10 , X11 , X12 , X13 , X14 )]
maxPossibleValue = max(InputStream)
AggregateValue = 0
for x in InputStream:
AggregateValue = AggregateValue + float(x)
S_Con_Comparison1 = [(X0 < maxPossibleValue)]
S_Con_Comparison2 = [(X1 < maxPossibleValue)]
S_Con_Comparison3 = [(X2 < maxPossibleValue)]
S_Con_Comparison4 = [(X3 < maxPossibleValue)]
S_Con_Comparison5 = [(X4 < maxPossibleValue)]
S_Con_Comparison6 = [(X5 < maxPossibleValue)]
S_Con_Comparison7 = [(X6 < maxPossibleValue)]
S_Con_Comparison8 = [(X7 < maxPossibleValue)]
S_Con_Comparison9 = [(X8 < maxPossibleValue)]
S_Con_Comparison10 = [(X9 < maxPossibleValue)]
S_Con_Comparison11 = [(X10 < maxPossibleValue)]
S_Con_Comparison12 = [(X11 < maxPossibleValue)]
S_Con_Comparison13 = [(X12 < maxPossibleValue)]
S_Con_Comparison14 = [(X13 < maxPossibleValue)]
S_Con_Comparison15 = [(X14 < maxPossibleValue)]
S_Con_Comparison = S_Con_Comparison1 + S_Con_Comparison2 + S_Con_Comparison3 + S_Con_Comparison4 + S_Con_Comparison5 + S_Con_Comparison6 + S_Con_Comparison7 + S_Con_Comparison8 + S_Con_Comparison9 + S_Con_Comparison10 + S_Con_Comparison11 + S_Con_Comparison12 + S_Con_Comparison13 + S_Con_Comparison14 + S_Con_Comparison15
S_Con = [( X0 + X1 + X2 + X3 + X4 + X5 + X6 + X7 + X8 + X9 + X10 + X11 + X12 + X13 + X14 == AggregateValue)]
Solve = S_Con + S_Con_Comparison + DistinctParameter
s = Solver()
s.add(Solve)
x = Reals('x')
i = 0
output =[0] * len(InputStream)
if s.check() == sat:
m = s.model()
for d in m.decls():
location = int((repr(d).replace("X","")))
x=round(float(m[d].numerator_as_long())/float(m[d].denominator_as_long()),5)
output[location]= x
print(output)
Each of the values of the input stream can be taken from a possible set of size 2^25. As per my understanding, the only way to find the input stream is to do a brute force on the resulted stream. Given this circumstances, I want to know if it is possible to reverse engineer the input stream from the corresponding output stream.

As mentioned in the comments, SMT solvers should not be entrusted with the task of generating truly random models. Having said this, it doesn't look like you need such property to be guaranteed for your application.
I fixed your model so as to impose X_i >= 0, since this is a requirement in the comments.
from z3 import *
import sys
import io
import math
def obfuscate(input_stream):
X_list = [Real('X_{0}'.format(idx)) for idx in range(0, len(input_stream))]
assert len(X_list) == len(input_stream)
max_input_value = max(input_stream)
aggregate_value = sum(input_stream)
distinct_cs = Distinct(X_list)
lower_cs = [(0 <= Xi) for Xi in X_list]
upper_cs = [(Xi < max_input_value) for Xi in X_list]
same_sum_cs = (Sum(X_list) == aggregate_value)
s = Solver()
s.add(distinct_cs)
s.add(lower_cs)
s.add(upper_cs)
s.add(same_sum_cs)
status = s.check()
if status == sat:
r_ret = []
fp_ret = []
m = s.model()
for Xi in X_list:
r_value = m.eval(Xi)
r_ret.append(r_value)
num = r_value.numerator_as_long()
den = r_value.denominator_as_long()
fp_value = round(float(num) / float(den), 5)
fp_ret.append(fp_value)
return input_stream, aggregate_value, "sat", r_ret, fp_ret, sum(fp_ret)
else:
return input_stream, aggregate_value, "unsat", None, None, None
if __name__ == '__main__':
print("Same-value inputs are all unsat")
print(obfuscate([0.0, 0.0, 0.0]))
print(obfuscate([1.0, 1.0, 1.0]))
print(obfuscate([2.0, 2.0, 2.0]))
print("\nRe-ordering input does not change output")
print(obfuscate([1.0, 2.0, 3.0]))
print(obfuscate([1.0, 3.0, 2.0]))
print(obfuscate([3.0, 2.0, 1.0]))
print("")
print(obfuscate([0.1, 0.0, 0.0]))
print(obfuscate([0.0, 0.1, 0.0]))
print(obfuscate([0.0, 0.0, 0.1]))
print("\nSame-sum input do not necessarily map to the same outputs")
print(obfuscate([0.1, 0.9, 2.0]))
print(obfuscate([1.1, 0.1, 1.8]))
print("\nSame outputs may result from different inputs")
print(obfuscate([0.6, 1.3, 1.1]))
print(obfuscate([1.3, 0.7, 1.0]))
The output is:
Same-value inputs are all unsat
([0.0, 0.0, 0.0], 0.0, 'unsat', None, None, None)
([1.0, 1.0, 1.0], 3.0, 'unsat', None, None, None)
([2.0, 2.0, 2.0], 6.0, 'unsat', None, None, None)
Re-ordering input does not change output
([1.0, 2.0, 3.0], 6.0, 'sat', [5/2, 11/4, 3/4], [2.5, 2.75, 0.75], 6.0)
([1.0, 3.0, 2.0], 6.0, 'sat', [5/2, 11/4, 3/4], [2.5, 2.75, 0.75], 6.0)
([3.0, 2.0, 1.0], 6.0, 'sat', [5/2, 11/4, 3/4], [2.5, 2.75, 0.75], 6.0)
([0.1, 0.0, 0.0], 0.1, 'sat', [1/30, 1/15, 0], [0.03333, 0.06667, 0.0], 0.09999999999999999)
([0.0, 0.1, 0.0], 0.1, 'sat', [1/30, 1/15, 0], [0.03333, 0.06667, 0.0], 0.09999999999999999)
([0.0, 0.0, 0.1], 0.1, 'sat', [1/30, 1/15, 0], [0.03333, 0.06667, 0.0], 0.09999999999999999)
Same-sum input do not necessarily map to the same outputs
([0.1, 0.9, 2.0], 3.0, 'sat', [4/3, 5/3, 0], [1.33333, 1.66667, 0.0], 3.0)
([1.1, 0.1, 1.8], 3.0, 'sat', [7/5, 8/5, 0], [1.4, 1.6, 0.0], 3.0)
Same outputs may result from different inputs
([0.6, 1.3, 1.1], 3.0, 'sat', [23/20, 49/40, 5/8], [1.15, 1.225, 0.625], 3.0)
([1.3, 0.7, 1.0], 3.0, 'sat', [23/20, 49/40, 5/8], [1.15, 1.225, 0.625], 3.0)
This simple example allows us to make the following observations:
the output is determined by the values in input, but it is not affected by their order
the obfuscation procedure can be sensitive to variations in the input stream
Therefore, even if an attacker attempts to use rainbow tables to find the potential input multiset that generated an output sequence, they still cannot find the exact order of the values in the input stream.
Let's disregard the fact that building such rainbow tables is impractical due to the large number of input sequences of size 15 that can be generated with
a pool of 2^25 candidate values (a loose upper-bound would be 2^375), and assume that we have a way to access it efficiently.
Given an output sequence O, generated with obfuscate(), we can look for a match M inside our rainbow table, where M is a list of multisets that, when used as input, would result in the same output O. Let M[i] be the i-th input set in M containing n elements, each with multiplicity m_i. Then the number of possible permutations of M[i] is (source: Wikipedia):
In the simplest scenario in which every value in the input stream is different from the others, there are up to 15! = 1.307.674.368.000 permutations for each candidate solution M[i] in the match M. In your application, would the attacker have the time to try all of them?

Related

WHat does Lambda do in this code (python keras)?

def AdaIN(x):
#Normalize x[0] (image representation)
mean = K.mean(x[0], axis = [1, 2], keepdims = True)
std = K.std(x[0], axis = [1, 2], keepdims = True) + 1e-7
y = (x[0] - mean) / std
#Reshape scale and bias parameters
pool_shape = [-1, 1, 1, y.shape[-1]]
scale = K.reshape(x[1], pool_shape)
bias = K.reshape(x[2], pool_shape)#Multiply by x[1] (GAMMA) and add x[2] (BETA)
return y * scale + bias
def g_block(input_tensor, latent_vector, filters):
gamma = Dense(filters, bias_initializer = 'ones')(latent_vector)
beta = Dense(filters)(latent_vector)
out = UpSampling2D()(input_tensor)
out = Conv2D(filters, 3, padding = 'same')(out)
out = Lambda(AdaIN)([out, gamma, beta])
out = Activation('relu')(out)
return out
Please see code above. I am currently studying styleGAN. I am trying to convert this code into pytorch but I cant seem to understand what does Lambda do in g_block. AdaIN needs only one input based on its declaration but some how is gamma and beta also used as input? Please inform me what does the Lambda do in this code.
Thank you very much.
Lambda layers in keras are used to call custom functions inside the model. In g_block Lambda calls AdaIN function and passes out, gamma, beta as arguments inside a list. And AdaIN function receives these 3 tensors encapsulated within a single list as x. And also those tensors are accessed inside AdaIN function by indexing list x(x[0], x[1], x[2]).
Here's pytorch equivalent:
import torch
import torch.nn as nn
import torch.nn.functional as F
class AdaIN(nn.Module):
def forward(self, out, gamma, beta):
bs, ch = out.size()[:2]
mean = out.reshape(bs, ch, -1).mean(dim=2).reshape(bs, ch, 1, 1)
std = out.reshape(bs, ch, -1).std(dim=2).reshape(bs, ch, 1, 1) + 1e-7
y = (out - mean) / std
bias = beta.unsqueeze(-1).unsqueeze(-1).expand_as(out)
scale = gamma.unsqueeze(-1).unsqueeze(-1).expand_as(out)
return y * scale + bias
class g_block(nn.Module):
def __init__(self, filters, latent_vector_shape, input_tensor_channels):
super().__init__()
self.gamma = nn.Linear(in_features = latent_vector_shape, out_features = filters)
# Initializes all bias to 1
self.gamma.bias.data = torch.ones(filters)
self.beta = nn.Linear(in_features = latent_vector_shape, out_features = filters)
# calculate appropriate padding
self.conv = nn.Conv2d(input_tensor_channels, filters, 3, 1, padding=1)# calc padding
self.adain = AdaIN()
def forward(self, input_tensor, latent_vector):
gamma = self.gamma(latent_vector)
beta = self.beta(latent_vector)
# check default interpolation mode in keras and replace mode below if different
out = F.interpolate(input_tensor, scale_factor=2, mode='nearest')
out = self.conv(out)
out = self.adain(out, gamma, beta)
out = torch.relu(out)
return out
# Sample:
input_tensor = torch.randn((1, 3, 10, 10))
latent_vector = torch.randn((1, 5))
g = g_block(3, latent_vector.shape[1], input_tensor.shape[1])
out = g(input_tensor, latent_vector)
print(out)
Note: you need to pass latent_vector and input_tensor shapes while creating g_block.

how to get channel=3 from MRI slices?

I'm trying to use VGG, but input request 3 channels but my imput_shape'channel=1
I use nibabel to slice the MRI (nii)
ValueError: The input must have 3 channels; got input_shape=(256, 256, 1)
Here is my code about MRI slices.
code
images = []
images_ground = []
for f in range(len(g)):
a = nib.load(g[f])
a = a.get_data()
b = nib.load(g[f])
b = b.get_data()
a=a[:,:,48:166]
b = transform.resize(b, (64, 64, 256))
b=b[:,:,48:166]
for i in range(a.shape[2]):
images_ground.append(a[:,:,i])
images.append(b[:, :, i])
images_ground = np.array(images_ground)
images_ground = images_ground.reshape(-1, 256, 256, 1)
images = np.array(images)
images = images.reshape(-1, 64, 64, 1)
m = np.max(images)
mi = np.min(images)
images = (images - mi) / (m - mi)
n=np.max(images_ground)
ni=np.min(images_ground)
images_ground=(images_ground-ni)/(n-ni)
return images,images_ground
I too had the same problem,
The VGG model is trained on a 3 channel(RGB) input imags but as you are providing Grayscale image which is only 1 channel it shows an error
if you want to solve with TensorFlow use
tf.image.grayscale_to_rgb()
If its Keras
datagen_train = ImageDataGenerator()
train_generator = datagen_train.flow_from_directory(directory_name,
<other parameters>,color_mode="rgb")

Error Received while building the Auto encoder

I am trying to build an auto encoder for my term project using CNN as Encoder and LSTM as Decoder, how ever when i display the summary of the model. I receive the following error:
ValueError: Input 0 is incompatible with layer lstm_10: expected ndim=3, found ndim=2
x.shape = (45406, 100, 100)
y.shape = (45406,)
I already tried changing the shape of the input for the LSTM, but it didn't work.
def keras_model(image_x, image_y):
model = Sequential()
model.add(Lambda(lambda x: x / 127.5 - 1., input_shape=(image_x, image_y, 1)))
last = model.output
x = Conv2D(3, (3, 3), padding='same')(last)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), padding='valid')(x)
encoded= Flatten()(x)
x = LSTM(8, return_sequences=True, input_shape=(100,100))(encoded)
decoded = LSTM(64, return_sequences = True)(x)
x = Dropout(0.5)(decoded)
x = Dense(400, activation='relu')(x)
x = Dense(25, activation='relu')(x)
final = Dense(1, activation='relu')(x)
autoencoder = Model(model.input, final)
autoencoder.compile(optimizer="Adam", loss="mse")
autoencoder.summary()
model= keras_model(100, 100)
Given you are using an LSTM, you need a time dimension. So your input shape should be: (time, image_x, image_y, nb_image_channels).
I would suggest to get a more in-depth understanding of autoencoders, LSTM and 2D Convolution as all these play together here. This is a helpful intro: https://machinelearningmastery.com/lstm-autoencoders/ and this https://blog.keras.io/building-autoencoders-in-keras.html).
Also have a look at this example, someone implemented an LSTM with Conv2D How to reshape 3 channel dataset for input to neural network. The TimeDistributed layer comes in useful here.
However, just to get your error fixed you can add a Reshape() layer to fake the extra dimension:
def keras_model(image_x, image_y):
model = Sequential()
model.add(Lambda(lambda x: x / 127.5 - 1., input_shape=(image_x, image_y, 1)))
last = model.output
x = Conv2D(3, (3, 3), padding='same')(last)
x = BatchNormalization()(x)
x = Activation('relu')(x)
x = MaxPooling2D((2, 2), padding='valid')(x)
encoded= Flatten()(x)
# (50,50,3) is the output shape of the max pooling layer (see model summary)
encoded = Reshape((50*50*3, 1))(encoded)
x = LSTM(8, return_sequences=True)(encoded) # input shape can be removed
decoded = LSTM(64, return_sequences = True)(x)
x = Dropout(0.5)(decoded)
x = Dense(400, activation='relu')(x)
x = Dense(25, activation='relu')(x)
final = Dense(1, activation='relu')(x)
autoencoder = Model(model.input, final)
autoencoder.compile(optimizer="Adam", loss="mse")
print(autoencoder.summary())
model= keras_model(100, 100)

How can I use three Conv1d on the three axis of my 3*n matrix in Pytorch?

The following is my CNN. The input of it is a (3,64) matrix, I want to use three convolution kernels to process the x,y,z axis respectively.
class Char_CNN(nn.Module):
def __init__(self):
super(Char_CNN, self).__init__()
self.convdx = nn.Conv1d(1, 12, 20)
self.convdy = nn.Conv1d(1, 12, 20)
self.convdz = nn.Conv1d(1, 12, 20)
self.fc1 = nn.Linear(540, 1024)
self.fc2 = nn.Linear(1024, 30)
self.fc3 = nn.Linear(30, 13)
def forward(self, x):
after_convd = [self.convdx(x[:, :, 0]), self.convdy(x[:, :, 1]), self.convdz(x[:, :, 2])]
after_pool = [F.max_pool1d(F.relu(value), 3) for value in after_convd]
x = torch.cat(after_pool, 1)
x = x.view(x.size(0), -1)
x = self.fc1(x)
x = self.fc2(x)
x = self.fc3(x)
x = F.softmax(x)
return x
But during the running of loss = criterion(out, target), a RunTime Error occurs:
RuntimeError: Assertion `cur_target >= 0 && cur_target < n_classes' failed.
I'm very new to pytorch so that I cannot find out the mistake of my code.
Can you help me?
The way of convolution is okay. The problem is my labels were between 1 and 13, and the correct range is 0 to 12.
After modifying it, my CNN works successfully.
But as a fresher to Pytorch and deep learning, I guess my convolution mode can be clearer and easier. Welcome to point out my errors!

How to convert an upper/lower gpuarray to the specific format required by cublasStbsv?

I am currently using pycuda and scikits.cuda to solve linear equation A*x = b, where A is an upper/lower matrix. However the cublasStbsv routine requires a specific format.
To give an example: if a lower matrix A = [[1, 0, 0], [2, 3, 0], [4, 5, 6]], then the input required by cublasStbsv should be [[1, 3, 6], [2, 5, 0], [4, 0, 0]], where rows are diagonal, subdiagonal1, subdiagonal2, respectively. If using numpy, this can be easily done by stride_tricks.as_strided, but I dont know how to do similar things with pycuda.gpuarray. Any help would be appreciated, thanks. I found pycuda.compyte.array.as_strided, but it cannot be applied to gpuarray.
I got it done by using theano. First converted it to cudandarray, change stride and make a copy back to gpuarray. Just be careful about changes between Fortran and C order.
update:
finally got it done by using gpuarray.multi_take_put
def make_triangle(s_matrix, uplo = 'L'):
"""convert triangle matrix to the specific format
required by cublasStbsv, matrix should be in Fortran order,
s_matrix: gpuarray
"""
#make sure the dytpe is float32
if s_matrix.dtype != 'f':
s_matrix = s_matrix.astype('f')
dim = s_matrix.shape[0]
if uplo == 'L':
idx_tuple = np.tril_indices(dim)
gidx = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * dim)
gdst = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * (dim - 1))
return gpuarray.multi_take_put([s_matrix], gdst, gidx, (dim, dim))[0]
else:
idx_tuple = np.triu_indices(dim)
gidx = gpuarray.to_gpu(idx_tuple[0] + idx_tuple[1] * dim)
gdst = gpuarray.to_gpu(idx_tuple[0] + (idx_tuple[1] + 1) * (dim - 1))
return gpuarray.multi_take_put([s_matrix], gdst, gidx, (dim, dim))[0]