Invalid shape error plotting 2D image in python for MNIST Sign Language dataset having PIXEL VALUES AS COLUMNS - deep-learning

I have a MNIST Sign Language dataset with pixel values as columns.
I get the error when I try to plot an image at one of the indexes as follows:
#Training dataset
dfr = pd.read_csv("sign_mnist_train.csv")
X_train_orig = dfr.iloc[:,1:]
Y_train_orig = dfr['label']
#Testing dataset
dfe = pd.read_csv("sign_mnist_test.csv")
X_test_orig = dfe.iloc[:,1:]
Y_test_orig = dfe['label']
#shapes of dataset
print(dfr.shape) #(27455, 785)
print(dfe.shape) #(7172, 785)
#Example of a picture
index = 1
plt.imshow(X_train_orig.iloc[index])
#TypeError: Invalid shape (784,) for image data

Looks like the image you are trying to plot is a flattened one corresponding to [B, N], where N is 1x28x28, and B is 27455 which is your image size (27455, 784). This is fine if you want to feed it to a Linear layer of 784 long vector. To plot this image you have to reshape it to correspond to [27455, 1, 28, 28]. You can try this out:
image = X_train_orig.iloc[index]
image = np.reshape(image.values, (28, 28))
plt.imshow(image)

Related

How can I extract label from results in YOLO v5?

Is there any way to extract the detected label like person or cat, dog or others that is printing by the results.print() function? I want these detected labels to be saved in an array and use it later. I am using YOLOv5 model here.
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame = cap.read()
# Make detections
results = model(frame)
results.print()
# Showing the box and prediction
cv2.imshow('YOLO', np.squeeze(results.render()))
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
The printed output of the results.print() was like this -
image 1/1: 480x640 1 person
Speed: 7.0ms pre-process, 80.6ms inference, 3.5ms NMS per image at shape (1, 3, 480, 640)
From this output, I wanna extract the person label and store it in an array.
This might not be the optimal solution, but here's an approach that I used for a personal project:
lst = []
cap = cv2.VideoCapture(0)
while cap.isOpened():
ret, frame = cap.read()
# Make detections
results = model(frame)
cv2.imshow('YOLO', np.squeeze(results.render()))
df = results.pandas().xyxy[0]
for i in df['name']: # name->labels
lst.append(i)
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()
I have used results.pandas().xyxy[0] function to get the results as a data frame and then appended the labels to a list.
Assuming you use YoloV5 with pytorch, please see this link. It detailes how to interpret the results as json objects and also explains the structure.

SHAP multiclass summary plot for Deep Explainer

I want to use SHAP summary plot for multiclass classification problem using Deep Explainer. I have 3 classes and for shap_values I got a list of 3 arrays each having (1000,1,24) size. Each array representing a class, I am getting the summary plot for individual class
import shap
background = train_x[np.random.choice(train_x.shape[0], 1000, replace=False)]
explainer = shap.DeepExplainer(model, background)
back= test_x[np.random.choice(test_x.shape[0], 1000, replace=False)]
shap_values = explainer.shap_values(back)
shap.summary_plot(shap_values[0][:,0,:], plot_type = 'bar', feature_names = features)
but when i try to plot all three classes on a single summary plot by this code
shap.summary_plot(shap_values,back_x, plot_type="bar",feature_names = features)
it gives me following error
IndexError: index 12 is out of bounds for axis 0 with size 1
how to plot all the 3 classes on a single summary plot?

How can I concatenate the 4 corners of the image quickly when loading image in deep learning?

What is the most effective way to concatenate 4 corner, shown in this photo ?
(conducting in getitem())
left_img = Image.open('image.jpg')
...
output = right_img
This is how I would do it.
Firstly I would convert the image to a Tensor Image temporarily
from torchvision import transforms
tensor_image = transforms.ToTensor()(image)
Now assuming you have a 3 channel image (although similiar principles apply to any matrices of any number of channels including 1 channel gray scale images).
You can find the Red channel with tensor_image[0] the Green channel with tensor_image[1] and the the Blue channel with tensor_image[2]
You can make a for loop iterating through each channel like
for i in tensor_image.size(0):
curr_channel = tensor_image[i]
Now inside that for loop with each channel you can extract the
First corner pixel with float(curr_channel[0][0])
Last top corner pixel with float(curr_channel[0][-1])
Bottom first pixel with float(curr_channel[-1][0])
Bottom and last pixel with float(curr_channel[-1][-1])
Make sure to convert all the pixel values to float or double values before this next appending step
Now you have four values that correspond to the corner pixels of each channel
Then you can make a list called new_image = []
You can then append the above mentioned pixel values using
new_image.append([[curr_channel[0][0], curr_channel[0][-1]], [curr_channel[-1][0], curr_channel[-1][-1]]])
Now after iterating through every channel you should have a big list that contains three (or tensor_image.size(0)) number of lists of lists.
Next step is to convert this list of lists of lists to a torch.tensor by running
new_image = torch.tensor(new_image)
To make sure everything is right new_image.size() should return torch.Size([3, 2, 2])
If that is the case you now have your wanted image but it is tensor format.
The way to convert it back to PIL is to run
final_pil_image = transforms.ToPILImage()(new_image)
If everything went good, you should have a pil image that fulfills your task. The only code it uses is clever indexing and one for loop.
There is a possibility however if you look more than I can, then you can avoid using a for loop and perform operations on all the channels without the loop.
Sarthak Jain
I don't know how quick this is but here:
import numpy as np
img = np.array(Image.open('image.jpg'))
w, h = img.shape[0], image.shape[1]
# the window size:
r = 4
upper_left = img[:r, :r]
lower_left = img[h-r:, :r]
upper_right = img[:r, w-r:]
lower_right = img[h-r:, w-r:]
upper_half = np.concatenate((upper_left, upper_right), axis=1)
lower_half = np.concatenate((lower_left, lower_right), axis=1)
img = np.concatenate((upper_half, lower_half))
or short:
upper_half = np.concatenate((img[:r, :r], img[:r, w-r:]), axis=1)
lower_half = np.concatenate((img[h-r:, :r], img[h-r:, w-r:]), axis=1)
img = np.concatenate((upper_half, lower_half))

Training loss is Nan using image segmentation in TPU using TFrecords

I am a beginner trying to work with TPUs using Tensorflow in Kaggle Kernels. I previously trained an Unet model using a dataset in GPU, and now I am trying to implement that in TPU. I made a tfrecord out of the dataset images and mask, and the TFrecord returns image and mask. When I try to train in TPU, the loss is always Nan, even though the metrics accuracy is normal. Since this is the same model and loss I used in GPU, I am guessing the problem is in tfrecord or loading dataset.
The code for loading data is given below :
def decode_image(image_data):
image = tf.image.decode_jpeg(image_data, channels=3)
image = tf.cast(image, tf.float32) / (255.0) # convert image to floats in [0, 1] range
image = tf.reshape(image, [*IMAGE_SIZE, 3]) # explicit size needed for TPU
return image
def decode_image_mask(image_data):
image = tf.image.decode_jpeg(image_data, channels=3)
image = tf.cast(image, tf.float64) / (255.0) # convert image to floats in [0, 1] range
image = tf.reshape(image, [*IMAGE_SIZE, 3]) # explicit size needed for TPU
image=tf.image.rgb_to_grayscale(image)
image=tf.math.round(image)
return image
def read_tfrecord(example):
TFREC_FORMAT = {
"image": tf.io.FixedLenFeature([], tf.string), # tf.string means bytestring
"mask": tf.io.FixedLenFeature([], tf.string), # shape [] means single element
}
example = tf.io.parse_single_example(example, TFREC_FORMAT)
image = decode_image(example['image'])
mask=decode_image_mask(example['mask'])
return image, mask
def load_dataset(filenames, ordered=False):
# Read from TFRecords. For optimal performance, reading from multiple files at once and
# disregarding data order. Order does not matter since we will be shuffling the data anyway.
ignore_order = tf.data.Options()
if not ordered:
ignore_order.experimental_deterministic = False # disable order, increase speed
dataset = tf.data.TFRecordDataset(filenames, num_parallel_reads=AUTO) # automatically interleaves reads from multiple files
dataset = dataset.with_options(ignore_order) # uses data as soon as it streams in, rather than in its original order
dataset = dataset.map(read_tfrecord, num_parallel_calls=AUTO)
return dataset
def get_training_dataset():
dataset = load_dataset(TRAINING_FILENAMES)
dataset = dataset.repeat() # the training dataset must repeat for several epochs
dataset = dataset.shuffle(2048)
dataset = dataset.batch(BATCH_SIZE,drop_remainder=True)
dataset = dataset.prefetch(AUTO) # prefetch next batch while training (autotune prefetch buffer size)
return dataset
def get_validation_dataset(ordered=False):
dataset = load_dataset(VALIDATION_FILENAMES, ordered=ordered)
dataset = dataset.batch(BATCH_SIZE)
dataset = dataset.cache()
dataset = dataset.prefetch(AUTO) # prefetch next batch while training (autotune prefetch buffer size)
return dataset
def count_data_items(filenames):
# the number of data items is written in the name of the .tfrec files, i.e. flowers00-230.tfrec = 230 data items
n = [int(re.compile(r"-([0-9]*)\.").search(filename).group(1)) for filename in filenames]
return np.sum(n)
So, what am I doing wrong?
Turns out the problem was that I was unbatching the data and batching it to 20 to properly view the image and masks in matplotlib, and this was screwing up how data was being sent to the model, hence the Nan loss. Making another copy of the dataset and using that to view image, while sending the original to train solved this problem.

Tensor shape mismatch error in PyTorch on MNIST dataset, but no error on synthetic data

I am trying to implement a Deep Learning paper (https://github.com/kiankd/corel2019) and having a weird error when supplying real data (MNIST) to it, but no error when using the same synthetic data as the authors used.
The error happens in this function:
def get_armask(shape, labels, device=None):
mask = torch.zeros(shape).to(device)
arr = torch.arange(0, shape[0]).long().to(device)
mask[arr, labels] = -1.
return mask
More specifically this line:
mask[arr, labels] = -1.
The error is:
RuntimeError: The shape of the mask [500] at index 0 does not match the shape of the indexed tensor [500, 10] at index 1
The weird thing is, that if I use the synthetic data, there is no error and it works perfectly. If I print out the shapes, I get the following (both with synthetic data and with MNIST):
mask torch.Size([500, 10])
arr torch.Size([500])
labels torch.Size([500])
The code used to generate the synthetic data is the following:
X_data = (torch.rand(N_samples, D_input) * 10.).to(device)
labels = torch.LongTensor([i % N_classes for i in range(N_samples)]).to(device)
While the code to load MNIST is this:
train_images = mnist.train_images()
X_data_all = train_images.reshape((train_images.shape[0], train_images.shape[1] * train_images.shape[2]))
X_data = torch.tensor(X_data_all[:500,:]).to(device)
X_data = X_data.type(torch.FloatTensor)
labels = torch.tensor(mnist.train_labels()[:500]).to(device)
get_armask is used the following way:
def forward(self, predictions, labels):
mask = get_armask(predictions.shape, labels, device=self.device)
# make the attractor and repulsor, mask them!
attraction_tensor = mask * predictions
repulsion_tensor = (mask + 1) * predictions
# now, apply the special cosine-COREL rules, taking the argmax and squaring the repulsion
repulsion_tensor, _ = repulsion_tensor.max(dim=1)
repulsion_tensor = repulsion_tensor ** 2
return arloss(attraction_tensor, repulsion_tensor, self.lam)
The actual error seems to be different from what is in the error message, but I have no idea where to look. I tried a few things, like changing the learning rate, normalizing the MNIST data to be more or less in the same range as the test data but nothing seems to work.
Any suggestions? Thanks a lot in advance!
After exchanging some emails with the author of the paper we figured out what is the problem. The labels were type of Byte instead of Long, that caused the error. The error message is very misleading, the actual problem has nothing to do with the sizes...