background elimination
Data augmentation to increase performance
SOLO v1(Single Stage Instance Segmentation)
Paper Implementation



Figure 1. On the left, Original image. Middle, Image with one-shot instance segmentation. Right, image with data augmentation from my code.
-
Two stage vs Single stage
- Two stage (Doing segmentation and instance segmentation after finding the bounding box)
- Detection - Faster RCNN
- Segementation - Mask RCNN

Figure 2. The Mask R-CNN framework for instance segmentation.
- One stage (Getting the instance mask right away, instead of finding the bounding box)
- Detection - YOLO
- Segmentation - SOLO

Figure 3. Solo framework
- Top-down segmentation
- It is greatly affected by the performance of the bounding box because it is first detected and then segmented.
- e.g. Mask R-CNN
- Bottom-up segmentation
- there is a pixel-embedding method to be greatly influenced by the grouping.


Figure 4. Bottom-up segmentation (Pixel-wise embedding)
Modifiying Solo code
def show_result_ins(img,
result,
class_names,
score_thr=0.3,
sort_by_density=False,
out_file=None):
"""Visualize the instance segmentation results on the image.
Args:
img (str or np.ndarray): Image filename or loaded image.
result (tuple[list] or list): The instance segmentation result.
class_names (list[str] or tuple[str]): A list of class names.
score_thr (float): The threshold to visualize the masks.
sort_by_density (bool): sort the masks by their density.
out_file (str, optional): If specified, the visualization result will
be written to the out file instead of shown in a window.
Returns:
np.ndarray or None: If neither `show` nor `out_file` is specified, the
visualized image is returned, otherwise None is returned.
"""
assert isinstance(class_names, (tuple, list))
img = mmcv.imread(img)
img_show = img.copy()
h, w, _ = img.shape
cur_result = result[0]
seg_label = cur_result[0] #
seg_label = seg_label.cpu().numpy().astype(np.uint8)
cate_label = cur_result[1] #ex.(9,426,640) the number of mask is 9
cate_label = cate_label.cpu().numpy()
score = cur_result[2].cpu().numpy()
vis_inds = score > score_thr
seg_label = seg_label[vis_inds]
num_mask = seg_label.shape[0] # the number of mask
cate_label = cate_label[vis_inds]
cate_score = score[vis_inds]
if sort_by_density:
mask_density = []
for idx in range(num_mask):
cur_mask = seg_label[idx, :, :]
cur_mask = mmcv.imresize(cur_mask, (w, h))
cur_mask = (cur_mask > 0.5).astype(np.int32)
mask_density.append(cur_mask.sum())
orders = np.argsort(mask_density)
seg_label = seg_label[orders]
cate_label = cate_label[orders]
cate_score = cate_score[orders]
np.random.seed(42)
color_masks = [
np.random.randint(0, 256, (1, 3), dtype=np.uint8)
for _ in range(num_mask)
]
for idx in range(num_mask):
idx = -(idx+1)
cur_mask = seg_label[idx, :, :]
cur_mask = mmcv.imresize(cur_mask, (w, h))
cur_mask = (cur_mask > 0.5).astype(np.uint8)
if cur_mask.sum() == 0:
continue
color_mask = color_masks[idx]
cur_mask_bool = cur_mask.astype(np.bool)
img_show[cur_mask_bool] = img[cur_mask_bool]*0.5+ color_mask * 0.5 #instance segmentation -> colorizing
cur_cate = cate_label[idx]
cur_score = cate_score[idx]
label_text = class_names[cur_cate]
#label_text += '|{:.02f}'.format(cur_score)
center_y, center_x = ndimage.measurements.center_of_mass(cur_mask)
vis_pos = (max(int(center_x) - 10, 0), int(center_y))
cv2.putText(img_show, label_text, vis_pos,
cv2.FONT_HERSHEY_COMPLEX, 0.3, (255, 255, 255)) # green
if out_file is None:
return img
else:
back_ground = ~np.sum(seg_label,axis=0).astype(np.bool) #Gave the opposite value of segmentation
img_show[back_ground] = img[back_ground]*0+ color_mask * 0 #Multiplied the background by zero to remove the background
mmcv.imwrite(img_show, out_file)
accessed and modified inference.py of Solo code



Figure 5. In the backbone network, the feature map is extracted through the resnet
-
Result



Figure 6. On the left, Original image. Middle, Image with one-shot instance segmentation. Right, image with data augmentation from my code.