DL-C4W3(YOLO)

intro 边框的定义

filename alreay exists, renamed

加入我们要分类80个物品,那么就有80个c(c0,c1…c80)

YOLO

you only look once,只做一次前馈传播,并使用非最大化抑制之后就可以输出目标框。

编码

uplouccessful

upload succful

所谓anchor box,就是用来使得一个格子能够检测出多个对象。需要预先定义好anchor box的形状,当每找到一个对象的中点的时候,不仅仅把中点分配给对应的grid,而且还会分配到对应的anchor box

对于每个anchor box,找出该框包含某一类的概率

upload ccessful

可视化预测

upload sful

filename already exists, ramed

当框框太多的时候,使用非最大化抑制的方法剔除一些重叠的框框。

code见下

filter_box

upload sucsful

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47

def yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold = .6):
"""
通过阈值来过滤对象和分类的置信度。

参数:
box_confidence - tensor类型,维度为(19,19,5,1),包含19x19单元格中每个单元格预测的5个锚框中的所有的锚框的pc (一些对象的置信概率)。
boxes - tensor类型,维度为(19,19,5,4),包含了所有的锚框的(px,py,ph,pw )。
box_class_probs - tensor类型,维度为(19,19,5,80),包含了所有单元格中所有锚框的所有对象( c1,c2,c3,···,c80 )检测的概率。
threshold - 实数,阈值,如果分类预测的概率高于它,那么这个分类预测的概率就会被保留。

返回:
scores - tensor 类型,维度为(None,),包含了保留了的锚框的分类概率。
boxes - tensor 类型,维度为(None,4),包含了保留了的锚框的(b_x, b_y, b_h, b_w)
classess - tensor 类型,维度为(None,),包含了保留了的锚框的索引

注意:"None"是因为你不知道所选框的确切数量,因为它取决于阈值。
比如:如果有10个锚框,scores的实际输出大小将是(10,)
"""
"""

# Step 1: Compute box scores
### START CODE HERE ### (≈ 1 line)
box_scores = box_confidence * box_class_probs
### END CODE HERE ###

# Step 2: Find the box_classes thanks to the max box_scores, keep track of the corresponding score
### START CODE HERE ### (≈ 2 lines)
box_classes = K.argmax(box_scores, axis = -1)
box_class_scores = K.max(box_scores, axis = -1)
### END CODE HERE ###

# Step 3: Create a filtering mask based on "box_class_scores" by using "threshold". The mask should have the
# same dimension as box_class_scores, and be True for the boxes you want to keep (with probability >= threshold)
### START CODE HERE ### (≈ 1 line)
filtering_mask = (box_class_scores >= threshold)
### END CODE HERE ###

# Step 4: Apply the mask to scores, boxes and classes
### START CODE HERE ### (≈ 3 lines)
scores = tf.boolean_mask(box_class_scores,filtering_mask)
boxes = tf.boolean_mask(boxes,filtering_mask)
classes= tf.boolean_mask(box_classes,filtering_mask)

### END CODE HERE ###

return scores, boxes, classes

非最大化抑制

filename already exists, renamed

1.假设首先设定阈值为0.6,抛弃所有pc<=0.6可能性的框框,这一步先剔除了所有可能性很低的框框。

2.选中一个pc最大的框框,作为输出,然后抛弃所有其他的与输出的交并比>=0.5的框框

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
def yolo_non_max_suppression(scores, boxes, classes, max_boxes = 10, iou_threshold = 0.5):
"""
Applies Non-max suppression (NMS) to set of boxes

Arguments:
scores -- tensor of shape (None,), output of yolo_filter_boxes()
boxes -- tensor of shape (None, 4), output of yolo_filter_boxes() that have been scaled to the image size (see later)
classes -- tensor of shape (None,), output of yolo_filter_boxes()
max_boxes -- integer, maximum number of predicted boxes you'd like
iou_threshold -- real value, "intersection over union" threshold used for NMS filtering

Returns:
scores -- tensor of shape (, None), predicted score for each box
boxes -- tensor of shape (4, None), predicted box coordinates
classes -- tensor of shape (, None), predicted class for each box

Note: The "None" dimension of the output tensors has obviously to be less than max_boxes. Note also that this
function will transpose the shapes of scores, boxes, classes. This is made for convenience.
"""

max_boxes_tensor = K.variable(max_boxes, dtype='int32') # tensor to be used in tf.image.non_max_suppression()
K.get_session().run(tf.variables_initializer([max_boxes_tensor])) # initialize variable max_boxes_tensor

# Use tf.image.non_max_suppression() to get the list of indices corresponding to boxes you keep
### START CODE HERE ### (≈ 1 line)
indicesList = tf.image.non_max_suppression(
boxes,
scores,
max_boxes,
iou_threshold,
score_threshold=float('-inf'),
name=None
)

### END CODE HERE ###

# Use K.gather() to select only nms_indices from scores, boxes and classes
### START CODE HERE ### (≈ 3 lines)
#把在indicesList的scores gather起来
scores = K.gather(scores,indicesList)
boxes = K.gather(boxes,indicesList)
classes = K.gather(classes,indicesList)
### END CODE HERE ###

return scores, boxes, classes

所有框进行过滤

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
def yolo_eval(yolo_outputs, image_shape = (720., 1280.), max_boxes=10, score_threshold=.6, iou_threshold=.5):
"""
Converts the output of YOLO encoding (a lot of boxes) to your predicted boxes along with their scores, box coordinates and classes.

Arguments:
yolo_outputs -- output of the encoding model (for image_shape of (608, 608, 3)), contains 4 tensors:
box_confidence: tensor of shape (None, 19, 19, 5, 1)
box_xy: tensor of shape (None, 19, 19, 5, 2)
box_wh: tensor of shape (None, 19, 19, 5, 2)
box_class_probs: tensor of shape (None, 19, 19, 5, 80)
image_shape -- tensor of shape (2,) containing the input shape, in this notebook we use (608., 608.) (has to be float32 dtype)
max_boxes -- integer, maximum number of predicted boxes you'd like
score_threshold -- real value, if [ highest class probability score < threshold], then get rid of the corresponding box
iou_threshold -- real value, "intersection over union" threshold used for NMS filtering

Returns:
scores -- tensor of shape (None, ), predicted score for each box
boxes -- tensor of shape (None, 4), predicted box coordinates
classes -- tensor of shape (None,), predicted class for each box
"""

### START CODE HERE ###

# Retrieve outputs of the YOLO model (≈1 line)
box_confidence,box_xy,box_wh,box_class_probs = yolo_outputs

# Convert boxes to be ready for filtering functions
boxes = yolo_boxes_to_corners(box_xy, box_wh)
# Use one of the functions you've implemented to perform Score-filtering with a threshold of score_threshold (≈1 line)
scores,boxes,classes = yolo_filter_boxes(box_confidence,boxes,box_class_probs,score_threshold)
# Scale boxes back to original image shape.
boxes = scale_boxes(boxes, image_shape)
# Use one of the functions you've implemented to perform Non-max suppression with a threshold of iou_threshold (≈1 line)
scores,boxes,classes = yolo_non_max_suppression(scores, boxes, classes,max_boxes,iou_threshold)
### END CODE HERE ###

return scores, boxes, classes

YOLO总结

filename eady exists, renamed