OpenCV之停车场车位识别

1.实现流程

  1. 训练基于VGG16的迁移学习图像分类模型,使其能够分类每个车位有车或者没车
  2. 通过OpenCV处理停车场图片,分割成单个车位图像数据
  3. 基于训练模型识别并绘制出空车位

2.模型训练

1.VGG16

VGG16 是由牛津大学 Visual Geometry Group(VGG)在 2014 年提出的经典卷积神经网络(CNN)模型,主要用于图像分类任务。其核心特点是:

  • 结构简单但深度大:由 ​16 层​(13 个卷积层 + 3 个全连接层)组成,使用 ​3x3 小卷积核堆叠​(通过多层小卷积核模拟大感受野)。
  • ImageNet 竞赛的里程碑:在 2014 年 ImageNet 大规模视觉识别挑战赛(ILSVRC)中取得第二名(Top-5 错误率 7.3%),证明了深度网络对视觉任务的重要性。
  • 标准化设计:所有卷积层使用相同配置(3x3 卷积核,步长 1,填充 same),全连接层统一为 4096 个神经元。

1.网络结构

1
2
3
4
5
6
7
8
9
输入层 (224x224x3)
→ 2层卷积 (64通道) → 最大池化
→ 2层卷积 (128通道) → 最大池化
→ 3层卷积 (256通道) → 最大池化
→ 3层卷积 (512通道) → 最大池化
→ 3层卷积 (512通道) → 最大池化
→ 全连接层 (4096神经元) → Dropout
→ 全连接层 (4096神经元) → Dropout
→ 输出层 (1000神经元,对应ImageNet的1000类)

2.迁移学习

迁移学习(Transfer Learning)的核心思想是:​利用在大规模数据集(如 ImageNet)上预训练的网络,提取通用视觉特征,迁移到新任务中。VGG16 成为迁移学习经典选择的原因:

  1. 强大的特征提取能力
    • 通用底层特征:VGG16 的浅层卷积层(前几层)学习到的是边缘、颜色、纹理等通用视觉特征,这些特征对大多数图像任务(如物体检测、分类)都有效
    • 深层语义特征:深层卷积层(后几层)捕捉的是更高层次的语义特征(如车轮、车窗、车身结构等),适用于复杂任务
  2. 预训练权重优势
    • 大规模数据训练:VGG16 在 ImageNet(1400万张图片,1000类别)上训练,学习到了丰富的视觉模式
    • 参数有效性:其 3x3 卷积核堆叠的方式,能高效提取特征,且权重分布稳定
  3. 迁移学习适配性
    • 冻结部分层:可以冻结前几层(保留通用特征),只微调深层(适配特定任务),如在本文中冻结前 10 层
    • 灵活替换顶层:移除原模型的分类层(ImageNet 的 1000 类输出),替换为自定义分类层(如示例中的二分类)

3.局限性

尽管VGG16适合迁移学习,但也有明显缺点:

  • 参数量大:全连接层占用大量参数(约 1.38 亿),容易过拟合
  • 计算成本高:较深的网络导致训练和推理速度较慢
  • 被新模型超越:后续模型(如 ResNet、EfficientNet)在性能和效率上更优

2.数据处理

_load_data_generator用于加载和预处理图像数据,以便用于训练深度学习模型。它使用 ImageDataGenerator 进行数据增强和归一化,并返回训练集和验证集的数据生成。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
def _load_data_generator(_train_data_dir, _valid_data_dir, _img_width, _img_height, _batch_size):
# 1️⃣ 移除 .DS_Store 文件(仅适用于 macOS)
for directory in [_train_data_dir, _valid_data_dir]:
ds_store_path = os.path.join(directory, ".DS_Store")
if os.path.exists(ds_store_path):
os.remove(ds_store_path)

# 2️⃣ 数据增强器,对训练数据进行增强,防止过拟合,提高模型的泛化能力
train_datagen = ImageDataGenerator(
rescale=1.0 / 255, # 将像素值归一化到 [0,1](原始范围[0,255])
horizontal_flip=True, # 随机水平翻转,适用于左右对称的物体(如汽车、动物)
fill_mode="nearest", # 在图像转换时,缺失的像素用最近的像素填充
zoom_range=0.1, # 图像随机缩放 10%
width_shift_range=0.1, # 10% 随机水平平移
height_shift_range=0.1, # 10% 随机垂直平移
rotation_range=5 # 随机旋转图像 ±5 度
)
# 3️⃣ 验证集数据只进行归一化,不使用数据增强
valid_datagen = ImageDataGenerator(rescale=1. / 255)

# 4️⃣ 生成数据集和验证数据
train_generator = train_datagen.flow_from_directory(
_train_data_dir,
target_size=(_img_height, _img_width), # 调整图像大小
batch_size=_batch_size,
class_mode="categorical" # 适用于多类别分类问题(独热编码,如 [1,0,0])
)
validation_generator = valid_datagen.flow_from_directory(
_valid_data_dir,
target_size=(_img_height, _img_width),
batch_size=_batch_size,
class_mode="categorical"
)
return train_generator, validation_generator

3.模型自定义

使用VGG16并添加自定义的全连接层,以构建一个用于多类别分类(停车/不停车)的深度学习模型,同时对部分层进行微调

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
def load_model(_img_width, _img_height, _num_classes):  
# 1️⃣ 加载 VGG16 预训练模型(weights 在ImageNet数据集上训练好的权重)
# include_top=False:去掉 VGG16 的全连接层,只保留卷积部分(用于特征提取)
# input_shape 3表示RGB颜色通道
model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(_img_width, _img_height, 3))

# 2️⃣ 让前10层的参数不参与训练,只使用它们的预训练权重,避免过拟合,并且加快训练速度
for layer in model.layers[:10]:
layer.trainable = False # 冻结VGG16的前10层

# 3️⃣ 展平特征图
x = model.output # VGG16最后的卷积层的输出特征图
x = Flatten()(x) # 展平数据(变成一维向量)以便用于全连接层

# 4️⃣ 添加自定义全连接层
# 全连接层,输出 num_classes 个类别
# softmax激活函数:用于分类任务,输出每个类别的概率
predictions = Dense(_num_classes, activation="softmax")(x) # 输出层(2 分类)
# 5️⃣ 构建完整模型
# 输入:VGG16的输入层 输出:自定义的分类层predictions
model = Model(inputs=model.input, outputs=predictions)

# 6️⃣ 编译模型
# loss="categorical_crossentropy", 用于多类别分类任务
# optimizer 随机梯度下降(SGD)优化器 learning_rate=0.0001 控制学习步长(较小的学习率有助于稳定训练) momentum=0.9 让优化器记住之前的梯度,减少震荡,提高收敛速度
model.compile(loss="categorical_crossentropy", # 用于多类别分类任务
optimizer=optimizers.SGD(learning_rate=0.0001, momentum=0.9),
metrics=["accuracy"]) # 训练时监控准确率
return model

4.模型训练

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
def train():  
# 1️⃣ 设置训练参数
train_data_dir = "data/train" # 训练集文件夹
valid_data_dir = "data/valid" # 验证集文件夹
train_files_count = _files_count(train_data_dir) # 图片数量
valid_files_count = _files_count(valid_data_dir)
batch_size = 32 # 每批次训练32张图片
epochs = 15 # 训练15轮
num_classes = 2 # 二分类任务(停车/不停车)
img_width = 32 # 图片大小(VGG16预处理需要固定尺寸)
img_height = 32

# 2️⃣ 加载模型
model = load_model(img_width, img_height, num_classes)

# 3️⃣ 加载数据
train_generator, validation_generator = _load_data_generator(train_data_dir, valid_data_dir, img_width, img_height, batch_size)

# 4️⃣ 训练回调
# 模型检查点
# val_accuracy最高时自动保存最佳模型car1.keras
# save_best_only=True 只保存最佳模型
# save_weights_only=False 保存完整模型(结构+权重)
checkpoint = ModelCheckpoint("car1.keras", monitor='val_accuracy', verbose=1, save_best_only=True, save_weights_only=False, mode='auto')
# 早停机制
early = EarlyStopping(monitor='val_accuracy', # 10轮内没有提升,停止训练
min_delta=0,
patience=10, # 最多等待10轮
verbose=1,
mode='max') # val_accuracy最大化

# 5️⃣ 计算步数 计算每轮训练的批次数
steps_per_epoch = np.ceil(train_files_count / batch_size).astype(int)
validation_steps = np.ceil(valid_files_count / batch_size).astype(int)

# 6️⃣ 训练模型
model.fit(
train_generator, # 训练数据
steps_per_epoch=steps_per_epoch, # 每个epoch运行多少个batch(训练批次)
epochs=epochs, # 训练 15 轮
validation_data=validation_generator, # 验证数据集
validation_steps=validation_steps, # 验证批次数
callbacks=[checkpoint, early]
)

model.fit() 训练过程中自动生成的日志信息,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
...
Epoch 13/15
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 452ms/step - accuracy: 0.9705 - loss: 0.0595
Epoch 13: val_accuracy did not improve from 0.94512
12/12 ━━━━━━━━━━━━━━━━━━━━ 6s 546ms/step - accuracy: 0.9712 - loss: 0.0587 - val_accuracy: 0.9451 - val_loss: 0.1267
Epoch 14/15
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 444ms/step - accuracy: 0.9904 - loss: 0.0406
Epoch 14: val_accuracy improved from 0.94512 to 0.95122, saving model to car1.keras
12/12 ━━━━━━━━━━━━━━━━━━━━ 7s 547ms/step - accuracy: 0.9905 - loss: 0.0403 - val_accuracy: 0.9512 - val_loss: 0.1178
Epoch 15/15
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 458ms/step - accuracy: 0.9963 - loss: 0.0310
Epoch 15: val_accuracy did not improve from 0.95122
12/12 ━━━━━━━━━━━━━━━━━━━━ 7s 552ms/step - accuracy: 0.9960 - loss: 0.0314 - val_accuracy: 0.9390 - val_loss: 0.1335

模型训练完会生成一个car1.keras文件

3.车位分割

1.图像预处理

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 提取出白色和黄色的区域(车位线),并返回一个掩码后的图像  
def select_rgb_white_yellow(image):
lower = np.uint8([120, 120, 120])
upper = np.uint8([255, 255, 255])
# 提取lower和upper之间颜色
# mask是一个二值图像,白色区域表示符合颜色范围的像素,黑色区域表示不符合的像素
mask = cv2.inRange(image, lower, upper)
# 保留掩码中白色区域对应的像素,其余区域置为黑色
masked = cv2.bitwise_and(image, image, mask=mask)
return masked

# 转灰度图,便于后续边缘检测
def convert_gray(image):
return cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)

# 使用Canny算法检测图像边缘,返回二值化的边缘图像
def detect_edges(image, low_threshold=50, high_threshold=200):
# low_threshold:低阈值,用于边缘连接 high_threshold:高阈值,用于强边缘检测
# 二值图像,其中边缘像素为白色(255),非边缘像素为黑色(0)
return cv2.Canny(image, low_threshold, high_threshold)
1
2
3
4
5
6
white_yellow_image = parking.select_rgb_white_yellow(_image)  
cv_show("white_yellow_images", white_yellow_image)
gray_image = parking.convert_gray(white_yellow_image)
cv_show("gray_image", gray_image)
edges_image = parking.detect_edges(gray_image)
cv_show("edges_image", edges_image)



2.车位区域选择

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# 定义停车场的多边形区域顶点(通过手动标定),返回顶点坐标和标记顶点后的图像
def select_region(image):
rows, cols = image.shape[:2]
pt_1 = [int(cols * 0.05), int(rows * 0.90)]
pt_2 = [int(cols * 0.05), int(rows * 0.70)]
pt_3 = [int(cols * 0.30), int(rows * 0.55)]
pt_4 = [int(cols * 0.6), int(rows * 0.13)]
pt_5 = [int(cols * 0.90), int(rows * 0.15)]
pt_6 = [int(cols * 0.90), int(rows * 0.95)]

vertices = np.array([pt_1, pt_2, pt_3, pt_4, pt_5, pt_6], dtype=np.int32)
cv_vertices = [vertices.reshape(-1, 1, 2)] # 转换为OpenCV多边形格式

point_img = image.copy()
point_img = cv2.cvtColor(point_img, cv2.COLOR_GRAY2RGB)
for point in vertices:
cv2.circle(point_img, (point[0], point[1]), 5, (0, 0, 255), 2)

return point_img, cv_vertices

# 根据顶点坐标生成掩码,过滤出停车区域,忽略无关背景
def filter_region(image, vertices):
mask = np.zeros_like(image)
if len(mask.shape) == 2:
cv2.fillPoly(mask, vertices, 255) # 绘制停车区域
return cv2.bitwise_and(image, mask)
1
2
3
4
(point_img, vertices) = parking.select_region(edges_image)  
cv_show("point_img", point_img)
filter_region_image = parking.filter_region(edges_image, vertices)
cv_show("filter_region_image", filter_region_image)


3.车位线检测与聚类

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
# 用霍夫变换检测直线(车位线),返回线段的起点和终点坐标
def hough_lines(image):
# 能够检测出图像中的线段,并返回每条线段的起点和终点坐标
# image 输入图像,必须是单通道的二值图像(通常是经过边缘检测后的图像,如 Canny 边缘检测的结果
# rho 直线检测的精度(以像素为单位),默认值:0.1。值越小,检测精度越高
# theta 直线检测的角度精度(以弧度为单位)。默认值:np.pi / 10(即 18 度)。值越小,检测角度越精细
# threshold 累加器阈值,用于确定检测到的直线。默认值:15。值越小,检测到的直线越多;值越大,检测到的直线越少
# minLineLength 线段的最小长度。默认值:9 小于此长度的线段会被忽略
# maxLineGap 线段之间的最大间隙 默认值:4 如果两条线段之间的间隙小于此值,它们会被合并为一条线段
return cv2.HoughLinesP(image, rho=0.1, theta=np.pi / 10, threshold=15, minLineLength=9, maxLineGap=4)

# 过滤并绘制符合条件的线段(水平且长度在25-55像素之间)
def draw_lines(image, lines):
# 过滤霍夫变换检测到直线
image = np.copy(image)
cleaned = []
for line in lines:
for x1, y1, x2, y2 in line:
if abs(y2 - y1) <= 1 and 25 <= abs(x2 - x1) <= 55:
cleaned.append((x1, y1, x2, y2))
cv2.line(image, (x1, y1), (x2, y2), [255, 0, 0], 1)
print("No lines detected: ", len(cleaned))
return image

# 对检测到的线段按列聚类(每列为一排车位),计算每列的矩形边界框
def identify_blocks(image, lines):
_new_image = np.copy(image)

# 1️⃣ 过滤部分直线
cleaned = []
for line in lines:
for x1, y1, x2, y2 in line:
if abs(y2 - y1) <= 1 and 25 <= abs(x2 - x1) <= 55:
cleaned.append((x1, y1, x2, y2))

# 2️⃣ 对直线按照x1进行排序
import operator
list1 = sorted(cleaned, key=operator.itemgetter(0, 1))

# 3️⃣ 找到多个列,相当于每列是一排车
clusters = {}
d_index = 0
clus_dist = 10

for i in range(len(list1) - 1):
distance = abs(list1[i + 1][0] - list1[i][0])
if distance <= clus_dist:
if not d_index in clusters.keys():
clusters[d_index] = []
clusters[d_index].append(list1[i])
clusters[d_index].append(list1[i + 1])
else:
d_index += 1

# 4️⃣ 得到坐标
rects = {}
i = 0
for key in clusters:
all_list = clusters[key]
cleaned = list(set(all_list))
if len(cleaned) > 5:
cleaned = sorted(cleaned, key=lambda tup: tup[1])
avg_y1 = cleaned[0][1]
avg_y2 = cleaned[-1][1]
avg_x1 = 0
avg_x2 = 0
for tup in cleaned:
avg_x1 += tup[0]
avg_x2 += tup[2]
avg_x1 = avg_x1 / len(cleaned)
avg_x2 = avg_x2 / len(cleaned)
rects[i] = (avg_x1, avg_y1, avg_x2, avg_y2)
i += 1

print("Num Parking Lanes: ", len(rects))
# 5️⃣ 把列矩形画出来
buff = 7
for key in rects:
tup_top_left = (int(rects[key][0] - buff), int(rects[key][1]))
tup_bot_right = (int(rects[key][2] + buff), int(rects[key][3]))
cv2.rectangle(_new_image, tup_top_left, tup_bot_right, (0, 255, 0), 1)
return _new_image, rects
1
2
3
4
5
lines = parking.hough_lines(filter_region_image)  
line_image = parking.draw_lines(_image, lines)
cv_show("line_image", line_image)
rect_image, rects = parking.identify_blocks(_image, lines)
cv_show("rect_image", rect_image)


4.车位划分与编号

根据矩形边界框划分车位:

  • 在每列内按固定间距(gap=15.5)绘制水平线,表示车位分隔
  • 对中间列添加垂直线,将车位分为左右两部分
  • 微调坐标(adj_x1adj_y1等)以适配实际场景
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
def draw_parking(image, rects, thickness=1, save=False):  
color = [255, 0, 0]
new_image = np.copy(image)
gap = 15.5
cur_len = 0
spot_dict = {} # 字典:一个车位对应一个位置
tot_spots = 0
# 微调
adj_y1 = {0: 20, 1: -10, 2: 0, 3: -11, 4: 28, 5: 5, 6: -15, 7: -15, 8: -10, 9: -30, 10: 9, 11: -32}
adj_y2 = {0: 30, 1: 50, 2: 15, 3: 10, 4: -15, 5: 15, 6: 15, 7: -20, 8: 15, 9: 15, 10: 0, 11: 30}

adj_x1 = {0: -8, 1: -15, 2: -15, 3: -15, 4: -15, 5: -15, 6: -15, 7: -15, 8: -10, 9: -10, 10: -10, 11: 0}
adj_x2 = {0: 0, 1: 15, 2: 15, 3: 15, 4: 15, 5: 15, 6: 15, 7: 15, 8: 10, 9: 10, 10: 10, 11: 0}

for key in rects:
tup = rects[key]
# 使用 dict.get 方法,为不存在的键提供默认值 0
x1 = int(tup[0] + adj_x1.get(key, 0))
x2 = int(tup[2] + adj_x2.get(key, 0))
y1 = int(tup[1] + adj_y1.get(key, 0))
y2 = int(tup[3] + adj_y2.get(key, 0))
cv2.rectangle(new_image, (x1, y1), (x2, y2), (0, 255, 0), 1)
num_splits = int(abs(y2 - y1) // gap)
for i in range(0, num_splits + 1):
y = int(y1 + i * gap)
cv2.line(new_image, (x1, y), (x2, y), color, thickness)
if 0 < key < len(rects) - 1:
# 竖直线
x = int((x1 + x2) / 2)
cv2.line(new_image, (x, y1), (x, y2), color, thickness)
# 计算数量
if key == 0 or key == (len(rects) - 1):
tot_spots += num_splits + 1
else:
tot_spots += 2 * (num_splits + 1)

# 字典对应好
if key == 0 or key == (len(rects) - 1):
for i in range(0, num_splits + 1):
cur_len = len(spot_dict)
y = int(y1 + i * gap)
spot_dict[(x1, y, x2, y + gap)] = cur_len + 1
else:
for i in range(0, num_splits + 1):
cur_len = len(spot_dict)
y = int(y1 + i * gap)
x = int((x1 + x2) / 2)
spot_dict[(x1, y, x, y + gap)] = cur_len + 1
spot_dict[(x, y, x2, y + gap)] = cur_len + 2

print(f"total parking spaces: {tot_spots}, len: {cur_len} ")
if save:
filename = f'with_parking_{str(random.randint(1000, 9999))}.jpg'
cv2.imwrite(filename, new_image)
return new_image, spot_dict
1
2
draw_parking_image, _spot_dict = parking.draw_parking(_image, rects)  
cv_show("draw_parking_image", draw_parking_image)


最终生成车位字典 spot_dict,记录每个车位的坐标和编号

4.车位状态预测

1
2
3
4
5
6
7
8
9
10
11
12
13
# 使用预训练模型,预测车位状态empty或occupied
def make_prediction(image, model, class_dictionary):
# 预处理
img = image / 255.

# 转换成4D tensor
image = np.expand_dims(img, axis=0)

# 用训练好的模型进行训练
class_predicted = model.predict(image)
in_id = np.argmax(class_predicted[0])
label = class_dictionary[in_id]
return label

1.图片预测

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# 遍历所有车位,调用模型预测状态,并在图像上标记空车位(绿色半透明矩形)
def predict_on_image(image, spot_dict, model, class_dictionary, color=None, alpha=0.5):
if color is None:
color = [0, 255, 0]
new_image = np.copy(image)
overlay = np.copy(image)
cnt_empty = 0
all_spots = 0
for spot in spot_dict.keys():
all_spots += 1
(x1, y1, x2, y2) = spot
(x1, y1, x2, y2) = (int(x1), int(y1), int(x2), int(y2))
spot_img = image[y1:y2, x1:x2]
spot_img = cv2.resize(spot_img, (32, 32))

label = make_prediction(spot_img, model, class_dictionary)
if label == 'empty':
cv2.rectangle(overlay, (int(x1), int(y1)), (int(x2), int(y2)), color, -1)
cnt_empty += 1

cv2.addWeighted(overlay, alpha, new_image, 1 - alpha, 0, new_image)

cv2.putText(new_image, "Available: %d spots" % cnt_empty, (30, 95),
cv2.FONT_HERSHEY_SIMPLEX,
0.7, (255, 255, 255), 2)

cv2.putText(new_image, "Total: %d spots" % all_spots, (30, 125),
cv2.FONT_HERSHEY_SIMPLEX,
0.7, (255, 255, 255), 2)
return new_image

2.视频预测

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
# 对视频隔帧处理,实时检测车位状态并输出结果视频
def predict_on_video(video_name, spot_dict, model, class_dictionary, output_name="parking_video_output.mp4"):
cap = cv2.VideoCapture(video_name)

# 获取视频帧率、分辨率
fps = int(cap.get(cv2.CAP_PROP_FPS))
frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))

# 设置视频写入器(MP4 格式)
fourcc = cv2.VideoWriter_fourcc(*'mp4v') # 编码格式
out = cv2.VideoWriter(output_name, fourcc, fps, (frame_width, frame_height))

frame_count = 0
alpha = 0.5 # 透明度
color = (0, 255, 0) # 绿色

while cap.isOpened():
ret, frame = cap.read()
if not ret:
break # 读取失败,退出循环

frame_count += 1
if frame_count % 20 != 0: # 每20帧处理一次,提高性能
out.write(frame) # 直接写入原始帧
continue

overlay = frame.copy()
empty_spots = 0
total_spots = len(spot_dict)

for (x1, y1, x2, y2) in spot_dict.keys():
x1, y1, x2, y2 = map(int, [x1, y1, x2, y2]) # 确保坐标为整数
spot_img = frame[y1:y2, x1:x2]

if spot_img.size == 0:
continue # 避免无效的裁剪区域

spot_img = cv2.resize(spot_img, (32, 32))
label = make_prediction(spot_img, model, class_dictionary)

if label == 'empty':
cv2.rectangle(overlay, (x1, y1), (x2, y2), color, -1)
empty_spots += 1

# 叠加透明遮罩
cv2.addWeighted(overlay, alpha, frame, 1 - alpha, 0, frame)

# 显示车位信息
cv2.putText(frame, f"Available: {empty_spots} spots", (30, 95),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2)
cv2.putText(frame, f"Total: {total_spots} spots", (30, 125),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2)

out.write(frame) # 将帧写入输出视频文件
# cv_show('Parking Detection', frame)

if cv2.waitKey(10) & 0xFF == ord('q'):
break

cap.release()
out.release()
cv2.destroyAllWindows()
print(f"Processed video saved as {output_name}")

3.主方法调用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
# 测试单张图片,显示原始图片&预测结果
def image_test(_image, _area_dict, _model, _class_dict):
cv_show("image", _image)
predicted_image = parking.predict_on_image(_image, _area_dict, _model, _class_dict)
cv_show("predicted_image", predicted_image) # 显示预测后的图片
filename = f'with_marking_{str(random.randint(1000, 9999))}.jpg'
cv2.imwrite(filename, predicted_image) # 保存标记后的图片

# 测试视频
def video_test(video_name, _area_dict, _model, _class_dict):
parking.predict_on_video(video_name, _area_dict, _model, _class_dict)


if __name__ == '__main__':
# 1.模型训练,生成car1.keras
# train_model.train()
model = load_model("car1.keras")

# 2.获取停车场车位区域
image = cv2.imread("images/frame_0006.jpg")
area_dict = get_parking_area(image)

# 3.图片(标出空车位)
class_dict = {0: 'empty', 1: 'occupied'}
image_test(image, area_dict, model, class_dict)

# 4.视频(标出空车位)
# video_test('parking_video.mp4', area_dict, model, class_dict)


视频效果可以自己跑下看看

5.总结

1.模型训练

  • 使用 VGG16 预训练模型,去除全连接层,仅保留卷积部分用于特征提取
  • 迁移学习:冻结前 10 层,仅微调深层,提高泛化能力
  • 数据预处理:
    • 数据增强(翻转、缩放、平移、旋转)防止过拟合
    • 归一化(像素值缩放至 [0,1])
  • 自定义分类层:
    • 全连接层 + softmax 激活函数进行二分类(空/占)
  • 模型训练:
    • categorical_crossentropy作为损失函数,SGD 作为优化器
    • 早停机制 + 最优模型保存 (car1.keras)

2.车位分割

  • 图像预处理:
    • 选取 白色/黄色车位线,转换为灰度图
    • Canny 边缘检测提取车位边缘
  • 车位区域选择:
    • 手动标定停车场区域,提取有效部分
    • 生成 掩码,忽略背景干扰
  • 车位线检测与聚类:
    • 霍夫变换检测水平车位线
    • 聚类分析划分停车列,计算每列矩形边界框

6.备注

环境

  • mac: 15.2
  • python: 3.12.4
  • numpy: 1.26.4
  • opencv-python: 4.11.0.86
  • keras: 3.9.0

资源和代码

https://github.com/keychankc/dl_code_for_blog/tree/main/008_opencv_park