OpenCV之停车场车位识别

发表于 2025-03-31 更新于 2025-05-30 分类于深度学习阅读次数： Waline：本文字数： 4.5k 阅读时长 ≈ 16 分钟

1.实现流程

训练基于VGG16的迁移学习图像分类模型，使其能够分类每个车位有车或者没车
通过OpenCV处理停车场图片，分割成单个车位图像数据
基于训练模型识别并绘制出空车位

2.模型训练

1.VGG16

VGG16 是由牛津大学 Visual Geometry Group（VGG）在 2014 年提出的经典卷积神经网络（CNN）模型，主要用于图像分类任务。其核心特点是：

结构简单但深度大：由 16 层（13 个卷积层 + 3 个全连接层）组成，使用 3x3 小卷积核堆叠（通过多层小卷积核模拟大感受野）。
ImageNet 竞赛的里程碑：在 2014 年 ImageNet 大规模视觉识别挑战赛（ILSVRC）中取得第二名（Top-5 错误率 7.3%），证明了深度网络对视觉任务的重要性。
标准化设计：所有卷积层使用相同配置（3x3 卷积核，步长 1，填充 same），全连接层统一为 4096 个神经元。

1.网络结构

输入层 (224x224x3)
→ 2层卷积 (64通道) → 最大池化
→ 2层卷积 (128通道) → 最大池化
→ 3层卷积 (256通道) → 最大池化
→ 3层卷积 (512通道) → 最大池化
→ 3层卷积 (512通道) → 最大池化
→ 全连接层 (4096神经元) → Dropout
→ 全连接层 (4096神经元) → Dropout
→ 输出层 (1000神经元，对应ImageNet的1000类)

2.迁移学习

迁移学习（Transfer Learning）的核心思想是：利用在大规模数据集（如 ImageNet）上预训练的网络，提取通用视觉特征，迁移到新任务中。VGG16 成为迁移学习经典选择的原因：

强大的特征提取能力
- 通用底层特征：VGG16 的浅层卷积层（前几层）学习到的是边缘、颜色、纹理等通用视觉特征，这些特征对大多数图像任务（如物体检测、分类）都有效
- 深层语义特征：深层卷积层（后几层）捕捉的是更高层次的语义特征（如车轮、车窗、车身结构等），适用于复杂任务
预训练权重优势
- 大规模数据训练：VGG16 在 ImageNet（1400万张图片，1000类别）上训练，学习到了丰富的视觉模式
- 参数有效性：其 3x3 卷积核堆叠的方式，能高效提取特征，且权重分布稳定
迁移学习适配性
- 冻结部分层：可以冻结前几层（保留通用特征），只微调深层（适配特定任务），如在本文中冻结前 10 层
- 灵活替换顶层：移除原模型的分类层（ImageNet 的 1000 类输出），替换为自定义分类层（如示例中的二分类）

3.局限性

尽管VGG16适合迁移学习，但也有明显缺点：

参数量大：全连接层占用大量参数（约 1.38 亿），容易过拟合
计算成本高：较深的网络导致训练和推理速度较慢
被新模型超越：后续模型（如 ResNet、EfficientNet）在性能和效率上更优

2.数据处理

_load_data_generator用于加载和预处理图像数据，以便用于训练深度学习模型。它使用 ImageDataGenerator 进行数据增强和归一化，并返回训练集和验证集的数据生成。

def _load_data_generator(_train_data_dir, _valid_data_dir, _img_width, _img_height, _batch_size):
    # 1️⃣ 移除 .DS_Store 文件（仅适用于 macOS)
    for directory in [_train_data_dir, _valid_data_dir]:  
        ds_store_path = os.path.join(directory, ".DS_Store")  
        if os.path.exists(ds_store_path):  
            os.remove(ds_store_path)  

    # 2️⃣ 数据增强器,对训练数据进行增强，防止过拟合，提高模型的泛化能力
    train_datagen = ImageDataGenerator(  
        rescale=1.0 / 255,       # 将像素值归一化到 [0,1]（原始范围[0,255]）        
        horizontal_flip=True,    # 随机水平翻转，适用于左右对称的物体（如汽车、动物）  
        fill_mode="nearest",     # 在图像转换时，缺失的像素用最近的像素填充  
        zoom_range=0.1,          # 图像随机缩放 10%
        width_shift_range=0.1,   # 10% 随机水平平移  
        height_shift_range=0.1,  # 10% 随机垂直平移  
        rotation_range=5         # 随机旋转图像 ±5 度 
    )  
    # 3️⃣ 验证集数据只进行归一化，不使用数据增强  
    valid_datagen = ImageDataGenerator(rescale=1. / 255)  
  
    # 4️⃣ 生成数据集和验证数据
    train_generator = train_datagen.flow_from_directory(  
        _train_data_dir,  
        target_size=(_img_height, _img_width),  # 调整图像大小
        batch_size=_batch_size,  
        class_mode="categorical"  # 适用于多类别分类问题（独热编码，如 [1,0,0]）  
    )  
    validation_generator = valid_datagen.flow_from_directory(  
        _valid_data_dir,  
        target_size=(_img_height, _img_width),  
        batch_size=_batch_size,  
        class_mode="categorical"  
    )  
    return train_generator, validation_generator

3.模型自定义

使用VGG16并添加自定义的全连接层，以构建一个用于多类别分类（停车/不停车）的深度学习模型，同时对部分层进行微调

def load_model(_img_width, _img_height, _num_classes):  
    # 1️⃣ 加载 VGG16 预训练模型（weights 在ImageNet数据集上训练好的权重）  
    # include_top=False：去掉 VGG16 的全连接层，只保留卷积部分（用于特征提取）  
    # input_shape 3表示RGB颜色通道
    model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(_img_width, _img_height, 3))  
  
    # 2️⃣ 让前10层的参数不参与训练，只使用它们的预训练权重，避免过拟合，并且加快训练速度  
    for layer in model.layers[:10]:  
        layer.trainable = False  # 冻结VGG16的前10层  

    # 3️⃣ 展平特征图
    x = model.output  # VGG16最后的卷积层的输出特征图 
    x = Flatten()(x)  # 展平数据（变成一维向量)以便用于全连接层

    # 4️⃣ 添加自定义全连接层  
    # 全连接层，输出 num_classes 个类别
    # softmax激活函数：用于分类任务，输出每个类别的概率  
    predictions = Dense(_num_classes, activation="softmax")(x)  # 输出层（2 分类）  
    # 5️⃣ 构建完整模型
    # 输入:VGG16的输入层  输出:自定义的分类层predictions  
    model = Model(inputs=model.input, outputs=predictions)  

    # 6️⃣ 编译模型
    # loss="categorical_crossentropy", 用于多类别分类任务
    # optimizer 随机梯度下降（SGD）优化器 learning_rate=0.0001 控制学习步长（较小的学习率有助于稳定训练） momentum=0.9 让优化器记住之前的梯度，减少震荡，提高收敛速度
    model.compile(loss="categorical_crossentropy",  # 用于多类别分类任务  
                  optimizer=optimizers.SGD(learning_rate=0.0001, momentum=0.9), 
                  metrics=["accuracy"])  # 训练时监控准确率  
    return model

4.模型训练

def train():  
    # 1️⃣ 设置训练参数
    train_data_dir = "data/train"  # 训练集文件夹
    valid_data_dir = "data/valid"  # 验证集文件夹
    train_files_count = _files_count(train_data_dir)  # 图片数量
    valid_files_count = _files_count(valid_data_dir)  
    batch_size = 32  # 每批次训练32张图片
    epochs = 15  # 训练15轮
    num_classes = 2  # 二分类任务（停车/不停车）
    img_width = 32  # 图片大小（VGG16预处理需要固定尺寸）
    img_height = 32  

    # 2️⃣ 加载模型
    model = load_model(img_width, img_height, num_classes)  

    # 3️⃣ 加载数据
    train_generator, validation_generator = _load_data_generator(train_data_dir, valid_data_dir, img_width, img_height, batch_size)  
    
    # 4️⃣ 训练回调  
    # 模型检查点
    # val_accuracy最高时自动保存最佳模型car1.keras
    # save_best_only=True 只保存最佳模型
    # save_weights_only=False 保存完整模型（结构+权重）
    checkpoint = ModelCheckpoint("car1.keras", monitor='val_accuracy', verbose=1, save_best_only=True, save_weights_only=False, mode='auto')  
    # 早停机制  
    early = EarlyStopping(monitor='val_accuracy',  # 10轮内没有提升，停止训练
                          min_delta=0,  
                          patience=10,  # 最多等待10轮
                          verbose=1,  
                          mode='max')  # val_accuracy最大化

    # 5️⃣ 计算步数 计算每轮训练的批次数
    steps_per_epoch = np.ceil(train_files_count / batch_size).astype(int)  
    validation_steps = np.ceil(valid_files_count / batch_size).astype(int)  

    # 6️⃣ 训练模型
    model.fit(  
        train_generator,  # 训练数据  
        steps_per_epoch=steps_per_epoch,  # 每个epoch运行多少个batch（训练批次）  
        epochs=epochs,  # 训练 15 轮  
        validation_data=validation_generator,  # 验证数据集  
        validation_steps=validation_steps,  # 验证批次数  
        callbacks=[checkpoint, early]  
    )

model.fit() 训练过程中自动生成的日志信息，如下：

...
Epoch 13/15
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 452ms/step - accuracy: 0.9705 - loss: 0.0595
Epoch 13: val_accuracy did not improve from 0.94512
12/12 ━━━━━━━━━━━━━━━━━━━━ 6s 546ms/step - accuracy: 0.9712 - loss: 0.0587 - val_accuracy: 0.9451 - val_loss: 0.1267
Epoch 14/15
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 444ms/step - accuracy: 0.9904 - loss: 0.0406
Epoch 14: val_accuracy improved from 0.94512 to 0.95122, saving model to car1.keras
12/12 ━━━━━━━━━━━━━━━━━━━━ 7s 547ms/step - accuracy: 0.9905 - loss: 0.0403 - val_accuracy: 0.9512 - val_loss: 0.1178
Epoch 15/15
12/12 ━━━━━━━━━━━━━━━━━━━━ 0s 458ms/step - accuracy: 0.9963 - loss: 0.0310
Epoch 15: val_accuracy did not improve from 0.95122
12/12 ━━━━━━━━━━━━━━━━━━━━ 7s 552ms/step - accuracy: 0.9960 - loss: 0.0314 - val_accuracy: 0.9390 - val_loss: 0.1335

模型训练完会生成一个car1.keras文件

3.车位分割

1.图像预处理

# 提取出白色和黄色的区域(车位线)，并返回一个掩码后的图像  
def select_rgb_white_yellow(image):  
    lower = np.uint8([120, 120, 120])  
    upper = np.uint8([255, 255, 255])  
    # 提取lower和upper之间颜色  
    # mask是一个二值图像，白色区域表示符合颜色范围的像素，黑色区域表示不符合的像素  
    mask = cv2.inRange(image, lower, upper)  
    # 保留掩码中白色区域对应的像素，其余区域置为黑色  
    masked = cv2.bitwise_and(image, image, mask=mask)  
    return masked  

# 转灰度图，便于后续边缘检测
def convert_gray(image):  
    return cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)  

# 使用Canny算法检测图像边缘，返回二值化的边缘图像
def detect_edges(image, low_threshold=50, high_threshold=200):  
    # low_threshold：低阈值，用于边缘连接  high_threshold：高阈值，用于强边缘检测  
    # 二值图像，其中边缘像素为白色（255），非边缘像素为黑色（0）  
    return cv2.Canny(image, low_threshold, high_threshold)

white_yellow_image = parking.select_rgb_white_yellow(_image)  
cv_show("white_yellow_images", white_yellow_image)  
gray_image = parking.convert_gray(white_yellow_image)  
cv_show("gray_image", gray_image)  
edges_image = parking.detect_edges(gray_image)  
cv_show("edges_image", edges_image)

2.车位区域选择

# 定义停车场的多边形区域顶点（通过手动标定），返回顶点坐标和标记顶点后的图像
def select_region(image):  
    rows, cols = image.shape[:2]  
    pt_1 = [int(cols * 0.05), int(rows * 0.90)]  
    pt_2 = [int(cols * 0.05), int(rows * 0.70)]  
    pt_3 = [int(cols * 0.30), int(rows * 0.55)]  
    pt_4 = [int(cols * 0.6), int(rows * 0.13)]  
    pt_5 = [int(cols * 0.90), int(rows * 0.15)]  
    pt_6 = [int(cols * 0.90), int(rows * 0.95)]  
  
    vertices = np.array([pt_1, pt_2, pt_3, pt_4, pt_5, pt_6], dtype=np.int32)  
    cv_vertices = [vertices.reshape(-1, 1, 2)]  # 转换为OpenCV多边形格式  
  
    point_img = image.copy()  
    point_img = cv2.cvtColor(point_img, cv2.COLOR_GRAY2RGB)  
    for point in vertices:  
        cv2.circle(point_img, (point[0], point[1]), 5, (0, 0, 255), 2)  
  
    return point_img, cv_vertices  

# 根据顶点坐标生成掩码，过滤出停车区域，忽略无关背景
def filter_region(image, vertices):  
    mask = np.zeros_like(image)  
    if len(mask.shape) == 2:  
        cv2.fillPoly(mask, vertices, 255)  # 绘制停车区域  
    return cv2.bitwise_and(image, mask)

(point_img, vertices) = parking.select_region(edges_image)  
cv_show("point_img", point_img)
filter_region_image = parking.filter_region(edges_image, vertices)  
cv_show("filter_region_image", filter_region_image)

3.车位线检测与聚类

# 用霍夫变换检测直线（车位线），返回线段的起点和终点坐标
def hough_lines(image):  
    # 能够检测出图像中的线段，并返回每条线段的起点和终点坐标  
    # image 输入图像，必须是单通道的二值图像（通常是经过边缘检测后的图像，如 Canny 边缘检测的结果  
    # rho 直线检测的精度（以像素为单位），默认值：0.1。值越小，检测精度越高  
    # theta 直线检测的角度精度（以弧度为单位）。默认值：np.pi / 10（即 18 度）。值越小，检测角度越精细  
    # threshold 累加器阈值，用于确定检测到的直线。默认值：15。值越小，检测到的直线越多；值越大，检测到的直线越少  
    # minLineLength 线段的最小长度。默认值：9 小于此长度的线段会被忽略  
    # maxLineGap 线段之间的最大间隙 默认值：4 如果两条线段之间的间隙小于此值，它们会被合并为一条线段  
    return cv2.HoughLinesP(image, rho=0.1, theta=np.pi / 10, threshold=15, minLineLength=9, maxLineGap=4)  

# 过滤并绘制符合条件的线段（水平且长度在25-55像素之间）
def draw_lines(image, lines):  
    # 过滤霍夫变换检测到直线  
    image = np.copy(image)  
    cleaned = []  
    for line in lines:  
        for x1, y1, x2, y2 in line:  
            if abs(y2 - y1) <= 1 and 25 <= abs(x2 - x1) <= 55:  
                cleaned.append((x1, y1, x2, y2))  
                cv2.line(image, (x1, y1), (x2, y2), [255, 0, 0], 1)  
    print("No lines detected: ", len(cleaned))  
    return image  

# 对检测到的线段按列聚类（每列为一排车位），计算每列的矩形边界框
def identify_blocks(image, lines):  
    _new_image = np.copy(image)  
  
    # 1️⃣ 过滤部分直线  
    cleaned = []  
    for line in lines:  
        for x1, y1, x2, y2 in line:  
            if abs(y2 - y1) <= 1 and 25 <= abs(x2 - x1) <= 55:  
                cleaned.append((x1, y1, x2, y2))  
  
    # 2️⃣ 对直线按照x1进行排序  
    import operator  
    list1 = sorted(cleaned, key=operator.itemgetter(0, 1))  
  
    # 3️⃣ 找到多个列，相当于每列是一排车  
    clusters = {}  
    d_index = 0  
    clus_dist = 10  
  
    for i in range(len(list1) - 1):  
        distance = abs(list1[i + 1][0] - list1[i][0])  
        if distance <= clus_dist:  
            if not d_index in clusters.keys():  
                clusters[d_index] = []  
            clusters[d_index].append(list1[i])  
            clusters[d_index].append(list1[i + 1])  
        else:  
            d_index += 1  
  
    # 4️⃣ 得到坐标  
    rects = {}  
    i = 0  
    for key in clusters:  
        all_list = clusters[key]  
        cleaned = list(set(all_list))  
        if len(cleaned) > 5:  
            cleaned = sorted(cleaned, key=lambda tup: tup[1])  
            avg_y1 = cleaned[0][1]  
            avg_y2 = cleaned[-1][1]  
            avg_x1 = 0  
            avg_x2 = 0  
            for tup in cleaned:  
                avg_x1 += tup[0]  
                avg_x2 += tup[2]  
            avg_x1 = avg_x1 / len(cleaned)  
            avg_x2 = avg_x2 / len(cleaned)  
            rects[i] = (avg_x1, avg_y1, avg_x2, avg_y2)  
            i += 1  
  
    print("Num Parking Lanes: ", len(rects))  
    # 5️⃣ 把列矩形画出来  
    buff = 7  
    for key in rects:  
        tup_top_left = (int(rects[key][0] - buff), int(rects[key][1]))  
        tup_bot_right = (int(rects[key][2] + buff), int(rects[key][3]))  
        cv2.rectangle(_new_image, tup_top_left, tup_bot_right, (0, 255, 0), 1)  
    return _new_image, rects

lines = parking.hough_lines(filter_region_image)  
line_image = parking.draw_lines(_image, lines)  
cv_show("line_image", line_image)  
rect_image, rects = parking.identify_blocks(_image, lines)  
cv_show("rect_image", rect_image)

4.车位划分与编号

根据矩形边界框划分车位：

在每列内按固定间距（gap=15.5）绘制水平线，表示车位分隔
对中间列添加垂直线，将车位分为左右两部分
微调坐标（adj_x1, adj_y1等）以适配实际场景

def draw_parking(image, rects, thickness=1, save=False):  
    color = [255, 0, 0]  
    new_image = np.copy(image)  
    gap = 15.5  
    cur_len = 0  
    spot_dict = {}  # 字典：一个车位对应一个位置  
    tot_spots = 0  
    # 微调  
    adj_y1 = {0: 20, 1: -10, 2: 0, 3: -11, 4: 28, 5: 5, 6: -15, 7: -15, 8: -10, 9: -30, 10: 9, 11: -32}  
    adj_y2 = {0: 30, 1: 50, 2: 15, 3: 10, 4: -15, 5: 15, 6: 15, 7: -20, 8: 15, 9: 15, 10: 0, 11: 30}  
  
    adj_x1 = {0: -8, 1: -15, 2: -15, 3: -15, 4: -15, 5: -15, 6: -15, 7: -15, 8: -10, 9: -10, 10: -10, 11: 0}  
    adj_x2 = {0: 0, 1: 15, 2: 15, 3: 15, 4: 15, 5: 15, 6: 15, 7: 15, 8: 10, 9: 10, 10: 10, 11: 0}  
  
    for key in rects:  
        tup = rects[key]  
        # 使用 dict.get 方法，为不存在的键提供默认值 0        
        x1 = int(tup[0] + adj_x1.get(key, 0))  
        x2 = int(tup[2] + adj_x2.get(key, 0))  
        y1 = int(tup[1] + adj_y1.get(key, 0))  
        y2 = int(tup[3] + adj_y2.get(key, 0))  
        cv2.rectangle(new_image, (x1, y1), (x2, y2), (0, 255, 0), 1)  
        num_splits = int(abs(y2 - y1) // gap)  
        for i in range(0, num_splits + 1):  
            y = int(y1 + i * gap)  
            cv2.line(new_image, (x1, y), (x2, y), color, thickness)  
        if 0 < key < len(rects) - 1:  
            # 竖直线  
            x = int((x1 + x2) / 2)  
            cv2.line(new_image, (x, y1), (x, y2), color, thickness)  
        # 计算数量  
        if key == 0 or key == (len(rects) - 1):  
            tot_spots += num_splits + 1  
        else:  
            tot_spots += 2 * (num_splits + 1)  
  
        # 字典对应好  
        if key == 0 or key == (len(rects) - 1):  
            for i in range(0, num_splits + 1):  
                cur_len = len(spot_dict)  
                y = int(y1 + i * gap)  
                spot_dict[(x1, y, x2, y + gap)] = cur_len + 1  
        else:  
            for i in range(0, num_splits + 1):  
                cur_len = len(spot_dict)  
                y = int(y1 + i * gap)  
                x = int((x1 + x2) / 2)  
                spot_dict[(x1, y, x, y + gap)] = cur_len + 1  
                spot_dict[(x, y, x2, y + gap)] = cur_len + 2  
  
    print(f"total parking spaces: {tot_spots}, len: {cur_len} ")  
    if save:  
        filename = f'with_parking_{str(random.randint(1000, 9999))}.jpg'  
        cv2.imwrite(filename, new_image)  
    return new_image, spot_dict

1 2	draw_parking_image, _spot_dict = parking.draw_parking(_image, rects) cv_show("draw_parking_image", draw_parking_image)

最终生成车位字典 spot_dict，记录每个车位的坐标和编号

4.车位状态预测

# 使用预训练模型，预测车位状态empty或occupied
def make_prediction(image, model, class_dictionary):  
    # 预处理  
    img = image / 255.  
  
    # 转换成4D tensor  
    image = np.expand_dims(img, axis=0)  
  
    # 用训练好的模型进行训练  
    class_predicted = model.predict(image)  
    in_id = np.argmax(class_predicted[0])  
    label = class_dictionary[in_id]  
    return label

1.图片预测

# 遍历所有车位，调用模型预测状态，并在图像上标记空车位（绿色半透明矩形）
def predict_on_image(image, spot_dict, model, class_dictionary, color=None, alpha=0.5):  
    if color is None:  
        color = [0, 255, 0]  
    new_image = np.copy(image)  
    overlay = np.copy(image)  
    cnt_empty = 0  
    all_spots = 0  
    for spot in spot_dict.keys():  
        all_spots += 1  
        (x1, y1, x2, y2) = spot  
        (x1, y1, x2, y2) = (int(x1), int(y1), int(x2), int(y2))  
        spot_img = image[y1:y2, x1:x2]  
        spot_img = cv2.resize(spot_img, (32, 32))  
  
        label = make_prediction(spot_img, model, class_dictionary)  
        if label == 'empty':  
            cv2.rectangle(overlay, (int(x1), int(y1)), (int(x2), int(y2)), color, -1)  
            cnt_empty += 1  
  
    cv2.addWeighted(overlay, alpha, new_image, 1 - alpha, 0, new_image)  
  
    cv2.putText(new_image, "Available: %d spots" % cnt_empty, (30, 95),  
                cv2.FONT_HERSHEY_SIMPLEX,  
                0.7, (255, 255, 255), 2)  
  
    cv2.putText(new_image, "Total: %d spots" % all_spots, (30, 125),  
                cv2.FONT_HERSHEY_SIMPLEX,  
                0.7, (255, 255, 255), 2)  
    return new_image

2.视频预测

# 对视频隔帧处理，实时检测车位状态并输出结果视频
def predict_on_video(video_name, spot_dict, model, class_dictionary, output_name="parking_video_output.mp4"):  
    cap = cv2.VideoCapture(video_name)  
  
    # 获取视频帧率、分辨率  
    fps = int(cap.get(cv2.CAP_PROP_FPS))  
    frame_width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))  
    frame_height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))  
  
    # 设置视频写入器（MP4 格式）  
    fourcc = cv2.VideoWriter_fourcc(*'mp4v')  # 编码格式  
    out = cv2.VideoWriter(output_name, fourcc, fps, (frame_width, frame_height))  
  
    frame_count = 0  
    alpha = 0.5  # 透明度  
    color = (0, 255, 0)  # 绿色  
  
    while cap.isOpened():  
        ret, frame = cap.read()  
        if not ret:  
            break  # 读取失败，退出循环  
  
        frame_count += 1  
        if frame_count % 20 != 0:  # 每20帧处理一次，提高性能  
            out.write(frame)  # 直接写入原始帧  
            continue  
  
        overlay = frame.copy()  
        empty_spots = 0  
        total_spots = len(spot_dict)  
  
        for (x1, y1, x2, y2) in spot_dict.keys():  
            x1, y1, x2, y2 = map(int, [x1, y1, x2, y2])  # 确保坐标为整数  
            spot_img = frame[y1:y2, x1:x2]  
  
            if spot_img.size == 0:  
                continue  # 避免无效的裁剪区域  
  
            spot_img = cv2.resize(spot_img, (32, 32))  
            label = make_prediction(spot_img, model, class_dictionary)  
  
            if label == 'empty':  
                cv2.rectangle(overlay, (x1, y1), (x2, y2), color, -1)  
                empty_spots += 1  
  
        # 叠加透明遮罩  
        cv2.addWeighted(overlay, alpha, frame, 1 - alpha, 0, frame)  
  
        # 显示车位信息  
        cv2.putText(frame, f"Available: {empty_spots} spots", (30, 95),  
                    cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2)  
        cv2.putText(frame, f"Total: {total_spots} spots", (30, 125),  
                    cv2.FONT_HERSHEY_SIMPLEX, 0.7, (255, 255, 255), 2)  
  
        out.write(frame)  # 将帧写入输出视频文件  
        # cv_show('Parking Detection', frame)  
  
        if cv2.waitKey(10) & 0xFF == ord('q'):  
            break  
  
    cap.release()  
    out.release()  
    cv2.destroyAllWindows()  
    print(f"Processed video saved as {output_name}")

3.主方法调用

# 测试单张图片，显示原始图片&预测结果
def image_test(_image, _area_dict, _model, _class_dict):  
    cv_show("image", _image)  
    predicted_image = parking.predict_on_image(_image, _area_dict, _model, _class_dict)  
    cv_show("predicted_image", predicted_image)  # 显示预测后的图片
    filename = f'with_marking_{str(random.randint(1000, 9999))}.jpg'  
    cv2.imwrite(filename, predicted_image)  # 保存标记后的图片

# 测试视频
def video_test(video_name, _area_dict, _model, _class_dict):  
    parking.predict_on_video(video_name, _area_dict, _model, _class_dict)  
  
  
if __name__ == '__main__':  
    # 1.模型训练，生成car1.keras  
    # train_model.train()    
    model = load_model("car1.keras")  
  
    # 2.获取停车场车位区域  
    image = cv2.imread("images/frame_0006.jpg")  
    area_dict = get_parking_area(image)  
  
    # 3.图片（标出空车位）  
    class_dict = {0: 'empty', 1: 'occupied'}  
    image_test(image, area_dict, model, class_dict)  
  
    # 4.视频（标出空车位）  
    # video_test('parking_video.mp4', area_dict, model, class_dict)

视频效果可以自己跑下看看

5.总结

1.模型训练

使用 VGG16 预训练模型，去除全连接层，仅保留卷积部分用于特征提取
迁移学习：冻结前 10 层，仅微调深层，提高泛化能力
数据预处理：
- 数据增强（翻转、缩放、平移、旋转）防止过拟合
- 归一化（像素值缩放至 [0,1]）
自定义分类层：
- 全连接层 + softmax 激活函数进行二分类（空/占）
模型训练：
- categorical_crossentropy作为损失函数，SGD 作为优化器
- 早停机制 + 最优模型保存 (car1.keras)

2.车位分割

图像预处理：
- 选取 白色/黄色车位线，转换为灰度图
- Canny 边缘检测提取车位边缘
车位区域选择：
- 手动标定停车场区域，提取有效部分
- 生成掩码，忽略背景干扰
车位线检测与聚类：
- 霍夫变换检测水平车位线
- 聚类分析划分停车列，计算每列矩形边界框

6.备注

环境

mac: 15.2
python: 3.12.4
numpy: 1.26.4
opencv-python: 4.11.0.86
keras: 3.9.0

资源和代码

https://github.com/keychankc/dl_code_for_blog/tree/main/008_opencv_park