一、环境配置
1.1下载源码
mask2former:
https://github.com/facebookresearch/Mask2Former/tree/main
detectron2:
https://github.com/facebookresearch/detectron2
下载完后,新建一个文件夹,起个名字(我起的Mask2Former-main);mask2former与detectron2项目也分别命名为Mask2Former、detectron2;目录变成这个样子:
Mask2Former-main
|——detectron2
|——Mask2Former
使用pycharm打开项目的时候,直接打开Mask2Former-main
1.2配置环境
1.anaconda下新建并激活虚拟环境
新建:
conda create -n mask2former python==3.8
激活:
source activate mask2former
2.安装torch、torchvision
我是2060ti,CUDA12.4
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.4 -c pytorch -c nvidia
3.安装detectron2
cd到detectron2所在的根目录下,不要 cd 进detectron2文件夹:
python -m pip install -e detectron2
4.安装mask2former所需环境和MSDA模块
cd进mask2former文件夹,运行:
pip install -r requirements.txt
然后,cd 进 mask2former/modeling/pixel_decoder/ops/目录下,不要不要不要运行:
sh make.sh #不要运行这一行
windows下运行不了这个命令,用文本编辑工具notepad++打开make.sh,发现只有一行有用的,直接运行这一行即可
python setup.py build install
1.3我遇到的错误
python setup.py build install
失败。
这个问题我遇到过很多次,不做解释,在此记录针对我的电脑的解决方案:
#:: Step 1: 清空 INCLUDE 如果不进行我会遇到此时不应有 /windows
set INCLUDE=#:: Step 2: 加载 MSVC 编译器环境
call "C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\Build\vcvarsall.bat" x64#:: Step 3: 设置 DISTUTILS_USE_SDK=1
set DISTUTILS_USE_SDK=1#:: Step 4: 安装
pip install -e .
二、配置数据集
Mask2Former/
├── datasets/
│ └── voc2former/(数据集名字)
│ ├── images/(存放原图)
│ │ ├── train/
│ │ │ ├── 1.jpg
│ │ │ ├── ...
│ │ └── val/
│ │ ├── 2.jpg
│ │ └── ...
│ └── semantic_annotations/存放mask
│ ├── train/
│ │ ├── 1.png
│ │ └── ...
│ └── val/
│ ├── 2.png
│ └── ...
搞了半天才搞成这种格式,转换的代码自己搜搜吧,我没时间整理了
三、修改代码
1.修改ymal
经过我的查询,这两个ymal文件是用来语义分割的,别的可能是实例分割或者全景分割,就先用这两个跑通。
修改
Mask2Former/configs/ade20k/semantic-segmentation/maskformer2_R50_bs16_160k.yaml
改为自己的类别我的是 背景+类别=1+1=2
NUM_CLASSES: 2
修改
Mask2Former/configs/ade20k/semantic-segmentation/Base-ADE20K-SemanticSegmentation.yaml
注册数据集的名称,照着改就行
DATASETS:TRAIN: ("ade20k_sem_seg_2train",)TEST: ("ade20k_sem_seg_2val",)
我的显卡2060太捞了
批次改为2,不然显存不够
IMS_PER_BATCH: 2 #16
训练论述改小点,不然训练时间太长,改小点先试试跑通
MAX_ITER: 800 #160000
训练的图像大小改小点,不然显存不够
MAX_SIZE_TRAIN: 1024 #2048
后面报了个什么多线程(应该是我的cpu和内存太捞了),改了这个才好
NUM_WORKERS: 0 #4
2.修改config.py
detectron2/detectron2/config/config.py
我报了编码错误
所以改第34行
return open(filename, "r", encoding="utf-8") # return PathManager.open(filename, "r")
3.修改train_net.py注册数据集
Mask2Former/train_net.py
在if name == “main”:上方直接加上,注册数据集
from detectron2.data.datasets import register_coco_panoptic_separated, load_sem_seg
from detectron2.data import MetadataCatalog, DatasetCatalogdef register_ade20k_sem_seg(root="E:/GitHub/Mask2Former-main/Mask2Former/datasets/voc2former"):# 注册训练集和验证集for split in ["train", "val"]:# 语义分割标注路径sem_seg_root = os.path.join(root, "semantic_annotations", split)# 原始图片路径image_dir = os.path.join(root, "images", split)# 注册数据集到 DatasetCatalogDatasetCatalog.register(f"ade20k_sem_seg_2{split}",lambda: load_sem_seg(sem_seg_root, image_dir, gt_ext="png", image_ext="jpg"),)# 添加元数据 (类别名称和颜色)MetadataCatalog.get(f"ade20k_sem_seg_2{split}").set(stuff_classes=["background", "fracture"], # 替换为你的类别列表stuff_colors=[[0, 0, 0], [255, 0, 0]], # 每个类别的显示颜色ignore_label=255, # 忽略的标签值evaluator_type="sem_seg", # 指定评估类型为语义分割)# 调用注册函数
register_ade20k_sem_seg()
四、训练、预测
1.训练
python train_net.py --num-gpus 1 --config-file configs/ade20k/semantic-segmentation/maskformer2_R50_bs16_160k.yaml
2.预测
先改点代码
Mask2Former/demo/demo.py
把def setup_cfg(args):之上替换为
# Copyright (c) Facebook, Inc. and its affiliates.
# Modified by Bowen Cheng from: https://github.com/facebookresearch/detectron2/blob/master/demo/demo.py
import os
os.environ['KMP_DUPLICATE_LIB_OK'] = 'TRUE' # 解决OpenMP冲突
os.environ['CUDA_LAUNCH_BLOCKING'] = '1' # 可选:用于更清晰的CUDA错误提示import argparse
import glob
import multiprocessing as mp
import os# fmt: off
import sysfrom detectron2.data import MetadataCatalogsys.path.insert(1, os.path.join(sys.path[0], '..'))
# fmt: onimport tempfile
import time
import warningsimport cv2
import numpy as np
import tqdmfrom detectron2.config import get_cfg
from detectron2.data.detection_utils import read_image
from detectron2.projects.deeplab import add_deeplab_config
from detectron2.utils.logger import setup_loggerfrom mask2former import add_maskformer2_config
from predictor import VisualizationDemo# demo.py 顶部添加
from detectron2.data import MetadataCatalog
MetadataCatalog.get("ade20k_sem_seg_2val").set(stuff_classes=["background", "fracture"],stuff_colors=[[0,0,0], [255,0,0]],evaluator_type="sem_seg",ignore_label=255,
)
# 在预测代码前添加
metadata = MetadataCatalog.get("ade20k_sem_seg_2val")
assert hasattr(metadata, "stuff_classes"), "元数据未正确注册!"print("类别:", metadata.stuff_classes)
# constants
WINDOW_NAME = "mask2former demo"
然后预测:
python demo/demo.py --config-file "E:/GitHub/Mask2Former-main/Mask2Former/configs/ade20k/semantic-segmentation/maskformer2_R50_bs16_160k.yaml" --input "E:/traindate/StoneCrack_SE_jpg/2025_04_14_09_48_IMG_27561.jpg" --output "E:/GitHub/Mask2Former-main/Mask2Former/output/predictions" --opts MODEL.WEIGHTS "E:/GitHub/Mask2Former-main/Mask2Former/output/model_final.pth"
这就结束了。
贴一下训练结束:
[06/01 02:00:07 d2.evaluation.testing]: copypaste: mIoU,fwIoU,mACC,pACC
[06/01 02:00:07 d2.evaluation.testing]: copypaste: 95.8943,97.4269,97.6957,98.6891
预测一张图
在这里插入图片描述