Giter Club home page Giter Club logo

detecting's Introduction

Detecting

The platform for object detection research was implemented with TensorFlow2 eager execution.

TensorFlow 2.4 Python 3.6

GitHub:https://github.com/Qinbf/Detecting

B站关于项目的视频介绍:https://www.bilibili.com/video/BV1rZ4y1T7Nu

项目初衷是给大家提供一个既方便使用,同时又易于学习的目标检测工具。Detecting给大家提供多种预训练模型,可以直接下载使用,项目中的所有代码都有详细注释。

我先挖个坑,暂时只实现了FasterRCNN算法,后续会把坑填上,把主流的一些算法都实现。希望大家可以给个Star支持一下,谢谢!

如果有很多人喜欢Detecting这个项目的话,我会出一个免费的视频从头到尾讲解这个目标检测项目是如何做出来的。(一行一行代码讲,会讲到所有细节)

项目中肯定存在bug和不足,大家在使用时遇到问题或者有好的想法可以给我反馈。


安装

首先确保已经安装Tensorflow2环境,然后再安装detecting模块。

  • 推荐使用pip安装:
pip install detecting
  • 也可以使用源码安装:

先使用 git clone项目:

git clone https://github.com/Qinbf/detecting.git

然后 cd 到detecting文件夹中执行安装命令:

cd detecting
sudo python setup.py install
  • 如果需要训练或评估COCO数据集还需要安装pycocotools模块

快速使用

  • 模型预测

通常来说模型预测只需要几行代码

from detecting.build.fasterrcnn import FasterRCNNModel
# 下载并载入预训练模型
# weights如果为'None'表示定义一个没有训练过的新模型
# weights如果为一个路径,表示从该路径载入训练好的模型参数
model = FasterRCNNModel(backbone='resnet101', weights='Path/fasterrcnn_resnet101_1024.h5', input_shape=(1024, 1024))
# 预测结果并显示
model.predict_show('test_images/000000018380.jpg')


训练新模型

使用Detecting训练自己的数据可以按照VOC数据集的格式先对数据进行标注。

标注工具:

下面把VOC数据集看成是我们自己标注好的新数据集。

理论上训练集和测试集可以存放在任意位置,不过为了方便,大家可以参考我下面介绍的方式。我们可以新建一个datasets文件夹,然后把VOC训练集和测试集都放在datasets中,文件结构如下:

datasets/
└── VOC
    ├── test
    │   └── VOC2007
    │       ├── Annotations
    │       ├── ImageSets
    │       ├── JPEGImages
    │       ├── SegmentationClass
    │       └── SegmentationObject
    └── train
        └── VOC2007
            ├── Annotations
            ├── ImageSets
            ├── JPEGImages
            ├── SegmentationClass
            └── SegmentationObject

Annotations文件夹中保存这图片的标注,ImageSets文件夹保存这图片。我们可以自定义一个'train.yml'配置文件,配置文件内容如下:

DATASETS:
  NAMES: ('MYDATA')
  CLASSES: ('__background__', 
            'aeroplane', 'bicycle', 'bird', 'boat',
            'bottle', 'bus', 'car', 'cat', 'chair',
            'cow', 'diningtable', 'dog', 'horse',
            'motorbike', 'person', 'pottedplant',
            'sheep', 'sofa', 'train', 'tvmonitor')
  IMAGE_DIR: ('datasets/VOC/train/VOC2007/JPEGImages')
  LABEL_DIR: ('datasets/VOC/train/VOC2007/Annotations')
  SCALE: (1024, 1024)

MODEL:
  BACKBONE: 'resnet101'
  WEIGHTS: 'Path/fasterrcnn_resnet101_1024.h5'
  INPUT_SHAPE: (1024, 1024)
  ANCHOR_SCALES: (64, 128, 256, 512)
  ANCHOR_FEATURE_STRIDES: (16, 16, 16, 16)

NAMES: ('MYDATA')表示训练自己的数据集

NAMES: ('COCO')表示训练'COCO'数据集

NAMES: ('VOC')表示训练'VOC'数据集

CLASSES: 设置数据集的标签

IMAGE_DIR: 设置图片路径

LABEL_DIR: 设置标注路径

SCALE: 生成器产生的图片尺寸

BACKBONE: 模型基本分类器

WEIGHTS: 模型权值。WEIGHTS如果为'None'表示定义一个没有训练过的新模型;WEIGHTS如果为一个路径,表示从该路径载入训练好的模型参数

INPUT_SHAPE: 表示模型输入图片大小

ANCHOR_SCALES: anchors的大小

ANCHOR_FEATURE_STRIDES: anchors的步长

  • 模型训练

通常来说模型训练也只需要几行代码

from detecting.build.fasterrcnn import FasterRCNNModel
from detecting.datasets.datasets import load_tf_dataset
from detecting.config import cfg
# 与配置文件中的配置合并
cfg.merge_from_file('train.yml')
# 载入数据集tf_dataset
tf_dataset = load_tf_dataset(cfg)
# 载入模型 
model = FasterRCNNModel(cfg)
# 训练模型
model.fit(tf_dataset)

本项目最重要的文件是detecting/config/defaults.py,里面保存着所有默认配置信息。我们可以自定义"*.yml"文件,用于修改默认配置信息。

更多使用方法可以查看tutorial中的内容以及源代码。


VOC 测试集实测结果

Detection Model Backbone Input resolution mAP
FasterRCNN VGG16 1024x1024 53.97

COCO 验证集实测结果

Detection Model Backbone Input resolution AP AP50 AP75 APS APM APL
FasterRCNN ResNet50 640x640 24.7 39.9 26.0 5.7 26.1 42.6
FasterRCNN ResNet50 1024x1024 27.5 43.8 29.5 10.8 32.6 41.5
FasterRCNN ResNet101 640x640 27.0 41.2 29.2 7.2 28.6 45.0
FasterRCNN ResNet101 1024x1024 32.2 47.4 35.2 12.1 35.7 50.4
FasterRCNN ResNet152 640x640 27.7 41.5 29.9 7.8 29.4 46.8
FasterRCNN ResNet152 1024x1024 32.0 46.7 35.2 11.4 35.3 51.6

COCO数据集预训练模型下载地址

百度网盘
密码: owi0


Acknowledgment

tensorflow/models/tree/master/research/object_detection

Viredery/tf-eager-fasterrcnn

matterport/Mask_RCNN

detecting's People

Contributors

qinbf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

detecting's Issues

3G显存不够

OOM when allocating tensor with shape[2000,7,7,1024] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [Op:MaxPool]
这是显存不够吗?请问怎么解决

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.