Change Object Detection to own Objects (fasterRCNN)

2 visualizaciones (últimos 30 días)
MatLabMcLovinPotato
MatLabMcLovinPotato el 28 de Mayo de 2020
Editada: Ali Ozturk el 27 de Mayo de 2023
Afternoon!
I've been working my way through this example as it's the closest I've found for what I'm trying to do: https://www.mathworks.com/help/vision/examples/object-detection-using-faster-r-cnn-deep-learning.html
This is a detector to train, that looks for vehicle, I am trying to use this to detect other things. I need to be detecting more than one object, of more than one Class.
I've changes the above to:
  1. Use my own groundtruth table
  2. I have 8 objects I'd want to be detecting (+ Background)
  3. Main (really only) change is that I modified are references from 'vehilces' to to my groundtruth table columns, where I need all of my new objects.
i.e. I changed:
FROM: bldsTest = boxLabelDatastore(testDataTbl(:,'vehicle'));
To: bldsTest = boxLabelDatastore(testDataTbl(:,2:end)); and tried to list out all the classes in curly[s], rather than calling columns, same errors.
I labelled my own images and have been updating and adjusting as and where I can. I've changed lines in the opiotns, leaning that I need to edit which and what layers?
Although at this point, I have the following road block, at the trainFasterRCNNObjectDetector step, I get the following error:
Error using trainFasterRCNNObjectDetector (line 426)
Invalid network.
Caused by:
Layer 'boxDeltas': The input size must be 1×1×32. This R-CNN box regression layer expects the third input dimension to be 4 times the number of object classes the network should detect (8 classes). See the documentation for more details about creating Fast or Faster R-CNN networks.
Layer 'rcnnClassification': The input size must be 1×1×9. The classification layer expects the third input dimension to be the number of object classes the network should to detect (8 classes) plus 1. The additional class is required for the "background" class. See the documentation for more details about creating Fast or Faster R-CNN networks.
REQUEST
Would it be possible to please get an idea of how I can adjust my working to not have these errors, such as which variable, object, or deity I need to reference. Do I need to run through these steps of this first? For exmaple: https://www.mathworks.com/help/vision/ug/faster-r-cnn-examples.html
Should my images be smaller, more of them, more anti-object images, I've changed the options for training, although not sure if the right direction.... I'm asking what to do about my boxDeltas and rcnnClassification error. I really just ask that, given this isn't the first post talking to this; if you do feel the need to reply, please - don't reword the error message to me. If that's what I was after, I'd have posted this weeks ago...
  1 comentario
Ali Ozturk
Ali Ozturk el 27 de Mayo de 2023
Editada: Ali Ozturk el 27 de Mayo de 2023
You need to set the numClasses variable to your number of classes in the faster_rcnn.m file.
For example; if you have 8 classes, the line would be:
numClasses=8

Iniciar sesión para comentar.

Respuestas (1)

Madhav Thakker
Madhav Thakker el 24 de Jul. de 2020
I understand that you want to train a Faster-RCNN for multi-class object detection.
It seems that the Faster-RCNN network is instantiated as expected (8 classes + 1 background). I think the input data is not read properly. You can do width(dataset)-1 to verify the number of classes in your input dataset.
fasterRCNNLayers(inputSize,numClasses,anchorBoxes,featureExtractionNetwork,featureLayer) should be able to create a working Faster-RCNN network with correct number of classes. It is also reflected in the 'boxDeltas' and 'rcnnClassification' layer error.
To answer your other questions -
  • The minimum size of input images should be [224, 224, 3], but if you have a powerful GPU, you could even give the original image as input.
  • More the number of training images, more generalizable and more robust will the learned network be.
  • Ideally, you should have some data with no foreground objects but that depends from case to case basis.
  1 comentario
永涛 贾
永涛 贾 el 24 de Mayo de 2021
According to the help document:when training a Faster-RCNN for multi-class object detection, use a datastore, after calling the datastore with the read and readall functions,it returns a cell array or table with two or three columns. The second column must be a cell array that contains M-by-5 matrices of bounding boxes of the form [xcenter, ycenter, width, height,yaw]. The vectors represent the location and size of bounding boxes for the objects in each image.
what does the parameter ‘yaw’ mean?where does it come from?

Iniciar sesión para comentar.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by