Borrar filtros
Borrar filtros

Faster R-CNN layer error

36 visualizaciones (últimos 30 días)
Brenon
Brenon el 20 de Jul. de 2024
Comentada: Brenon el 30 de Jul. de 2024 a las 19:06
I have been at it for awhile but cannot figure it out, and I have also gotten lost in documentation for a few days now. I am attemping to train a Faster R-CNN model with a pretrained ResNet backbone. So far I have found documentation stating not to use lgraph and to use dlnetwork instead, so I attemtped it that way also and got the same error. More documentation stated not to use net=resnet50 either, and to use [net,classNames] = imagePretrainedNetwork instead. The issue is that I cannot figure out how to fit all of these pieces together. The dataset can be found here and downlaoded for free: https://www.flir.com/oem/adas/adas-dataset-form/
When the model attempts to run, it appears that it detects only 3 classes. I also used analyzeNetwork and network designer to look at the layers and it appears that the boxdeltas and R-CNN classification layers have the correct number of outputs. Any help is greatly appreciated!!
Here is the code so far (some parts generated by chatgpt and others taken from official documentation), but I have several versions of this with slight variations:
%% Define the custom read function
function imgOut = ensureRGB(imgIn)
[~, ~, numChannels] = size(imgIn);
if numChannels == 1
imgOut = repmat(imgIn, [1 1 3]);
else
imgOut = imgIn;
end
end
%% Define the paths
imageFolder = "C:\Users\User\Desktop\FLIR_Thermal_Dataset\FLIR_ADAS_v2\images_thermal_train";
annotationFolder = "C:\Users\User\Documents\MATLAB\trainingData.mat";
matFile = "C:\Users\User\Documents\MATLAB\trainingData.mat"; % MATLAB format annotations(there is a function to convert the original data into this .mat file if anyone needs it)
%% Load the training data from the MAT-file
load(matFile, 'trainingData');
% Shuffle the training data
rng(0);
shuffledIdx = randperm(height(trainingData));
trainingData = trainingData(shuffledIdx,:);
%% Create image datastore with custom read function and specify file extensions
imds = imageDatastore(trainingData.imageFilename, ...
'ReadFcn', @(filename) ensureRGB(imread(filename)), ...
'FileExtensions', {'.jpg', '.jpeg', '.png', '.bmp'});
%% Create box label datastore
blds = boxLabelDatastore(trainingData(:, {'bbox', 'label'}));
%% Combine the datastores
ds = combine(imds, blds);
%% Verify with a sample image
sampleImg = readimage(imds, 1);
[height, width, numChannels] = size(sampleImg);
disp(['Sample Image Number of Channels: ', num2str(numChannels)]);
%% Define number of classes
numClasses = 16;
%% Define input image size and anchor boxes
inputImageSize = [512 640 3];
anchorBoxes = [32 32; 64 64; 128 128];
%% Load the ResNet-50 network
lgraph = layerGraph(resnet50);
% Specify the feature extraction layer
featureLayer = 'activation_40_relu';
% Create Faster R-CNN layers
dlnetwork = fasterRCNNLayers(inputImageSize, numClasses, anchorBoxes, lgraph, featureLayer);
%% Analyze the network to ensure all layers are correct
analyzeNetwork(dlnetwork);
%% Define training options
options = trainingOptions('sgdm', ...
'MiniBatchSize', 16, ...
'InitialLearnRate', 1e-4, ...
'MaxEpochs', 10, ...
'Verbose', true, ...
'Shuffle', 'every-epoch', ...
'Plots', 'training-progress');
% Train the network
detector = trainFasterRCNNObjectDetector(ds, dlnetwork, options);
ERROR:
Training a Faster R-CNN Object Detector for the following object classes:
* car
* light
* person
Error using trainFasterRCNNObjectDetector (line 33)
Invalid network.
Error in
untitled (line 74)
detector = trainFasterRCNNObjectDetector(ds, dlnetwork, options);
Caused by:
Layer 'boxDeltas': The input size must be 1×1×12. This R-CNN box regression layer expects the third input dimension to be 4 times the number of object classes
the network should detect (3 classes). See the
documentation for more details about creating Fast or Faster R-CNN networks.
Layer 'rcnnClassification': The input size must be 1×1×4. The classification layer expects the third input dimension to be the number of object classes the
network should detect (3 classes) plus 1. The additional class is required for the "background" class. See the
documentation for more details about creating
Fast or Faster R-CNN networks.
So far I have tried:
1) using dlnetwork instead of lgraph
2) using [net,classNames] = imagePretrainedNetwork instead of net=resnet50
3) manually changing the layers in the designer
4) changing the channels from 1 to 3. (when loaded into my python environment the images had three channels, in MATLAB they showed 1)
5) resizing the images
  2 comentarios
Corey Kurowski
Corey Kurowski el 24 de Jul. de 2024
Hey Brenon,
At a surface level, it looks like there may be a mismatch in the input datastore and the network configuration. Unfortunately, I am unable to use the dataset directly, but if you are able to share your network and combinedDatastore, it might provide some clarity as to where this mismatch is occurring. Alternatively, even a set of screenshots of the feature extraction layer and output layers in your dlnetwork (from within analyzeNetwork) and your boxLabelDatastore may yield the information needed.
As a slight aside, what is gearing you towards using Faster RCNN as opposed to another network? It would be good to understand your approach fully to see if another solution/approach might exist.
brenon tate
brenon tate el 24 de Jul. de 2024
Good morning Corey, and thanks for the reply! The reason I am using Faster R-CNN is because I believe it may do a better job at identifying small objects than some of the others. I have already trained a Faster R-CNN model in python with this set, and want to do the same thing in matlab for practice. The ultimate goal is to use this model for my own dataset that I am in the process of creating with a thermal sensor that will contain small arms. The plan is to get it working in parallel, then train the model as I capture data and annotate my images.
I'm not sure if these files are what you were requesting, but if they arent please let me know and I will upload more. I could not upload the model as it is too large, but there is an image of the last 10 layers.

Iniciar sesión para comentar.

Respuesta aceptada

Corey Kurowski
Corey Kurowski el 30 de Jul. de 2024 a las 12:26
The following resolved the immediate issue at hand:
Hey Brenon,
My apologies on not seeing your initial reply sooner. After diving into this a bit, the issue seems to stem from your boxLabelDatastore and the categories produced for your labels. There is an underlying expectation that the categories denoted in each row in the combined datastore will provide a comprehensive list of all classes in your datastore. Currently, your combinedDatastore produces only a subset of classes (representing the present labels in the image):
categories(blds.LabelData{1,2})
ans =
3×1 cell array
{'car' }
{'light' }
{'person'}
The easiest fix would likely be configuring how your trainingData.mat is formatted. The first column should be the image names and then every column after should be a specific class with each row having associated bounding boxes for that respective image. Similar to:
Then, to create a boxLabelDatastore, you'd call:
blds = boxLabelDatastore(trainingData(:,2:end));
This will automatically configure that underlying comprehensive categorical class assumption. Please try this out when you are able to and see if it allows you to proceed.
In the meantime, it would be great if you could share your trainingData.mat file as it is so we can see what your formatting is. In most cases, we are able to build that underlying assumption upon boxLabelDatastore construction, but it seems like you may have an edge case configuration that we haven't properly captured.

Más respuestas (1)

Brenon
Brenon el 27 de Jul. de 2024 a las 22:00
I decided to go back to Python as this is consuming way more time than its worth. I loaded an example to compare my datstore to that one and everything is the same, I loaded an image with bounding boxes to ensure they were correct, then I checked that the input to the "rcnnBoxDeltas" and "rcnnClassification" layers were 64(4*16 classes) and 17(16 classes +1) and they were. For some reason this model is only detecting 3 of the 16 classes.
Also, the example Faster R-CNN loads with simialr errors.
  8 comentarios
Corey Kurowski
Corey Kurowski el 30 de Jul. de 2024 a las 16:25
Hey Brenon, one more thing that popped into my mind regarding your choice of Faster RCNN for smaller object identification. I would suggest you give YOLOX a chance if time allows. We recently released it and have seen very good results for smaller objects due to its anchorless design and a feature pyramid network.
Brenon
Brenon el 30 de Jul. de 2024 a las 19:06
Corey,
Thank you for providing that link, I will definitely try that out when the time comes, I may even try it on this FLIR ADAS set this weekend just to play around with it. I wont get to try it on the dataset that I mentioned previously as I believe it will take me awhile to create it.
Brenon

Iniciar sesión para comentar.

Productos


Versión

R2024a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by