File Exchange

image thumbnail

Feature fusion using Canonical Correlation Analysis (CCA)

version 1.0.1 (3.02 KB) by Mohammad Haghighat
Feature level fusion using Canonical Correlation Analysis (CCA)


Updated 31 Jan 2020

From GitHub

View Version History

View license on GitHub

Feature fusion is the process of combining two feature vectors to obtain a single feature vector, which is more discriminative than any of the input feature vectors.
CCAFUSE applies feature level fusion using a method based on Canonical Correlation Analysis (CCA). It gets the train and test data matrices from two modalities X and Y, and consolidates them into a single feature set Z.

Details can be found in:

M. Haghighat, M. Abdel-Mottaleb, W. Alhalabi, "Fully Automatic Face Normalization and Single Sample Face Recognition in Unconstrained Environments," Expert Systems With Applications, vol. 47, pp. 23-34, April 2016.

(C) Mohammad Haghighat, University of Miami

Cite As

Haghighat, Mohammad, et al. “Fully Automatic Face Normalization and Single Sample Face Recognition in Unconstrained Environments.” Expert Systems with Applications, vol. 47, Elsevier BV, Apr. 2016, pp. 23–34, doi:10.1016/j.eswa.2015.10.047.

View more styles

Comments and Ratings (33)


Sir, Your code is excellent and easy to use. I have used your code for my work. thanks a lot providing sucha a helpful code.

lili mao

Hello, thanks for your sharing. May I ask how to classify the training and test feature vectors after fusion

Pengnian Zhang

Chidiebere Ike

Thank you for uploading this informative code. I will appreciate your candid guide and opinions. I am working on Low resolution images with Occlusion handling. Can this be applied for Low resolution frontal images with Occlusion.... ??? Please advise. Thank you

Caroline Loanda

Hi Haghighat,

Thank you for uploading this helpful code, after reading your paper I need several favors from you:
1. Do you mind to share your code (in matlab) for automatic landmarks detection?
2. Since I am still learning in computer vision area, I am considering to apply SIFT or SURF feature extraction for face recognition (since it seems quite automatic, doesn't really need normalization). In your opinion, is SIFT of SURF will be good for face recognition?

Thanks alot,

Nazeera Sheth

Kyle Peterson

I have used the CCA code to fuse spectral data and it works great! Is there a way to access the importance of each variable before they are fused?

Mohammad Haghighat

Ankit Sharma:
CCA does not use the label information. If you want a supervised fusion technique, I'd like to draw your attention to the Discriminant Correlation Analysis (DCA). Here is the link to the Matlab code:

ankit sharma

could you please tell me when and how should we provide the training labels to the images for matching purpose... befofe or after feature fusion ??

Mohammad Haghighat

Wasseem Al-Obaydy:

Yes. The numbers of subjects in the test and train data do not have to be the same.


Hi Mohammad, thanks for the providing this toolbox. Can I ask if this could be used to look for multivariate relationships between 2 matrices (25 observations x 100 variables)?

Wasseem Al-Obaydy

Thank you ... Can your fusion technique work on train set and test set having different number of subjects?

Mohammad Haghighat

HR Ramya:

Please click on the Download button above.

Islem Rekik

HR Ramya

hellooo...can u please tell me where to find the code ...i need it to use for my work....thanks in advance

anil hazarika

Helpful code for all. Thanks a lot

sapna sapna

i run your code on feature vectors taken from two camera views. I am confused about label vector. after fusion, i want to train svm classifier, so when i have to label my data? before fusion or after fusion?

Mokni Raouia

Mohammad Haghighat

Fares Al-shargie:

The maximum length of the projected feature vector (d) is the rank of the between-set covariance matrix (Sxy), which is equal to the minimum rank of the two input feature matrices, X and Y.

d = rank(Sxy) ≤ min(rank(X),rank(Y))

emad saeidi

Hi, can this be used for neuro-imaging CCA? I hope to hear from you soon. (:

Fares Al-shargie

Dear Mohammed,
I have a matrix of 50*120 (n,p)[note: the 50 is 25 testing and 25 training]from one modality and another matrix of 50*120 (n,q). i used your method to maximize the correlation so that feature fusion can improve the detection rate.
my question is why the fused-feature vector with concat gives me only 25*16.?
another question
how interpret the map of this fused-feature matrix?
should i compare it with the individual and say the correlation of the fused feature was maximized.
thank you very much


Mohammad Haghighat


I'm very happy that the code has helped you, and thank you for your 5-star rating.

To the best of my knowledge, early fusion is just another name for data or feature level fusion, and late fusion is the same as decision level fusion. So, our method, using CCA, is kind of an early fusion technique.

In my recent paper, I've compared several well-known early fusion, and late fusion techniques, including

Early fusion methods:
- Serial feature fusion
- Parallel feature fusion
- CCA (Canonical Correlation Analysis)
- DCA (Discriminant Correlation Analysis)

Late fusion methods:
- SLR-Sum (Sparse Logistic Regression)
- SLR-Major
- SVM-Sum (Support Vector Machines)
- SVM-Major
- MKL (Multiple Kernel Learning)

You can find references to their papers (or codes) in this paper:

M. Haghighat, M. Abdel-Mottaleb, W. Alhalabi, "Discriminant Correlation Analysis: Real-Time Feature Level Fusion for Multimodal Biometric Recognition," IEEE Transactions on Information Forensics and Security, 2016.

Angel Lebanon

Thank you for this code.. Amazing! It helps me a lot.. Do you have any idea about late fusion and early fusion? I want to compare them with CCA

Mohammad Haghighat


trainX is an nxp matrix containing the first set of training data (n is the number of training samples and p is the dimensionality of the first feature set). Similarly, trainY is an nxq matrix containing the second set of training data. Obviously, you have the same number of training samples (n) but the feature vector length can vary (p vs. q).

can liu

I am glad to using your code, i am puzzle about the trainLabel(Row vector of length n containing the class labels for training data).you have given the explain,but I cannot understand it well.could you give me a detail explain about a example is better.thank you


I am new to using the matlab for fusion, I run your code but not sure what is the values should be for trainx and etc.

my training data is 1x14000
my question actually about the xnp, n, p and xnq, q, and mxp, Are they the obtained from my feature vectors?

thank you

Mohammad Haghighat


This is a very difficult question because there are different methods for either feature level, matching score level or decision level fusion; and one works better than the other.

However, all in all, fusion at the feature level is expected to provide better recognition results because the feature set contains richer information about the input data than the matching score or the output decision of a matcher [ref].

[ref] Arun Ross and Anil K. Jain, "Multimodal biometrics: An overview," In 12th European Signal Processing Conference (EUSIPCO), pp. 1221-1224, September 2004.

Mike Reno

How does this method work compared with score-level or decision-level fusion?

Mike Reno

How does this method work compared with score-level or decision-level fusion?

Mike Reno

Mike Reno

MATLAB Release Compatibility
Created with R2015b
Compatible with any release
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!