Main Content

filenames2labels

Get list of labels from filenames

Since R2022b

    Description

    example

    lbls = filenames2labels(loc) creates a list of labels lbls based on the filenames in the specified location loc.

    example

    lbls = filenames2labels(ds) creates a list of labels based on the filenames contained in ds. ds can be a datastore, matlab.io.datastore.FileSet object, or matlab.io.datastore.BlockedFileSet object.

    example

    lbls = filenames2labels(___,Name=Value) specifies additional name-value arguments. For example, IncludeSubfolders=true includes subfolders in the scan for labels.

    [lbls,files] = filenames2labels(___) returns a list of files. The ith label in lbls corresponds to the ith file in files.

    Examples

    collapse all

    Specify the folder that contains sample audio signals included with Signal Processing Toolbox™.

    audiofolder = "audiodata";

    Create a list of labels based on the .wav filenames located in folder.

    lbls = filenames2labels(audiofolder,FileExtensions=".wav")
    lbls = 3x1 categorical
         GuitarTuneSignal 
         NoisyMusicSignal 
         SpeechDFTSignal 
    
    

    Download a publicly available data set of ultra-wideband (UWB) impulse radar signals [1]. The data set contains recordings of dynamic hand gestures collected from human volunteers and is divided into eight subfolders.

    datasetZipFolder = matlab.internal.examples.downloadSupportFile("SPT","data/uwb-gestures.zip");
    datasetFolder = erase(datasetZipFolder,".zip");
    if ~exist(datasetFolder,"dir")
        downloadLocation = fileparts(datasetZipFolder);
        unzip(datasetZipFolder,downloadLocation);
    end

    Create a signal datastore that points to the files located in the download folder. Include subfolders in the search path.

    ds = signalDatastore(datasetFolder,IncludeSubfolders=true);

    Create a list of labels based on the filenames contained in the datastore. Extract only the substring of the filename that includes the gesture code (G1, G2, ..., G12). Verify that each of the eight subfolders contains one file corresponding to each gesture.

    p = "G"+digitsPattern;
    lbls = filenames2labels(ds,Extract=p);
    countlabels(lbls)
    ans=12×3 table
        Label    Count    Percent
        _____    _____    _______
    
         G1        8      8.3333 
         G10       8      8.3333 
         G11       8      8.3333 
         G12       8      8.3333 
         G2        8      8.3333 
         G3        8      8.3333 
         G4        8      8.3333 
         G5        8      8.3333 
         G6        8      8.3333 
         G7        8      8.3333 
         G8        8      8.3333 
         G9        8      8.3333 
    
    

    For an example that classifies the different hand gestures using a convolutional neural network (CNN), see Hand Gesture Classification Using Radar Signals and Deep Learning.

    References

    [1] Ahmed, Shahzad, Dingyang Wang, Junyoung Park, and Sung Ho Cho. "UWB-Gestures, a Public Dataset of Dynamic Hand Gestures Acquired Using Impulse Radar Sensors." Scientific Data 8, no. 1 (April 12,2021): 102. https://doi.org/10.1038/s41597-021-00876-0.

    Input Arguments

    collapse all

    Files or folders to scan for labels, specified as a character vector, a cell array of character vectors, a string scalar, or a string array, containing the location of files or folders that are local or remote.

    • Local files or folders — Specify loc as a local path to files or folders. If the files are not in the current folder, the local path must specify full or relative paths. Files within subfolders of the specified folder are included by default. You can use the wildcard character (*) when specifying the local path. This character specifies that the file search include all matching files or all files in the matching folders.

    • A remote location specified using an internationalized resource identifier (IRI).

    • Remote files or folders — Specify loc to be the full paths of the files or folders as a uniform resource locator (URL) of the form hdfs:///path_to_file. For more information, see Work with Remote Data.

    filenames2labels looks for all file formats. To specify a custom list of file extensions to scan, use the FileExtensions argument.

    Example: 'whale.mat'

    Example: '../dir/data/signal.mat'

    Example: "../dir/data/"

    Example: {'dataFiles/Files_1/' 'dataFiles/Files_2/'}

    Example: ["dataFiles/Files_1/" "dataFiles/Files_2/"]

    Data Types: char | string | cell

    Data repository, specified as a datastore, matlab.io.datastore.FileSet object, or matlab.io.datastore.BlockedFileSet object.

    • If you specify a datastore, then ds must contain a Files property from which label names are parsed.

    • If you specify a matlab.io.datastore.FileSet object, the function obtains label names from the filenames listed in the FileInfo property of ds.

    • If you specify a matlab.io.datastore.BlockedFileSet object, the function obtains the label names from the filenames in the BlockInfo property of ds.

    Name-Value Arguments

    Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

    Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

    Example: filenames2labels(loc,"ExtractBetween",[5 8])

    File extensions, specified as a string scalar, string vector, character vector, or cell array of character vectors. If you do not specify FileExtensions, the function includes the filenames of all files found in the specified location in the list of labels.

    This argument applies only when the input is a file location.

    Example: [".mat" ".csv"]

    Data Types: char | string | cell

    Subfolder inclusion flag, specified as true or false. If you specify IncludeSubFolders as true, the function includes subfolders in the scan for labels.

    This argument applies only when the input is a file location.

    Data Types: double | logical

    Delimiter that marks the end position for the extracted substring, specified as a string scalar, pattern object, or positive integer.

    • If you specify a string or pattern object, the function extracts labels from each filename as the substring that begins with the first character of the filename and ends before the first occurrence of the delimiter string or pattern.

    • If you specify a positive integer, the function extracts labels from each filename as the substring that begins with the first character of the filename and ends before the position specified by the delimiter index.

    If the string or pattern object is not found in a filename, or if the index is 1 or larger than length(char(filename))+1, the function sets the label for that filename to undefined.

    Example: 3

    Example: "S"

    Example: digitsPattern + "_"

    Data Types: double | char | string

    Delimiter that marks the start position for the extracted substring, specified as a string scalar, pattern object, or nonnegative integer.

    • If you specify a string or pattern object, the function extracts labels from each filename as the substring that begins after the first occurrence of the delimiter string or pattern and ends with the last character of the filename.

    • If you specify a nonnegative integer, the function extracts labels from each filename as the substring that begins after the position specified by the delimiter index and ends with the last character of the filename.

    If the string or pattern object is not found in a filename, or if the index is larger than or equal to length(char(filename))+1, the function sets the label for that filename to undefined.

    Example: 2

    Example: "Subject"

    Example: "_" + wildcardPattern

    Data Types: double | char | string

    Delimiter that marks the start and end positions for the extracted substring, specified as a two-element string vector or cell array of characters, a two-element vector of pattern objects, or a two-element vector of positive integers. For a delimiter equal to [P S]:

    • If you specify a two-element string or cell array of characters, or a two-element vector of pattern objects, the function extracts labels from each filename as the substring that begins after P and ends before S.

    • If you specify a two-element vector of positive integers, the function extracts labels from each filename as the characters indexed by P:S. S must be larger than or equal to P.

    If there are no characters between [P S], or if indices P or S are larger than the length of the filename, the function sets the label for that filename to undefined.

    Example: [3 7]

    Example: ["A" "D"]

    Example: ["_" "_"]

    Data Types: double | char | string | cell

    Delimiter to find substring, specified as a pattern object. The function extracts labels from each filename as the substring that matches the pattern object.

    • If no match is found in a filename, the function sets the label for that filename to undefined.

    • If more than one pattern is found per filename, the function returns lbls as a matrix. All filenames must have the same number of pattern matches.

    Example: lettersPattern

    Example: "_" + wildcardPattern + "_"

    Output Arguments

    collapse all

    List of labels based on the names of files located in loc or contained in ds, returned as a categorical vector or matrix.

    List of files used to scan for labels, returned as a string vector. The ith file in files corresponds to the ith label in lbls.

    Version History

    Introduced in R2022b