CuffGFFReadOptions

Option set for cuffgffread

Description

A CuffGFFReadOptions object contains options for the cuffgffread function, which filters and converts GFF and GTF files [1].

Creation

Description

example

cuffgffreadOpt = CuffGFFReadOptions creates a CuffGFFReadOptions object with the default property values.

CuffGFFReadOptions requires the Cufflinks Support Package for Bioinformatics Toolbox™. If the support package is not installed, then the function provides a download link.

Note

CuffGFFReadOptions is supported on the Mac and UNIX® platforms only.

cuffgffreadOpt = CuffGFFReadOptions(Name,Value) sets the object properties using one or more name-value pair arguments. Enclose each property name in quotes. For example, cuffgffreadOpt = CuffGFFReadOptions('DiscardSingleExon',true) discards transcripts spanning a single exon.

cuffgffreadOpt = CuffGFFReadOptions(S) specifies optional parameters using the string or character vector S.

Input Arguments

expand all

cuffgffread options, specified as a string or character vector. S must be in the original gffread option syntax (prefixed by one or two dashes).

Example: '-U'

Properties

expand all

Flag to add file descriptions from sequence files to the descr attribute of the output GFF record, specified as true or false. Specify the sequence files using the SequenceInfo option.

Example: true

Data Types: logical

Flag to check opposite strand when checking for in-frame stop codons, specified as true or false.

Example: true

Data Types: logical

Flag to adjust coding sequence phase when checking for in-frame stop codons, specified as true or false.

Example: true

Data Types: logical

Flag to cluster the input transcripts into loci, specified as true or false. This option is the same as the Merge property, except that it does not collapse fully contained transcripts with identical introns.

Example: false

Data Types: logical

Flag to discard transcripts with no coding sequence feature (CDS), specified as true or false.

Example: true

Data Types: logical

Flag to collapse fully contained transcripts that are shorter with fewer introns than the container, specified as true or false. This property applies only when you set Merge to true.

Example: true

Data Types: logical

Flag to collapse shorter transcripts overlapping at least 80% with another single exon transcript, specified as true or false. This property applies only when you set Merge to true.

Example: true

Data Types: logical

Genomic range to filter transcripts, specified as a string or character vector. The format must be "[[<strand>]<chr>:]<start>..<end>", where start and end are genomic positions, chr is an optional chromosome or contig name, and an optional strand ('+' or '-').

Example: “+NC_000912.1:4821..7340”

Data Types: char | string

Flag to ignore mRNA transcripts either lacking a start or stop codon or having an in-frame stop codon, specified as true or false.

Example: true

Data Types: logical

Flag to ignore multiexon mRNA transcripts that have an intron with a noncanonical splice sequence, specified as true or false. A noncanonical splice sequence is any splice sequence other than "GT-AG", "CG-AG", or "AT-AC".

Example: true

Data Types: logical

Flag to ignore transcripts spanning a single exon, specified as true or false.

Example: true

Data Types: logical

Flag to ignore transcripts with an in-frame stop codon, specified as true or false.

Example: true

Data Types: logical

Additional commands, specified as a string or character vector. The commands must be in the original syntax (prefixed by one or two dashes). Use this option to apply undocumented flags and flags without corresponding MATLAB properties. When the function converts the original flags to MATLAB properties, it stores any unrecognized flags in this option.

Example: "-E"

Data Types: char | string

Name of a file to save the spliced coding sequences in the FASTA format, specified as a string or character vector.

Example: "splicedCoding.FASTA"

Data Types: char | string

Name of a file to save the spliced exons in the FASTA format, specified as a string or character vector.

Example: "splicedExon.FASTA"

Data Types: char | string

Name of a file to save the protein translation of coding sequences in the FASTA format, specified as a string or character vector.

Example: "translated.FASTA"

Data Types: char | string

Flag to parse additional attributes only from the first exon, specified as true or false.

Example: true

Data Types: logical

Flag to list the lowest-level GFF features as exon features in the output file, specified as true or false.

Example: true

Data Types: logical

Flag to discard transcripts not contained fully within the range, specified as true or false. Specify the range using the CoordinateRange option.

Example: true

Data Types: logical

Flag to output GTF-format transcript files, specified as true or false.

Example: true

Data Types: logical

Flag to include all the object properties with the corresponding default values when converting to the original options syntax, specified as true or false. You can convert the properties to the original syntax prefixed by one or two dashes (such as '-d 100 -e 80') by using getCommand. The default value false means that when you call getCommand(optionsObject), it converts only the specified properties. If the value is true, getCommand converts all available properties, with default values for unspecified properties, to the original syntax.

Example: true

Data Types: logical

Maximum intron length for a transcript to include in the output file, specified as a positive integer. Inf, the default value, sets no limit on the intron length.

Example: 500

Data Types: double

Flag to merge transcripts into loci by collapsing transcripts with identical introns, specified as true or false.

Example: true

Data Types: logical

Flag to merge exons into a single exon when separated by fewer than 4 base-pair introns, specified as true or false.

Example: true

Data Types: logical

Name of a file to save information on duplicates when merging, specified as a string or character vector. This property applies only when you set Merge to true.

Example: "duplicates.txt"

Data Types: char | string

Flag to retain all attributes in the output file, specified as true or false.

Example: true

Data Types: logical

Flag to filter out records containing the word "pseudo," specified as true or false.

Example: false

Data Types: logical

Name of a file containing a replacement table, specified as a string or character vector. The table must have two columns, where the first column contains the original transcript IDs and the second column contains the new transcript IDs. An example table follows.

origTranscript1

newTranscript1

origTranscript2

newTranscript2

origTranscript3

newTranscript3

If you provide a replacement table, the function replaces the transcript IDs found in the first column with the new transcripts IDs from the second column and filters out those transcripts not found.

Example: "replaceTbl.txt"

Data Types: char | string

Name of a FASTA-format file containing genomic sequences for all input mappings, specified as a string or character vector.

Example: "seqs.fasta"

Data Types: char | string

Name of a tab-delimited file with additional information on each input sequence, specified as a string or character vector. This file must have three columns: a sequence name column, a sequence length column, and a sequence description column. If AppendDescription is true, the sequence description is included as an attribute in the output GFF file.

Example: "seqinfo.txt"

Data Types: char | string

Flag to decode url-encoded characters in attribute names, specified as true or false. For instance, "transcript%20description" is decoded to "transcript description".

Example: true

Data Types: logical

Flag to use the GTF-to-GFF3 conversion method from Ensembl, specified as true or false.

Example: true

Data Types: logical

Flag to include nontranscript GFF records in the output file, specified as true or false.

Example: true

Data Types: logical

Flag to use the track name in the second column of the GFF output line, specified as true or false.

Example: true

Data Types: logical

This property is read-only.

Supported version of the original cufflinks software, returned as a string.

Example: "2.2.1"

Data Types: string

Flag to write the exon coordinates projected onto the spliced sequence, specified as true or false. This property applies only when FastaExonsFile or FastaCDSFile is specified.

Example: true

Data Types: logical

Object Functions

getCommandTranslate object properties to original options syntax
getOptionsTableReturn table with all properties and equivalent options in original syntax

Examples

collapse all

Create a CuffGFFReadOptions object with the default values.

opt = CuffGFFReadOptions;

Create an object using name-value pairs.

opt2 = CuffGFFReadOptions('DiscardSingleExon',true,'FastaExonsFile','exons.fa');

Create an object by using the original syntax.

opt3 = CuffGFFReadOptions('-U -w exons.fa')

Convert a GTF file to a GFF file while retaining all attributes.

cuffgffread('gyrAB.gtf','gyrABOut.gff','PreserveAttributes',true)

You can also set the options using an object. For instance, specify the output to be in the GTF format.

opt = CuffGFFReadOptions;
opt.GTFOutput = true;
opt.PreserveAttributes = true;
cuffgffread('gyrAB.gtf','gyrABOut.gtf',opt);

Once you have the options object, you can retrieve the equivalent original options for all object properties using getOptionsTable.

getOptionsTable(opt)
ans =

  33×3 table

                                        PropertyName                FlagName        FlagShortName
                                 ___________________________    ________________    _____________

    AppendDescription            'AppendDescription'            '-A'                    ''       
    CheckOppositeStrand          'CheckOppositeStrand'          '-B'                    ''       
    CheckPhase                   'CheckPhase'                   '-H'                    ''       
    Cluster                      'Cluster'                      '--cluster-only'        ''       
    CodingOnly                   'CodingOnly'                   '-C'                    ''       
    CollapseContainer            'CollapseContainer'            '-K'                    ''       
    CollapseFull                 'CollapseFull'                 '-Q'                    ''       
    CoordinateRange              'CoordinateRange'              '-r'                    ''       
    DiscardInvalidCDS            'DiscardInvalidCDS'            '-J'                    ''       
    DiscardNonCanonicalSplice    'DiscardNonCanonicalSplice'    '-N'                    ''       
    DiscardSingleExon            'DiscardSingleExon'            '-U'                    ''       
    DiscardTerminatedCDS         'DiscardTerminatedCDS'         '-V'                    ''       
    FastaCDSFile                 'FastaCDSFile'                 '-x'                    ''       
    FastaExonsFile               'FastaExonsFile'               '-w'                    ''       
    FastaProteinFile             'FastaProteinFile'             '-y'                    ''       
    FirstExonOnly                'FirstExonOnly'                '-G'                    ''       
    ForceExons                   'ForceExons'                   '--force-exons'         ''       
    FullyContained               'FullyContained'               '-R'                    ''       
    GTFOutput                    'GTFOutput'                    '-T'                    ''       
    MaxIntronLength              'MaxIntronLength'              '-i'                    ''       
    Merge                        'Merge'                        '--merge'               '-M'     
    MergeCloseExons              'MergeCloseExons'              '-Z'                    ''       
    MergeInfoFile                'MergeInfoFile'                '-d'                    ''       
    PreserveAttributes           'PreserveAttributes'           '-F'                    ''       
    Pseudo                       'Pseudo'                       '--no-pseudo'           ''       
    ReplacementTable             'ReplacementTable'             '-m'                    ''       
    SequenceFile                 'SequenceFile'                 '-g'                    ''       
    SequenceInfo                 'SequenceInfo'                 '-s'                    ''       
    UrlDecode                    'UrlDecode'                    '-D'                    ''       
    UseEnsemblConversion         'UseEnsemblConversion'         '-L'                    ''       
    UseNonTranscript             'UseNonTranscript'             '-O'                    ''       
    UseTrackName                 'UseTrackName'                 '-t'                    ''       
    WriteCoordinates             'WriteCoordinates'             '-W'                    ''       

References

[1] Trapnell, C., B. Williams, G. Pertea, A. Mortazavi, G. Kwan, J. van Baren, S. Salzberg, B. Wold, and L. Pachter. 2010. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology. 28:511–515.

See Also

|

External Websites

Introduced in R2019a