getgenbank
Retrieve sequence information from GenBank database
Syntax
Description
getgenbank( displays
        information in the MATLAB® Command Window without returning data to a variable. The displayed information
        is only hyperlinks to the URLs used to search for and retrieve the data. The
          AccessionNumber)getgenbank function retrieves nucleotide information from the
          GenBank® database. This database is maintained by the National Center for Biotechnology
        Information (NCBI). For more details about the GenBank database, see https://www.ncbi.nlm.nih.gov/Genbank/.
Data = getgenbank(AccessionNumber)Data, a MATLAB structure containing information for the sequence. 
Tip
If an error occurs while retrieving the GenBank-formatted information, try rerunning the query. Errors can occur due to Internet connectivity issues that are unrelated to the GenBank record.
getgenbank(___,
        specifies options using one or more name-value arguments in addition to the input arguments
        in previous syntaxes. Each name-value argument is case insensitive.Name=Value)
Data = getgenbank(___)getgenbank returns Data, a MATLAB structure containing information for the sequence. 
Examples
This example shows how to retrieve the sequence from chromosome 19, M10051, that codes for the human insulin receptor and store it in a structure, S. 
S = getgenbank('M10051')S = struct with fields:
                LocusName: 'HUMINSR'
      LocusSequenceLength: '4723'
     LocusNumberofStrands: ''
            LocusTopology: 'linear'
        LocusMoleculeType: 'mRNA'
     LocusGenBankDivision: 'PRI'
    LocusModificationDate: '06-JAN-1995'
               Definition: 'Human insulin receptor mRNA, complete cds.'
                Accession: 'M10051'
                  Version: 'M10051.1'
                       GI: ''
                  Project: []
                   DBLink: []
                 Keywords: 'insulin receptor; tyrosine kinase.'
                  Segment: []
                   Source: 'Homo sapiens (human)'
           SourceOrganism: [4×65 char]
                Reference: {[1×1 struct]}
                  Comment: [14×67 char]
                 Features: [50×74 char]
                      CDS: [1×1 struct]
                 Sequence: 'ggggggctgcgcggccgggtcggtgcgcacacgagaaggacgcgcggcccccagcgctcttgggggccgcctcggagcatgacccccgcgggccagcgccgcgcgcctgatccgaggagaccccgcgctcccgcagccatgggcaccgggggccggcggggggcggcggccgcgccgctgctggtggcggtggccgcgctgctactgggcgccgcgggccacctgtaccccggagaggtgtgtcccggcatggatatccggaacaacctcactaggttgcatgagctggagaattgctctgtcatcgaaggacacttgcagatactcttgatgttcaaaacgaggcccgaagatttccgagacctcagtttccccaaactcatcatgatcactgattacttgctgctcttccgggtctatgggctcgagagcctgaaggacctgttccccaacctcacggtcatccggggatcacgactgttctttaactacgcgctggtcatcttcgagatggttcacctcaaggaactcggcctctacaacctgatgaacatcacccggggttctgtccgcatcgagaagaacaatgagctctgttacttggccactatcgactggtcccgtatcctggattccgtggaggataatcacatcgtgttgaacaaagatgacaacgaggagtgtggagacatctgtccgggtaccgcgaagggcaagaccaactgccccgccaccgtcatcaacgggcagtttgtcgaacgatgttggactcatagtcactgccagaaagtttgcccgaccatctgtaagtcacacggctgcaccgccgaaggcctctgttgccacagcgagtgcctgggcaactgttctcagcccgacgaccccaccaagtgcgtggcctgccgcaacttctacctggacggcaggtgtgtggagacctgcccgcccccgtactaccacttccaggactggcgctgtgtgaacttcagcttctgccaggacctgcaccacaaatgcaagaactcgcggaggcagggctgccaccaatacgtcattcacaacaacaagtgcatccctgagtgtccctccgggtacacgatgaattccagcaacttgctgtgcaccccatgcctgggtccctgtcccaaggtgtgccacctcctagaaggcgagaagaccatcgactcggtgacgtctgcccaggagctccgaggatgcaccgtcatcaacgggagtctgatcatcaacattcgaggaggcaacaatctggcagctgagctagaagccaacctcggcctcattgaagaaatttcagggtatctaaaaatccgccgatcctacgctctggtgtcactttccttcttccggaagttacgtctgattcgaggagagaccttggaaattgggaactactccttctatgccttggacaaccagaacctaaggcagctctgggactggagcaaacacaacctcaccaccactcaggggaaactcttcttccactataaccccaaactctgcttgtcagaaatccacaagatggaagaagtttcaggaaccaaggggcgccaggagagaaacgacattgccctgaagaccaatggggacaaggcatcctgtgaaaatgagttacttaaattttcttacattcggacatcttttgacaagatcttgctgagatgggagccgtactggccccccgacttccgagacctcttggggttcatgctgttctacaaagaggccccttatcagaatgtgacggagttcgatgggcaggatgcgtgtggttccaacagttggacggtggtagacattgacccacccctgaggtccaacgaccccaaatcacagaaccacccagggtggctgatgcggggtctcaagccctggacccagtatgccatctttgtgaagaccctggtcaccttttcggatgaacgccggacctatggggccaagagtgacatcatttatgtccagacagatgccaccaacccctctgtgcccctggatccaatctcagtgtctaactcatcatcccagattattctgaagtggaaaccaccctccgaccccaatggcaacatcacccactacctggttttctgggagaggcaggcggaagacagtgagctgttcgagctggattattgcctcaaagggctgaagctgccctcgaggacctggtctccaccattcgagtctgaagattctcagaagcacaaccagagtgagtatgaggattcggccggcgaatgctgctcctgtccaaagacagactctcagatcctgaaggagctggaggagtcctcgtttaggaagacgtttgaggattacctgcacaacgtggttttcgtccccagaaaaacctcttcaggcactggtgccgaggaccctaggccatctcggaaacgcaggtcccttggcgatgttgggaatgtgacggtggccgtgcccacggtggcagctttccccaacacttcctcgaccagcgtgcccacgagtccggaggagcacaggccttttgagaaggtggtgaacaaggagtcgctggtcatctccggcttgcgacacttcacgggctatcgcatcgagctgcaggcttgcaaccaggacacccctgaggaacggtgcagtgtggcagcctacgtcagtgcgaggaccatgcctgaagccaaggctgatgacattgttggccctgtgacgcatgaaatctttgagaacaacgtcgtccacttgatgtggcaggagccgaaggagcccaatggtctgatcgtgctgtatgaagtgagttatcggcgatatggtgatgaggagctgcatctctgcgtctcccgcaagcacttcgctctggaacggggctgcaggctgcgtgggctgtcaccggggaactacagcgtgcgaatccgggccacctcccttgcgggcaacggctcttggacggaacccacctatttctacgtgacagactatttagacgtcccgtcaaatattgcaaaaattatcatcggccccctcatctttgtctttctcttcagtgttgtgattggaagtatttatctattcctgagaaagaggcagccagatgggccgctgggaccgctttacgcttcttcaaaccctgagtatctcagtgccagtgatgtgtttccatgctctgtgtacgtgccggacgagtgggaggtgtctcgagagaagatcaccctccttcgagagctggggcagggctccttcggcatggtgtatgagggcaatgccagggacatcatcaagggtgaggcagagacccgcgtggcggtgaagacggtcaacgagtcagccagtctccgagagcggattgagttcctcaatgaggcctcggtcatgaagggcttcacctgccatcacgtggtgcgcctcctgggagtggtgtccaagggccagcccacgctggtggtgatggagctgatggctcacggagacctgaagagctacctccgttctctgcggccagaggctgagaataatcctggccgccctccccctacccttcaagagatgattcagatggcggcagagattgctgacgggatggcctacctgaacgccaagaagtttgtgcatcgggacctggcagcgagaaactgcatggtcgcccatgattttactgtcaaaattggagactttggaatgaccagagacatctatgaaacggattactaccggaaagggggcaagggtctgctccctgtacggtggatggcaccggagtccctgaaggatggggtcttcaccacttcttctgacatgtggtcctttggcgtggtcctttgggaaatcaccagcttggcagaacagccttaccaaggcctgtctaatgaacaggtgttgaaatttgtcatggatggagggtatctggatcaacccgacaactgtccagagagagtcactgacctcatgcgcatgtgctggcaattcaaccccaagatgaggccaaccttcctggagattgtcaacctgctcaaggacgacctgcaccccagctttccagaggtgtcgttcttccacagcgaggagaacaaggctcccgagagtgaggagctggagatggagtttgaggacatggagaatgtgcccctggaccgttcctcgcactgtcagagggaggaggcggggggccgggatggagggtcctcgctgggtttcaagcggagctacgaggaacacatcccttacacacacatgaacggaggcaagaaaaacgggcggattctgaccttgcctcggtccaatccttcctaacagtgcctaccgtggcgggggcgggcaggggttcccattttcgctttcctctggtttgaaagcctctggaaaactcaggattctcacgactctaccatgtccagtggagttcagagatcgttcctatacatttctgttcatcttaaggtggactcgtttggttaccaatttaactagtcctgcagaggatttaactgtgaacctggagggcaaggggtttccacagttgctgctcctttggggcaacgacggtttcaaaccaggattttgtgttttttcgttccccccacccgcccccagcagatggaaagaaagcacctgtttttacaaattcttttttttttttttttttttttttttttgctggtgtctgagcttcagtataaaagacaaaacttcctgtttgtggaacaaaatttcgaaagaaaaaaccaaa'
                SearchURL: 'https://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=M10051'
              RetrieveURL: 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=186439&rettype=gb&retmode=text&api_key=55022f38eb25e2f6b00a772015c7b77d6208'
This example shows how to retrieve only the coding sequence from chromosome 19 that codes for the human insulin receptor and store it in a structure, CDS.  To determine that the coding sequence is positions 139 through 4287, look at the Features field of the returned structure.
CDS = getgenbank('M10051',PartialSeq=[139,4287])CDS = struct with fields:
                LocusName: 'HUMINSR'
      LocusSequenceLength: '4149'
     LocusNumberofStrands: ''
            LocusTopology: 'linear'
        LocusMoleculeType: 'mRNA'
     LocusGenBankDivision: 'PRI'
    LocusModificationDate: '06-JAN-1995'
               Definition: 'Human insulin receptor mRNA, complete cds.'
                Accession: 'M10051 REGION: 139..4287'
                  Version: 'M10051.1'
                       GI: ''
                  Project: []
                   DBLink: []
                 Keywords: 'insulin receptor; tyrosine kinase.'
                  Segment: []
                   Source: 'Homo sapiens (human)'
           SourceOrganism: [4×65 char]
                Reference: {[1×1 struct]}
                  Comment: [14×67 char]
                 Features: [50×74 char]
                      CDS: [1×1 struct]
                 Sequence: 'atgggcaccgggggccggcggggggcggcggccgcgccgctgctggtggcggtggccgcgctgctactgggcgccgcgggccacctgtaccccggagaggtgtgtcccggcatggatatccggaacaacctcactaggttgcatgagctggagaattgctctgtcatcgaaggacacttgcagatactcttgatgttcaaaacgaggcccgaagatttccgagacctcagtttccccaaactcatcatgatcactgattacttgctgctcttccgggtctatgggctcgagagcctgaaggacctgttccccaacctcacggtcatccggggatcacgactgttctttaactacgcgctggtcatcttcgagatggttcacctcaaggaactcggcctctacaacctgatgaacatcacccggggttctgtccgcatcgagaagaacaatgagctctgttacttggccactatcgactggtcccgtatcctggattccgtggaggataatcacatcgtgttgaacaaagatgacaacgaggagtgtggagacatctgtccgggtaccgcgaagggcaagaccaactgccccgccaccgtcatcaacgggcagtttgtcgaacgatgttggactcatagtcactgccagaaagtttgcccgaccatctgtaagtcacacggctgcaccgccgaaggcctctgttgccacagcgagtgcctgggcaactgttctcagcccgacgaccccaccaagtgcgtggcctgccgcaacttctacctggacggcaggtgtgtggagacctgcccgcccccgtactaccacttccaggactggcgctgtgtgaacttcagcttctgccaggacctgcaccacaaatgcaagaactcgcggaggcagggctgccaccaatacgtcattcacaacaacaagtgcatccctgagtgtccctccgggtacacgatgaattccagcaacttgctgtgcaccccatgcctgggtccctgtcccaaggtgtgccacctcctagaaggcgagaagaccatcgactcggtgacgtctgcccaggagctccgaggatgcaccgtcatcaacgggagtctgatcatcaacattcgaggaggcaacaatctggcagctgagctagaagccaacctcggcctcattgaagaaatttcagggtatctaaaaatccgccgatcctacgctctggtgtcactttccttcttccggaagttacgtctgattcgaggagagaccttggaaattgggaactactccttctatgccttggacaaccagaacctaaggcagctctgggactggagcaaacacaacctcaccaccactcaggggaaactcttcttccactataaccccaaactctgcttgtcagaaatccacaagatggaagaagtttcaggaaccaaggggcgccaggagagaaacgacattgccctgaagaccaatggggacaaggcatcctgtgaaaatgagttacttaaattttcttacattcggacatcttttgacaagatcttgctgagatgggagccgtactggccccccgacttccgagacctcttggggttcatgctgttctacaaagaggccccttatcagaatgtgacggagttcgatgggcaggatgcgtgtggttccaacagttggacggtggtagacattgacccacccctgaggtccaacgaccccaaatcacagaaccacccagggtggctgatgcggggtctcaagccctggacccagtatgccatctttgtgaagaccctggtcaccttttcggatgaacgccggacctatggggccaagagtgacatcatttatgtccagacagatgccaccaacccctctgtgcccctggatccaatctcagtgtctaactcatcatcccagattattctgaagtggaaaccaccctccgaccccaatggcaacatcacccactacctggttttctgggagaggcaggcggaagacagtgagctgttcgagctggattattgcctcaaagggctgaagctgccctcgaggacctggtctccaccattcgagtctgaagattctcagaagcacaaccagagtgagtatgaggattcggccggcgaatgctgctcctgtccaaagacagactctcagatcctgaaggagctggaggagtcctcgtttaggaagacgtttgaggattacctgcacaacgtggttttcgtccccagaaaaacctcttcaggcactggtgccgaggaccctaggccatctcggaaacgcaggtcccttggcgatgttgggaatgtgacggtggccgtgcccacggtggcagctttccccaacacttcctcgaccagcgtgcccacgagtccggaggagcacaggccttttgagaaggtggtgaacaaggagtcgctggtcatctccggcttgcgacacttcacgggctatcgcatcgagctgcaggcttgcaaccaggacacccctgaggaacggtgcagtgtggcagcctacgtcagtgcgaggaccatgcctgaagccaaggctgatgacattgttggccctgtgacgcatgaaatctttgagaacaacgtcgtccacttgatgtggcaggagccgaaggagcccaatggtctgatcgtgctgtatgaagtgagttatcggcgatatggtgatgaggagctgcatctctgcgtctcccgcaagcacttcgctctggaacggggctgcaggctgcgtgggctgtcaccggggaactacagcgtgcgaatccgggccacctcccttgcgggcaacggctcttggacggaacccacctatttctacgtgacagactatttagacgtcccgtcaaatattgcaaaaattatcatcggccccctcatctttgtctttctcttcagtgttgtgattggaagtatttatctattcctgagaaagaggcagccagatgggccgctgggaccgctttacgcttcttcaaaccctgagtatctcagtgccagtgatgtgtttccatgctctgtgtacgtgccggacgagtgggaggtgtctcgagagaagatcaccctccttcgagagctggggcagggctccttcggcatggtgtatgagggcaatgccagggacatcatcaagggtgaggcagagacccgcgtggcggtgaagacggtcaacgagtcagccagtctccgagagcggattgagttcctcaatgaggcctcggtcatgaagggcttcacctgccatcacgtggtgcgcctcctgggagtggtgtccaagggccagcccacgctggtggtgatggagctgatggctcacggagacctgaagagctacctccgttctctgcggccagaggctgagaataatcctggccgccctccccctacccttcaagagatgattcagatggcggcagagattgctgacgggatggcctacctgaacgccaagaagtttgtgcatcgggacctggcagcgagaaactgcatggtcgcccatgattttactgtcaaaattggagactttggaatgaccagagacatctatgaaacggattactaccggaaagggggcaagggtctgctccctgtacggtggatggcaccggagtccctgaaggatggggtcttcaccacttcttctgacatgtggtcctttggcgtggtcctttgggaaatcaccagcttggcagaacagccttaccaaggcctgtctaatgaacaggtgttgaaatttgtcatggatggagggtatctggatcaacccgacaactgtccagagagagtcactgacctcatgcgcatgtgctggcaattcaaccccaagatgaggccaaccttcctggagattgtcaacctgctcaaggacgacctgcaccccagctttccagaggtgtcgttcttccacagcgaggagaacaaggctcccgagagtgaggagctggagatggagtttgaggacatggagaatgtgcccctggaccgttcctcgcactgtcagagggaggaggcggggggccgggatggagggtcctcgctgggtttcaagcggagctacgaggaacacatcccttacacacacatgaacggaggcaagaaaaacgggcggattctgaccttgcctcggtccaatccttcctaa'
                SearchURL: 'https://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?db=nuccore&id=M10051'
              RetrieveURL: 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=186439&rettype=gb&retmode=text&api_key=55022f38eb25e2f6b00a772015c7b77d6208&seq_start=139&seq_stop=4287'
Input Arguments
Unique alphanumeric identifier for sequence record, specified as a character vector or string.
Example: 'M10051'
Data Types: char | string
Name-Value Arguments
Specify optional pairs of arguments as
      Name1=Value1,...,NameN=ValueN, where Name is
      the argument name and Value is the corresponding value.
      Name-value arguments must appear after other arguments, but the order of the
      pairs does not matter.
    
Example: PartialSeq=[139,4287]
Two-element array of integers containing the start and end positions of the
              subsequence [ that specifies a subsequence to retrieve.
                StartBP,
                EndBP]StartBP is an integer between 1 and EndBP.
                EndBP is an integer between StartBP and
              the length of the sequence.
Example: PartialSeq=[139,4287]
GenBank data file name or file path containing the data returned from the GenBank database, specified as a character vector or string. If you specify only a file name, the file is saved to the MATLAB current folder. The function does not append data to an existing file. Instead, it overwrites the contents of the existing file without warning.
Example: ToFile='myFile.m'
Data Types: char | string
Format for sequence information, specified as a character vector or string as:
- 'GenBank'— Default when- SequenceOnlyis- false.
- 'FASTA'— Default when- SequenceOnlyis- true.
When 'FileFormat' is 'FASTA', then
                Data contains only two fields, Header and
                Sequence.
Example: FileFormat='FASTA'
Data Types: char | string
Return only sequence in Data, specified as
                false or true. Specify true
              to return the sequence.
Connection timeout in seconds, specified as a positive scalar. For more information, see here
Example: TimeOut=6
Data Types: double
Output Arguments
Sequence information, returned as a MATLAB structure.
Version History
Introduced before R2006aThe function no longer appends data to an existing file. The function now overwrites the contents of the existing file without warning.
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Seleccione un país/idioma
Seleccione un país/idioma para obtener contenido traducido, si está disponible, y ver eventos y ofertas de productos y servicios locales. Según su ubicación geográfica, recomendamos que seleccione: .
También puede seleccionar uno de estos países/idiomas:
Cómo obtener el mejor rendimiento
Seleccione China (en idioma chino o inglés) para obtener el mejor rendimiento. Los sitios web de otros países no están optimizados para ser accedidos desde su ubicación geográfica.
América
- América Latina (Español)
- Canada (English)
- United States (English)
Europa
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)