getSubsequence

Retrieve partial sequences from object

Description

example

subSeqs = getSubsequence(object,subset,positions) returns the partial sequences subSeqs for sequence positions specified by positions from only object elements specified by subset.

Examples

collapse all

Store read data from a SAM-formatted file in a BioRead object.

br = BioRead('ex1.sam')
br = 
  BioRead with properties:

     Quality: [1501x1 File indexed property]
    Sequence: [1501x1 File indexed property]
      Header: [1501x1 File indexed property]
       NSeqs: 1501
        Name: ''


Retrieve the sequences (reads) from the object.

seqs = getSequence(br);

Retrieve the first, third, and fifth sequences from the object.

seqs2 = getSequence(br,[1 3 5])
seqs2 = 3x1 cell array
    {'CACTAGTGGCTCATTGTAAATGTGTGGTTTAACTCG'}
    {'AGTGGCTCATTGTAAATGTGTGGTTTAACTCGTCC' }
    {'GCTCATTGTAAATGTGTGGTTTAACTCGTCCATGG' }

Retrieve the first five positions of those sequences.

seqs3 = getSubsequence(br,[1 3 5],[1:5])
seqs3 = 3x1 cell array
    {'CACTA'}
    {'AGTGG'}
    {'GCTCA'}

You can use a header to get the corresponding sequences with that header. If multiple sequences have the same header, the function returns all of those sequences.

Get the first five positions of the sequences with the header B7_591:4:96:693:509.

seqs4 = getSubsequence(br,{'B7_591:4:96:693:509'},[1:5])
seqs4 = 1x1 cell array
    {'CACTA'}

Retrieve the first, fourth, and sixth positions of the first three sequences.

seq5 = getSubsequence(br,[1:3],[1 4 6])
seq5 = 3x1 cell array
    {'CTG'}
    {'CGG'}
    {'AGC'}

Input Arguments

collapse all

Object containing the read data, specified as a BioRead or BioMap object.

Example: bioreadObj

Subset of elements in the object, specified as a vector of positive integers, logical vector, string vector, or cell array of character vectors containing valid sequence headers.

Example: [1 3]

Tip

When you use a sequence header (or a cell array of headers) for subset, a repeated header specifies all elements with that header.

Sequence positions, specified as a vector of positive integers or logical vector. The last position must be within the range of positions for each sequence specified by subset.

Example: [2:10]

Output Arguments

collapse all

Subsequences from a subset of elements, returned as a cell array of character vectors.

Introduced in R2010a