Decoding DNA sequence into binary

56 visualizaciones (últimos 30 días)
Meghashree G
Meghashree G el 23 de Sept. de 2015
Editada: Khaled belkacemi el 4 de Abr. de 2022
I have a sequence TGACTCAGTCGTTCAATCTATGCC, how to write code in matlab to convert it into binary? Please help me..Thank you
  3 comentarios
Meghashree G
Meghashree G el 23 de Sept. de 2015
The output should be 01100011 01110010 01111001 01110000 01110100 01101111
( A=00 T=01 G=10 C=11)
Dabba Do
Dabba Do el 14 de Feb. de 2018
Editada: Dabba Do el 14 de Feb. de 2018
IF: every A matches to a T, THEN: the output for an A or a T is the same. The same is true of G and C they are a pair. So there are only two pairs, why not try defining each pair as a 1 or a 0. So if A and T = 1 and G and C = 0 then: the sequence TGACTCAGTGTTCAATCTATGCC would be: 101010101001101110111000. By coding each codon with a 1 and a 0 you are going to con-volute the Bits in your code and they work as pairs representing a single genetic Bit.

Iniciar sesión para comentar.

Respuesta aceptada

James Tursa
James Tursa el 23 de Sept. de 2015
Editada: James Tursa el 23 de Sept. de 2015
One way:
s = 'TGACTCAGTCGTTCAATCTATGCC'; % Input DNA string
[~,x] = ismember(s,'ATGC'); % Convert the ATGC into indexes
c = {'00','01','10','11'}; % Numeric strings to convert the indexes into
result = cell2mat(c(x)); % Convert the indexes into the numeric strings

Más respuestas (4)

Bastien Chardonnens
Bastien Chardonnens el 23 de Sept. de 2015
You can use
str = ('TGACTCAGTCGTTCAATCTATGCC');
[~,~,ind] = unique(double(str)-65);
dec2bin(ind-1)

Suresma Jena
Suresma Jena el 27 de Ag. de 2017
i want to how to convert a DNA sequence into binary. ex- if x = ATGCAT then its binary sequence will be xA = 100010 xT = 010001 xG = 001000 xC = 000100
  6 comentarios
Walter Roberson
Walter Roberson el 18 de Sept. de 2017
s = fileread('NameOfTextFileGoesHere');
lakshmi boddu
lakshmi boddu el 8 de Abr. de 2018
A pseudo random binary sequence of size 256 * 256 is generated with two 1 D logistic maps , from which 3-bit disjoint and consecutive binary sequences are extracted to choose DNA coding rule, as follows 000 ( 00-A,01-C 10-G,11-T) 001(00-A,01-G,10-C,11-T) 010(00-C,01-A,10-T,11-G) 011( 00-C ,01-T,10-A,11-G) 100( 00-G, 01-A,10-T,11-C) 101(00-G,01-T,10-A, 11-C) 110(00-T,01-C,10-G,11-A) 111( 00-T,01-G,10-C,11-A) . please help me sir how to implement in matlab

Iniciar sesión para comentar.


Siyab Khan
Siyab Khan el 15 de En. de 2019
Editada: Siyab Khan el 15 de En. de 2019
Please also write code for DNA sequencig in 4 bits
via this given schema
4 2 1 0
N = 0 0 0 1 N represents the gap in DNA sequence
A = 0 1 0 0
T = 1 0 0 0
C = 0 0 1 0
G = 0 1 1 0

Khaled belkacemi
Khaled belkacemi el 4 de Abr. de 2022
Editada: Khaled belkacemi el 4 de Abr. de 2022
Hello,can someone help me to do the code for this situation? example of input data:0121212002202021101110002
and i want to code my data as this array

Categorías

Más información sobre Genomics and Next Generation Sequencing en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by