Does the text analytics toolbox allow users to test out-of-sample perplexity with LDA?
    4 visualizaciones (últimos 30 días)
  
       Mostrar comentarios más antiguos
    
    Stephen Bruestle
 el 15 de Oct. de 2018
  
    
    
    
    
    Comentada: Stephen Bruestle
 el 30 de Nov. de 2018
            I want to create two samples from my data: one for training and one for testing. Then I want to fit the LDA model using the training sample. Then I want to test the preplexity of the test sample using the fitted model. Is this possible with the text analytics toolbox?
0 comentarios
Respuesta aceptada
  Christopher Creutzig
    
 el 26 de Nov. de 2018
        The second output of logp gives you the perplexity.
txt = extractFileText('sonnets.txt');
sonnets = split(txt,[newline newline]);
sonnets = sonnets(5:2:end);
td = tokenizedDocument(sonnets);
bow = bagOfWords(td(1:50));
mdl = fitlda(bow,5,'Verbose',0);
[~,perpl] = logp(mdl, encode(bow,td(51:53)))
% perpl = 337.4999
2 comentarios
Más respuestas (0)
Ver también
Categorías
				Más información sobre Statistics and Machine Learning Toolbox en Help Center y File Exchange.
			
	Productos
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

