Word Count in a PDF file

4 visualizaciones (últimos 30 días)
Ahmed Alsaadi
Ahmed Alsaadi el 20 de Dic. de 2018
Editada: Omer Yasin Birey el 21 de Dic. de 2018
I have a PDF file "EHP.pdf", I want to count the total number of words in that file? This file has many sections I want to exclude the last section from the calculations. Any suggestions?
  2 comentarios
KALYAN ACHARJYA
KALYAN ACHARJYA el 20 de Dic. de 2018
Using Matlab??
Ahmed Alsaadi
Ahmed Alsaadi el 20 de Dic. de 2018
Yes, by using Matlab

Iniciar sesión para comentar.

Respuesta aceptada

Omer Yasin Birey
Omer Yasin Birey el 20 de Dic. de 2018
Editada: Omer Yasin Birey el 21 de Dic. de 2018
Hi Ahmed, you can use extractFileText. You must choose a starter word and a finisher word, this word must be unique. Because, counting will end when Matlab encounters this word. By this way you can count the words between the starter and finisher.
str = extractFileText("EHP.pdf");
i = strfind(str,"firstWord"); % write here the first word of your pdf
ii = strfind(str,"lastWord"); % write here the last word of your pdf, that must be distinctive
start = i(1);
fin = ii(1);
extracted = extractBetween(str,start,fin-1)
uniqueWordNumbers = wordCloudCounts(extracted);
counter = uniqueWordNumbers(:,2);
counterArray = table2array(counter);
totalWords = sum(counterArray);
  3 comentarios
Omer Yasin Birey
Omer Yasin Birey el 20 de Dic. de 2018
Ah, You are right Ahmed. I made a typo and also forgot a line there, try this instead:
counter = uniqueWordNumbers(:,2);
counterArray = table2array(counter);
totalWords = sum(counterArray);
add this table2array line and change the input of sum with this
Ahmed Alsaadi
Ahmed Alsaadi el 20 de Dic. de 2018
It works now, thank you very much Omer.

Iniciar sesión para comentar.

Más respuestas (0)

Categorías

Más información sobre Display and Presentation en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by