extract defined range of pages form a PDF to create multiple PDF files
17 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
enrico maggiolini
el 2 de Mzo. de 2023
Respondida: Rahul
el 7 de Nov. de 2024 a las 7:48
Let say i have a 1000 pages PDF file called ALL.pdf.
I want to create multiple PDF files taken from ALL.pdf following a defined order in an excel file, eg:
File name First page Last page
John 1 10
Luke 11 15
Matt 16 22
...... ...... ......
Adam 996 1000
So matlab should extract the first 10 pages from ALL.pdf and generate JOHN.pdf, with pages from 11 to 15 generate the file LUKE.pdf and so on to ADAMS.pdf with the last 5 pages of ALL.pdf.
My ignorance is related the command usefull to open and manipulate PDF files (if i had to extract defined sheets from an .xlsx there were no problem at all)
I've searched everywhere but i did not find anything on how to do it.
Any advice?
Thanks
0 comentarios
Respuesta aceptada
Rahul
el 7 de Nov. de 2024 a las 7:48
In order to achieve the desired result of extracting text from particular ranges of pages from a large 'pdf' file and then saving them as separate 'pdf' files, you can consider using 'extractFileText' function which provides a property called 'Pages' where the range of pages required can be mentioned as an array. Here is an example:
pages = 1:10
str = extractFileText("ALL.pdf", 'Pages', pages);
% This would store the text content of the first 10 pages in 'str'
Then you can use functions like 'Document' and 'Paragraph' to convert 'str' obtained to a new 'pdf' file like 'JOHN.pdf' as mentioned in the question. Here is an example:
import mlreportgen.dom.*;
doc = Document('JOHN', 'pdf');
% Adding the 'str' as a paragraph to the document
p = Paragraph(str);
append(doc, p);
close(doc);
You can refer to the following MathWorks documentations to know more about these functions:
'extractFileText': https://www.mathworks.com/help/releases/R2023a/textanalytics/ref/extractfiletext.html
'Document': https://www.mathworks.com/help/releases/R2023a/rptgen/ug/mlreportgen.dom.document-class.html
Hope this helps! Thanks.
0 comentarios
Más respuestas (0)
Ver también
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!