Assume you have a PDF file, which is displayed containing the string "Account# 345". Now different details impede the extraction of this string:
- The contents can be compressed and/or encrypted, such that the string cannot be found in clear text inside the file.
- Even without encryption or compression, the text need not be stored continously, but in a valid PDF each character can be stored with its paper position, such that the order does not matter.
In consequence searching a string in a PDF is not reliable. Therefore some OCR software is applied frequently to add an additional layer containing the contents as searchable strings. But as long as you do not specify any details of your PDF we cannot guess if they contain such strings.
Please notice, that your problem is not well defined and suggesting solutions is still based on guessing, although you've posted several corresponding questions in this forum. Finally the main problem is, that somebody decided to store data in PDF files, which is not sufficient for the later extraction of strings. Creating a large and complicatd workaround afterwards is an inefficient way. It would be more stable and faster to obtain the data in a more suitable format as a text file.
6 Comments
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238211
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238211
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238215
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238215
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238217
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238217
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238219
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238219
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238223
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238223
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238334
Direct link to this comment
https://la.mathworks.com/matlabcentral/answers/155500-how-to-extract-data-from-pdf-file-in-matlab#comment_238334
Sign in to comment.