How to create Excel spreadsheet from HTML report?
6 visualizaciones (últimos 30 días)
Mostrar comentarios más antiguos
MathWorks Support Team
el 25 de Jun. de 2018
Editada: MathWorks Support Team
el 5 de Feb. de 2024
I have an HTML report with some information, and I would like to use MATLAB to extract a subset of the text and export to an Excel spreadsheet. How can I do this?
Attached is the HTML file, and copied below are its contents. I have also included a screenshot of the result that I would like to achieve after exporting to an Excel spreadsheet:
<div class="info">
<pre>
###########################################
<strong>-------------------------------------------</strong>
<strong>--------------- Section 1 -----------------</strong>
<strong>-------------------------------------------</strong>Date = 01/01/2018
Time = 12:00:00
Type = Type1
Number1 = 1234
Comments = <strong>-------------------------------------------</strong>
<strong>--------------- Section 2 -----------------</strong>
<strong>-------------------------------------------</strong>Number2 = 100.0001
Version = 1.0.0
Program = example.exe</pre>
</div>

Respuesta aceptada
MathWorks Support Team
el 2 de Dic. de 2023
Editada: MathWorks Support Team
el 5 de Feb. de 2024
Attached (and also copy-pasted below) is an example script that illustrates one such approach to convert the HTML file to an Excel spreadsheet as per the following workflow:
1) Read in the HTML file
2) Remove specific sections/patterns of text (via regular expression search and replacement)
3) Parse the remaining text based on a specified delimiter and format
4) Export the result to an Excel Spreadsheet
Please note that this example is provided simply to illustrate the application of these functions and outline an example of a general approach. This example code would likely need to be adapted and tuned for a particular workflow and use-case, since the implementation is highly dependent on the requirements therein.
%% Read in HTML file.
filenameHTML = 'Example.html';
txt = fileread('Example.html');
%% Remove HTML tags, header text, and last section (pertaining to images).
txt = regexprep(txt,'<script.*?/script>','');
txt = regexprep(txt,'<style.*?/style>','');
txt = regexprep(txt,'<.*?>','');
txt = regexprep(txt,'.*#\n','');
txt = regexprep(txt,'--.*?\n','');
txt = regexprep(txt,'\n\n.*','');
%% Set up delimiters and format specification to read columns of data as text:
% For more information, see the TEXTSCAN documentation.
delimiter = {' = '};
formatSpec = '%q%q%[^\n\r]';
%% Read columns of data according to the format.
dataArray = textscan(txt, formatSpec, 'Delimiter', delimiter, ...
'TextType', 'char', 'ReturnOnError', false);
raw = repmat({''},length(dataArray{1}),length(dataArray)-1); %preallocation before loop
for col = 1:(length(dataArray)-1)
raw(1:length(dataArray{col}),col) = dataArray{col};
end;
%% Write data to Excel spreadsheet.
filenameSpreadsheet = 'Example.xlsx';
xlswrite(filenameSpreadsheet,raw)
The following links provide additional information and references to the functions that I used, along with their syntax, arguments, parameters, and options:
Run the below command in your MATLAB instance to access the release specific documentation:
web(fullfile(docroot, 'matlab/ref/fileread.html'))
web(fullfile(docroot, 'matlab/ref/regexprep.html'))
web(fullfile(docroot, 'matlab/ref/textscan.html'))
web(fullfile(docroot, 'matlab/ref/xlswrite.html'))
Please follow the below link to search for the required information regarding the current release:
0 comentarios
Más respuestas (0)
Ver también
Categorías
Más información sobre Spreadsheets en Help Center y File Exchange.
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!