MATLAB Answers

How to read a table from an url?

92 views (last 30 days)
tom3w
tom3w on 2 Dec 2016
Commented: tom3w on 7 Dec 2016
Hi, I'd need some help. How is it possible to read a table from an url?
The following sequence allows constructing a URL object, opening a URL connection, setting up a buffered stream reader, and reading lines (line by line):
url = java.net.URL('http://www.mathworks.com')
is = openStream(url);
isr = java.io.InputStreamReader(is);
br = java.io.BufferedReader(isr);
s = char(readLine(br)); % can be repeated
I think bufferedReader is only appropriate to read contents row by row. In case the webpage contains a table, this code works, but does not read all the elements of the table, i.e. tbody
Example (java contents)
<div class="table-responsive no-padding-top"> : start of table, displayed in Matlab (e.g. command window)
<table width=... > : table formatting, displayed in Matlab
<thead>: start of table header, displayed in Matlab
<tr>: entire row related to table header, displayed in Matlab
<th> ... </th>: 1st element of header, displayed in Matlab
<th> ... </th>: 2nd element of header, displayed in Matlab
...
</tr>, displayed in Matlab
</thead>: end of header/description of column names, displayed in Matlab
<tbody>: full table with its contents, "<tbody>" displayed in Matlab
<tr>: 1st row of table, *not displayed* in Matlab
<td>...</td>: 1st cell of 1st row, *not displayed* in Matlab
<td>...</td>: 2nd cell of 1st row, *not displayed* in Matlab
</tr>: end of 1st row, *not displayed* in Matlab
<tr>: 2nd row of table, *not displayed* in Matlab
<td>...</td>: 1st cell of 1st row, *not displayed* in Matlab
<td>...</td>: 2nd cell of 1st row, *not displayed* in Matlab
</tr>: end of 2nd row, *not displayed* in Matlab
</tbody>: end of table contents, "</tbody>" displayed in Matlab
</table>: end of table object, displayed in Matlab
How can we read the details behind a table body (tbody)?
Many thanks for your support!
Thomas

  0 Comments

Sign in to comment.

Accepted Answer

Sid Jhaveri
Sid Jhaveri on 6 Dec 2016
1) You can try using the " webread " function.
2) You can also refer to the following links:
https://www.mathworks.com/matlabcentral/fileexchange/34968-htmltabletocell
https://www.mathworks.com/matlabcentral/fileexchange/29642-get-html-table-data-into-matlab-via-urlread-and-without-builtin-browser

  1 Comment

tom3w
tom3w on 7 Dec 2016
The webread function works pretty well. Thank you Sid for your suggestions. Kr, Thomas

Sign in to comment.

More Answers (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by