Why do I get 0*1 cells?

1 visualización (últimos 30 días)
reza
reza el 28 de En. de 2019
Respondida: Milan Roy el 29 de En. de 2019
I am trying to web scraping using this tutorial : https://medium.com/@roymilaniitd/web-scraping-to-extract-news-using-matlab-dd78b954684 , But when I test the following code:
html = webread('https://www.indiatoday.in/top-stories');
list = extractBetween(html,'<h3 class=”” title=','</a></h3><p>');
list2=extractAfter(list,'<a href="');
list3 = extractAfter(list2,'">');
I get the contents in a html char variable, but three 0*1 cells list1, list2, list3!
Why this happens?

Respuestas (2)

Jan
Jan el 28 de En. de 2019
Editada: Jan el 28 de En. de 2019
You are searching for:
'<h3 class=”” title='
% ^^
I'm sure, you mean:
'<h3 class="" title='
with standard double quotes ".
The author of this page seems to use a tool like MS Word to create webpages and let the automatic replacement insert smart quotes. This is a very bad idea when posting code in the internet.

Milan Roy
Milan Roy el 29 de En. de 2019
Yes, just use the standard " " instead of the formatted double quote. It should work fine.

Categorías

Más información sobre Environment and Settings en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by