check if url exists for large files without downloading the file

25 visualizaciones (últimos 30 días)
Diana
Diana el 11 de Oct. de 2016
Respondida: Fernando Bello el 28 de Nov. de 2019
I've searched this forum for a while and need some help with checking URLs. I have a set of files to download from a site and loop through the formatting options for the files basically the files are zip files and are extremely large (nearly 100MB). If I use the following statement:
raw_file = websave([p_raw,'\',fname_raw],[URL,MO_name,num2str(YR),'/raw/',fname_raw]);
Then for files that do not actually exist a dummy file with 0KB of data in it is made in the saved location. I want to avoid this. I've tried using:
[str status] = urlread([URL,MO_name,num2str(YR),'/raw/',fname_raw]);
but MATLAB still insists on trying to read the actual file instead of just telling me if it's there or not - wasting time in the process.
The function isurl does not apply to named URLs. How do I check merely the existence of a url without actually downloading it?
Thanks!

Respuestas (4)

Steven Lord
Steven Lord el 11 de Oct. de 2016
Try using Java.
url = 'http://www.mathworks.com';
J = java.net.URL(url);
conn = openConnection(J);
status = getResponseCode(conn)
When I ran this code I received status 200 which is OK. I tried it for a URL to a file that didn't exist and received code 404. I don't think creating an object using openConnection actually retrieves the data; I think you'd have to call that object's getContent method or something similar to obtain the data.

Image Analyst
Image Analyst el 11 de Oct. de 2016
Did you try exist()?
itExists = exist(filename, 'file');
filename will the the url of the file you want to check on.
  1 comentario
Diana
Diana el 11 de Oct. de 2016
Editada: Walter Roberson el 11 de Oct. de 2016
If you mean in this manner:
itexists = exist('http://amisr.com/database/tmp/loucks/Apr2016/raw/20160406.005.tar.gz','file')
nope - gives me a 0 return and I'm staring at the file on the URL.

Iniciar sesión para comentar.


Matthew Eicholtz
Matthew Eicholtz el 11 de Oct. de 2016
It seems I stumbled upon a similar approach to Steven, but a few minutes too late!
I think this function would do the trick:
function tf = urlexist(url)
URL = java.net.URL(url); %create the URL object
% Get the proxy information using the MATLAB proxy API.
proxy = com.mathworks.webproxy.WebproxyFactory.findProxyForURL(URL);
% Open a connection to the URL.
if isempty(proxy)
urlConnection = URL.openConnection;
else
urlConnection = URL.openConnection(proxy);
end
% Try to start the input stream
try
inputStream = urlConnection.getInputStream;
tf = true;
catch
tf = false;
end
end

Fernando Bello
Fernando Bello el 28 de Nov. de 2019
Maybe it is a little bit late, but I'm just reading this post because I had the same trouble.
I solved it this way:
url = ('https://matlab/file.nc'); % the URL where data is, it can be what ever you want, png, doc, mat, etc.
filename = 'filename_test'; % the name you want to save the file
options = weboptions('Username','username','Password','passwd','CertificateFilename','');
try
websave(filename,url,options);
disp('OK') %if the file exist it is downloades
catch
disp('URL read of link was unsuccessful') % if file do not exist
end

Categorías

Más información sobre Downloads en Help Center y File Exchange.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by