Use string or character arrays in code?

27 visualizaciones (últimos 30 días)
Johannes Hougaard
Johannes Hougaard el 8 de Jun. de 2020
Comentada: Johannes Hougaard el 8 de Jun. de 2020
I am writing a few functions and scripts performing operations on files and content from files.
Often I use sprintf, uigetfile or other character/string based basic functions (even a simple isa is an example) that needs an input to be either a string or a character (sometimes even a cellstr is a good way of doing it).
Years ago when I was doing similar stuff in previous MATLAB release everything was handled in char arrays (with a little help from a cell when needed).
Nowadays it seems there's a new class in town - namely the string array.
Which is preferred and why? As far as I can tell the char is quite superior in memory usage, but are there any other advantages to using one over the other?

Respuesta aceptada

Stephen23
Stephen23 el 8 de Jun. de 2020
Editada: Stephen23 el 8 de Jun. de 2020
"Which is preferred and why?"
In my opinion character arrays and string arrays have quite different applications, even though people often seem to treat them as being interchangeable. In a nutshell, I would decide which one to use based on these paradigms:
  • string array: use when you want to consider each string element as an atomic unit (of course it is possible to dig into the string and mess around with its individual characters, but that is not their raison d'être). For example, strings allow for arrays with identical sizes to corresponding numeric arrays (e.g. simpler assigning of function outputs in a for-loop). The various special methods (e.g. compose, etc.) are sometimes very convenient but not particularly efficient.
  • character array: use when you need to work with individual characters, e.g. compression, encoding/decoding, or when you need to write very efficient code. With character arrays you can use standard MATLAB indexing and arithmetic operations etc. to work with individual character codes. There is no way to beat this in terms of speed or memory.
  1 comentario
Johannes Hougaard
Johannes Hougaard el 8 de Jun. de 2020
Thank you Stephen.
This makes a lot of sense - albeit still making the string kind of similar to the cellarray of characters.
But this explains a mixed use of the two and gives me some guidance on when to use the string array, while still keeping the character notation on parameter/value pairs, input parameters etc.
The filename used for identification of each cluster of variables is a string, whereas the 'LineWidth' etc. is more of a character thing. It just baffled me a bit that the live editor and the appdesigner apparently encourages the use of strings for everything.

Iniciar sesión para comentar.

Más respuestas (1)

dpb
dpb el 8 de Jun. de 2020
It's all about what's more convenient for the purpose...char() is most memory efficient, but one has to always drag around and remember to use 2D array indexing with the trailing colon to retrieve the full string from the array or even a single string. With, of course, the caveat that char() arrays have to be padded to the longest single element, so if there are many shorter strings but only one long one, that could also not be so for a particular application. Which brings up that one may have the additional need to strtrim them in use as well. And the latter brings in the proliferation of similar functionality and name overload in that they added strip with strings class, but also extended strtrim to handle the new strings as well. Wish they wouldn't do that! :(
Anyway, continuing on, if one is to using the string for character manipulation, it's probably more convenient to use the char() array or cellstr(); with string one still has to use the curlies "{}" in postfix notation to extract individual elements which has the effect of returning the underlying char() string embedded in it, anyways.
OTOH, strings support a new set of operations that can make some higher level operations pretty easy -- I particularly have found the extractAfter/Before/Between group and compose to come in handy in really quickly being able to do parsing and constructing exercises.
I'd made no effort to compare anything on runtime performance; as usual I'd expect to write simplest code could first and then only worry about trying to optimize if proved to be a bottleneck and unacceptable as is.

Categorías

Más información sobre Characters and Strings en Help Center y File Exchange.

Etiquetas

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by