- https://www.mathworks.com/help/matlab/import_export/what-is-a-datastore.html
- https://www.mathworks.com/company/newsletters/articles/matrix-indexing-in-matlab.html
Can I specify the records in each datastore partition?
    5 visualizaciones (últimos 30 días)
  
       Mostrar comentarios más antiguos
    
I create a datastore from a single CSV file which I partition into 3 parts.
>> ds = tabularTextDatastore('airlinesmall.csv') ;
>> subds= partition( ds, 3, 1) ; preview( subds)
ans =
  8×29 table
    Year    Month    DayofMonth    DayOfWeek    DepTime    CRSDepTime    ArrTime    CRSArrTime    UniqueCarrier    FlightNum    TailNum    ActualElapsedTime    CRSElapsedTime    AirTime    ArrDelay    DepDelay    Origin      Dest      Distance    TaxiIn    TaxiOut    Cancelled    CancellationCode    Diverted    CarrierDelay    WeatherDelay    NASDelay    SecurityDelay    LateAircraftDelay
    ____    _____    __________    _________    _______    __________    _______    __________    _____________    _________    _______    _________________    ______________    _______    ________    ________    _______    _______    ________    ______    _______    _________    ________________    ________    ____________    ____________    ________    _____________    _________________
    1987     10          21            3          642          630         735          727          {'PS'}          1503       {'NA'}             53                 57          {'NA'}         8          12       {'LAX'}    {'SJC'}      308       {'NA'}    {'NA'}         0             {'NA'}            0           {'NA'}          {'NA'}        {'NA'}        {'NA'}             {'NA'}      
    1987     10          26            1         1021         1020        1124         1116          {'PS'}          1550       {'NA'}             63                 56          {'NA'}         8           1       {'SJC'}    {'BUR'}      296       {'NA'}    {'NA'}         0             {'NA'}            0           {'NA'}          {'NA'}        {'NA'}        {'NA'}             {'NA'}      
    1987     10          23            5         2055         2035        2218         2157          {'PS'}          1589       {'NA'}             83                 82          {'NA'}        21          20       {'SAN'}    {'SMF'}      480       {'NA'}    {'NA'}         0             {'NA'}            0           {'NA'}          {'NA'}        {'NA'}        {'NA'}             {'NA'}      
    1987     10          23            5         1332         1320        1431         1418          {'PS'}          1655       {'NA'}             59                 58          {'NA'}        13          12       {'BUR'}    {'SJC'}      296       {'NA'}    {'NA'}         0             {'NA'}            0           {'NA'}          {'NA'}        {'NA'}        {'NA'}             {'NA'}      
    1987     10          22            4          629          630         746          742          {'PS'}          1702       {'NA'}             77                 72          {'NA'}         4          -1       {'SMF'}    {'LAX'}      373       {'NA'}    {'NA'}         0             {'NA'}            0           {'NA'}          {'NA'}        {'NA'}        {'NA'}             {'NA'}      
    1987     10          28            3         1446         1343        1547         1448          {'PS'}          1729       {'NA'}             61                 65          {'NA'}        59          63       {'LAX'}    {'SJC'}      308       {'NA'}    {'NA'}         0             {'NA'}            0           {'NA'}          {'NA'}        {'NA'}        {'NA'}             {'NA'}      
    1987     10           8            4          928          930        1052         1049          {'PS'}          1763       {'NA'}             84                 79          {'NA'}         3          -2       {'SAN'}    {'SFO'}      447       {'NA'}    {'NA'}         0             {'NA'}            0           {'NA'}          {'NA'}        {'NA'}        {'NA'}             {'NA'}      
    1987     10          10            6          859          900        1134         1123          {'PS'}          1800       {'NA'}            155                143          {'NA'}        11          -1       {'SEA'}    {'LAX'}      954       {'NA'}    {'NA'}         0             {'NA'}            0           {'NA'}          {'NA'}        {'NA'}        {'NA'}             {'NA'}      
>> 
I would like to specify the rows of the source dataset I want in each partition. Is that possible?
>> subds= partition( ds, 3, 2) ; preview( subds)
ans =
  0×29 empty table
>> subds= partition( ds, 3, 3) ; preview( subds)
ans =
  0×29 empty table
>> 
Incidentally, why are my 2nd and 3rd partitions empty?
0 comentarios
Respuestas (1)
  Piyush Dubey
    
 el 15 de Sept. de 2023
        I understand that you are attempting to create partitions of your datastore and are experiencing difficulties with it. You also observed that the partitions created are empty tables. 
Please note that in order to create datastore partitions, the datastore needs to have more than one file. The datastore referred to in the code snippet has only one ‘CSV’ file and thus cannot be partitioned. You can determine the number of partitions in a datastore using the “numpartitions()” function and the syntax for this function is demonstrated below: 
X=numpartitions(subds) 
You will be able to see that the number of partitions remains '1', both before and after attempting to create partitions because there is only one file in the datastore. This is the reason why any partition accessed after index ‘1’ results in an empty table. 
If you would like to perform “row-wise partition” of a file within the datastore, a possible workaround for it would be using the “read” function. After the data is read to a variable, it can be further ‘sliced’, ‘indexed’ and ‘labeled’ as shown below: 
ds = tabularTextDatastore(‘airlinesmall.csv’); 
temp=ds.read; 
%extracting a particular column 
subds=temp.Month; 
%extract the first out of 3 partitions 
subds=temp(height(temp)/3, : ); 
Please refer to the following MathWorks documentation links for more information on “DataStore” and “Indexing”: 
Hope this helps. 
Ver también
Categorías
				Más información sobre MATLAB Support Package for IP Cameras en Help Center y File Exchange.
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!

