Contenido principal

upsert

Insert or update data in Apache Cassandra database

Description

upsert(conn,keyspace,tablename,data) exports data from MATLAB® by inserting or updating it in an Apache Cassandra® database table.

example

upsert(conn,keyspace,tablename,data,Name,Value) specifies options using one or more name-value arguments. For example, 'ConsistencyLevel',"TWO" sets the consistency level to specify that two nodes must respond for the CQL query to execute.

example

Examples

collapse all

Insert data from MATLAB® into an Apache™ Cassandra® database using the Apache Cassandra database C++ interface, and display the data by using a Cassandra database connection.

The Cassandra database includes the employees_by_job database table, which contains employee data and the job_id partition key.

Create a Cassandra database connection using the configured data source CassandraDataSource and a blank user name and password. The apacheCassandra function returns conn as a connection object.

datasource = "CassandraDataSource";
username = "";
password = "";
conn = apacheCassandra(datasource,username,password);

Return the names of the Cassandra database tables in the employeedata keyspace. t is a string array that contains the names of these tables.

keyspace = "employeedata";
t = tablenames(conn,keyspace)
t = 3×1 string array
    "employees_by_id"
    "employees_by_job"
    "employees_by_name"

Import employee data into MATLAB from the employees_by_job table by using the Cassandra database connection.

keyspace = "employeedata";
tablename = "employees_by_job";
results = partitionRead(conn,keyspace,tablename);

Display the last few rows of the imported employee data.

tail(results)
ans=8×13 table
    "SH_CLERK"    27-Jan-2004    184    NaN    50    "NSARCHAN"    "Nandita"    "Sarchand"    121    1×2 table    [2;2]    "650.509.1876"    4200
    "MK_REP"    17-Aug-2005    202    NaN    20    "PFAY"    "Pat"    "Fay"    201    1×2 table    [8;10;3]    "603.123.6666"    6000
    "PU_CLERK"    10-Aug-2007    119    NaN    30    "KCOLMENA"    "Karen"    "Colmenares"    114    1×2 table    [9;3;6;1;5]    "515.127.4566"    2500
    "PU_CLERK"    15-Nov-2006    118    NaN    30    "GHIMURO"    "Guy"    "Himuro"    114    1×2 table    [4;6;2;10]    "515.127.4565"    2600
    "PU_CLERK"    24-Dec-2005    116    NaN    30    "SBAIDA"    "Shelli"    "Baida"    114    1×2 table    [3;7]    "515.127.4563"    2900
    "PU_CLERK"    24-Jul-2005    117    NaN    30    "STOBIAS"    "Sigal"    "Tobias"    114    1×2 table    [9;10]    "515.127.4564"    2800
    "PU_CLERK"    18-May-2003    115    NaN    30    "AKHOO"    "Alexander"    "Khoo"    114    1×2 table    [4;6]    "515.127.4562"    3100
    "AC_ACCOUNT"    07-Jun-2002    206    NaN    110    "WGIETZ"    "William"    "Gietz"    205    1×2 table    [2;3]    "515.123.8181"    8300

results is a table that contains these variables:

  • job_id — Job identifier

  • hire_date — Hire date

  • employee_id — Employee identifier

  • commission_pct — Commission percentage

  • department_id — Department identifier

  • email — Email address

  • first_name — First name

  • last_name — Last name

  • manager_id — Manager identifier

  • office — Office location (table that contains two variables for the building and room)

  • performance_ratings — Performance ratings

  • phone_number — Phone number

  • salary — Salary

Display the CQL data types of the columns in the employees_by_job database table.

cols = columninfo(conn,keyspace,tablename);
cols(:,1:2)
ans=13×2 table
                 "job_id"         "text"
              "hire_date"         "date"
            "employee_id"          "int"
         "commission_pct"       "double"
          "department_id"          "int"
                  "email"         "text"
             "first_name"         "text"
              "last_name"         "text"
             "manager_id"          "int"
                 "office"       "office"
    "performance_ratings"    "list<int>"
           "phone_number"         "text"
                 "salary"          "int"

Create a table of data representing one employee to insert into the Cassandra database. Specify the names of the variables. Create a table for the office information. Then, create a table with the employee information that contains the nested table of office information. Set the names of the variables.

varnames = ["job_id" "hire_date" "employee_id" ...
    "commission_pct" "department_id" "email" "first_name" ...
    "last_name" "manager_id" "office" "performance_ratings" ... 
    "phone_number" "salary"];
office = table("South",160, ...
    'VariableNames',["building" "room"]);
data = table("IT_ADMIN",datetime('today'),301,0.25,30,"SMITH123", ...
    "Alex","Smith",114,office,{[4 5]},"515.123.2345",3000);
data.Properties.VariableNames = varnames;

Insert the employee information into the Cassandra database.

upsert(conn,keyspace,tablename,data)

Display the inserted data by importing it into MATLAB using the partition key IT_ADMIN. The employees_by_job table contains a new row.

keyValue = "IT_ADMIN";
results = partitionRead(conn,keyspace,tablename,keyValue)
results=1×13 table
    "IT_ADMIN"    06-Oct-2020    301    0.2500    30    "SMITH123"    "Alex"    "Smith"    114    1×2 table    [4;5]    "515.123.2345"    3000

Close the Cassandra database connection.

close(conn)

Using the Apache™ Cassandra® database C++ interface, update data in an Apache Cassandra database with MATLAB® data. Display the updated data by using a Cassandra database connection. Specify a consistency level for the write operation.

The Cassandra database includes the employees_by_job database table, which contains employee data and the job_id partition key.

Create a Cassandra database connection using the configured data source CassandraDataSource and a blank user name and password. The apacheCassandra function returns conn as a connection object.

datasource = "CassandraDataSource";
username = "";
password = "";
conn = apacheCassandra(datasource,username,password);

Return the names of the Cassandra database tables in the employeedata keyspace. t is a string array that contains the names of these tables.

keyspace = "employeedata";
t = tablenames(conn,keyspace)
t = 3×1 string array
    "employees_by_id"
    "employees_by_job"
    "employees_by_name"

Import employee data into MATLAB from the employees_by_job table by using the Cassandra database connection.

keyspace = "employeedata";
tablename = "employees_by_job";
results = partitionRead(conn,keyspace,tablename);

Display the last few rows of the imported employee data.

tail(results)
ans=8×13 table
    "SH_CLERK"    27-Jan-2004    184    NaN    50    "NSARCHAN"    "Nandita"    "Sarchand"    121    1×2 table    [2;2]    "650.509.1876"    4200
    "MK_REP"    17-Aug-2005    202    NaN    20    "PFAY"    "Pat"    "Fay"    201    1×2 table    [8;10;3]    "603.123.6666"    6000
    "PU_CLERK"    10-Aug-2007    119    NaN    30    "KCOLMENA"    "Karen"    "Colmenares"    114    1×2 table    [9;3;6;1;5]    "515.127.4566"    2500
    "PU_CLERK"    15-Nov-2006    118    NaN    30    "GHIMURO"    "Guy"    "Himuro"    114    1×2 table    [4;6;2;10]    "515.127.4565"    2600
    "PU_CLERK"    24-Dec-2005    116    NaN    30    "SBAIDA"    "Shelli"    "Baida"    114    1×2 table    [3;7]    "515.127.4563"    2900
    "PU_CLERK"    24-Jul-2005    117    NaN    30    "STOBIAS"    "Sigal"    "Tobias"    114    1×2 table    [9;10]    "515.127.4564"    2800
    "PU_CLERK"    18-May-2003    115    NaN    30    "AKHOO"    "Alexander"    "Khoo"    114    1×2 table    [4;6]    "515.127.4562"    3100
    "AC_ACCOUNT"    07-Jun-2002    206    NaN    110    "WGIETZ"    "William"    "Gietz"    205    1×2 table    [2;3]    "515.123.8181"    8300

results is a table that contains these variables:

  • job_id — Job identifier

  • hire_date — Hire date

  • employee_id — Employee identifier

  • commission_pct — Commission percentage

  • department_id — Department identifier

  • email — Email address

  • first_name — First name

  • last_name — Last name

  • manager_id — Manager identifier

  • office — Office location (table that contains two variables for the building and room)

  • performance_ratings — Performance ratings

  • phone_number — Phone number

  • salary — Salary

Display the CQL data types of the columns in the employees_by_job database table.

cols = columninfo(conn,keyspace,tablename);
cols(:,1:2)
ans=13×2 table
                 "job_id"         "text"
              "hire_date"         "date"
            "employee_id"          "int"
         "commission_pct"       "double"
          "department_id"          "int"
                  "email"         "text"
             "first_name"         "text"
              "last_name"         "text"
             "manager_id"          "int"
                 "office"       "office"
    "performance_ratings"    "list<int>"
           "phone_number"         "text"
                 "salary"          "int"

Import the data to update by using the partitionRead function with the partition key value MK_REP. The data is for an employee who is a marketing representative.

keyValue = "MK_REP";
data = partitionRead(conn,keyspace,tablename,keyValue)
data=1×13 table
    "MK_REP"    17-Aug-2005    202    NaN    20    "PFAY"    "Pat"    "Fay"    201    1×2 table    [8;10;3]    "603.123.6666"    6000

Update the commission percentage to 0.25 for the marketing representative. Also, specify the consistency level "ONE" to ensure that one replica node commits the write operation.

data.commission_pct = 0.25;
level = "ONE";
upsert(conn,keyspace,tablename,data,'ConsistencyLevel',level)

Display the updated data by importing it into MATLAB using the partition key value MK_REP. The updated commission percentage for the marketing representative is 0.25.

keyValue = "MK_REP";
results = partitionRead(conn,keyspace,tablename,keyValue)
results=1×13 table
    "MK_REP"    17-Aug-2005    202    0.2500    20    "PFAY"    "Pat"    "Fay"    201    1×2 table    [8;10;3]    "603.123.6666"    6000

Close the Cassandra database connection.

close(conn)

Input Arguments

collapse all

Apache Cassandra database connection, specified as a connection object.

Keyspace, specified as a character vector or string scalar. If you do not know the keyspace, then access the Keyspaces property of the connection object using dot notation to view the keyspaces in the Cassandra database.

Example: "employeedata"

Data Types: char | string

Cassandra database table name, specified as a character vector or string scalar. If you do not know the name of the table, then use the tablenames function to find it.

Example: "employees_by_job"

Data Types: char | string

Data to insert or update in a Cassandra database, specified as a table. You must specify the primary keys of the Cassandra database table, but you can ignore other Cassandra columns. The names of the variables in the table must match the names of the Cassandra columns in the database table, without case sensitivity. The data types of the variables in the table must be compatible with the CQL data types of the Cassandra columns. For details, see Convert CQL Data Types to MATLAB Data Types Using Apache Cassandra Database C++ Interface.

Data Types: table

Name-Value Arguments

collapse all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: upsert(conn,keyspace,tablename,data,'ConsistencyLevel',"ONE",'RequestTimeout',15) exports data by receiving a write response from one node, and the database must wait 15 seconds to perform the write operation before throwing an error.

Consistency level, specified as one of these values.

Consistency Level ValueWrite Operation

"ALL"

Commit on all replica nodes.

"EACH_QUORUM"

Commit on a majority of replica nodes in each data center.

"QUORUM"

Commit on most replica nodes.

"LOCAL_QUORUM"

Commit on most replica nodes in the local data center.

"ONE" (default)

Commit on one replica node.

"TWO"

Commit on two replica nodes.

"THREE"

Commit on three replica nodes.

"LOCAL_ONE"

Commit on one replica node in the local data center.

"ANY"

Commit on at least one replica node.

You can specify the value of the consistency level as a character vector or string scalar.

For details about consistency levels, see Configuring Data Consistency.

Data Types: char | string

This property is read-only.

Request timeout, specified as a positive numeric scalar. The request timeout indicates the number of seconds the database waits to return a CQL query before throwing an error.

Data Types: double

Version History

Introduced in R2021a