126.96.36.199 NDB Cluster Disk Data Objects
NDB Cluster Disk Data storage is implemented using a number of Disk Data objects. These include the following:
Tablespaces act as containers for other Disk Data objects.
Undo log files undo information required for rolling back transactions.
One or more undo log files are assigned to a log file group, which is then assigned to a tablespace.
Data files store Disk Data table data. A data file is assigned directly to a tablespace.
Undo log files and data files are actual files in the file system of each data node; by default they are placed in
ndb_ in the
DataDir specified in the NDB Cluster
config.ini file, and where
node_id is the data node's node ID. It is possible to place these elsewhere by specifying either an absolute or relative path as part of the filename when creating the undo log or data file. Statements that create these files are shown later in this section.
NDB Cluster tablespaces and log file groups are not implemented as files.
Although not all Disk Data objects are implemented as files, they all share the same namespace. This means that each Disk Data object must be uniquely named (and not merely each Disk Data object of a given type). For example, you cannot have a tablespace and a log file group both named
Assuming that you have already set up an NDB Cluster with all nodes (including management and SQL nodes), the basic steps for creating an NDB Cluster table on disk are as follows:
Create a log file group, and assign one or more undo log files to it (an undo log file is also sometimes referred to as an undofile).Note
Undo log files are necessary only for Disk Data tables; they are not used for
NDBCLUSTERtables that are stored only in memory.
Create a tablespace; assign the log file group, as well as one or more data files, to the tablespace.
Create a Disk Data table that uses this tablespace for data storage.
Each of these tasks can be accomplished using SQL statements in the mysql client or other MySQL client application, as shown in the example that follows.
We create a log file group named
CREATE LOGFILE GROUP. This log file group is to be made up of two undo log files, which we name
undo_2.log, whose initial sizes are 16 MB and 12 MB, respectively. (The default initial size for an undo log file is 128 MB.) Optionally, you can also specify a size for the log file group's undo buffer, or permit it to assume the default value of 8 MB. In this example, we set the UNDO buffer's size at 2 MB. A log file group must be created with an undo log file; so we add
CREATE LOGFILE GROUPstatement:
CREATE LOGFILE GROUP lg_1 ADD UNDOFILE 'undo_1.log' INITIAL_SIZE 16M UNDO_BUFFER_SIZE 2M ENGINE NDBCLUSTER;
undo_2.logto the log file group, use the following
ALTER LOGFILE GROUPstatement:
ALTER LOGFILE GROUP lg_1 ADD UNDOFILE 'undo_2.log' INITIAL_SIZE 12M ENGINE NDBCLUSTER;
Some items of note:
.logfile extension used here is not required. We use it merely to make the log files easily recognizable.
There can exist at most one log file group in the same NDB Cluster at any given time.
When you add an undo log file to a log file group using
ADD UNDOFILE ', a file with the name
filenameis created in the
ndb_directory within the
DataDirof each data node in the cluster, where
node_idis the node ID of the data node. Each undo log file is of the size specified in the SQL statement. For example, if an NDB Cluster has 4 data nodes, then the
ALTER LOGFILE GROUPstatement just shown creates 4 undo log files, 1 each on in the data directory of each of the 4 data nodes; each of these files is named
undo_2.logand each file is 12 MB in size.
UNDO_BUFFER_SIZEis limited by the amount of system memory available.
For more information about the
CREATE LOGFILE GROUPstatement, see Section 13.1.15, “CREATE LOGFILE GROUP Statement”. For more information about
ALTER LOGFILE GROUP, see Section 13.1.5, “ALTER LOGFILE GROUP Statement”.
Now we can create a tablespace, which contains files to be used by NDB Cluster Disk Data tables for storing their data. A tablespace is also associated with a particular log file group. When creating a new tablespace, you must specify the log file group which it is to use for undo logging; you must also specify a data file. You can add more data files to the tablespace after the tablespace is created; it is also possible to drop data files from a tablespace (an example of dropping data files is provided later in this section).
Assume that we wish to create a tablespace named
lg_1as its log file group. This tablespace is to contain two data files named
data_2.dat, whose initial sizes are 32 MB and 48 MB, respectively. (The default value for
INITIAL_SIZEis 128 MB.) We can do this using two SQL statements, as shown here:
CREATE TABLESPACE ts_1 ADD DATAFILE 'data_1.dat' USE LOGFILE GROUP lg_1 INITIAL_SIZE 32M ENGINE NDBCLUSTER; ALTER TABLESPACE ts_1 ADD DATAFILE 'data_2.dat' INITIAL_SIZE 48M ENGINE NDBCLUSTER;
Some items of note:
As is the case with the
.logfile extension used in this example for undo log files, there is no special significance for the
.datfile extension; it is used merely for easy recognition of data files.
When you add a data file to a tablespace using
ADD DATAFILE ', a file with the name
filenameis created in the
ndb_directory within the
DataDirof each data node in the cluster, where
node_idis the node ID of the data node. Each data file is of the size specified in the SQL statement. For example, if an NDB Cluster has 4 data nodes, then the
ALTER TABLESPACEstatement just shown creates 4 data files, 1 each in the data directory of each of the 4 data nodes; each of these files is named
data_2.datand each file is 48 MB in size.
NDB 7.6 (and later) reserves 4% of each tablespace for use during data node restarts. This space is not available for storing data.
ALTER TABLESPACEstatements must contain an
ENGINEclause; only tables using the same storage engine as the tablespace can be created in the tablespace. For NDB Cluster tablespaces, the only permitted values for this option are
CREATE TABLE dt_1 ( member_id INT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY, last_name VARCHAR(50) NOT NULL, first_name VARCHAR(50) NOT NULL, dob DATE NOT NULL, joined DATE NOT NULL, INDEX(last_name, first_name) ) TABLESPACE ts_1 STORAGE DISK ENGINE NDBCLUSTER;
TABLESPACE ... STORAGE DISKoption tells the
NDBCLUSTERstorage engine to use tablespace
ts_1for disk data storage.
It is also possible to specify whether an individual column is stored on disk or in memory by using a
STORAGEclause as part of the column's definition in a
STORAGE DISKcauses the column to be stored on disk, and
STORAGE MEMORYcauses in-memory storage to be used. See Section 13.1.18, “CREATE TABLE Statement”, for more information.
Indexing of columns implicitly stored on disk. For table
dt_1 as defined in the example just shown, only the
joined columns are stored on disk. This is because there are indexes on the
first_name columns, and so data belonging to these columns is stored in RAM. Only nonindexed columns can be held on disk; indexes and indexed column data continue to be stored in memory. This tradeoff between the use of indexes and conservation of RAM is something you must keep in mind as you design Disk Data tables.
You cannot add an index to a column that has been explicitly declared
STORAGE DISK, without first changing its storage type to
MEMORY; any attempt to do so fails with an error. A column which implicitly uses disk storage can be indexed; when this is done, the column's storage type is changed to
MEMORY automatically. By “implicitly”, we mean a column whose storage type is not declared, but which is which inherited from the parent table. In the following CREATE TABLE statement (using the tablespace
ts_1 defined previously), columns
c3 use disk storage implicitly:
mysql> CREATE TABLE ti ( -> c1 INT PRIMARY KEY, -> c2 INT, -> c3 INT, -> c4 INT -> ) -> STORAGE DISK -> TABLESPACE ts_1 -> ENGINE NDBCLUSTER; Query OK, 0 rows affected (1.31 sec)
c4 are themselves not declared with
STORAGE DISK, it is possible to index them. Here, we add indexes to
c3, using, respectively,
CREATE INDEX and
mysql> CREATE INDEX i1 ON ti(c2); Query OK, 0 rows affected (2.72 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> ALTER TABLE ti ADD INDEX i2(c3); Query OK, 0 rows affected (0.92 sec) Records: 0 Duplicates: 0 Warnings: 0
SHOW CREATE TABLE confirms that the indexes were added.
mysql> SHOW CREATE TABLE ti\G *************************** 1. row *************************** Table: ti Create Table: CREATE TABLE `ti` ( `c1` int(11) NOT NULL, `c2` int(11) DEFAULT NULL, `c3` int(11) DEFAULT NULL, `c4` int(11) DEFAULT NULL, PRIMARY KEY (`c1`), KEY `i1` (`c2`), KEY `i2` (`c3`) ) /*!50100 TABLESPACE `ts_1` STORAGE DISK */ ENGINE=ndbcluster DEFAULT CHARSET=latin1 1 row in set (0.00 sec)
You can see using ndb_desc that the indexed columns (emphasized text) now use in-memory rather than on-disk storage:
shell> ./ndb_desc -d test t1 -- t1 -- Version: 33554433 Fragment type: HashMapPartition K Value: 6 Min load factor: 78 Max load factor: 80 Temporary table: no Number of attributes: 4 Number of primary keys: 1 Length of frm data: 317 Max Rows: 0 Row Checksum: 1 Row GCI: 1 SingleUserMode: 0 ForceVarPart: 1 PartitionCount: 4 FragmentCount: 4 PartitionBalance: FOR_RP_BY_LDM ExtraRowGciBits: 0 ExtraRowAuthorBits: 0 TableStatus: Retrieved Table options: HashMap: DEFAULT-HASHMAP-3840-4 -- Attributes -- c1 Int PRIMARY KEY DISTRIBUTION KEY AT=FIXED ST=MEMORY c2 Int NULL AT=FIXED ST=MEMORY c3 Int NULL AT=FIXED ST=MEMORY c4 Int NULL AT=FIXED ST=DISK -- Indexes -- PRIMARY KEY(c1) - UniqueHashIndex i2(c3) - OrderedIndex PRIMARY(c1) - OrderedIndex i1(c2) - OrderedIndex NDBT_ProgramExit: 0 - OK
Performance note. The performance of a cluster using Disk Data storage is greatly improved if Disk Data files are kept on a separate physical disk from the data node file system. This must be done for each data node in the cluster to derive any noticeable benefit.
You may use absolute and relative file system paths with
ADD UNDOFILE and
ADD DATAFILE. Relative paths are calculated relative to the data node's data directory. You may also use symbolic links; see Section 188.8.131.52, “Using Symbolic Links with Disk Data Objects”, for more information and examples.
A log file group, a tablespace, and any Disk Data tables using these must be created in a particular order. The same is true for dropping any of these objects:
A log file group cannot be dropped as long as any tablespaces are using it.
A tablespace cannot be dropped as long as it contains any data files.
You cannot drop any data files from a tablespace as long as there remain any tables which are using the tablespace.
It is not possible to drop files created in association with a different tablespace than the one with which the files were created. (Bug #20053)
For example, to drop all the objects created so far in this section, you would use the following statements:
mysql> DROP TABLE dt_1; mysql> ALTER TABLESPACE ts_1 -> DROP DATAFILE 'data_2.dat' -> ENGINE NDBCLUSTER; mysql> ALTER TABLESPACE ts_1 -> DROP DATAFILE 'data_1.dat' -> ENGINE NDBCLUSTER; mysql> DROP TABLESPACE ts_1 -> ENGINE NDBCLUSTER; mysql> DROP LOGFILE GROUP lg_1 -> ENGINE NDBCLUSTER;
These statements must be performed in the order shown, except that the two
ALTER TABLESPACE ... DROP DATAFILE statements may be executed in either order.
You can obtain information about data files used by Disk Data tables by querying the
FILES table in the
INFORMATION_SCHEMA database. An extra “
NULL row” provides additional information about undo log files. For more information and examples, see Section 24.9, “The INFORMATION_SCHEMA FILES Table”.