Ignite Persistence

Apache Ignite Native Persistence

❗️

This is a legacy Apache Ignite documentation

The new documentation is hosted here: https://ignite.apache.org/docs/latest/

Overview

Ignite native persistence is a distributed ACID and SQL-compliant disk store that transparently integrates with Ignite's durable memory. Ignite persistence is optional and can be turned on and off. When turned off Ignite becomes a pure in-memory store.

With the native persistence enabled, Ignite always stores a superset of data on disk, and as much as it can in RAM based on the capacity of the latter. For example, if there are 100 entries and RAM has the capacity to store only 20, then all 100 will be stored on disk and only 20 will be cached in RAM for better performance.

Also, it is worth mentioning that as with a pure in-memory use case, when persistence is enabled, every individual cluster node persists only a subset of the data, only including partitions for which the node is either primary or backup. Collectively, the whole cluster contains the full data set.

The native persistence has the following characteristics making it different from 3rd party databases and useful as an alternative persistence layer in Ignite:

  • SQL queries over the full data set that spans both, memory and disk. This means that Apache Ignite can be used as a memory-centric distributed SQL database.
  • No need to have all the data and indexes in memory. Ignite persistence allows storing a superset of data on disk and only the most frequently used subsets in memory.
  • Instantaneous cluster restarts. If the whole cluster goes down there is no need to warm up the memory by preloading data from the Ignite Persistence. The cluster becomes fully operational once all the cluster nodes are interconnected with each other.
  • Data and indexes are stored in a similar format both in memory and on disk which helps avoid expensive transformations when moving data between memory and disk.
  • An ability to create full and incremental cluster snapshots by plugging-in 3rd party solutions.

Enabling Persistent Storage

To enable the native persistence, pass an instance of DataStorageConfiguration to a cluster node configuration:

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
  <!-- Enabling Apache Ignite native persistence. -->
  <property name="dataStorageConfiguration">
    <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
      <property name="defaultDataRegionConfiguration">
        <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
          <property name="persistenceEnabled" value="true"/>
        </bean>
      </property>
    </bean>
  </property>
  
  <!-- Additional setting. -->
 
</bean>
// Apache Ignite node configuration.
IgniteConfiguration cfg = new IgniteConfiguration();

// Ignite persistence configuration.
DataStorageConfiguration storageCfg = new DataStorageConfiguration();
            
// Enabling the persistence.
storageCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);
            
// Applying settings.
cfg.setDataStorageConfiguration(storageCfg);

When persistence is enabled, data and indexes are stored both in memory and on disk across all the cluster nodes. The picture below depicts the structure of Ignite persistence at the file system level of an individual cluster node:

👍

Persistence per Data Region and Cache

Ignite allows enabling persistence per concrete data region and, thus, per cache. Refer to data regions configuration section for details.

1708

Ignite Native Persistence Structure on the File System

📘

Cluster Doesn’t Start After Field Type Changes

When developing your application, you may need to change a type of a custom object’s field. For instance, let’s say you have object A with field A.range of int type and then you decided to change the type of A.range to long right in the source code. When you do this, the cluster or the application will fail to restart because Ignite doesn’t support field/column type changes.

When this happens and you are still in development, you need to go into the file system and remove the following directories: marshaller/, db/, and wal/ located in Ignite working directory (db and wal might be located in other places if you have redefined their location).

However, if you are in production then instead of changing field types, add a new field to your object model and remove the old one. This operation is fully supported. At the same time, the ALTER TABLE command can be used to add new columns or remove existing ones at run time.

Firstly, there will be a unique directory for every cache deployed on the node. From the picture above, we can see that there are at least two caches (Cache_A and Cache_B) whose data and indexes are maintained by the node.

Secondly, for every partition for which the node is either a primary or backup, the persistence creates a dedicated file on the file system. For instance, the node from the picture above is responsible for partitions 1, 10 and 564. The indexes are stored in one file per cache.

👍

Cache Groups and Partition Files

If Cache_A and Cache_B belong to a similar cache group, there will be only a single directory with partition files shared by both caches. Learn more from cache groups documentation.

Finally, there are files and directories related to the write-ahead log activities that are explained in the Write-Ahead Log and Checkpointing documentation.

🚧

Cluster Activation

Note that if Ignite Persistence is used, the cluster is considered inactive by default, disallowing any CRUD operations. A user has to manually activate the cluster. See cluster activation page for more information on how to activate the cluster.

Configuring Persistent Storage Path

When persistence is enabled, the node will store user data, indexes and WAL files in the {IGNITE_WORK_DIR}/db directory. This directory is referred to as the storage directory.
You can change the storage directory by setting the storagePath property of the DataStorageConfiguration object, as shown below.

Each node will maintain tree subdirectories under the storage directory meant to store cache data, WAL files, and WAL archive files:

Subdirectory nameDescription
{WORK_DIR}/db/{nodeId}This directory contains cache data and indexes.
{WORK_DIR}/db/wal/{nodeId}This directory contains WAL files.
{WORK_DIR}/db/wal/archive/{nodeId}This directory contains WAL archive files.

nodeId here is either the consistent node ID (if it's defined in the node configuration) or auto-generated node id. It is used to ensure uniqueness of the directories for the node.
If multiple nodes share the same work directory, they will use different sub-directories.

If the work directory contains persistence files for multiple nodes (there are multiple {nodeId} subdirectories with different nodeIds), the node will pick up the first subdirectory that is not being used.
To make sure a node always uses a specific subdirectory and, thus, specific data partitions even after restarts, set IgniteConfiguration.setConsistentId to a cluster-wide unique value in the node configuration.

You can change the storage directory as follows:

<bean class="org.apache.ignite.configuration.IgniteConfiguration">
    <property name="dataStorageConfiguration">
        <bean class="org.apache.ignite.configuration.DataStorageConfiguration">
            <property name="defaultDataRegionConfiguration">
                <bean class="org.apache.ignite.configuration.DataRegionConfiguration">
                    <property name="persistenceEnabled" value="true"/>
                </bean>
            </property>
            <property name="storagePath" value="/opt/storage"/>
        </bean>
    </property>
</bean>
IgniteConfiguration cfg = new IgniteConfiguration();
//data storage configuration
DataStorageConfiguration storageCfg = new DataStorageConfiguration();

storageCfg.getDefaultDataRegionConfiguration().setPersistenceEnabled(true);

storageCfg.setStoragePath("/opt/storage");

cfg.setDataStorageConfiguration(storageCfg);

Ignite ignite = Ignition.start(cfg);

🚧

Ensure persistence files are not stored under temp folder

On some systems, the default location might be under a temp folder. This can lead to situations when persistence files are removed by the operating system when the node process is restarted. To avoid this:

  • Ensure that WARN logging level is not disabled for Ignite. You will see a warning if the persistence files are written to the temporary directory.
  • Change the location of all persistence files using the APIs of DataStorageConfiguration such as setStoragePath(...), setWalPath(...) and setWalArchivePath(...)
  • Set the value for igniteWorkDir to a non-temporary directory (IgniteConfiguration#igniteWorkDir)

🚧

Nodes From Isolated Clusters on a Single Box/Machine

Ignite supports running nodes from isolated clusters on a single machine. In this case, every cluster has to store its persistence files under different paths in the file system. Use setStoragePath(...), setWalPath(...) and setWalArchivePath(...) methods of DataStorageConfiguration to redefine the paths for every individual cluster.

Transactional Guarantees

Ignite native persistence is an ACID-compliant distributed store. Every transactional update that comes to the store is appended to the WAL first. The update is uniquely defined with an ID. This means that a cluster can always be recovered to the latest successfully committed transaction or atomic update ​in the event of a crash or restart.

SQL Support

Ignite native persistence allows using Apache Ignite as a distributed SQL database.

There is no need to have all the data in memory if you need to run SQL queries across the cluster. Apache Ignite is able to execute them over the data that is both in memory and on disk. Moreover, it's optional to preload data from the persistence into the memory after a cluster restart. You can run SQL queries as soon as the cluster is up and running.

Ignite Persistence Internals

This documentation provides a high-level overview of the Ignite persistence. For more technical details refer to these documents:

Performance Tips

The performance suggestions are listed in the durable memory tuning documentation section.

Example

To see how the Ignite native persistence can be used in practice, try this example, available on GitHub and delivered with every Apache Ignite distribution.