Spring Data

Spring Data is a part of the Spring Framework which makes it easy to map data from a Java application onto the Aerospike Database and read it back. To learn more about Spring Data see https://spring.io/projects/spring-data, and to learn about the Spring Framework, see https://spring.io/projects/spring-framework.

The Spring Data Aerospike implementation supports both synchronous and reactive programming paradigms.

Getting Started

To use Spring Data Aerospike in your project, the first step is to add it to your build process. For Maven, this is as simple as:

<dependency>
    <groupId>com.aerospike</groupId>
    <artifactId>spring-data-aerospike</artifactId>
    <version>4.1.0</version>
</dependency>

and for Gradle users:

implementation group: 'com.aerospike', name: 'spring-data-aerospike', version: '4.1.0'

Connecting to the Aerospike Database

Connecting to the repository is easy with the help of the AbstractAerospikeDataConfiguration class.

@Configuration
@EnableAerospikeRepositories(basePackageClasses = { PersonRepository.class})
public class AerospikeConfiguration extends AbstractAerospikeDataConfiguration {
    @Override
    protected Collection<Host> getHosts() {
        return Collections.singleton(new Host("localhost", 3000));
    }

    @Override
    protected String nameSpace() {
        return "test";
    }
}

@Configuration tells Spring that this class contains configuration data and @EnableAerospikeRepositories activates Aerospike repositories that can be used for data access. The parameter to this annotation tells Spring Data Aerospike where to look for the repositories. This can be a list of package names as strings using the basePackages value, or a list of classes through the basePackageClass value. If the latter is used (as in this example), the class is used to determine which package to scan, and all repositories in that package will be available for use. More details on repositories are below.

The AbstractAerospikeDataConfiguration class exposes a number of beans which Spring Data Aerospike uses internally. Some of these, in particular the AerospikeTemplate bean, are useful in their own right if finer grained control over data access is needed. The primary information required by this configuration is how to connect to the cluster, provided through the getHosts and nameSpace calls.

Creating Functionality

The base functionality for using Spring Data is provided by the AerospikeRepository interface. This typically takes 2 parameters:

The type which this class manages, typically an entity class to be stored in the database.
The type of the ID for this class.

Application code typically extends this interface for each of the types to be managed, and methods can be added to the interface to determine how the application can access the data. For example, consider a class Person with a simple structure:

@AllArgsConstructor
@NoArgsConstructor
@Data
@Document
public class Person {
    @Id
    private long id;
    private String firstName;
    private String lastName;
    @Field("dob")
    private Date dateOfBirth;
}

Note that this example uses the Project Lombok annotations to remove the need for explicit constructors and getters and setters. Normal POJOs which define these on their own can ignore the @AllArgsConstructor, NoArgsConstructor and @Data annotations. The @Document annotation tells Spring Data Aerospike that this is a domain object to be persisted to the database, and @Id identifies the primary key of this class. The @Field annotation is used to create a shorter name for the bin in the Aerospike database (dateOfBirth will be stored in a bin called dob in this example).

For the Person object to be persisted to Aerospike, you must create an interface with the desired methods for retrieving data. For example:

public interface PersonRepository extends AerospikeRepository<Person, Long> {
    public List<Person> findByLastName(String lastName);
}

This defines a repository which can write Person entities as well as being able to query people by last name. The AerospikeRepository extends both PagingAndSortingRepository and CrudRepository so methods like count(), findById(), save() and delete() are there by default. For reactive users, use the ReactiveAerospikeRepository instead.

Note that this is just an interface and not an actual class. In the background, when your context gets initialized, actual implementations for your repository descriptions get created and you can access them through regular beans. This means you will save lots of boilerplate code while still exposing full CRUD semantics to your service layer and application.

Once this is defined, the repository is ready for use. A sample Spring Controller which uses this repository could be:

@RestController
public class ApplicationController {
    @Autowired
    private PersonRepository personRepsitory;
    
    @GetMapping("/seed")
    public int seedData() {
        Person person = new Person(1, "Bob", "Jones", new GregorianCalendar(1971, 12, 19).getTime());
        personRepsitory.save(person);
        return 1;
    }
    
    @GetMapping("/findByLastName/{lastName}")
    public List<Person> findByLastName(@PathVariable(name = "lastName", required=true) String lastName) {
        return personRepsitory.findByLastName(lastName);
    }
}

Invoking the seed method above gives you a record in the Aerospike database which looks like:

aql> select * from test.Person where pk = "1"
+-----+-----------+----------+-------------+-------------------------------------+
| PK  | firstName | lastName | dob         | @_class                             |
+-----+-----------+----------+-------------+-------------------------------------+
| "1" | "Bob"     | "Jones"  | 64652400000 | "com.aerospike.sample.model.Person" |
+-----+-----------+----------+-------------+-------------------------------------+
1 row in set (0.001 secs)

There are 2 important things to notice here:

The fully qualified path of the class is listed in each record. This is needed to instantiate the class correctly, especially in cases where the compile-time type and runtime type of the object differ. For example, where a field is declared as a super class but the instantiated class is a sub-class.
The long id field was turned into a String when stored in the database. All @Id fields must be convertable to Strings and will be stored in the database as such, then converted back to the original type when the object is read. This is transparent to the application, but needs to be considered if using an external tool like AQL to view the data.

Indexing

Notice that findByLastName is not a simple lookup by key, but rather finds all records in a set. Aerospike has 2 ways of achieving this:

Scanning all the records in the set and extracting the appropriate records.
Defining a secondary index on the field lastName and using this secondary index to satisfy the query.

The second approach is far more efficient. Aerospike stores the secondary indexes in a memory structure, allowing exceptionally fast identification of the records that matach. However, this relies on a secondary index having been created. This can either be created by systems administrators using the asadm tool, or can be created on the fly by giving Spring Data Aerospike a hint that such an index is necessary.

To have Spring Data Aerospikecreate the index, use the @Indexed annotation on the field where an index is required. This will change the Person object as in the following example:

@AllArgsConstructor
@NoArgsConstructor
@Data
@Document
public class Person {
    @Id
    private long id;
    private String firstName;
    @Indexed(name = "lastName_idx", type = IndexType.STRING)
    private String lastName;
    private Date dateOfBirth;
}

This creates the index at runtime if it does not already exist, then the queries can use it. The Spring Data Aerospike adapter has intelligence built into it so it can use a secondary index for a range of queries with multiple predicates.

For example, the requirement might be "LastName matches a passed parameter and FirstName contains a different string." In Spring Data, this is specified by:

public List<Person> findByLastNameAndFirstNameContaining(String lastName, String firstName);

This uses the same secondary index as defined above, and the additional predicate (FirstNameContaining) applied onto the results derived from the first predicate (LastName)

Note that in Aerospike, secondary indexes are case-sensitive, exact match queries only. So a method such as

public List<Person> findByLastNameContaining(String lastName);

could not be satisifed by the secondary index. In this case, Aerospike would need to scan the data (the first approach listed above). This can be an expensive operation as all records in the set must be read by the Aerospike server and the condition applied to see if they match. Due to the cost of performing this operation, scans from Spring Data Aerospike are disabled by default. If the cost of the scans are acceptable to an organization, they can be enabled by setting scansEnabled to true in the AerospikeDataSettings. One way to do this is to create a custom bean which overrides the default settings:

@Configuration
@EnableAerospikeRepositories(basePackageClasses = { PersonRepository.class})
public class AerospikeConfiguration extends AbstractAerospikeDataConfiguration {
    @Override
    protected Collection<Host> getHosts() {
        return Collections.singleton(new Host("localhost", 3000));
    }

    @Override
    protected String nameSpace() {
        return "test";
    }
    
    @Bean
    public AerospikeDataSettings aerospikeDataSettings() {
        return AerospikeDataSettings.builder().scansEnabled(true).build();
    }
}

Note: Once this flag is enabled, scans run whenever needed with no warnings or errors. This may or may not be optimal in the particular use case. In the example above, assume there was a new requirement to be able to find by firstName with an exact match:

public interface PersonRepository extends AerospikeRepository<Person, Long> {
    public List<Person> findByLastName(String lastName);
    public List<Person> findByFirstName(String firstName);
}

In this case firstName is not marked as @Indexed so Spring Data Aerospike does not know to create an index on it. Hence it will scan the repository, a costly operation which could be avoided by using an index.

Useful Resources

There are a number of blogs posts to help you get started with Spring Data Aerospike. These include:

Known Issues

Spring Data Aerospike is an active project with a large and complex code base. Like all such projects, there are known issues from time to time. Currently the known issues are:

Query by method names do not support Distinct and Like keywords
Date comparisons do not work when using Dates. Pass the comparison parameters as Longs instead by invoking myDate.getTime()

If any other issues are encountered, please raise a Guthub issue on the project.

Getting Started​

Connecting to the Aerospike Database​

Creating Functionality​

Indexing​

Useful Resources​

Known Issues​