Spring Data
Spring Data is a part of the Spring Framework which makes it easy to map data from a Java application onto the Aerospike Database and read it back. To learn more about Spring Data see https://spring.io/projects/spring-data, and to learn about the Spring Framework, see https://spring.io/projects/spring-framework.
The Spring Data Aerospike implementation supports both synchronous and reactive programming paradigms.
Getting Started
To use Spring Data Aerospike in your project, the first step is to add it to your build process. For Maven, this is as simple as:
<dependency>
<groupId>com.aerospike</groupId>
<artifactId>spring-data-aerospike</artifactId>
<version>4.1.0</version>
</dependency>
and for Gradle users:
implementation group: 'com.aerospike', name: 'spring-data-aerospike', version: '4.1.0'
Connecting to the Aerospike Database
Connecting to the repository is easy with the help of the AbstractAerospikeDataConfiguration
class.
@Configuration
@EnableAerospikeRepositories(basePackageClasses = { PersonRepository.class})
public class AerospikeConfiguration extends AbstractAerospikeDataConfiguration {
@Override
protected Collection<Host> getHosts() {
return Collections.singleton(new Host("localhost", 3000));
}
@Override
protected String nameSpace() {
return "test";
}
}
@Configuration
tells Spring that this class contains configuration data and @EnableAerospikeRepositories
activates Aerospike repositories that can be used for data access. The parameter to this annotation tells Spring Data Aerospike where to look for the repositories. This can be a list of package names as strings using the basePackages
value, or a list of classes through the basePackageClass
value. If the latter is used (as in this example), the class is used to determine which package to scan, and all repositories in that package will be available for use. More details on repositories are below.
The AbstractAerospikeDataConfiguration
class exposes a number of beans which Spring Data Aerospike uses internally. Some of these, in particular the AerospikeTemplate
bean, are useful in their own right if finer grained control over data access is needed. The primary information required by this configuration is how to connect to the cluster, provided through the getHosts
and nameSpace
calls.
Creating Functionality
The base functionality for using Spring Data is provided by the AerospikeRepository
interface. This typically takes 2 parameters:
- The type which this class manages, typically an entity class to be stored in the database.
- The type of the ID for this class.
Application code typically extends this interface for each of the types to be managed, and methods can be added to the interface to determine how the application can access the data. For example, consider a class Person
with a simple structure:
@AllArgsConstructor
@NoArgsConstructor
@Data
@Document
public class Person {
@Id
private long id;
private String firstName;
private String lastName;
@Field("dob")
private Date dateOfBirth;
}
Note that this example uses the Project Lombok annotations to remove the need for explicit constructors and getters and setters. Normal POJOs which define these on their own can ignore the @AllArgsConstructor
, NoArgsConstructor
and @Data
annotations. The @Document
annotation tells Spring Data Aerospike that this is a domain object to be persisted to the database, and @Id
identifies the primary key of this class. The @Field
annotation is used to create a shorter name for the bin in the Aerospike database (dateOfBirth
will be stored in a bin called dob
in this example).
For the Person
object to be persisted to Aerospike, you must create an interface with the desired methods for retrieving data. For example:
public interface PersonRepository extends AerospikeRepository<Person, Long> {
public List<Person> findByLastName(String lastName);
}
This defines a repository which can write Person
entities as well as being able to query people by last name. The AerospikeRepository
extends both PagingAndSortingRepository and CrudRepository so methods like count()
, findById()
, save()
and delete()
are there by default. For reactive users, use the ReactiveAerospikeRepository
instead.
Note that this is just an interface and not an actual class. In the background, when your context gets initialized, actual implementations for your repository descriptions get created and you can access them through regular beans. This means you will save lots of boilerplate code while still exposing full CRUD semantics to your service layer and application.
Once this is defined, the repository is ready for use. A sample Spring Controller which uses this repository could be:
@RestController
public class ApplicationController {
@Autowired
private PersonRepository personRepsitory;
@GetMapping("/seed")
public int seedData() {
Person person = new Person(1, "Bob", "Jones", new GregorianCalendar(1971, 12, 19).getTime());
personRepsitory.save(person);
return 1;
}
@GetMapping("/findByLastName/{lastName}")
public List<Person> findByLastName(@PathVariable(name = "lastName", required=true) String lastName) {
return personRepsitory.findByLastName(lastName);
}
}
Invoking the seed
method above gives you a record in the Aerospike database which looks like:
aql> select * from test.Person where pk = "1"
+-----+-----------+----------+-------------+-------------------------------------+
| PK | firstName | lastName | dob | @_class |
+-----+-----------+----------+-------------+-------------------------------------+
| "1" | "Bob" | "Jones" | 64652400000 | "com.aerospike.sample.model.Person" |
+-----+-----------+----------+-------------+-------------------------------------+
1 row in set (0.001 secs)
There are 2 important things to notice here:
- The fully qualified path of the class is listed in each record. This is needed to instantiate the class correctly, especially in cases where the compile-time type and runtime type of the object differ. For example, where a field is declared as a super class but the instantiated class is a sub-class.
- The
long
id field was turned into aString
when stored in the database. All@Id
fields must be convertable toString
s and will be stored in the database as such, then converted back to the original type when the object is read. This is transparent to the application, but needs to be considered if using an external tool likeAQL
to view the data.
Indexing
Notice that findByLastName
is not a simple lookup by key, but rather finds all records in a set. Aerospike has 2 ways of achieving this:
- Scanning all the records in the set and extracting the appropriate records.
- Defining a secondary index on the field
lastName
and using this secondary index to satisfy the query.
The second approach is far more efficient. Aerospike stores the secondary indexes in a memory structure, allowing exceptionally fast identification of the records that matach. However, this relies on a secondary index having been created. This can either be created by systems administrators using the asadm
tool, or can be created on the fly by giving Spring Data Aerospike a hint that such an index is necessary.
To have Spring Data Aerospikecreate the index, use the @Indexed
annotation on the field where an index is required. This will change the Person
object as in the following example:
@AllArgsConstructor
@NoArgsConstructor
@Data
@Document
public class Person {
@Id
private long id;
private String firstName;
@Indexed(name = "lastName_idx", type = IndexType.STRING)
private String lastName;
private Date dateOfBirth;
}
This creates the index at runtime if it does not already exist, then the queries can use it. The Spring Data Aerospike adapter has intelligence built into it so it can use a secondary index for a range of queries with multiple predicates.
For example, the requirement might be "LastName
matches a passed parameter and FirstName
contains a different string." In Spring Data, this is specified by:
public List<Person> findByLastNameAndFirstNameContaining(String lastName, String firstName);
This uses the same secondary index as defined above, and the additional predicate (FirstNameContaining
) applied onto the results derived from the first predicate (LastName
)
Note that in Aerospike, secondary indexes are case-sensitive, exact match queries only. So a method such as
public List<Person> findByLastNameContaining(String lastName);
could not be satisifed by the secondary index. In this case, Aerospike would need to scan the data (the first approach listed above). This can be an expensive operation as all records in the set must be read by the Aerospike server and the condition applied to see if they match. Due to the cost of performing this operation, scans from Spring Data Aerospike are disabled by default. If the cost of the scans are acceptable to an organization, they can be enabled by setting scansEnabled
to true
in the AerospikeDataSettings
. One way to do this is to create a custom bean which overrides the default settings:
@Configuration
@EnableAerospikeRepositories(basePackageClasses = { PersonRepository.class})
public class AerospikeConfiguration extends AbstractAerospikeDataConfiguration {
@Override
protected Collection<Host> getHosts() {
return Collections.singleton(new Host("localhost", 3000));
}
@Override
protected String nameSpace() {
return "test";
}
@Bean
public AerospikeDataSettings aerospikeDataSettings() {
return AerospikeDataSettings.builder().scansEnabled(true).build();
}
}
Note: Once this flag is enabled, scans run whenever needed with no warnings or errors. This may or may not be optimal in the particular use case. In the example above, assume there was a new requirement to be able to find by firstName
with an exact match:
public interface PersonRepository extends AerospikeRepository<Person, Long> {
public List<Person> findByLastName(String lastName);
public List<Person> findByFirstName(String firstName);
}
In this case firstName
is not marked as @Indexed
so Spring Data Aerospike does not know to create an index on it. Hence it will scan the repository, a costly operation which could be avoided by using an index.
Useful Resources
There are a number of blogs posts to help you get started with Spring Data Aerospike. These include:
- Simple Web Application Using Java, Spring Boot, Aerospike and Docker
- How to setup spring-data-aerospike in Spring Boot application
- Basic error handling in spring-data-aerospike
- How to create secondary index in Spring Data Aerospike
- Caching with Spring Boot and Aerospike
- Spring Data Aerospike: Reactive Repositories
- Spring Data Aerospike - Projections
Known Issues
Spring Data Aerospike is an active project with a large and complex code base. Like all such projects, there are known issues from time to time. Currently the known issues are:
- Query by method names do not support
Distinct
andLike
keywords - Date comparisons do not work when using
Dates
. Pass the comparison parameters asLong
s instead by invokingmyDate.getTime()
If any other issues are encountered, please raise a Guthub issue on the project.