# Recommendations for Microsoft Azure

## OS

We recommend using the latest Ubuntu Server LTS as it has the most recent optimizations and bug fixes on Azure. You can find Ubuntu through the Azure portal or at [this page](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/canonical.0001-com-ubuntu-server-focal).

## Instance type

The Lsv2 family of VMs is specifically designed for high performance I/O applications. We recommend Lsv2 VMs for deploying Aerospike.

You can, however, use the M and DSv2 VMs if their characteristics match your requirements.

In general, any persistence based storage should utilize VM families with the ‘S’ suffix. This indicates the VMs have Premium (SSD) Storage support.

::: note
Aerospike recommends enabling [Write Acceleration](https://docs.microsoft.com/en-us/azure/virtual-machines/windows/how-to-enable-write-accelerator).
:::

## Network setup

By default, machines within the same Virtual Network in Azure can communicate freely with each other. Aerospike uses TCP ports 3000 for communication with clients and 3002-3003 for intra-cluster communication. These ports do not need to be open to the internet. If the database clients are in the same Virtual Network, you do not need a separate firewall rule as they can connect over port 3000.

You need a port for SSH access to your instances. The default TCP port is 22.

All instances in Azure are assigned an internal IP address. These internal IP addresses should be used in the [mesh heartbeat](https://aerospike.com/docs/database/manage/network/heartbeat#mesh-unicast-heartbeat) configuration.

## Persistent disks

Azure provides storage in the form of VHD Disks in [Storage Accounts](https://docs.microsoft.com/en-us/azure/storage/storage-introduction).

There are two types of disks. High performance Premium Storage offers SSD backed storage, while Standard Storage offers HDD backed storage. The performance of the disk is closely tied to the size of the disk volume.

::: note
As of this writing, the [best performant](https://docs.microsoft.com/en-us/azure/virtual-machines/windows/premium-storage#premium-storage-disk-limits) disk is the P40 (2048GB) disk. There is a bigger P50 (4096GB) disk, but it does not confer additional performance benefits.
:::
::: note
Azure [limits](https://docs.microsoft.com/en-us/azure/storage/storage-premium-storage#scalability-and-performance-targets) individual persistent disk performance. To achieve higher performance, you must provision additional disks and use them in parallel.
:::

## Local SSD

Some Instance types come with local SSDs. This provides extremely good performance, with high input/output operations per second (IOPS) and low latency compared to the persistent disk options. However, these local SSDs are created and destroyed along with the virtual machine instance. In spite of this, the local SSD storage option can be used judiciously with an Aerospike cluster so that the data is always replicated on multiple local SSDs attached to multiple virtual machines in the cluster.

Here is an example configuration snippet:

```plaintext
storage-engine device {

            device /dev/sdb

    }
```

Local SSD IOPS are not limited by persistent Disk IOPS allocations or instance IOPS allocations. They have their own allocations.

::: note
See [Premium Storage scalability](https://docs.microsoft.com/en-us/azure/storage/storage-premium-storage#premium-storage-scalability-and-performance-targets).
:::

## Shadow device configuration

As noted above, some Azure instance types have local SSDs. These can be significantly faster than Premium Storage Disks, as they are network attached. But Azure treats local SSDs as cache and not suitable for long-term data, as these volumes are purged when the instance stops.

To take advantage of local disks with the persistence guarantee of Azure Blob Storage, Aerospike has the [Shadow Device configuration](https://aerospike.com/docs/database/manage/namespace/storage/config/#setup-for-shadow-device) for the storage engine.

The write throughput is still limited by the instance limit and storage volume, so this strategy gives good results when the percentage of writes is low.

An example config would be as follows:

```plaintext
storage-engine device{

    device /dev/sdb  /dev/sdc

    ...

  }
```

::: note
For data-in-memory use cases with persistence, it may also be preferable to use a local SSD device alongside a Premium Storage Disk volume. In this case, it would be to save on IOPS cost incurred on read during the defragmentation process. The reads would be performed against the local SSD device and re-written/defragmented blocks directly mirrored to the Premium Storage Disk volume.
:::

## Fault tolerance

Azure has the concept of [Availability Sets](https://docs.microsoft.com/en-us/azure/virtual-machines/availability-set-overview). It consists of Update Domains and Fault Domains.

 ![My cool image](https://aerospike.com/docs/_astro/azure-udfd.wwegSzTx_22qc67.png)

Update Domains are groups of physical systems that can be rebooted simultaneously in a maintenance event. Fault Domains are groups of physical systems that share power and network switch.

::: note
Aerospike suggests having the Availability Set to be defined with the most number of Update Domains and Fault Domains as possible.
:::
::: note
Azure will not give notice of VM reboots for Update Domain interruptions.
:::
::: caution
VMs are migrated by Azure automatically should a Fault Domain issue arise. This live-migration process has historically been detrimental to Aerospike. Aerospike can be made more resilient against live migrations, but at the cost of performance.
:::
::: note
Aerospike also provides XDR replication in our Enterprise Edition to facilitate DR scenarios
:::

## Additional information

-   [Optimize your Linux VM on Azure](https://docs.microsoft.com/en-us/azure/virtual-machines/linux/optimization)