# Choosing Instance Types

> Source: https://parallelworks.com/docs/compute/instance-types

# Choosing Instance Types

When configuring a cloud-based cluster, there are many instance types to choose from, and each cloud service provider (CSP) has their own naming conventions. Elements of instance type names correspond to elements of the physical machines that clusters are deployed from. This page explains these instance naming conventions for AWS, Azure, and Google clusters.

Please note that specific instance types are subject to change as CSPs add and remove hardware; however, you can always refer to this page to help identify nodes and any additional features that are included with them. 

## AWS

On AWS, instance types are named following a pattern based on: 
- instance family
- generation family
- processor family
- additional capabilities
- instance size

![An example of an AWS instance type with labels for each element.](/content-images/docs/compute/instance-types/aws-instance-example.png)
_An example of an AWS instance type from their documentation_

### Instance Families

| Syntax | Description                |
| ------ | -------------------------- |
|  c     | Compute optimized          |
|  d     | Dense storage              |
|  f     | [FPGA](https://aws.amazon.com/ec2/instance-types/f1/) |
|  g     | Graphics intensive         |
| hpc    | High performance computing |
| inf    | [AWS Inferentia](https://aws.amazon.com/machine-learning/inferentia/) |
|  m     | General purpose            |
| mac    | macOS                      |
|  p     | GPU accelerated            |
|  r     | Memory optimized           |
|  t     | Burstable performance      |
| trn    | [AWS Trainium](https://aws.amazon.com/machine-learning/trainium/) |
|  u     | High memory                |
| vt     | Video transcoding          |
|  x     | Memory intensive           |

### Processor Families

| Syntax | Description                |
| ------ | -------------------------- |
|  a     | AMD processors             |
|  g     | [AWS Graviton](https://aws.amazon.com/ec2/graviton/) processors |
|  i     | Intel processor            |

**Note for the `i` syntax**: Many older Intel-based instance types do not include this code. It was likely added when AWS began offering more Graviton and AMD-based options. 

### Additional Capabilities

| Syntax | Description                |
| ------ | -------------------------- |
|  d     | Instance store volumes     |
|  n     | Network and [EBS](https://aws.amazon.com/ebs/) optimized |
|  e     | Extra storage or memory    |
|  z     | High performance           |
| flex   | [Flex instance](https://aws.amazon.com/ec2/instance-types/m7i/) |

### Instance Generations

Older instance generations are usually kept available for a set period of time, but it’s suggested to use newer versions for optimal performance.
- [documentation](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html#current-gen-instances) for the current generation of AWS instances
- [documentation](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html#previous-gen-instances) for the previous generation of AWS instances

### Selection Guidelines

For CPU-based workloads, most ACTIVATE users will want to select **compute-optimized** instance types, which include the **c** and **hpc** instance families: 

- **c5n.18xlarge**: One of ACTIVATE’s default configuration instances. c5 instances are based on Intel Skylake processors. Note that the instance name is missing an `i` in the name because it predates other processors being included in the family.
- **c6in.24xlarge**: A newer generation of the `c` instance family. Note that this instance includes an `i` in the name to separate it from `c6a` (AMD) and `c6g` (Graviton) instances.
- **hpc6a.48xlarge**: AMD EPYC-based instances designed specifically for HPC workloads

For GPU-based workloads, look for instances in the **g** and **p** families: 

- **g5.48xlarge**: g5 instances are equipped with NVIDIA A10G Tensor Core GPUs and AMD EPYC processors.
- **p3.16xlarge**: p3 instances include Intel Skylake processors and NVIDIA V100 Tensor Core GPUs.

:::info Note
Instance options vary by zone and region. If you're trying to use a specific instance type and it's not visible in the dropdown, try changing to a different region first.
:::

### Further Reading 

You can read more about AWS instances and naming conventions any time by visiting [this page](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html) of their documentation. 

## Azure

Azure's naming structure follows this pattern: 

[Family] + [Sub-family] + [# of vCPUs] + [Constrained vCPUs] + [Additive Features] + [Accelerator Type] + [Version]

On ACTIVATE, we also add an Azure instance's [tier](https://learn.microsoft.com/en-us/azure/search/search-sku-tier#tier-descriptions).

For example, the Azure instance `Standard_HC44rs` can be broken down into: 
- **Tier**: Standard
- **Family**: H
- **Sub-family**: C
- **CPUs**: 44
- **Additive Features**: rs
    - **r**: RDMA capable
    - **s**: Premium Storage capable

### Instance Families

| Syntax | Description                                   |
| ------ | --------------------------------------------- |
|  A     | Entry-level VMs for dev/test                  |
|  Bs    | Economical burstable VMs                      |
|  D     | General purpose compute                       |
|  E     | Optimized for in-memory applications          |
|  F     | Compute optimized virtual machines            |
|  G     | Memory and storage optimized virtual machines |
|  H     | High Performance Computing virtual machines   |
|  Ls    | Storage optimized virtual machines            |
|  M     | Memory optimized virtual machines             |
|  Mv2   | Largest memory optimized virtual machines     |
|  N     | GPU-enabled virtual machines                  |

#### Instance Sub-families

Many Azure instance families include sub-families with different features. For example, H-Series instances come in two flavors:
- **HB**: Up to 120 AMD EPYC 7003-series CPU cores, 448 GB of RAM, and no hyperthreading
- **HC**: Up to 44 Intel Xeon Platinum 8168 processor cores, 8 GB of RAM per CPU core, no hyperthreading, and up to 4 Managed Disks

### Additional Capabilities

| Syntax | Description                                   |
| ------ | --------------------------------------------- |
|  a     | AMD-based processor                           |
|  b     | Block Storage performance                     |
|  d     | diskful (that is, a local temp disk is present); this feature is for newer Azure VMs; see [Ddv4 and Ddsv4-series](https://learn.microsoft.com/en-us/azure/virtual-machines/ddv4-ddsv4-series)                                       |
|  i     | isolated size                                 |
|  l     | low memory; a lower amount of memory than the memory intensive size |
|  m     | memory intensive; the most amount of memory in a particular size    |
|  p     | ARM CPU                                       |
|  t     | tiny memory; the smallest amount of memory in a particular size     |
|  s     | Premium Storage capable, including possible use of [Ultra SSD](https://learn.microsoft.com/en-us/azure/virtual-machines/disks-types#ultra-disks)|
|  C     | confidential                                  |
|  NP    | node packing                                  |
|  r     | [RDMA](https://www.microsoft.com/en-us/research/publication/empowering-azure-storage-with-rdma/) capable                                   |

Please note that this is not a complete list of additive features. Additionally, these identifiers are not used in all node types that may apply to them. For example, `Standard_HB60rs` instances have AMD EPYC processors, but don’t have an `a` listed as an additional capability.

### Instance Generations (version)

Like other cloud providers, Azure instances are routinely updated with newer generations. Azure has product pages for each instance series that describes their specifications, current generation, and additional features. Information for the H-Series nodes can be found on [this page](https://learn.microsoft.com/en-us/azure/virtual-machines/sizes-hpc).

### Selection Guidelines

For compute clusters, we suggest using Azure H-Series nodes as they are InfiniBand/RDMA enabled for high-speed networking. Our primary default cluster configuration uses `Standard_HC44rs` instances.

For Lustre, stick to instances that have `d` and `s` listed as additional features for their enhanced storage functionality. 

### Further Reading
You can read more about Azure naming conventions by visiting [this page](https://learn.microsoft.com/en-us/azure/virtual-machines/vm-naming-conventions) of their documentation. 

You can read more about Azure instance types on [this page](https://azure.microsoft.com/en-us/pricing/details/virtual-machines/series/) of their documentation. 

## Google

### Machine Families & Series

Google instances fall into one of four categories (called familes) and are further categorized by their series and generation.

- **General-purpose**: best price-performance ratio for a variety of workloads
    - **e2**
    - **n2, n2d, n1**
    - **c3**
    - **tau t2d, tau t2a**
- **Compute-optimized**: highest performance per core on Compute Engine and optimized for compute-intensive workloads
    - **h3**
    - **c2, c2d**
- **Memory-optimized**: ideal for memory-intensive workloads, offering more memory per core than other machine families, with up to 12 TB of memory
    - **m3, m2, m1**
- **Accelerator-optimized**: ideal for massively parallelized Compute Unified Device Architecture (CUDA) compute workloads, such as machine learning (ML) and high-performance computing (HPC); this family is the best option for workloads that require GPUs
    - **a2**
    - **g2**

### Selection Guidelines

`h3-standard-88` is Google’s newest node type that's suitable for HPC workloads. This type features 88 vCPUs (no hyperthreading), 352GB memory, and up to 200 Gbps network egress bandwidth.

`c2-standard-60 instances` are smaller than the **h3** nodes, but are also well suited for HPC applications.

For GPUs, try the **a2** series.

### Further Reading 

You can read more about Google instances and naming conventions any time by visiting [this page](https://cloud.google.com/compute/docs/machine-resource) of their documentation.
