Feb. 20, 2023

OpenStack Cinder Terminology

#openstack #cinder #101

Gathering information on storage terminology can be challenging. Below are some definitions to help you understand the terms commonly used in the Openstack storage industry.


GB

Cinder uses the binary unit of measurement as defined by the International Electrotechnical Commission (IEC) with symbol GiB, which stands for gibibyte. This means that when we refer to 1GB, we are actually talking about 1024MB. The same rule applies to TB and MB.


Volume

A storage volume is the basic unit of storage, whether it's allocated space on a disk or a tape cartridge from the old days.


Pool

A storage pool is a logical collection of volumes. This is where a group of storage volumes is managed together to achieve better utilization and performance.


Storage Array

A storage array, also known as a disk array, is a data storage system used for block-based, file-based, or object storage. Instead of storing data on a server, storage arrays use multiple drives in a collection capable of storing a massive amount of data, managed by a central management system.


Thin and Thick Volumes

- Thick Volumesallocate the total size of the volume upon creation. This means that the total size of the volume is always used up in the pool, regardless of the amount of data stored in it. A thick-provisioned Cinder volume reserves an amount of space from the backend storage system equal to the size of the requested volume, even though users typically do not consume all the space in the volume, reducing overall storage efficiency.

- Thin Volumes allocate space in the storage pool as data is written into the volume. A thin-provisioned Cinder volume only carves space from the backend storage system as required for actual usage. Thin-provisioning allows for capacity over-subscription, meaning that more storage space can be allocated than is available on the storage controller.

Understanding these terminologies is essential when dealing with storage systems. Knowing these definitions will help you make informed decisions regarding storage allocation and provisioning, leading to better utilization and performance.

Thin-provisioningallows for capacity over-subscription. In other words, more storage space may be allocated than is available on the storage controller.

For example

Suppose a 1TB storage pool contains four 250GB thick-provisioned volumes. If you try to create another Cinder volume, even if all four volumes remain empty, you would need to add more storage capacity to the pool.

Storage Pool: Thick provisioned
Storage Pool capacity = 1TB
Cinder volume One:   250GB allocated
Cinder volume Two:   250GB allocated
Cinder volume Three: 250GB allocated
Cinder volume Four:  250GB allocated
Storage Pool space consumed = 1TB

Thin-provisioning with over-subscription allows flexibility in capacity planning and reduces the likelihood of wasted storage capacity.

Storage Pool: Thin provisioned
Storage Pool capacity = 1TB
Cinder volume One:  250GB allocated
Cinder volume Two:  250GB allocated
Cinder volume Three 250GB allocated
Cinder volume Four: 250GB allocated
Cinder volume Five: 250GB allocated
Storage Pool space consumed = ~0GB

 

⚠️NOTE: Thin provisioning helps maximize storage utilization. However, if aggregates are over committed through thin provisioning, usage must be monitored, and capacity must be increased as usage nears predefined thresholds.

Please check this excellent article that talks about it.

And this articule.


Total capacity

The total capacity refers to the physical storage capacity available in the storage pool used by Cinder if there were no volumes present.

Drivers report this as total_capacity_gb and it should be reported in GB with a precision of up to 2 decimal places.

For instance, if the storage array has 5TB of space but the pool used in Cinder is limited to 1TB, the driver should report a total_capacity_gb of 1024GB.


Volume size

It is the maximum physical size that a volume can take in the pool. This is referenced throughout the code as volume_size.

In particular, when we are talking about thick volume the volume_size will be the same as the free capacity we lost when it was provisioned

Finally, a thin volume_size will be the maximum capacity available for the volume. This means that Cinder will show the full size even if  the volume hasn't been fully allocated yet.


Free capacity

It is the current physical capacity available in the storage array’s pool being used by Cinder. The number and volume sizes of the thin and thick volumes that have been provisioned by Cinder or directly in the storage array are irrelevant here.

This is currently being reported by the drivers as free_capacity_gb and, as the name indicates, should be reported in GB and with a precision no greater than 2 decimals.

If the storage array has 5TB of space with a total of 3TB available for all its pools but Cinder is using a pool that has a limit of 1TB of which it has already used 400GB and someone has manually created volumes outside of Cinder that are currently using 124GB of space, then the driver should be reporting a free_capacity_gb of 500GB (1TB = 1024GB = 400GB + 124GB + 500GB).


Provisioned capacity

The amount of capacity that would be used in the pool being used by Cinder if all the volumes present there were completely full. This is currently being reported by the drivers as provisioned_capacity_gb.

This includes not only volumes created by Cinder but also all other existing volumes in that backend, but does not include snapshots.

Let’s expand the earlier example from “free capacity” where 524GB of the available 1TB had already been used, and say that the 124GB that were externally created were all used by 1GB thick volumes, and that Cinder was using the 400GB with 400 thick volumes of 1GB and 20 empty thin volumes of 20GB each. In this situation our reported provisioned_capacity_gb value should be 924GB ((124 * 1GB) + (400 * 1GB) + (20 * 20GB)).

 

⚠️NOTE: If a driver does not report the provisioned_capacity_gb data we’ll use the automatically calculated allocated_capacity_gb as described below.


Allocated capacity

The provisioned volumes for a specific Cinder backend.

⚠️NOTE: Important to notice that this refers to a specific service backend, so if you are running a multi-backend Cinder service or multiple Cinder Volume services where you have more than one backend configured to use the same storage array’s pool, then each one of these backends will only be reporting the sum of the volume_size of the volumes they created and not the sum of all the volume_size of the volumes that have been created by a Cinder service.

This is currently being reported by the Volume service as allocated_capacity_gb .

For two volumes had been created, one thick and one thin, each one of 1GB, then you’ll be reporting 2GB as allocated_capacity_gb, but if you were to unmanage one of those volumes then you would only be reporting 1GB, even if the volume is still there and will still be counted in the provisioned_capacity_gb.

This field is calculated directly by the Cinder core code and drivers should not calculate or report this information on their get_volume_stats method.


Oversubscription ratio

It is the maximum ratio between the “provisioned capacity” and the “total capacity” represented as a real number. 

🔸In simple words, the ratio is the number that can be used to express one quantity as a fraction of the other ones. The two numbers in a ratio can only be compared when they have the same unit. We make use of ratios to compare two things.

A ratio of 1.0 means that the “provisioned capacity” cannot exceed the “total capacity” whereas a value of 5.0 means that the Cinder backend is allowed to create as much as 5 times the “total capacity” of the storage array’s pool in volumes.

This will only have effect when a thin provisioned volume is being created, and will be ignored for thick provisioned.

This is currently being reported by the drivers as max_over_subscription_ratio with a greater or equal value to 1.0, preferably with no more than a 2 decimal precision.

This value is optional, and when missing from the driver’s status report the value defined in the [DEFAULT] section on the Cinder scheduler receiving the request will be used. So vendors should make sure that they are correctly returning this value in their drivers if they support thin provisioning and admins should make sure they have a consistent default value of the max_over_subscription_ratio across all scheduler nodes.

Note that this ratio is per backend or per pool depending on driver implementation.

For example

Platform9 only accepts max_over_subscription_ratio  equals to 1.0. As you already know that means that the “provisioned capacity” cannot exceed the “total capacity”.

 

provisioned_ratio = ((backend_state.provisioned_capacity_gb + requested_size) / total)

 

Consider a Cinder Host to have allocated_capacity_gb of 6740 GB and total_capacity_gb equal to 6881.28 GB. This information can be retrieved using the following command.

 

# cinder get-pools --detail| allocated_capacity_gb | 6740| max_over_subscription_ratio | 1.0| total_capacity_gb | 6881.28

 

Let us consider a volume to be provisioned/migrated to be of size 200 GB.

The provisioned ratio calculated is to be 1.01 which higher than the subscription ratio of 1.0.

provisioned ratio = ((  6740 + 200 ) / 6881.28 ) = 1.01


The ratio of provisioned capacity over total capacity 1.01 has exceeded the maximum over subscription ratio 1.00 on host UUID@Backend#IP:/cinder.

 

More info.


Reserved percentage

Represents the percentage of the storage array’s “total capacity” that is reserved and should not be used for calculations. It is represented by an integer value going from 0 up to 100.

This is currently being reported by the drivers as reserved_percentage with a greater or equal value to 1.0, preferably with no more than a 2 decimal precision.

Default value is 0 if the field is missing in the status report from the backend or if the user has not defined it in the backend’s Cinder configuration. This is per backend or per pool depending on driver implementation.


Provisioning support

Cinder backends may support up to two different types of provisioning, thin and thick and drivers are expected to indicate as capable of one of them at least in their capabilities report.

The way to report support for these is setting to true the boolean fields thin_provisioning_support and/or thick_provisioning_support. And non reported provisioning types will default to false.

A Cinder backend may support both provisioning types at the same time.

$ cinder get-pools --detail

+-----------------------------+------------------------------------------------------+

| Property                | Value                                            |

+-----------------------------+------------------------------------------------------+

| QoS_support             | False                                            |

| allocated_capacity_gb   | 0                                                |

| backend_state           | up                                               |

| cacheable               | True                                             |

| driver_version          | 3.0.0                                            |

| filter_function         | None                                             |

| free_capacity_gb        | 28.5                                             |

| goodness_function       | None                                             |

| location_info           | LVMVolumeDriver:enriquetaso:stack-volumes-lvm:thin:0 |

| max_over_subscription_ratio | 20.0                                             |

| multiattach             | True                                             |

| name                    | enriquetaso@lvm#lvm                              |

| pool_name               | lvm                                              |

| provisioned_capacity_gb | 0.0                                              |

| reserved_percentage     | 0                                                |

| storage_protocol        | iSCSI                                            |

| thick_provisioning_support  | False                                            |

| thin_provisioning_support   | True                                             |

| timestamp               | 2022-11-14T22:41:36.281567                       |

| total_capacity_gb       | 28.5                                             |

| total_volumes           | 1                                                |

| vendor_name             | Open Source                                      |

| volume_backend_name     | lvm                                              |

+-----------------------------+------------------------------------------------------+


 

Volume provisioning type

For Cinder backends that only support one of the provisioning types all volumes created on them will be of that type, and we can use the volume type’s extra specs to make the scheduler filter out backends not supporting a specific provisioning type:

  • ‘thin_provisioning_support’: ‘<is> True’ or ‘<is> False’
  • ‘thick_provisioning_support’: ‘<is> True’ or ‘<is> False’

But if our deployment is using a backend that is supporting both provisioning types simultaneously we need to be explicit about the type of provisioning we want for a volume using the volume type’s extra spec provisioning:type and setting it to thin or thick.

If no provisioning:type is defined for a volume it will default to thin if the backend is capable of it, and the driver is expected to honor this assumption.

 


Cinder Focus Topics

Provisioning calculations

Some of the creation failures are based on the provisioned_capacity_gb value being wrong, but there are other cases where Cinder’s calculations for over provisioning do not match industry’s standard definition, which for some admins create confusion and undesired behavior.

Standard provisioning calculation to check if a volume of volume_size fits is:

 

((provisioned_capacity_gb + volume_size) <= (total_capacity_gb x (1 - (reserved_percentage / 100.0)) x max_over_subscription_ratio))

 

Whereas the Cinder calculations, which were agreed on as the best calculations for being considered safer are:

 

(volume_size <= (free_capacity_gb  - (total_capacity_gb x reserved_percentage / 100.0)) x max_over_subscription_ratio)


 

You can configure this with over_provisioning_calculation which takes values standard and cinder. It’s default to cinder for backward compatibility and that will be used by the CapacityFilter to determine which one of the mechanisms to use.

This configuration option will also affect CapacityWeigher as it will need to do the free space calculation according to the standard definition as well. As one can assume thick provisioning will have no modifications on its behavior.

 

Link

- Provisioning Improvements — Cinder Specs 0.0.1.dev634 documentation