Apr 13, 2025

#26. Your Workload Optimizations Checklist — Part 3: Storage

The third in a series to optimize Cloud workloads. Let’s focus on your storage.

In this article we introduced the simple 80:20 rule that helps you to optimize workloads effectively and achieve cost savings through investigating the business value.

In this series of articles, we will consider together how to optimize the usage of services used by most organizations (i.e., compute, storage, databases, networking and SaaS), by following simple checklists that questions their business value. In this article we will cover Storage services.

The Storage Optimization Checklist

In in the previous article, I would split the check up list for compute opitmizations into 5 reviews: Business, performance, technical, license and rates.

Business Value Review

Storage costs typically stem from objects in blob storage, including business operation artifacts, logs, and database entries stored on volume drives. The first step is to evaluate the business value of stored items. While moving data from hot to cold storage yields significant cost savings, deleting unnecessary objects from storage altogether eliminates their storage costs entirely.

Here are some practical ways to achieve this:

Data Retention Policy

This is the simplest and most automated way to align your storage with business values. First, establish different retention periods based on data type and business value. Then, set up automated deletion of data that exceeds its required retention period.

Retention policies may also include cloud storage tiering policies. For example, you keep frequently accessed data on premium storage (SSD, high-performance), then move infrequently accessed data to standard storage, and archive rarely accessed data to cold/archive storage—resulting in 70-90% cost reduction.(e.g., AWS S3 Standard → S3 Infrequent Access → S3 Glacier → S3 Glacier Deep Archive).

Data Duplication and Redundancy

It's important to regularly evaluate data stored in the organization to identify and eliminate unnecessary duplication. For example, while data may be stored in multiple regions for redundancy, non-critical business data can be stored in a single region instead.

Business Impact Analysis

Conduct a business impact analysis to understand the cost implications of storage choices. A good example of this is logging. Set up a governance mechanism for logging classification and retention that clearly defines logging levels and storage duration.

Performance Review

IOPS and Throughput

Performance of storage is associated to the IOPS and Throughput used. Storage medium also affects performance. Premium SSD volumes offer higher IOPS (Input/Output Operations Per Second) compared to standard HDD storage. This also applies to throughput (the amount of data read or written per second), latency (data access speed), and queue depth (the number of pending I/O requests a storage resource can handle simultaneously).

IOPS and throughput can affect business performance when associated with databases. However, high IOPS and throughput storage mediums are often used for drive volumes attached to compute virtual machines that may only run development workloads. Therefore, it's better to evaluate storage performance advantages by measuring actual business output instead.

Data Compression and Deduplication

Data compression reduces data size by encoding information with fewer bits than the original. It identifies and eliminates redundant patterns in data. Data deduplication in the other hand is a specific method that removes duplicate copies of data by storing only one unique instance of each data block and replacing duplicates with references to that single copy.

Both techniques work out of the box with backup services and snapshots, creating far less storage requirements (e.g., AWS Backup, Azure Backup). However, compression can also be used for data retrieval in production environments. For example, web application contents, File Servers, and Streaming Media often use compression and deduplication to improve their network delivery performance.

Therefore, ensure that data compression and deduplication is enabled whenever the use case is applicable.

Data Transfer Optimization

Data transfer costs can represent a significant portion of your cloud bill, often exceeding compute costs for data-intensive applications.

Data transfer costs are influenced by several key factors, such as:

  • Direction of data (inbound, outbound, also known as ingress and egress)
  • Geographic considerations (cross-region transfers are more expensive than same-region)
  • Service-specific pricing (e.g., S3 vs. EC2 vs. RDS outbound rates may differ)

So how can you optimize your storage costs?

Use CDN for Content Distribution whenever possible to reduce networking costs. If you use multiple regions, leverage Private Network Connectivity. Instead of using the internet to transfer data between regions, using a private network direct connect can cut these costs in half. Finally, try to keep traffic between your services in the same region or zone.

Technical Review

There are many technical changes that could be implemented to save costs. Here are my top recommendations:

Caching Strategies

Caching is supported at different levels through various techniques and can provide significant performance and cost benefits. For example, while Memory/RAM caching (e.g., Redis, Memcached, application in-memory caches) is used with application data to enable extremely fast data access, it comes with higher costs. However, you can achieve similar performance by switching to SSD storage, which uses flash memory chips, or by implementing the NVMe (Non-Volatile Memory Express) protocol—specifically designed for SSDs to enable rapid data access. These approaches can dramatically improve performance while reducing storage costs.

Consider also using Edge Caching, which improves access by storing cached copies of data at geographic locations closer to your end users.

Intelligent-Tiering for Object Storage

💡  Note: (as of April 2025), this feature applies to AWS (Intelligent-Tiering service in S3), and Google cloud (GCP Autoclass)

Until recently, we relied on Lifecycle Management for storage, which moved less frequently accessed objects into cheaper storage classes (e.g., cold storage or archival) to save costs.

Now, AWS and Google offer this management automatically by monitoring object access patterns and moving them between storage tiers without overhead charges. Of course, you may still rely on data cycle management to do the work manually.

Utilize Batch Operations

When performing write operations into a database, batch writes are significantly more cost-efficient than individual writes. They reduce API calls, provide improved throughput (meaning fewer provisioned resources are needed), reduce transaction overhead (fewer transactions result in less log writing), and improve network efficiency (fewer network round trips lead to lower data transfer costs and reduced latency in time-sensitive applications).

License Review

Usage of native tools

Many enterprises use expensive tools that come with a heavy price tag. For example, instead of NetApp or EMC Storage Arrays for storage management, consider native tools (e.g., Amazon EFS/FSx, Google Filestore or Azure Files). The same applies to backup and recovery options—instead of Commvault or Veritas, use AWS Backup, Google Cloud Backup and DR, or Azure Backup.

Rate Review

Reserved Capacity

For constant usage, a commitment for 1-3 years or reserved capacity (e.g., AWS EFS/EBS reserved capacity, Azure reserved capacity) can save a good 20 -60% of the total costs.

Summary

When it comes to storage options, there are many ways to optimize costs. This article has only scratched the surface.

To summarize this article and the series ( ****Your Workload Optimizations Checklist — Part 1: Databases and Part 2: Compute), divide your cost optimization work into 5 reviews: business, performance, technical, license, and rates. Not every optimization will be applicable to your situation, so always weigh the effort against potential savings. Start with small changes and add more optimizations as your process matures.

Do you like our blog posts? It means a lot to us when you rate (👍, ❤️, 👏) and share them. Thank you so much for your support.