Eliminating Cloud Cost overruns – Systematic approach to cost optimization

Eliminating Cloud Cost overruns – Systematic approach to cost optimization

The Customer

Our client is the leading financial systems and money transfer company having robust plan for modernizing their technology platforms and allow technology differentiators to outperform their competition by reducing costs and improving services.

Executive Summary

Our client is running its core business critical applications on AWS. These applications are typically deployed in four individual AWS tenants (dev, qa, stage and prod). Using AWS organizations, landing zones and AVM pipeline, the client has firm process to vend secure accounts per business applications. With large number of vended accounts and no central utilization reports at the organization level, our client was occasionally seeing high cloud charges especially from the lower environments that were not running critical workloads. This effected their annual cloud budget and overall capacity to migrate more applications to the cloud. Sincera’s tool based objective reports and architecture based subjective analysis identified unutilized, underutilized and the selection of wrong resources that were leading to high costs of cloud computing.

Business Challenge

Customer’s cloud costs were difficult to manage due to a lack of visibility into historical usage and spending patterns. This resulted in unpredictable usage spikes and steady spend creep, which often caused actual cloud costs to double the original budget. This significantly reduced business’s capacity to innovate in cloud. This affected their ability to migrate more applications to the cloud or add new features to the existing business applications which affected their ROI and time to market new offerings.

Our Solution

Sincera’s cloud optimization approach is built on Inform, Optimize and Operate framework from the FinOps organization.

While the inform phase gathers and organizes the existing cost metrics, configurations and invoices, our recommendations are broadly classified under two buckets:

  • Operational Recommendations – Prescriptive guidance on usage of existing compute, storage and network to make the consumption elastic and in line with usage.
  • Architectural Recommendations – Detailed architectural recommendations with patterns and runbooks (optional) that allow for correct architecture patterns using cloud services that uses key features like autoscaling, correct storage type and backups that are in line with security, compliance and resiliency requirements.

Sincera’s DWorks tool uses this cloud optimization approach to gather key metrics that help collate costs from different organizations and prepare a detailed view of historic and near real time consumption reports of all cloud resources. It also helps with the analysis and focus on low effort and high impact areas. We then focus on architecture guidance to further improve cost effectiveness and cloud adoption. Following KPIs were captured by this tool:

  • Business Unit or Owner Tag compliance
  • Mean time to upgrade or push a new release
  • Existing monthly costs per
    • Application
    • Env
  • Storage costs
  • Compute costs
  • Egress costs
  • Forecasting accuracy
  • Instance types

Based on the analysis and the reports gathered, our teams recommended and implemented some of the following tasks .

Operational

  • AMI retention and governance – AMI backups were taken in development and test environments during ongoing activities such as patching and deployment. However, the AMI instances were not deleted after successful patching/deployment. There were also unattached EBS volumes that need to be deleted. Our team recommended automation jobs for releasing storage space. We also found data in S3 buckets that were not accessed for a long time. We have set up backup jobs to move historical data to Glacier to lower the storage costs.
  • Storage – Our team reviewed the utilization report for the EC2 and RDS instances. Many EC2s in the lower environments were not used. AWS provides different saving plans based on the duration we can commit to for using the AWS cloud resources. Many RDS snapshots were created due to auto back up configurations (mis-configured pipelines). We adjusted the pipeline configuration to reduce the backup in lower environments and recommended the customer to delete many old backups.
  • Environment – Our teams analyzed that there were many tiers 4 and tier 5 non prod apps driving up consumption and adversely affecting the monthly cost. We deployed automated reporting and instance managing jobs that were designed to send notifications to account owners of the instance shutdown during off peak hours. These jobs are configurable and can be tuned to any environment. Having workloads isolated per tenant helped our jobs to be configured per business requirements.
  • Log Retention – Here we focused on logs being sent to S3, especially the ones in dev and QA that usually had DEBUG level set to INFO. We observed large log volumes of stale log lines from old releases being stored for all non-prod environments. DWorks gave us a good breakdown of the Percentage of storage cost consumed by the non-prod logs. Our team created an S3 lifecycle policy to push these logs to in frequent storage before deleting them after a period of 60 days.

Our checklist, tool and analysis based FinOps approach resulted in approximately 35 objective recommendations that brought down the overall AWS consumption costs by 20-40% after first iteration.

Impacts and Key Benefits to the client

  1. Free up resources and cost for cloud migration
  2. Improve operational usage and EBITA helping boost ROI.
  3. Monitoring and metrics: Continuously build and monitor cost KPIs for all environments and creating threshold alarms ensured budget variance is in check and avoided cloud sprawl.
  4. Compliant tagging to account for shows back and charge back models.

Cost optimization is an ongoing process and not a one-time exercise. Lack of expertise in managing cloud infrastructure can lead to high monthly billing costs and become a financial burden. Sincera’s tailored approach to FinOps can help organizations in the initial assessment and setup of managed services using proven cost optimization checklists and processes to ensure that cloud resources are used optimally, and cloud usage costs are low.

Sincera can provide both one-time assessment services and ongoing managed services for AWS cloud optimization as per the needs of the customer.