The move to the cloud continues for businesses across North America, with a recent study saying that almost half of all businesses surveyed are planning to pursue a cloud-first strategy in 2022, with 20% planning to migrate all of their applications to the cloud. Why are so many companies making the move to the cloud? What does it take to keep a cloud-based business up and running, and how do you prevent eventual outages from negatively impacting business?
Tech in Motion recently brought together some of the brightest minds in cloud technology to answer some of these questions during our latest webinar: Cloud Migration Case Studies: Strategies for Success. Presenters included: Cody Chandler, Director, Cloud Engineering at Blackline, Jeremy Proffitt, Director DevOps, and SRE for C3 Division at Ally, Mandi Walls, DevOps Advocate at Pagerduty, and Dheeraj Gupta, Cloud Services ad Amazon Web Services. Read on below for some of the highlights or click the video below to watch the whole event.
Why use Cloud Services in the First Place?
While there are many benefits of moving to the cloud, a particular perk was demonstrated by Dheeraj Gupta from Amazon, who went into detail about the process and use cases of “cloud-bursting.” An ideal solution for compute-intensive workloads like batch jobs, animation rendering, video processing, and running simulations for things like computational fluid dynamics and protein folding, for example, cloud-bursting is a concept of scaling from on-premises to the cloud to meet capacity requirements. As Gupta simply put, “Cloud-bursting lets you scale out when demand goes up, scale in when demand comes back in.” By introducing cloud-bursting, companies are now able to add capacity to a data center without needing to invest in the necessary hardware.
How to Manage, Maintain, and Monitor Your Cloud Usage
Once a company’s cloud strategy is established, how do you maintain it and keep it running efficiently? As Jeremy Proffit, Director DevOps, and SRE for C3 Division at Ally said during his presentation, “When it comes to major cloud outages, it’s not just about the fire, it’s about how long it burns.” Proffit continued, saying that there are two things that are never 100%, patching systems and uptime. How you combat these two realities will decide how successful your work in the cloud will be.
When it comes to patching, Proffit recommended only trusting your own system that goes out and looks at what virtual servers and devices you have out there. The best monitoring practices start by gathering a list of assets, then looking to ensure each is returning a state. Those without a state should be on an exception list or alerted upon.
As for uptime, companies should safeguard their reliability through a mix of redundant resources and reviewing the timing of a cloud provider’s downtime. Using multiple servers, resources, data centers, and network layers should be part of your business plan when using the cloud. As for the timing of cloud maintenance, Proffit said instead of looking for 99.9% overall uptime, look for 100% uptime during the hours that you do your business, and be O.K. with a lower percentage at night.
However, when the inevitable issues occur, having a game plan in place is paramount. As Mandi Walls, DevOps Advocate at Pagerduty went into, the goals of Site Reliability Engineers (SRE) are to create shared ownership, set realistic objectives, reduce the cost of failure, and finally to automate. The endgame of this is to create reliability and production performance in a cloud environment. In order to achieve this and as everyone in the tech world has learned more about the abilities of the cloud, more and more tools are being made to make things better and more efficient. Walls talked about while the options available can be almost awe-inspiring, the sheer number can become intimidating and exhausting to figure out the right solution for each businesses’ particular pipeline. The important thing is to not get bogged down in trying to learn and master each and every new tool.
In broader terms, proper cloud maintenance and monitoring is about continually finding ways to problem solve and prevent issues in the future. As Cory Chandler, Director, Cloud Engineering at Blackline noted, “You have to keep people from falling to the momentum of the way things have been done in the past.” Thinking about ways not just to solve the problem at hand, but to figure out why these problems are happening in the first place and prevent them from happening again. To achieve this, Chandler talked about not being afraid to make big swings but also thinking about the smaller step-by-step process on how to make it successful.
These are only a few of the highlights of Tech in Motion’s latest event on Cloud Migration, to watch the full case study click here, and make sure to bookmark Tech in Motion’s Event Page to stay up to date on the latest webinars!