Speed­ing up with eph­em­eral and immut­able infrastructure



Intro­duc­tion

For years, infra­struc­ture man­age­ment was based on vari­ous pro­cesses and routines that required manual inter­ven­tion by engin­eers or tech­ni­cians. While these prac­tices were effect­ive, the devel­op­ment land­scape has under­gone sig­ni­fic­ant changes in recent years. The advent of agile meth­od­o­lo­gies, shorter devel­op­ment cycles, increased focus on time-to-mar­ket speed, dis­trib­uted sys­tems, and scaled envir­on­ments have made it chal­len­ging for tra­di­tional infra­struc­ture man­age­ment to keep pace. Cloud trans­form­a­tion and the cloud-nat­ive trend were the ulti­mate push that evid­enced a change need.

A new, more agile approach to infra­struc­ture man­age­ment was needed to respond to these chal­lenges. Instead of treat­ing infra­struc­ture as unique, valu­able “pets” that required sig­ni­fic­ant time, effort, and resources to main­tain, a more stand­ard­ized, com­mod­it­ized approach was needed. By view­ing infra­struc­ture as replace­able “cattle,” organ­iz­a­tions can stand­ard­ize their sys­tems, reduce the risks asso­ci­ated with manual man­age­ment, and ensure their infra­struc­ture is equipped to meet the demands of mod­ern development.

The pets vs cattle ana­logy were first used by Randy Bias to explain the dif­fer­ence between tra­di­tional and new approaches to server management.

In the old way of doing things, we treat our serv­ers like pets, for example Bob the mail server. If Bob goes down, it’s all hands on deck. The CEO can’t get his email and it’s the end of the world. In the new way, serv­ers are numbered, like cattle in a herd. For example, www001 to www100. When one server goes down, it’s taken out back, shot, and replaced on the line.

In this art­icle, we delve into the chal­lenges of util­iz­ing mut­able and long-lived infra­struc­ture and its effect on cloud-nat­ive trans­form­a­tions. We also explore the bene­fits of adopt­ing an immut­able and eph­em­eral infra­struc­ture approach.

To provide prac­tical insights, we will illus­trate each topic with a real-world scen­ario from our exper­i­ences at synvert, demon­strat­ing how util­iz­ing immut­able and eph­em­eral infra­struc­ture has aided one of our cus­tom­ers in achiev­ing a cloud-nat­ive trans­form­a­tion and bet­ter reli­ab­il­ity and speed at deliv­er­ing software.

Let’s get deep into the con­straints of long-lived and mut­able infrastructure

To under­stand the real bene­fits of immut­able and eph­em­eral infra­struc­ture, we need to get deep into the main chal­lenges and con­straints of a long-lived and mut­able infra­struc­ture in an agile devel­op­ment world:

  • An increase in oper­a­tional com­plex­ity and con­sequently reduced reli­ab­il­ity, the increase in dis­trib­uted ser­vice archi­tec­tures and dynamic scal­ing leads to a sig­ni­fic­ant increase in main­ten­ance and mon­it­or­ing require­ments, mainly due to changes in the runtime envir­on­ment. Main­ten­ance and con­fig­ur­a­tion pro­cesses across mul­tiple machines or serv­ers are not com­pat­ible with flex­ible and con­tinu­ously chan­ging environments.
  • The pre­vi­ous point has a clear impact on the second, slower deploy­ments. As infra­struc­ture becomes unpre­dict­able due to the mul­tiple con­fig­ur­a­tions and pro­cesses, the accur­acy and con­sist­ency of inform­a­tion are dimin­ished. This leads to a waste of time fix­ing con­fig­ur­a­tion issues and debug­ging the runtime envir­on­ment due to pos­sible con­fig­ur­a­tion drifts.
  • Next, there are also prob­lems with the mon­it­or­ing pain, ima­gine your­self search­ing for errors on a sys­tem run­ning for a long time, with sev­eral pro­cesses run­ning and sev­eral con­fig­ur­a­tion changes over time.
  • And finally, there are fire drills or out-of-con­trol events, like inter­ven­tions, updates, or patches that you don’t have full con­trol of, a cloud pro­vider reboot or a zone out­age could be a good example. This will increase the costs with on-call teams, being noti­fied to put your infra­struc­ture up and run­ning again.

Our cus­tomer scen­ario at the begin­ning presen­ted sev­eral chal­lenges in imple­ment­ing agile devel­op­ment pro­cesses. Des­pite ini­tial efforts, the organ­iz­a­tion has struggled to achieve desired res­ults due to infra­struc­ture constraints.

Pre­vi­ously, the com­pany was deliv­er­ing its product every 3 months, allow­ing for manual cor­rec­tion of any con­fig­ur­a­tion drift. How­ever, with an increased push for more fre­quent product deliv­ery, the simple task of man­aging 12 big vir­tual machines where the backend and front end were hos­ted became a sig­ni­fic­ant chal­lenge. Con­fig­ur­a­tion drift caused by inde­pend­ently con­figured instances and later resource star­va­tion res­ult­ing from miss­ing log rota­tions caus­ing data­base prob­lems were just a few of the dif­fi­culties faced.

So, what is exactly immut­able and eph­em­eral infrastructure?

To under­stand immut­able infra­struc­ture, first, we need to under­stand what immut­able means. “Immut­able” refers to some­thing that can­not be changed, altered, or modified.

In the con­text of soft­ware devel­op­ment and infra­struc­ture, “immut­able” is used to describe sys­tems, com­pon­ents, or resources that remain unchanged dur­ing their entire life­cycle. This means that once they are deployed, they can­not be updated or mod­i­fied in any way. Instead, a new ver­sion of the sys­tem, com­pon­ent, or resource must be cre­ated if changes are needed.

Now is the time to talk about eph­em­eral but first, let’s get deep into what the eph­em­eral term means. “Eph­em­eral” refers to some­thing that is short-lived or tem­por­ary and does not per­sist for a long time.

In the con­text of infra­struc­ture, the term “eph­em­eral infra­struc­ture” refers to com­put­ing resources or com­pon­ents that are cre­ated dynam­ic­ally and des­troyed as needed, rather than being per­sist­ent and long-lived. This allows for greater flex­ib­il­ity, scalab­il­ity, and ease of man­age­ment in cloud-based or other dynamic com­put­ing environments.

As observed, both types of infra­struc­ture dif­fer in their design prin­ciples. While immut­able infra­struc­ture pri­or­it­izes sta­bil­ity through unchan­ging com­pon­ents, eph­em­eral infra­struc­ture val­ues flex­ib­il­ity through its abil­ity to be eas­ily replaced. By com­bin­ing these two, an infra­struc­ture is cre­ated that can quickly scale, deploy, and recover in response to changes in demand or conditions.

Com­ing back to our scen­ario, it became evid­ent that those vir­tual machines needed to be trans­formed into immut­able and eph­em­eral com­pon­ents. The per­sist­ence of these machines was hinder­ing the client’s deploy­ment pro­cess, so we needed to find a way to make these instances repro­du­cible and extern­al­ize any non-repro­du­cible elements.

What are the main advant­ages of using this type of infrastructure?

Now, let’s delve into the advant­ages of this method and why it helps organ­iz­a­tions with their cloud-nat­ive transformation.

  • First, sim­pli­fy­ing oper­a­tions, once util­iz­ing auto­mated deploy­ment tech­niques allows for the sub­sti­tu­tion of out­dated resources with updated ver­sions, ensur­ing your sys­tems remain in their ori­ginal “known-good” state.
  • Second, there is con­tinu­ous and faster deploy­ment and aware­ness of what is being run, and its beha­vior is main­tained. Updat­ing becomes a reg­u­lar, ongo­ing pro­cess with fewer errors occur­ring in pro­duc­tion and all updates can be mon­itored through source con­trol and CI/CD processes.
  • Next, we have mit­ig­a­tion of errors and increase reli­ab­il­ity, new instances can be raised almost instantly and their life­cycle is now much shorter, this will reduce the risk of data loss or cor­rup­tion, as well as the risk of con­fig­ur­a­tion drifts, vul­ner­ab­il­ity sur­face, and the level of effort required to meet ser­vice level agree­ments. This helps organ­iz­a­tions main­tain a high level of reli­ab­il­ity and sta­bil­ity, even as their work­loads change and evolve over time.
  • Another advant­age is pre­par­a­tion for fire drills or cloud-ready com­pon­ents. Once you know the desired state of each machine, oper­a­tions like reboot, recov­ery, and run­ning can hap­pen and you are much more con­fid­ent when cloud reboots hap­pen that your under­ly­ing instances should be handled grace­fully and with min­imal if any, applic­a­tion downtime.
  • The added bene­fit of improved scalab­il­ity comes with the afore­men­tioned advant­age and this makes it easy to scale up or down as needed, without hav­ing to worry about the under­ly­ing hard­ware. This allows organ­iz­a­tions to quickly respond to chan­ging demands and to take advant­age of new mar­ket opportunities.
  • And finally poten­tial reduc­tion of costs. Immut­able infra­struc­ture is ready to be dynamic which is very import­ant when we are talk­ing about pro­vi­sion­ing infra­struc­ture in a cloud pro­vider. Another out­come in terms of redu­cing costs is a reduc­tion in expenses related to the upkeep and upgrad­ing of con­ven­tional, per­sist­ent servers.

Seems good so far, right? Let’s get back to our scenario.

To begin with, we star­ted to extern­al­ize the data­base instance to a Plat­form-as-a-Ser­vice (PaaS) solu­tion to reduce the risk of down­time, which allows us to sim­plify oper­a­tions and increase reli­ab­il­ity. We then fol­lowed three steps to make these machines immut­able resources:

  • extern­al­iz­a­tion of configurations
  • pack­aging
  • pro­vi­sion­ing

We trans­ferred all con­fig­ur­a­tion man­age­ment respons­ib­il­it­ies to tools such as Con­sul and Vault from HashiCorp to achieve ser­vice dis­cov­ery, con­fig­ur­a­tion man­age­ment, health checks, and secure stor­age of sens­it­ive data. We used Packer also from HashiCorp to cre­ate pre-con­figured vir­tual machine tem­plates that can be quickly deployed to save time and reduce manual con­fig­ur­a­tion errors.

Finally, we estab­lished a deploy­ment pro­cess for these machines using Ter­ra­form from HashiCorp, a lead­ing Infra­struc­ture as Code tool for provisioning.

After all these steps, the cloud was only one com­mand away, since we were able to cre­ate repro­du­cible infra­struc­ture, which happened some months after, along with con­tain­er­iz­a­tion and so much more.

Immut­able and eph­em­eral infra­struc­ture can be found in all sizes and forms

So far we have repeatedly men­tioned the terms infra­struc­ture, machines, and serv­ers, but what can be turned into immut­able and eph­em­eral infra­struc­ture? Nearly everything can be, but let’s delve deeper.

Vir­tu­al­iz­a­tion was the cata­lyst for the growth of immut­able and eph­em­eral infra­struc­ture. It was easy to cre­ate new serv­ers, fire­walls, etc. on a hyper­visor, and if some­thing went wrong, a new machine could be brought online with just a few clicks.

How­ever, vir­tual machines became cum­ber­some due to their heavy weight and numer­ous lay­ers of man­age­ment, includ­ing the ker­nel, oper­at­ing sys­tem, pack­ages and depend­en­cies, applic­a­tions, and more. To address these issues, newer con­cepts such as con­tain­er­iz­a­tion emerged, res­ult­ing in smal­ler, lighter, and sim­pler com­pon­ents for our infrastructure.

With the advent of tools such as Kuber­netes, Apache Mesos, Nomad, Open­Shift, and oth­ers, the concept of immut­able and eph­em­eral infra­struc­ture gained a new per­spect­ive. Not only can our serv­ers be trans­formed into immut­able and eph­em­eral com­pon­ents, but our ser­vices and applic­a­tions can also be made eas­ily replaceable.

Finally, cloud pro­viders delivered the fin­ish­ing touch to the world of immut­able infra­struc­ture. With the abil­ity to pro­vi­sion infra­struc­ture through simple API requests, nearly everything can be turned into immut­able. Resources such as serv­ers, fire­walls, load bal­an­cers, applic­a­tions, func­tions and more can now be set up quickly, effi­ciently, and most import­antly, auto­mat­ic­ally, allow­ing us to keep pace with our company’s evolving require­ments and demands.

Final thoughts

To final­ize our scen­ario fol­low-up, cur­rently, our cus­tomer has all kinds of sizes and forms of immut­able infra­struc­ture resources run­ning in his company.

After the cloud trans­form­a­tion, the engin­eer­ing organ­iz­a­tion was hav­ing a sig­ni­fic­ant impact since it star­ted deliv­er­ing soft­ware on a weekly basis vs a quarterly basis with con­tain­er­iz­a­tion already in place.

A found­a­tional block behind this improve­ment and imple­ment­a­tion is the immut­able and eph­em­eral infra­struc­ture concept which gave our cus­tomer the oppor­tun­ity to increase the pace of devel­op­ment with flex­ib­il­ity, speed, sta­bil­ity, and reduced costs.