Design­ing Cloud Nat­ive Apps With Twelve–Factor



We reg­u­larly listen to people using Cloud Nat­ive and Cloud terms inter­change­ably and we chal­lenge that! Cloud Nat­ive is about how applic­a­tions are cre­ated and deployed, not where. As Joe Beda, Co-Founder of Kuber­netes said:

“Cloud nat­ive is struc­tur­ing teams, cul­ture and tech­no­logy to util­ize auto­ma­tion and archi­tec­tures to man­age com­plex­ity and unlock velocity.”

Cloud Nat­ive pat­terns allow us to design applic­a­tions with loosely coupled ser­vices, being small, inde­pend­ent units each of them address­ing a spe­cific prob­lem. Also, our teams can abstract away the under­ly­ing infra­struc­ture depend­en­cies and focus on applic­a­tion code. Cloud Nat­ive Devel­op­ment enables scal­able, highly avail­able, easy-to-man­age and more resi­li­ent applications.

At synvert, our teams lever­age Cloud Nat­ive pat­terns to help teams become high per­form­ant short­en­ing Lead Time For Change driven by cus­tom­ers’ feed­back to improve the applic­a­tion continuously.

Of all the bene­fits, one we high­light is scalab­il­ity. The flex­ib­il­ity to up-scale and down-scale resources on demand without applic­a­tion down­time can pre­vent ser­vice dis­rup­tions com­ing from traffic spikes while optim­iz­ing the utilization/cost ratio. This is pos­sible because Cloud Nat­ive applic­a­tions are designed to scale hori­zont­ally, dis­trib­ut­ing loads across mul­tiple server instances of the applic­a­tion. How­ever, new cloud adop­ters face many chal­lenges. Now, the applic­a­tion is part of a dis­trib­uted system.

We will show you how we solve these chal­lenges with the Twelve-Factor App.

Twelve-Factor App

The Twelve-Factor App is a meth­od­o­logy that was cre­ated in 2012 by a group of engin­eers at Her­oku. It con­sists of a set of twelve prac­tices for build­ing cloud-nat­ive apps. The Twelve-Factor App mani­festo can be checked here. Even after all the devel­op­ments achieved in the cloud nat­ive land­scape and tool­ing since the cre­ation of the mani­festo, these factors remain very present nowadays.

With a Twelve-Factor App, we can design hori­zont­ally scaled applic­a­tions at the infra­struc­ture level, sim­pli­fy­ing the orches­tra­tion of dis­trib­uted com­pon­ents and man­aging resources to scale apps dynamically.

The fol­low­ing describes each twelve-factor and shows, based on our per­spect­ive, how each factor can be imple­men­ted to accom­plish scalab­il­ity, elasti­city, and portability.

I. Code­base

One code­base tracked in revi­sion con­trol; many deploys.

In terms of scalab­il­ity and elasti­city of the applic­a­tion deploy­ment, a prop­erly designed applic­a­tion can mean the dif­fer­ence between a one-month and one-day lead time. By con­sid­er­ing the code­base as a source-code repos­it­ory or a set of repos­it­or­ies with a com­mon root, this prin­ciple states that one code­base per applic­a­tion is recom­men­ded and can be deployed across mul­tiple envir­on­ments (Dev, Test and Production).

There is only one code­base per app, but there will be many deploys of the app. A deploy is a run­ning instance of the app.

At synvert, we recom­mend less mature teams start with a Git­flow branch­ing strategy. Fun­nily, this is con­tro­ver­sial since one can say Git­flow is not really com­pli­ant with Twelve-Factor App and makes Continu­ous Integ­ra­tion and Con­tinu­ous Deliv­ery (CI/CD) harder with long-lived fea­ture branches and mul­tiple primary branches. Although, when we are deal­ing with pro­jects from scratch and teams are het­ero­gen­eously com­posed in terms of matur­ity, our primary con­cern is to have a high level of con­trol over the bal­ance of changes intro­duced with the speed of devel­op­ment keep­ing the Change Fail­ure Rate con­trolled. Using Git­flow, vis­ib­il­ity and con­trol are enforced, allow­ing senior developers to invest more time in code review and mentoring.

If there are mul­tiple code­bases, it’s not an app — it’s a dis­trib­uted sys­tem. Each com­pon­ent in a dis­trib­uted sys­tem is an app, and each can indi­vidu­ally com­ply with twelve-factor.

As we men­tioned before, we believe that applic­a­tions should be just small enough inde­pend­ent units. This archi­tec­ture model makes applic­a­tions easier to scale and faster to develop.

Based on our exper­i­ence, we typ­ic­ally recom­mend adopt­ing a multi-repo approach with one repos­it­ory per applic­a­tion, instead of a mono repo with mul­timod­ule pro­jects within a single repos­it­ory. This approach has proven to be more effect­ive in estab­lish­ing clear logical bound­ar­ies from applic­a­tion to infra­struc­ture level between applic­a­tions. It also sim­pli­fies com­mu­nic­a­tion and col­lab­or­a­tion within het­ero­gen­eous teams who are eager to embrace a DevOps mindset.

By util­iz­ing sep­ar­ate repos­it­or­ies for each applic­a­tion, teams can more eas­ily man­age and com­pre­hend the rela­tion­ships between dif­fer­ent com­pon­ents. This approach facil­it­ates the imple­ment­a­tion of DevOps prac­tices, as it pro­motes autonomy and clear own­er­ship over indi­vidual applic­a­tions and their cor­res­pond­ing infrastructure.

Two diagrams showing the functions of monorepo and multi-repo.
Fig­ure 1: Func­tions of monorepo and multi-repo
II. Depend­en­cies

Expli­citly declare and isol­ate dependencies.

From past exper­i­ence sup­port­ing a cus­tomer mov­ing from on-prem to the cloud, we faced the typ­ical issue described below.

An example of Java code
Fig­ure 2: Java code

The code­base is expli­citly using the readlink syscall to obtain the abso­lute path of a local file. Mov­ing the code to the cloud was caus­ing an out­age since the readlink syscall was not avail­able in the under­ly­ing oper­at­ing sys­tem the cus­tomer needed to use.

A twelve-factor app never relies on impli­cit exist­ence of sys­tem-wide packages.

To over­come issues like this, applic­a­tion depend­en­cies should be defined in a depend­ency mani­fest man­aged by a depend­ency man­ager. Declar­ing them in a mani­fest helps teams under­stand the applic­a­tion and sim­pli­fies setup for upcom­ing developers. In addi­tion, tools like Depend­abot and Black­duck can be added to improve secur­ity soft­ware sup­ply chains from external third-party dependencies.

On JVM stacks, our teams usu­ally work with Maven, where depend­en­cies are declared in a pom file, like the one below. Pack­ages are down­loaded from the Maven Cent­ral Repos­it­ory at build time.

An example of a pom file
Fig­ure 3: pom file

Con­tain­ers have decreased depend­ency-based issues pre­vent­ing impli­cit depend­en­cies from “leak­ing in” from the sur­round­ing sys­tem. Nev­er­the­less, one should apply depend­ency declar­a­tion and depend­ency isol­a­tion together, only one will not sat­isfy the Twelve-factor requirements.

III. Con­fig­ur­a­tion

Store con­fig­ur­a­tion in the environment.

Twelve-Factor App should sep­ar­ate con­fig­ur­a­tions from code. Con­fig­ur­a­tions such as data­base con­nec­tions, cre­den­tials for external ser­vices, and oth­ers, vary sub­stan­tially across deploy­ments, whereas code does not.

In a twelve-factor app, env vars are gran­u­lar con­trols, each fully ortho­gonal to other env vars. They are never grouped together as “envir­on­ments”, but instead are inde­pend­ently man­aged for each deploy.

Stor­ing con­fig­ur­a­tions in envir­on­ment vari­ables or a con­fig­ur­a­tion file provides flex­ib­il­ity and reduces down­time risk dur­ing deploy­ments. Rap­idly chan­ging con­fig­ur­a­tions based on the envir­on­ment becomes pos­sible without recom­pil­ing or redeploy­ing code. As a res­ult, deploy­ment and con­fig­ur­a­tion pro­cesses can be auto­mated and repeated, lead­ing to scal­able and stream­lined applications.

In pro­jects where we use Kuber­netes as a con­tainer orches­trator, we use Con­figMaps to apply this factor. Con­figMaps are frame­work and pro­gram­mat­ic­ally agnostic, that eas­ily enable applic­a­tion con­fig­ur­a­tion switch­ing between environments.

Con­sider the next image. The left side rep­res­ents a con­fig­ur­a­tion file, a application.properties for a Spring applic­a­tion. On the right side, there is Con­figMap. After the app is packed into a con­tainer, the Con­figMap file will be applied to the con­tainer accord­ing to the envir­on­ment, over­writ­ing the default variables.

An example of a Configuration and ConfigMap file
Fig­ure 4: Example of a Con­fig­ur­a­tion (Left side) and Con­figMap (Right Side) file

Regard­ing the applic­a­tion cre­den­tials, Kuber­netes also provides a way to store sens­it­ive inform­a­tion using Secrets. How­ever, to enhance the secur­ity (data encryp­tion and iden­tity-based access) and man­age­ment of secrets, we recom­mend com­bin­ing it with a secret man­age­ment tool like Hashicorp Vault.

IV. Back­ing Services

Treat back­ing ser­vices as attached resources.

A back­ing ser­vice is a ser­vice that the app con­sumes as part of its nor­mal oper­a­tion such as data­bases, messaging/queuing sys­tems, cach­ing sys­tems, etc. Our teams treat back­ing ser­vices as abstrac­tions and don’t dif­fer­en­ti­ate between local and third-party sys­tems. Defin­ing ser­vices with a clean con­tract lever­ages con­sump­tion through an inter­face (API).

Con­versely, our code should not be coupled with any spe­cific back­ing ser­vice imple­ment­a­tion. For example, if the applic­a­tion needs to com­mu­nic­ate with a data­base, the code should be agnostic and abstract from all imple­ment­a­tions, inde­pend­ent of the data­base type and vendor. This is very import­ant to treat back­ing ser­vices as attached resources.

Mind-map with links to mySQL, Email service, Amazon S3 and Twitter
Fig­ure 5: Mind-map of attached resources

Hav­ing applic­a­tions that are able to access attached resources gives us the cap­ab­il­ity, at the deploy­ing phase, to switch, for example, from a local MySQL data­base to a man­aged data­base like Amazon RDS, without chan­ging any code. Achiev­ing scalab­il­ity and flex­ib­il­ity as the sys­tem grows. This is only pos­sible if we access resources via URL, abstract the code and keep the end­points and cre­den­tials in a con­fig­ur­a­tion file.

In this way, con­sid­er­ing again the Con­figMap example illus­trated in the Con­fig­ur­a­tion factor, Mon­goDB and Algo­lia are back­ing ser­vices. And, if the app’s data­base is mis­be­hav­ing due to a hard­ware issue, you just need to spin up another data­base. This is done by chan­ging the envir­on­ment vari­able MONGO_DB to the updated server end­point in Con­figMap, without code changes.

V. Build, Release, Run

Strictly sep­ar­ate build and run stages.

This prin­ciple takes the deploy­ment pro­cess down into three rep­lic­able stages: Build, Release, and Run.

The delivery pipeline consisting of build, release, and run
Fig­ure 6: Deliv­ery pipeline

The code­base is taken through the build stage, trans­form­ing the source code with its depend­en­cies into a bundle known as a build. The res­ult of the build phase com­bined with the envir­on­ment con­fig­ur­a­tion pro­duces a release. Then, the release is deployed into the exe­cu­tion envir­on­ment and run.

The expli­cit sep­ar­a­tion between build and release steps is cru­cial for sane deploy­ments and roll­backs. At synvert, we achieve sep­ar­a­tion through arti­fact man­age­ment — after the code is merged and tested, each build res­ult (image or bin­ary) should be ver­sioned, pub­lished and stored in an image registry. We use a private Docker registry for con­tainer images and Har­bor for Helm charts. This allows releases to be re-used and deployed across mul­tiple envir­on­ments. If some­thing goes wrong, we can audit the release in a given envir­on­ment and, if neces­sary, roll­back to the pre­vi­ous one. Ideally, this pro­cess is fully auto­mated and doesn’t require human intervention.

VI. Pro­cesses

Execute the app in one or more state­less processes.

Cloud Nat­ive applic­a­tions should be volat­ile and highly dis­pos­able. Due to this, a Twelve-Factor App never relies on local stor­age con­tents, either on a disk or in memory, being avail­able or that any request will be handled by the same process.

Twelve-factor pro­cesses are state­less and share-nothing.

Applic­a­tions should be executed as one or more state­less pro­cesses. Mean­ing that all long-liv­ing states must be external to the applic­a­tion, provided by a back­ing ser­vice, like a data­base and cache. Applic­a­tions cre­ate and con­sume a tem­por­ary state dur­ing a request or trans­ac­tion. In the end, all data should be des­troyed. As a res­ult, the concept of state­less­ness does not mean that state can­not exist, it means that applic­a­tions can­not main­tain it.

Three pillars with the respective headings App 1 App 2 and App 3
Fig­ure 7: Applic­a­tions as state­less processes

Ses­sion state is a com­mon scen­ario for build­ing web applic­a­tions we may need to deal with. It cap­tures the status of the user inter­ac­tion, keep­ing a ses­sion for each user as long as he is logged in. In this way, apps can know recent actions and user per­son­al­iz­a­tion, for example.  In this spe­cific example using data cach­ing, through Redis, to store these states and ensure noth­ing is stored loc­ally is an option.

VII. Port binding

Export ser­vices via port binding

If we think in a non-cloud envir­on­ment, we often stumble into scen­arios where web apps are executed inside a web­server con­tainer. Then the con­tainer assigns ports to applic­a­tions when they start up.

Three fields labelled App 1, App 2, App 3 which are marked with arrows pointing to numbers
Fig­ure 8: Assign­ment of ports to applications

In con­trast, Cloud Nat­ive apps are self-con­tained, with a web server lib­rary bundled into them, not requir­ing runtime injec­tions of external con­tain­ers. In light of this, self-con­tained ser­vices should make them­selves avail­able to other ser­vices by port-bind­ing — a spe­cific port num­ber defined to listen to requests.

In addi­tion, the port num­ber should be defined as an envir­on­ment vari­able (Con­fig­ur­a­tion prin­ciple). In this way, we can apply port-bind­ing per envir­on­ment without chan­ging code.

This factor has been a stand­ard prac­tice for some time but has been enforced by con­tain­er­iz­a­tion stand­ards, prox­ies, and load bal­an­cer imple­ment­a­tions. It is only pos­sible with net­work map­ping between the con­tainer and host. Like so, Kuber­netes has built-in ser­vice dis­cov­ery and you can abstract port bind­ings by map­ping ser­vice ports to con­tain­ers. Ser­vice dis­cov­ery is accom­plished using internal DNS names. Although there are dif­fer­ent ser­vice types in Kuber­netes (you can check here), in the below example, we provide an example of a Clus­terIP service.

Code example of a ClusterIP service
Fig­ure 9: Code example of a Clus­terIP service
VIII. Con­cur­rency

Scale out via the pro­cess model.

This prin­ciple recom­mends organ­iz­ing pro­cesses by their pur­pose and divid­ing them into groups to handle work­loads effi­ciently. By archi­tect­ing applic­a­tions to handle work­loads by pro­cess type, teams can man­age resources based on each work­load. As a res­ult, mul­tiple dis­trib­uted pro­cesses can scale inde­pend­ently. The key to achiev­ing this is to define dis­pos­able, state­less and share-noth­ing pro­cesses that can scale horizontally.

At synvert, we handle this factor using two approaches that can be com­bined, Load Bal­an­cers and Hori­zontal Pod Auto­scal­ing (HPA). Regard­less of the method, we need to mon­itor the per­form­ance and resource usage of the applic­a­tion to ensure that it remains respons­ive and per­form­ant under heavy loads.

  • Load Bal­an­cers

Using a load bal­an­cer, traffic can be dis­trib­uted across mul­tiple applic­a­tion instances, pre­vent­ing the over­load of a par­tic­u­lar instance.

Visualization of Load Balancer balancing the load across multiple application instances
Fig­ure 10: Load Bal­an­cer bal­an­cing the load across mul­tiple applic­a­tion instances
  • Hori­zontal Pod Autoscaling

In pro­jects where we use Kuber­netes, our teams take advant­age of Hori­zontal Pod Auto­scal­ing to scale up or down the num­ber of pods run­ning in the cluster based on stand­ards, such as aver­age CPU util­iz­a­tion, aver­age memory util­iz­a­tion, or cus­tom met­rics. The fol­low­ing yaml file shows how to con­fig­ure HPA for web-go-app deploy­ment.

Written code of autoscaling/v2
Fig­ure 11: Writ­ten code of autoscaling/v2
IX. Dis­pos­ab­il­ity

Max­im­ize robust­ness with fast star­tup and grace­ful shutdown

We argue that for an applic­a­tion to be robust and scal­able, it is essen­tial to have fast star­tup times, be respons­ive and have grace­ful shut­downs. Con­tain­ers already provide fast star­tup times. How­ever, it does not solve all prob­lems. It is cent­ral to defin­ing start-up and health checks to ensure sys­tems are oper­at­ing and to rap­idly replace any fail­ing instances.

Think of the scen­ario where a web app is con­nec­ted to a data­base as a back­ing ser­vice. At star­tup, it needs to load some data or con­fig­ur­a­tion files. After deploy­ment, the app starts up and appears ready to receive requests. But… what about the data­base? Is the data­base up and pre­pared to receive connections?

At synvert, we adopt mon­it­or­ing and alert­ing tools like Pro­meth­eus and Datadog. By estab­lish­ing alerts based on met­rics, logs and thresholds, our teams can have more vis­ib­il­ity over the sys­tems. Also, we can detect when an instance of your applic­a­tion is mis­be­hav­ing or exper­i­en­cing issues and auto­mat­ic­ally spin up an addi­tional instance to replace it. This is import­ant to pre­vent long peri­ods of down­time that cost our cli­ents money.

Another approach in our pro­jects is imple­ment­ing start-up and health checks by defin­ing Kuber­netes Live­ness and Read­i­ness probes. Con­sid­er­ing the pre­vi­ous web app scen­ario, we can con­fig­ure a read­i­ness probe to check if the data­base con­tainer is ready to accept traffic. For that, we just need to con­fig­ure read­i­ness­Probe, like the fol­low­ing example, at the con­tainer spec level.

A code example for readiness Probe
Fig­ure 12: Code for read­i­ness Probe

The same care and atten­tion are needed dur­ing shut­down. In a grace­ful shut­down, we can listen for a SIGTERM sig­nal. After that, the web app shuts down the ser­vice port, closes its data­base con­nec­tions, and flushes its log files. If we dis­close the shut­down in a dis­trib­uted sys­tem, it can lead to cas­cad­ing effects on other sys­tems that rely on it. This can even­tu­ally affect customers.

X. Dev/Prod Parity

Keep devel­op­ment, sta­ging and pro­duc­tion as sim­ilar as possible.

This factor focuses on the import­ance of keep­ing devel­op­ment, sta­ging and pro­duc­tion envir­on­ments sim­ilar. It is essen­tial to find and catch issues before advan­cing to pro­duc­tion, elim­in­at­ing the ste­reo­typ­ical devel­op­ment state­ment, “It runs on my laptop”.

We know that min­imal diver­gence between devel­op­ment and pro­duc­tion envir­on­ments is “nor­mal”. In com­plex sys­tems, this factor may be one of the most chal­len­ging to imple­ment, often because of budget con­straints. How­ever, it is crit­ical, to enable speed as organ­iz­a­tions scale.

Con­tain­ers help us mit­ig­ate this risk by provid­ing a uni­form envir­on­ment for run­ning code. Tools like Docker can spin up the neces­sary con­tain­ers to build and run the applic­a­tion and any depend­en­ciesAlso, hav­ing an effect­ive CI/CD pipeline can ensure that the same build and deploy­ment steps are executed in all envir­on­ments. This factor can also be applied by using Ter­ra­form, an Infra­struc­ture as a Code tool, by eas­ily rep­lic­at­ing environments.

XI. Logs

Treat logs as event streams.

We believe that hav­ing a proper log­ging strategy, met­rics and traces is cru­cial to under­stand­ing and man­aging sys­tems as they evolve and become more com­plex due to their dis­trib­uted nature.

A twelve-factor app never con­cerns itself with rout­ing or stor­age of its out­put stream.

Cloud Nat­ive applic­a­tions can make no assump­tions about the file sys­tem on which they run, other than the fact it’s eph­em­er­al­ity. Logs should be writ­ten to stdout and stderr and treated as event streams. Decoup­ling the aggreg­a­tion, pro­cessing and stor­age of logs from the app’s core logic, in our vis­ion, empowers elastic scalab­il­ity. Being the applic­a­tion decoupled from log know­ledge, we can eas­ily change the log­ging approach without modi­fy­ing the application.

At synvert, we use Flu­entd as a key player in our log­ging strategy. Flu­entd is an open-source pro­ject under the Cloud Nat­ive Com­put­ing Found­a­tion (CNCF), that acts as a log­ging agent that man­ages log col­lec­tion, pars­ing and dis­tri­bu­tion. Fur­ther, it can be com­ple­men­ted with Elast­ic­Search, to store and index JSON doc­u­ments and, finally, Kibana, for data visu­al­iz­a­tion and dis­cov­ery. Those tools together are known as the EFK stack. Flu­entd is typ­ic­ally deployed on Kuber­netes as a Dae­mon­Set that col­lects all con­tainer logs at the cluster level.

EFK stack where Fluentd collects and parses logs as streams
Fig­ure 13: EFK stack where Flu­entd col­lects and parses logs as streams
XII. Admin Processes

Run admin and man­age­ment tasks as one-off processes.

Man­age­ment or admin­is­trat­ive tasks are short-lived pro­cesses. Such tasks include migrat­ing data­bases, run­ning one-time scripts, etc. We believe that these types of tasks should be handled sep­ar­ately from applic­a­tion pro­cesses. How­ever, they should run on identical sys­tems like an app run­ning in a pro­duc­tion envir­on­ment. Also, such tasks should be tested and reviewed, like the code­base, to avoid syn­chron­iz­a­tion issues.

For Cloud Nat­ive apps, this factor becomes more rel­ev­ant when cre­at­ing repeat­able tasks. In this way, our teams handle man­age­ment tasks through CronJobs. It empowers scalab­il­ity and elasti­city since tasks are handled inher­ently by Kuber­netes which cre­ates eph­em­eral con­tain­ers based on the need for those tasks.

Final Thoughts

Cloud Nat­ive devel­op­ment presents a mul­ti­tude of chal­lenges that require care­ful con­sid­er­a­tion. Our aim is to provide valu­able insights into the imple­ment­a­tion of Twelve-Factor App prin­ciples, enabling you to con­struct sys­tems that are scal­able, port­able, and reli­able. By pri­or­it­iz­ing code mod­u­lar­ity and con­tain­er­iz­a­tion pat­terns, you can pave the way towards achiev­ing hori­zontal scal­ing, a cru­cial cap­ab­il­ity for high-per­form­ing teams seek­ing to lever­age the cloud effectively.

With our per­spect­ive and guid­ance, you can nav­ig­ate the intric­a­cies of Cloud Nat­ive devel­op­ment and har­ness its full poten­tial. By embra­cing mod­u­lar­ity in your code­base and adopt­ing con­tain­er­iz­a­tion prac­tices, you will estab­lish a found­a­tion for build­ing scal­able sys­tems. This empowers your team to oper­ate within clear bound­ar­ies and lever­age the power of the cloud, res­ult­ing in enhanced per­form­ance and growth.