Unblock­ing Fea­ture Flags for faster teams



When work­ing on a fast-grow­ing and fast-paced com­pany with a strong online pres­ence there is usu­ally lots of exper­i­ment­a­tion going on at any given time. To keep the pace of devel­op­ment and allow for sev­eral daily deploy­ments to pro­duc­tion, engin­eer­ing teams need some­thing to con­trol what the users can see and when.

In our quest to assist teams in increas­ing their per­form­ance, we have vari­ous mech­an­isms at our dis­posal, and one power­ful way, in par­tic­u­lar, is fea­ture flag­ging. Fea­ture flags are a power­ful approach that can sig­ni­fic­antly impact Lead Time For Change as it allows much bet­ter con­trol and unlocks a higher level of col­lab­or­a­tion over the full Soft­ware Deliv­ery Life­cycle. But, while fea­ture flags provide sig­ni­fic­ant flex­ib­il­ity, man­aging mul­tiple exper­i­ments and fea­tures con­cur­rently can be challenging.

We will delve into an example imple­ment­a­tion that can aid oth­ers in lever­aging and over­com­ing a few chal­lenges of fea­ture man­age­ment tools. It is inspired by a use case we faced with a cus­tomer, who had low response time require­ments, and the chal­lenges of imple­ment­ing fea­ture flags were hinder­ing them.

By embra­cing this approach, teams can bene­fit from cent­ral­ized con­fig­ur­a­tion and cached stor­age, ulti­mately stream­lin­ing their fea­ture man­age­ment pro­cesses, optim­iz­ing their out­comes, and improv­ing their performance.

Fea­ture Man­age­ment Primer

Tools

There are many tools to man­age how to release fea­tures that help make product decisions based on exper­i­ment­a­tion, a few examples are:

Advant­ages
  • Enables fast flow: It enables trunk-based development
  • Con­trol: Full con­trol over the fea­tures that users can see
  • Toggle vs deploy­mentNo deploy­ment is needed to toggle fea­tures on and off
  • Tar­get­tedDefine the tar­get audi­ence of spe­cific fea­tures (e.g. enable a fea­ture for 50% of users or a spe­cific market).
Chal­lenges

Even though fea­ture man­age­ment tools are great, they come with a few chal­lenges that teams need to mitigate

  • Extra HTTP requests: Every time a user accesses a page we need to do an extra request to Split to get the fea­ture flags
  • Code duplic­a­tion: Each team will likely have its own imple­ment­a­tion to fetch the fea­ture flags
  • No entry point: There is no single entry point to eval­u­ate the fea­ture flags and exper­i­ments dynam­ic­ally if needed

Effi­cient usage of fea­ture flags

Finally, let’s delve into a solu­tion that has the chal­lenges men­tioned in mind and keep the integ­ra­tion with the fea­ture man­age­ment tool effi­cient and flexible.

The dia­gram below describes the over­all sys­tem design using Split as the fea­ture man­age­ment tool, Cloud­flare Worker as an edge server, and AWS API Gate­way as the ori­gin server.

Our goal was to min­im­ize the HTTP requests from the user to only one. For that, we cent­ral­ized the logic of col­lect­ing and aggreg­at­ing all inform­a­tion needed in a Cloud­flare Worker, that sits between the user and all other backends (Split + API gate­way) in a sim­ilar approach to the BFF’s (Backend For Fron­tends) pattern.

Fig­ure 1: Archi­tec­ture Diagram
1. Stor­ing fea­ture con­fig­ur­a­tion at the edge

Split is the source of truth of the fea­tures con­fig­ur­a­tion and we need to keep it in sync with the Cloud­flare KV. To ful­fill his pur­pose, we need to ensure that it holds the latest con­fig­ur­a­tion. The solu­tion is simply to use Split web­hooks to syn­chron­ize the con­fig­ur­a­tion between Split and Cloud­flare KV.

This way, every time the team changes the fea­ture con­fig­ur­a­tions in Split we ensure that we have the latest con­fig­ur­a­tion at the Edge.

2. Com­mu­nic­a­tion between edge and ori­gin servers

After hav­ing the Split con­fig­ur­a­tion syn­chron­ized, our Cloud­flare Worker is able to fetch the stored con­fig­ur­a­tion from the Work­ers KV and for­ward the request to the ori­gin server with this inform­a­tion in an HTTP header. We should how­ever take into account that the HTTP request has size limitations.

The Amazon API Gate­way has a hard size limit of 10240 bytes for the head­ers, so it’s bet­ter to have a threshold mech­an­ism for the num­ber of fea­ture flags. This mech­an­ism can also integ­rate with Slack or another sim­ilar tool to alert your team when you’re approach­ing the size limit.

As an improve­ment to reduce the “x‑example” header size, we can try to send only the fea­ture flags we need based on the cur­rent request.

Bene­fits
  • Cent­ral­ized con­fig­ur­a­tion: There is an entry point where we can manip­u­late data before expos­ing it to the applications.
  • Cloud­flare Work­ers KV as stor­age: The fea­ture flags are stored in a global and low-latency key-value data store with a cache system.
Caveats
  • HTTP request size: Usu­ally ori­gin serv­ers have lim­its for the HTTP request, in our case the Amazon API Gate­way has a hard size limit of 10240 bytes for the HTTP head­ers size.
  • Page cache on the Cloud­flare Worker: The per­cent­age of cache hits can be small since the “x‑example” header isn’t always the same because of the pos­sible val­ues from the fea­ture flags.

For our cus­tomer use case, the bene­fits over­weighted the caveats. We ensured every team had a single entry point, min­im­ized the requests and flicker effect by fol­low­ing the BFF pat­tern and the Cache-Hit-Ratio was still very acceptable.

A few alternatives

There are dif­fer­ent approaches for fea­ture flags, depend­ing on what you need to optim­ize for. For smal­ler teams maybe you don’t need the flex­ib­il­ity of a plat­form like Split or if they are not updated fre­quently you can play dif­fer­ently with caches.

  • Cache the request loc­ally and play with the sync times you need for your use case
  • Define fea­ture flags as envir­on­ment vari­ables dir­ectly in the code when they don’t change fre­quently (this means, you will need to deploy every time you want to change the configuration)
  • Use the SSM to store the fea­ture flags and then turn them on and off through the AWS console

Con­clu­sion

At synvert, our mis­sion is to empower teams and trans­form them into high-per­form­ing organizations.

By empower­ing our cus­tomer to integ­rate fea­ture flags into their Soft­ware Devel­op­ment Life­cycle, we achieved a remark­able enhance­ment in the col­lab­or­a­tion between Product Man­agers and Soft­ware Engin­eers. This imple­ment­a­tion provided them with greater con­trol over the fea­tures, lead­ing to stream­lined com­mu­nic­a­tion and improved coordination.

Moreover, this break­through promp­ted a trans­form­a­tion in the branch­ing strategy, trans­ition­ing to trunk-based devel­op­ment. This stra­tegic shift had a pro­found impact on the velo­city of soft­ware releases to pro­duc­tion. The adop­tion of fea­ture flags and trunk-based devel­op­ment res­ul­ted in shorter feed­back loop cycles with end users, enabling rapid iter­a­tion and refinement!