GuideMay 2, 202618 min readSignaKit

Feature Flags Best Practices Checklist

30+ actionable checks across 5 categories — naming, targeting, monitoring, cleanup, and security. Use this as a team reference for every flag you create, and as a quarterly audit checklist for your existing flag inventory.

A clean checklist with items being ticked off, representing feature flag best practices

The 5 rules that cover 80% of mistakes

  • Name every flag with a type prefix: exp_, rel_, ops_, or perm_
  • Every flag needs a named owner and a planned cleanup date before it goes live
  • Use opaque user IDs (never raw emails) as targeting attributes
  • Run a quarterly audit — any exp_ or rel_ flag older than 90 days with no active rollout is stale
  • Rotate your SIGNAKIT_SDK_KEY quarterly and treat it as a server-side secret

Feature flags are powerful and cheap to create — which is exactly why teams accumulate them. A codebase with 5 flags is easy to reason about. At 50 flags with no naming standard, no owners, and no cleanup process, you have a system nobody wants to touch. The practices below prevent that outcome.

These checks fall into 5 categories. The naming and ownership checks apply at flag creation time; the rest apply on a recurring basis. For deeper background, see our guide on feature flag naming conventions and the complete feature flags guide.

01

Naming & Ownership

Use a type prefix: exp_, rel_, ops_, or perm_

The prefix tells everyone what kind of flag this is, how long it should live, and how to treat it. Without a prefix, all flags look the same in dashboards and alerts.

Write in lowercase_snake_case

Avoid camelCase, PascalCase, and screaming snake. Lowercase snake_case reads cleanly in code, log output, and API responses, and matches how most SDKs serialize flag keys.

Include the product area in the name

exp_checkout_cta_color, not exp_cta_color. The product area makes the flag findable when scanning a list of 40 flags in the dashboard.

Name the behavior, not the expected value

rel_checkout_v2, not checkout_enabled. Names that encode the expected state (enabled, show_new, use_redesign) become misleading when the flag is flipped.

Record an owner for every flag

The owner is accountable for the flag's lifecycle — including cleaning it up. Use a team name if individuals rotate.

Set a planned cleanup date before the flag goes live

For exp_ and rel_ flags: set a date within 30–90 days. For ops_ flags: set a review date (not a deletion date). For perm_ flags: no deadline needed, but review annually.

One flag, one responsibility

A single flag controlling two independent code paths creates coupling and makes incident response ambiguous. Split into two flags even if it feels redundant.

02

Targeting & Rollout

Use opaque user IDs, not emails or names, as targeting attributes

The user ID passed to createUserContext() is included in exposure events and stored in your data warehouse. Use a hashed or random identifier — never a raw email address or full name.

Start gradual rollouts at 1–5%, not 50%

The first 1% reveals most integration bugs and unexpected behavior before they affect a significant portion of users. Expand to 10%, 25%, 50%, 100% with monitoring between each step.

Enable bot detection by passing $userAgent

Pass the $userAgent attribute when creating the user context. The SDK automatically excludes bot traffic from experiments and suppresses exposure and conversion events for bots.

Test your flag in staging before enabling in production

Create a separate SDK key for staging. Verify that both branches of every flag render correctly, and that disabling the flag (returning the control path) also works correctly.

Verify that the null/off path works before enabling

decide() can return null if a flag key doesn't exist or is archived. Your code must always handle the null case and render a sensible default experience.

Confirm traffic splits add up to 100%

A 50/50 split with targeting rules that exclude 40% of users results in only 60% of eligible users being bucketed. Check effective sample sizes, not just configured percentages.

Monitoring dashboard with status indicators for feature flag health and alerts
03

Monitoring & Alerts

Monitor exposure event counts per flag

A sudden drop in exposure events for a flag often means a targeting rule is misconfigured or the SDK is not being initialized for a code path. Set an alert for >50% drops in exposure volume.

Watch for unexpected variant distribution

If a 50/50 split consistently shows 48/52 or 55/45, your user ID distribution may be skewed. Investigate before interpreting experiment results.

Set up Slack alerts for flag changes (Starter plan and above)

Every flag enable, disable, or configuration change should be visible to the team. Unexpected flag changes are a common source of unexplained production behavior.

Monitor your guardrail metrics during experiments

Define at least one metric you want to protect (error rate, p95 latency, support ticket volume) and check it alongside your primary experiment metric.

Check experiment results on a predefined schedule

Frequent peeking inflates false positive rates. Define a check schedule before the experiment starts (e.g., after 7 days and after 14 days) and stick to it.

04

Cleanup & Lifecycle

Remove the flag code within 30 days of calling a winner

Dead code behind a flag is still dead code. Merge the winning branch directly, delete the losing branch, and remove the flag from the dashboard. The flag is not a safety net after the winner ships.

Run a quarterly flag audit

Review every exp_ and rel_ flag older than 90 days. If it has no active rollout, it's a cleanup candidate. Ask the owner whether it's still needed — not whether it's safe to delete (that's the owner's job to verify).

Archive flags rather than deleting them immediately

Archiving removes a flag from active evaluation but preserves its history. Delete after the code that referenced it is merged and deployed — confirms the app doesn't depend on it.

Don't leave exp_ flags in production past their experiment deadline

An experiment flag that outlives its deadline is making a silent product decision. Either call the winner, extend the experiment with a new deadline, or shut it down.

Treat ops_ flags differently from release and experiment flags

Operational kill switches should not be deleted. They exist precisely because the underlying risk (a flaky third-party integration, a heavy database query) still exists. Review annually but keep them alive.

A security shield over a digital code background representing feature flag access control
05

Security & Access Control

Keep the SDK key (SIGNAKIT_SDK_KEY) server-side only

The SDK key authenticates your application to the SignaKit API. Never expose it in client-side JavaScript bundles, public repositories, or build logs.

Use separate SDK keys for each environment

Your staging and production environments should have different SDK keys. This prevents accidental flag changes in staging from affecting production, and makes it clear which environment a key belongs to.

Rotate SDK keys quarterly

Treat the SDK key like any other server-side credential. Rotate it on a quarterly schedule — create the new key, update your deployment environment, verify the app is functioning, then revoke the old key.

Assign team members the minimum required role

SignaKit supports Owner, Admin, and Member roles with project-level access control. Reserve Admin and Owner roles for engineers who need to manage flags in production. Read-only access is sufficient for most team members.

Review the audit log after unexpected production behavior

SignaKit maintains a full audit log of flag changes — who changed what, when. Before investigating a production incident as a code bug, check whether a flag was modified around the same time.

Never put secrets or PII in flag names, variation keys, or metadata

Flag configurations are not secret. Names, variation keys, and targeting rule labels are visible to anyone with dashboard access. Keep sensitive data out of flag definitions entirely.

Frequently Asked Questions

How many feature flags is too many?

There's no hard limit — large engineering organizations run thousands of active flags simultaneously. The problem isn't quantity; it's flags without owners, without cleanup dates, and without a naming convention. 200 well-named, well-owned flags are easier to manage than 20 flags with arbitrary names and no cleanup process.

Should we use one flag per environment or one flag for all environments?

One flag definition, separate SDK keys per environment. The flag configuration is shared, but environment-specific SDK keys mean you can enable a flag in staging without enabling it in production. This is the standard setup — don't create separate flags for each environment.

What's the right cleanup timeline for release flags vs experiment flags?

Release (rel_) flags should be cleaned up within 2–4 weeks of reaching 100% rollout. Experiment (exp_) flags should be cleaned up within 2 weeks of calling a winner — time enough to merge the winning code and confirm the deployment is stable. Set these deadlines at flag creation, not after the fact.

Can we use feature flags in a monorepo with multiple services?

Yes. Each service initializes its own SDK instance with createInstance(). Flags are evaluated per service; a flag evaluated in your API service and in your frontend service are independent evaluations, each generating their own exposure event. Use the same flag key across services if you want the same targeting rules to apply everywhere.

Put these into practice

SignaKit makes the checklist easy to follow

Dashboard audit views, audit logs, Slack notifications, and role-based access — free up to 1 million events per month.