Leader Election

TinySystems modules support horizontal scaling through Kubernetes-based leader election. Understanding leader election is essential for building scalable components.

Why Leader Election?

When running multiple replicas of a module:

+-----------------------------------------------------------------------------+
|                    PROBLEM: MULTIPLE REPLICAS                                |
+-----------------------------------------------------------------------------+

Without leader election:

   Pod A                    Pod B                    Pod C
     |                        |                        |
     | Update TinyNode -------|------------------------|
     |                        | Update TinyNode -------|
     |                        |                        | Update TinyNode
     |                        |                        |
     v                        v                        v
+-----------------------------------------------------------------------------+
|                          CONFLICT!                                           |
|   All pods try to update the same CRs                                       |
|   Race conditions, lost updates, inconsistent state                         |
+-----------------------------------------------------------------------------+

With leader election:

   Pod A (LEADER)           Pod B (READER)           Pod C (READER)
     |                        |                        |
     | Update TinyNode        | Watch only             | Watch only
     | Process signals        | Handle messages        | Handle messages
     |                        |                        |
     v                        v                        v
+-----------------------------------------------------------------------------+
|                          CONSISTENT                                          |
|   Only leader writes to CRs                                                 |
|   All pods handle incoming messages                                         |
+-----------------------------------------------------------------------------+

Kubernetes Lease-Based Election

TinySystems uses Kubernetes Leases for leader election:

// cli/run.go
func setupLeaderElection(ctx context.Context, namespace, moduleName, podName string) (*atomic.Bool, error) {
    isLeader := &atomic.Bool{}

    // Create lease lock
    lock, err := resourcelock.New(
        resourcelock.LeasesResourceLock,
        namespace,
        fmt.Sprintf("%s-lock", utils.SanitizeResourceName(moduleName)),
        nil,
        coreClient.CoordinationV1(),
        resourcelock.ResourceLockConfig{
            Identity: utils.SanitizeResourceName(podName),
        },
    )
    if err != nil {
        return nil, err
    }

    // Start leader election
    go leaderelection.RunOrDie(ctx, leaderelection.LeaderElectionConfig{
        Lock:            lock,
        LeaseDuration:   15 * time.Second,
        RenewDeadline:   10 * time.Second,
        RetryPeriod:     2 * time.Second,
        Callbacks: leaderelection.LeaderCallbacks{
            OnStartedLeading: func(ctx context.Context) {
                log.Info("became leader")
                isLeader.Store(true)
            },
            OnStoppedLeading: func() {
                log.Info("stopped leading")
                isLeader.Store(false)
            },
            OnNewLeader: func(identity string) {
                log.Info("new leader elected", "leader", identity)
            },
        },
    })

    return isLeader, nil
}

The Lease Resource

yaml

apiVersion: coordination.k8s.io/v1
kind: Lease
metadata:
  name: common-module-v1-lock
  namespace: tinysystems
spec:
  holderIdentity: common-module-pod-abc123
  leaseDurationSeconds: 15
  acquireTime: "2024-01-15T10:30:00Z"
  renewTime: "2024-01-15T10:30:10Z"
  leaderTransitions: 5

Checking Leadership

Components check leadership via context:

import "github.com/tiny-systems/module/pkg/utils"

func (c *Component) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    if port == v1alpha1.ControlPort {
        // Only leader should process control actions
        if !utils.IsLeader(ctx) {
            return nil  // Ignore on non-leader pods
        }

        // Leader-only logic
        c.startOperation()
    }
    return nil
}

Leader Responsibilities

Only the leader pod should:

Action	Why Leader Only
Update TinyModule status	Avoid conflicting updates
Update TinyNode status	Single source of truth
Process TinySignal CRs	Prevent duplicate execution
Expose ports to Ingress	Single ingress configuration
Write to shared metadata	Consistent state

Reader Responsibilities

All pods (including leader) should:

Action	Why All Pods
Watch CRs for changes	Stay in sync
Handle incoming messages	Load distribution
Apply local reconciliation	Maintain state
Run gRPC server	Accept cross-module calls

Leader Election Flow

+-----------------------------------------------------------------------------+
|                       LEADER ELECTION FLOW                                   |
+-----------------------------------------------------------------------------+

1. STARTUP
   +------------------------------------------------------------------------+
   |  All pods try to acquire the Lease                                     |
   |  Only one succeeds (becomes leader)                                    |
   |  Others become readers                                                  |
   +------------------------------------------------------------------------+
                                     |
                                     v
2. LEADER ACTIVE
   +------------------------------------------------------------------------+
   |  Leader renews lease every 10 seconds                                  |
   |  Leader updates CRs and processes signals                              |
   |  Readers watch and handle messages                                     |
   +------------------------------------------------------------------------+
                                     |
                                     v
3. LEADER FAILURE
   +------------------------------------------------------------------------+
   |  Leader pod dies or network partition                                  |
   |  Lease expires after 15 seconds                                        |
   +------------------------------------------------------------------------+
                                     |
                                     v
4. NEW ELECTION
   +------------------------------------------------------------------------+
   |  Remaining pods compete for lease                                      |
   |  One becomes new leader                                                |
   |  System continues operating                                            |
   +------------------------------------------------------------------------+

Failover Timing

Leader dies
     |
     | <--- Up to 15 seconds (lease duration)
     |
     v
Lease expires
     |
     | <--- Up to 2 seconds (retry period)
     |
     v
New leader elected
     |
     | <--- Immediate
     |
     v
System operational

Total failover time: ~17 seconds worst case

Using IsLeader in Components

Ticker Component Example

func (t *Ticker) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    if port == v1alpha1.ControlPort {
        // Only leader starts the ticker
        if !utils.IsLeader(ctx) {
            return nil
        }

        control := msg.(Control)
        if control.Start {
            go t.startEmitting(ctx, output)
        } else if control.Stop {
            t.stopEmitting()
        }
    }
    return nil
}

HTTP Server Example

func (s *Server) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    if port == v1alpha1.ReconcilePort {
        node := msg.(v1alpha1.TinyNode)

        // Read port from metadata (all pods)
        port := node.Status.Metadata["http-server-port"]

        if utils.IsLeader(ctx) && port == "" {
            // Leader starts server and publishes port
            actualPort := s.startServer()
            output(ctx, v1alpha1.ReconcilePort, func(n *v1alpha1.TinyNode) {
                n.Status.Metadata["http-server-port"] = strconv.Itoa(actualPort)
            })
        } else if port != "" {
            // All pods use the published port
            s.startOnPort(port)
        }
    }
    return nil
}

Controller-Level Leadership

Controllers also check leadership:

func (r *TinyNodeReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    // All pods reconcile locally
    r.Scheduler.Update(ctx, node)

    // Only leader updates status
    if !r.IsLeader.Load() {
        return ctrl.Result{RequeueAfter: 30 * time.Second}, nil
    }

    // Leader-only: update status
    r.Status().Update(ctx, node)
    return ctrl.Result{RequeueAfter: 5 * time.Minute}, nil
}

Testing Leadership

For local development with a single replica:

// Local development: always leader
if os.Getenv("FORCE_LEADER") == "true" {
    isLeader.Store(true)
    return isLeader, nil
}

Best Practices

1. Don't Assume Leadership

// Bad: Assumes will always be leader
func (c *Component) Handle(...) {
    c.updateClusterState()  // May not be leader!
}

// Good: Check leadership
func (c *Component) Handle(ctx context.Context, ...) {
    if utils.IsLeader(ctx) {
        c.updateClusterState()
    }
}

2. Handle Leadership Changes

type Component struct {
    cancelFunc context.CancelFunc
    mu         sync.Mutex
}

func (c *Component) Handle(ctx context.Context, ...) {
    if port == v1alpha1.ReconcilePort {
        c.mu.Lock()
        defer c.mu.Unlock()

        if utils.IsLeader(ctx) && c.cancelFunc == nil {
            // Just became leader
            ctx, c.cancelFunc = context.WithCancel(ctx)
            go c.startLeaderOnlyWork(ctx)
        } else if !utils.IsLeader(ctx) && c.cancelFunc != nil {
            // Lost leadership
            c.cancelFunc()
            c.cancelFunc = nil
        }
    }
}

3. Idempotent Leader Operations

func (c *Component) Handle(ctx context.Context, output module.Handler, ...) {
    if utils.IsLeader(ctx) {
        // Idempotent: safe to call multiple times
        output(ctx, v1alpha1.ReconcilePort, func(n *v1alpha1.TinyNode) {
            if n.Status.Metadata["initialized"] != "true" {
                n.Status.Metadata["initialized"] = "true"
                // Do initialization...
            }
        })
    }
}

Next Steps

Leader-Reader Pattern - Full pattern documentation
CR-Based State Propagation - Sharing state
Multi-Replica Coordination - Coordination patterns

Leader Election ​

Why Leader Election? ​

Kubernetes Lease-Based Election ​

The Lease Resource ​

Checking Leadership ​

Leader Responsibilities ​

Reader Responsibilities ​

Leader Election Flow ​

Failover Timing ​

Using IsLeader in Components ​

Ticker Component Example ​

HTTP Server Example ​

Controller-Level Leadership ​

Testing Leadership ​

Best Practices ​

1. Don't Assume Leadership ​

2. Handle Leadership Changes ​

3. Idempotent Leader Operations ​

Next Steps ​