Skip to content

Horizontal Scaling

This guide covers best practices for building TinySystems components that scale horizontally across multiple pod replicas.

Scaling Overview

┌─────────────────────────────────────────────────────────────────────────────┐
│                      HORIZONTALLY SCALED MODULE                              │
└─────────────────────────────────────────────────────────────────────────────┘

                    Kubernetes Deployment
                    replicas: 3

         ┌─────────────────┼─────────────────┐
         │                 │                 │
         ▼                 ▼                 ▼
    ┌─────────┐       ┌─────────┐       ┌─────────┐
    │  Pod 1  │       │  Pod 2  │       │  Pod 3  │
    │ (Leader)│       │(Reader) │       │(Reader) │
    └────┬────┘       └────┬────┘       └────┬────┘
         │                 │                 │
         └─────────────────┼─────────────────┘

                    Kubernetes Service
                    (Load Balancing)

                    ┌──────┴──────┐
                    │             │
              gRPC Traffic    HTTP Traffic
              (from other     (from Ingress)
               modules)

Design Principles

1. Stateless by Default

Components should be stateless - all state comes from:

  • Incoming messages
  • Settings (via settings port)
  • TinyNode metadata (shared state)
go
// Good: Stateless processing
func (c *Component) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    input := msg.(Input)
    result := transform(input, c.settings)  // Only uses input + settings
    return output(ctx, "output", result)
}

// Bad: Local state
func (c *Component) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    c.cache[input.ID] = input  // This won't be shared across pods!
    return nil
}

2. CR-Based Shared State

For state that must be shared:

go
// Use TinyNode metadata
func (c *Component) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    if port == v1alpha1.ReconcilePort {
        node := msg.(v1alpha1.TinyNode)

        // Read shared state (all pods)
        c.sharedConfig = node.Status.Metadata["config"]

        // Write shared state (leader only)
        if utils.IsLeader(ctx) {
            output(ctx, v1alpha1.ReconcilePort, func(n *v1alpha1.TinyNode) {
                n.Status.Metadata["last-update"] = time.Now().Format(time.RFC3339)
            })
        }
    }
    return nil
}

3. Idempotent Operations

Operations should be safe to repeat:

go
func (c *Component) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    // Idempotent: Check before acting
    if c.currentPort == targetPort {
        return nil  // Already done
    }

    // Perform action
    c.startOnPort(targetPort)
    return nil
}

4. Leader-Only Writes

Only the leader modifies cluster state:

go
func (c *Component) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    // All pods: handle messages
    if port == "input" {
        return c.processInput(ctx, output, msg)
    }

    // Leader only: update status
    if port == v1alpha1.ReconcilePort && utils.IsLeader(ctx) {
        c.updateClusterState(ctx, output)
    }

    return nil
}

Scaling Patterns

Pattern 1: Load-Balanced Processing

All pods process incoming messages:

go
// All pods can handle this
func (c *Processor) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    if port == "input" {
        result := c.process(msg.(Input))
        return output(ctx, "output", result)
    }
    return nil
}

Traffic is distributed by Kubernetes Service.

Pattern 2: Coordinated Server

One port, all pods listen:

go
func (c *Server) handleReconcile(ctx context.Context, output module.Handler, node v1alpha1.TinyNode) any {
    configuredPort := getPortFromMetadata(node)

    if configuredPort == 0 && utils.IsLeader(ctx) {
        // Leader: assign port
        port := c.startOnRandomPort()
        publishPort(output, port)
        return nil
    }

    if configuredPort > 0 && c.currentPort != configuredPort {
        // All pods: use assigned port
        c.startOnPort(configuredPort)
    }

    return nil
}

Pattern 3: Singleton Operations

Only leader performs certain operations:

go
func (c *Scheduler) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    if port == v1alpha1.ControlPort {
        if !utils.IsLeader(ctx) {
            return nil  // Only leader handles control
        }

        control := msg.(Control)
        if control.Start {
            c.startScheduledJobs(ctx, output)
        }
    }
    return nil
}

Pattern 4: Sharded Processing

Distribute work by key:

go
func (c *Sharded) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    input := msg.(Input)

    // Determine if this pod should handle this key
    podIndex := getPodIndex()
    totalPods := getTotalPods()
    targetPod := hash(input.Key) % totalPods

    if podIndex != targetPod {
        // Not our shard - forward to correct pod
        return c.forwardToShard(ctx, input, targetPod)
    }

    // Our shard - process
    return c.process(ctx, output, input)
}

Helm Values for Scaling

yaml
# values.yaml
replicaCount: 3

resources:
  requests:
    cpu: 100m
    memory: 128Mi
  limits:
    cpu: 500m
    memory: 512Mi

# Horizontal Pod Autoscaler
autoscaling:
  enabled: true
  minReplicas: 2
  maxReplicas: 10
  targetCPUUtilizationPercentage: 70

Graceful Scaling

Scale Up

1. New pod starts
2. Acquires TinyModule CR watch
3. Discovers other modules
4. Starts receiving gRPC traffic
5. Processes messages

No special handling needed - Kubernetes and the SDK handle it.

Scale Down

1. Pod receives SIGTERM
2. Context cancelled
3. Graceful shutdown
4. Stop accepting new requests
5. Complete in-flight requests
6. Pod terminates

Handle context cancellation:

go
func (c *Component) Handle(ctx context.Context, output module.Handler, port string, msg any) any {
    for _, item := range items {
        select {
        case <-ctx.Done():
            return ctx.Err()  // Graceful shutdown
        default:
            c.process(item)
        }
    }
    return nil
}

Monitoring Scaled Deployments

Metrics per Pod

go
import "go.opentelemetry.io/otel/metric"

var (
    messagesProcessed = metric.NewInt64Counter("messages_processed")
)

func (c *Component) Handle(ctx context.Context, ...) any {
    // Increment counter with pod label
    messagesProcessed.Add(ctx, 1, attribute.String("pod", os.Getenv("HOSTNAME")))
    return nil
}

Leader Status

go
var (
    isLeaderGauge = metric.NewInt64Gauge("is_leader")
)

func updateLeaderMetric(isLeader bool) {
    value := int64(0)
    if isLeader {
        value = 1
    }
    isLeaderGauge.Set(context.Background(), value)
}

Common Pitfalls

1. Local Caching

go
// Bad: Cache not shared
type Component struct {
    cache map[string]Result
}

// Good: Use TinyNode metadata or external cache
func (c *Component) getFromMetadata(node v1alpha1.TinyNode, key string) string {
    return node.Status.Metadata[key]
}

2. Assuming Single Instance

go
// Bad: Assumes single instance
var globalCounter int

// Good: Use distributed counter or accept per-pod counting
func (c *Component) Handle(ctx context.Context, ...) any {
    // Either: Don't count, or use distributed storage
}

3. File-Based State

go
// Bad: File not shared between pods
f, _ := os.Create("/data/state.json")

// Good: Use ConfigMap, Secret, or TinyNode metadata

4. In-Memory Timers

go
// Bad: Timer only runs on one pod
time.AfterFunc(5*time.Minute, doWork)

// Good: Use leader-only timer with state in CR
if utils.IsLeader(ctx) {
    go c.runTimerLoop(ctx, output)
}

Checklist for Scalable Components

  • [ ] All processing is stateless or uses CR metadata
  • [ ] Leader-only operations check utils.IsLeader(ctx)
  • [ ] Context cancellation is respected
  • [ ] Operations are idempotent
  • [ ] No local file storage for shared state
  • [ ] No global variables for shared data
  • [ ] Graceful shutdown implemented
  • [ ] Metrics include pod identifier

Next Steps

Build flow-based applications on Kubernetes