system-design

intermediate

18 min read

Data Layer Architecture

data-fetching

caching

state-management

react-query

swr

graphql

rest

optimistic-updates

bff

The data layer determines how a frontend application fetches, caches, normalizes, and synchronizes data — separating client state from server state and choosing the right fetching and caching strategies is critical for maintainable, performant applications.

Key Points

1Client vs Server State

Client state (UI, forms, navigation) is owned by the browser; server state (fetched data) is a cached copy of backend data — mixing them in one store causes stale data bugs and manual cache management.

2Stale-While-Revalidate

Return cached data instantly for fast UI, then revalidate in the background — TanStack Query, SWR, and RTK Query implement this pattern with configurable stale time and cache invalidation.

3Fetching Trade-offs

REST is simple and cacheable but over/under-fetches; GraphQL gives precise data but adds complexity; tRPC provides zero-cost type safety but couples client and server to TypeScript.

4Optimistic Updates

Update the UI immediately before server confirmation for latency-sensitive interactions (likes, bookmarks, reordering) — roll back on failure to maintain consistency.

5BFF Pattern

A Backend for Frontend aggregates multiple API calls into frontend-optimized endpoints, reducing client waterfall requests — Next.js Server Components naturally serve as a BFF layer.

What You'll Learn

Distinguish between client state and server state and select appropriate tools for each
Explain stale-while-revalidate caching and configure cache keys, stale time, and invalidation strategies
Compare REST, GraphQL, and tRPC for data fetching and articulate when to use each
Implement optimistic updates with rollback for latency-sensitive user interactions
Design a data normalization strategy and explain trade-offs of normalized vs document-based caching

Deep Dive

The data layer is the bridge between your backend APIs and your UI components. Poor data architecture leads to loading waterfalls, stale data bugs, excessive re-renders, and unmaintainable state management code. A well-designed data layer makes data fetching declarative, caching automatic, and synchronization reliable.

Client State vs Server State

The most important architectural decision is separating client state from server state:

Client state is data owned by the frontend — it exists only in the browser and doesn't need to be fetched:

UI state: modal open/closed, sidebar collapsed, active tab
Form state: input values, validation errors, dirty/touched flags
Navigation state: current route, scroll position, history
User preferences: theme, language, layout density

Server state is data owned by the backend — it's fetched over the network and cached locally:

User profile, settings, permissions
Product listings, search results, feed items
Comments, messages, notifications
Analytics data, dashboard metrics

Mixing these two types in a single store (a common Redux anti-pattern) causes problems: you end up writing manual loading/error/success states for every API call, cache invalidation becomes your responsibility, and stale data bugs appear everywhere.

Modern best practice: use lightweight state management for client state (useState, useReducer, Zustand, Jotai) and a server state library for server state (TanStack Query, SWR, Apollo Client, RTK Query).

Data Fetching Patterns

REST

The most common pattern. Each endpoint returns a specific resource:

GET /api/users/123 — fetch user
GET /api/users/123/posts?page=1&limit=20 — fetch user's posts
POST /api/posts — create post

Pros: Simple, cacheable (HTTP caching), widely understood. Cons: Over-fetching (getting fields you don't need), under-fetching (needing multiple requests for related data).

GraphQL

A query language where the client specifies exactly what data it needs:

GRAPHQL

query {
  user(id: "123") {
    name
    avatar
    posts(first: 20) {
      title
      createdAt
    }
  }
}

Pros: No over/under-fetching, single request for related data, strongly typed schema. Cons: Complexity, caching is harder (no URL-based HTTP caching), N+1 query risk on server.

tRPC

End-to-end type safety between a TypeScript backend and frontend — no code generation needed. The server defines procedures, and the client calls them with full TypeScript autocompletion.

Pros: Zero-cost type safety, no schema maintenance. Cons: Requires TypeScript on both ends, tightly couples client and server.

Caching Strategies

Server state libraries implement stale-while-revalidate (SWR) caching:

First request fetches from network, stores in cache, renders
Subsequent requests return cached data immediately (instant UI), then revalidate in the background
If fresh data differs, the UI updates seamlessly

Key caching concepts:

Cache keys — Unique identifiers for cached data (usually query name + parameters): ['users', 123], ['posts', { page: 1 }]
Stale time — How long cached data is considered fresh (no background refetch)
Cache time — How long inactive cached data stays in memory before garbage collection
Invalidation — Manually marking cached data as stale after mutations: queryClient.invalidateQueries(['posts'])
Optimistic updates — Immediately update the cache with expected mutation result, roll back if the server request fails

Optimistic Updates

Optimistic updates make the UI feel instant by updating local state before the server confirms:

User clicks "Like" → immediately show liked state
Send POST /api/like to server in background
If server succeeds → done (cache already matches)
If server fails → roll back to previous state, show error

This pattern is essential for latency-sensitive interactions (likes, bookmarks, todo completion, drag-and-drop reordering).

Pagination

Offset-based: GET /api/posts?page=2&limit=20 — simple but breaks when items are added/removed between pages (skipped or duplicated items).

Cursor-based: GET /api/posts?after=abc123&limit=20 — uses an opaque cursor (usually an encoded ID or timestamp) to fetch the next page. Stable pagination regardless of insertions/deletions. Required for infinite scroll.

Data Normalization

When the same entity appears in multiple places (a user in a post, in comments, in the sidebar), denormalized data duplicates it everywhere. Normalization stores each entity once by ID:

JavaScript

// Denormalized (duplicated user data)
{ posts: [{ id: 1, author: { id: 5, name: "Alice" } }] }
 
// Normalized (single source of truth)
{ users: { 5: { id: 5, name: "Alice" } }, posts: { 1: { id: 1, authorId: 5 } } }

Apollo Client and Redux with normalizr handle this automatically. TanStack Query uses document-based caching (denormalized) which is simpler but requires manual invalidation of all queries containing a modified entity.

BFF Pattern (Backend for Frontend)

A BFF is a thin server layer that aggregates multiple backend APIs into frontend-optimized endpoints. Instead of the client making 5 separate API calls to assemble a page, the BFF makes those calls server-side and returns a single, pre-shaped response.

Benefits: reduces client-side waterfall requests, hides backend complexity, enables per-platform optimization (mobile BFF returns less data than desktop BFF). Next.js API routes and Server Components naturally serve as a BFF layer.

Key Interview Distinction

The data layer's primary architectural decision is separating client state (owned by the browser) from server state (cached copy of backend data). Server state libraries (TanStack Query, SWR) handle caching, background revalidation, and synchronization automatically. Choosing between REST, GraphQL, and tRPC depends on your team's type safety needs, data shape flexibility, and backend architecture. Pagination strategy (cursor vs offset) depends on data mutability. The BFF pattern reduces client complexity by moving aggregation to the server.

Fun Fact

The term 'stale-while-revalidate' originated as an HTTP Cache-Control directive defined in RFC 5861 (2010), long before React libraries adopted it. SWR (the library by Vercel) is literally named after this HTTP header directive.

Learn These First

Frontend System Design Fundamentals

beginner

Promises & Async/Await

intermediate

Continue Learning

Real-Time System Architecture

advanced

Rendering Strategy Architecture

intermediate

Practice What You Learned

What is the difference between client state and server state?

junior

data-layer

Client state is data owned by the browser (UI state, form inputs, navigation) that doesn't need to be fetched. Server state is data owned by the backend (user profiles, posts, products) that's fetched over the network and cached locally. Using different tools for each (useState/Zustand for client, TanStack Query/SWR for server) prevents stale data bugs and eliminates manual cache management.

Design an autocomplete/typeahead search component

mid

autocomplete

An autocomplete component requires debounced API calls, client-side caching (LRU), keyboard navigation, ARIA combobox accessibility, and virtualization for long result lists to deliver a responsive, inclusive search experience.

Design a social media news feed with infinite scroll

mid

feed

A social media feed requires cursor-based pagination for data fetching, list virtualization to render only visible items, Intersection Observer for infinite scroll triggers, optimistic updates for interactions (likes/comments), and careful memory management to prevent performance degradation during long scroll sessions.

Design an offline-first web application with sync capabilities

senior

data-layer

An offline-first web application uses Service Workers for asset caching and IndexedDB for local data storage, enabling full functionality without network connectivity. Synchronization is handled through a queue-based system with conflict resolution strategies like last-write-wins or CRDTs, while optimistic UI ensures instant feedback for user actions.

Component Architecture at Scale

Real-Time System Architecture

PrevNext

Client State vs Server State

The most important architectural decision is separating client state from server state:

Client state is data owned by the frontend — it exists only in the browser and doesn't need to be fetched:

UI state: modal open/closed, sidebar collapsed, active tab
Form state: input values, validation errors, dirty/touched flags
Navigation state: current route, scroll position, history
User preferences: theme, language, layout density

Server state is data owned by the backend — it's fetched over the network and cached locally:

User profile, settings, permissions
Product listings, search results, feed items
Comments, messages, notifications
Analytics data, dashboard metrics

Data Fetching Patterns

REST

The most common pattern. Each endpoint returns a specific resource:

GET /api/users/123 — fetch user
GET /api/users/123/posts?page=1&limit=20 — fetch user's posts
POST /api/posts — create post

Pros: Simple, cacheable (HTTP caching), widely understood. Cons: Over-fetching (getting fields you don't need), under-fetching (needing multiple requests for related data).

GraphQL

A query language where the client specifies exactly what data it needs:

GRAPHQL

query {
  user(id: "123") {
    name
    avatar
    posts(first: 20) {
      title
      createdAt
    }
  }
}

Pros: No over/under-fetching, single request for related data, strongly typed schema. Cons: Complexity, caching is harder (no URL-based HTTP caching), N+1 query risk on server.

tRPC

End-to-end type safety between a TypeScript backend and frontend — no code generation needed. The server defines procedures, and the client calls them with full TypeScript autocompletion.

Pros: Zero-cost type safety, no schema maintenance. Cons: Requires TypeScript on both ends, tightly couples client and server.

Caching Strategies

Server state libraries implement stale-while-revalidate (SWR) caching:

First request fetches from network, stores in cache, renders
Subsequent requests return cached data immediately (instant UI), then revalidate in the background
If fresh data differs, the UI updates seamlessly

Key caching concepts:

Cache keys — Unique identifiers for cached data (usually query name + parameters): ['users', 123], ['posts', { page: 1 }]
Stale time — How long cached data is considered fresh (no background refetch)
Cache time — How long inactive cached data stays in memory before garbage collection
Invalidation — Manually marking cached data as stale after mutations: queryClient.invalidateQueries(['posts'])
Optimistic updates — Immediately update the cache with expected mutation result, roll back if the server request fails

Optimistic Updates

Optimistic updates make the UI feel instant by updating local state before the server confirms:

User clicks "Like" → immediately show liked state
Send POST /api/like to server in background
If server succeeds → done (cache already matches)
If server fails → roll back to previous state, show error

This pattern is essential for latency-sensitive interactions (likes, bookmarks, todo completion, drag-and-drop reordering).

Pagination

Offset-based: GET /api/posts?page=2&limit=20 — simple but breaks when items are added/removed between pages (skipped or duplicated items).

Data Normalization

When the same entity appears in multiple places (a user in a post, in comments, in the sidebar), denormalized data duplicates it everywhere. Normalization stores each entity once by ID:

JavaScript

// Denormalized (duplicated user data)
{ posts: [{ id: 1, author: { id: 5, name: "Alice" } }] }
 
// Normalized (single source of truth)
{ users: { 5: { id: 5, name: "Alice" } }, posts: { 1: { id: 1, authorId: 5 } } }