Case Study
Interactive Journey Storytelling
Built an Apple-style scroll-driven cinematic storytelling page using AI-generated frames, canvas compositing, and framer-motion — turning a developer biography into an immersive visual narrative.
Full-Stack Engineer (solo) · 1 day
TL;DR
- •The optimized WebP frame images averaged 100-170KB per file, with a quality setting of 85%.
- •The AI asset generation pipeline was able to produce visually consistent frames across different scenes, with facial consistency maintained at 100% using three personal reference photos.
- •The scroll-driven animation system achieved smooth transitions between frames, using a custom canvas rendering engine with a small number of frames (5-8 per chapter).
Context
Portfolio sites typically present career history as a timeline or bullet list — functional but forgettable. The goal was to create something visitors would actually remember: an Apple-style scroll-driven experience that tells the developer's story through cinematic visuals and carefully timed text, transforming a biography into a narrative journey from India to London.
Problem
Scroll-driven frame animations require solving several interconnected challenges: generating visually consistent AI frames that feature a real person across different scenes, building a canvas rendering engine that transitions smoothly between a small number of frames, and ensuring the scroll mechanics work correctly within an existing layout system that was never designed for full-viewport sticky positioning.
My Role
Solo developer across the entire feature: AI asset generation pipeline, canvas compositing engine, scroll animation system, text overlay choreography, chapter transition effects, SEO integration, accessibility fallback, and visual debugging via browser automation.
Approach
Asset Generation Pipeline
Nano Banana 2 (Google's Gemini 3.1 Flash Image model) generates the frame images via API. Three personal photos from different angles serve as reference inputs so the AI maintains facial consistency across all scenes. Each prompt includes explicit style directives — cinematic lighting, volumetric fog, 1920x1080 widescreen — to keep the visual language consistent across chapters. Generated PNGs are converted to optimised WebP via sharp at 85% quality, averaging 100-170KB per frame.
The pipeline is repeatable: re-running the generation script with new prompts replaces frames without any code changes. The scroll engine is asset-agnostic — it reads frameCount from configuration and loads whatever WebP files exist in the chapter directory.
Canvas Compositing Engine
With only 5-8 frames per chapter, a naive frame-switching approach produces a slideshow effect. The canvas engine solves this with multi-layer compositing on every scroll tick:
- Crossfade blending — Two frames are drawn simultaneously with opposing alpha values, using smoothstep easing to prevent linear pop
- Ken Burns effect — The outgoing frame slowly zooms and drifts while the incoming frame counter-zooms, creating perceived depth
- Light leak flare — A diagonal warm gradient rendered in screen blend mode peaks mid-transition, simulating film exposure
- Color temperature shift — An overlay tint transitions from warm amber (India chapters) to cool teal (London chapters) as scroll progress increases
- Dynamic vignette — A radial gradient darkens edges, intensifying during transitions for cinematic drama
All five effects are computed per-frame from a single scroll progress value (0-1) using pure canvas 2D operations — no WebGL, no shader dependencies.
Scroll Architecture
The journey page required opting out of the site's existing page transition system. The root template wraps every page in a motion.div with overflow-y-auto, which creates an intermediate scroll container that breaks both position: sticky and framer-motion's useScroll target tracking. A route-level template override was insufficient because Next.js App Router templates nest rather than replace.
The solution: conditionally remove both overflow classes in the transition component when the pathname is /journey. Removing only overflow-y-auto is insufficient because per CSS specification, setting overflow-x: hidden alone auto-promotes overflow-y to auto.
Text Choreography
27 story beats use 8 cycling animation patterns — each overlay gets a different combination of entrance direction (rise, slide left/right, drop, zoom), scale curve, and rotation based on its array index. Serif typography (Playfair Display) marks hero moments; italic sans (Roboto) marks reflective beats. Positions alternate between center, bottom-left, bottom-right, and top-center to keep the eye moving.
Accessibility
When prefers-reduced-motion is active, the entire scroll-driven system is replaced with a static layout: each chapter renders its first frame as a background image with all overlay text visible simultaneously. No canvas, no scroll tracking, no animation — just content.
Results
The page delivers a cinematic scroll experience at under 2MB total asset weight (18 WebP frames). Canvas compositing runs at 60fps on modern hardware with no jank — all effects are computed from a single motion value with zero DOM mutations during scroll. The AI generation pipeline cost under $1 total for all frames, and the asset-agnostic architecture means visual refresh requires zero code changes.
Learnings
AI image generation with personal photo references produces inconsistent results without explicit negative constraints. Specifying "NOT smiling" or "short neat hair, NOT windswept" is as important as describing what you want. Group scenes require a separate reference photo showing different people together — otherwise the model clones the reference face onto every character.
CSS overflow is a paired property: you cannot set one axis independently without affecting the other. overflow-x: hidden silently promotes overflow-y to auto, which breaks position: sticky for any descendant. This is per-spec behaviour but rarely documented in the context of sticky positioning.
Canvas crossfade with 5 frames looks better than hard-switching with 20 frames. The compositing approach — blending, zooming, and tinting between a small number of high-quality images — creates perceived smoothness that frame count alone cannot achieve. Fewer, better frames with richer transitions beats more frames with simple cuts.
