Fit Finder: Redesigning trust, not just size

I led the product design evolution of Fit Finder across two companies. First at Fit Analytics, then after Snap's acquisition in 2021. The redesign shipped in 2023 to clients including Zara, Hugo Boss, and ASOS, serving millions of shoppers monthly.

Conversion rate uplift via A/B test

+14%

Return rate reduction

–12%

User satisfaction post-launch

+22%

My role

Lead Product Designer

Timeline

Five months (2022–2023)

Team

PM, User Researcher, Data Scientists, Engineers

Platform

Web & Mobile (white-label SDK)

Context

A product I knew from the inside

Fit Finder is a white-label size recommendation tool embedded directly into retailer product pages. When a shopper clicks "Find my size," Fit Finder walks them through a short questionnaire about their body measurements and fit preferences, then surfaces a personalised size recommendation. The goal: reduce the uncertainty that drives returns, and give shoppers the confidence to buy.

I joined Fit Analytics in 2019 as a senior designer, working on the product long before Snap acquired the company in 2021. After the acquisition, Fit Finder became part of Snap's ARES Shopping Suite, and I continued leading design as we planned a major version update. That continuity mattered. I wasn't parachuted in to redesign something I'd never used. I knew what the data said, where users dropped off, and which parts of the product engineering had quietly flagged as technical debt.

The V5.0 redesign was the most significant project within a four-year tenure on the product. Footwear sizing and several earlier iteration cycles aren't documented here, but the upper and lower body redesign is where the outcomes were most measurable.

Fit Finder (V4.2), Upper body / Woman — October 2022

Problem

What was actually breaking

V4.2 had a measurable drop-off problem. Analytics showed significant abandonment at the brand-selection screen; a step that asked users to select their preferred brand after entering body measurements. The intent was to calibrate recommendations to brand-specific sizing, but the data science team had been questioning its value for months.

Beyond that, usability testing pointed to a broader trust gap. Users were completing the flow but not acting on the recommendation. They'd see their size and then still check the size chart manually, or abandon the purchase. The interface felt bureaucratic; too many screens and inconsistent visual language across clients.

The core problem statement

How do we make users trust Fit Finder's recommendation enough to act on it, and can we get there with fewer steps?

I know I'm usually a medium, but this thing told me large. I didn't really trust it, so I just guessed anyway.

— Usability test participant, 2022

Research & Analysis

What the research actually told us

Two weeks of research before touching a single design file. Working alongside a dedicated User Researcher and the Data Science team, we ran interviews, usability testing, empathy mapping, journey mapping, analytics review, card sorting, and a similarity matrix. The goal was to build enough evidence to make decisions we could defend.

Who we were designing for

We started by defining three user archetypes from interview synthesis, then used them throughout the project to pressure-test design decisions. They weren't fictional personas. They were distilled from real patterns we heard across sessions.

The Confident Buyer

Knows her size, wants confirmation

Mia knows she's a medium. She uses Fit Finder to validate, not discover. If the recommendation contradicts her expectation without explanation, she ignores it and checks the size chart anyway. The trust gap on the recommendation reveal screen was her problem.

The Uncertain Measurer

Wants help, gets stuck on inputs

Omar knows his height and weight, but the unit toggle between metric and imperial gives him pause mid-flow. It's a small friction point, but for a user already uncertain about trusting the recommendation, doubt at the input stage compounds into doubt at the result.

The Brand-Loyal Returner

The Uncertain Measurer

Shops by brand, not by size

Wants help, gets stuck on inputs

Amaka shops by brand and thinks in brand-specific sizes. The brand-selection screen felt designed for her — but analytics showed she still dropped off there, because her preferred brands were often unlisted. She bore the cost of a screen that had no accuracy benefit.

Three archetypes built from interview synthesis — each one maps to a specific failure point in the V4.2 flow

Where the journey broke down

Journey mapping the V4.2 flow made the problem visible in a way that analytics alone couldn't. We mapped the emotional arc across every screen; from the moment a shopper clicked "Find my size" through to the recommendation reveal, and marked where confidence dropped, where confusion spiked, and where users abandoned the flow entirely.

Two moments stood out. The brand selection screen caused a sharp confidence dip for users whose brand wasn't listed, and the recommendation reveal triggered disbelief for users whose result didn't match their expectation. Neither moment had any recovery mechanism in V4.2.

The user journey map surfaced two critical failure points: the brand screen and the recommendation reveal. Neither had a recovery path in V4.2

Usability test findings mapped to each screen in the V4.2 flow

What the usability tests found, screen by screen

We ran moderated usability sessions with 12 participants across the full V4.2 flow, conducted remotely via Google Meet. The findings were specific enough to drive direct design decisions, not just general friction, but screen-by-screen evidence of where and why users lost confidence.

What empathy mapping revealed about trust

Empathy mapping sessions after the usability tests and interviews helped us understand what users were thinking and feeling at the moments that mattered most. The pattern was consistent: when a recommendation contradicted a user's expectation, they immediately looked for a reason and found nothing.

The product gave users a result but no reasoning. No context about why that size was recommended, no indication of how confident the model was, and no acknowledgement that their usual size might differ from what Fit Finder was suggesting. Without that, the recommendation felt arbitrary, and arbitrary doesn't convert.

Empathy mapping revealed that the trust gap wasn't about accuracy. Users didn't distrust Fit Finder because it was wrong, they distrusted it because it didn't explain itself

Empathy mapping revealed that the trust gap wasn't about accuracy — users didn't distrust Fit Finder because it was wrong, they distrusted it because it didn't explain itself

Analytics review of the V4.2 flow: the steepest single drop-off in the flow occurred at Brand selection, with completion continuing to fall across every subsequent brand screen

What analytics and the similarity matrix confirmed

Analytics review revealed a clear pattern: users who made it through measurements, body shape, and personal data questions were abandoning the flow at a disproportionate rate once they hit the brand screens. The single steepest drop in the entire upper body funnel was from Age at 53% to Brand selection at 36% — a 17 point fall in one step. The brand screens then continued to erode completion across three consecutive steps, from 36% down to 20%, before only 14% of users ever reached a result.

Card sorting results; users consistently grouped height, weight, age, and bra size as "personal information" data, with brand preference treated as a separate category entirely

Card sorting provided the mental model evidence to support removing it. Shoppers consistently grouped measurements, weight, age, and bra size together as "personal details", and placed brand selection in a separate category entirely. We were forcing two distinct mental models into one flow.

The similarity matrix made that pattern quantitative. With 10 respondents, the blue clusters show near-perfect agreement; height, weight, chest shape, belly shape, hip shape, and bra size were grouped together by almost everyone. Brand information scored zero across those same groupings. The data wasn't ambiguous: these were two completely separate mental models, and the V4.2 flow had collapsed them into one.

Similarity matrix from the card sorting exercise — blue clusters show how users grouped personal measurements together and separated brand information into a distinct category

01 Brand data: no accuracy impact

Similarity matrix confirmed brand selection had no statistically significant effect on recommendation quality. The screen was legacy friction with no model benefit.

02 Two mental models, one broken flow

Card sorting showed users clearly separate "about me" data (measurements, age, bra size) from brand preference. V4.2 mixed both into a single linear flow which was a structural mismatch.

03 40% couldn't find their brand

Usability data showed 40% of testers couldn't locate their brand in the selection screen. Either it wasn't listed, or they couldn't find it in "More Brands." Drop-off followed immediately.

04 30% didn't trust the final recommendation

Post-flow interviews confirmed 30% of testers didn't act on the size recommendation. They second-guessed it and checked the size chart manually. The problem wasn't the model, it was the interface.

Key design decision

We removed the brand selection screens entirely. This was the most significant structural change in the redesign, informed by both the card sort data and the Data Science team's analysis. Fewer screens, cleaner mental model, no accuracy trade-off.

Design

From sketches to a system

Six weeks of design work, moving from Crazy 8s through sketches, wireframes, and high-fidelity screens. The process ran alongside an accessibility audit and regular design critiques. I also involved junior designers throughout as a way to both pressure-test decisions and develop the team's skills in parallel.

1. Accessibility audit

Evaluated V4.2 for heading hierarchy, focus order, screen reader compatibility, and WCAG colour contrast. Identified accessibility issues across the existing flow that needed resolution before any visual redesign.

2. Crazy 8s & sketching

Fast ideation on alternative approaches to the measurement input, recommendation reveal, and trust-building moments. Eight concepts in eight minutes, then down-selection based on feasibility and user insight alignment.

3. Wireframes & flow validation

Mapped the simplified flow without brand screens. Validated with the PM and engineering lead before moving to high fidelity. No wasted polish on concepts that wouldn't ship.

4. High-fidelity design & ARES Design System

Designed within, and contributed to, the ARES Shopping Suite design system. Built reusable components for measurement inputs, recommendation cards, and progress indicators that could scale across Snap's AR tools.

5. Prototype & usability testing

Interactive prototype tested with users before handoff. Iterated on the recommendation reveal screen twice based on feedback. The first version still felt flat; the second added contextual explanation of why the recommendation was made.

From research to decisions — annotated V5.0 screens

V5.0 desktop and mobile flows

The V5.0 recommendation reveal screen addressed the trust gap directly. A confidence bar, a plain-language explanation grounded in purchase behaviour at scale, and a sizing context note for cases where the result differed from a user's usual size.

Every change in V5.0 was driven by a specific finding from usability testing; brand screens removed, radio buttons replaced with buttons, arrows and ellipses stripped from body shape screens, and help text added to the bra size screen to reduce discomfort

Design system contribution

Fit Finder V5.0 was designed within the ARES Shopping Suite design system; a shared component library built to unify Snap's AR tools including virtual try-on and interactive product displays. I created the sizing-specific components (measurement inputs, recommendation cards, fit preference selectors) so they could be reused across future ARES products without redesign.

Fit Finder components built into the ARES Design System for reuse across Snap's Shopping Suite

Results

Three months, three metrics, one decision

V5.0 launched in Q1 2023. Over a three-month A/B test across 20 retail partners, including global clients in apparel, footwear, and lifestyle, the PM and Data Science team tracked conversion rate, return rate, and user satisfaction against predefined success criteria.

Before the test began, the team aligned on three conditions for a full rollout decision: statistical significance at 95% confidence on conversion rate; no regression on return rate; and a neutral or positive shift in user satisfaction. All three were met. Return rate and satisfaction both exceeded the minimum bar rather than just clearing it.

No single metric told the full story; the three results only made sense together.

Conversion rate uplift via A/B test

+14%

Return rate reduction

–12%

User satisfaction score

+22%

A conversion uplift alone can mean users are being persuaded to buy things that don't fit. A return rate reduction alone could simply mean fewer people are buying. A satisfaction score increase alone could reflect a smoother flow that users enjoyed but didn't trust enough to act on. It was the convergence of all three; more purchases, fewer returns, and users who felt confident throughout, that confirmed the redesign had worked at every level: the flow was easier, the recommendation was trusted, and that trust was warranted.

The brand screens removal was the most impactful single change both in terms of drop-off reduction and the downstream effect on recommendation trust. Shorter flows with less friction correlated directly with higher satisfaction scores. The improved recommendation reveal, which now included a brief contextual explanation alongside a sizing context note for brands that run large or small, was flagged by qualitative feedback as a meaningful trust signal. Two design decisions, each grounded in research, each pulling in the same direction.

Metric	Fit Analytics client range	V5.0 a/b test result
Conversion rate	+1% to +22% (avg. ~+8%)	+14%
Return rate reduction	–2% to –20% (avg. ~–10%)	–12%
User satisfaction	No published benchmark	+22% vs V4.2 baseline

With all three success criteria met, statistical significance confirmed, and results that outperformed the broader Fit Analytics client benchmark, the PM and engineering teams moved to full rollout across all retail partners. V5.0 became the new baseline for all Fit Finder deployments going forward. The decision to roll out fully was straightforward.

Reflection

With more time

Starting with the signal, not the screen

The brand screen removal was the right call, but I pushed for it late in the process. If I'd run the data review with the Data Science team in week one rather than week three, we'd have had more time to test what the opening experience should actually be. Structural decisions like that need to happen before anything goes into Figma.

Sizing the footwear problem earlier

Footwear sizing came up repeatedly in the data but we treated it as a phase two problem. It wasn't. Users who couldn't get a shoe recommendation had already decided the tool didn't work for them before they ever tried it on apparel. I'd have scoped at least a lightweight footwear model into the brief from the start.

Accessibility as a design input, not a check

The accessibility audit happened at the end. It should have happened at the wireframe stage. Several contrast and touch target decisions had to be revisited late, which cost time and introduced small inconsistencies in the final system. Running it in parallel with design would have been cleaner.

Quantifying the return rate link

We knew returns were the business problem. We designed to address it. But we never agreed on how we'd measure whether our specific changes contributed to the reduction versus other variables. I'd want a clearer attribution model in place before the A/B test goes live, so the results are actually defensible.

Design system contribution scope

Contributing to ARES while simultaneously shipping Fit Finder meant some components were built for the project first and generalised after. A few of them weren't quite right for other contexts and needed rework. A clearer split between project-specific and system-level work from the start would have saved that.

Fit Finder (V4.2), Upper body / Woman — October 2022

Role and collaborators

My role and the wider team

I was Lead Product Designer for the full project. That meant research, accessibility audit, high-fidelity design, design system contribution, prototyping, and developer handoff. The decision to remove the brand screens was mine to push for, and I worked directly with the PM and Data Science team to validate it against the recommendation model before it touched a design file. I also ran weekly crits with the junior designers, using the project as a way to give them structured feedback on real work.

Engineering Lead + Developers

Owned implementation, managed the developer handoff process, and ran QA across web and mobile.

Data Scientists

Ran analytics review, validated the recommendation model changes, and led A/B test analysis.

User Researcher

Facilitated the interview programme, ran empathy mapping sessions, and moderated usability testing.

Product Manager

Scoped the project, managed stakeholder alignment across Snap and retail clients, and oversaw A/B test oversight.

menu

Fit Finder: Redesigning trust, not just size

+14%

+14%

–12%

–12%

+22%

+22%

A product I knew from the inside

What was actually breaking

The core problem statement

What the research actually told us

Who we were designing for

The Confident Buyer

The Uncertain Measurer

The Brand-Loyal Returner

The Uncertain Measurer

Where the journey broke down

What the usability tests found, screen by screen

What empathy mapping revealed about trust

What analytics and the similarity matrix confirmed

01

Brand data: no accuracy impact

02

Two mental models, one broken flow

03

40% couldn't find their brand

04

30% didn't trust the final recommendation

Key design decision

From sketches to a system

1. Accessibility audit

1. Accessibility audit

2. Crazy 8s & sketching

2. Crazy 8s & sketching

3. Wireframes & flow validation

3. Wireframes & flow validation

4. High-fidelity design & ARES Design System

4. High-fidelity design & ARES Design System

5. Prototype & usability testing

5. Prototype & usability testing

From research to decisions — annotated V5.0 screens

Design system contribution

Three months, three metrics, one decision

+14%

+14%

–12%

–12%

+22%

+22%

With more time

Starting with the signal, not the screen

Starting with the signal, not the screen

Sizing the footwear problem earlier

Sizing the footwear problem earlier

Accessibility as a design input, not a check

Accessibility as a design input, not a check

Quantifying the return rate link

Quantifying the return rate link

Design system contribution scope

Design system contribution scope

My role and the wider team

Engineering Lead + Developers

Data Scientists

User Researcher

Product Manager