Measuring What Matters: A 5-Step Checklist for Setting Up AOIs with Eye Tracking AI
Two teams can analyze the same stimulus with eye tracking and reach opposite conclusions. The problem isn’t usually the data. It’s that their Areas of Interest (AOIs) were drawn differently.
AOIs aren’t just boxes on a screen; they are your measurement instrument. Get them wrong, and every metric that follows—like dwell time, first fixation, and attention share—will reflect your setup choices more than your participants’ actual behavior.
This guide covers five key decisions that determine if your findings will hold up. We’ll talk about defining what you’re measuring, sizing for real-world accuracy, handling tricky boundaries, choosing the right setup method, and validating your work. You don’t need a PhD in eye tracking to get this right.
Step 1: Define Your Measurement Intent
Every AOI you create is a stand-in for something your hypothesis cares about. Before you draw a single box, you need to be clear on what “attention” means for this specific study.
Three Questions to Ask Yourself Upfront:
- What’s the decision? The question “Did they notice the CTA?” requires a different AOI than “Did they read the pricing section?”, even on the same page.
- Whole object or a feature? An AOI around a whole hero image tells you something different than an AOI around the tiny logo inside it. Know which level of detail your hypothesis actually needs.
- Static or moving stimulus? Answering this now determines your tools and effort level, saving you from having to redo work later.
Avoid the Post-Hoc Temptation
One pattern that quietly ruins good data is defining AOIs after seeing the results. When you see where fixations clustered, it’s tempting to draw your boxes to confirm the pattern. This is like tuning your ruler after you’ve already measured something. Treat your AOI design as a plan you commit to beforehand, not a labeling exercise you do after the fact.
Pro Tip: For work you’re sharing with stakeholders, especially in creative and UX tests, pairing eye tracking outputs like gaze plots with engagement signals from facial coding can make your interpretations much stronger. Attention to a region is more convincing when you can also show what participants were feeling in that moment.
Step 2: Scale for Real-World Accuracy
No eye tracker is pixel-perfect in the real world. Webcam-based eye tracking, a popular choice for remote research, can have an accuracy range of several degrees of visual angle. At typical screen distances, this means dozens of pixels of drift.
If your AOIs don’t account for this, a gaze that lands near a button gets logged as a miss. Your metric is now measuring tracking error, not user behavior.
Practical Sizing Principles
- Build generous margins: It’s better to think in terms of visual angle, not exact pixels. A tiny AOI that perfectly hugs a small UI element will consistently undercount fixations.
- Account for hardware limits: Webcam setups require more generous sizing and more conservative claims. Don’t report a level of precision your tools can’t deliver.
- Consider peripheral vision: Something can be “seen” even without a direct fixation on it. Your margins can reflect this reality, not just your aesthetic preference for neat boxes.
The 10% Rule: If shrinking an AOI by 10% completely flips your main finding, your measurement was probably too fragile from the start.
Quick Sizing Sanity Checklist
Before you commit to a full layout, run through these questions:
- [ ] What is the expected accuracy range for my tracker and setup?
- [ ] Are margins required? If so, how wide should they be?
- [ ] How will I handle near-misses? (Will you ignore them or categorize them as “ambiguous”?) Document this policy now, not during analysis.
Step 3: Design Layout and Boundary Rules
Unclear boundaries lead to metrics you can’t defend. When a gaze point lands on the edge of two AOIs, or inside a region covered by three overlapping shapes, which one gets the credit? If you don’t set a rule, the answer is “whatever the software does by default,” and those defaults vary.
Common Failure Modes
- Overlapping AOIs that double-count fixations or inflate dwell time.
- Tightly packed AOIs where a small tracker wobble flips a gaze point from one label to another.
- No plan for what to do with gaze that falls outside of all AOIs.
Decision Rules That Help
- Use empty space as a buffer: If you can, leave padding between AOIs. Not every single pixel on the screen needs to belong to a region.
- Establish a priority hierarchy: When overlap is unavoidable (like a logo inside a banner), define a priority order or a “nearest-center” rule. Apply it consistently for all participants.
- Define “the wild”: Decide upfront what gaze “outside all AOIs” means. You can ignore it, tag it as an “other” region, or analyze it separately. All are valid choices; having no choice is not.
Note on Dense Interfaces: For navigation bars, checkout flows, or busy creative, creating component-level AOIs often makes more analytical sense than drawing broad page regions.
Step 4: Pick the Right Setup Method
The right method depends on one thing: what moves.
Decision Logic
- If the participant moves in the world (common in wearable eye tracking), you’ll likely need gaze mapping. This is a transformation that anchors gaze to a fixed object before AOIs can be applied.
- If an object on the screen moves (like in video ads or animated UI), you need dynamic AOIs. This involves manual keyframes, interpolation, or automated object tracking. Static AOIs simply won’t work.
- If the object is flat and static (a webpage or print ad), manual or point-based AOIs are usually sufficient and much faster to set up.
Point-Based AOIs: Defined by a center coordinate and a shape like a circle or rectangle, point-based AOIs are a great alternative to hand-drawn polygons. If someone else needs to replicate your study, giving them coordinates is much easier than telling them, “I drew it approximately here.”
Table: Pick Your AOI Method in 60 Seconds
| Stimulus Type | Best Method | Effort | Consistency | Main Risk |
| Static (webpage, print) | Manual or point-based | Low | High | Human subjectivity in boundary placement |
| Video (moving elements) | Dynamic/keyframe + interpolation | High | Medium | Drift between keyframes if not checked |
| Wearable / Scene Camera | Gaze mapping first, then AOIs | Very high | Low without calibration | Reference frame errors corrupt everything |
| Face Video | Landmark-based automation | Low–Medium | High | Occlusion / blink frames need review |
The Role of AI in Face Video: Face video is a special case worth noting. Facial landmarks like the eyes, nose, and mouth are stable reference points that automated tools can track reliably. This makes it one of the better use cases for AI-first AOI automation, with a human reviewing the output instead of doing it all from scratch.
Step 5: Validate and Document
Treat your AOIs like any other measurement instrument: test them before you trust them, and document your choices so others can reproduce your work.
A Practical Validation Toolkit
- Instructed-target pilot: Ask a few people to deliberately look at one element (“focus on the headline”). If your AOI doesn’t capture their gaze, the problem is with your AOI’s size or placement, not the participants.
- Sensitivity check: Expand and contract each AOI by a small amount and rerun your metrics. If key numbers like dwell time shift dramatically, your conclusions are fragile.
- Auto-AOI vs. human review: For automated AOIs, have a researcher review a subset of the output. Look for systematic misses, which are common with occlusions, blinks, or fast-moving content.
- Inter-rater check: If AOIs were drawn by hand, have a second person draw them independently on a small sample. If your drawings are very different, your definitions are too ambiguous.
A Mental Model for Eye Tracking AI: AI can be a massive time-saver, but your approach should be “AI first pass, human curation,” rather than full hands-off automation. AI models have their own biases about what constitutes an “object edge.”
Define Your Exit Criteria
Before you pilot, establish your parameters: What capture rate is acceptable? How much can a metric shift before it’s a problem? Answering these questions in advance is what makes this a real validation process.
Minimum AOI Documentation Checklist
If you can’t describe your AOI setup clearly, stakeholders will reasonably question your findings. Ensure your research log contains:
- [ ] AOI names and what each one represents (be specific: “hero image logo” vs. “hero image”)
- [ ] Size and shape rules, including your margins and the reason for them
- [ ] Overlap and boundary rules, and how you handled gaze outside of any AOI
- [ ] The method used (manual, keyframed, automated) and a note on any human edits
- [ ] Whether analysis is based on fixations or raw samples, with a brief rationale
Summary: The Repeatable 5-Step Checklist
Here is the full workflow, compressed into a repeatable checklist for your next study:
- Define your measurement intent: Lock in your hypothesis first. Decide what “attention to this region” means.
- Set your sizing rules: Choose AOI sizes based on your tracker’s real-world accuracy. Build in margins.
- Design layout and boundary rules: Use spacing buffers. Define clear rules for any overlap or out-of-bounds gaze.
- Pick and implement the right method: Match your stimulus (static, dynamic, face, wearable) to the best execution method.
- Validate, then document: Run an instructed pilot, do a sensitivity check, and document your specs before analyzing the data.
Conclusion
At its core, eye tracking is a translation science. It takes raw biological behavior—the rapid, semi-conscious movement of human eyes—and translates it into actionable business and design insights.
But this translation is only as reliable as the boundaries you draw. If your AOIs are arbitrary, your metrics will be meaningless. By treating AOI design as a deliberate, scientific step rather than an administrative afterthought, you ensure your data reflects human attention—not tracker error or researcher bias.
If you only do one thing: Pilot your AOI setup with 2–3 test participants before you launch your main study. It takes less than an hour and can save you from having to defend a conclusion built on a poorly placed box.
Frequently Asked Questions (FAQs)
1. Can AI completely automate the AOI drawing process?
Not yet. While AI is incredibly fast at object detection and facial landmark tracking, it lacks the contextual understanding of your study’s hypothesis. AI might perfectly map a “button,” but it doesn’t know if your question is about the text label inside the button or the overall visual weight of the button in the UI. Always use a workflow of “AI first pass, human curation.”
2. How much margin (in pixels) should I add to an AOI?
There is no single pixel rule because visual angle depends on the user’s distance from the screen. However, for a standard desktop webcam eye tracker (typically accurate to about 1.5 to 2 degrees of visual angle), a margin of 30 to 50 pixels around critical elements is a common industry standard to absorb tracking drift without capturing unrelated content.
3. What should I do if two critical elements are too close together to allow margins?
If elements are tightly clustered (e.g., social media share icons), you have two choices:
- Aggregate: Merge them into a single, collective AOI (e.g., “Social Sharing Group”) and accept that you cannot isolate individual buttons.
- Prioritize / Categorize: Keep them separate but establish a strict “nearest-center” rule, and explicitly tag near-miss gazes in your documentation as “ambiguous.”
4. When should I analyze raw gaze samples versus fixations inside an AOI?
For most UX and creative tests, fixations (where the eye pauses to process information, typically >100ms) are the standard metric because they represent conscious processing. Raw gaze samples are more useful for highly dynamic tasks, usability studies involving fast physical movement, or diagnosing raw hardware accuracy during your validation phase.












