Hacking Attention: My Immersive Group-Watching Project for the Living Room

Some projects come from pure frustration—and for me, it was the complete chaos of attention when watching a show with friends at home. Imagine: one person on the TV remote, someone else swiping TikTok, another guy half-coding on a laptop, random screens blinking everywhere. You’d think you’re together, but everyone’s reality is fragmented. This project? It’s my way of hacking that distraction, turning every screen and device in the room into a synchronized, immersive, and (honestly) pretty intense collective experience.

The Core Idea: Turning Attention Chaos Into a Tool
Here’s the basic scenario:
You’re in a living room, watching a show or movie with friends—physically together, not a virtual watch party.
Each person brings a phone (sometimes two), maybe a laptop or tablet, plus the main screen (TV, projector, whatever).
For two people, you might have three to five active devices. With three or four people, that easily jumps to six, seven, sometimes even ten.
No streaming platform actually leverages this absurd number of sensors and displays in a single physical space.
Most streaming services are obsessed with “solo” immersion, or at best, Discord/online watch parties. Bandersnatch-style interactive shows let you change the story, but none of this actually touches the physical, social, and chaotic reality of group watching in the same room.
I wanted something different:
A system where the show plays as usual, but every device in the room—phones, tablets, even smart lights—contribute to a single, coordinated, attention-hijacking experience. Instead of fighting distraction, I wanted to choreograph it.

How It Works: Multisensory Synchronized Terror (or Fun)

Phones/Tablets: While the episode is playing, people’s devices light up with synced sounds, flashes, subtle vibrations, AR puzzles, “secret” notifications. Maybe during a tense scene, your phone’s screen slowly flickers, or you hear a chilling whisper in your earbuds nobody else can hear.
Calling Mechanic: Suddenly, someone’s phone rings. The caller? An AI (or another friend, orchestrated by the system), delivering cryptic messages, jump scares, or even tasks. Sometimes you have to relay the info to the group—or maybe you keep it secret.
AR Layer: Point your phone camera at the wall and—bam—a bloody message appears, or a ghost flickers at the edge of the sofa. Only visible on your device.
Security Camera Gimmick: One tablet becomes a “security cam”—displaying a different angle of the narrative world, maybe with clues, or even hallucinated threats only some people can see. Sometimes the camera gets “hacked” and pushes a threatening message, or an alternate perspective of the action.

Information, Suspicion, and Paranoia as Mechanics
Each device receives unique bits of information, sometimes intentionally contradictory. You can share what you know, or mislead the group. It’s half-watching, half-playing, half-lying.
Some info is “cursed”—only one person receives a creepy warning, and sharing it might endanger the whole group (in the narrative, at least).
Phones randomly emit whispers or even urgent beeps; group members have to coordinate, or just freak out together.
Using AR, a single player might be stalked by a shadow, only visible to them. Nobody else sees it, but the app nudges you to act on it—or maybe just get really, really nervous.

Physical and Sensory Participation: Modern Rituals
At certain moments, all devices buzz, flash, or play the same jarring sound. Total synchronized panic.
“Hold your breath” minigame: when characters hide on screen, your phone asks you to press and hold. Fail, and your device explodes in noise; everyone glares at you, tension spikes.
Smart home integration: the system dims lights, flickers bulbs, or triggers color shifts exactly in sync with the most intense moments. Suddenly, the living room is part of the set.

Hidden Roles, Social Engineering, and Subtle Betrayal
Each person can be assigned a secret “role”: helper, traitor, victim, etc. You might be getting DMs from the show’s villain, trying to sabotage the group—or save them.
The system can introduce psychological twists: anonymous messages, creepy “watch your back” warnings, or apparent glitches on only one device.
Paranoia isn’t just for the screen anymore. The line between narrative tension and group dynamic blurs.

Technical Architecture: Beyond Second Screen
Most “second screen” apps are glorified trivia quizzes or basic remote controls. What I’m building is more like a real-time, distributed, sensor-driven game engine:
Every device is a networked “attention node.”
Core sync is managed via local WiFi and precise AV-sync algorithms—yes, hard, but doable with today’s protocols (NTP, multicast triggers, device fingerprinting).
AI modules generate or deliver narrative, play roles, manipulate information, and even fake “conversations” by calling group members or texting them in-character.
Real physical environment is orchestrated—light, sound, maybe even haptics—by the system, not just the TV.

The Philosophy: No Story Branching, Just Reality Branching
Unlike interactive shows (Bandersnatch, etc.), the content itself never changes. The video stream is fixed. All the play happens around the content—how you receive, share, or hide information; how your senses are manipulated; how the room itself becomes part of the tension.

Why Do This? And Why Now?
Because the modern living room is already a chaotic mesh of screens, speakers, and divided attention. Instead of fighting this, why not make the chaos the point?
Everyone’s device becomes a vector for fear, suspense, or laughter.
Sometimes you feel isolated, sometimes everyone panics together, sometimes the border between fiction and reality evaporates (at least for a second).
It’s ritual, game, and shared performance all at once—a totally new way to experience media that only makes sense in the real, physical, “messy” space of a home.

Next Steps: Adaptive, AI-driven, and Wild
On the roadmap:

Adaptive pacing and content, dynamically responding to audience reactions.
Personalized scares, depending on emotional/biometric feedback (think smartwatches + phone cameras).
External “agent” participation—friends in another city can join as hidden saboteurs or helpers.
Full integration with generative AI for roles, voices, even video overlays.
And ultimately: a ritualistic, nearly uncanny blend of cinema, LARP, and social deduction.
We’ve only run a handful of late-night test sessions, but the sheer chaos (and the honest-to-god screams) already show this works—maybe even too well.

Under the Hood: How Does This Actually Work?

Core Architecture

Device Discovery & Sync:
On launch, each device joins a local “session” over the same WiFi (Bonjour/mDNS, or fallback to QR code pairing for offline setups). Devices time-sync using an NTP server (or, for nerds, a local “time master” if there’s no internet).
The TV or main screen runs the “host” app—either a custom player or a browser overlay for existing streaming platforms.
AV Synchronization:
The main video timeline serves as the “source of truth.” All triggers—whether a phone call, a flash, a whisper, or an AR overlay—are tied to timestamps. Devices poll the host for timing, then fire effects at precise moments.
To handle lag and jitter, there’s continuous drift correction: the phones and tablets gently “nudge” themselves back into sync if latency creeps in.
Narrative Engine:
The system parses “event scripts” written in a JSON or YAML format—think Ren’Py meets Node-RED. Each event is tagged for device(s), user(s), and trigger conditions (e.g., “At 00:12:34, send AR overlay to User 2; at 00:13:00, vibrate all phones if ambient light < 30 lux”).
Scripts can branch based on live user input or sensor data: e.g., if someone failed the “hold your breath” challenge, spawn a new scare just for them.
Local AI Modules:
For privacy and latency, simple “roleplay AI” (think a mini-LLM or a ruleset-based persona) can run directly on-device. For more complex, generative dialogue or real-time content, a cloud AI endpoint is pinged (low-latency is critical here, so edge inference may be necessary for bigger deployments).
Environmental Controls:
The app scans for smart home devices (Hue, Nanoleaf, HomeKit, etc.) and can send commands to change lighting, activate speakers, or even control fans and curtains if available. The more hardware you have, the wilder it gets.

Interaction Mechanics: What Actually Happens?

Individual vs. Collective Experiences

Solo Frights:
Sometimes, a phone receives a jumpscare, a warning, or a “glitch” effect that only one person perceives. Maybe their camera briefly malfunctions, or they hear a voice that nobody else hears.
Group Events:
The system can force moments of “shared reality,” like all devices buzzing, lights dimming, and a coordinated sound cue creating mass panic (yes, it works).
These spikes in collective attention act like “emotional punctuation”—everyone’s on edge together, then a sudden calm.

Roleplay, Social Deduction, and Betrayal

Secret Missions & Roles:
Some viewers are told to mislead others, or to act as the “eyes” for the on-screen antagonist. The app sends them prompts, suggestions, or even “threats” to keep the deception active.
Bluffing and Paranoia:
Devices can issue conflicting information. You might have a vital clue, but is it real, or are you the target of a prank? This dynamic creates a light layer of “Among Us” energy, but the show itself never changes—only the social experience does.

AR and Environmental Manipulation

Layered Reality:
With the AR layer, you can point your camera anywhere in the room. At certain triggers, overlays appear—bloody handprints, writing on walls, or even a lurking figure.
Some are scripted, some are randomized; sometimes only one user sees them. This isn’t just for horror—imagine using it for a heist show, where only you spot the hidden code.
Environmental Feedback Loops:
The app can use data from device sensors: if the room is too bright, it asks users to dim the lights for “full experience.” Or, if someone’s heart rate (from a smartwatch) spikes, it delays the next jumpscare, so no one passes out (yes, safety first—usually).

Scalable Design: Why This Can Actually Be Mainstream

No Need for Custom Content:
The system can layer on top of any existing show or movie. As long as you have a timestamped event script, you can “mod” any piece of content into an immersive group experience.
Cloud-based Event Marketplace:
In the future, I see a library of user-created event scripts: “Best scripts for The Haunting of Hill House,” or “Ultimate group-watch for Money Heist.” You can browse, download, rate, remix.
Analytics for Continuous Improvement:
(Optional, privacy-respecting) analytics can track engagement—when people’s devices were most active, which cues got the loudest reactions, etc. Scripts can evolve dynamically based on feedback.

Use Cases and Edge Cases: Not Just for Horror

Crime/Detective Nights:
Devices feed out clues, red herrings, and even fake evidence. The group debates, investigates, and experiences revelations or betrayals together—almost like a murder mystery party, but guided by the script and tech.
Sci-Fi/Surrealism:
Imagine using this in a Black Mirror-style episode: AR overlays glitch, you receive a message “from the future” on your phone, lights flicker to “simulate” a power failure, etc.
Comedy/Trolling:
All of this can be used for fun, too—prank messages, “whoopee cushion” sound triggers, or just forcing everyone to take a silly selfie mid-movie.

Research, Precedents, Inspirations

Partial Parallels:
Netflix’s “Stranger Things” Interactive AR was phone-only, never group-focused.
Jackbox Games nail local group play, but are game-centric, not narrative overlay.
Second Screen Apps (HBO Go, Disney+) have tried trivia and simple polls, but never real, multisensory, room-wide manipulation.
Academic Theory:
Draws from pervasive gaming, ARGs (Alternate Reality Games), and theater practices like environmental/immersive theater.
Game studies concepts: social play, spect-actor (Boal), bleed between fiction and reality, attention economy.
Uses recent work on “shared affective synchrony” in physical co-presence (cf. Aarseth, Calleja, and some recent HCI conference papers).

The Real Challenge: Friction vs. Magic

Tech needs to be seamless.
Device setup should take less than two minutes—QR code, join group, done.
UI needs to be minimal, as not to break the fiction or “over-gamify.”
Safety and consent: people can always “pause” or opt out of individual triggers.
But when it works? There’s a genuine magic—a real, unrepeatable night. Screams, laughter, moments of real confusion and delight. And (honestly) the group dynamic keeps you far more present than you’d ever be in a normal Netflix session.

TL;DR:
The modern living room is a perfect mess of attention, devices, and distracted minds. Instead of fighting that, my project makes the mess intentional, using every phone, every lamp, every heartbeat to blur the line between watcher and participant. Not a game, not an app—an ambient, evolving ritual that turns “just watching TV” into something no algorithm can predict.
Want to know more, test a prototype, or help script the weirdest group-watching night ever? Just ping me. There’s no turning back once you’ve seen your own living room haunt you back.