Apple’s vision of ourselves

Man wearing Apple Vision Pro headset. You can see a digital version of his eyes through the headset, he has his tongue sticking out
Figure 1: Vision Pro headset
What is Apple’s spatial computing getting us into?

Apple’s first mixed reality headset combines and relates digital visual and auditory information with your surroundings launched this week. Despite despite critics questioning its actual use cases, desire for the device has seemingly blown through pre-order availability with deliveries now tracking more than a month out from launch. Underneath the desire, the assumptions around the mechanics and social interactions Apple Vision Pro creates offers a raft of questions around privacy and security, and how our social lives will deal with this next form of computing. So how is Apple positioning its next product line and what does this mean the rest of us?

Apple aims to redefine personal computing with the Vision Pro, just as its iPhone did sixteen years ago, and the Macintosh forty years ago. However, this new device also redefines concerns about privacy via its novel forms of what I have elsewhere called biospatial surveillance (Heemsbergen 2022), which involves using extensive personal and environmental data.

Apple assures privacy protections, but the device’s capability to capture and manipulate intimate data, such as biometric and physiological information, raises questions about the implications for personal privacy and social interaction. Interestingly, this surveillance is also used to addresses a social challenge of using headsets, by allowing others to “see” users’ eyes and expressions on the headset, which integrates the user more seamlessly into their social environment.

Spatial computing era

Apple is very careful about its brand, describing interfaces and guiding its customer relations with developers who make software in very specific ways. Apple Vision Pro, according to Apple (2024), is never to be referred to as a ‘headset’, and does not make “augmented reality (AR), virtual reality (VR), extended reality (XR), or mixed reality (MR)” experiences. Instead, Apple now offers “spatial computing” experiences, described as blending digital content with the physical world.

In more technical language, Greenwold (2003) has described spatial computing as the interaction between humans and a machine where the machine maintains and manipulates references to real-world objects and spaces. Spatial computing makes relationships between digital data and physical data possible to be exploited for user experiences. Most simply, that perfectly rendered dinosaur from Apple TV+’s, “Prehistoric Planet” can now seem to jump ‘out’ of its digital window and run around your living room. Other than dinosaurs, what is Apple offering?

Whatever the offering, it has big shoes to fill. Sixteen years ago iPhone was introduced, and introduced an era of mobile computing. Never before could the device in your pocket hold the entire internet (including Google Maps!), all your music (on a touch screen iPod/music store!), and easily choose/delete voicemail (visual voicemail!).

These three ‘killer’ applications reconfigured how we interact in life and ‘killed’ off other practices – to the extent that they might now seem silly: do you carry a paper map in your car? Do you use CDs? Do you even remember what voice mail is? Iterations of faster networking and better hardware meant sharing photos and video calls on the run became a ubiquitous human activity, with billions of Apple phones now in existence and around 80% of humans now using a smartphone of some sort.

Mobile phones present a classic socio-technological regime (Schot & Geels, 2008), where pressures of technology and industry, market preferences (apps!), culture, and policies reconfigured not just what technology exists, but how we live with it. While spatial computing might reconfigure human practice once again, the ‘technology’ of Apple’s new product has not yet aligned into a dominant design that takes the world by storm – though Apple is very good at doing so. See iPhone, AirPod, and Macintosh to name a few user-experience expectations that have spurred industry, culture, and regulations along. However, at present, the technology press suggests the ‘killer apps’ of spatial computing are still pending on the eve of Apple’s launch.

Launch memories

Despite the press, we note here where Apple is focusing our attention on the Apple Vision Pro. First is offering an entertainment device like no other – best ‘theatre for one’ in existence. Second, is something CAVRN member Chris Cesher has written about: attempting to solve the social problem of wearing a weird looking headset by letting people ‘see’ your gaze, which is a critical piece of human-human interaction, which gets dropped between Virtual Reality and physical reality; competitor headsets are not ‘see-through’.

Third is capturing and reliving ‘memories’. Apple up to this point has not been a memories company. Computer, phone, fitness, entertainment, health-wearable, yes, but not memories (other than indirectly via its creative potential, as per one overly emotional screen-time for the holidays ad). Thus, this third offer might seem strange, but Apple’s presentation (see figure 1) and reviewers of Apple’s device continue to come back to the novelty of directly capturing memories:

“this was stuff from my own life, my own memories. I was playing back experiences I had already lived”

(Stein 2023)

“[I viewed them in] an “Immersive” view where the border of the video becomes glowy and dream-like to give it characteristics of a memory”.

(Wong 2023)

Figure 2: Apple’s video explains spatial memories

The potential to select, store, annotate, and share digital ‘memories’ is a patented feature of Apple’s device. For clarity, the ephemeral and intangible of life’s memories are now product: “[Apple technologies] deliver stunning spatial memories in a compact file size” (Apple 2023). This way memories can last past the initial organic user who captured them and be leveraged for other use and by other users.

Creating a personal theatre, showing digital versions of our gaze, and transposing ‘memories’ to others all require novel forms of surveillance of our bodies and surroundings; underneath the ‘killer’ applications of Apple’s Vision are vast amounts of intimate data being captured and put to use.

Biospatial surveillance, included

Spatial computing requires vast amounts of data from both your surroundings and your body to work. In my research I have called this biospatial surveillance, as it is a step change from how digital media have ‘tracked’ us previously.

In past social media use, our consumer attention is tracked off data we give off in ways that build profiles around intent and desire. For instance, Instagram or Facebook (parent company Meta) know you’ll want to buy new slimmer jeans in March because of patterns in your activity levels not just on their platforms, but from across the web and outside of it (app-based workouts, geographical point of purchase Gatorades, increasing ‘Mediterranean’ recipe searches, and previous brand purchase history). Together these patterns suggest which pants you’re now due for. You might even, at this point, tell your friends something about new pants, and swear your phone is listening to you when it serves the relevant ad. Research found on average 2,300 companies send data about a single user to Facebook on this type of behavior.

In spatial computing, intimate data about our bodies and our surroundings is put to use in ways that fundamentally redefine the nature and modality of personal data capture. For instance, no less than 64 different physiological and biometric data streams were noted as available for headset design back in 2021 (Bye). These include things like eye tracking and pupil response to something called Superconducting Quantum Interface Devices (SQUID) that measure subtle changes in the body’s electromagnetic energy field.

This is not ‘consumer’ data, it is personal bodily data that is better thought of as medical data. For instance, analysing unconscious movements can be exploited for emotional insights or to predict neurogenerative disease (Abraham et al. 2022). Industry safety advocates from “X Reality Safety Intelligence” (formerly XR Safety Initiative) call this “biometrically-inferred data” as users are unaware their bodies are giving it up.

Apple’s privacy protections suggest they don’t share this type of data with anyone, and Apple has proven better than most consumer electronics companies on privacy and leveraging medical-like data from wearables for safety (search “apple watch heart”).  That doesn’t mean biospatial surveillance is not already put to use for Apple’s spatial computing.

You can’t even pre-order a Apple Vision Pro on the web without scanning your facial features via your iPhone (ensuring snug fit).

Screen capture from Apple.com when ordering a Vision Pro. The text says: "Grab an iPhone or an iPad with FaceID to find the right size for you: Scan the image with the Camera app and follow the prompts. Then return here to complete your purchase. The measurement data used to find your fit will not leave your device"
Figure 3: Still from Apple.com when ordering Apple Vision Pro

But wait, there’s more. Apple’s focus on memories seems to rely on a patent (WO2023196257) that talks about memory in terms of their playback control and about how to  ‘guide and direct a user with attention, memory, and cognition’ through feedback loops that monitor “facial recognition, eye tracking, user mood detection, user emotion detection, voice detection, etc. [from a] bio-sensor for tracking biometric characteristics, such as health and activity metrics… and other health-related information”.

Our bodies are at work in spatial computing, whether we know it or not. Apple’s patent further speaks of receiving the informed consent of the users or letting them opt out. It is worth quoting here in part to show how new forms of surveillance and control baked into the technical interfaces of spatial computing solutions:

” The detections made by the eye-tracking sensor can determine where the user is devoting attention. The user’s vision and/or gaze can be monitored to detect an attention level and/or changes thereof. … For example, the head-mountable device can include sensors to detect other conditions of the eye, such as pupil dilation, eyelid state (e.g., closure, openness, droopiness, etc.), blink rate, and the like. Such eye conditions can be compared to corresponding target thresholds to determine a user’s attention level. .. In operation …, the conditions can be compared to each other to determine how effective the indicator, the activity, and/or other events were at improving the attention level of the user.”

WO2023196257

On the one hand, this type of data capture is required for users to merely look at things to select them. It may potentially revolutionise computer interfaces in ways akin to having a mouse and pointer instead of command line, or a multi-touch screen instead of a mouse and monitor. On the other, its presents modes for behaviour modification that are a category change from the (now seemingly simple) data exhaust we offer algorithms for screen based content like TikTok or Facebook.

To be clear, this type of biospatial surveillance is also at work at solving the social problems donning a headset introduces; providing a technical solution for a social problem introduced by the technical constraints of the mixed reality ‘solution’. Headsets remove users from the physical-social world around them; they are in a separate reality. Apple is seemingly the first to try to solve this in a commercial product as Chesher (2023) has previously argued.

This type of eye-gaze display (EyeSight, according to Apple) constantly measures user expression and eye movement through multiple sensors and displays and simulated approximation for the outer facing screen. Your face is constantly mapped, computed, and expressed, so that others might see it – or rather see Apple’s vision of it. Likewise, as passers-by come into range of the Apple Vision Pro’s sensors, Apple’s vision of them is automagically rendered into your experience, whether they like it or not.

Apple’s screen on the outside of its device looks weird. But, weirder still is not having these social cues as people walk around with spatial computers strapped on their faces.

Apple’s foray into spatial computing exposes new privacy concerns just under the surface of the functional user experience requirements. Extensive biospatial surveillance that captures intimate biometric and environmental data, potentially redefining personal data and social interactions are killer applications of the technology that have yet to be properly communicated or explored by consumers or research. And, Apple engineers its devices such that the further research required revolves around an object of desire that customers and developers will be experimenting with as we all attempt to make sense of the limits we should put to such technology in defining our social realities.


References

Heemsbergen, L., Bowtell, G., & Vincent, J. (2021). Conceptualising Augmented Reality: From virtual divides to mediated dynamics. Convergence27(3), 830-846.

Schot, J., & Geels, F. W. (2008). Strategic niche management and sustainable innovation journeys: theory, findings, research agenda, and policy. Technology analysis & strategic management, 20(5), 537-554. https://doi.org/10.1080/09537320802292651


Portions of this post appeared in The Conversation as “Editing memories, spying on our bodies, normalising weird goggles: Apple’s new Vision Pro has big ambitions

Recommended citation:

Heemsbergen, L (January 2024). Apple’s Vision of Ourselves, Critical Augmented and Virtual Reality Research Network (CAVRN). https://cavrn.org/apples-vision-of-ourselves/