The room was noisy, and the faces of our judges seemed distracted as we gathered in front of them to pitch our project. It wasn’t an ideal scene to have a conversation, let alone attempt to explain the problem we’d identified and demonstrate our unique solution.
“We are Attentive AI,” Oscar voiced, as we started, but even I even sense the distraction from our team leader, who struggled to focus himself as a few dozen people were working and talking all around us. To make matters worse, the microphoned voice of the organizer interrupted us with another layer of loudness.
Were our judges even paying attention? Could they even pay attention? What was their level of attentiveness?
Holding a laptop in one hand to show our slides on the screen, Oscar managed his way through the main aspects of our story: we were helping teachers in the classrom, we had talked to teachers about their difficulties, and we had built a solution that addressed some of those challenges. Hurried and distracted and 3 minutes and 40 seconds in, Oscar passed the rest of the pitch to me and Ming for the demo.
Our lead developer Ming held the 5G phone with our app on it in front of the judges as I tried to speak loud enough about what we had built. I barely avoided the need to scream as I quickly explained how we used the phone’s camera to detect faces and facial expressions and translated those features into an attentiveness score. Furthermore we had custom system to detect and cont when hands were raised too. All of this was overlaid onto each face in a real-time camera feed.
Team Attentive AI with our attention score fully powered up shortly before the second round of AT&T 5G Hackathon
As one of the judges pointed to the timer on his phone, I stumbled my way into a short summary of what we do, who we were and a high-level one-liner on the impact of our technology, 5G and big data could have on education.
Our five minutes ran out. No questions from the judges. Our time was up.
As one of the first teams to present, it was a rough situation to pitch in. It was an open space with loudness and movement all around. No projectors. We were demo-ing and presenting an app on a smartphone screen.
Everyone struggled to focus. As we walked way, we were flustered somewhat and disappointed to some extent by the affair. They weren’t even paying attention, someone in the group remarked a few seconds later and a few steps away.
To be honest, the irony wasn’t apparent in the moment. But in retrospect, the situation was almost cosmically ironic for us. We were there in Los Angeles at the AT&T 5G Hackathon with IBM Watson. And we were trying to build a piece of technology that was all about attention and how to track it.
Specifically, we wanted to help teachers track if students were paying attention and monitor other classroom behaviors and provide a contexual interface about each student. We had built an app that managed to provide this service to some extent and had created a user interface (UI) and user experience (UX) where technology’s aim was not to distract the teacher but ideally help them to be adn to become attentive to students’ needs in the moment and afterwards. The hope was to give teachers super powers of sorts to be attentive to what was important, actionable and human.
As we returned to our work area, a degree of self-doubt emerged:
Would our project go on to the second round of judging? In spite of having a talented, multi-functional team and building some amazing tech, would our loud and distracted pitch screw up our chances? How did we get here and what had happened?
In this post, I want to share our tale from the 2019 AT&T 5G Hackathon in Los Angeles. It was an amazing event with a few hundred attendees, support from Samsung, IBM and others, and some $100,000 in prizes up for grabs. We also were a group of individuals that had met, become a team, and later friends.
Overall, it was an inspiring few days and we did (as you’ll see) manage to pass this initial round and went on to become one of the top teams at the event. We had much to be proud of on the conceptual, customer development, product, design and tech sides for our team. Obviously, you can only get so much done in a weekend, but we managed to not only deal with a unique problem on how to track attention and how to “attentively” provide that tracking data and other information back to users, but we also built a working prototype!
But before we jump to conclusion and a bit of a post-mortem on what we built and who we built it for, let’s take a side step or two to talk about attention, what it is and few different ways to might track it through cognitive tests, brain scans and, as we would build, through the direction of our faces and eyes.
EDITOR’S NOTE: This post was originally drafted in September and October 2019 and rediscovered in May 2023. I largely just smoothed out the prose and finished a few incomplete sentences and sections. I am publishing it now as a shareable, fun, learning case study. I have added a note at the end of some more updated thoughts.
What is Attention?
Attention involves a general state of wakefulness and a desire and ability to selectively concentrate on something.
Attention: Where Our Bottoms-Up Needs Meet Our Top-Down Intentions
Attention is the cognitive function that allocates cognitive resources to space, time, sense, and task (Posner, 2012): it drives enhancement and suppression of relevant and irrelevant sensory information, respectively, with the goal of processing relevant information more accurately. (Battleday, 2015)
Attention is a behavioral and cognitive process of selectively concentrating on certain information while ignoring others. Without paying attention, you can’t learn, create or decide anything. That said much of our mental lives pass in the opposite of attentio in what is termed the wandering mind mental state.
Like many pyschological concepts, attention does not represent a singular phenomeon. There is no single observational way to know if someone is attentive nor one test that can quantify fully how strongly we are focused. In fact, excluding the obvious conditions like sleep or unconsciousness where there is a total lack of arousal, attention is a fluid and dynamic state state. Attention ebbs and flows.
Scentifically, it is more accurate to say that the state of being attentive broadly encompasses two sides:
- First, it requires being awake and wakeful. This state of general arousal and wafefulness reflects certain primal factors like getting enough sleep, being nourished, not too much stress, emotional stability, and several other factors. In order to be attentive in a moment, we can’t be hindered by biological lacks like hunger, fatigue and emotional stressors. To be attentive you need to be awake and ready.
- Second, beyond this basic state, attention is also one of our higher cognitive states and involves a process related to selective concentratation, and, in certain situations, visual processing. It is literally keeping your mental and/or visual focus on something. As such it’s sometimes more precisely called selection attention. Other higher cognitive functitions like memory and executive functions all depend on our attention too. So, while to be attentive requires general wakefulness, it also requires a degree of interest and desire to focus for a time on something specific.
To put this another way, sustained attention or focus depends on a bottoms-up state of physical and mental wellness and a top-down mental state and engaged awareness. To be attentive, we need to capable of paying attention (i.e. our bodies and minds are well), AND we need to want to, strive to, and be pulled into the cognitive state of actally doing it.
Know that we’ve briefly defined what is attention, how can we quantify it?
How to Track Attention Cognitively, Neurologically and Behaviorally
Several methods have been used to track attention ranging from brain scans that map what an attentive brain looks like to behavorial expressions of being attention, like cognitive tests and games. Attempts have even made to interpret attention from our gaze, meaning the direction of our eyes and face during an activity.
Let’s look at each of these one by one.
Cognitive Testing: Can We “Score” Our Attention?
In psychology and neurology, there are a whole host of neurocognitive tests, cogntive assessment and mental disorders screenings. These tests are primarily used to detect mental disorders and diseases, especially amongst the elderly. These neuropsychological screen have also enabled us to tease out various cognitive functions like willpower, memory, planning, inhibition control, reaction time, and even attention. They also help us identify mental disorders and malfunctions.
Awhile back I looked at both the question of how to quantify my level of attention and explored a method to improve it with meditation. In Can Meditation Improve Your Attention? Self-Experiment Into Mindfulness and Cognitive Testing, I took a deep dive into my own n=1 experiment, varous cognitive test and what they test for, and the effect of meditation on attention. I concluded that meditation did not significantly impact my attention level according to the tests I took, and there were indications that it had a negative effect on how attentive I was.
One important thing to realize about testing for any cognitive function is that is next to impossible to test for just one function in isolation. We use multiple mental capacities in any task. For example, even the most simple of tests, like reaction time, can’t just measure our speed of reaction without considering other factors like information processing and inhibition control, depending on the test used. Motivation has a strong modulating effect on tests too. Basically, while we might be equiped with a range of mental “instruments,” our brains almost always require several sections of the orchestra for both simple and complex mental operations.
Put another way, each test might be a primary indicator of one cognitive function, but it often reveals information on several others at the same time too. For example, most tests of executive function (like the classic Stoop test) asks you to either do or not do something while you are presented with conflicting inputs. You are told to select items with word “red” but not select those colored in “red.” Variations of this test not only require executive control but also demands our attention and visual processing too.
So, can these tests “score” our attention?
On one level they can. Tests of cognitive functions are used to study various diseases and the effect different drugs and treatments on our mental capacities. For example, cognitive tests are used to say how well people do under sleep deprivation or how a drug might impede or improve reaction time, memory or planning.
Some common tests of attention include:
- Psychomotor vigilance task (PVT) - A computer- or mobile-based measure of a person’s reaction to a specific change in environment (like a dot appearing). It tests vigilance and displays time-on-task effect depending on duration of test.
- Digit span, especially digits backwards - Primarily a memory test, it also tests one’s ability to pay attention to numbers before the recall stage.
- Paced Auditory Serial Addition Test (PASAT).
- The Stroop Test of response inhibition.
- Timed tests involving letter or star cancellation.
These are just a few. Many video and card games require a good deal of attention to do well, since unless you are attentive to various outputs, you risk making a mistake and losinig. Additionally, if you are trying to examine the effects of nootropics or even caffeine, tests of attention can be a good way to quantify the effects, since stimulants largely act to raise your baseline state of arousal and thus enable heightened attention.
Unfortunately, while cognitive tests might be a great way to study attention, when it comes to real-time classroom needs for teachers, these kinds of tests are ill-suited and unlikely to work. They are also prone to practice effects too.
Neurological Markers: Can We Identify an Attentive Brain?
In view of the fact that attention requires both a baseline wellnessness state and higher cognitive functions, another way we can track attention is through brain scans.
Neuroimaging or brain imaging describes the use of various direct and indirect methods to get a picture of the brain and nervous system. The most techniques include: Functional Magnetic Resonance Imaging (fMRI), Computed Tomography (CT) Scan, Positron Emission Tomography (PET), Magnetoencephalography (MEG), and Electroencephalography (EEG). There are pro’s and con’s to each of these methods. Some provide greater detail of which areas of the brains are most active but are slower and involve radioactive chemicals. Arguably the cheapest and easiest way to get real-time information on the brain is through EEG or measuring our brain waves in different parts of the brain.
EEG is a noninvasive test used to measure the electrical activity in the the brain. It was discovered in 1924 by Hans Berger and revealed that one could measure the brain’s electrical activity by placing electrodes on the scalp and amplifying the signal. Those sensors are used to measure patterns in the ionic current within neurons. According to the frequency range of those signals, we are able to identify brain waves. The most common five brainwaves are Delta, Theta, Alpha, Beta, and Gamma. By looking at location and occurances of certain brainwaves, researchers have been able to identify a range of patterns and signature for certain mental states.
Compared to other neuroimaging techniques, EEG is comparatively cheap technology, mobile and easy to translate, and it offers millisecond-range temporal resolution. It doesn’t require radioactive chemicals either. Its main limitation is poor spatial resolution, meaning it isn’t very precise in detecting location and sources. There are both professional and medical-grade EEGs and a handful of consumer options. Professional EEGs use a 10–20 electrode placement system (10–20 System). While consumer EEG like the Muse, which is used for neurofeedback meditation, only use a few sensor locations and bands.
Returning to the question of attention, can neuroimaging and brain scans identify and measure our attention?
Since, of course, attention has a neurological basis, multiple studies have been able to use neuroimaging to tell us what happens in the brain during sustained and visual attention. Specifically, using more advanced brain imaging techniques, they have found attention in the reticular activating system (brain stem and thalamic nuclei) and that attention has multimodal association areas in prefrontal and parietal with right bias (Hodges, 2017). EEG-derived metrics of attention include alpha decreases and theta increases in fronto-central region.
Since consumer EEG devices like Muse and MindWave by NeuroSky are specifically designed to measure pre-frontal cortex, a number of experiments have been done on this region as it relates to meditation, attention and other mental activities.
Because human emotions, mental states, and levels of attentiveness are controlled by the cerebral cortex in the forehead, detecting the EEG signals produced in this area of the brain is a viable method for determining whether students are inattentive. (Liu, 2013)
One study (Liu, 2013), which used a consumer EEG and some advanced machine learning techniques for classification, provides several key points worth considering about brainwave activity in pre-frontal cortex around attention. They found:
- When the subjects were inattentive (often due to sleepiness) there was greater Theta activity.
- When the subjects were in a relaxed state, Alpha activity was slightly higher.
- When the subjects were attentive, Beta activity was greater.
So, when someone is not paying attention or inattentive, Theta activity increases in the front part of the brain. Similarly when someone is paying attention, Beta activity increases too.
In view of these clues, it seems like EEG and brainwaves offer a powerful method into tracking attention. There are even a few companies offering wearables for tracking student attention in this way. Obviously privacy and “creepy tech” concerns abound with anything around brain monitoring, but if you are looking for a very direct method of tracking attention, the attentive brain is one of the best.
Attentive Eyes: Gaze Tracking for Measuring Attentive Behavior
Cognitive tests are easy to do and sensitive to changes around attention. Brain scan are perhaps the most robust for identifying when we are attentive or not. But both have their limitations. Cognitive tests are disruptive, take time and have practice effects. Brain scans require a lot of technology and can be uncomfortable and “creepy” to students and parents. Fortunately perhaps there are behavioral ways to measure our attention, and it’s all in our eyes.
Gaze tracking is a technique used to analyze and understand a person’s visual attention and focus. It involves tracking the movement and direction of a person’s eyes and studying where they fixate and how long they spend looking at specific objects or areas of interest. By monitoring eye movements, researchers and developers can gain insights into the attentive behavior of individuals, including their level of engagement, interest, and cognitive processes. In short, by measuring where our eyes go and do we can intuitive underlying cognitive and neurological states of attention and focus.
The primary technology behind gaze tracking is an eye-tracking system, which typically consists of a camera or sensors that capture the eye movements of the person being observed. Historically these were massive, lab-based systems, often involving computer screens, multiple cameras or even virtual reality headsets. The captured eye movement data is then processed and analyzed to extract meaningful information about the person’s attention patterns.
Gaze tracking has applications in various fields, including psychology, human-computer interaction, market research, usability testing, and neuroscience. For example, in User Experience (UX) and Human-Computer Interaction (HCI), gaze tracking has been used to assess how users interact with digital interfaces, websites, or software applications. It provides valuable insights into which elements of a design attract the most attention, what areas might cause confusion or frustration, and how to optimize user interfaces for better user experiences. Gaze tracking has also proved popular in market research and advertising towards better understanding how consumer react to advertisements or product displays.
For our group at this hackathon, we were interested in exploring gaze tracking using a mobile device’s camera and how it might be used in an educational context. By understanding where students focus their attention and if they are paying attention at all, how might we help educators improve their teaching? How might we guide a new teacher in honing their instructional techniques and classroom management? How might we enable even a polished teacher to enhance learning outcomes and notice areas to become a better teacher?
Now that we’ve taken a bit of a conceptual dive into attention and various ways to track it, let’s look at how our weekend went, what we built and what we learned.
What We Built: Attention Gaze Tracking for Education with a Mobile App
Many in team were working to understand how IoT will revolutionize design and manufacturing. When brainstorming for this hackathon we sought out innovation by borrowing from those industries and applying them to a problem that we were passionate about: Education. We set out to provide the benefits of IoT data, cleansed through the lens of AI, to provide teachers with the realtime, relevant, and contextual information needed to serve their students. By further discussing with teachers, we learned more about their teaching challenges and refined our approach in order to incorporate key metrics (like attention) and to provide actionable contextual information about the students (academic history, sleep, socioeconomics, etc). And we focused on designing a seamless, in-situation user interface.
Over the course of a single weekend, we built an AI-empowered, AR-enabled app to detect individual and aggregate student attention. The aim of our app was to empower teachers, classrooms, and educators, enabling them to know and track more about what’s going on in the classroom. By extension, we might be able to integrate additional sensor and wearable data into a tool for other teachers, counselors and schools.
We detect how attentive students are real-time in the classroom and throughout the day. We provide a seamless user experience showing if a student is paying attention, how often students engage (like raising their hands), and display other actionable background information about individual students (hunger, happiness, sleep, stress, etc).
We collect video and pictures from the phone camera, detect certain facials feature on the device, and use AI/ML for advanced behavior detection.
We believe technology can be more attentive to what’s happening in the classroom and provide background context to each and every student. We are Attentive AI, and we are on a mission to help teachers, classrooms, and schools bring data-driven learning awareness anywhere in the world with a smartphone and 5G.
The app we created uses the device camera to collect and identify facial information (gaze, face posture, eye direction, openness of eyes), all scientifically validated inputs for detecting attention. We use these inputs to calculate a real-time attention score. We then map this metric into a real-time seamless AR overlay for each person and provide additional contextual and actionable items about individual students (hunger, happiness, sleep, stress, etc.).
We used React Native and Expo for developing the cross-platform app for Android (iOS coming soon). We leveraged the camera libraries and facial detection libraries for main facial information.
Using IBM Watson Visual Recognition, we developed a custom classifier for detecting unique classroom behaviors. Specifically, we collected pictures and trained our classifier for object detection. We created a backend on IBM Cloud. 5G allows us to quickly transmit images and video to the backend and seamless detect and share these metrics back into the app.
Checkout a demo of our app here:
Conclusion (Written May 2023)
Re-reading this write-up nearly three and half years after, it’s amazing to recall all of the pieces we put together in a single hackathon. It’s a good reminder of what a group of a creative technologists can do in such a short span of time. We integrated multiple pieces of technology, found an interesting vision for using the tech, and we built it. We didn’t win the weeked but we did take a special prize for an integration with IBM Watson API.
Following the weekend we started a chat channel to discuss next steps. Unfortunately, work challenges made it hard for us to find time to continue this weekend project. Additionally we got a good deal of pushback from additional testers about creepiness of tech and such a strong form of tracking of students. Interestingly the one area where we did here some positive vibes was turning the camera around and instead of tracking the students, instead tracking the teacher. What if a tool like this could be used to help a teacher better see themselves and improve how they instruct and manage a classroom? In view of these challenges we dropped continuing on this project. The timing wasn’t right.
Ironically, if we had continued on this project, somehow gotten it more into a working version, and developed a marketing and business side, we might have faced a much bigger opportunity months later when by March 2020 COVID-19 forced lockdowns, remote learning and working via zoom. There are nowadays several companies providing AR and tracking solutions that enable you to monitor your zoom meetings or even get feedback on your speaking presentation. It’s curious to imagine where we might have taken AttentiveAI from then to now.
Personally, I think it would have been interesting to continue to center our work on focus and being attentive. Various research shows that a person’s ability to maintain focus and pay attention is dependent on these two factors:
- Are their basic needs met? Are they hungry? Are they tired? Are they emotionally secure and content?
- Is the task and information in front of them engaging, enjoyable, challenge and/or appropriate for them? Does that presentation of that information instill these sentiments?
Assuming our tech could have provided some insight into these states, how might a teacher, leader or even meeting organizer change how they teach, lead or drive a meeting?
- Turner, D. C., Robbins, T. W., Clark, L., Aron, A. R., Dowson, J., & Sahakian, B. J. (2003). Cognitive enhancing effects of modafinil in healthy volunteers. Psychopharmacology, 165(3), 260-269.
- Hodges, J. R. (2017). Cognitive Assessment for Clinicians. Oxford University Press.
- Cooper, C. (2018). Psychological Testing. Routledge.
- Battleday, R. M., & Brem, A.-K. (2015). Modafinil for cognitive neuroenhancement in healthy non-sleep-deprived subjects: a systematic review. European Neuropsychopharmacology, 25(11), 1865-1881.
- Chiesa, A., Calati, R., & Serretti, A. (2011). Does mindfulness training improve cognitive abilities? A systematic review of neuropsychological findings. Clinical psychology review, 31(3), 449-464.
- Posner, M. I. (1980). Orienting of attention. Quarterly journal of experimental psychology, 32(1), 3-25.
- Liu, N.-H., Chiang, C.-Y., & Chu, H.-C. (2013). Recognizing the degree of human attention using EEG signals from mobile sensors. Sensors, 13(8), 10273-10286.
- Rebolledo-Mendez, G., Dunwell, I., Martínez-Mirón, E. A., Vargas-Cerdán, M. D., De Freitas, S., Liarokapis, F. et al. (2009). Assessing neurosky’s usability to detect attention levels in an assessment exercise.
- Lopez-Gordo, M. A., Pelayo, F., Fernández, E., & Padilla, P. (2015). Phase-shift keying of EEG signals: Application to detect attention in multitalker scenarios. Signal Processing, 117, 165-173.
- Cao, Z., Chuang, C.-H., King, J.-K., & Lin, C.-T. (2019). Multi-channel EEG recordings during a sustained-attention driving task. Scientific data, 6.
- Conati, C., & Merten, C. (2007). Eye-tracking for user modeling in exploratory learning environments: An empirical evaluation. Knowledge-Based Systems, 20(6), 557-574.
- Rebolledo-Mendez, G., & De Freitas, S. (2008). Attention modeling using inputs from a Brain Computer Interface and user-generated data in Second Life.
- Haynes, J.-D., & Rees, G. (2006). Neuroimaging: decoding mental states from brain activity in humans. Nature Reviews Neuroscience, 7, 523.
AIDA (AI Disclosure Acknowledgement): The following written content was written by me without any AI systems.>