Articles by William Xu
Imagine a home surveillance camera monitoring an elderly parent and anticipating potential concerns while respecting their privacy. Imagine another camera predicting a home burglary based on suspicious behaviors, allowing time to notify the homeowner who can in turn notify the police before the event occurs—or an entire network of cameras working together to keep an eye on neighborhood safety. Artificial Intelligence vision chips A new gen of AI vision chips are pushing advanced capabilities such as behavior analysis and higher-level security There's a new generation of artificial intelligence (AI) vision chips that are pushing advanced capabilities such as behavior analysis and higher-level security to the edge (directly on devices) for a customisable user experience—one that rivals the abilities of the consumer electronics devices we use every day. Once considered nothing more than “the eyes” of a security system, home monitoring cameras of 2020 will leverage AI-vision processors for high-performance computer vision at low power consumption and affordable cost—at the edge—for greater privacy and ease of use as well as to enable behavior analysis for predictive and preemptive monitoring. Advanced home monitoring cameras With this shift, camera makers and home monitoring service providers alike will be able to develop new edge-based use cases for home monitoring and enable consumers to customise devices to meet their individual needs. The result will be increased user engagement with home monitoring devices—mirroring that of cellphones and smart watches and creating an overlap between the home monitoring and consumer electronics markets. A quick step back reminds us that accomplishing these goals would have been cost prohibitive just a couple of years ago. Face recognition, behavior analysis, intelligent analytics, and decision-making at this level were extremely expensive to perform in the cloud. Additionally, the lag time associated with sending data to faraway servers for decoding and then processing made it impossible to achieve real-time results. Cloud-based home security devices The constraints of cloud processing certainly have not held the industry back, however. Home monitoring, a market just seven years young, has become a ubiquitous category of home security and home monitoring devices. Consumers can choose to install a single camera or doorbell that sends alerts to their phone, a family of devices and a monthly manufacturer’s plan, or a high-end professional monitoring solution. While the majority of these devices do indeed rely on the cloud for processing, camera makers have been pushing for edge-based processing since around 2016. For them, the benefit has always been clear: the opportunity to perform intelligent analytics processing in real-time on the device. But until now, the balance between computer vision performance and power consumption was lacking and camera companies weren’t able to make the leap. So instead, they have focused on improving designs and the cloud-centric model has prevailed. Hybrid security systems Even with improvements, false alerts result in unnecessary notifications and video recording Even with improvements, false alerts (like tree branches swaying in the wind or cats walking past a front door) result in unnecessary notifications and video recording— cameras remain active which, in the case of battery powered cameras, means using up valuable battery life. Hybrid models do exist. Typically, they provide rudimentary motion detection on the camera itself and then send video to the cloud for decoding and analysis to suppress false alerts. Hybrids provide higher-level results for things like people and cars, but their approach comes at a cost for both the consumer and the manufacturer. Advanced cloud analytics Advanced cloud analytics are more expensive than newly possible edge-based alternatives, and consumers have to pay for subscriptions. In addition, because of processing delays and other issues, things like rain or lighting changes (or even bugs on the camera) can still trigger unnecessary alerts. And the more alerts a user receives, the more they tend to ignore them—there are simply too many. In fact, it is estimated that users only pay attention to 5% of their notifications. This means that when a package is stolen or a car is burglarised, users often miss the real-time notification—only to find out about the incident after the fact. All of this will soon change with AI-based behavior analysis, predictive security, and real-time meaningful alerts. Predictive monitoring while safeguarding user privacy These days, consumers are putting more emphasis on privacy and have legitimate concerns about being recorded while in their homes. Soon, with AI advancements at the chip level, families will be able to select user apps that provide monitoring without the need to stream video to a company server, or they’ll have access to apps that record activity but obscure faces. Devices will have the ability to only send alerts according to specific criteria. If, for example, an elderly parent being monitored seems particularly unsteady one day or seems especially inactive, an application could alert the responsible family member and suggest that they check in. By analysing the elderly parent’s behavior, the application could also predict a potential fall and trigger an audio alert for the person and also the family. AI-based behavior analysis Ability to analyse massive amounts of data locally and identify trends is a key advantage of AI at the edge The ability to analyse massive amounts of data locally and identify trends or perform searches is a key advantage of AI at the edge, for both individuals and neighborhoods. For example, an individual might be curious as to what animal is wreaking havoc in their backyard every night. In this case, they could download a “small animal detector” app to their camera which would trigger an alert when a critter enters their yard. The animal could be scared off via an alarm and—armed with video proof—animal control would have useful data for setting a trap. Edge cameras A newly emerging category of “neighborhood watch” applications is already connecting neighbors for significantly improved monitoring and safety. As edge cameras become more commonplace, this category will become increasingly effective. The idea is that if, for example, one neighbor captures a package thief, and then the entire network of neighbors will receive a notification and a synopsis video showing the theft. Or if, say, there is a rash of car break-ins and one neighbor captures video of a red sedan casing their home around the time of a recent incident, an AI vision-based camera could be queried for helpful information: Residential monitoring and security The camera could be asked for a summary of the dates and times that it has recorded that particular red car. A case could be made if incident times match those of the vehicle’s recent appearances in the neighborhood. Even better, if that particular red car was to reappear and seems (by AI behavior analysis) to be suspicious, alerts could be sent proactively to networked residents and police could be notified immediately. Home monitoring in 2020 will bring positive change for users when it comes to monitoring and security, but it will also bring some fun. Consumers will, for example, be able to download apps that do things like monitor pet activity. They might query their device for a summary of their pet’s “unusual activity” and then use those clips to create cute, shareable videos. Who doesn’t love a video of a dog dragging a toilet paper roll around the house? AI at the Edge for home access control Home access control via biometrics is one of many new edge-based use cases that will bring convenience to home monitoring Home access control via biometrics is one of many new edge-based use cases that will bring convenience to home monitoring, and it’s an application that is expected to take off soon. With smart biometrics, cameras will be able to recognise residents and then unlock their smart front door locks automatically if desired, eliminating the need for keys. And if, for example, an unauthorised person tries to trick the system by presenting a photograph of a registered family member’s face, the camera could use “3D liveness detection” to spot the fake and deny access. With these and other advances, professional monitoring service providers will have the opportunity to bring a new generation of access control panels to market. Leveraging computer vision and deep neural networks Ultimately, what camera makers strive for is customer engagement and customer loyalty. These new use cases—thanks to AI at the edge—will make home monitoring devices more useful and more engaging to consumers. Leveraging computer vision and deep neural networks, new cameras will be able to filter out and block false alerts, predict incidents, and send real-time notifications only when there is something that the consumer is truly interested in seeing. AI and computer vision at the edge will enable a new generation of cameras that provide not only a higher level of security but that will fundamentally change the way consumers rely on and interact with their home monitoring devices.
Facial recognition is becoming more popular in newer systems for access control — a shift that began before the pandemic and has intensified with a market shift toward “touchless” systems. A new facial recognition platform is emerging that responds to the access control industry’s increased interest in facial recognition by expanding the concept with a new higher level of technology. At the core of the new system is high-performance, true-3D sensing with facial depth map processing at low power consumption, which enriches the capabilities of small-footprint access control devices. New proficiencies include anti-spoofing (preventing the use of a 2D photo of an authorised user to gain entry) and anti-tailgating (preventing an unauthorised person from gaining entry by following an authorised user) in real time and in challenging lighting conditions. The system uses “true 3D sensing,” which incorporates single-camera structured-light 3D sensing—as opposed to dual-camera depth sensing or IR video imaging-based approaches. AI vision processing and 3D sensing technologies The new “Janus reference design” incorporates AI vision processing, 3D sensing technologies, and RGB-IR CMOS image sensor technologies from Ambarella, Lumentum and ON Semiconductor. Specifically, Lumentum’s high-reliability, high-density VCSEL projector for 3D sensing combines with ON Semiconductor’s RGB-IR CMOS image sensor and Ambarella’s powerful AI vision system on chip (SoC). The Ambarella, Lumentum, and ON Semiconductor engineering teams worked together to incorporate their complementary technologies into the reference design. A reference design offers OEM product and engineering teams a fully functional engineering reference implementation that they can use as the basis for their own product. Teams will often customise a reference design with their choice of various third-party hardware components to fit their product specifications and positioning. They might also integrate their own software, algorithms, and back-end system integrations. The advantage to this approach is that the manufacturer can get to market quickly with a next-generation product that emphasises their core strengths. 3D depth information for facial recognition Generally, it takes between nine months and a year for a manufacturer to get to market using a fully functional reference design, such as the one developed jointly by Ambarella, Lumentum and ON Semiconductor. The Janus platform leverages 3D depth information generated via structured light for facial recognition with a >99% recognition accuracy rate. Traditional 2D-based solutions are prone to false acceptance and presentation attacks, whereas 3D sensing delivers advanced security—just as mobile phones use true-depth cameras for facial recognition. 3D facial recognition also significantly reduces the gender and ethnic biases demonstrated by some 2D facial recognition solutions. The Janus reference design is also aimed at future smart locks for enterprise and residential use: its unique single-camera 3D sensing solution will help OEMs overcome cost and manufacturability barriers, while the ultra-low power edge AI capability can effectively extend the battery life, which in turn reduces maintenance cost. Video security and access control Ambarella sees touchless access control, as well as the convergence of video security and access control, as the mega-trends driving industry innovation and growth—using video, computer vision, and 3D sensing to not only address safety and security, but also to improve the user experience and public health, says William Xu, director of marketing for Ambarella. The convergence of video security cameras and access control readers has been widely discussed by leading access control OEMs. In many cases, they already integrate video security cameras, readers, door controllers, cloud-access, and the like. In most enterprise installations, one would typically find security cameras installed where there are access control readers. Combining the two devices significantly reduces the maintenance cost and system complexity. “In comparison to fingerprint or other contact-based approaches, Janus-based access control is touchless—requiring no physical contact with authentication hardware such as fingerprint sensors or keypads—reducing infection risk while enabling a seamless experience,” says Mr. Xu. “The Janus platform provides true 3D depth information to prevent unauthorised individuals from mimicking legitimate users, and the advanced embedded AI processor enables tracking and anti-tailgating algorithms. Janus-based devices perform well in challenging lighting conditions and they are capable of authenticating multiple users simultaneously, with imperceptible latency.” Access Control and public health What was once purely a security challenge—namely, how to prevent unauthorised entry into a restricted area—has evolved into a public health challenge as well. Many traditional access control methods, from number pads to fingerprint readers, require touch in order to function, and if the current global pandemic has made one thing evident, it’s that minimising physical contact between users and surfaces is vital to community well-being. Janus was originally designed to facilitate the next generation of facial-recognition-based access control readers—enabling 3D sensing and high recognition speed for seamless authentication. COVID-19 has accelerated industry-wide research, development, and timelines for Janus-based solutions, says Mr. Xu. Deep learning and artificial intelligence drive all the new capabilities offered in Janus—capabilities that are only possible due to the platform’s high computational horsepower. The core deep learning and AI capabilities of Janus enable a wide range of advanced features only possible with an embedded vision SoC, says Mr. Xu. All are performed in real time, even when multiple users are being processed simultaneously. These include the extraction and comparison of facial depth maps with those registered in the system; 3D liveness detection, ensuring that the system can distinguish between real users and photo or video playback attacks; anti-tailgating, which relies on computer vision algorithms to detect and track when an unauthorised person follows a legitimate user inside; face mask detection; and people counting. VCSEL technology According to Ken Huang, Director of Product Line Management, 3D Sensing, Lumentum: “Lumentum’s VCSEL technology is one of the Janus design’s core strengths and differentiators. The process begins when Lumentum’s high-resolution dot projector projects thousands of dots onto the scene to create a unique 3D depth pattern of a user’s face. Most traditional biometric facial security systems rely on 2D images of users—simple photographs—which reduces authentication accuracy. In contrast, the 3D depth map generated by Lumentum’s technology provides the foundation of a more accurate, more secure, and more intelligent system overall. In addition, Lumentum’s VCSEL solutions incorporate a Class 1, eye-safe laser with zero field failures to date.” Adds Paige Peng, Product Marketing Manager, Commercial Sensing Division, ON Semiconductor: “If we think of Ambarella’s CV25 as the brain of the Janus design, the AR0237IR from ON Semiconductor is the eye. The AR0237IR image sensor captures the information, and the CV25 processes it. Other face recognition systems use two “eyes” – one to recognise RGB patterns to generate the viewing image stream, and another IR module to detect liveliness in motion. The Janus solution leverages a single “eye”—the AR0237IR—to obtain both visible and infrared images for depth sensing and advanced algorithms such as anti-spoofing and 3D recognition. AR0237IR also provides good sensitivity in various lighting conditions and supports high-dynamic-range (HDR) functions.” The single-camera 3D sensing solution for access control operates in three seamless steps: Step 1: Lumentum’s high-resolution dot projector creates a unique 3D depth map of a user’s face; Step 2: ON Semiconductor’s RGB-IR image sensor captures the high-resolution images from Step 1, even in low-light or high dynamic range conditions; Step 3: Ambarella’s advanced vision SoC takes the high-resolution images captured in Step 2 and uses deep neural networks (DNNs) for depth processing, facial recognition, anti-tailgating, and anti-spoofing while video encoding and network software run simultaneously.