A technology poised to transform the physical security market is deep learning, which is a neural network approach to machine learning, differentiated by an ability to train using large data sets for greater accuracy. In effect, the system “learns” by looking at lots of data to achieve artificial intelligence (AI).

Phases of deep learning

I heard a lot about AI, including how it can transform the physical security marketplace, when I attended NVIDIA’s GPU Technology Conference (GTC) in San Jose recently.  

Recognising images, including video images, is a big focus of AI. In the past, you needed programmers to spend months telling a computer how to recognise an image. In deep learning, instead of programming the computer, you just show it many different images and it "learns" to distinguish the differences. This is the "training" phase. After the neural network learns about the data, it can then use "inference" to interpret new data based on what it has learned. In effect, if it has seen enough cats before, it will know when a new image is a cat.

Factors enabling AI

Deep learning and AI are fast-growing areas for a wide range of uses – physical security is just one. It is all made possible by the coming together of three factors. One is the availability of lots of data. This is the “big data” we have been hearing about; in effect, a proliferation of sensors (including video cameras) has produced a large enough mass of data to enable systems to be trained effectively. The second factor is the development of new algorithms to train neural networks faster, and the third is the availability of computer hardware (specifically GPUs, graphics processing units), that is capable of rapidly completing the involved calculations. NVIDIA manufactures those GPUs and sponsors the annual GTC conference, all about how they can be used more effectively.

“Deep learning is about teaching technology to understand the world around us in a way that is similar to how we understand it”

Deep learning and neural network computing is everywhere. It is now widely available in on-premises computers, in systems embedded in edge devices, and even in the cloud. The edge is particularly important in the video surveillance market, enabling systems to function despite any bandwidth or latency issues that would limit the effectiveness of a central server-based system. Edge-based functionality also limits concerns about the privacy of information, and eliminates dependence on the availability of 3G connectivity.

NVIDIA AI City initiative

Video analytics applications fall under NVIDIA's “AI City” initiative, which they describe as a combination of "safe cities" (video surveillance, law enforcement, forensics) and "smart cities" (traffic management, retail analytics, resource optimisation). Depending on the application, AI City technology must function in the cloud, on premises and/or at the edge. NVIDIA’s new Metropolis initiative offers AI at every system level, from the Jetson TX2 "embedded supercomputer" available at the edge, to on-premises servers (using NVIDIA’s Tesla and Quadro) to cloud systems (using NVIDIA’s DGX).

“AI City applications need an edge-to-cloud architecture,” says Jesse Clayton, Senior Manager, Product Management, Intelligent Machines, at NVIDIA. “Some applications, such as body cameras and parking entrance applications, have to have AI at the edge. But for other problems, you need to aggregate multiple sources of information, such as using AI on an on-premises server for hundreds of video cameras.”

The sheer volume of installed cameras in the world makes video an AI problem – more than 1 billion cameras worldwide by 2020 will provide 30 billion frames of video per day. The existing limitations of current video systems to adapt and function well in real-world conditions point to a need for better technology, as do the traditional shortcomings of video analytics systems. Video systems can achieve "super-human" results, identifying and classifying images using artificial intelligence.

NVIDIA’s Quadro GPU system enables Avigilon network video recorders (NVRs) to search simultaneously across hundreds of cameras to find images that are similar in appearance
NVIDIA’s GPU Technology Conference offered a chance for Avigilon to interact with others focused on AI

AI in video surveillance

AI is steadily making its way into video surveillance. Multiple security industry partners are using NVIDIA GPUs to boost the effectiveness of their systems. Many companies highlighted their initiatives at ISC West in April and again at NVIDIA’s GPU Technology Conference. Among them are Avigilon’s Appearance Search and BriefCam’s real-time video synopsis system. Hikvision uses the technology for a six-fold improvement detecting pedestrians in the rain, while Dahua is speeding up its licence plate recognition system by five times. Other companies using the technology are UNV Uniview (vehicle classification), SeeQuestor (investigations), Xjera Labs (people and attribute detection) and Sensetime (object detection).

NVIDIA’s Quadro GPU system enables Avigilon network video recorders (NVRs) to search simultaneously across hundreds of cameras to find images that are similar in appearance, such as faces that match an example. The GPU’s fast and efficient processing power, available in a small and affordable form factor, provides a system that is scalable and cost-effective but can run complex algorithms to provide rapid results.

Beyond recognising objects, the
system can also learn about how
objects interact in the environment,
and look for anomalies

“Deep learning is about teaching technology to understand the world around us in a way that is similar to how we understand it,” says Willem Ryan, Senior Director, Global Marketing at Avigilon. “What seem simple to us in terms of how we perceive the world is complex for a machine to do, but a machine learns faster. Deep learning allows you to teach a machine how to make connections that we make every day. Using GPUs, a system can make assumptions and calculations instantaneously.”

Beyond recognising objects, the system can also learn about how objects interact in the environment, and look for anomalies or non-typical events. For example, if the system sees a car go onto a pavement, it could provide an alert.

How will AI develop?

NVIDIA’s GTC conference offered a chance for Avigilon to interact with others focused on AI, and to share Avigilon’s knowledge of the unique AI challenges of the video surveillance market. “This is the heart of the development of AI and deep learning,” said Ryan at the GTC conference. “To be involved and part of this is exciting to Avigilon, and we can expose people here to how AI can be used in a way they may not be familiar with. We have talked to people who didn’t realise how video surveillance happens currently, and how AI is changing it. “

“We want to continue to support the idea of GPU processing and how using it can make video surveillance solutions more effective, and change how people interact with video,” he added. “That’s where we see the impact. There have been challenges we have struggled to overcome in the security industry, and these are the breakthroughs that will help us overcome those challenges. So, we want to be at the forefront and involved in those discussions.”

The impact of AI and deep learning on the physical security industry is only beginning. The full realisation of that impact over the next few years will be fascinating to watch.

Download PDF version

Author profile

Larry Anderson Editor, SecurityInformed.com & SourceSecurity.com

An experienced journalist and long-time presence in the US security industry, Larry is SourceSecurity.com's eyes and ears in the fast-changing security marketplace, attending industry and corporate events, interviewing security leaders and contributing original editorial content to the site. He leads SourceSecurity.com's team of dedicated editorial and content professionals, guiding the "editorial roadmap" to ensure the site provides the most relevant content for security professionals.

In case you missed it

What is the value of remote monitoring systems’ health and operation?
What is the value of remote monitoring systems’ health and operation?

When is it too late to learn that a video camera isn’t working properly? As any security professional will tell you, it’s too late when you find that the system has failed to capture critical video. And yet, for many years, system administrators “didn’t know what they didn’t know.” And when they found out, it was too late, and the system failed to perform as intended. Fortunately, in today’s technology-driven networked environment, monitoring a system’s health is much easier, and a variety of systems can be deployed to ensure the integrity of a system’s operation. We asked this week’s Expert Panel Roundtable: How can remote monitoring of a security system’s health and operation impact integrators and end users?

What is the changing role of training in the security industry?
What is the changing role of training in the security industry?

Even the most advanced and sophisticated security systems are limited in their effectiveness by a factor that is common to all systems – the human factor. How effectively integrators install systems and how productively users interface with their systems both depend largely on how well individual people are trained. We asked this week’s Expert Panel Roundtable: What is the changing role of training in the security and video surveillance market?

What is AI Face Search? Benefits over facial recognition systems
What is AI Face Search? Benefits over facial recognition systems

When a child goes missing in a large, crowded mall, we have a panicking mom asking for help from the staff, at least a dozen cameras in the area, and assuming the child has gone missing for only 15 minutes, about 3 hours’ worth of video to look through to find the child. Typical security staff response would be to monitor the video wall while reviewing the footage and making a verbal announcement throughout the mall so the staff can keep an eye out for her. There is no telling how long it will take, while every second feels like hours under pressure. As more time passes, the possible areas where the child can be will widen, it becomes more time-consuming to search manually, and the likelihood of finding the child decreases. What if we can avoid all of that and directly search for that particular girl in less than 1 second? Artificial neural networks are improving every day and now enable us to search for a person across all selected camera streamsWith Artificial Intelligence, we can. Artificial neural networks are improving every day and now enable us to search for a person across all selected camera streams in a fraction of a second, using only one photo of that person. The photo does not even have to be a full frontal, passport-type mugshot; it can be a selfie image of the person at a party, as long as the face is there, the AI can find her and match her face with the hundreds or thousands of faces in the locations of interest. The search result is obtained in nearly real time as she passes by a certain camera. Distinguishing humans from animals and statues The AI system continuously analyses video streams from the surveillance cameras in its network, distinguishes human faces from non-human objects such as statues and animals, and much like a human brain, stores information about those faces in its memory, a mental image of the facial features so to speak. When we, the system user, upload an image of the person of interest to the AI system, the AI detects the face(s) in that image along with their particular features, search its memory for similar faces, and shows us where and when the person has appeared. We are in control of selecting the time period (up to days) and place (cameras) to search, and we can adjust the similarity level, i.e., how much a face matches the uploaded photo, to expand or fine-tune the search result according to our need. Furthermore, because the camera names and time stamps are available, the system can be linked with maps to track and predict the path of the person of interest. AI Face Search is not Face Recognition for two reasons: it protects people’s privacy, and it is lightweight Protecting people’s privacy with AI Face Search  All features of face recognition can be enabled by the system user, such as to notify staff members when a person of interest is approaching the store AI Face Search is not Face Recognition for two reasons: it protects people’s privacy, and it is lightweight. First, with AI Face Search, no names, ID, personal information, or lists of any type are required to be saved in the system. The uploaded image can be erased from the system after use, there is no face database, and all faces in the camera live view can be blurred out post-processing to guarantee GDPR compliance. Second, the lack of a required face database, a live view with frames drawn around the detected faces and constant face matching in the background also significantly reduces the amount of computing resource to process the video stream, hence the lightweight. Face Search versus Face Recognition AI Face Search Face Recognition Quick search for a particular person in video footage Identify everyone in video footage Match detected face(s) in video stream to target face(s) in an uploaded image Match detected face(s) in video stream to a database Do not store faces and names in a database Must have a database with ID info Automatically protect privacy for GDPR compliance in public places May require additional paperwork to comply with privacy regulations Lightweight solution Complex solution for large-scale deployment Main use: locate persons of interest in a large area Main use: identify a person who passes through a checkpoint Of course, all features of face recognition can be enabled by the system user if necessary, such as to notify staff members when a person of interest is approaching the store, but the flexibility to not have such features and to use the search tool as a simple Google-like device particularly for people and images is the advantage of AI Face Search.Because Face Search is not based on face recognition, no faces and name identifications are stored Advantages of AI Face Search Artificial Intelligence has advanced so far in the past few years that its facial understanding capability is equivalent to that of a human. The AI will recognise the person of interest whether he has glasses, wears a hat, is drinking water, or is at an angle away from the camera. In summary, the advantages of Face Search: High efficiency: a target person can be located within a few seconds, which enables fast response time. High performance: high accuracy in a large database and stable performance, much like Google search for text-based queries. Easy setup and usage: AI appliance with the built-in face search engine can be customised to integrate to any existing NVR/VMS/camera system or as a standalone unit depending on the customer’s needs. The simple-to-use interface requires minimal training and no special programming skills. High-cost saving: the time saving and ease of use translate to orders of magnitude less manual effort than traditionally required, which means money saving. Scalability: AI can scale much faster and at a wider scope than human effort. AI performance simply relies on computing resource, and each Face Search appliance typically comes with the optimal hardware for any system size depending on the customer need, which can go up to thousands of cameras. Privacy: AI Face Search is not face recognition. For face recognition, there are privacy laws that limits the usage. Because Face Search is not based on face recognition, no faces and name identifications are stored, so Face Search can be used in many public environments to identify faces against past and real-time video recordings. AI Face Search match detected face(s) in video stream to target face(s) in an uploaded image Common use cases of AI Face Search In addition to the scenario of missing child in a shopping mall, other common use cases for the AI Face Search technology include: Retail management: Search, detect and locate VIP guests in hotels, shopping centres, resorts, etc. to promptly attend to their needs, track their behaviour pattern, and predict locations that they tend to visit. Crime suspect: Quickly search for and prove/disprove the presence of suspects (thief, robber, terrorist, etc.) in an incident at certain locations and time. School campus protection: With the recent increase in number of mass shootings in school campuses, there is a need to identify, locate and stop a weapon carrier on campus as soon as possible before he can start shooting. Face Search will enable the authorities to locate the suspect and trace his movements within seconds using multiple camera feeds from different areas on campus. Only one clear image of the suspect’s face is sufficient. In the race of technology development in response to business needs and security concerns, AI Face Search is a simple, lightweight solution for airports, shopping centres, schools, resorts, etc. to increase our efficiency, minimise manual effort in searching for people when incidents occur on site, and actively prevent potential incidents from occurring. By Paul Sun, CEO of IronYun, and Mai Truong, Marketing Manager of IronYun