Recommendations for setting up face recognition. The largest face recognition 300 channels system is implemented on VideoNet PSIM

What factors need to be considered when implementing a solution with face recognition in order to get high-quality recognition and efficient operation of the entire system? In the article we will analyze the parameters of the face recognition module that you need to pay attention to, analyze the main terms and concepts, external factors that affect the quality of recognition. Let's talk about what tasks can be effectively solved using the recognition module.


Face recognition system - three pillars of efficient operation


When choosing a face recognition system, the biggest misconception is that the quality of the system directly depends on the choice of a recognition algorithm, and it is enough to take the best recognition algorithm and automatically get a high-quality solution. Quality recognition is a complex result, and the recognition algorithm is one of the components.


Three main factors influence the effective operation of the face recognition system: the face recognition algorithm and the choice of the recognition module settings, external factors, the quality of the base for identifying faces. If you do not pay attention to even one of the factors, then you will not be able to get high-quality recognition system performance.



The most common solutions using face recognition are solutions for controlling access to an object, solutions with multi-factor identification of a person, solutions for identifying a person in a crowd.


The face recognition system in VideoNet PSIM can operate in two main modes: verification mode and identification mode. Verification mode is a comparison of a pair of photographs: a biometric template stored in the database and an image received at the input of the recognition module. Result - the recognition module gives a binary solution: “yes” images belong to one person or “no” different people are depicted in the photos. The verification mode is used when organizing multi-factor identification, where the recognized face is compared with the face of, for example, the owner of the access card. And with a positive comparison, the system decides that access to the object is allowed. Identification mode is a comparison of the image of an input recognition module with one or more photo bases (comparison of one with many), the result is permission or denial of access to an object, identification of a regular customer or identification of the offender, etc. There are many use cases. The result of the comparison will be the face with the highest percentage of similarity.



Factor # 1. Face recognition module operation algorithm


Face recognition algorithm is a method of constructing a biometric model of a face or a descriptor. Based on the constructed descriptor, a further process of person identification takes place. There are three main types of algorithms: mathematical, neural network and hybrid, based on a combination of the first two algorithms. Most modern recognition systems use algorithms built on neural networks. Neural network face recognition algorithms are trained on large sets of photographs of people with specially marked elements in the image. The recognition accuracy depends on the quality of the photographs on which the neural network is being trained.


VideoNet PSIM uses neural network face recognition algorithms. In VideoNet PSIM, you can select one of three built-in recognition algorithms depending on the task, recognition quality requirements, and computing resource requirements.


The face recognition process in VideoNet PSIM consists of the following steps:

  • Face detection in a video stream
  • Choosing the best frame
  • Determining the characteristics of a person: gender, age, emotions (joy, sadness, etc.)
  • Formation of a descriptor (biometric template)
  • Compare biometric templates with templates in the face database
  • Determining the degree of similarity of templates
  • Deciding on the identification of a person based on a given value of similarity


Configuring the face recognition module


Depending on the selected control zone: a corridor, a checkpoint with a turnstile, a controlled entrance to the door, etc., a camera is selected, installed and configured for face recognition. Face recognition module is individually configured for each camera. There are three groups of settings in VideoNet PSIM.



Stream parameters:


Resolution. During configuration the resolution of the frame is set, which will be fed to the input of the recognition module - standard, high or maximum. The value of this parameter affects the recognition speed.


Frame frequency. The frame rate for processing is selected. The value is selected depending on the system performance. The value of this parameter affects the processor load.


Face detection parameters:


Confidence threshold. When detecting a face in a video stream - i.e. determining that there is a person in the frame, the confidence threshold parameter is used. This parameter determines the confidence of the recognition module that the object detected in the frame is a face. The higher the confidence threshold, the lower the number of false positives, but at the same time, the chances of missing a real person increase.

The choice of setting the confidence threshold is determined individually for each task to be solved. When a high value of the confidence threshold is set in the settings, the system will not let an outsider pass, but it will give a large number of false positives that you have to deal with. When the confidence threshold is lowered, the chances of missing the face are small, but the error of allowing an outsider to the object increases. Therefore, the choice of the threshold value of confidence in each case is individual.

Detection frequency. This parameter regulates that the detection of faces happens no often than at the specified number of frames per second. The value is selected depending on the system performance. The value of this parameter affects the recognition speed.

Detector algorithm. In VideoNet PSIM, there are two types of algorithms for searching and detecting faces in an image - high-quality or fast. A fast algorithm is recommended for very limited computing resources. The value of this parameter affects the recognition speed.


Face recognition parameters:


Ignore repeated face recognition. The time is set after which the same face reappeared in the recognition zone should be considered new. Setting this parameter is necessary because in the process of being in the frame, a person can turn away for a short time and can be blocked by another person.

Save faces. Saving faces to the database.

Count faces Count the number of recognized faces.


Determination of gender, age, emotions of a person by image of a face:




Using the image of a face, you can determine the characteristics of a person: gender, age, emotions (joy, anger, sadness, etc.). This functionality is part of the face recognition module in VideoNet PSIM.


Determining the characteristics of a person's gender, age, emotions is called classification. Classification of people by facial images is used in many areas to analyze the age composition of the audience, for example, a store or restaurant, analyze the gender composition of the audience, analyze the quality of customer service, search for people in the video archive by photo, gender, age, emotions.


Substitution of a face with a photograph:


The “substitution of the face by a photo” functionality is used in cases when an intruder uses a photo to hide his face or tries to penetrate an object using a photo of one of the employees of this object. Such control will require additional computer resources. The value of this parameter affects the recognition speed. In the settings, the percentage of probability that the last person covered his face with a photograph is set, if the specified value is exceeded, the person’s face should be considered replaced by a photo.





What data gets into the faces log:


All events related to the results of the face recognition module are recorded in the Face Journal. There are three types of events: Face is recognized, Database match, Face substitution. Face recognition log events contain the following data that can be used to build reports and statistics:


  • Event type;
  • Date and time;
  • Face image from the camera;
  • Photo from the DB;
  • Camera;
  • Computer;
  • Full name;
  • Department;
  • Position;
  • Age;
  • Name of face DB;
  • Comment;
  • Gender;
  • Similarity level;


In the Faces log for convenient work with data, a filtering mechanism is implemented. You can set the time range for sampling, and select the necessary data for filtering by log. Generated report can be saved in xlsx format. For example, you can set a sample by age by specifying the desired age range or make a sample by gender. Any filtering options are available. From the Faces log, you can watch a video on any event from the log.



Factor # 2. External factors



The recognition accuracy and, accordingly, the overall system performance depend on the correct choice of a video camera, its settings and the choice of installation location. We have compiled the main recommendations, following which you will improve the quality of the recognition system.



Recommendations for organizing a face recognition zone

  • It is recommended that you use a separate camera for face recognition.
  • It is desirable to install the camera at the level of the head of a person of average height, in order to provide a vertical deflection angle of no more than 15 degrees
  • The direction of movement of people in the control zone should be towards the camera
  • Organize even lighting of faces in the recognition area
  • Avoid complicated backgrounds behind people. The best recognition results are obtained against a light and uniform wall or floor
  • The time spent by a person in the control zone should be 1 second. Stopping a person in the control zone, for example, using a turnstile, will increase the quality of recognition.


The algorithm in its work focuses on the characteristic features of the face, eyes, corners of the lips, nose, etc. Successful face recognition requires that the face be represented by at least 160 pixels per oval of the face, and ideally at least 60 pixels, between the eyes. No matter how carefully you choose the location of the camera, to achieve these values it will have to be adjusted in place. More often, for the possibility of on-site adjustment, choose a camera with a varifocal lens.



For the best recognition result, follow the recommendations

  • The image of the face in the frame should be clear, not blurry and evenly lit
  • The presence of individual shadows or highlights on the face will significantly reduce the likelihood of correct identification of a person
  • The background on which the face is located should be uniform and light
  • The angle of rotation of the face in the frame should not exceed 15 degrees vertically and horizontally
  • Recommended distance between the pupils in the face image is at least 60 pixels


The presence of a mustache, beard, glasses, blinking and emotions do not significantly affect the quality of a person’s identification. For a high-quality identification of a person, a person does not have to look into the camera; the face recognition module shows a good percentage of similarity up to 30 degrees of deviation of the observation axis from the frontal plane of the face, following the recommendations for the recognizable image and the requirements for organizing the face recognition zone.





Factor #3. High-quality photo base for recognition


The likelihood of identifying a person increases when you add a few photos of a person to the database.





Recommendations for images used for building descriptors

  • When creating descriptors, it is recommended to obtain reference images of faces solely by photographing, rather than cutting faces from the video stream. Only in this case it is possible to achieve maximum accuracy in face recognition;
  • The position of the face should be frontal. The rotation, tilt and deflection of the head should be less than 5 degrees in any direction from the frontal position at each angular coordinate;
  • The expression should be neutral (without a smile), both eyes should be open (but not wider than normal) and look into the camera, mouth closed;
  • A person’s hair should not cover his eyes, the presence of a covering on his head (headgear, headscarf, etc.) is not recommended;
  • If the image is a person with glasses, their rim should not cover part of the eyes. The glasses must have clear and transparent lenses so that the pupils of the eyes and the irises are clearly visible;
  • A person’s shoulders should face the camera. It is not allowed to use images on which a person looks “over the shoulder”;
  • The background in the image should be without shadows, even and should not contain textures with straight or curved lines that can distort the results of automated processing of the face;
  • The background should have a uniform color palette or be monochrome, with a consistent change in brightness from light to dark in only one direction;
  • A person’s face should be evenly lit, without shadows. Presence of the primary direction of illumination and the presence of “bright spots” on the face image (glare) are not allowed;
  • The presence of dark eyebrow shadows in the eye sockets are not allowed. The irises and pupils of the eyes must be clearly visible;
  • The image should clearly show the texture of the skin in each area of the face. At the same time, there should not be areas with saturation on the face (insufficient or too large exposure);
  • All points of the received face image should be in focus (from nose to ears and from chin to top of head);
  • The use of unnatural lighting is not allowed: yellow, red, etc .;
  • Lighting should not distort the natural color of the skin when viewed in natural conditions. Red-eye effect is not allowed;
  • It is not allowed to edit a color or black and white image in order to improve the appearance of the depicted face or to artistic process it;
  • The resolution of a face photo should be at least 256x256 pixels.



Additional functionality. # Search for a person by photo


To investigate incidents or analyze information in VideoNet PSIM, a person’s photo search functionality is provided. To search for a person from a photograph, simply select a person’s photo from the Faces log, upload a photo from a file or from a Web camera, add it to the search window and configure the required level of similarity between this face and faces in VideoNet events. The system will find and show only those events where the face in the image has the necessary level of similarity with the face in the photo selected for search. You can watch video clips with search results in one click.






Conclusion # Each face recognition system is individual


Our recommendations are given to get the maximum result from the face recognition system. This does not mean that if our recommendations are not followed, the system will not work. Each object is individual and has its own characteristics, and often some recommendations are not possible to implement fully. Our company’s specialists will help you set up a face recognition system, taking into account the features of your object and the logic for further work with a recognized face, and will help you build a solution with multi-factor identification.


For support, selection of solutions and recommendations, please email or call 8-800-50-50-100. We have experience in implementing various solutions with face recognition and organization of the largest face recognition system, with more than 300 recognition channels in one system.