Sorry, you need to enable JavaScript to visit this website.
Skip to main content

This page is not yet available in English.

This page is currently only available in its original language. You can continue here, or head to the English homepage.

Automatic Speaker Comparison

Information page

What task is being performed?

In an investigation conducted by an expert appointed by a examining magistrate, voices are compared. This takes place in the Division of Digital and Biometric Traces, within the Speech, Language, and Audio Research team. The voice comparison is typically conducted to determine if the voice of the suspect matches that of the perpetrator. One of the methods used for this is automatic speaker comparison.

Why is an algorithm used?

The algorithm supports the decisions made by experts. It is used in addition to the investigation conducted solely by the experts. The algorithm is used because it makes the final judgment even more objective than without the algorithm, as it can be accurately measured how well the algorithm performs.

For speaker comparison, you can upload two audio files into the algorithm. The algorithm extracts features from the speech in each recording and compares them. A score is then generated that indicates how similar the voices are. This is primarily based on 'the sound of the voice.' These steps are automated, while the interpretation of the score is done by experts. For each case, we reassess whether the algorithm can be used.

This algorithm is used in criminal investigations where speaker comparison plays a role. The NFI uses audio relevant to a specific criminal case, provided for investigation by the police, the Public Prosecution Service, or the judiciary. The NFI only uses the algorithm in cases where it is technically possible to apply automatic speaker comparison. When the algorithm is used, it usually involves tapped conversations. There is no voice bank where individuals are searched for using this technology.

The method in which this algorithm is used involves a lot of human intervention: it is first determined whether the algorithm is suitable for the investigation material, and later the outcome is interpreted in relation to the research question. The algorithm is only applied alongside, or in parallel to, the expert's evaluation. The results from both methods are interpreted by experts to form a final conclusion.

The algorithm has been generally tested in advance. Since the specific audio conditions differ per case (language, speech duration, phone/microphone, etc.), for each case, it is retested whether the algorithm is suitable for use in those conditions. The algorithm is only used in cases by an expert who has been specially trained and understands the software’s limitations. The algorithm is only used within the context of a single criminal case. If the algorithm cannot be tested in a case to determine how well it works, it will not be used.

If you have questions about the application of this algorithm, or about algorithms in general, please contact the NFI via the general contact form.

Journalists with questions about the application of this algorithm can contact the NFI press officers.