How to detect answering machines (AMD)

Detection of an answering machine on the emission of calls by robots.

This subject is not yet evoked on the web in France in 2020 because to reach this problem, it is necessary to understand that it is not always easy for a robot to detect that the robot is addressing a classic answering machine with DTMF keys or why not another robot (in the case of the virtual secretaries that we have set up).  When a human picks up the phone, in general, he or she will speak in a very short way: “Hello” or “yes hello” or “who is this” ? etc… When, on the other hand, the answering machine is triggered, we will have a more or less long tirade and the possibility to leave a message to our interlocutor. 
How can we effectively detect that we are dealing with an answering machine and a voice mail?
The tool is adapted to understand the voice and the duration of the first message and its content and also the environment which can be noisy. We can therefore measure with precision the duration of the speech of this first interaction which will generally be longer than a human.  We can also detect a small tone or BIP just before leaving a message. This signal is an important indicator in the choice of the scenario to close the conversation. 
It is possible to leave a message for example, but here again, you must be able to record it and you must use little tricks that no one knows of. All of this is preconfigured in our platform to avoid you asking this type of question: What should I do when I meet an answering machine?
We take advantage of our voice recognition engines to detect the tone of the answering machine and the transcript before indicating that you can leave the message. This ensures that the entire message is successfully delivered or that the call can be automatically replayed at another time.
More than 70% of unknown calls are redirected to voicemail because people do not like to talk to strangers. Thanks to deeplearning and acoustic modeling, digital signal processing techniques can tell if a person has actually picked up the phone. The analysis of audios is always interesting because it allows to enrich the knowledge base on the “answering machine” part. The understanding of audio spectra also allows to detect other sound samples and surrounding noises: The television, the train, the car to mention only them, birds. It is likely that in the future, legislation will prohibit communication between a robot and a person in a car for example.

However, we must beware of false positives, where the robot would think that we are on an answering machine or the opposite. The human sometimes picks up the phone but does not speak immediately and waits or does not speak loud enough and there is a long silence. It has been calculated that everything happens in the first second of the call and that some answering machines are impossible to detect. The most important thing is to be able to cut the conversation when we understand that it is an answering machine or a music on hold right after the dialing of the called person.

In the case of call transfers, it is also necessary to imagine more complex scenarios in the event that you also get an answering machine and therefore you must be able to return to the conversation. I’m telling you: it is complex.