Intelligent Voice® for E-DISCOVERY
To manually transcribe six months of phone calls for just 70 traders costs about £1million/$1.5million. With voice recording now a reality in the US and the UK, how can you cost effectively conduct an early case review?
Text is the backbone of all litigation support and e-discovery systems. Phone calls, self-evidently, are not available as text unless you provide a transcript to go with the file.
You will hear talk of "Phonetic Search" as the work-around for this.
"Phonetic Search" deconstructs the search term, and attempts to pattern match "phonemes", which are some of the building blocks of words, to find approximate matches. Useful in some cases, but only part of the answer.
An e-discovery system gives thematic and other analysis when aggregating text information, so why not use a machine to transcribe voice files?
The Problem with Automatic Transcription
It is impossible for a computer to "hear" a phone conversation and produce a perfect transcript. That is the holy grail of voice recognition technology.
It is possible to achieve high levels of accuracy using desktop transcription systems that employ a high-quality microphone.
Telephone speech, however, presents unique challenges. First, it is highly compressed and stored as 8Khz (as opposed to 16Khz for desktop speech) to a standard first introduced in 1972. Second, the higher and lower frequencies are eliminated to further save space, which is why voice sound "tinny" over the telephone. Finally, the way people speak is very different over the phone. Speech recognition relies not only on recognising the phonemes, but also in predicting the order that certain words appear in. If a system is struggling to decide between "dog" and "frog", "The princess kissed the dog" is likely to be wrong.
Compounding this is the way most calls are recorded. Historically, disk space has been expensive, so calls that take place in stereo (as most do), are compressed to mono, and often calls are re-encoded to a much lower quality just to save space. This makes the automatic transcription challenge that much harder.
OCR all over again
When scanned images were first introduced in the early 1990ís to form part of the e-discovery process, OCR did not deliver the level of results it does today. However, the results did provide valuable insights into the scanned data, sufficient to conduct an early case review, potentially saving millions in fees. Since then, capture technology as well as OCR algorithms have improved to give significantly greater accuracy
IV for E-Discovery provides "OCR" for voice, gleaning a valuable transcript that can be loaded into an e-discovery system so that voice can be used alongside other text. The product relies on years of research that pinpoints the specific issues raised by telephone speech, to provide the best possible information to the e-discovery system.
Combined with JumpTo technology, a user can click on an automatically generated bookmark such as a phrase or name and go directly to the point in the conversation where it is said, again saving valuable time in the review progress
Beyond Automatic Recognition
As phone capture looks more to quality than disk space, and with increasing research focus on telephone voice transcription, results will get better and better over time, and closer the perfect transcript paradigm.
However, Intelligent Voice is focussing on other challenges presented uniquely by phone calls.
Almost 50% of all calls to financial service companies have a "blocked" Caller ID. Not knowing who called makes transaction analysis very difficult. Automatically clustering unknown callers overcomes this
Emotional analysis also provides key stress indicators to pinpoint potential wrongdoing
Separating mono phone calls back into stereo greatly improves the ability to provide robust automatic transcription.
What Do I Do Next?
IV for E-Discovery takes a random sample of any given data set and provides a detailed analysis of factors such as voice quality, line noise variations and language anomalies.
Once this processing has taken place, the system will output a series of transcript files, as well as an optional encapsulation of the original file with the text to be ingested direct into an e-discovery system, such as Clearwell, allowing telephone calls to becomes as accessible as email or any text document