Speech recognition for the AI age.



Intelligent Voice indexes key words and phrases from your telephone calls



This allows you to search for telephone calls as if they were text.



Add-on modules give you the power to analyze calls and track anomalous behavior.



Nvidia® GPU technology processes calls at up to 400x faster than real time.


How smart are yours?


Intelligent Voice® boosts the intelligence of your calls, and puts them to work for you.


MifidII & GDPR compliance nightmares? Make your fears disappear. Store & search all your voice data using the sound of voice.

Banish the Compliance Monster.

Without the ability to index and search the content of your company’s calls, the simplest inquiry from a regulator, e-discovery, or Freedom Of Information Act Request that applies to voice, can mean trouble. It can mean big, expensive, time-consuming trouble of manually wading—actually listening—through hundreds or thousands of hours of calls.

The alternative? Call Recording and Compliance software by Intelligent Voice.
With IV, you can have ready-access to all of your data, giving you the ability to pinpoint what was verbally said on a call, when, to whom, and by whom using voice indexing —as easily as searching for text. Make better use of your compliance team’s resources.

IV’s got you covered.


Bogged down by audio? Search hours of recordings with a click. IV’s e-Discovery is the key.


Imagine: agent voice call and screen data you can search—live, and post-call. IV’s got it. Get it and get ahead.

You can’t improve what you can’t measure.

Call centers have become wizards at measuring data around calls: call and hold times, conversion and retention rates, and more. But attempts to measure the core of the call center world—the verbal content of calls—are still in baby steps. Random QA monitoring a small percentage of calls is the best technology has afforded. Until now.

Intelligent Voice for Call Centers turns what was said in every call into data that can be searched, analyzed and measured. Combine it with screen capture, and other data, to get the full picture of your business. Find and reward the true super-stars, and catch anomalous behavior before it becomes a liability—even in real time s with our “live” call monitoring.

You already record and store your calls. That’s a mountain of unused data. With IV, put your call center voice data to work for you.

Measure both: facts about each call, and what is said.


If your hot new app needs secure, private, high-speed ASR, then you need Intelligent Voice.

Let’s partner, partner.

So, you’re developing an exciting new app and you need accurate, super high-speed Automatic Speech Recognition (ASR) integrated. Don’t reinvent the wheel—unless you’ve got a few extra years and loads of extra cash to invest. We’ve done it for you.

Also, think security. If you’re sending a customer’s sensitive, private data to a public API like Google or Watson, the worry becomes: are you breaching your customer’s data privacy? Are you breaking the law?

Intelligent Voice powers your ASR and speech-to-text functions using our own AWS Azure or other cloud instance. We’ve got every avenue of speech-search covered: hyperphonic, encrypted, biometric. IV model-building means your API gets even better as it’s used, and acquires the vocabulary that is unique to each client. Let our (top quality and cost-effective) technology complement yours, and let’s create something wonderful together.

Your API is better with IV.


Credit Card Data giving you problems?  Let us clean it up, automatically

Automatically redact credit card information from both the Audio and Transcript


Intelligent Voice pick up your call recording and run it through our speech to text engine. We then we run our PCI algorithm across the transcription, if we detect PCI data it is redacted in both the audio and the transcript.

We are working alongside Zendesk to allow the redacted recording and transcript to be added to the ticket. If PCI data is not detected then the transcript is added to the ticket.


Increase your Privacy with IV

Make the audible, searchable

And the invisible, visible.


Intelligent Voice® takes your company’s phone calls (+ email and IM) and turns them into smart data using “World’s Fastest” Speech to Text Engine.

High-speed ASR

Lightning-fast speech-to-text

High-speed ASR

Automatic speech recognition (ASR) technology is how you turn the spoken word into valuable data you can use. Intelligent Voice has developed high-speed, secure, private ASR that is not only cutting edge, but ready to plug into to your existing phone and data systems. With IV, you can quickly and easily turn what is said on the phone into data you can search, with a click of your mouse.

Live call monitoring

Catch anomalies, real-time

Live call monitoring

‘Live Call Monitoring’ no longer means you must physically sit, or pay someone to sit, and listen to one live call at a time. Human quality assurance may still have its place. But IV offers Live Call Monitoring that goes beyond, far beyond individual QA. Until now, there hasn’t been a way to perform real-time monitoring to all your company calls. Whether in a call center environment or any other, with Intelligent Voice’s Live Call Monitoring, you can be plugged into all your calls as they happen, and be alerted about anomalous behavior as it occurs.

IVNote + SmartTranscript

Search what’s said

IVNote + SmartTranscript

With each call, your IV Technology creates an HTML file, a SmartTranscript, that generates a written record of what was said. It not only transcribes and indexes the call, but is also linked to the call recording and audio player. Clicking on any of the words in the SmartTranscript allows you to JumpTo that specific part of the call, and listen for yourself. With Intelligent Voice, you can search for key terms in a specific call or in your entire archive of SmartTranscript call data.

IVNOTE is for everyone who’s ever wished they could get more out of their phone calls. IVNOTE is simple. It captures your phone calls, turns them into text, and sends the transcript and the call directly to your inbox. You can be more present in calls, be more engaged in the conversation, ask more relevant questions—if half of your attention and mental energy aren’t spent notetaking and trying to track what was said for future reference. Be there fully. Let us track the important points of what was said for you. 


Accelerates ‘learning’ & accuracy


Language is a living thing. We all have specific vocabulary that we favor—and it often changes over time. Whether it’s specific work-related terminology or evolving expressions in our personal and cultural lexicon, Intelligent Voice’s language model-building helps the technology “learn” your frequently used terms, for the most accurate transcripts possible. Also, our acoustic modeling trains the system to recognize and best adjust to different sounds, like background noise, phone and microphone setup, to best hone in on voices and clearly capture what is said.

API-based integration

Let our features enhance yours.

API-based integration

Increasingly, and especially since Siri, customers expect their apps to respond to the human voice. Intelligent Voice has what IT developers need to drive their app’s features. Partner with us to give your program interface the voice-enabled functionality your customers want. Our technology is easy to incorporate. Our team of experts is flexible, and eager to help our creation bring yours to life. Whether it’s automatic speech recognition, speech-to-text, recording, indexing, model-building, biometric, hyperphonic or encrypted voice data search—our solutions deliver industry-leading accuracy, speed, and security.

On-site or in-cloud

Choose where your data ‘lives’

On-site or in-cloud

While Intelligent Voice turns your phone calls into searchable data—you control where and how that data is stored. Whether you want the flexibility, cost-effectiveness, and quick scalability of hosting your voice data in the cloud, or if you have the expertise and want the added security and control of hosting your data on site, or whether you choose to have your voice data hosted by a third party, IV accommodates. Intelligent Voice helps you find hosting for your voice data that’s right for you.

Biometric Search

Voice ID

Hyperphonic Search

Sounds and Phrases captured, instantly

Encrypted Search

Search sound, keeping the words hidden

Intelligent Voice in Action

Click on keywords and phrases generated using Intelligent Voice,
and JumpTo where it is said...

Wall Street Journal Interview With CTO Nigel Cannings

Epiq Systems talks about Intelligent Voice

Advancing Speech to Text with Intelligent Voice

Posted on May 30, 2018

The “Magic Pipe” Fallacy: Privacy Protection in the Smart Home

Intelligent Digital Assistants (IDAs) or voice-activated smart devices such as Amazon’s Echo and Google Home have become an essential part of today’s smart life. We use them in our homes (e.g. online searches, querying about weather, directions, etc) as well as in our offices (e.g. recording meetings) to make our life smarter. It’s a Matter […]

Intelligent Digital Assistants (IDAs) or voice-activated smart devices such as Amazon’s Echo and Google Home have become an essential part of today’s smart life. We use them in our homes (e.g. online searches, querying about weather, directions, etc) as well as in our offices (e.g. recording meetings) to make our life smarter.

It’s a Matter of Convenience

Indeed, voice technology is sweeping our world and transforming our lives. IDAs and voice-activated televisions (smart TVs) will soon be commonly used in our daily lives. Recent forecasts show that 50% of all searches on the internet will be voice searches by 2020 [14] and there will probably be more digital assistants than humans by 2021 [15]. The research done by J. Walter Thompson and Mindshare [13] shows that efficiency is the main reason for using voice. It shows that the user’s brain activity is lower when voice is used, as compared to when touch or typing are used, which indicates that voice data is more intuitive than any other means of communication. Current common tasks for regular voice users (i.e. those who use voice services at least once a week) are “online searches, finding information about a specific product, asking for directions, asking questions, finding information about a specific brand or company, playing music, checking travel information, setting alarms, checking news headlines and home management tasks” [13].

The Right to Privacy.
Facebook Logo Crossed Out

Privacy issues in technology were first raised as far back as 1890 by two legal scholars in possibly the most influential privacy article, “The Right To Privacy”, where they examined whether existing laws at the time protected the individual’s privacy [8]. They wrote the article mainly in response to the rise of the ”snapshot” and its subsequent use in taking photos of people secretly or without their consent. They wrote “Instantaneous photographs and newspaper enterprise have invaded the sacred precincts of private and domestic life,”. “The Right To Privacy” article is considered as the main foundation of American privacy laws [5] and since its publication, privacy laws have been passed in some US states to protect individuals. Today, after more than 130 years, drones embedded with cameras, allow anyone to spy from above and new privacy laws are being passed in the US to limit and govern their use [11].

Today’s technology is affecting the privacy of individuals on a daily basis, through the use of smartphones and social media: photos captured by smartphones are shared in social media websites making them susceptible to breach by hackers. In addition to these privacy concerns about photos shared in the cloud, the rising use of cloud-based voice recognition systems such as IDAs and smart TVs has added another layer of privacy issues, sneaking up on people right inside their homes.

For many, there is a belief that there is a “magic pipe” that exists between their Alexa-type device, and the ultimate provider of information, very much like typing text into a browser, and getting a webpage direct from, say, a weather website.

The main privacy problem with voice is that the voice data is processed online at the cloud which enables the cloud to record and store voice data. This makes data vulnerable to breaches from external hackers as well as from the cloud server itself. In fact the cloud provider acts as the conduit of all information to and from the consumer, which could include sensitive financial and health information. The SSL “padlock” that we see against many websites, protecting data-in-transit, has no equivalent in the voice activated world.

What Risks can Voice Really Present?
Man Speaking on mobile phone




















Voice adds an extra layer of potential privacy intrusion over Plain Old Text communications. The recent progress in voice forensics driven by modern advancements in AI speech processing systems by researchers from institutions such as Carnegie Mellon University can profile speakers from their voice data: they can estimate the speaker’s bio-relevant parameters (e.g. height, weight, age, physical and mental health) as well as their environmental parameters (e.g. location of the speaker and the surrounding objects). These research findings have been recently applied to help the US Coast Guard to identify hoax callers [6]. This shows the amount of information that can be leaked about speakers when their recordings are breached by hackers, or even where they are used for data mining by cloud voice providers.

So, online speech recognition leads to privacy issues not only because the cloud server will know the speaker’s transcribed text but also because voice data reveals the speaker’s emotions (e.g. joy, sorrow, anger, surprise, etc) and the speaker’s biological features. Voice data contains biometric data that might be used to identify the speaker. In fact, applications for speaker verification (used for authentication purposes) and speaker identification (used to identify a speaker from a set of individuals) are currently being deployed or are already in use in banking and other sectors.

In [4], it has been reported that recent patents by Amazon and Google about use cases of their digital assistants, Echo and Home respectively, reveal privacy problems that could affect smart home owners. In particular, “a troubling patent”, as noted in [4], describes the use of security cameras embedded in smart devices (e.g IPA, see Fig. 1) to send video shots to identify a user’s “gender, age, fashion-taste, style, mood, known languages, preferred activities, and so forth.” [4].



Fig. 1. Consent vs Amazon’s Echo Look and Google Home Mini

Fig. 1. Consent vs Amazon’s Echo Look and Google Home Mini

Recently, there has been rise in concern about privacy among the users’ of Amazon Echo and Google Home as shown in a recent paper [7] analysing online user reviews. Apparently Amazon’s Echo got bad reviews mostly concerned about privacy after being used as a testimony in a US court to judge a murder case in Arkansas [10]. The paper shows also that Google Home reviews were not affected by the news warning that they are always listening without being activated [12]. Of course, these devices need to be listening in order to detect their activation keywords (e.g. “Alexa” or “OK Google”) but they should not be recording anything before they spot their activation keywords.

General Data Protection Regulation (GDPR) vs Voice Data.

Big Data is Watching You

The EU GDPR [16], enforced on May 25th, defines Biometric data as follows “personal data resulting from specific technical processing relating to the physical, physiological or behavioral characteristics of a natural person, which allows or confirms the unique identification of that natural person, such as facial images or dactyloscopic data”. So GDPR categorises biometric data as sensitive personal data. Personal sensitive data needs to be protected and its processing can be done with consent or in certain cases where it is necessary. In particular, speakers’ voice data is related to their physical, physiological and behavioral characteristics as mentioned above.

Therefore, voice data as well as all other forms of data need to be protected when outsourced to the cloud, and any subsequent processing should be done with consent. Otherwise, if data is breached by hackers, un-protected breached data can be exploited with severe consequences of the type mentioned above.

Achieving Privacy in Voice-activated Applications.

Encrypted Data

Fortunately, there are some solutions that allow us to enjoy the use of IDAs whilst at the same time achieve some measure of privacy. One possible solution is an on-device speech recognition system combined with searchable encryption [3, 2, 1] which is one of the practical methods to perform secure search on encrypted data. An alternative is to have on-device speech recognition as on-device intent matching, eliminating the need to have any cloud intermediary.

In this case the IDA device could be the user’s smartphone, laptop or desktop computer. The on-device solution allows us to avoid the data-in-use protection needed when performing computation in the cloud. It is more suitable for IDAs since they normally processes short-duration voice data in real time.

Performing speech recognition offline at the client side rather than at the cloud side means that at a minimum, the corresponding transcription hides the speakers’ biological and environmental voice features noted above, and only reveals the transcribed texts to the cloud server to enable the server to respond to the speakers’ queries.

The cloud server will use a search engine or any other convenient method to respond to queries depending on dynamic data such as news headlines, weather forecasts, travel information, shopping, etc. However, some very private tasks can be done locally at the user side without using a cloud server such as making phone calls, home management and calendar management.

Our on-device solution can also perform generic speech recognition to transcribe recorded office meetings or recorded customer service calls for example. Privacy and security concerns aside, the prospect of outsourcing data storage to the cloud is attractive for a number of reasons. With professional cloud hosting comes robust backup services, unlimited capacity and essentially it is cheap and more convenient than maintaining on-premise in-house databases. If stored data is always encrypted on the cloud then many concerns disappear, since encrypted data can still be searched, with state of the art searchable encryption techniques. This enables users to perform search when needed on their encrypted data stored at the cloud without costly download-decrypt-re-upload protocols. Third party queries, for example, such as the ones required by court in the Alexa murder case, could be privately issued through the use of multi-client searchable encryption schemes [17, 3] where the data owner (i.e. the user who recorded the meeting or conference call) only writes the encrypted data and gives access to queries to an authorized third party (e.g. court) according to a policy agreement between the data owner and the third party. The cloud server storing the encrypted audio data will not be able to know the encrypted queries or the encrypted audio data because it does not have the data owner’s secret keys. It will only be able to learn whether two encrypted queries are the same or not but will never ‘see’ the actual plaintext queries.

Path of most resistance

Row of Taxi Cabs

Whilst these cryptographic approaches are exciting, they represent a threat to the current order. Google, Apple and Amazon are all building business models that insert themselves in the transaction loop between consumer and brand.

“Alexa, get me a taxi to the airport” represents a major source of potential revenue to Amazon, who act as the arbiter of your intent. You want a cheap taxi, so you don’t care if it is Uber, Lyft or a local cab company. The lucky company pays a small commission to Amazon for being chosen. If Amazon acts as the payment provider, that represents a second source of income.

What is required is an in-home device that is powerful enough to provide the cloud power of speech recognition and intent matching to allow consumers to interact directly with the internet, but which is cheap enough that it provides a bulwark against low-cost devices provided by the major providers. The teardown cost reported by ABI Research of the second-generation Echo Dot is $34.87 [18]. It retails at $49.99 for one device, or $40 for 2, and has been seen for as low as $30. Clearly it is being seen as a loss leader for other services.

The question is, in a world where privacy is regularly sacrificed by consumers for access to free services and content, who will blink first, the internet giants who depend on our data to fund their businesses, or the consumers who provide it?

1. R. Bost. Sophos:Forward secure searchable encryption. In CCS 2016
2. David Cash, Stanislaw Jarecki, Charanjit Jutla, Hugo Krawczyk, Marcel-Catalin Rosu, and Michael Steiner. Highly-scalable searchable symmetric encryption with support for boolean queries. In CRYPTO 2013.
3. Reza Curtmola, Juan Garay, Seny Kamara, and Rafail Ostrovsky. Searchable symmetric encryption: improved denitions and ecient constructions. In ACM CCS 2006.
4. Google, Amazon Patent Filings Reveal Digital Home Assistant Privacy Problems
5. Neil M. Richards. The Puzzle of Brandeis, Privacy, and Speech.
6. Rita Singh, Joseph Keshet and Eduard Hovy. Profiling Hoax Callers. IEEE International Symposium on Technologies for Homeland Security, Boston, May 2016.
7. Lydia Manikonda, Aditya Deotale, Subbarao Kambhampati. What’s up with Privacy?: User Preferences and Privacy Concerns in Intelligent Personal Assistants. AAAI/ACM Conference on Artificial Intelligence, Ethics and Society (AIES) 2018. Url:
8. Samuel D. Warren and Louis D. Brandeis. The Right To Privacy. Harvard Law Review. Vol. 4, No. 5 (Dec. 15, 1890), pp. 193-220.
14. Christi Olson, “Just Say It: The Future of Search is Voice and Personal Digital Assistants,” Campaign, 25 April 2016,
15. Ovum, “Digital Assistant and Voice AI–Capable Device Forecast : 2016-21,” April 2017
16. GDPR.
17. S. Jarecki, C. Jutla, H. Krawczyk, M. C. Rosu, and M. Steiner. Outsourced symmetric private information retrieval. In ACM CCS 13, Berlin, Germany, Nov. 4–8, 2013. ACM Press
18. Disruptive Asia, “Amazon Echo Dot MkII teardown reveals significant cost reduction effort: ABI,”, January, 2017.


Intelligent Voice has offices in London, New York and San Francisco

Fill out the form below or email us on [email protected] and we will get back to you


Or you can call us here:+44(0)2036272670

London Office

Intelligent Voice Limited, St Clare House,
30-33 Minories, London, EC3N 1DD.
Co Reg: 2353541

New York Office

Intelligent Voice Inc., 5th Floor, 555
Madison Avenue, New York, 10022.

San Francisco Office

Intelligent Voice Inc., 44 Tehama St,
San Francisco, CA 94105