Since the mid-2000S, few business topics have received as much attention as big data and business analytics,5,8,11,13 including unstructured data derived from social media, blogs, chat, and email messages. In addition to unstructured data, YouTube, Vimeo, and other video sources represent another aspect of organizations' customer services. A 2011 IBM survey of more than 4,000 IT professionals from 93 countries and 25 industries7 identified big data and business analytics as a major business trend for most organizations, along with mobile, cloud, and social business technologies. This trend is also reflected in a number of professional reports and academic journals, including McKinsey Quarterly and MIS Quarterly. The related skills can also potentially help give organizations a competitive advantage.
Big data takes many forms, including Web and social-media data, machine-to-machine data, transaction data, biometric data, and human-generated data. Human-generated data is our focus here, including vast quantities of unstructured data (such as call-center agents' notes, voice recordings, email messages, paper documents, surveys, and electronic medical records). A number of call analytics technologies are available, including voice searching and indexing for call centers through company-specific phonic-indexing technology. One important application is real-time monitoring that, in a call-center setting, can help address agitated callers and get supervisors involved more quickly. Analytics can process hundreds of hours of audio files in a day, depending on server load, and provide organizations detailed reports on ways to improve customer calls and related job functions, detect problems in operational sectors, and even uncover root problems in products. These systems capture, categorize, store, and analyze unstructured data and can be customized for each customer to include language identification, audio entity extraction, and real-time monitoring.
Here, we review speech and call analytics, especially phonetic search-pattern technology and actual use of voice searching and indexing, along with the benefits of phonic search-pattern technology. We cover real-world data users, including call-center solution vendors and their clients. We also offer insight that can help managers increase customer satisfaction and potentially give themselves a competitive edge.
The first big data center was built in 1965 when the U.S. government had to store 742 million tax returns and 175 million sets of fingerprints. The birth of the Web in 1989 was a milestone in the history of big data housed and accessed on and through the Internet. In 2004, emerging big data tools like Hadoop began to help business managers and researchers understand that data.17
A Gartner report in 20134 described big data as pertaining to big volume, velocity, and variety of information assets that require new forms of processing to enhance decision making. These elements reflect how quickly the data is processed (volume), how much of it is structured and unstructured (variety), and how data flows in from among various sources (velocity). More recently, big data has begun to also reflect veracity, or the abnormalities in data and its business value.
Big data takes several forms: Web and social media data, including click-stream and interaction data from social media; machine-to-machine data, including from sensors, meters, and other devices; big transaction data, including health care claims, telecommunications call records, and utility billing records; biometric data, including fingerprints, genetics, handwriting, and retinal scans; and human-generated data, including unstructured and semistructured data (such as call-center agents' notes, voice recordings, email messages, paper documents, surveys, and electronic medical records).4
Speech and call analytics use phonetic search technology to analyze customer interactions, identify critical areas in need of improvement, and drive business transformation. Phonetic search technology is a method of speech or voice recognition.
Phonetic search-pattern technology. Organizations increasingly use technology based on phonetics, or the systematic study of the sounds of human speech. Within all human languages, there are approximately 400 distinct sounds, or "phonemes," though most use only a fraction of the total. By collecting them, organizations are able to capture a true record of what is said in an audio track that can be searched more accurately and flexibly than human analysts could otherwise do on their own.
The process works in two phases. In one, recorded audio is input into the system, and a time-aligned phonetic index is generated automatically. Because phonemes are simply uttered sounds, indexing them is not affected by background noise, language, dialect, or speaking style. The other begins when a search is requested by a human analyst. Searches can be done directly on words or phrases or through special operators (such as Boolean strings and time-based proximity to other content). A search engine identifies and matches the phonetic equivalent of the search string and returns results ranked by relevance.
The technology can potentially deliver several benefits:
Greater speed. Phonemes are the tiniest building blocks of language. Using them enables quicker audio processing and enhanced ability to find words and phrases in context without complex and difficult-to-maintain dictionaries;
Greater accuracy. Today's spoken human languages are constantly changing. New words, industry terms, blended words, proper names, slang, code words, brand names, and even the nonstandard mixing of different languages are all easily processed through the phonetic approach; and
Greater flexibility. Since the technology is not dictionary-based, the system does not have to be trained on dialects or accents.
Atlanta, GA-based phonetic-search-and-indexing-technology firm Nexidia, through real-time monitoring analytics and call-operator pop-ups, provides solutions to end users that spare them having to send a service vehicle to a client-user's location, saving the client money and supporting its bottom line. Technology customers support high interaction volumes around product and technical support. Such interaction can make or break a company's innovation life cycle and make the technology's reputation.
As the health-care market shifts to a more consumer-driven, retail-like business model, health-care insurance companies must bolster service to remain competitive by reducing operational costs and increasing enrollment. Solutions must help improve the customer experience, increase first-call resolution, and maximize enrollment.
Failure by insurance companies to comply can cost them in terms of financial results and brand reputation. They must thus take a proactive, continuously vigilant stance. Phonetics-using insurance companies are able to detect potential conflicts during calls and provide quick access to relevant call segments for monitoring high-risk transactions. Compliance violations come in many forms, including non-inclusion of language (such as mini-Miranda rights and other disclosures, abusive language, threat of wage garnishment, and harassment) that generate most consumer complaints, according to FICO, a credit-scoring system, and its Engagement Analyzer software platform.
Here, we discuss several technologies and cases of phonetic recognition and analytics, self-identification/authentication solutions, and related applications in the context of call centers. Real-world systems have been implemented or are being planned in a number of directions.
Phonetic recognition, analytics, authentication solutions. A representative example of a call-center technology solution is Impact 360 from Verint Systems, which provides analytic software and hardware for the security, surveillance, and business-intelligence markets.2 It helps generate useful feedback from a market and state-of-the-art consumer trends, helping define and implement marketing strategies and customer segmentation.3
Another real-world application is Voice of the Customer Analysis, or VoCA, from Lucis, which has a strategic partnership with Verint Systems. VoCA analyzes an organization's customer-interaction data collected in call centers, assuming such interactions could include information on customer behavior, emotion, and market trends. The technology is intended to help dramatically reduce customer complaints, provide customized service for each customer, and ultimately yield enhanced customer loyalty. Such a solution can help improve a user organization's analytics ability, business insight, company reputation through complaint management, accurate decision making, as reflected in the voices of customers, and marketing based on well-understood market trends.
Another example is Bridgetec's Catch All, Catch You, and Catch Who systems. Catch All helps organizations transform unstructured phonetic data into well-structured text-based data through its speech-to-text engine for the phonetic data of customers inbound to call centers. The technology also includes a data-mining function using text-based keyword extraction that reports statistics to user organizations. Reporting statistics is useful for identifying customer-related issues and determining the overall status of the call center. Catch All involves an ontology-based technology that includes syntax morphemes and keyword indexing. Catch You and Catch Who deal with customer voice commands and authentication. Catch All helps user organizations evaluate employee performance and implement quality assurance based on voice analytics. Through Catch All, an organization can identify which customers responded to its telemarketing from the possibly millions of calls collected in its database.
The Interaction Analytics speech analytics tool from NICE Systems includes phonetic-recognition technology that analyzes why customers contacted the organization about complaints. The technology is useful for real-time speech analytics, phonetic indexing, speech-to-text transcription, speaker separation, emotion detection, and talk-over analysis.12
Catch You and Nuance Recognizer are considered by phonetics analysts the two major commercial phonetic-recognition applications, performing voice-command functions that replace the button-type automatic response system (ARS), thus increasing call or contact center efficiency and speed. The Hyundai credit card company implemented this technology in 2010, replacing a typical ARS it was using in its call centers at the time.
FreeSpeech from Nuance Communications is a representative voice-verification/authentication system.16 Voice verification and authentication identify customers through their unique voices and tones. The technology matches customers with digital audio files stored in a digital audio database, since each human voice has a unique frequency. Another Nuance solution is VocalPassword, which is able to verify customer identity during interaction with self-service voice applications (such as voice response) and mobile apps. The customer recites a passphrase, and the application verifies the person's identity by comparing their vocal tone against those in its database.18
Phonetic recognition, analytics, authentication. Here, we discuss real-world applications of phonetic analytics in call centers. The first is NH Credit Card of South Korea, which has used Emo-Ray phonetic analytics to determine whether calling customers are angry.6 The system is able to identify which calls should be treated with special care based on analysis of a customer's vocal tone, voice frequency, and other identifying characteristics. It can thus identify calls from angry customers through phonetic analytics. The hit ratio of its performance in distinguishing between angry and normal reportedly reached 90% in January 2014, following a large data security breach in South Korea. Moreover, NH Credit Card of South Korea analyzes the call histories of customers referred to its call center.
The second case involves the phonetic analytics used by AXA Direct Insurance14 call center agents to request guidance as to how they should respond to customer calls, aiming to optimize response time. AXA Direct was thus reportedly able to reduce inefficient call interactions up to 61% in July 2014, as covered by News1, a South Korean news agency. This cost savings reflected AXA's reduced total call time following improved efficiency of its call center agents' search time. In the long-term, AXA could improve its overall first-call resolution and customer satisfaction.
Because phonemes are simply uttered sounds, indexing them is not affected by background noise, language, dialect, or speaking style.
A third real-world example of phonetic solutions being used in call centers involves voice verification,10,15 whereby call-center agents are able to identify calling customers through technology from Nuance Communication and Bridgetec. Call centers verify caller identities through their unique vocal tones, saving time otherwise needed to authenticate customers and promising to enhance customer satisfaction. For example, South Korea's Ministry of Public Administration and Security forbids all companies operating in South Korea from collecting and storing Social Security Numbers (SSNs) in their databases due to the risk of leaking personal information. The technology's potential diffusion into corporate call centers would require highly accurate voice verification and monitoring potential legal violations. It would also require sufficient time to acquire and use customers' voice and frequency data. SK Telecom, a South Korea mobile service provider, has implemented such voice verification in its call center.
We may thus categorize phonetic technologies according to their functionrecognition, analytics, and authenticationalong with their major solutions and benefits for call centers; Table 1 classifies the benefits of phonetic analytics for both solution clients and end users. The main benefits include efficiency, customer satisfaction, and support for overall business strategy.
Organizations using phonetic-search-pattern technology can reduce call-handling times and cost and improve customer satisfaction. Just as call centers have progressed into contact centers, speech analytics has likewise progressed into interaction analytics, as channels (such as text and social media) have been added. The goal is to make sense of all the unstructured data flowing into call centers and getting it into the hands of the managers who need it.
Phonetic analytics technology helps eliminate barriers to understanding interaction between organizations and their customers in call and contact centers. By defining and tracking metrics as they relate to both organizational and agent performance, behavioral issues between call-center agents and customers are identified, as are the business processes and procedures in the way of achieving strategic business goals. The technology does more than provide surface-level information and statistics on call-center/customer interaction. Agent behavior affecting the business can be identified, then help determine how to respond. Business managers should thus consider the following aspects of call and phonetic analytics technology.
Technology. Big data in call analytics and recognition must be secure. Some call data in the banking industry cannot, by law, be used in other industries. Constantly changing words, abbreviations, and technical terms must be updated for call analytics, and voice quality needs to be studied. The major technical challenge involves the number of non-English languages in the world. The technology of phonetic analytics should be able to understand and recognize even the relatively obscure languages people use. Well-structured data, rather than unstructured data, can fit the technical dimension of phonetic technology. However, technological advances in unstructured, phonetic recognition and analytics strive to increase business speed/efficiency, leading to a potential revolution in business intelligence.
Legal. Call-center managers must understand privacy and security18 to be able to build secure data centers reflecting payment-card industry standards, ensuring individual customer privacy. For example, in the case of the South Korea privacy-protection law (signed August 7, 2014) forbidding public institutions from collecting and storing South Korean citizens' SSNs, phonetic-analytics technology may be used to verify customer identity in lieu of SSNs.
Market. Phonetic analytics is expensive. The many analytics solutions, need to secure data volume, and significant speech-analytics-related costs are often barriers to adoption. The major vendors must thus provide clear benefits to potential clients willing to pay when they perceive more benefit than cost in such technology. However, a new market combining phonetic analytics and the Internet of Things has emerged; for example, driving- and cooking-related devices can be connected through phonetic search technology without users having to touch them.
Customer psychology. In addition to concern for the privacy of their personal data, customers are reluctant to use phonetic technology when interacting with corporate call centers, reflecting a type of common business logic. Customers' trust and consent is required for an organization to adopt phonetic analytics technology but, K.L. Keller9 wrote, it is very difficult for customers to change their purchasing habits for the sake of new products and services.
Return on investment. Speech analytics technology is expensive. Numerous business leaders believe in the potential of big data and analytics but find big data return on investment difficult to measure.1 The cost of maintaining huge databases and analytics is also significant but may soon decrease as more and more companies capture, process, analyze, and store vast amounts of data.
Beyond the five issues we have outlined here, the most significant barrier preventing adoption of phonetic technology is customer psychology and habits. Most customers are unlikely to change their purchasing habits if they are not familiar with new voice-recognition-and-authentication technologies. Difficulties in phonetic data collection will thus continue to hamper development of phonetic analytics. Moreover, customers in the banking and stock-trading industries are even more reluctant to adopt the new technology, as data error or system failure can potentially produce huge losses.
These problems call for initiatives from solution vendors and solution buyers, or call-center organizations, and for collaboration between them. First, solution vendors must update their technologies to minimize any possible errors; for example, a stock-trading company requires a 100% reliable phonetic system that permits no technical error. Second, solution clients, or call centers, must encourage customers to adopt the technology by providing free calls and discounted service charges. Customers' trust and consent is required before phonetic recognition, analytics, and authentication is implemented. Finally, solution vendors and solution clients must collaboratively clarify and share their aims in implementing phonetic technology based on their partnership. Their objectives may be cost reduction (through efficiency), performance enhancement, and/or customer satisfaction. The objectives may vary depending on organizational circumstances. Moreover, governments must communicate with industry stakeholders by reviewing whether phonetic technology could violate privacy-protection laws. However, phonetic-analytics technology is still in an early stage of development, with questions concerning government policy, the technology itself, the phonetics market, and customer purchasing habits. Finally, the three concepts discussed herephonetic recognition, analytics, and authenticationmust be clarified, as they overlap and are sometimes used interchangeably; Table 2 summarizes the implications of phonetic analytics technology for solution vendors, clients, and policymakers.
Business intelligence and analytics may provide an opportunity for organizations to learn more about their own customers' purchasing power, product placement, feedback, long-tail marketing, targeted and personalized recommendations, and increased sales through enhanced customer satisfaction.5 With access to more and more data, organizations are able to solve their customers' problems more quickly and efficiently and improve job functions and authority issues. Data can be assimilated quickly and customized to an organization's individual circumstances, identifying problem areas and providing recommendations and coaching tools for call-center agents. Analysts can help define queries and search information for benchmarking and root-cause analysis and recommendations for problem solving. The goal is to provide organized and easily accessible information, quicker problem solving, increased service value, and ultimately more business.
The future of phonetic analysis involves lots of electronic data. Since social media, blog posts, chats, and email messages are already in text form, organizations are potentially better able to produce a more complete picture of their business environment. Along with unstructured data, YouTube and Vimeo videos are yet another type of customer service platform. That is, both structured and unstructured data can be aggregated into analyses that then help paint a bigger picture. One promising idea for emergency calling, or 911, applications is for the software to track phone calls and paint a picture for first responder(s) sent to the scene of a crime or other location. While business analytics receives considerable attention, virtually all the studies we are aware of have neglected or given only cursory attention to call analytics, or phonetic search and indexing technology. We hope our own research, as outlined here, aids decision makers and managers dealing with unstructured tasks to identify patterns and trends in consumer behavior.
2. Bodner, D. Verint Impact 360Speech Analytics Helps Shanghai Unicom Listen and Take Action Based on the Voice of Its Customers. Verint Systems, Inc., Melville, NY, Apr. 22, 2014; http://bit.ly/1OM0LTD
3. Bodner, D. Speech Analytics: Use the Voice of Your Customer to Optimize Your Business. Verint Systems, Inc., Melville, NY, 2014; http://bit.ly/1Oj8vMe
4. Buytendijk, F. and Laney, D. Drive Value from Big Data Through Six Emerging Best Practices. Gartner Inc., Stamford, CT, Oct. 29, 2013; http://gtnr.it/1Plvagi
6. Choi, K.M. ChosunBiz. Call centers, they know more about me than I do; http://bit.ly/1U5v1N5
7. Gokhale, V. The 2011 IBM Tech Trends Report: The Clouds Are Rolling In ... Is Your Business Ready? IBM, New York, Nov. 23, 2011; http://ibm.co/1Plc0VR
10. Kim, S.J. Bridgetec Catch All: There is a solution in communications with customers. Newsprime (May. 26, 2014); http://bit.ly/1YyzO0b
12. Kostman, D. Speech Analytics: Innovative Speech Technologies to Unveil Hidden Insights. NICE Systems, Inc., Ra'anana, Israel, 2014; http://bit.ly/1Mx71wE
14. Lee, H.C. AXA Direct could reduce inefficient calls up to 61% by using call content analysis in call centers. News1 (July 2, 2014); http://news1.kr/articles/71751498
15. Park, S.Y. Caller authentication solutions: Callers' identity can be identified by their voice. Digital Times (Mar. 9, 2014); http://bit.ly/1RHvtU8
16. Ricci, P. Authentication Via Conversation. Nuance, Inc., Burlington, MA, 2015; http://bit.ly/1ScHoIN
17. Stoecker, D. Hadoop analytics. Alteryx, Inc., Irvine, CA, 2015; http://www.alteryx.com/solutions/hadoopanalytics
©2016 ACM 0001-0782/16/02
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and full citation on the first page. Copyright for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, or to redistribute to lists, requires prior specific permission and/or fee. Request permission to publish from firstname.lastname@example.org or fax (212) 869-0481.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2016 ACM, Inc.
No entries found