The Perils of Data Misreporting

In medicine, the long-standing existence of reported treatment variations has had little practical significance in clinical quality assessment. Likewise, variation in data collection and management has received little attention as a major factor in the measurement of clinical effectiveness. Instead, the results of most medical encounters are driven by economic circumstances and clinical influences, influenced by what is perceived as best for the patient. Unfortunately, such approaches do not make for the best data collection and analysis. In recent years, the importance of good data has grown exponentially as most of the industrialized world has adopted data-driven, or evidence-based, health care, which uses comparative data methods for clinical treatment purposes. The emergence of high-speed information systems, reliable data communications technology, and national data quality standards have provided most countries with an infrastructure for medical service provision using comparative medical evidence and outcomes data derived from large populations [5]. Without consistent classification and reporting of medical information, however, evidence-based decision making becomes an unlikely, or dangerous, proposition.

This article presents the findings of a national survey of all accredited U.S. medical records managers, examining variations in the classification of health care. Overall, the study findings of significant variation in the application of standardized classification rules suggest a large degree of misreporting of patient data exists across the U.S. This is a cause for concern since recent national health policy and legislation dictate increased uniformity in data reporting and information management. With this survey, I sought to define the direction and randomness of such variation, observe potential motivating factors for misreporting, and assess the impact on utilization of comparative medical evidence.

Over-reimbursement. Managers reported that records with significant errors often result in over-reimbursement of billing claims. Overall, 8.2% of respondents said more than 5% of their records had significant over-reimbursement errors. Such errors were seen across practice settings, with hospital-based managers reporting that an average of 4.6% of records had significant errors that would result in over-reimbursement. Outpatient, clinic-based managers reported about 4% of their records contained significant over-reimbursement errors.

Table 1. Percent of erroneous records resulting in over-reimbursement.

In the Northeast, 5.2% of managers reported a significant amount (over 5%) of their records contained over-reimbursement. This was significantly higher than found in the Northwest (3.9%), Southeast (3.8%), Mid-Atlantic (4.1%), Mountain (3.7%) and Pacific (4.2%) regions. This measure precludes minor, typographical or inconsequential errors and captures, in effect, major inconsistencies between actual and reported diagnoses.

Under-reimbursement. Managers reported that records with significant errors also resulted in under-reimbursement of billing claims. Overall, 16.8% of respondents said that more than 5% of their records had significant under-reimbursement errors. Hospital setting respondents (6.7%) reported greater under-reimbursement levels, compared to clinic (5.4%) and “other” (4.8%) practice setting respondents. Health care organizations do not always misreport in the same direction; at a national level they under-report nearly twice as often as they over-report, and often disproportionately within different classification groups. Of greater concern, they vary significantly in both directions, depending on their geographic location.

Billing recoding. We found a number of other influences serving to confound medical data reporting. Data is often changed by health care billing departments, often to the financial advantage of organizations. Overall, 14.1% of managers reported that more than 5% of their data classification codes are changed by their respective billing departments. Seven percent of the Southwestern respondents, and 15.1% of the Northeastern respondents reported biller changes of the principal diagnosis.

On average, the percentage of the total principle diagnoses changed by the billing department was significantly greater for respondents working in clinic settings (7.4%), than for those working in hospitals (4.3%). Beyond the medical records department, billing personnel often change codes to varying degrees, across regional boundaries and organizations.

Management influences. The findings also suggest that management influences may be acting to adversely affect classification practices. Records managers reported their senior management, as well as third-party managers, emphasized that classification of data, when unclear, should reflect the maximum allowable reimbursement rate. Overall, 43.5% of respondents indicated senior management sought to promote such optimization “often” or “sometimes.” By respondent category, over 47% of hospital respondents, 50.6% of outpatient clinic respondents, and 31% of “other” setting respondents reported senior manager influence occurred “often” or “sometimes.” In terms of regions, 43.8% of Mid-Atlantic region respondents reported that senior managers sought to influence reimbursement optimization “often” or “sometimes,” compared to 34.6% of Mountain region respondents.

External payer organizations, such as insurance companies, likewise appear to exhibit significant influence over data classification. While slightly below the influence of internal management, the external payer influence was significant. Approximately 33% of respondents indicated their data classification practices were varied due to the influence of a specific payer “often” or “sometimes.” About 28% of respondents said they were seldom influenced by specific payers, but nearly 34% of hospital respondents reported they are “often” or “sometimes” influenced by external payers, compared to 47.2% of clinic respondents, 25.3% of “other” practice settings, 35.2% of Northeast region respondents, and 23.1% of Mountain region respondents.

Clarity of guidance. The usability of government guidance was also identified as a reason for misreported medical data, related to compliance with correct coding and classification guidelines. Managers were asked to indicate the clarity of governmental guidance related to regulatory compliance. Overall, 36.4% of respondents said government guidance was “very” or “mostly” unclear, ranging from a high of 49.3% Northwest region respondents, to a low of 32.9% Southwestern respondents reporting “very” or “mostly” unclear government guidance.

Table 2. Percent of erroneous records resulting in under-reimbursement.

Comparative Limitations

The presence of bidirectional misreporting and misclassification of information suggests that data utilization for comparative analysis must be undertaken with great caution, especially when outcomes or other decision data are compared across regions or countries. In a national, for-profit health care industry, classification systems serve the dual purpose of comparing morbidity/mortality rates and maximizing reimbursement for services. Findings here suggest the existence of a culture in the U.S. of non-random misreporting of data, which varies across regional boundaries, and which is driven largely by profit motives.

While it has been shown that small area variations in health care quality are likely the result of uncertainty regarding the value of a given level of health care delivery, there is, undoubtedly, a similar uncertainty regarding the classification of medical information. Classification criteria is at times ambiguous, and presupposes an ideal environment of complete, accurate, and legible supporting documentation, available for correct code choice. Beyond this uncertainty, however, lies a far greater influence, related to profit-influenced data manipulation. Practices inherent to such a culture may operate in combination, magnifying the overall level of error within any defined database or system.

Why has the problem of misreporting medical information not been addressed? Most variation in medical data has been analyzed for purposes of identifying its financial impact on health reimbursement, at a national level. Since much of the over-reimbursement, or upcoding, balances out the impact of under-reimbursement, or downcoding, little attention has been given to data manipulation practices, since the aggregate, national bottom line only shows net reimbursement variation. The use of comprehensive classification accuracy assessments, rather than reimbursement audits, would serve to alleviate this shortcoming to some extent.

However, this ignores the more important clinical impact. As coded and classified data begins to be used for evidence-based comparison of clinical treatment outcomes, the variation effect is multiplied, rather than balanced; comparing underestimated data with overestimated data across varying regional areas multiplies the skewing effect, and degrades the reliability and usability of health care data as a comparative resource [6].

From Patient’s Best Interest to Damaged Data

Are other data repositories at risk for systematic misreporting? In a report by the American Medical Association’s Institute for Ethics, nearly 40% of physicians reported they had at least “sometimes” exaggerated the severity of patient conditions, changed patient billing diagnoses, and/or reported non-existent signs or symptoms to help patients secure coverage for needed care [8]. Even as demands for more accurate and complete data emerge, cost pressures are likely to continue to cause providers and provider organizations to misreport data as a means for obtaining reimbursement for their neediest patients. It is unlikely that providers’ views will change regarding their need or obligation to misreport data, given the combination of high expectations for service, and increased patient requests for physicians to modify their medical data for reimbursement purposes [8].

To check the accuracy of coded data, auditors routinely examine hospitals for inappropriate use of codes and excessive classification variation. As an example, for the diagnosis of pneumonia, a number of hospitals have recently paid the government substantial settlement fines for upcoding cases. Pneumonia coding remains a top target in government audits because of the significant reimbursement difference between two related DRG groups: DRG 79 and DRG 89. Bacterial pneumonia (DRG 79) pays an average of $2,500 more per patient than the more simple pneumonia (DRG 89). Such misreporting of coded data is now a common occurrence in U.S. health care. Multiplying that inflated figure by the thousands of pneumonia cases per year at hundreds of the nation’s hospitals, the enormous potential for error in public health assessment becomes apparent.

Related to data quality at aggregated, regional, or multi-organizational levels, the accumulation of individual provider idealism becomes increasingly problematic, since misreported patient data does not result in routine, random error. While dollar variations may balance out overall at a national level, their classification indicates a pervasive, multidimensional skewing effect that may significantly limit any attempt to enlist the comparative power of information sciences and technologies in evidence-based health care.

As pressures to control health care costs have increased, so has manipulation of reimbursement data. Unfair managed care guidelines related to cost control are often cited as justification to manipulate patient data in order to obtain reimbursement for needed services, as demands on providers’ time increase. Providers who misreport data are thus more likely to believe that manipulation of a dysfunctional health care system is necessary to provide high-quality care in the existing health care structure. As such, they may be asked by patients to misreport and misrepresent data to insurance companies, and have indicated that misreporting of data for expediency in processing claims is the only way they can have sufficient time to treat their patients adequately, especially the indigent and uninsured [1].

Perhaps more troubling, there is some evidence that providers will create coordinated organization-wide mechanisms to encourage dissemination of misleading information in order to confound what they perceive to be unfair regulatory oversight actions [2]. In an era when both reimbursement levels and clinical error analysis rely on coded and classified data, however, it is critical to accurately determine the consistency and/or reliability of such data, given an environment where physicians, information managers, and payers influence the classification of such data [3, 4].

Conclusion

Comparative, evidence-based studies of groups or populations are increasingly used in the design and delivery of health care service at local, national, and international levels. Inaccurately classified medical information in such environments can result in health care decisions that are ineffective, or potentially life-threatening. Likewise, area variations in coded and classified utilization data taken from small geographic areas often indicate that the effectiveness of a given level of medical service is uncertain, with expenses toward a given treatment outcome ranging widely. Such variation is believed to reflect differences in the individuals and groups practicing medicine. Factors intrinsic to the health industry overall, rather than a result of localized random error, are often responsible for variations in clinical performance [7].

In assessing the accuracy of data provided by health care organizations, service providers are often forced to deal with an inherent conflict between the ethics of undivided loyalty to patients, and pressure to use clinical methods and judgment for social purposes on behalf of third parties. Since such a conflict is now inherent in the health care industry structure, providers are challenged with retaining patient trust while dealing with inconsistent demands of the health care industry, which often work against the interests of patients [1]. Many providers, in effect, will not hesitate to misrepresent data to reimbursement sources, when they perceive it benefits their patients.

Clinicians may have historically had difficulty finding implications in variation studies relevant to their personal practices, but such variation in medical evidence can no longer be seen as an unfortunate by-product of medical service. The examination of aggregated, coded, and classified patient data has since become the preferred means of analyzing service variation in U.S. health care. In an emerging data-driven, global, evidence-based medical environment, fraught with charges of billing fraud and documented medical error, one thing is becoming certain: data is destiny.