While product review systems that collect and disseminate opinions about products from recent buyers (Table 1) are valuable forms of word-of-mouth communication,2,6,12,a evidence suggests that they are overwhelmingly positive. Kadet9 notes that most products receive almost five stars. Chevalier and Mayzlin4 also show that book reviews on Amazon and Barnes & Noble are overwhelmingly positive. Is this because all products are simply outstanding? However, a graphical representation of product reviews reveals a J-shaped distribution (Figure 1) with mostly 5-star ratings, some 1-star ratings, and hardly any ratings in between. What explains this J-shaped distribution? If products are indeed outstanding, why do we also see many 1-star ratings? Why aren't there any product ratings in between? Is it because there are no "average" products? Or, is it because there are biases in product review systems? If so, how can we overcome them?
The J-shaped distribution also creates some fundamental statistical problems. Conventional wisdom assumes that the average of the product ratings is a sufficient proxy of product quality and product sales. Many studies2,3,7,8,10,11 used the average of product ratings to predict sales.b However, these studies showed inconsistent results: some found product reviews to influence product sales, while others did not. The average is statistically meaningful only when it is based on a unimodal distribution, or when it is based on a symmetric bimodal distribution. However, since product review systems have an asymmetric bimodal (J-shaped) distribution, the average is a poor proxy of product quality.
This report aims to first demonstrate the existence of a J-shaped distribution, second to identify the sources of bias that cause the J-shaped distribution, third to propose ways to overcome these biases, and finally to show that overcoming these biases helps product review systems better predict future product sales.
We tested the distribution of product ratings for three product categories (books, DVDs, videos) with data from Amazon collected between FebruaryJuly 2005: 78%, 73%, and 72% of the product ratings for books, DVDs, and videos are greater or equal to four stars (Figure 1), confirming our proposition that product reviews are overwhelmingly positive.
Figure 1 (left graph) shows a J-shaped distribution of all products. This contradicts the law of "large numbers" that would imply a normal distribution. Figure 1 (middle graph) shows the distribution of three randomly-selected products in each category with over 2,000 reviews. The results show that these reviews still have a J-shaped distribution, implying that the J-shaped distribution is not due to a "small number" problem. Figure 1 (right graph) shows that even products with a median average review (around 3-stars) follow the same pattern.
To investigate the causes of the J-shaped distribution of product reviews, we performed a controlled experiment in which everyone reviewed a randomly selected product - a music CD titled "Mr. A-Z." The experimental results showed a unimodal distribution, while the corresponding distribution for the very same product with Amazon's empirical data was J-shaped (Figure 2). Also, the average of the experiment's product reviews was significantly lower than the average of Amazon's product reviews.
The experimental results explain the J-shaped distribution to be driven by two self-selection biases (Figure 3)-purchasing bias and under-reporting bias.
First, since only people with higher product valuations purchase a product, those with lower valuations are less likely to purchase the product, and they will not write a (negative) product review (purchasing bias). Purchasing bias causes the positive skewness in the distribution of product reviews and inflates the average.
Second, among people who purchased a product, those with extreme ratings (5-star or 1-star) are more likely to express their views to "brag or moan" than those with moderate views (under-reporting bias).
People tend to write reviews only when they are either extremely satisfied or extremely unsatisfied. People who feel the product is average might not be bothered to write a review.
Taken together, these two biases shape a J-shaped distribution. The J-shaped distribution is because people who purchase a product are more likely to write positive reviews (purchasing bias), and also people with moderate views are less passionate to exert the time and effort to report their ratings (under-reporting bias), unless it is to "brag or moan."
The experimental results also show that product reviews follow a unimodal distribution, implying that most people have moderate tastes. The experimental results suggest that only 3% of the respondents gave the music CD a 5-star review and 7% gave the product a 1-star review. These results invalidate both rival explanations of extreme tastes and overconfidence.c
Rational people account for these two biases when reading product reviews. Specifically, people pay more attention to extreme compared to moderate reviews, and also pay more attention to extreme negative reviews to learn what is wrong about the product to avoid.2,4 Therefore, the average rating is not representative of true product quality, and the entire distribution of product reviews must be observed. This is consistent with the recent update of Amazon's product review system (Table 2) that presents the distribution over the average of product ratings.
To test the superiority of our proposed "brag or moan" model with the proposed additional parameters, we compared its ability to predict future product sales versus the simple average, three weighted average scores, and the percentage of 1-star and 5-star reviews. Future product sales are a good proxy for product quality since people purchase products they believe to create value to them, and product value predicts future sales. Through our proposed brag-and-moan analytical model, we showed that to generate an unbiased estimator of product quality, we need to incorporate the following parameters besides the average of product reviews: standard deviation (to control for the divergence of opinions that makes it difficult to infer product quality due to higher uncertainty in inferring true product quality), the two modes of the bimodal distribution of product reviews (to overcome under-reporting bias), and product price (to overcome purchasing bias).
Controlling for previous product sales and the total number of product reviews, our model explains a large degree of the variance in future product sales (R2 adjusted=77.29%), which is significantly higher than all five competing models. All proposed variables (the standard deviation and the two modes of the bimodal distribution of product reviews, plus product price) were significant predictors of future product sales besides the simple average.
To overcome the two sources of bias in product review systems (purchasing and under-reporting), we propose that people should not solely rely on the simple average that is readily available, but they should incorporate other variables, such as the standard deviation, the two modes of the online product reviews, besides product price. Therefore, product review systems must provide additional information besides the averagethe standard deviation and the two modes of the product reviews.
7. Dellarocas, C., Awad, N., and Zhang, X. Exploring the value of online reviews to organizations: Implications for revenue forecasting and planning. Proceedings of the 24th International Conference on Information Systems, 2004, Washington, D.C.
b. Some exceptions include Chevalier and Mayzlin3 who used the percentages of 1-star and 5-star ratings, and Liu11 who used the percentages of positive and negative ratings.
c. There are other explanations for the J-shaped distribution (brown arrows) (Figure 3): first, people have extreme tastes, and the J-shaped distribution reflects the true state of nature; however, this explanation suggests that most products are either outstanding or abysmal, which is unlikely;9 second, people are overconfident,1 and the J-shaped distribution reflects their exaggeration of extreme reviews; however, this would result in a symmetric U-shaped (not a J-shaped) distribution. Finally, if the past reviews (that follow a J-shaped distribution) are likely to influence subsequent reviews, past reviews are likely to spawn similar past reviews, reinforcing, yet not explaining the J-shaped distribution. Finally, people may choose to write extreme product reviews because they believe readers prefer to read such extreme reviews. However, since readers tend to read extreme negative over extreme positive reviews, this would create the opposite of a J-shaped distribution. Besides, this would enhance the bias in the product reviews that this study seeks to overcome.
Product review systems rely on a simple database technology that allows people to rate products. Following the view of information systems as socio-technical systems, product review systems denote the interaction between people and technology. The challenge is how people use technology to build product review systems that render unbiased information about true product quality.
©2009 ACM 0001-0782/09/1000 $10.00
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
The Digital Library is published by the Association for Computing Machinery. Copyright © 2009 ACM, Inc.