The goal of content personalization is to ensure the right people receive the right information at the right time. People can have very different preferences and priorities when it comes to their information needs. Here, we describe a content personalization system that uses learned user profiles to target appropriate content items for individual end users [9]. We will ground our discussion with a study of a fully deployed personalized Internet service called “Personalized Television,” which serves personalized TV listings content to over 20,000 users in Ireland and Great Britain.
The ClixSmart content personalization engine has been developed in the Department of Computer Science at University College Dublin. Its engine performs two essential tasks: It monitors the online activity of users (for a given Web site) and automatically constructs profiles for these users to capture their domain and behavioral preferences. (This task is carried out by the profile manager [5, 6]). The actions of individual users are stored as they select (click), browse, and read content assets, and this information is used to infer interest in specific content assets stored in the content database. Also, it uses this user profile information to personalize a target Web site by filtering information content for the target user, eliminating irrelevant content items and highlighting relevant ones [13, 7, 8, 10].
The ClixSmart personalization manager employs two different content filtering strategies. A content-based filtering approach seeks to recommend similar items to the items a user liked in the past [4, 11]. In contrast, the collaborative recommendation approach seeks to select items for a given user that similar users also liked.
Content-based filtering has its roots in information retrieval and case-based reasoning research [4]. The success of the content-based method relies on an ability to accurately represent recommendable items in terms of a suitable set of content features, and to represent user profile information in terms of a similar feature set. The relevance of a given content item to a specific target user is proportional to the similarity of this item to the user’s profile; content-based filtering methods select content items that have a high degree of similarity to the user’s profile.
By integrating both content-based and collaborative filtering strategies, the ClixSmart personalization engine provides a unique and powerful personalization solution.
The downside of content-based recommendation methods is this content description requirement can be problematic and time consuming. In many domains it represents a significant knowledge-engineering problem and, indeed, it may not even be possible to develop a suitable content description language in the first place. Content-based methods also suffer from a number of shortcomings in the way they select items for recommendation. By its very nature, content-based recommendation relies on recommending items similar to items a given user liked previously—a user profile effectively delimits a region of the item-space from which all future recommendations will be drawn. Therefore, future recommendations will display limited diversity. This is particularly problematic for new users since their recommendations will be based on the very limited set of items represented in their immature profiles. Items relevant to a user, but bear little or no resemblance to the snapshot of items the user has looked at in the past, will never be recommended in the future.
Collaborative filtering techniques are a recent alternative to content-based strategies [13, 8]. The basic idea is to move beyond the experience of an individual user profile, and instead draw on the experiences of a population or community of users. Typically, each target user is associated with a set of nearest-neighbor users by comparing the profile information provided by the target user to the profiles of other users. Collaborative filtering techniques look for correlation between users in terms of their ratings assigned to items in a user profile. The nearest-neighbor users are those that display the strongest correlation to the target user. These users then act as “recommendation partners” for the target user, and items that occur in their profiles (but not in the target user profile) can be recommended to the target user. In this way, items are recommended on the basis of user similarity rather than item similarity.
Collaborative filtering has a number of advantages over content-based methods. Firstly, since explicit content representations are not needed, the knowledge-engineering problem associated with content-based methods is relieved. More importantly, perhaps, the quality of collaborative filtering typically increases with the size of the user population, and collaborative recommendations benefit from improved diversity when compared to content-based recommendations.
Collaborative filtering does suffer from a number of significant drawbacks. For a start, it is not suitable for recommending new items or one-off content items because these techniques can only recommend items already rated by other users. If a new or one-off item is added to the content database, there can be a significant delay before this item will be considered for recommendation. Essentially, only when many users have seen and rated the item will it find its way into enough user profiles to become available for recommendation. This so-called “latency problem” is a serious limitation that often renders a pure collaborative recommendation strategy inappropriate for a given application domain.
Collaborative recommendation can also prove unsatisfactory in dealing with what might be termed an “unusual user.” In short, there is no guarantee a set of recommendation partners will be available for a given target user, especially if there is insufficient overlap between the target profile and other profiles. If a target profile contains only a small number of ratings or contains ratings for a set of items that nobody else has reviewed, it may not be possible to make a reliable recommendation using the collaborative technique.
It should be clear from this discussion that both content-based and collaborative personalization methods suffer from a number of significant disadvantages. However, taken together, both techniques complement each other perfectly. For example, content-based filtering can solve the latency problems associated with collaborative filtering. Furthermore, the diversity problem associated with content-based methods is solved by introducing a collaborative component. By integrating both content-based and collaborative filtering strategies, the ClixSmart personalization engine provides a unique and powerful personalization solution.
PTV Adds a Personal Touch
As the digital television revolution gathers pace subscribers across the globe are faced with the problem of how to find relevant schedule and show information in a timely fashion. It will become increasingly difficult to discover what programs are on in a given week, never mind locating a small set of relevant programs for a quiet evening’s viewing. The digital TV vendors do recognize this as a serious problem, and they are now offering electronic program guides (EPGs) to help users navigate this digital maze.
However, current EPGs are crude and offer little more than a static category based view of the day’s programming; the burden of search remains with the user. Ultimately, they will suffer the same problems of scale as a traditional guide since they do not automatically take user preferences into account.
PTV (www.ptv.ie) is an Internet system offering an innovative solution to this problem by providing a personalized information service for television viewers. PTV uses the ClixSmart personalization engine to generate electronic TV guides personalized for the viewing preferences of individual users. These guides are available as traditional HTML Web pages or as WML pages for WAP-enabled wireless devices, such as mobile telephones.
The PTV content database is made up of a schedule database and a program database. The schedule database stores the current channel schedules and is automatically compiled from TV station Web sites and bulletin boards. Each schedule item includes a program name, its channel and time information, and a textual episode description. The program database contains information about individual programs and films. Each program record is encoded as a set of features including program name, genre, country of origin, cast, studio, director, writer, and so on. The program database is vital for the content-based personalization component of PTV.
The main function of PTV is to construct personalized TV guides for each individual user. Each guide contains programs the user is known to enjoy as well as program recommendations relevant to the user given his or her current profile. The key to the success of PTV is its ability to select truly relevant program recommendations by using the content-based and collaborative personalization techniques supported by the ClixSmart engine. In this way, PTV benefits from all of the advantages of a hybrid recommendation system, including the ability to make high-quality and diverse program recommendations, to cope with new or one-off programs, and to cope with new or unusual users.
An example of a personalized guide produced by PTV is shown in Figure 2, which shows four separate listings for three programs: “Friends,” “Married with Children,” and “Ally McBeal.” This guide has been produced for a user with a strong interest in U.S. sitcoms. The user has previously expressed an interest in programs such as “Friends” and PTV has further recommended “Married With Children” and “Ally McBeal,” which the user has not encountered before, but which PTV feels is relevant.
PTV has proven to be extremely successful and popular. Since its 1999 launch, the system has attracted over 20,000 registered users from Ireland and the U.K. without any significant marketing or advertising.
Since “Married With Children” and “Ally McBeal” are recommendations from PTV, the user is afforded the opportunity to rate these suggestions by clicking on the grading icons (thumbs-up and thumbs-down) beside these programs. Importantly, this type of information allows PTV to learn about a user’s specific and general viewing preferences. For example, by rating “Ally McBeal” positively (small or large thumbs-up) PTV will not only learn the user is interested in this particular program, it will also learn the user’s interest in a range of similar programs, including other American sitcoms, courtroom dramas, among others. Moreover, PTV will also learn about more general viewing preferences, such as the fact the user likes to watch shows on Network 2 that air during primetime.
Of course, the ultimate judgement of a user’s interest in a program is whether the user actually watches it, but in the current incarnation of the PTV system there is no way of capturing this information directly by monitoring a user’s online behavior. However, users will ultimately be able to access systems like PTV through their television sets and in this case it will be possible to recognize whether a user watches a recommended show, thereby doing away with the need to elicit direct feedback from the user.
Quality Evaluation
To date PTV has proven to be extremely successful and popular. Since its 1999 launch, the system has attracted over 20,000 registered users from Ireland and the U.K. without any significant marketing or advertising. In addition, the PTV system has been developed in WAP and WebTV formats and has been licensed to a number of portal sites and telecommunication companies, including the Irish Times newspaper group, where PTV has been rebranded as MyTV on the ireland.com portal site.
To measure the personalization quality of the personalized guides produced by PTV we carried out a user study in which regular and new users were asked to evaluate the system in terms of guide precision, ease of use, and speed of service. The guide precision results are of interest here. In total, 310 PTV users were included in the evaluation, and in terms of guide quality (precision) they were asked to rate the appropriateness of their personalized guides. The results are presented in Figure 3a, and are extremely positive. The average personalized guide contained between 10 and 15 programs and critically, 97% of users rated the quality of these guides as satisfactory or good, with only 3% of users rating the guides as poor quality.
A second experiment was performed to evaluate the relative precision of the collaborative and content-based personalization strategies used in PTV. Grading logs for a one-month period were analyzed to examine the user ratings associated with programs recommended by each recommendation technique. Specifically, we measured how often each recommendation method produced at least 1, 2, and 3 good ratings (that is, recommendations subsequently graded by users as positive) in a single guide. We also included the results for a naive random recommendation strategy to serve as a benchmark. The results are presented in Figure 3b and show the collaborative recommendation strategy (ACF) consistently outperforms the content-based strategy. For example, the collaborative strategy produces at least one good recommendation per guide in 98% of the user guides produced, compared to 77% for the content-based recommendation strategy. Both of these techniques are significantly better than the random benchmark.
Join the Discussion (0)
Become a Member or Sign In to Post a Comment