Abstract
The suggestion of Points of Interest (PoIs) to people with autism spectrum disorders challenges the research about recommender systems by introducing an explicit need to consider both user preferences and aversions in item evaluation. The reason is that autistic users’ perception of places is influenced by sensory aversions, which can cause stress and anxiety when they visit the suggested PoIs. Therefore, the management of individual preferences is not enough to provide these people with suitable recommendations.
To address this issue, we propose a Top-N recommendation model that combines information about the user’s idiosyncratic aversions with her/his preferences in a personalized way. The goal is that of suggesting the places that (s)he can like and smoothly experience at the same time. We are interested in finding a user-specific balance of compatibility and interest within a recommendation model that integrates heterogeneous evaluation criteria to appropriately take these aspects into account.
We tested our model on 148 adults, 20 of which were people with autism spectrum disorders. The evaluation results show that, on both groups, our model achieves superior accuracy and ranking results than the recommender systems based on item compatibility, on user preferences, or which integrate these aspects using a uniform evaluation model. These findings encourage us to use our model as a basis for the development of inclusive recommender systems.
1. Introduction
The personalized suggestion of Points of Interest (PoIs) to fragile users challenges the development of recommender systems19 by broadening the factors to be taken into account in the identification of the most suitable items for the individual user. For instance, people with autism spectrum disorders, who are the main target of this work, have idiosyncratic sensory aversions to noise, brightness, and other sensory features, which influence the way they perceive items, especially places.20 Thus, a recommender system that overlooks these aversions could suggest PoIs that cause a high level of stress and anxiety on the user.7 In order to address this issue, the preference data traditionally used to personalize item recommendation should be combined with information about people’s aversions to estimate the likelihood that, rather than only being interested in exploring the suggested places, they can serenely experience them.
Starting from Multi-Criteria Decision Analysis,25 which provides techniques for the evaluation of multiple dimensions of items, and on match-making models based on user-to-item similarity,12 most recommender systems assume that the attributes of an item contribute to its utility to the user in an additive way. However, depending on individual idiosyncrasies and their strength, problematic features might make an item unsuitable, even though it meets the user’s preferences. Moreover, the impact of compatibility on decision-making varies individually and it cannot be separately managed. For instance, some people with autism are determined to visit noisy and crowded places if they like them very much. Therefore, inclusive recommendation models must reflect personal evaluation criteria by balancing feature compatibility and preference satisfaction at the individual level. In the present work, we investigate the role of these two types of information in the personalized suggestion of PoIs to users with, or without, autism spectrum disorders (neurotypical users). We propose a novel Top-N recommender system that applies heterogeneous evaluation criteria to take user preferences and compatibility requirements into account, by exploiting feature-based user profiles for the specification of individual needs.
Our work has two key aspects. Firstly, we acquire data about people’s aversion to sensory features of places in terms of disturbance caused by low or high feature values, for example, darkness or strong light. In this task, we try to limit the amount of information elicited from people as much as possible. For this purpose, we employ a questionnaire derived from Tavassoli,23 which provides data about a user’s aversion to a subset of the values that each feature can take. Then, we interpolate her/his aversion to the whole range of values and we derive the compatibility of the feature as the complement of aversion. Secondly, for the estimation of item ratings, we distinguish user preferences for broad categories of places from idiosyncratic sensory aversions. Moreover, as users might balance differently these aspects in item evaluation, we combine preferences and compatibility by applying user-specific weights, which we acquire by analyzing users’ ratings, in conjunction with their declared preferences and idiosyncrasies.
An important challenge in the development of this type of system is that it must work under data scarcity because few users can be studied to learn their interests. Research studies indicate that autism spectrum disorders affect around 1 in 100 people in Europe.5 Moreover, these people can be hardly contacted because they have interaction problems and a tendency to avoid new experiences. Finally, their attention problems cause difficulties in providing detailed feedback about items.13
We tested our model on 148 adults: 20 of them were people with autism, while we did not have any information about the others. However, we can reasonably expect that the second sample respected the proportion of the entire population, including at most 1 or 2 autistic people. On both groups of participants, the accuracy and ranking capability achieved by our model was higher than that of a set of baseline recommender systems that singularly take item compatibility, or user preferences, into account. Moreover, our system outperformed baseline models that uniformly manage compatibility and preference information, without differentiating their contributions.
The approach presented in this paper is part of the Personalized Interactive Urban Maps for Autism project (PIUMA), which aims at developing novel digital solutions to help people with autism in their everyday movements.18 PIUMA involves a collaboration among the Computer Science and Psychology Departments of the University of Torino, and the Adult Autism Center of the city of Torino, Italy. The result of this project is a mobile app that manages dynamic geographical maps specifically conceived for users with autism spectrum disorders, but which target neurotypical people, as well.3
The remainder of this paper is organized as follows: first, we discuss the spatial needs of autistic people (Section 2), and we position our work in the related one (Section 3). Section 4 outlines how we gather data about users and PoIs, and Section 5 presents our model. Section 6 describes the validation methodology we applied to test our model. Sections 7 and 8 present and discuss the evaluation results. At last, Section 9 concludes the paper.
2. Spatial Needs of Autistic People
Symptoms of autism span from severe language and intellectual disabilities to the absence of disabilities, and an Intelligence Quotient above the average. Autism entails an atypical sensory perception in over 90% of individuals,23 who can be overwhelmed by environmental factors that are easily managed by neurotypical subjects. At least in part because of these characteristics, people with autism spectrum disorders actively avoid places that may negatively impact their senses.22 Sight, smell, and hearing are relevant to mobility in urban environments, and high sensory stimulation negatively influences individuals in their movements. Further relevant environmental dimensions that could impact the sense of safeness are the temperature, openness, and crowding of a place. These idiosyncratic sensory aversions may result in anxiety, fatigue, disgust, sense of oppression, or distraction.18
In order to address this issue, there is a strong need for technological support able to satisfy the spatial needs of people with autism, focusing on aversions derived from their high sensitivity to sensory stimulation. Moreover, as these aversions seem to be highly idiosyncratic, there are no features of places that may reassure the entire autistic population, and the peculiarities of each person have to be considered.16 Therefore, the provision of personalized solutions that adapt to the individual user is extremely important.
3. Related Work
As people with autism spectrum disorders commonly exhibit an affinity with technology, Information and Communication Technologies are largely used to support them in the management of daily activities.16, 17 However, the research about autism tends to pay more attention to children, and it overlooks adults’ needs. This might be a consequence of the “medical model”, which promotes the intervention toward a school-aged target. Moreover, the Human–Computer Interaction community seems to prefer addressing social interaction problems,8, 16 instead of dealing with spatial difficulties.
Most applications investigate the adoption of personalization strategies targeted to autism in the educational domain. For example, Judy et al.11 present a personalized e-learning system that provides learning paths having different difficulty levels, based on the user’s past performance. The authors define ontologies to describe learning materials, annotation schemas, and services, and they use a genetic algorithm as an optimization technique, by representing a set of learning objects as chromosomes.
García et al.6 propose an adaptive Web-based application that helps students with autism spectrum disorders overcome the challenges they might have to face when they attend university. The system adapts the presentation of the information site to autistic and neurotypical students, but the information is the same for everyone. The adaptive functionality is based on learning styles (visual vs. verbal, global vs. analytical, active vs. reflective) and on the user’s history. For example, if the user is more visual than verbal, the video version of content is shown at the top of the learning object. Otherwise, it is moved to the bottom of the object. Hong et al.10 propose to provide users with suggestions within a social network aimed at supporting young adults’ independence. However, they focus on the organization of the social network, by relying on peer suggestions, rather than automatically generating recommendations.
Differently, Costa et al.4 develop a task recommendation system that uses a machine learning technique to supplement the child’s regular therapy. The system suggests the daily activities to be performed (related to eating, keeping clean, getting dressed, and so forth) based on age, gender, and time of day. It does not consider the child’s preferences, and the difficulty level of the activities is manually set by the therapist. Moreover, Ng and Pera15 propose a hybrid game recommender for adult people with autism spectrum disorders, based on collaborative and graph-based recommendation techniques.
Our work differs from the previously listed ones in several aspects. Firstly, we focus on a different domain, that is, spatial support. Secondly, we evaluated our model with autistic people. This has rarely, if ever, been done in the related research. Thirdly, our approach employs personal preferences for item categories, and aversions to sensory features, to steer recommendation in a context where a limited amount of feedback about items can practically be collected.
Our work also differs from general content-based,12 feature-based,9 collaborative and multi-criteria1 recommender systems, because we treat sensory features as sources of discomfort for users, rather than liking or disliking factors. In other words, we separately model the influence of idiosyncratic sensory aversions, which determine the compatibility of items with the user, from her/his preferences for different types of items. Notice that this separation also distinguishes our model from recommender systems that deal with negative preferences, such as,14 because we support the management of heterogeneous criteria to deal with user preferences and sensory idiosyncrasies. Previously, the INTRIGUE2 tourist guide introduced the notion of compatibility requirements in PoI recommendation. However, it did not investigate their different meaning and impact on the evaluation of items, with respect to user preferences.
It is worth mentioning that, while constraint-based recommender systems are too knowledge-intensive for our purposes (we are not suggesting item bundles with constraint satisfaction requirements), the optimization of soft constraints for path finding under suitability criteria is relevant to extend PoI suggestion with instructions for reaching the target places. This type of technique has been explored in recommender systems for routing, such as the work by Verma et al.24
4. Preliminary Study Setup
This section describes how we gathered data about users and places to validate our model. Moreover, it describes the samples of users we involved.
4.1. Data Knowledge about Users
The acquisition of individual user profiles is a key step to personalize recommendations because it makes it possible to explicitly represent the user preferences and requirements to be considered in item evaluation. User profiles can be explicitly elicited from users, or they can be unobtrusively learned by tracking and analyzing user behavior.19 In this work, we adopted the former technique, which makes it possible to initialize the user profile before starting to use the mobile guide, and thus supports the identification of unsuitable PoIs since the beginning of the interaction with the user. This approach does not preclude the adoption of dynamic user modeling techniques to update the user profiles while people use the mobile guide, and we have recently extended our work in this direction.
Our questionnaire, shown in Table 1, includes two sections. In the first one (left column of the table), it elicits user preferences about categories of PoIs such as restaurants, parks, etc., in order to learn which ones users like or dislike. In the second section (right column), questions concern users’ aversions to sensory features of places.
Table 1. Short questionnaire to elicit information about preferences and sensory idiosyncrasies (translated from Italian).
The information about sensory aversions is hard to obtain: usually, very long and complex surveys have to be completed for this purpose.20 Moreover, asking people with autism for such data is challenging because they have difficulties in social interactions and they tend to avoid new experiences.21 Given our users’ attention problems,13 and considering the application context of our project, which is not a clinical setting, we decided to avoid long and detailed surveys. Therefore, we carefully prepared with psychologists a short list of questions to capture such information.
We defined the questions about aversions by adapting a subset of the Sensory Perception Quotient (SPQ) test23 on the basis of the findings reported by Rapp et al.18 SPQ is a standard sensory questionnaire for adults that assesses basic sensory hyper- and hyposensitivity. We would have liked to directly use it since it is part of the battery of assessment tests proposed to the patients of the Autistic Adult center in Torino. However, it includes 92 items, too many to be proposed when bootstrapping a mobile guide. As shown in Table 1, our questionnaire is aimed at acquiring aversion information more quickly. Specifically, for some features (brightness and space), the user is asked to evaluate two extreme conditions, that is, low or high levels, assuming that the middle ones are less problematic. In other cases (crowding, noise, and smell), the user is asked about her/his annoyance concerning the highest level, because low levels of these features are neutral.
In our experiment, users filled in the survey of Table 1, possibly in the presence of an operator (when needed), and they answered questions using the [1, 5] Likert scale. Then, we asked users to evaluate 50 specific PoIs located in Torino city center (e.g., How much do you like Castle Square?) in order to collect a dataset of user ratings to test our model. We used the same [1, 5] Likert scale as above, but we included the “I don’t know this place” choice to support opting out.
Knowledge about PoIs. We used the Maps4All crowd-sourcing platform (https://maps4all.firstlife.org/) as a source of information about places. Specifically, the 50 PoIs mentioned in Section `Knowledge about users’ are representative of all the categories of places defined in that platform. We selected those PoIs with the requirement that they had previously been mapped with the contribution of at least three different crowdsourcers each.
The reason for exploiting an ad-hoc platform as a source of information about PoIs, instead of relying on a public Open Data source, is the fact that Maps4All was explicitly designed to support the crowdsourcing of sensory features of places. In contrast, Open Data sources such as OpenStreetMap (https://www.openstreetmap.org) fail to provide the sensory information we needed for our experiment. In particular, for each place, Maps4All enables the user to rate in the [1, 5] scale the level of (i) brightness, (ii) crowding, (iii) noise, (iv) smell, (v) openness, and (vi) temperature. These sensory features have been defined based on the findings of the user study presented by Rapp et al.,18 and of state-of-art research.20 Notice that, by interacting with Maps4All, the user can also provide a global rating of the place.
We populated the Maps4All platform through two experimental crowdsourcing sessions, during two lessons at the Master’s degree in Social Innovation and ICT at the University of Torino, in May and December 2019. About 120 students participated in the crowdsourcing tasks. In order to guarantee the collection of a reasonable amount of data about places, we asked each of them to provide evaluations for at least three PoIs in Torino city center. In total, during the two sessions, we collected the evaluations of 282 items.
For our study, we involved two groups of users:
- 20 adults with autism spectrum disorders (from 22 to 40 years old, mean age: 26.3, median 28; 11 men, 9 women), who are patients of the Autistic Adult Center of Torino, medium and high functioning.
- 128 neurotypical subjects (from 19 to 71 years old, mean age: 28.1, median 23; 63 men, 65 women), who are university students or contacts of the authors of this paper.
All participants signed a privacy consensus according to General Data Protection Regulation. Moreover, we obtained approval for the study from the research ethics committee of the University of Torino.
As far as the 50 PoIs we selected, the mean number of evaluations we obtained is 31 for autistic participants and 39 for neurotypical ones.
5. Recommendation Model
As previously discussed, we assume that both user preferences and item compatibility should be taken into account to identify the most relevant items that a user can smoothly experience and like, at the same time. However, evaluation criteria might be personal. Moreover, these aspects can be weighted differently in decision-making processes. For instance, in contrast to the tendency of people with autism spectrum disorders to visit places in which they feel comfortable, during our participatory design interview sessions we encountered a few subjects who face the challenges of noisy and crowded environments in order to be able to carry out the activities they like very much. We thus propose a recommendation model that, based on the observed item evaluations, can weigh the contribution of compatibility and preferences in rating prediction, on a user-specific basis.
For clarity purposes, we split the presentation of our model as follows. In Section 5.1, we describe the input data for recommendation. In Section 5.2, we specify how we estimate the compatibility of the individual features of an item with the user. Then, we present the estimation of the overall compatibility of the item with the user (Section 5.3) and the preference-based item evaluation (Section 5.4). In Section 5.5, we describe how we combine compatibility and preference-based evaluation to predict item ratings.
Before describing our model, we introduce the notation we use:
- U is the set of users and I the set of items of the domain.
- C is the set of item categories, such as shops and cinemas.
- L is a Likert scale in [1, υmax]. In this work, υmax = 5.
- F = F↑ ∪ FV is the set of sensory features defined in our domain. We assume that each feature f ∈ F takes values in L.
Specifically, F↑ is the set of features f such that, the higher the value of f, the stronger its negative impact on the user. For instance, noise belongs to this class. Differently, FV denotes features whose extreme values make users uncomfortable, while the middle ones are less problematic, for example, brightness.
In our domain, there are no features such that people are expected to feel comfortable with high values and uncomfortable with low ones. Thus, we omit this class.
For each user u ∈ U, and item i ∈ I, we estimate u‘s evaluation of i (denoted as ) as a decimal number in the [1, υmax] interval, by taking u‘s previous ratings, preferences for item categories, and idiosyncrasies into account.
Our model takes the user and item profiles as input. The profile of u ∈ U, extracted from the questionnaire data, specifies:
- The ratings rj in L that (s)he provided for a set of items j ∈ I.
- Her/his declared preferences for the categories c ∈ C, each one expressed in the L scale.
- Her/his declared sensory aversion to specific values of item features, expressed in L. We denote u‘s aversion to a value υ of a feature f ∈ F as aufυ. For example, auf5 = 4 means that u is fairly disturbed by items having f = 5.
For each feature f ∈ F↑, we assume by default that auf1 = 1. Therefore, the user profile stores a single value, aufυmax, which specifies u‘s aversion to the maximum value of f. We denote the maximum value of f as υmax.
For each feature f in FV, the user profile stores two values which express u‘s aversion to the minimum and maximum values of f, respectively, for example, {auf1 = 3, aufυmax = 4}
Differently, the profile of an item i specifies the category c ∈ C of the item, and a vector storing, for each feature f ∈ F, the value of f in item i retrieved by querying the Maps4All platform. For each feature f, Maps4All returns the mean evaluation it collected from crowdsourcers; takes values in the [1, υmax] interval.
5.2. Compatibility of Individual Features with the User
We can define compatibility as the opposite of aversion to feature values. However, user profiles only include one or two aversion values declared by users for each feature. Thus, the missing ones have to be interpolated. In the following, we describe the patterns we apply to approximate a user’s aversion to item features, starting from the values stored in her/his profile.
For each f ∈ F↑, we approximate aversion as a linearly increasing function. Let us represent feature values in the X axis, and user aversion in the Y axis of a plane. Then, we can define this function as a line that connects point (1, 1) to point (υmax, aufυmax), as in Figure 1:
Figure 1. Interpolation of a user’s aversion to a feature of type F↑.
We thus estimate u‘s aversion to f in i (eaufi) as follows:
For instance, the line in Figure 2 shows the interpolation of a user’s aversion to a feature f. Given a user u with aufυmax = 4, and a PoI i such that =3, eaufi = line↑ (3). Thus, u‘s aversion to f in i is approximated to 2.5.
Figure 2. Example of the interpolation of a user’s aversion to a feature of type F↑.
Differently, for each f ∈ FV, and given {auf1, aufυmax} in u‘s profile, we interpolate aversion by means of a concave function on the range of f. The aversion function has a “V” shape, which we approximate by drawing two lines, as in Figure 3:
Figure 3. Interpolation of a user’s aversion to a feature of type FV.
- line↑ connects points (1, 1) and (υmax, aufυmax) to represent the increment of aversion toward the maximum value of f.
- line↓ connects points (1, auf1) and (υmax, 1) to represent the decrease in aversion while f takes higher values than its minimum:
We estimate u‘s aversion to f in i by selecting the maximum values of the two lines:
Let’s look at the example in Figure 4. Given a PoI i such that = 3, eaufi = max(line↑(3), line↓ (3)). Thus, u‘s aversion to f in i is estimated as max(2.5, 2) = 2.5.
Figure 4. Example of the interpolation of a user’s aversion to a feature of type FV.
Notice that eaufi takes values in the [1, υmax] interval. Moreover, higher values of this measure mean that the feature generates more discomfort to u.
Given eaufi, the compatibility of f with u in i, denoted as compufi, can thus be defined as:
For example, if eaufi = 2.5 and υmax = 5, compufi = 3.5.
5.3. Overall Item Compatibility: Aggregation Measures
We propose alternative aggregation measures to compute the overall compatibility of an item i with a user u (compui) by modeling different types of influence of individual features.
In Section 7, we evaluate their performance, in combination with diverse recommendation algorithms.
- Min. This measure defines compui as the minimum compatibility of i‘s features with u:
Min is conjunctive and it evaluates i as incompatible with u if the item has at least one incompatible feature. - Ave. In this case, compui is the mean compatibility of the features of i:
where |·| denotes set cardinality. This measure is additive (disjunctive) and equally balances the influence of the features on compatibility.
We also define two aggregation measures that estimate the overall compatibility of an item i with a user u in function of the distance between the features of i (stored in the vector) and those of an ideal item that best matches u‘s idiosyncrasies. We denote this item as . For each feature f ∈ F, is the most compatible value of f, based on u‘s estimated aversion to sensory features. Specifically, for each (see the red point in Figure 5). Moreover, for each f ∈ FV, is represented by the value of f associated to the minimum aversion. For instance, (violet point) in Figure 6.
Figure 5. Identification of
for a feature of type F↑.
Figure 6. Identification of
for a feature of type FV.
The two vector-based aggregation measures for the computation of the overall compatibility of i with u are
- Cos. In this measure, compui is the Cosine similarity between
and
:
where · is the scalar|vector product, ǁ·ǁF is the Frobenius norm, and ▭ is the decimal product. A small angle between and means that i is highly compatible with u, and vice versa. - RMSD. In this case, compui is the complement of the root mean square deviation between
and
:
The smaller is the distance between and , the more compatible is i with u.
5.4. Preference-Based Item Evaluation
While compatibility indicates whether the user can smoothly experience an item, it does not mean that (s)he will like it. User preferences have to be taken into account for this purpose. In our domain, the only preference that we consider is the interest in the category of the item to be evaluated. Thus, the preference value of a user u for an item of category c ∈ C corresponds to the value of u‘s preference for c stored in u‘s profile. We denote this value as puc.
It is worth mentioning that, if more preferences had to be modeled, a Multi-Criteria Decision Analysis approach might be applied to compute an overall preference estimation as a weighted function of preferences for individual attributes. However, this is out of the scope of the present work.
In order to balance compatibility and preferences in a personalized way, we propose to identify user-dependent evaluation criteria by exploiting the user’s idiosyncrasies and preferences, in combination with the ratings of items (s)he provides. Specifically, we estimate the rating that a user u will give to an item i as a weighted mean of overall compatibility and user preferences:
where α takes values in the [0, 1] interval, and puci ∈ L is the preference-based evaluation of i, given u‘s profile. This model, henceforth, referred as Ind (that is, Individual), identifies a specific α value for each user to optimize item recommendation to her/him. We identify the value of α for each u ∈ U as the one that minimizes the distance between estimated ratings and ground-truth ones.
6. Validation Methodology
We aim at assessing whether a recommendation model that takes both item compatibility and user preferences into account is more effective than an approach based on a single type of information. Moreover, we aim at evaluating the usefulness of a personalized balance of these aspects, as specified by the α parameter of Equation (10). For these purposes, we compare our model to a set of recommender systems that (i) uniformly manage compatibility and user preferences, ignoring their possibly different impact on decision-making, or (ii) focus either on compatibility or on preferences. We consider the following baselines:
- Multi-Criteria (MC). This recommender system estimates item ratings by uniformly treating idiosyncratic aversions and preferences on the basis of the aggregation measures described in Section 5.3. Given an item i, it computes by fusing u‘s preference for the category of i (puci) with the compatibility of individual features with u (compufi) by means of a single aggregation function. For example, this function could be the mean of all these values, as in Equation (7).
- C-only. This is a configuration of our recommendation model in which α = 1. In this case, items are evaluated exclusively on the basis of their compatibility with the user.
- Pref-only. In this configuration of our model, we set α = 0 to evaluate items on the exclusive basis of the user’s preferences.
We did not select as baselines any collaborative or feature-based recommenders such as those proposed by Adomavicius1 and Han,9 because the data about users is too small to train those algorithms.
We separately compare our model to the above baselines on the dataset of the users with autism spectrum disorders (henceforth denoted as AUT), and on the one regarding neurotypical users (NEU). For the comparison, we configure all the algorithms on each aggregation measure of Section 5.3. The resulting configurations are named by appending the name of the selected measure to that of the applied algorithm. For instance, IndCos represents the application of the Cos aggregation measure to model Ind.
To evaluate recommendation performance, we consider ranking capability (MRR and MAP), accuracy (Precision, Recall, and F1), error minimization (MAE and RMSE), and user coverage. However, consistently with recent trends in the evaluation of recommender systems, we pay special attention to ranking metrics because they help understand whether the items that the user likes are placed in the first positions of the suggestion list, or not.
We perform 5-fold cross-validation in which, for every fold, we use 80% as training set and 20% as test set. As the Ind models have to optimize the α parameter, we train each of them to Find the best user-specific setting by optimizing its results with respect to MAP. Moreover, to be sure that the baselines are consistently evaluated, we run the other algorithms (MC, C-only, and Pref-only, which do not need any training) on the same test sets used for Ind.
7. Evaluation Results
Tables 2 and 3 show the Top-N evaluation results with N = 5. That is, the list of suggested items has length = 5. The tables omit the results concerning user coverage because it is 100% in all the cases.
Table 2. Results on AUT dataset for N = 5.
Table 3. Results on NEU dataset for N = 5.
We consider two categories of algorithms, that is, the configurations of our model on the various aggregation measures, and the corresponding ones of the baselines. In the tables, we show the best value across all algorithms in bold. Moreover, the best value obtained by the other category of algorithms is underlined (when our model obtains the best value, we underline the best value achieved by the baselines, and vice versa). Stars indicate significant differences according to a Student T-Test between the best performing algorithm from each category; **: p < 0.01; *: p < 0.05.
The evaluation results suggest that IndCos is the best recommender system because it achieves good accuracy and ranking capability. On both datasets, it outperforms all the other algorithms (baselines and own category) in F1 and MAP. Moreover, it has the best Recall of its own category. As a matter of fact, IndMin achieves better error minimization than IndCos on both datasets. Specifically, it obtains the best MAE of all models, and it achieves the best RMSE in AUT. Furthermore, in NEU, it obtains better results than the other algorithms of its own category. However, as previously discussed, our primary evaluation criterion is ranking capability.
Interestingly, IndRMSD is the worst configuration of our model. On the AUT dataset, it obtains the lowest results of its own category on all evaluation metrics. However, it achieves better results than several baselines in MAP and other metrics, supporting the superiority of our model. It is also worth noting that Pref-only is the best baseline regarding MAP. Moreover, C – onlyCos has a lower ranking capability than Pref-only, but it has fairly good accuracy. It is the best or second-best baseline on the various measures.
Unfortunately, the low size of the AUT and NEU datasets does not support the statistical significance of results for several evaluation metrics. However, the results concerning MAP and RMSE on the AUT dataset are significant. This is important because our recommendation model is especially targeted to users with autism spectrum disorders. Thus, we can rely on the ranking capability of our model when recommending items to them. At the same time, the results are encouraging for neurotypical users. Therefore, it is worth investigating performance within a larger experiment that will possibly provide more statistically relevant results on both groups of people.
8. Discussion
From the evaluation results, we draw two conclusions. The first one is that a customized model of item evaluation, which balances feature compatibility and preference satisfaction in a personalized way, achieves better performance than the recommender systems that manage only one of these aspects. As far as F1 and ranking capability are concerned, the configurations of the Ind model that take both preferences and compatibility into account (and, specifically, IndCos) obtain higher results than Pref-only, which only employs user preferences in item suggestion. Moreover, they achieve better results than the C-only algorithms, which only use compatibility data. The performance of these algorithms is poorer than that of Pref-only, as well. This means that, not surprisingly, compatibility information alone is not enough to generate relevant recommendations for the user.
The second conclusion we draw is that a customized model of item evaluation, which balances feature compatibility and preference satisfaction in a personalized way, outperforms the recommender systems, which uniformly manage both aspects. Specifically, the Ind configurations outperform the MC ones, regardless of the applied aggregation measure, in most evaluation metrics, and especially in ranking and F1 measures.
To summarize, preference information is useful to suggest relevant PoIs in Top-N recommendation. However, better results can be achieved by combining it with a compatibility evaluation aimed at assessing whether the user can smoothly experience the recommended items. Interestingly, a uniform management of compatibility and preference information, which does not distinguish the possibly heterogeneous evaluation criteria concerning them, does not bring good results. Conversely, the acquisition of user-specific weights to balance the impact of compatibility and interests improves item suggestion.
9. Conclusion
Users with autism spectrum disorders are a challenging target of PoI recommender systems because of their spatial needs. In order to suggest suitable solutions, which the user can like and serenely experience, her/his preferences for PoI categories, traditionally analyzed by researchers, and her/his aversions to sensory features, have to be jointly considered. The reason is that aversions can seriously affect the visit experience, causing negative feelings on the user.
In this paper, we presented a novel Top-N recommender of Points of Interest especially targeted to these people. The peculiarity of our model is that it takes the individual user’s idiosyncratic aversions to sensory features into account to generate suggestions that (s)he is expected to like and smoothly experience at the same time. We tested our model on autistic and neurotypical people. The evaluation results show that, on both user groups, our model achieves higher accuracy and ranking capability than baseline recommenders, which (i) evaluate items on the sole basis of how closely they meet the user’s preferences, or how compatible they are with her/his idiosyncratic aversions to sensory features, and (ii) uniformly manage compatibility and preference information without distinguishing the different contributions of these aspects to item evaluation. We thus conclude that the integration of heterogeneous evaluation criteria about user interests and aversions is a promising approach to make recommender systems more inclusive.
Acknowledgments
This work is supported by the Compagnia di San Paolo Foundation. We thank Stefano Cocomazzi, Stefania Brighenti, and Claudio Mattutino for their contributions to the work. We are also grateful to the Adult Autism Center of the city of Torino for their participation in the PIUMA project.
Join the Discussion (0)
Become a Member or Sign In to Post a Comment