The impact of data characteristics on the performance of classical recommender systems has been recently investigated and produced fruitful results about the relationship they have with recommendation accuracy. This work provides a systematic study on the impact of broadly chosen data characteristics (DCs) of recommender systems. This is applied to the accuracy and fairness of several variations of CF recommendation models. We focus on a suite of DCs that capture properties about the structure of the user-item interaction matrix, the rating frequency, item properties, or the distribution of rating values. Experimental validation of the proposed system involved large-scale experiments by performing 23,400 recommendation simulations on three real-world datasets in the movie (ML-100K and ML-1M) and book domains (BookCrossing). The validation results show that the investigated DCs in some cases can have up to 90% of explanatory power – on several variations of classical CF algorithms –, while they can explain — in the best case — about 40% of fairness results (measured according to user gender and age sensitive attributes). Therefore, this work evidences that it is more difficult to explain variations in performance when dealing with fairness dimension than accuracy.
Yashar Deldjoo, Alejandro Bellogín, T. D. Noia
Inf. Process. Manag.