Recommender Systems

From SI410
Jump to: navigation, search
Amazon's Recommendations for a User
Back • ↑Topics • ↑Categories

Recommender systems use collaborative filtering systems to assess a user's likes, dislikes, even loves and hatreds to make suggestions as to potential other product that the user might like besides the page he is actually viewing. The most notable reputation systems include Amazon, Netflix, Pandora, and iTunes. Recommender systems pose a number of ethical challenges, including the privacy of aggregated information and the use of such systems for unethical purposes.


Recommender systems typically provide users with recommendations from two different approaches, collaborative filtering and content filtering. Collaborative filtering employs a users past purchase history (as well as users who have similar purchasing histories) to make recommendations on related products. Content filtering employs different features of previously purchased products to recommend other products with similar features.

When choosing between these models, a decision between gathering explicit and implicit data on users must be made.

Examples of explicit data gathering include the following:[1]

  • Asking a user to rate a particular item.
  • Asking a user to rank a collection of items.
  • Presenting two items to a user and having them make a choice between the two.
  • Asking a user to create a list of items that they enjoy.

Examples of implicit data gathering include the following:

  • Observing the items that a user looks at in online environments.
  • Keeping track of the items that a user purchases/looks at.
  • Analyzing the users social network and looking at what others have purchased and/or liked/disliked.

Algorithms & Models

K-Nearest Neighbor

The k-nearest neighbor (or k-NN) approach uses pearson correlation to determine the k most similar users to a single user, and computes a prediction based on the nearest users.

Pearson Correlation

The Pearson correlation , determines how similar two users are based on previous data. It is used with the K-Nearest-Neighbor algorithm to determine the k most similar users to a user in question.

Collaborative Filtering

Collaborative Filtering[2] refers to defining individual users as a set of N-dimensional vector of items where N is the distinct number of catalog items. For example, customer A of a department store's website is assigned a vector of unique items purchased from that store. The components of the vector are positively rated for purchased items or positively rated items while negatively rated items are given negative component values. [3] This algorithm is commonly used to recommend items to a user whose observed preferences are similar to those of another customer profile. Recommendations are based on the non-common items rated by one user in contrast to the other. Drawbacks of this form of recommender system include that a large sample of users is necessary to be able to find strong degrees of similarities between customers and a selected user. Secondly, by partitioning user preferences by items limits recommendations to a specific product area. This means if a customer has bought a pair of shoes, that customer will only receive recommendations for similar shoes that other similar users have purchased but will not be receiving recommendations for bags that similar users may have purchased.

Retailers like use an Item-to-Item collaborative filtering algorithm that works offline and recommends highly correlated items rather than items to highly similar users. [2]

Companies such as Netflix also utilize recommender systems to help with user movie selection. Netflix suggests movies that the recommender system thinks that the user would enjoy based on past movies watched and movie selections by seemingly similar Netflix users. The recommender system utilizes user ratings, comparisons between the movie ratings of other users, "as well as a learning algorithm that learns patterns in [a user's] history in order to recommend [the user] in an accurate and optimal manner." [4]

Recommender System Competitions

The Netflix Prize

The Netflix Prize was a competition created by Netflix in 2009 to improve their collaborative filtering algorithm for movie rating prediction. Netflix gave a training data set of over 100 million ratings. The prize winners were determined by a held out test data set, with ratings only known by Netflix. Contenders were sorted on the root-mean-square error of their predictions and the held out test data set. BellKor's Pragmatic Chaos was the winning team, providing Netflix with a %10.06 improvement on their own algorithm [5].

The RecSys Challenge 2018

The RecSys Challenge is a currently running competition for music recommendation, with data given by Spotify[6]. Contenders will have to improve on Spotify's playlist recommendation systems.

Ethical Issues


Privacy is one ethical concern with recommendation systems. The data that is gathered on users is done oftentimes without the users consent and without the users awareness. There are ways to trace large sets of data gathered from thousands of users back to individuals and this can result in private information (such as credit card number, social security numbers, etc) to be discovered as well. It is important that as new recommender systems are developed, and more specifically new algorithms to run these systems, ethical implications are kept in mind and included in the development of these products.

Many advertising companies are employing technologies such as cookies and spyware in order to learn more about people's Internet browsing history and Internet preferences. These advertising companies can contribute a person's Internet browsing search and give to companies to further refine their recommender systems.

Accountability is another ethical concern when dealing with recommender systems. recommender systems are based off of massive amounts of data but what if the data collected and is incorrect and the recommender is recommending things that could have ethical implications. For example what if a recommender system recommends a food establishment and the customer is unpleased or even harmed. Is the recommender system held accountable for recommending this establishment or is it the programmer or is it the the customers that made the reviews? Should the algorithm be deleted?

Implications of the use of explicit data in recommender systems

Netflix recommender both displays your ratings and other users' average rating.

Two examples of sites that use explicit forms of data in their recommender systems include Netflix and commenting on Slashdot. Netflix is a hybrid recommender system, meaning it combines multiple sources of user data to form recommendations, while Slashdot primarily employs a user rankings system to determine the order of commentary displayed on a given post. Slashdot's user generated rankings determines non-personalized recommendations. While this system avoids the ethical concern of information privacy, it does bring up the concern of "buried treasures" or poorly distributed visibility to posts that are worthy of notice. Buried treasure occurs when moderators of Slashdot attend less to comments with lower scores that may have been unfairly rated or neglect to scan over comments posted later in a conversation. This is an ethical concern because by using a recommender system to determine the importance of a comment, there is potential to neglect equally important perspectives that have been unfairly rated. By deciding for users what information is most important to read, a bias towards one perspective of view point can be imposed on a the site's readers. [7]

See Also

External Links


  1. Wikipedia:Recommender_system
  2. 2.0 2.1
  3. Wikipedia: Pearson Correlation []
  5. The Netflix Prize: Leaderboard []
  6. RecSys Challenge: About []

(back to index)