Clustering by Passing Messages Between Data Points - Overview & Procedure

clustering by passing messages between data points n.w
1 / 11
Embed
Share

"Explore the concept of clustering by passing messages between data points. Learn about Affinity Propagation, a method to assign data points to clusters based on message passing. Understand the complete procedure, nuances, and implications of diagonal values in similarity matrices."

  • Clustering
  • Data Points
  • Affinity Propagation
  • Message Passing
  • Similarity Matrix

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Clustering by Passing Messages Between Data Points Brendan J. Frey and Delbert Dueck

  2. The Problem - Clustering We want to assign our data points to categories, based on similarity Common approach - define clusters by representative points, assign each data point to best representative Most formulations are NP-hard, must be approximated heuristically

  3. Affinity Propagation - Informal Description Our data points are nodes in a graph Each iteration, every node gets messages from all other nodes How similar is the candidate? How much do other nodes support it? Messages estimate the suitability of nodes to serve as exemplars for others Messages from previous rounds are used to compute new set of messages

  4. Affinity Propagation - Message Passing Input: similarity matrix S We will iteratively build Responsibility matrix R, Availability matrix A R(i,k) represents how much i favors k as an exemplar candidate I want to vote for you , I don t want to vote for you A(i,k) represents how much k favors being an exemplar for i You should vote for me , No, don t vote for me

  5. Affinity Propagation - Complete Procedure Initialize A, R to all 0 Update responsibility: R(i,k) = S(i,k) - max{ A(i,k ) + S(i,k ) }, where k =/= k Update availability: A(i,k) = min{ 0, R(k,k) + SUM(max{ 0, R(i ,k) }) }, where i =/= i =/= k A(k,k) = SUM(max{ 0, R(i ,k) }), where i =/= k Iterate until convergence E = A + R The exemplar for node i is node k with largest E(i,k)

  6. Nuance - Diagonal of Similarity Matrix Diagonal values S(i,i) are called preferences Govern how much node i wants to vote for itself Parameter for the algorithm, will greatly affect clustering behavior Higher values - more, smaller clusters Lower values - fewer, larger clusters Typically set to median(S) Can be used to include/exclude candidates based on prior information

  7. Advantages Doesn t depend on geometric distances, symmetry, or centroids Doesn t require any initial guesstimate of cluster exemplars Desired number of clusters doesn t need to be fixed up front Similarity diagonal (a priori suitability of exemplars) can be flexibly tuned Easy to implement Matrix-based implementation yields good runtime

  8. Disadvantages Quadratic space and runtime complexity Entire similarity matrix must be computed up front Desired number of clusters doesn t need to be fixed up front Not always clear how to best tune the similarity diagonal Evident preference for many small clusters, not always desirable

  9. Images stolen from Frey, B. J., & Dueck, D. (2007). Clustering by passing messages between data points. science, 315(5814), 972-976. https://www.ritchievink.com/blog/2018/05/18/algorithm-breakdown-affinity- propagation/ https://en.wikipedia.org/wiki/Cluster_analysis

More Related Content