
Uncovering Large Groups of Active Malicious Accounts in Online Social Networks
Discover how researchers from Duke University and Facebook Inc. collaborated to uncover, identify, and combat malicious accounts on popular online social networks like Facebook, Google+, Twitter, and Instagram. Explore their innovative approaches, including SynchroTrap, to detect fake and compromised accounts engaging in cyber attacks such as spam propagation, malware spreading, social engineering, and online voting manipulation.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Uncovering Large Groups of Active Malicious Accounts in Online Social Networks Qiang Cao Xiaowei Yang Duke University {qiangcao, xwy}@cs.duke.edu Jieqi Yu Christopher Palow Facebook Inc. {jieqi, cpalow}@fb.com ACM 2014
Outline Introduction System Design- SynchroTrap Evaluation Conclusion
Introduction Online social networks (OSNs) such as Facebook, Google+, Twitter, or Instagram are popular targets for cyber attacks. Malicious accounts including fake accounts and compromised accounts Fake accounts: created by attacker which were only used to specifically jobs. Compromised accounts: real user accounts were kidnapped by attacker.
Introduction-Related work Malicious accounts identify method are clarified into three categories Social-graph-based approaches Feature based account classification Aggregate behavior clustering
Social-graph-based approaches Use social connectivity to infer fake accounts that have limited social connections to legitimate users. Ex: SybilRank New Account Legal Account illegitimate Account Fake Account Real User Account
Feature based account classification Uses various account features to train classifiers to detect malicious accounts. Ex: COMPA : Identifying compromised accounts using statistical models that catch sudden changes in a user s behavior New Post
Aggregate behavior clustering Compares the pairwise similarity and aggregate small cluster into large ones. Ex. SynchroTrap
Introduction Malicious accounts application: propagate spam messages spread malware launch social engineering attacks manipulate online voting results Large groups of malicious accounts act in loose synchrony
Introduction-Constraint Examine Facebook attacked by malicious accounts Sample: 450 normal ID and 450 malicious ID over one week Compare with normal user, malicious accounts attack is concentrated and continuous. Figure 1: An example of malicious photo uploads in Facebook. The x-axis shows the time when an account uploads a photo, and the y-axis is the account s ID. A dot (x, y) in the figure shows that an account with ID y uploads a photo at time x.
Introduction-Constraint Examine Instagram attacked by malicious accounts Sample: 1000 normal ID and 1000 malicious ID over one week Same attack pattern also appears in Instagram. Malicious accounts not only act together but often from a limited set of IP addresses. Figure 2: An example in Instagram user following. The x-axis is the timestamp of an account s following action and the yaxis is an account s ID. A dot (x, y) shows that an account y follows a targeted account at time x.
Introduction-Constraint Resources Constraint: Limited physical computing resources Limited operating time Infected machine may go offline, recover, or even be quarantined at any time A machine rental is usually charged based on the consumed computing utility Limited finance cost Mission Constraint: Require the level of prevalence that a customer pursues Require a strict deadline must be accomplished
System Design- SynchroTrap SynchroTrap use clustering analysis to detect the loosely synchronized actions from malicious accounts at scale. Challenges: Scalability Facebook :600,000,000 daily activity with thousands of abnormal attack Accuracy The diversity of normal users and the hiding of malicious activity reduce accurate detection
System Design- Challenges Scalability Approach: partition user actions by mission ,e.g., uploading spam photos, promoting rogue apps, post partition user actions by IP addresses, followee IDs, and page IDs Slicing the computation of user comparison into smaller jobs, and then aggregate the results of multiple slice to obtain period-long user similarity. Accuracy: Approach: Design SynchroTrap based on our understanding of an attacker s economic constraint
System Design- SynchroTrap Step 1. Partitioning activity data by applications 2. Comparing user actions 3. Pairwise user similarity metrics 4. Scalable user clustering 5. Parallelizing user-pair comparison
Partitioning activity data by applications OSN provide diverse function and features, which may result in curse of dimensionality Categorize a user s actions into subsets according to the applications they belong to
Comparing user actions U, T, Ci : UserID ,Timestamp, Constraint object Constraint object: page like, Instagram follow, and upload photo, etc. If pre-defined Tsim=1hr 7:00 8:00 match UID:1 2021-4-12 7:45 UTC UID:15 2021-4-12 7:55 UTC
Pairwise user similarity metrics Aibe the set of actions performed by user Ui Ai= {<U, T, C>|U=Ui}. ?? ? ?? T Ai C1 ?= {<U, T, C>|U=Ui, C=Ck}. C2 Ui C3
Pairwise user similarity metrics Per-constraint similarity: (Jaccard similarity) ? ?? ? ?? ? ? ?? ?? Sim(Ui,Uj,Ck)= ? ?? ? ?? ? ?? C1 Ui 1 ?? 1 ?? 1 1 ?? ?? =2 Overall similarity. ? C1 ?? Uj 6 <Tsim ? ?? ? ?? ? ? ?? ? ?? ? ?= ?? ?? ?? Sim(Ui,Uj)= C2 Ui ? ?? C2 Uj ? ?? ? ?? ? ?=2+1 ?? 3 11 6+5= Sim(Ui,Uj)= ??
Scalable user clustering Single-linkage hierarchical clustering Treat each data as a cluster Ci, i=1 to n. For all cluster, find the most closest cluster Ci Cj Merge Ci Cjas a new cluster Difficult to parallel implementation
Scalable user clustering Two-step adaptation of the single-linkage clustering algorithm Threshold
Evaluation Can SynchroTrap accurately detect malicious accounts while yielding low false positives? How effective is SynchroTrap in uncovering new attacks? Can SynchroTrap scale up to Facebook-size OSNs?
Evaluation In total, SynchroTrap detected 1156 large campaigns that involve more than 2 million malicious accounts, with a precision higher than 99% Large attack campaigns are comprised of millions of user actions
Post-processing to deal with false positives The Facebook security team sets a threshold of 200, above which almost all users in each cluster are found malicious. Do not invalidate all actions that a malicious account has performed during a detection window, but focus on those that match each of the other accounts in the same cluster.
Conclusion SynchroTrap is a generic and scalable detection system that uses clustering analysis to detect large groups of malicious users that act in loose synchrony Optimizing it by partitioning user activity data by time and only comparing pair-wise user actions that fall into overlapping sliding windows SynchroTrap unveiled 1156 large campaigns and more than two million malicious accounts that involved in the campaigns