
Extreme Elasticity in Big Data Management
Learn about the challenges of extreme elasticity in handling big data, strategies for building clusters, algorithms for efficient processing, and the integration of people and technology to manage extreme complexity in data applications.
Uploaded on | 0 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
The Three Es of Big Data and What DB People can do About Them Michael Franklin UC Berkeley Beckman Database Get Together October 14, 2013 UC BERKELEY
The Big Data Problem - Nutshelled Something s Something s gotta gotta give give:: Time Massive Diverse and Growing Data Money Quality 2
The 3 Es of Big Data: E Extreme E Elasticity E Everywhere
Extreme Elasticity - Machines Option #1 Build your own Cluster/WSC 46K Servers (2010 estimate) Option #2 Rent Machines from AWS x Servers needed Option #3 Try your luck on the Spot Market x Servers needed (US East Saturday Sept 28 @1:30am)
Extreme Elasticity - Algorithms Agarwal et al., BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data. ACM EuroSys 2013.
Extreme Elasticity - People Incentives Fatigue, Fraud, & other Failure Modes Latency & Prediction Work Conditions Interface <-> Answer Quality Task Structuring Task Routing 6
Extreme Elasticity Approximate Answers ML Libraries and Ensemble Methods Active Learning Algorithms Cloud Computing esp. Spot Instances Multi-tenancy Relaxed (eventual) consistency/ Multi-version methods Machines Dynamic Task and Microtask Marketplaces Visual analytics Manipulative interfaces and mixed mode operation People
The Challenge Extreme Elasticity + Tradeoffs + AMP AMP Integration = Extreme Complexity
The Good News: We already know how to do this (kinda)! End Users tell the system what they want, not how to get it SQL Result MQL Model
MLbase: Progress initial release: Spring 2014 MQL Parser Query Planner / Optimizer (Contracts) ML Library ML Developer API Runtime Released July 2013
For More Information UC BERKELEY amplab.cs.berkeley.edu franklin@berkeley.edu