Efficient Techniques in Compressed Linear Algebra for ML Applications
Explore compressed linear algebra techniques for large-scale machine learning, including compression, column encoding, and memory optimization. Learn about column-wise compression, low column cardinalities, and various encoding formats to enhance data processing efficiency.
Download Presentation
Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Lecture 22: Compressed Linear Algebra for Large Scale ML Slides by Memar 1
Announcement Thanks to those that dropped by on Friday. Those that didn t come: I hope you are cooking something awesome. If you still want to meet email me. Interesting read: https://lemire.me/blog/2018/04/17/iterating-in- batches-over-data-structures-can-be-much-faster/ 2
Today 1. Compression 2. Column encoding 3. Compressed LA 3
Section 1 Section 1 1. Compression 4
Section 1 Section 1 Motivation
Section 1 Section 1 Solution: Fit more data into memory
Section 1 Section 1 Solution: Fit more data into memory
Section 1 Section 1 Compression techniques
Section 1 Section 1 Compression techniques
Section Section 2 2 Column-wise compression: Motivation Column-wise compression leverages two key characteristics: few distinct values per column and high cross-column correlations. 1. Low column cardinalities 11
Section Section 2 2 Column-wise compression: Motivation Column-wise compression leverages two key characteristics: few distinct values per column and high cross-column correlations. 1. Low column cardinalities 2. Non-uniform sparsity across columns 3. Tall and skinny matrices (more common) 12
Section Section 2 2 Column encoding formats 1. Uncompressed Columns (UC) 2. Offset-List Encoding (OLE) 3. Run-Length Encoding (RLE) 13
Section Section 2 2 Uncompressed Column 14
Section Section 2 2 Offset-List Encoding 15
Section Section 2 2 Run-Length Encoding 16
Section Section 2 2 It s all about tradeoffs! 17
Section Section 2 2 Column co-coding 18
Section Section 2 2 Combining compression methods 19
Section Section 2 2 Data layout: OLE 20
Section Section 2 2 Data layout: RLE 21
Section Section 3 3 Matrix-vector multiplication 23
Section Section 3 3 Compression planning 24
Section Section 3 3 Estimating column compression ratios 25
Section Section 3 3 Partitioning columns into groups 26
Section Section 3 3 Choosing the encoding format for each group 27