Understanding Stacking of Random Hyperplanes for Enhanced Classification

stacking random hyperplanes n.w
1 / 6
Embed
Share

"Explore the innovative technique of stacking random hyperplanes to create new features for improved classification accuracy. Discover the effectiveness of this approach in comparison to traditional linear methods through empirical results on UCI datasets."

  • Hyperplanes
  • Stacking
  • Classification
  • Random projection
  • Empirical results

Uploaded on | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.

E N D

Presentation Transcript


  1. Stacking, random hyperplanes Usman Roshan

  2. Stacking A method to combine classifiers Instead of majority vote or boosting use non- linearized outputs to create new features Then apply a classifier on the new representation Closely related to representation learning Stacked generalization Very little theory is known but works in practice

  3. Random hyperplanes Use random hyperplanes in stacking to make new features Similar to extreme learning machines This simple method tends to perform very well In fact random hyperplanes work very well for dimensionality reduction

  4. Johnson-Lindenstrauss lemma Given any and n and k >= O(log(n)/ 2), for any set of P of n points in Rdthere exists a lower dimensional mapping f(x) (x in P) to Rksuch that for any u,v in P 2 f(u)- f(v) 2 (1+e) u-v 2 (1-e) u-v Furthermore, this mapping can be found in randomized polynomial time. Simply let each random vector be randomly sampled from thenormal Gaussian distribution. Why does this work? Because random projections of vectors preserve length and we model distance between vectors u and v as vectors.

  5. Classification on random hyperplanes Method: For i = 0 to k do: Create random vector w where each wi is uniformly sampled between 0 and 1 Project training data X (each row is datapoint xj) onto w. Let projection zibe Xw Append zias new row to Z Do the same for test data X and call the new representation Z Run linear SVM on Z and predict on Z How does this perform in comparison to liblinear on original data X and X ?

  6. Empirical results Random UCI datasets Linear SVM applied to random hyperplanes Compared to linear SVM on original data Similar results reported elsewhere

Related


More Related Content