Factor Analysis: Discovering Latent Variables and Factor Structure
Factor analysis involves uncovering underlying factors from correlations between observed variables using techniques like Principal Components and rotation. It can be exploratory or confirmatory, aiming to reveal the factor structure.
Uploaded on Mar 04, 2025 | 0 Views
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Description Goal: Discover m < p underlying Factors (aka latent variables) from Covariances or Correlations among p observed variables Makes use of Principal Components and Rotation to obtain the Factors Exploratory Factor Analysis: Using observed responses to obtain factor structure. Confirmatory Factor Analysis: Uses new data to determine whether hypothesized Factor structure is appropriate.
Orthogonal Factor Model - I M L X M 1 1 11 M O 1 M p X X = = = = = X E V L X 1 p = p p pp + + + ... X 11 1 l F l F 1 1 1 1 m m ... X = + + + ... 1 1 p l F pm l F p p m p M M + L X 11 l M O l F M 1 1 1 M 1 1 m = + = = X Matrix form: LF L X l l F 1 p p p pm m p ,..., Common Factors F F 1 m th th loading of variable on factor l i j ij errors (Specific Factors) Assumptions: i F F = = = 0 FF' I E V E m m = 1 m ' = = = = = 0 diag COV , E V E E 0 F F' i p p p m 1 p
Orthogonal Factor Model - II ) ( ( E ) ( )( = ) ( )( )( ) ) ( ) = + + = + + = + + + X X ' LF LF ' LF LF ' ' LFF'L' LF ' F'L' ' = ( ( X F ' = + + + = + E E E E X X ' LFF'L' LF ' F'L' LL' ( ) ) + = + X F' LF F' LFF' F' ( ) = = + = X F' COV , E E E L FF' F' L m m 2 l L l l pl l l 1 1 L L 11 l M O l 11 l M O l = = 1 M 1 l l 1 M 1 m p M = = = L L' LL' O M L L l l l l m m 2 p 1 1 p pm m pm L l l l 1 1 pl l = = 1 1 l l m m V X = = + = = = 2 il 1,..., COV , l i p X X il kl l l i k i ii i i k ik = = 1 1 l l = = COV , X F l i k ik + V X = + + = 2 1 Communality 2 im l ... 1,..., p l i i ii i i Uniqueness (Specific Variance)
Example 1 38 23 23 37 9 1 9 11 1 6 3 1 5 3 9 6 1 3 5 2 2 = = = L L' 9 17 7 2 3 1 11 7 7 2 1 37 23 23 34 9 1 9 11 1 1 0 0 0 0 3 0 0 0 0 4 0 0 0 0 2 9 = = LL' 9 13 7 11 7 5 = + ) = + = = = = + 2 2 2 2 2 2 : Commonality: p p + 6 + pecific Variance: 1 Variances and Covariances 2 Specific Variances p 1 37 S p p = 1 38 X h 11 l 1 12 l h 1 1 ( 1 11 1 1 ( ) + p p 2 2 p p p = + = p 2 2 2 = = = L Factor Loadings pm p m LL' 0 + LL' When m cannot be typically fac tored into p
Example 2 2 1 0.5 0.8 0.5 1 0.2 0.8 0.2 1 11 l l l 11 l l l 11 31 l l l l l 11 21 2 21 l l l = = = = = = 3 1 p m F 21 11 l l l l F L LL' 1 21 21 31 2 31 31 31 11 31 21 = + = = 0.5 = 0.2 = = = 2 0 0 0 1 0.8 0.8 0.2 1 = 11 l 11 21 l l + 11 31 l l l l + 1 1 = 2 21 l = 0 0 0.5 1 21 11 l l l l 2 2 21 31 2 31 l 0 31 21 l l l l l l 3 31 11 3 0.8 0.2 11 l l = = = = = = 0.8 0.2 4 4 11 31 11 31 l l 21 31 l l 11 l l 21 21 31 21 1 8 = = = 2 21 2 21 0.5 4 which is not possible for real 11 21 l l l l l 21 = = T TT' T'T I 1 ambiguity in model: orthogonal m m m + = + = + = = = = * * * * X LF LTT'F LF L LT F T'F + F F T = = = = = * * * F T' 0 F T' T'IT T'T I , undistinguishable E E V V L L = + = * * LL' LL'
Estimation Principal Factor Method - I x 1 j ( )( ) 1 n 1 n n 1/2 1/2 = = = = x x x x x S x x x x ' R D SD Observed Data: ,..., M 1 n j j j j 1 n = = 1 1 j j x jp Off-diagonal elements of S and R 0 Little gain from conducting a Factor Analysis e ' 1 M 1 = 1 1 1 e e ' + + = e e ' e e Spectral Decomposition: ... L 1 1 p p p p p e ' p p 0 M = = = + = L m p L L ' 0 L e e 1 1 p p p p p p p p p p 0 e Factor loading for factor is j j j e ' 1 M 1 g = = When the last - eigenvalues are small: L p m e e L L ' 1 1 m m p m m p e ' m m m = = 2 ik Variances of specific factors obtained as diagonal elements of - 1,..., l i m LL' i ii = 1 k
Estimation Principal Factor Method - II x 1 j ( )( ) 1 n 1 n n 1/2 1/2 = = = = x x x x x S x x x x ' R D SD Observed Data: ,..., M 1 n j j j j 1 n = = 1 1 j j x jp x x S Centered Data: Makes use of Eigenvalues/Eigenvectors of for Factors j x x 1 1 j s M 11 = z R Standardized Data: Makes use of Eigenvalues/Eigenvectors of for Factors j x x p j p s pp ^ ^ e e ' ^ ^ ^ e e ' ^ = + + S R Spectral Decomposition: ... (Based on Centered, use if Standardized) = 1 1 1 p p p ^ ^ e ^ ^ e ~ l ~ l L O L 1 M 11 1 11 M O 1 M m M m m ~ ^ ^ e ^ ^ e = = L Let number of comm on factors: L m p 1 1 m m p m ~ l ~ l ^ ^ e ^ ^ e L L 1 p pm 1 1 p m mp 2 2 2 m m ~ ~ ~ ~ l ~ h ~ l = = = = = diag with 1,..., Communalities: p 1, ..., p s i i ik i ik i i ii = = 1 1 k k
Selecting m, the Number of Retained Factors ~ r ~ r 0 L 12 1 p ~ r ~ r ~ ~ LL' ~ 0 M O L + = S 12 M 2 M p Matrix of "Residuals": ~ r ~ r L 0 1 2 p p 2 2 2 1 p p ~ ~ LL' ~ ~ r ^ ^ + = + + S Sum of squared elements of 2 ... + ik 1 m p = k i = + 1 1 i Small value for the sum of squared removed eigenvalues leads to small sum of squared "residuals" Proportion of Total Sample Variance attributed to Factor : j ^ j S if Analysis based on + + ... s s 11 pp ^ j R if Analysis based on p
Maximum Likelihood Estimation I Normal F, F 1 j ( ) ( ) ( ) ( ) = = = + = + F 0 I M ~ , ~ , COV , ~ , N N N 0 F 0 X LF 0 LL' j j j j j j j F jm ( )( ) ( )( ) 1 2 n ( ) ( = ) /2 /2 n np + = 1 Normal Likelihood: , 2 exp trace L n x x x x ' x x ' j j = 1 j ( )( ) ( ) ( ) 1 2 n n ( ) ( ) x ( ) ( ) 1 /2 1/2 1 /2 /2 n n p p = 1 1 x x x ' 2 exp trace 2 exp x ' x j j 2 = 1 j Uniqueness Condition (Due to being not effected by Orthogonal Transformation): L ^ ^ = Choose so that L ,..., n X X L' L = L Diagonal matrix ) ML Est imates , by specialized programs + ( ~ , N LL' 1 ^ ^ ^ ^ ^ ^ ( ) = s L MLE : , , that maximize , s.t. Diagonal L x L' L 2 2 2 ^ h ^ l ^ l = + + = s MLE of Communalities: ... 1,..., i p 1 i i im 2 2 ^ l s ^ l s + + + + ... ... 1 j pj Proportion of Total Sample Vari ance Attributable to Factor : j 11 pp
Maximum Likelihood Estimation Normal F, ML Estimation based on Standardized Measurements (Correlation Matrix): = = + V V V L V L ' V ( )( ) 1/2 1/2 1/2 1/2 1/2 1/2 V 1/2 1/2 1/2 1/2 ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ = + = + MLE: V L V L ' V V L L ' Z Z Z ^ l ^ l L ^ l ^ 11 M O 1 M z z m ( )( ) 1 n n ^ ^ l ^ ik = = = = L S x x x x ' i Z zik zi n j j ^ ^ = 1 j ^ l ^ l ii ii L 1 zp z pm 2 2 2 ^ h ^ l ^ l = + + = s MLE of Communalities: ... 1,..., i p 1 zi zi zim 2 2 ^ l ^ l + + ... p 1 z j zpj Proportion of Total Sample Variance Attributable to Factor : j
Large-Sample Test for # of Common Factors (m) = + p m : : any other positive definite matrix 1 n j j n = H H LL' L 0 A ( )( ) n ^ ^ ^ ^ = + = (ML Estimate under ) ' (Unrestricted ML Estimate) H LL' S x x x x 0 j 1 /2 n ^ ^ ^ ^ ^ + LL' ( ) max , L H = = = = Likelihood Ratio Statistic: -2ln 2ln 2ln ln ln n n 0 ( ) S S S max , L n n n ( ) ( ) + p 1 1 p p p p = + = + = Number of Variance Parameters Under : H p p A 2 2 2 ( ) m 1 m m ( ) = + = + = L' L Number of Variance Parameters Under : 1 Last term is # of constraints for H pm p p m 0 0 2 2 Degrees of Freedom for Likelihood Ratio Statistic: ( ) ( ) ( ) ( ) 2 + + + + pm m + + 1 1 p p m m p m p m 2 2 2 2 2 2 2 2 p p m m pm p p p m p ( ) = + = = = 1 p m 0 2 2 2 2 2 + + 2 4 6 5 p m Bartlett's Correction fo r LR-Statistic: Replace n with: 1 n ^ ^ ^ + LL' + + 2 4 6 5 p m )( ) 2 Reject H if 1 ln n 0 ( ) ( 2 S p m p m + n 2
Factor Rotation Goal: Rotate the Factor Loading Matrix , so that each variable loads highly on 1 factor L * ^ ^ ^ = = = L L LT TT' T'T I Matrix of estimated Factor Loadings with p m = If 2 (or 2 factors at-a-time): m cos sin sin cos cos sin sin cos = = T T Clockwise Counter-Clockwise 2 2 * p ~ l * ik ^ l 4 * * p 1 p m ~ l ~ l = 1 i ik = = Choose to maximize: T Varimax Rotation: V ik ik ^ h p = = 1 1 k i i m ( ) variance of squared scaled loadings for factor V j = 1 j Oblique Rotations: Allow for correlations among factors and can ease interpretations in some cases
Estimating Factor Scores Weighted Least Squares ^ f th F f Unobserved, Random Factor Score for Unit: Realized (unobserved) value: Estimate: j j j j ^ l ^ 2 Standard Methods treat , as true , and can be applied to rotated or unrotated loadings 1 1 l ik i ik i = Wei ghted Least Squares (WLS) Weights: V i i = 2 i p ( ) ( ) = = x Lf ' x Lf = 1 1 ' = 1 i i + + + 1 1 1 1 1 1 x' x ' f'L' Lf 2 2 2 x' x' Lf ' Lf 2 i p set = ^ ( ) = + = 1 1 1 1 1 L' Lf 2 2 2 0 L' x L' L' Lf L' x f = 1 i i ( ) ^ f 1 ( ) = 1 1 L' L L' x 1 1 1 1 1 1 WLS WLS ( ) ( ) ^ f ^ ^ ^ ^ ^ ~ f ~ ~ ~ ~ ~ = = S L' L Based on : ML: PF: L' x x L' L L' x x j j j j 1 1 1 1 1 1 WLS WLS ^ f ^ ^ ^ ^ ^ ~ f ~ ~ ~ ^ ^ = = R L ' L L ' z Based on : ML: PF: L ' z L ' L Z Z Z Z Z Z Z Z Z Z j j j j * * * WLS WLS WLS ^ ^ ^ f ^ ~ f ~ = = = L LT T'f T'f When based on Rotated Loadings: j j j j
Estimating Factor Scores Regression Approach - I ^ ^ L 0 As with WLS, Regression Approach Uses , ~ = + = + X LF in place of , L ( ) ( ) X , ~ , N N LF F 0 I ( ) = + ~ , N 0 LL' F = + X 0 0 LL' L' L I F = + = = X F COV , COV , ~ , V N LF F L L ( ) ( ) ( ) ) 1 = = + 1 F x L' | E x L' LL' ( = I L' LL' x 1 = I L' L + 1 F x | V L 1 1 R ( ) ( ) ^ f ^ ^ ^ ^ ^ ^ = = + L' x x L' LL' x x j j j 1 1 1 1 1 1 1 R ( ) ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ f ^ ^ ^ ^ ^ + = + = + L' LL' Note: I L' L L' I L' L L' x x j j 1 1 1 1 1 1 R ^WLS f ( ) ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ + = = L' L L L' I L' L f L' x x j j j
Estimating Factor Scores Regression Approach - II 1 1 1 ^ ^ ^ ^ = L' L Based on ML Estimates: (Diagonal Matrix) 1 1 1 1 ^ ^ ^ ^ ^ ^ ^ + = + L' L I L' L I 1 R WLS ^ ^ f ^ f If elements of are small, j j ^ In practice often is replaced with : ( j L'S x S R ) ^ f ^ = 1 x Based on S : j 1 R ^ f ^ ^ ^ ^ ^ ^ ^ ^ = + = + L ' L L Based on R: ' where ' z L L j Z Z Z Z Z Z Z j * R R ^ f ^ = T'f Rotated Loadings j j