Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loadings matrix has incorrect shape when using principal method with lapack #137

Open
LuciaCam opened this issue Jun 17, 2024 · 1 comment

Comments

@LuciaCam
Copy link

Bug Description
When using the principal method with lapack SVD instead of randomized, the loadings matrix returned by FactorAnalyzer is always given in full, it has shape n_cols x n_cols, instead of selecting only loadings for the n_factors desired. When using the randomized SVD, there is no issue.

Reproducible Code

import pandas as pd
import numpy as np

num_rows = 1000
num_cols = 6
df = pd.DataFrame(
    np.random.standard_normal(size=(num_rows, num_cols)), 
    columns=[f'col{i+1}' for i in range(num_cols)])

# shape is correct with randomized
efa = FactorAnalyzer(n_factors=2, rotation='promax', method='principal', svd_method='randomized')
efa.fit(df)
print(efa.loadings_.shape)

# shape is incorrect with lapack
efa = FactorAnalyzer(n_factors=2, rotation='promax', method='principal', svd_method='lapack')
efa.fit(df)
print(efa.loadings_.shape)

Expected behavior
The shape of the .loadings_ attribute should be n_cols x n_factors.

Versions (please complete the following information):

  • OS: Windows 10
  • Python: 3.10.10
  • Versions for factor_analyzer: 0.5.1 / numpy: 1.26.1 / scipy : 1.11.3 / pandas: 2.1.1
@desilinguist
Copy link
Member

Thanks for your feedback, @LuciaCam. I will look into this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants