Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnboundLocalError: local variable 'is_global' referenced before assignment #34

Open
GabyBG opened this issue Nov 7, 2023 · 2 comments

Comments

@GabyBG
Copy link

GabyBG commented Nov 7, 2023

Hello,
I am trying to run spectra using cell type labels:

import Spectra
import scanpy as sc
import pandas as pd
import numpy as np
import cytopus as cp

#subset my dataset from cytopus
G = cp.KnowledgeBase()
celltype_of_interest = ['T']
global_celltypes = ['all-cells','leukocyte']
G.get_celltype_processes(celltype_of_interest,global_celltypes = global_celltypes,get_children=True,get_parents =False)
annotations = G.celltype_process_dict
annotations = G.celltype_process_dict

#Run spectra
model = Spectra.est_spectra(
    adata=adata, 
    gene_set_dictionary=annotations, 
    use_highly_variable=True,
    cell_type_key="predicted.celltype.l1", 
    use_weights=True,
    lam=0.1, #varies depending on data and gene sets, try between 0.5 and 0.001
    delta=0.001, 
    kappa=None,
    rho=0.001, 
    use_cell_types=False,
    n_top_vals=50,
    label_factors=True, 
    overlap_threshold=0.2,
    clean_gs = True, 
    min_gs_num = 3,
    num_epochs=5000
)

It finishes the process, but gives the following error:

Cell type labels in gene set annotation dictionary and AnnData object are identical
removing gene set T for cell type global which is of length 14 0 genes are found in the data. minimum length is 3
removing gene set global for cell type global which is of length 150 0 genes are found in the data. minimum length is 3
Your gene set annotation dictionary is now correctly formatted.
/home/ubuntu/anaconda3/envs/scFates-gpu/lib/python3.8/site-packages/numpy/core/fromnumeric.py:3464: RuntimeWarning: Mean of empty slice.
  return _methods._mean(a, axis=axis, dtype=dtype,
/home/ubuntu/anaconda3/envs/scFates-gpu/lib/python3.8/site-packages/numpy/core/_methods.py:192: RuntimeWarning: invalid value encountered in scalar divide
  ret = ret.dtype.type(ret / rcount)
100%|██████████████████████████████████████████████████████████████████████████████████| 5000/5000 [38:15<00:00,  2.18it/s]
---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[12], line 1
----> 1 model = Spectra.est_spectra(
      2     adata=adata, 
      3     gene_set_dictionary=annotations, 
      4     use_highly_variable=True,
      5     cell_type_key="predicted.celltype.l1", 
      6     use_weights=True,
      7     lam=0.1, #varies depending on data and gene sets, try between 0.5 and 0.001
      8     delta=0.001, 
      9     kappa=None,
     10     rho=0.001, 
     11     use_cell_types=False,
     12     n_top_vals=50,
     13     label_factors=True, 
     14     overlap_threshold=0.2,
     15     clean_gs = True, 
     16     min_gs_num = 3,
     17     num_epochs=5000
     18 )

File ~/anaconda3/envs/scFates-gpu/lib/python3.8/site-packages/Spectra/Spectra.py:1314, in est_spectra(adata, gene_set_dictionary, L, use_highly_variable, cell_type_key, use_weights, lam, delta, kappa, rho, use_cell_types, n_top_vals, filter_sets, label_factors, clean_gs, min_gs_num, overlap_threshold, **kwargs)
   1311 #labeling function
   1312 if label_factors:
   1313     #get cell type specificity of every factor
-> 1314     if is_global == False:
   1315         celltype_dict = get_factor_celltypes(adata, cell_type_key, cellscore=spectra.cell_scores)
   1316         max_celltype = [celltype_dict[x] for x in range(spectra.cell_scores.shape[1])]

UnboundLocalError: local variable 'is_global' referenced before assignment
@Tobiaspk
Copy link
Collaborator

Hi, it seems that the genes defined in the annotations variable are not found in your adata object, this causes the error that you're seeing. We'll implement a more verbose warning in the next patch, thanks for raising this issue.

To solve this, ensure that the gene names in adata.var_names match your annotations. Cytopus uses capital letters only for genes, without spaces or special characters (for example STAT6, RAB1A, ..). Please let me know if that helped.

@Tobiaspk Tobiaspk mentioned this issue Nov 13, 2023
8 tasks
@ErikaZ95
Copy link

Hi. Even after doing as suggested it threw the same error at the end of the training. I then flattened the dictionary such that it has only one level and the training succeeded. However, I guess this is not the right way to do it as you want a hierarchy of cell type -> processes. (correct me if I am wrong)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants