Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct more than two batches #36

Open
parkjooyoung99 opened this issue Apr 10, 2022 · 5 comments
Open

Correct more than two batches #36

parkjooyoung99 opened this issue Apr 10, 2022 · 5 comments

Comments

@parkjooyoung99
Copy link

Hello,
I am running fastMNN with data set that having 2 batches , 'dataset' and 'condition' .

I tried to correct both in once by specifying batch
fastMNN(combined,batch=c(combined$dataset, combined$condition), subset.row=chosen.hvgs)
but this does not work.

Is there a way to correct multiple batches?

Thank you :)

@LTLA
Copy link
Owner

LTLA commented Apr 10, 2022

I assume you mean you have two factors, of which each combination of levels is to be treated as a batch. If so, just paste the factors together. c() doesn't make sense here.

@parkjooyoung99
Copy link
Author

parkjooyoung99 commented May 22, 2022

@LTLA

I tried that but that actually did not correct batches.
I guess the following is happening,

Let's assume we have 4 datasets, A.B.C.D, where A and B are consist of patients and the others normal.
What I expected is to correct differences A vs B vsC vs D and plus (A,B) vs (C,D).
However, when I merge them as A_patient, B_patient, C_normal, D_normal,
fastmnn only corrects differences A_patient vs B_patient vs C_normal vs D_normal, but not (A_patient, B_patient) vs (C_normal, D_normal).

Hope that there will be an update about this!

Thank you!

@LTLA
Copy link
Owner

LTLA commented May 22, 2022

I don't understand your issue. If fastMNN has removed batch effects between the individual datasets, then why is the (A, B) vs (C, D) still present? If A == B == C == D, then this implies that (A, B) == (C, D); what makes you think otherwise?

@parkjooyoung99
Copy link
Author

parkjooyoung99 commented May 23, 2022

Batch between individual datasets and batch between patient vs normal is a different story.

When correcting batches between datasets, differences among 4 datasets should be considered.
However, to correct the batch between patient/normal, it has to only consider the differences between (patients vs normal), not the differences between individuals.

When merging two batches as ["patient, "__normal],
fastmnn does not think about differences between patient vs normal, since it recognizes each "
" as one batch level.
To say in other words, A_patient and B_patient should be recognized as the same level (patient) in the context of patient/normal but fastmnn does not consider them as the same level.

Indeed, it did not correct patient/normal batch.

I hope it clarifies my idea.!

@LTLA
Copy link
Owner

LTLA commented May 23, 2022

If you don't want to correct the differences between A and B, then just set the "patient/normal" factor as your batch=, in which case the differences within the patient level (i.e., between individuals) will be ignored. From experience with human data, I don't think this is a good idea, but whatever floats your boat.

If this is not what you are referring to, then I would suggest preparing a minimum reproducible example before proceeding with any further discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants