-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can we exclude certain data and labels based on a condition? #28
Comments
You can remove data at training time (in |
Ok thank you. |
The quality check was done manually. Basically, visual inspection of the pre-processing steps (registration, segmentation) and inspection of the motions of the parameters were checked. |
Ok thank you. |
Hi, But if I put this step in Classifier under fit, I used def fit (self, X, y)
X_new=X[some_good_idx]
y_new=y[some_good_idx]
self.clf.fit(X_new, y_new),
def predict(self, X):
return self.clf.predict(X)
def predict_proba(self, X):
return self.clf.predict_proba(X) it crashed when running CV evaluation with error `X has a different shape than during fitting. |
Can you submit it? I can look at the trace there. |
Modifying the starting kit, this should be something like this. from sklearn.base import BaseEstimator
from sklearn.base import TransformerMixin
class FeatureExtractor(BaseEstimator, TransformerMixin):
def fit(self, X_df, y):
return self
def transform(self, X_df):
# get only the anatomical information
X = X_df[[col for col in X_df.columns if col.startswith('anatomy')]]
return X
from sklearn.base import BaseEstimator
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
from sklearn.pipeline import make_pipeline
class Classifier(BaseEstimator):
def __init__(self):
self.clf = make_pipeline(StandardScaler(), LogisticRegression())
def fit(self, X, y):
X_select = X['anatomy_select'] == 1
self.clf.fit(X[X_select], y[X_select.values])
return self
def predict(self, X):
return self.clf.predict(X)
def predict_proba(self, X):
return self.clf.predict_proba(X) |
I tried and it works locally with the |
Thank you, guys. I have tested the modified anatomy code, it works. I think the error was caused by that I was trying to exclude the QC columns (i.e. anatomy_select) in the fit. It should be fine to include that column as they will be all ones and removed by feature selection. |
Based on the instructions, my personal comprehension is that we have to provide you the two basic functions, FeatureExtractor( ) and Classifier( ). I would like to access the whole data and exclude some of them, so afterwards I'll have to exclude their corresponding labels, as well. I can exclude the data based on the condition each time the FeatureExtractor is called but I can't do the same for the labels through it. So my question is if we will have to execute all the commands before FeatureExtractor is called (because that would solve my problem) or not.
The text was updated successfully, but these errors were encountered: