-
Notifications
You must be signed in to change notification settings - Fork 2
Meeting Notes June 23 2018
srinivasannambi edited this page Jun 24, 2018
·
1 revision
- 96.2 R2 and 96.9 Explained variance
- RMSE 5.36 (Everything is now in terms of IC50 | Before it was in terms of normalized scale)
- Mean Absolute Error 2.8
- Median Absolute Error 2.59
- For most of them, the error is only 2.5 up or down
* Linear SVM: Found features that only that model can find the right relationship * No other model (Ensemble or not) had performance that's close to that. * LASSO and Ridge : Both did pretty bad initially. However, still more testing needed considering reduced feature sets. * Adapated Boosting : did well on MSE * Multi-layered Perceptron : Did well on MSE
Split into groups. 1 Group to start looking into inference. And Group 2 to continue on modeling.
- Extract and analyze feature importance. Incorporate other methods, such as clustering, similarity analysis, focusing on the best IC50 (OSM-S-169)
- Working with the modeling side by side. Work on providing insights and reasoning to that recommendations.
- Build the inference pipeline to take in as input the performing feature sets and provide the output. 80 R2 or higher should be good performing ones.
- Making sure the test predictions look realistic is important. What features we would need to narrow down for the testing/experiment.
- Call out the risks of testing our recommendation based on possible error.
- Consider Building something like Prediction Interval so we can call out a wider window of possibility and the risks involved
- Review Blake's Streamline : 9 Models into the function. Pass a string. Similar to how we do grid search. U can do that - this function does grid search on all models
- Continue working on new models on individual paths, find new approaches that work and provide recommendations for predictors.
- Multiple set of features that are performing well. Tried with 80, Try with 300.
- Divide & Conquer - Step method algorithm with different models and starting features. Moving from the local maximum to newer peak. Go for robust-ment of step methods.
- Combine efforts on best model. Use as a starting point and run step algorithm.
- Consider Least angle regression : Instead of traditional ordinary least squares It uses angle between vectors.
- Github Repo
- Folder Structure For Reuse. How well it's working. The why behind it.
- Where we are and our Inferences So Far.
- Accuracy of the model, test results. Best predictors.
- Demo ?
- Series 3 a repesentation of Selleck DB [ Series 3 : Compounds that they thought they may be potent and determined the IC50 Values. Selleck Compounds : No IC50. Compounds that they had purchased and intend to perform experiments based on our recommendations ]
- What we are working on now.