The name of R code is Multivariate-Data-Anlysis.R
A framework on multivariate data analysis for finding meaningful information of Breast Cancer data and predicting the diagnosis using these measurements of pattern recognition.
We proposed two dimensionality reduction methods, which one is widely used and the other is inspired by multiple comparison method, and compared different classification approaches to predict the diagnosis of Breast Cancer.
The data was used from the Univerity of California Irvine, the Center for Machine Learning and Intelligent Systems. https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original)
The name of R code is Bayesian_decision.R
The project is aimed to simulate the data and the results using Bayesian decision-throretic design. The whole theory is came from the below paper.
Lee, B. L., Fan, S. K., & Lu, Y. (2016). A curve-free Bayesian decision-theoretic design for two-agent Phase I trials. Journal of Biopharmaceutical Statistics, 27(1), 34-43. doi:10.1080/10543406.2016.1148713
The name of R code is Bosch.R
Bosch, one of the world's leading manufacturing companies, wants to improve their manufacturing processes. Through www.Kaggle.com, Bosch provides their manufacturing records data at each step of its assembly lines. With numerical data, categorical data and time data, the goal of our project (“Bosch project”) is to predict the failure, which means to build an algorithm to help Bosch visual these potential failures and adjust their manufacturing procedures.
Focused on comparing dimension reduction methods since the results are similar when applying KNN and random forest classified methods on the original data. In our project, we compare the three kinds of dimension reduction methods, PCA, LDA, and random forest. Also, we use k-nearest neighbot method to predict the failure.