Random Forest

Omar Halawa (ohalawa@ucsd.edu) of the GenePattern Team @ Mesirov Lab - UCSD

The following repository is a GenePattern module written in Python 3, using the following Docker image.

It performs random forest classification, a machine learning algorithm that is an ensemble of decision trees, through either: cross-validation (takes one dataset as input) or test-train prediction (takes two datasets, test and train). Each dataset consists of two file inputs, one for feature data (.gct), and one for target data (.cls). It processes files and performs classification via Scikit-learn's RandomForestClassifier, generating a prediction results file (.pred.odf) which the "true" class to the model's prediction and outputting a feature importance file (.feat.odf) in the case of test-train prediction. The module also supports importing and exporting trained models. Created for GenePattern module usage through optional arguments for classifier parameters.

Documentation on usage and implementation is found here. A detailed step-by-step explanation behind how the Random Forest algorithm works is found here. All source files, including cross-validation runs for all_aml_train (.gct, .cls), BRCA_HUGO (.gct, .cls), and iris (.gct, .cls) datasets as well as a test-train run with all_aml_test (.gct, .cls) and all_aml_train (.gct, .cls) all with output examples ("examples," as the classifier utilizes randomness, so each run varies) are available for better reproducibility and portability. However, to see how randomness can be "reproduced," read this.

Also see the GPU-backed CuPy-based implementation of this module, RandomForest.GPU, for potentially faster jobs.

Name		Name	Last commit message	Last commit date
Latest commit History 502 Commits
data		data
docs		docs
gpunit		gpunit
other		other
src		src
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
RandomForest.zip		RandomForest.zip
build.xml		build.xml
manifest		manifest
paramgroups.json		paramgroups.json
prerelease.version		prerelease.version
release.version		release.version

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Random Forest

Omar Halawa (ohalawa@ucsd.edu) of the GenePattern Team @ Mesirov Lab - UCSD

About

Releases 11

Packages

Contributors 4

Languages

License

genepattern/RandomForest

Folders and files

Latest commit

History

Repository files navigation

Random Forest

Omar Halawa (ohalawa@ucsd.edu) of the GenePattern Team @ Mesirov Lab - UCSD

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 11

Packages 0

Contributors 4

Languages

Packages