This repository is a collection of our group's work during the Summer@ICERM 2020 REU program.
-
To view our GitHub Pages site, click here.
-
For a more thorough explanation of the mathematical background and results, click here.
-
To access the slides from our final presentation, click here.
The code from a continuation of this work can be found in notebooks/ID_Test.ipynb
. The associated paper can be found on arXiv.
The notebooks
folder contains all of our coding experiments.
We use a database of real pictures of faces to extract the components of an average face, which can be added up to reconstruct approximations to any specific face.
We use randomization to find low-rank approximations to image, making it easier to use these images for data analysis and computation.
The left-most image is the original image, and the rest are various forms of approximations.
We numerically verify some of the claims made in the Johnson-Lindenstrauss Lemma.
We compute approximate 'least-square' solutions to linear systems that do not have an exact solution, and then compare the accuracy to that of the best of a randomly sampled set of vectors.
Certain datasets are not linearly seperable. To solve this problem, we use randomized kernel methods to map data into a higher-dimensional space where PCA is then performed.
If we want to train a nonlinear classifier on a set of labeled data, one option is to use a Support Vector Machine (SVM). Using a randomized kernel function, we experiment with SVM on the MNIST dataset.
The presentations
folder contains all our files for the biweekly group presentations at ICERM. These mostly consist of Jupyter notebooks with extensive descriptions and explanations of the code / phenomena.
Clone the repository, and install all packages required.
We recommend using Anaconda/conda to set up a virtual environment so as to not interfere with any other projects in your file system. You can run this command to create the environment and install all required packages:
conda create -n icerm --file package-list.txt
You will also need to download the datasets used for our experiments (e.g., LFW, MNIST). A dataset named dataset1
should be stored in the following directory:
random-projections/datasets/dataset1
You can reproduce our results by running the Jupyter notebooks provided.
- Rishi Advani
- Madison Crim
- Sean O'Hagan
A full list of references is provided in the final report linked above.
Thank you to our organizers, Akil Narayan and Yanlai Chen, along with our TAs, Justin Baker and Liu Yang, for supporting us throughout this program.