ui.R → user interface R script that handles front-end
- uses shinydashboard package to create layout of website
server.R → server R script that handles back-end data management and operations
- saves and receives data from google sheets
- hybrid and jaccard recommender tools
- create graphs for data analysis
ContentBasedRec.R → our initial content-based recommender
CollaborativeRec.R → our initial collaborative filtering recommender
- includes user vs. user and item vs. item collaborative filtering
JaccardRec.R → our initial testing with Jaccard similarity
- can be used to find similar programs or students
- may use to improve collaborative or content based filtering
RecSim.R → R script to run and test our recommendation system
DukeGroups_ScrapeDescriptions.ipynb:
- Gather links of organizations in a list, then enter each link and collect all the descriptions in a list.
- Perhaps one day should scrape the Full Roster. However, the member information is not accurate.
DukeGroups_SearchTagwords.ipynb → Takes advantage of DukeGroups' search bar. Input all the tagwords manually we came up, and collect all co-curricular names shown in the search results. Create a dictionary whose keys are the co-curricular activity names, and whose values are 0 or 1 indicating if this co-curricular activity contains a specific tag.
Topic_Modeling.ipynb → include multiple methods to extract keywords from text, such as RAKE, TextBlob, and LDA. More testings are needed to ensure accuracy.
Processing_Tagwords.ipynb → import Tag_Words.csv and read the tagwords inside it.
Cluster_Groups.ipynb → clusters student programs using PCA and K-Means
Cluster_Tagwords.ipynb → clusters tagwords using PCA and K-Means
Generate_Students.ipynb → simulate student profiles using Normal Distribution
Collavorative_Rec.ipynb → use user-based collaborative filtering to give students recommendations.