to request how much memory you want for a100 machine #SBATCH --constraint=a100_80gb
request a node foe debug ijob -A uva-dsa -p gpu --gres=gpu:a6000:4 -c 16 -t 70:00:00
check the long slurm name squeue --format="%.18i %.9P %.30j %.8u %.8T %.10M %.9l %.6D %R" --me
- bertEmbedding:
- HeteroGraph: GNN architectures that contains knowledge information from protocol information
- EMSDataPipeline: wrapper class to tokenize narratives and create data loader for training/testing
- .build_single_dataloader() return a dataloader that streams samples for training/testing
- EMSDataset: wrapper class that tokenizes narrative and meta knowledge text. Used in EMSDataPipeline to create a data_loader object.
-
HGT (Heterogeneous Graph Transformer) that operates on graph object passed as argument
-
EMSMultiModel: This class integrates the label-wise attention with the BERT model. Below I have summarized the initialization fields (to the best of my understanding):
- Backbone: backbone for BERT model to process narrative text
- dp: dropout (not used anymore)
- max_len: max sentence length (not used anymore, I think)
- attn: specifies what type of attention to use in knowledge fusion
- 'qkv-la': label-wise attention qkv
- 'la': label-wise attention
- 'sa': self-attention
- cluster: whether the model is outputting grouped or ungrouped protocols
- cls: whether or not the output of fusion through a fully connected layer (does this stand for classifier?)
- graph: the GNN architecture (such as Heterograph) used for knowledge
-
classifier: MLP with 3 hidden layers. I believe this inputs the embeddings after they have been attended to
- SimCLR
- ResampleLoss
- FocalLoss
- evaluate model performance and generate report json file
- an interface for evaluate bad cases
- dataAug
- knowledgeModel
- EMSBertModel
- comb_pipeline
- dm_pipeline
- Doing counterfactual reasoning
- trainer
- tester
- validator
- pseudolabeler
- Desription: a bunch of graphing code
- generate a vector representation for protocols by calculate TF-IDF of signs & symptoms
- extract signs & symptoms by using Metamap
- extract gt labels from RAA data
- split into 3-folds train-val-test
- explain model's prediction by using Integrated Gradient
- use one-gram as features and train machine learning models
- used as baseline