-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #52 from FredHutch/crispr_calc_rev_ki
review of first CRISPR score calculation
- Loading branch information
Showing
3 changed files
with
617 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
--- | ||
title: "scratch gimap CRISPR Score Review" | ||
output: html_document | ||
date: "`r Sys.Date()`" | ||
--- | ||
|
||
```{r warning=FALSE, message=FALSE} | ||
library(tidyverse) | ||
library(devtools) | ||
library(magrittr) | ||
library(ggpubr) | ||
devtools::load_all() | ||
``` | ||
|
||
## Get the gimap CRISPR Score results | ||
|
||
```{r} | ||
gimap_dataset <- get_example_data("gimap") %>% | ||
gimap_filter() %>% | ||
gimap_annotate() %>% | ||
gimap_normalize( | ||
timepoints = "day", | ||
replicates = "rep") %>% | ||
calc_crispr() | ||
``` | ||
|
||
## Load the GI Mapping CRISPR Score results | ||
|
||
```{r} | ||
old_crispr_results <- read_tsv("~/Downloads/results/calculate_LFC/tables/tsv/HeLa_lfc_annot_adj_pgRNA.txt") | ||
``` | ||
|
||
#### sync the reps values | ||
|
||
```{r} | ||
old_crispr_results %<>% mutate( | ||
rep = recode(rep, "A" = "late_RepA", "B" = "late_RepB", "C" = "late_RepC") | ||
) | ||
``` | ||
|
||
## Inspect/Compare | ||
|
||
```{r} | ||
ncol(gimap_dataset$crispr_score) | ||
ncol(old_crispr_results) | ||
print("Column names in new results not in old results") | ||
setdiff(colnames(gimap_dataset$crispr_score), colnames(old_crispr_results)) | ||
print("Column names in old results not in new results") | ||
setdiff(colnames(old_crispr_results), colnames(gimap_dataset$crispr_score)) | ||
nrow(gimap_dataset$crispr_score) | ||
nrow(old_crispr_results) | ||
``` | ||
|
||
From [this line in the code](https://github.com/FredHutch/gimap/blob/273d9cb50d80b0d8c29375582eec9d1946bd7556/R/05-crispr-calc.R#L133), it appears that the `double_crispr_score` column from gimap results is related to the `CRISPR_score` column from GI_Mapping results. | ||
|
||
Why are there so many more rows in gimap results? Some data does get dropped after the join (225993 rows before the drop_na) | ||
|
||
There appears to be 4 entries for each pgRNA ID and rep combo. They share the same original CRISPR score but once you look at others, that seems to be where the differences are. | ||
|
||
```{r} | ||
joindf <- dplyr::full_join(old_crispr_results, gimap_dataset$crispr_score, | ||
by = c("pgRNA_id" = "pg_ids", "rep"), | ||
suffix = c("_old", "_new")) %>% | ||
select(double_crispr_score, CRISPR_score, rep, pgRNA_id) %>% #225993 rows prior to drop_na | ||
drop_na() %>% | ||
distinct() | ||
nrow(joindf) | ||
``` | ||
|
||
```{r} | ||
joindf %>% ggplot(aes(x=double_crispr_score, y = CRISPR_score)) + | ||
geom_point(alpha=0.1) + | ||
xlab("gimap Normalization/LFC: lfc_adj") + | ||
ylab("GI Mapping Normalization/LFC: lfc_adj2") + | ||
theme_classic() + | ||
stat_regline_equation(label.y = 2.35) + stat_cor(label.y = 1.95) + | ||
facet_wrap(~rep) | ||
``` | ||
```{r} | ||
joindf %>% ggplot(aes(x=double_crispr_score, y = CRISPR_score)) + | ||
geom_point(alpha=0.1) + | ||
xlab("gimap Normalization/LFC: lfc_adj") + | ||
ylab("GI Mapping Normalization/LFC: lfc_adj2") + | ||
theme_classic() + | ||
stat_regline_equation(label.y = 2.35) + stat_cor(label.y = 1.95) | ||
``` |
Large diffs are not rendered by default.
Oops, something went wrong.