Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

local ld_clump:Warning: cannot open file 'C:\Users\Zheng\AppData\Local\Temp\RtmpqWMUyA\file65d825722a42.clumped': No such file or directoryError in file(file, "rt") : cannot open the connection #34

Open
kouji175 opened this issue May 30, 2023 · 22 comments

Comments

@kouji175
Copy link

I can make sure that this file is exist and R have right to modify it.

However, this file does not have ".clumped" at the end of name, I think it maybe the reason why warning.

My information:

sessionInfo()
R version 4.3.0 (2023-04-21 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 11 x64 (build 22621)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8
[2] LC_CTYPE=English_United States.utf8
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.utf8

time zone: America/New_York
tzcode source: internal

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] gwasvcf_0.1.1 ieugwasr_0.1.5 plinkbinr_0.0.0.9000
[4] data.table_1.14.8 dplyr_1.1.2 TwoSampleMR_0.5.6
[7] readr_2.1.4

loaded via a namespace (and not attached):
[1] tidyselect_1.2.0 blob_1.2.4
[3] R.utils_2.12.2 filelock_1.0.2
[5] Biostrings_2.68.1 bitops_1.0-7
[7] fastmap_1.1.1 RCurl_1.98-1.12
[9] BiocFileCache_2.8.0 VariantAnnotation_1.46.0
[11] GenomicAlignments_1.36.0 XML_3.99-0.14
[13] digest_0.6.31 lifecycle_1.0.3
[15] KEGGREST_1.40.0 RSQLite_2.3.1
[17] magrittr_2.0.3 compiler_4.3.0
[19] genetics.binaRies_0.1.0 rlang_1.1.1
[21] progress_1.2.2 tools_4.3.0
[23] utf8_1.2.3 yaml_2.3.7
[25] rtracklayer_1.60.0 knitr_1.43
[27] prettyunits_1.1.1 S4Arrays_1.0.4
[29] bit_4.0.5 curl_5.0.0
[31] DelayedArray_0.26.3 plyr_1.8.8
[33] xml2_1.3.4 BiocParallel_1.34.2
[35] R.oo_1.25.0 BiocGenerics_0.46.0
[37] grid_4.3.0 stats4_4.3.0
[39] fansi_1.0.4 biomaRt_2.56.0
[41] SummarizedExperiment_1.30.1 cli_3.6.1
[43] rmarkdown_2.21 crayon_1.5.2
[45] generics_0.1.3 rstudioapi_0.14
[47] httr_1.4.6 tzdb_0.4.0
[49] rjson_0.2.21 DBI_1.1.3
[51] cachem_1.0.8 stringr_1.5.0
[53] zlibbioc_1.46.0 parallel_4.3.0
[55] AnnotationDbi_1.62.1 XVector_0.40.0
[57] restfulr_0.0.15 matrixStats_0.63.0
[59] vctrs_0.6.2 Matrix_1.5-4
[61] jsonlite_1.8.4 IRanges_2.34.0
[63] hms_1.1.3 S4Vectors_0.38.1
[65] bit64_4.0.5 GenomicFeatures_1.52.0
[67] glue_1.6.2 codetools_0.2-19
[69] stringi_1.7.12 GenomeInfoDb_1.36.0
[71] GenomicRanges_1.52.0 BiocIO_1.10.0
[73] tibble_3.2.1 pillar_1.9.0
[75] rappdirs_0.3.3 htmltools_0.5.5
[77] GenomeInfoDbData_1.2.10 BSgenome_1.68.0
[79] R6_2.5.1 dbplyr_2.3.2
[81] evaluate_0.21 lattice_0.21-8
[83] Biobase_2.60.0 R.methodsS3_1.8.2
[85] png_0.1-8 Rsamtools_2.16.0
[87] memoise_2.0.1 Rcpp_1.0.10
[89] xfun_0.39 MatrixGenerics_1.12.0
[91] pkgconfig_2.0.3

Here is the code I want to run

exp_dat<-ieugwasr::ld_clump(dplyr::tibble(rsid=exp_dat$SNP, pval=exp_dat$pval.exposure),plink_bin = genetics.binaRies::get_plink_binary(),bfile = "D:/Cornell/bioinformatics/MR/Breast-cancer/1kg_ref",clump_kb = 500,clump_r2 = 5*10^-8)
Clumping 1XlgyO, 14375928 variants, using EUR population reference
Warning: cannot open file 'C:\Users\Zheng\AppData\Local\Temp\RtmpqWMUyA\file65d825722a42.clumped': No such file or directoryError in file(file, "rt") : cannot open the connection

@Yaolab-fantastic
Copy link

I have the same problem. The error message is:

Clumping ASw1YU, 38 variants, using EUR population reference
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'C:\Users\DELL\AppData\Local\Temp\RtmpABwtLw\file13f34200527b6.clumped': No such file or directory

I found there is the file C:\Users\DELL\AppData\Local\Temp\RtmpABwtLw\file13f34200527b6. But this temp file has no ".clumped" extension.

Below is my code:

  exp_dat_clumped <- ld_clump(
    dat = exp_dat,
    clump_kb = 10000, 
    clump_r2 = 0.001, 
    clump_p = 5e-8,
    plink_bin = genetics.binaRies::get_plink_binary(),
    bfile = 'D:/Data/1kg/ieugwasr/EUR' #path to LD reference dataset
  ) 

@Neil-D123
Copy link

Hi!

Im having the same problem! Did you guys find a solution to this?

@kouji175
Copy link
Author

you can write what you get from get_plink_exe() to replace the plink_bin =,
for example
exp_dat_clump <- ieugwasr::ld_clump(dplyr::tibble(rsid=exp_dat$SNP, pval=exp_dat$pval.exposure,id=exp_dat$id.exposure),plink_bin = "D:/R-4.3.0/library/plinkbinr/bin/plink_Windows.exe",bfile = "D:/Cornell/bioinformatics/MR/Breast-cancer/1kg_ref/EUR", clump_kb=1000,clump_r2=0.1, clump_p=5E-08)
Then you can check if there are
"Warning: No significant --clump results. Skipping" which is normal because it just means there are no SNPs could be selected as instruments.

@Neil-D123
Copy link

Neil-D123 commented Jul 14, 2023

Hi,

Thanks for getting back to me. I tried that and im still getting the same error message. But i downloaded the plink.exe binary from the site and put it into the working directory and then the following code worked:

exp_dat_clump <- ieugwasr::ld_clump(dplyr::tibble(rsid=instruments$rsid, pval=instruments$pval,id=instruments$id.exposure),plink_bin = "C:/HCC Study/plink.exe" ,bfile = "C:/HCC Study/EUR/EUR", clump_kb=1000,clump_r2=0.01, clump_p=5E-08)

Thanks!

@kouji175
Copy link
Author

Hi,

Thanks for getting back to me. I tried that and im still getting the same error message. But i downloaded the plink.exe binary from the site and put it into the working directory and then the following code worked:

exp_dat_clump <- ieugwasr::ld_clump(dplyr::tibble(rsid=instruments$rsid, pval=instruments$pval,id=instruments$id.exposure),plink_bin = "C:/HCC Study/plink.exe" ,bfile = "C:/HCC Study/EUR/EUR", clump_kb=1000,clump_r2=0.01, clump_p=5E-08)

Thanks!

Hi, can you show head of your rsid?
You must make sure the rsids are consistent with rsids in your bfile

@shuang-pi-ji
Copy link

Hi,I have the same problem. Here is the subset of my data for local clumping :

T1.csv

@shuang-pi-ji
Copy link

Hi,I have the same problem. Here is the subset of my data for local clumping :

T1.csv

OK...My problem should be different. I have found the solution for my data. This is because my data column names not consistent with the "dat" data, which column names should be $rsid, $pval and $id. When I use select and rename the column, it works.

`T1 %>%
dplyr::select(rsid=SNP,pval=pval.exposure,id=id.exposure) %>%
filter(rsid!=".")->DD

D <- ld_clump_local(dat = DD,clump_p = 1e-05,clump_kb = 10000,clump_r2 = 0.001,
bfile = "/input/LD_reference_dataset/EUR",plink_bin = genetics.binaRies::get_plink_binary())`

@XjtuZhangKun-lab
Copy link

Hello, have you solved this problem yet?

@shuang-pi-ji
Copy link

I think this problem may also be caused by the memory or other space problem in Windows when you run a large R object and get a crash in R. After this crash, R cannot create the Temporary Files in the Windows Temp directory. I found this problem because when I used the same code to run the input with a large read_exposure_data TwoSampleMR object, it crashed and could not clump after restart. So my solution is: Close R (Rstudio) and delete the whole temp file that R (Rstudio) created.(Maybe have the name like RtmpG8APgv in AppDataLocalTemp dir) and rerun the code. It works.
I think using the data that has already been filtered (like subset SNP data with a P value > 1E-5) before clumping may be a method to prevent this problem.

@Steven-Shixq
Copy link

Sharing my experience in resolving the same error: In my particular scenario (Ubuntu), the program functions correctly after I've removed the X chromosome data from the input variable dat.

@loftyddd
Copy link

loftyddd commented Oct 6, 2023

I have the same problem. The error message is:

Clumping ASw1YU, 38 variants, using EUR population reference
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
  cannot open file 'C:\Users\DELL\AppData\Local\Temp\RtmpABwtLw\file13f34200527b6.clumped': No such file or directory

I found there is the file C:\Users\DELL\AppData\Local\Temp\RtmpABwtLw\file13f34200527b6. But this temp file has no ".clumped" extension.

Below is my code:

  exp_dat_clumped <- ld_clump(
    dat = exp_dat,
    clump_kb = 10000, 
    clump_r2 = 0.001, 
    clump_p = 5e-8,
    plink_bin = genetics.binaRies::get_plink_binary(),
    bfile = 'D:/Data/1kg/ieugwasr/EUR' #path to LD reference dataset
  ) 

have you ever sovled it? I meet the same question!

@pjordab
Copy link

pjordab commented Nov 7, 2023

I have exactly the same issue.
The file exists but without the suffix .clumped

@loftyddd
Copy link

loftyddd commented Nov 7, 2023

我有完全相同的问题。 文件存在,但没有后缀 .clumped

I have a another method to solve the clumping error. Just filter the snp that Pvalue is litttle enough before we do online clumping, the you maybe could run the online clumping. However, if you yet could'nt run the code, I suggest you to clumping by dividing the file into several little files. That maybe a method.

@pjordab
Copy link

pjordab commented Nov 7, 2023

Thank you for your answer.
I already pre-filtered the SNPs and my input file contain only 28 SNPs to clump... but still not working.

@loftyddd
Copy link

loftyddd commented Nov 7, 2023

Thank you for your answer. I already pre-filtered the SNPs and my input file contain only 28 SNPs to clump... but still not working.

Have ever try to do online clumping while swithing a different VPN?

@pjordab
Copy link

pjordab commented Nov 7, 2023 via email

@pjordab
Copy link

pjordab commented Nov 7, 2023

Got the solution:

I downloaded plink.exe from here:

https://www.cog-genomics.org/plink/

and replaced
plink_bin = genetics.binaRies::get_plink_binary() --> plink_bin = "plink.exe"

Now it works!

@loftyddd
Copy link

loftyddd commented Nov 7, 2023 via email

@mentors501
Copy link

Got the solution:

I downloaded plink.exe from here:

https://www.cog-genomics.org/plink/

and replaced plink_bin = genetics.binaRies::get_plink_binary() --> plink_bin = "plink.exe"

Now it works!

it works!!
But don`t use development edition. Using stable edition will be fine.

@greengarden0925
Copy link

greengarden0925 commented Feb 13, 2024

ve a another method to solve the clumping error. Just filter the snp that Pvalue is litttle enough before we do online clumping, the you maybe could run the online clumping. Ho

The error happened if there is no snps need to be clumped, no "[tempfile prefix].clumped" will be created. That's why error said
cannot open file 'C:\Users\asus\AppData\Local\Temp\RtmpaqodFr\file235060db7c1e.clumped':

The fundamental soluation is to revise the scrips of "ld_clump_local" and "ld_clump". I revised the original functions into "ld_clump_local_YT" and "ld_clump_YT". The following is the code contents.

ld_clump_local_YT=function(dat, clump_kb, clump_r2, clump_p, bfile, plink_bin) {
  #debug:
  # dat= data.frame(rsid=exposure$SNP, 
  #                     pval=exposure$pval.exposure, 
  #                     id=exposure$id.exposure)
  # .....................
  shell <- ifelse(Sys.info()["sysname"] == "Windows", "cmd", 
                    "sh")
 
    fn <- tempfile()
    
    write.table(data.frame(SNP = dat[["rsid"]], P = dat[["pval"]]), 
                file = fn, row.names = F, col.names = T, quote = F)

    fun2 <- paste0(shQuote(plink_bin, type = shell), " --bfile ",
                   shQuote(bfile, type = shell), " --clump ",
                   shQuote(fn,type = shell), " --clump-p1 ", clump_p, " --clump-r2 ",
                   clump_r2, " --clump-kb ", clump_kb, " --out ",
                   shQuote(fn,type = shell))

    system(fun2)
    
    #if ######.clumped exists
    if(file.exists(paste(fn, ".clumped", sep = ""))){
      res <- read.table(paste(fn, ".clumped", sep = ""), header = T)
      y <- subset(dat, !dat[["rsid"]] %in% res[["SNP"]])
      if (nrow(y) > 0) {
        message("Removing ", length(y[["rsid"]]), " of ", nrow(dat), 
                " variants due to LD with other variants or absence from LD reference panel")
      }
      unlink(paste(fn, "*", sep = ""))
      return(subset(dat, dat[["rsid"]] %in% res[["SNP"]]))
    }else{ #does not exists clumped data
      return(dat)
    }
    
   
    
  }

ld_clump_YT=function (dat = NULL, clump_kb = 10000, clump_r2 = 0.001, clump_p = 0.99, 
            pop = "EUR", access_token = NULL, bfile = NULL, plink_bin = NULL){
    stopifnot("rsid" %in% names(dat))
    stopifnot(is.data.frame(dat))
    if (is.null(bfile)) {
      message("Please look at vignettes for options on running this locally if you need to run many instances of this command.")
    }
    if (!"pval" %in% names(dat)) {
      if ("p" %in% names(dat)) {
        warning("No 'pval' column found in dat object. Using 'p' column.")
        dat[["pval"]] <- dat[["p"]]
      }
      else {
        warning("No 'pval' column found in dat object. Setting p-values for all SNPs to clump_p parameter.")
        dat[["pval"]] <- clump_p
      }
    }
    if (!"id" %in% names(dat)) {
      dat$id <- random_string(1)
    }
    if (is.null(bfile)) {
      access_token = check_access_token()
    }
    ids <- unique(dat[["id"]])
    res <- list()
    for (i in 1:length(ids)) {
      x <- subset(dat, dat[["id"]] == ids[i])
      if (nrow(x) == 1) {
        message("Only one SNP for ", ids[i])
        res[[i]] <- x
      }
      else {
        message("Clumping ", ids[i], ", ", nrow(x), " variants, using ", 
                pop, " population reference")
        if (is.null(bfile)) {
          res[[i]] <- ld_clump_api(x, clump_kb = clump_kb, 
                                   clump_r2 = clump_r2, clump_p = clump_p, pop = pop, 
                                   access_token = access_token)
        }
        else {
          res[[i]] <- ld_clump_local_YT(x, clump_kb = clump_kb, 
                                     clump_r2 = clump_r2, clump_p = clump_p, bfile = bfile, 
                                     plink_bin = plink_bin)
        }
      }
    }
    res <- dplyr::bind_rows(res)
    return(res)
  }
  

Run the clumping using the following code:

    exposure_clumped=ld_clump_YT(
      dat=data.frame(rsid=exposure$SNP, 
                     pval=exposure$pval.exposure, 
                     id=exposure$id.exposure),
      clump_kb = 10000,
      clump_r2 = 0.001,
      clump_p = 0.99,
      plink_bin = "[path to plink.exe, i.e. './plink_win64_20231211/plink.exe']",
      bfile = "[input your file path of 1000 genome reference],i.e. './Data/1000genomeLDreference/EUR'"
    )

@Leweibo
Copy link

Leweibo commented Feb 18, 2024

This problem is data depended. When I change another set of data, It works well.

and maybe the warning info is the key of the problem

"Warning: No significant --clump results. Skipping.“

@XUANEND
Copy link

XUANEND commented Oct 18, 2024

with Plink working, a file named like file235060db7c1e.clumped will be created if it filters significant SNPs. No SNPs filterd will cause the situation that the *.clumped file cannot be created so the procedure terminated and ERROR appears.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests