-
Notifications
You must be signed in to change notification settings - Fork 27
/
Copy pathREADME.Rmd
159 lines (110 loc) · 5.65 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
---
output: github_document
---
# staplr <img src="man/figures/logo.png" align="right" height="150"/>
[![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/latest/active.svg)](http://www.repostatus.org/#active)
[![Licence](https://img.shields.io/badge/licence-GPL--3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0.en.html)
[![Build Status](https://travis-ci.org/pridiltal/staplr.svg?branch=master)](https://travis-ci.org/pridiltal/staplr)
---
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/staplr)](https://cran.r-project.org/web/packages/staplr/index.html)
[![](http://cranlogs.r-pkg.org/badges/staplr)](http://cran.rstudio.com/web/packages/staplr/index.html)
---
[![Last-changedate](https://img.shields.io/badge/last%20change-`r gsub('-', '--', Sys.Date())`-yellowgreen.svg)](/commits/master)
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# staplr
This package provides functions to manipulate PDF files:
- fill out PDF forms: get_fields() and set_fields()
- merge multiple PDF files into one: staple_pdf()
- remove selected pages from a file: remove_pages()
- rename multiple files in a directory: rename_files()
- rotate entire pdf document: rotate_pdf()
- rotate selected pages of a pdf file: rotate_pages()
- Select pages from a file: select_pages()
- splits single input PDF document into individual pages: split_pdf()
- splits single input PDF document into parts from given points: split_from()
This package is still under development and this repository contains a development version of the R package *staplr*.
## Installation
staplr requires a Java installation on your system. You can get the latest version
of java from [here](https://www.java.com/en/download/). [OpenJDK](https://openjdk.java.net/) also works.
You can install the stable version from CRAN.
```{r cran-installation, eval = FALSE}
install.packages('staplr', dependencies = TRUE)
```
You can install staplr from github with:
```{r gh-installation, eval = FALSE}
# install.packages("devtools")
devtools::install_github("pridiltal/staplr")
```
## Example
```{r dataset, echo=TRUE, eval=FALSE}
library(staplr)
# Merge multiple PDF files into one
staple_pdf()
# This command prompts the user to select the file interactively.
# Remove page 2 and 3 from the selected file.
remove_pages(rmpages = c(2,3))
# This function selects pages from a file;
select_pages(selpages = c(1,3))
# This function splits a single input PDF document into individual pages
split_pdf()
# This function writes renamed files back to directory
#if the directory contains 3 PDF files
rename_files(new_names = paste("file",1:3))
# These functions are to fill out pdf forms
get_fields()
set_fields()
# This includes 2 external functions `get_fields` and `set_fields`
# and files to use as examples.
# This is what the example file looks like
```
<img src="https://user-images.githubusercontent.com/6352379/37745585-bc7bb8e8-2d32-11e8-918c-e52a0a549118.png" height="300" />
```{r echo=TRUE,, eval=FALSE}
# If you get path to this file by
pdfFile = system.file('testForm.pdf',package = 'staplr')
# And do
fields = get_fields(pdfFile)
# You'll get a list of fields that the pdf contains
# along with some additional information about the fields.
# You make modifications in any of the fields by
fields$TextField1$value = 'this is text'
set_fields(pdfFile, 'newFile.pdf', fields)
# This will create a filled pdf file
```
<img src="https://user-images.githubusercontent.com/6352379/37745838-65986038-2d34-11e8-9d16-5d6514ef24ab.png" height="300" />
## Troubleshooting and 2.11.0 changes
- As of version 2.11.0, the package uses [pdftk-java](https://gitlab.com/pdftk-java/pdftk) instead of using the original [pdftk](https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/). `pdftk-java`
is included with the package so if you have a working java installation, you shouldn't
have any problems.
- While default java options should be enough for most use cases, if you need to,
you can change java options that is used to run pdftk by doing
```r
options('staplr_java_options' = '-Xmx512m')
```
This option is not affected by `rJava` settings.
- If you don't have a working java installation, your installation will fail since
you can't install rJava. Make sure you follow the proper instructions for java installation. For openJDK on linux make sure you get both jdk and jre and run javareconf.
```
sudo apt update -y
sudo apt install -y openjdk-8-jdk openjdk-8-jre
sudo R CMD javareconf
```
Also restart your R session after `javareconf`
- `pdftk-java` is built as a faithful representation of the original `pdftk` so
there shouldn't be any major differences between the outputs. However, for any reason you'd prefer to run a local installation of pdftk rather than using the version that is shipped with the package, do
```r
# set staplr_custom_pdftk to the path to local installation
# just setting to pdftk will do if it's already in your path
options('staplr_custom_pdftk' = 'pdftk')
```
If you want to do this, you can get the original version of pdftk from [here](https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/). Note that MacOS users
with a version higher than "High Sierra" should use [this](https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/pdftk_server-2.02-mac_osx-10.11-setup.pkg) version instead.
Make sure to set the option back to `NULL` if you want to use the built in pdftk later.
## References
- [https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/](https://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/)