Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ariExtra immigration #46

Merged
merged 51 commits into from
Oct 17, 2023
Merged
Show file tree
Hide file tree
Changes from 25 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
70adb10
Resolve
Jul 7, 2023
d80cb3b
Add `download_gs_file()` and related functions from ariExtra
Jun 2, 2023
a678cc2
Update documentation
Jun 2, 2023
9533715
Add `pdf_to_pngs()` from ariExtra
Jun 2, 2023
6524a07
Stylistic changes
Jun 2, 2023
7cbd032
Resolve
Jul 7, 2023
8226531
Resolve
Jul 7, 2023
4b45b2f
Resolve
Jul 7, 2023
25f8bd1
Set defaults for 'model_name' and 'vocoder_name'
Jun 12, 2023
1ac3054
Use `cli_alert_warning()`
Jun 16, 2023
0cdff11
Documentation
Jun 16, 2023
0532e89
Comment code
Jul 10, 2023
86ff441
Updates
Jul 10, 2023
cbe2af8
Merge pull request #48 from jhudsl/main
seankross Jul 10, 2023
2131ede
Document model_name and vocoder_name argument in `ari_spin()`
Jul 10, 2023
22b7f13
Add `check_png_urls()`
Jul 10, 2023
98b30bd
`pad_wav()` Documentation
Jul 10, 2023
aa693cf
Use \dontrun{} around `pad_wav()` example
Jul 10, 2023
e8e495f
Documentation for `ari_spin()`
Jul 11, 2023
66f9a64
Add `pptx_to_pdf()`
Aug 15, 2023
6864eec
Add `sys_type()` and `os_type()`
Aug 28, 2023
86fd00f
Don't need `fix_soffice_library_path()`
Aug 29, 2023
60e6f25
get rid of text2speech specific code
Aug 31, 2023
96c5765
Fix `ari_narrate()` so we can get rid of text2speech
Aug 31, 2023
8ff859e
Put all the ffmpeg related arguments into a list called `ffmpeg_args`
Sep 1, 2023
63eaaae
Syntax fix
Sep 1, 2023
48d3cdf
`ari_burn_subtitles()`: Fix destination of output in system command
Sep 1, 2023
0c34a4e
Document `ari_subtitles()` and `ari_burn_subtitles()`
Sep 5, 2023
c820e21
Fix dependency issue
Sep 8, 2023
e259d89
Remove download_gs_file.R and pptx_notes.R
howardbaik Sep 14, 2023
9b595ef
Fix merge conflicts when git pull-ing from ariExtra-immigration branch
howardbaik Sep 14, 2023
1923a9b
Merge branch 'ariExtra-immigration' of https://github.com/jhudsl/ari …
howardbaik Sep 14, 2023
45119d5
Run `document()`
howardbaik Sep 14, 2023
0183e3f
Get rid of unnecessary Imports in DESCRIPTION
howardbaik Oct 9, 2023
be9b52c
Put progress bar back into `ari_spin()`
howardbaik Oct 9, 2023
35c1a60
Get rid of `print()`
howardbaik Oct 10, 2023
e6711ba
progress_bar
howardbaik Oct 10, 2023
74ad489
Create `coqui_args()`
howardbaik Oct 10, 2023
bbb8f02
Replace `tts_engine_args` with `coqui_args()`
howardbaik Oct 10, 2023
674c9cf
Made final changes to code
howardbaik Oct 16, 2023
21be0ae
Passed R CMD CHECK
howardbaik Oct 16, 2023
12f517a
Merge branch 'ariExtra-immigration' into burn-subtitles
howardbaik Oct 16, 2023
b745ff3
Get rid of default argument for `output_video`
howardbaik Oct 16, 2023
cbbeff2
Create `set_ffmpeg_args()`
howardbaik Oct 16, 2023
1b0b1eb
Build ffmpeg command to supply to `system()`
howardbaik Oct 16, 2023
6366f44
Resolve merge conflicts with `ariExtra-immigration` branch
howardbaik Oct 16, 2023
42ba661
Documentation stuff
howardbaik Oct 16, 2023
e14f959
More Documentation stuff
howardbaik Oct 16, 2023
00658f1
Merge pull request #53 from jhudsl/reduce-arguments
howardbaik Oct 16, 2023
b647663
Merge pull request #52 from jhudsl/burn-subtitles
howardbaik Oct 16, 2023
d286fde
Fix documentation
howardbaik Oct 17, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions DESCRIPTION
howardbaik marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,11 @@ Depends:
R (>= 3.1.0)
Imports:
cli,
docxtractr,
hms,
httr,
jsonlite,
pdftools,
progress,
purrr,
rmarkdown,
Expand Down
1 change: 0 additions & 1 deletion NAMESPACE
howardbaik marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@ export(set_video_codec)
export(video_codec_encode)
importFrom(cli,cli_alert_info)
importFrom(hms,hms)
importFrom(progress,progress_bar)
importFrom(purrr,compose)
importFrom(purrr,discard)
importFrom(purrr,map)
Expand Down
65 changes: 37 additions & 28 deletions R/ari_narrate.R
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
#' Create a video from slides and a script
#' Generate video from slides and a script
#'
#' \code{ari_narrate} creates a video from a script written in markdown and HTML
#' slides created with \code{\link[rmarkdown]{rmarkdown}} or a similar package.
Expand Down Expand Up @@ -44,46 +44,48 @@
#' @export
#' @examples
#' \dontrun{
#'
#' #
#' ari_narrate(system.file("test", "ari_intro_script.md", package = "ari"),
#' system.file("test", "ari_intro.html", package = "ari"),
#' voice = "Joey"
#' )
#' system.file("test", "ari_intro.html", package = "ari"))
#' }
ari_narrate <- function(script, slides,
output = tempfile(fileext = ".mp4"),
voice = text2speech::tts_default_voice(service = service),
service = "amazon",
tts_engine = text2speech::tts,
tts_engine_args = list(service = "coqui",
voice = NULL,
model_name = "tacotron2-DDC_ph",
vocoder_name = "ljspeech/univnet"),
tts_engine_auth = text2speech::tts_auth,
howardbaik marked this conversation as resolved.
Show resolved Hide resolved
capture_method = c("vectorized", "iterative"),
subtitles = FALSE, ...,
subtitles = FALSE,
verbose = FALSE,
audio_codec = get_audio_codec(),
video_codec = get_video_codec(),
cleanup = TRUE) {
auth <- text2speech::tts_auth(service = service)
cleanup = TRUE,
...) {
# Authentication for Text-to-Speech Engines
auth <- tts_engine_auth(service = tts_engine_args$service)
# Stop message
if (!auth) {
stop(paste0(
"It appears you're not authenticated with ",
service, ". Make sure you've ",
tts_engine_args$service, ". Make sure you've ",
"set the appropriate environmental variables ",
"before you proceed."
))
}


# Check capture_method
capture_method <- match.arg(capture_method)
if (!(capture_method %in% c("vectorized", "iterative"))) {
stop('capture_method must be either "vectorized" or "iterative"')
}

# Output directory, path to script
output_dir <- normalizePath(dirname(output))
script <- normalizePath(script)
if (file_ext(script) %in% c("Rmd", "rmd") & missing(slides)) {
tfile <- tempfile(fileext = ".html")
slides <- rmarkdown::render(input = script, output_file = tfile)
}

# Slides
if (file.exists(slides)) {
slides <- normalizePath(slides)
if (.Platform$OS.type == "windows") {
Expand All @@ -92,52 +94,59 @@ ari_narrate <- function(script, slides,
slides <- paste0("file://localhost", slides)
}
}
# Check if script and output_dir exists
stopifnot(
file.exists(script),
dir.exists(output_dir)
)

# Convert script to html and get text
if (file_ext(script) %in% c("Rmd", "rmd")) {
paragraphs <- parse_html_comments(script)
} else {
html_path <- file.path(output_dir, paste0("ari_script_", grs(), ".html"))
html_path <- file.path(output_dir, paste0("ari_script_", get_random_string(), ".html"))
if (cleanup) {
on.exit(unlink(html_path, force = TRUE), add = TRUE)
}
render(script, output_format = html_document(), output_file = html_path)
rmarkdown::render(script, output_format = rmarkdown::html_document(), output_file = html_path)
paragraphs <- map_chr(
html_text(html_nodes(read_html(html_path), "p")),
rvest::html_text(rvest::html_nodes(xml2::read_html(html_path), "p")),
function(x) {
gsub("\u2019", "'", x)
}
)
}

# Path to images
slide_nums <- seq_along(paragraphs)
img_paths <- file.path(
output_dir,
paste0(
"ari_img_",
slide_nums, "_",
grs(), ".jpeg"
get_random_string(), ".jpeg"
)
)

# Take screenshot
if (capture_method == "vectorized") {
webshot(url = paste0(slides, "#", slide_nums), file = img_paths, ...)
webshot::webshot(url = paste0(slides, "#", slide_nums), file = img_paths, ...)
} else {
for (i in slide_nums) {
webshot(url = paste0(slides, "#", i), file = img_paths[i], ...)
webshot::webshot(url = paste0(slides, "#", i), file = img_paths[i], ...)
}
}

if (cleanup) {
on.exit(walk(img_paths, unlink, force = TRUE), add = TRUE)
}

# Pass along ari_spin()
ari_spin(
images = img_paths, paragraphs = paragraphs,
output = output, voice = voice,
service = service, subtitles = subtitles,
verbose = verbose, cleanup = cleanup
output = output,
tts_engine = tts_engine,
tts_engine_args = tts_engine_args,
tts_engine_auth = tts_engine_auth,
subtitles = subtitles,
verbose = verbose,
cleanup = cleanup
)
}
94 changes: 39 additions & 55 deletions R/ari_spin.R
Original file line number Diff line number Diff line change
@@ -1,30 +1,23 @@
#' Create a video from images and text
#' Generate video from images and text
#'
#' Given equal length vectors of paths to images (preferably \code{.jpg}s
#' or \code{.png}s) and strings which will be
#' synthesized by
#' \href{https://aws.amazon.com/polly/}{Amazon Polly} or
#' any other synthesizer available in
#' \code{\link[text2speech]{tts}}, this function creates an
#' synthesized by a text-to-speech engine, this function creates an
#' \code{.mp4} video file where each image is shown with
#' its corresponding narration. This function uses \code{\link{ari_stitch}} to
#' create the video.
#'
#' This function needs to connect to
#' \href{https://aws.amazon.com/}{Amazon Web Services} in order to create the
#' narration. You can find a guide for accessing AWS from R
#' \href{http://seankross.com/2017/05/02/Access-Amazon-Web-Services-in-R.html}{here}.
#' For more information about how R connects
#' to Amazon Polly see the \code{aws.polly} documentation
#' \href{https://github.com/cloudyr/aws.polly}{here}.
#'
#' @param images A vector of paths to images.
#' @param paragraphs A vector strings that will be spoken by Amazon Polly.
#' @param output A path to the video file which will be created.
#' @param voice The voice you want to use. See
#' \code{\link[text2speech]{tts_voices}} for more information
#' about what voices are available.
#' @param service speech synthesis service to use,
#' @param model_name (Coqui TTS only) Deep Learning model for Text-to-Speech
#' Conversion
#' @param vocoder_name (Coqui TTS only) Voice coder used for speech coding and
#' transmission
#' @param service Speech synthesis service to use,
howardbaik marked this conversation as resolved.
Show resolved Hide resolved
#' passed to \code{\link[text2speech]{tts}},
#' Either \code{"amazon"}, \code{"microsoft"}, or \code{"google"}.
#' @param subtitles Should a \code{.srt} file be created with subtitles? The
Expand All @@ -33,7 +26,7 @@
#' \code{.srt}.
#' @param duration a vector of numeric durations for each audio
#' track. See \code{\link{pad_wav}}
#' @param ... additional arguments to \code{\link{ari_stitch}}
#' @param ... Additional arguments to voice_engine
#' @param tts_args list of arguments to pass to \code{\link{tts}}
#' @param key_or_json_file access key or JSON file to pass to
#' \code{\link{tts_auth}} for authorization
Expand All @@ -43,7 +36,6 @@
#' @importFrom text2speech tts_auth tts tts_default_voice
#' @importFrom tuneR bind Wave
#' @importFrom purrr map reduce
#' @importFrom progress progress_bar
#' @importFrom tools file_path_sans_ext
#' @importFrom cli cli_alert_info
#' @export
Expand All @@ -57,115 +49,107 @@
#' "Welcome to my very interesting lecture.",
#' "Here are some fantastic equations I came up with."
#' )
#' ari_spin(slides, sentences, voice = "Joey")
#' ari_spin(slides, sentences)
#' }
#'
ari_spin <- function(images, paragraphs,
output = tempfile(fileext = ".mp4"),
voice = text2speech::tts_default_voice(service = service),
service = ifelse(have_polly(), "amazon", "google"),
tts_engine = text2speech::tts,
tts_engine_args = list(service = "coqui",
voice = NULL,
model_name = "tacotron2-DDC_ph",
vocoder_name = "ljspeech/univnet"),
tts_engine_auth = text2speech::tts_auth,
howardbaik marked this conversation as resolved.
Show resolved Hide resolved
subtitles = FALSE,
duration = NULL,
tts_args = NULL,
key_or_json_file = NULL,
...) {
verbose = FALSE,
cleanup = TRUE) {
# Check for ffmpeg
ffmpeg_exec()

# Argument checks
auth <- text2speech::tts_auth(
service = service,
auth <- tts_engine_auth(
service = tts_engine_args$service,
key_or_json_file = key_or_json_file
)
if (!auth) {
stop(paste0(
"It appears you're not authenticated with ",
service, ". Make sure you've ",
tts_engine_args$service, ". Make sure you've ",
"set the appropriate environmental variables ",
"before you proceed."
))
}

# Create file path to output
stopifnot(length(images) > 0)
images <- normalizePath(images)
output_dir <- normalizePath(dirname(output))

# Paragraphs
if (length(paragraphs) == 1) {
if (file.exists(paragraphs)) {
paragraphs <- readLines(paragraphs, warn = FALSE)
paragraphs <- paragraphs[!paragraphs %in% ""]
}
}

# Paragraphs: Check for semicolons
semi_colon <- trimws(paragraphs) == ";"
if (any(semi_colon)) {
warning(paste0(
"Some paragraphs are simply a semicolon - ",
"likely needs to be replaced or slide removed!"
))
}
# Check for arguments
stopifnot(
length(paragraphs) > 0,
identical(length(images), length(paragraphs)),
all(file.exists(images)),
dir.exists(output_dir)
)
# End of Argument checks

# Setup objects to populate in for-loop with tts()
wave_objects <- vector(mode = "list", length = length(paragraphs))
par_along <- seq_along(paragraphs)
paragraphs_along <- seq_along(paragraphs)
ideal_duration <- rep(NA, length(paragraphs))

pb <- progress_bar$new(
format = "Fetching Narration [:bar] :percent",
total = length(par_along)
)
# Iterate through arguments used in tts()
for (i in par_along) {
args <- tts_args
args$text <- paragraphs[i]
args$voice <- voice
args$service <- service
for (ii in paragraphs_along) {
args <- tts_engine_args
args$text <- paragraphs[ii]
args$bind_audio <- TRUE
# coqui+ari doesn't work with mp3
if (service == "coqui") {
if (tts_engine_args$service == "coqui") {
args$output_format <- "wav"
args$voice <- NULL
}
wav <- do.call(text2speech::tts, args = args)
howardbaik marked this conversation as resolved.
Show resolved Hide resolved
wav <- do.call(tts_engine, args = args)
wav <- reduce(wav$wav, bind)
wav <- pad_wav(wav, duration = duration[i])
ideal_duration[i] <- length(wav@left) / wav@samp.rate
wave_objects[[i]] <- wav
pb$tick()
}
if (service == "coqui") {
cli::cli_alert_info("Coqui TTS does not support MP3 format; will produce a WAV audio output.")
wav <- pad_wav(wav, duration = duration[ii])
ideal_duration[ii] <- length(wav@left) / wav@samp.rate
wave_objects[[ii]] <- wav
}

# Burn subtitles
if (subtitles) {
sub_file <- paste0(file_path_sans_ext(output), ".srt")
ari_subtitles(paragraphs, wave_objects, sub_file)
}

print("Audio succesfully converted...............")
# Create a video from images and audio
howardbaik marked this conversation as resolved.
Show resolved Hide resolved
res <- ari_stitch(images, wave_objects, output, ...)

res <- ari_stitch(images, wave_objects, output)
# Collect output
args <- list(...)
args <- list()
cleanup <- args$cleanup
if (is.null(cleanup)) {
cleanup <- TRUE
}
if (!cleanup) {
attr(res, "wavs") <- wave_objects
}
attr(res, "voice") <- voice
attr(res, "voice") <- tts_engine_args$voice
if (subtitles) {
attr(res, "subtitles") <- sub_file
}
attr(res, "service") <- service
attr(res, "service") <- tts_engine_args$service
return(res)
}

howardbaik marked this conversation as resolved.
Show resolved Hide resolved
Expand Down
Loading
Loading