Introduction
intro.Rmd
The transcribe package provides an R interface for audio transcription using OpenAI’s Whisper model with optional post‑processing via Ollama. This package supports both a programmatic interface and a command‑line interface (CLI) as well as a web API using Plumber.
This vignette will walk you through the installation and usage of transcribe with a focus on macOS and Linux systems.
Installation
1. Install the Package
Install transcribe from GitHub:
# Uncomment if needed:
# install.packages("remotes")
remotes::install_github("brancengregory/transcribe")
2. Install Python Dependencies (Whisper)
On macOS:
-
Homebrew:
Ensure you have Homebrew installed. If not, visit brew.sh for instructions.
Then, install Python (if needed) and use pip to install Whisper:
-
yt-dlp:
Install via Homebrew:
3. Install and Run Ollama
-
Ollama is required for post‑processing.
-
On macOS: Check Ollama’s website or use Homebrew if available:
On Linux: Follow the installation instructions provided on Ollama’s documentation (if available) or consider alternatives if Ollama is not supported.
-
4. Additional Dependencies
The package uses: - processx to wrap external commands (e.g., yt-dlp), - reticulate to call Python’s Whisper, - ellmer for prompt-based post‑processing of transcripts, - curl for URL encoding/decoding.
Ensure these packages are installed in R:
install.packages(c("processx", "reticulate", "ellmer", "curl", "logger", "glue", "stringr", "fs"))
How It Works
Audio Downloading:
When given a remote URL, processx callsyt-dlp
to download the audio file in WAV format.Transcription via Whisper:
Python’s Whisper is accessed via reticulate to transcribe the audio.Post‑processing with Ollama and ellmer:
The raw transcript from Whisper is optionally cleaned up using a prompt via ellmer, which sends the text to an Ollama server for formatting.-
Interfaces:
- CLI: Process audio via command-line scripts.
- Plumber API: A web-based interface for uploading files or entering URLs.
Basic Usage
Transcribing a Local File
library(transcribe)
transcript <- transcribe_audio(
input_path = "path/to/audio.wav",
language = "en",
whisper_model_name = "large-v3-turbo",
processed = TRUE,
ollama_model = "llama3.2"
)
cat(transcript)
CLI Usage
The package provides a command‑line interface. For example, run:
Rscript inst/scripts/main_cli.R -i "path/to/audio.wav" -l en -m large-v3-turbo -p TRUE -M llama3.2 -o "transcribe.txt"
This command processes the audio file and saves the transcript to
transcribe.txt
.
Plumber API
You can also run a web interface via Plumber:
Then open your browser at http://127.0.0.1:7608 to access the transcription interface.
Technical Breakdown
Troubleshooting
Conclusion
The transcribe package provides a flexible R-based solution for audio transcription and cleanup, using Whisper, yt-dlp, and Ollama. It supports multiple interfaces (CLI and web) and offers a robust workflow for both local and online audio sources.
For further details, please refer to the package documentation and additional vignettes.
vignette("intro", package = "transcribe")
Happy transcribing!