Parametric Voice Synthesis

Tools for sound synthesis and acoustic analysis. Performs parametric synthesis of sounds with harmonic and noise components such as animal vocalizations or human voice. Also includes tools for spectral analysis, pitch tracking, audio segmentation, self-similarity matrices, morphing, etc.


R package for sound synthesis and acoustic analysis.
Homepage with help, demos, etc: http://cogsci.se/soundgen.html
Source code on github: https://github.com/tatters/soundgen

Performs parametric synthesis of sounds with harmonic and noise components such as animal vocalizations or human voice. Also includes tools for spectral analysis, pitch tracking, audio segmentation, self-similarity matrices, morphing, etc.

Key functions:

  • Sound synthesis from R console: soundgen()
  • Shiny app for sound synthesis (opens in a browser): soungen_app()
  • Acoustic analysis of a wav/mp3 file: analyze()
  • Measuring syllables, pauses, and bursts in a wav/mp3 file: segment()

For more information, please see the vignettes on sound synthesis and acoustic analysis:

vignette("sound_generation", package="soundgen")

vignette("acoustic_analysis", package="soundgen")

Or, to open the vignettes in a browser:
RShowDoc('sound_generation', package = 'soundgen')

RShowDoc('acoustic_analysis', package = 'soundgen')

Installation

To install the current release from CRAN: install.packages("soundgen")

NB: Make sure all dependencies have been installed correctly! For problems with seewave, see http://rug.mnhn.fr/seewave/

On Macs, you may need to do the following:

  • First install brew according to the instructions here: https://brew.sh/
  • Then run the following from the terminal
    brew install libsndfile
    brew install fftw
  • Finally, install soundgen in R:
    install.packages("soundgen")

News

      CHANGES IN SOUNDGEN VERSION 1.4.0 [13-March-2019]

MAJOR (affect back-compatibility):

  • soundgen(), fart(), beat(), generateNoise(): removed all deprecated parameters, notably "pitchAnchors", "noiseAnchors", etc. (now simply "pitch", "noise", etc.)
  • soundgen(): a new argument ("noiseAmpRef") controls how the relative amplitudes of voiced and unvoiced components are specified by the "noise" setting, with 3 options: "f0" (noise amplitude relative to first partial) produces the same results as in earlier versions of soundgen; both "source" (relative to ulfiltered voiced component) and "filtered" (relative to filtered voiced component - the new default) make noise amplitude less dependent on rolloff. The new default is thus to specify the (peak) amplitude of noise directly in relation to the (peak) amplitude of the voiced component, which is different from older versions - check your code! See vignette on sound synthesis, section 2.12.2, "Noise amplitude".

NEW FUNCTIONS:

  • modulationSpectrum(), modulationSpectrumFolder(): calculates and plots a modulation spectrum showing combined spectro-temporal modulation in one sound or all mp3/wav files in a folder, respectively
  • getRMS(), getRMSFolder(): calculates root mean square amplitude per frame in one sound or all mp3/wav files in a folder, respectively
  • normalizeFolder(): normalizes the amplitude of all wav/mp3 files in a folder based on their peak or RMS amplitude or subjective loudness
  • getLoudnessFolder(): a wrapper around getLoudness() for estimating the loudness of all audio files in a folder
  • gaussianSmooth2D(): performs Gaussian blur of spectrograms, modulation spectra, or any other numeric matrices.
  • reportTime(): provides a nicely formatted "estimated time left" in loops plus a summary upon completion

MINOR:

  • spectrogram(), getLoudness(): better temporal alignment of the waveform, spectrogram, and loudness contour

  • analyze(), analyzeFolder(): more precise time stampts for STFT frames

  • getLoudness(), analyze(): both functions produce mutually consistent estimates; option to specify the "scale" (maximum possible amplitude) of input waveforms (for audio files, the scale is determined automatically), making loudness estimates dependent on the actual vs. theoretically possible amplitude; all sounds were effectively normalized to max amplitude in earlier versions

  • soundgen_app(): more intelligent behavior of pitch range

  • Updated vignettes and demos

      CHANGES IN SOUNDGEN VERSION 1.3.2 [10-January-2019]
    

MINOR:

  • soundgen(): new argument "rolloffExact" for controlling the strength of each individual harmonic in source spectrum
  • soundgen(): changed behavior of the "glottis" parameter (but still experimental) for glottal pulses with a closed phase
  • soundgen(): interpolation of vectorized amDep, amShape, and amFreq changed from 'spline' to 'approx'
  • analyze(), analyzeFolder(), segment(), segmentFolder(), spectrogram(), spectrogramFolder(): support for mp3 files in addition to wav
  • analyze(): pitch candidates based on lowest dominant frequency band ("dom") respect pitchCeiling
  • Updated demos, vignettes, presets, and documentation of exported functions

BUG FIXES:

  • soundgen(), getRolloff(): rolloffOct argument now really controls the change of rolloff per octave above the fundamental frequency, as intended; note that this affects many presets, which have been updated accordingly

  • soundgen(): proper control of pitch drift with pitchDriftDep

  • soundgen(), soundgen_app(): pitchFloor and pitchCeiling fully control pitch range, without needing to override permittedValues with "invalidArgAction = 'ignore'"

  • analyze(): plotting symbols for pitch candidates in plot legend

      CHANGES IN SOUNDGEN VERSION 1.3.1 [04-October-2018]
    

NEW FUNCTIONS:

  • getLoudness(): estimates subjective loudness per frame, in sone

MINOR:

  • analyze(), analyzeFolder(): also return loudness per frame, in sone (new arguments: SPL_measured and Pref)
  • analyze(), analyzeFolder(): new argument "summaryStats" for specifying how to summarize the output (mean, sd, etc.)
  • analyze(): a legend can be added to plot (thanks to Jerome Sueur for the idea)
  • analyze(), analyzeFolder(), segment(), segmentFolder(), spectrogram(): plots are saved as .png instead of .jpg
  • ssm(): new arguments: "padWith" for dealing with edges and "heights" for controlling relative plot sizes
  • addFormants(): support for user-specified "spectralEnvelope" (a new argument)
  • soundgen(): adjusted bandwidth calculation for very low formant frequencies (under 250 Hz)
  • Updated vignettes and documentation of all exported functions

BUG FIXES:

  • analyzeFolder(): the result is returned with correct class (numeric instead of factor)

  • flatEnv(): fixed a bug that prevented correct removal of DC offset ("killDC")

  • generateNoise(): proper interpolation of user-specified "spectralEnvelope" (renamed from "filterNoise")

  • soundgen_app(): fixed a bug introduced in 1.3.0 that caused the app to crash when switching to a new preset

  • soundgen_app(): proper plotting of amplGlobal and spectrogram view of formantsNoise

      CHANGES IN SOUNDGEN VERSION 1.3.0 [31-August-2018]
    

Back-compatible with soundgen 1.2.X. The main change is abolishing the distinction between anchors (previously pitchAnchors, ampAnchors, etc.) and other vectorized arguments to soundgen (jitterDep, rolloff, etc.) Most vectorized arguments can now be either numeric vectors or dataframes.

NEW FUNCTIONS:

  • osc_dB(): draws an oscillogram (waveform) on a log-tranformed, dB scale. Analogous to the "Waveform (dB)" view in Audacity

MAJOR:

  • soundgen(), generateNoise(), beat(), fart(): the word "Anchor" is dropped from arguments (pitchAnchors, pitchAnchorsGlobal, glottisAnchors, amplAnchors, amplAnchorsGlobal, mouthAnchors, noiseAnchors). Write simply "pitch = ..., ampl = ..." instead of "pitchAnchors = ..., amplAnchors = ..."
  • soundgen(), generateNoise(), getRolloff(): vectorized arguments can be specified on a natural time scale, in the same format as all former anchors. Ex.: "jitterDep = data.frame(time = c(0, 150, 700), value = c(.2, 2, 0))". Affected arguments: vibratoFreq, vibratoDep, subFreq, subDep, jitterLen, jitterDep, shimmerLen, shimmerDep, rolloff, rolloffOct, rolloffKHz, rolloffParab, rolloffParabHarm, amDep, amFreq, amShape

MINOR:

  • soundgen() and related functions: "throwaway" is deprecated and replaced with "dynamicRange"

  • spectrogram(), soundgen_app(): added dynamic range, support for plotting the oscillogram on a dB scale, and adjustable size of spectrogram/oscillogram panels

  • soundgen_app(): fixed a bug introduced in 1.2.1 that crashed the app when changing sylLen. Thanks to Andrew Chang for pointing this out

  • Updated vignettes on sound synthesis and analysis, a new vignette on reproducing an existing sound with soundgen (webpage only, not published with the package because of large audio files)

      CHANGES IN SOUNDGEN VERSION 1.2.1 [04-August-2018]
    

NEW FUNCTIONS:

  • generateNoise(): a simplified version of soundgen() for producing purely voiceless vocalizations such as hissing. Unlike soundgen(), it also accepts a manually specified spectral filter of an arbitrary shape instead of formants for flexible generation of non-biological sounds

MINOR:

  • analyze(): spectral median (medianFreq) and spectral centroid (specCentroid) added as separate outputs; type ?analyze() for details

  • soundgen(): a new argument shimmerLen for controlling the period of shimmer (cf. jitterLen)

  • soundgen(): a new argument noiseFlatSpec for more advanced control of the spectrum of the unvoiced component

  • soundgen(): more intelligent automatic adjustment of pitch ceiling, floor, pitch sampling rate, and sampling rate for extreme pitch values

  • crossFade(): accepts different shapes of fade-in/out

  • flatEnv(): minor debugging, default method is now "hil" (see ?seewave::env)

  • schwa() returns relative formant frequencies in both percentages and semitones

  • Updated url's in documentation

  • Updated vignettes on sound synthesis and particularly on sound analysis, including a new section on DIY extraction of custom spectral descriptors

      CHANGES IN SOUNDGEN VERSION 1.2.0 [04-March-2018]
    

Back-compatibible with soundgen 1.1.x. Old code will mostly produce the same results, except for moving noise formants and amplitude envelopes in polysyllabic sounds.

MAJOR:

  • soundgen(): the argument "formantsNoise" (formants in the unvoiced component) now affects polysyllabic bouts in the same way as "formants" (formants in the voiced component), i.e. moving formants in "noiseFormants" span multiple syllables (see vignette 2.12)
  • soundgen(): vectorized parameters sylLen, pauseLen, vibratoFreq, vibratoDep, jitterDep, jitterLen, and shimmerDep. For ex., "nSyl = 15, sylLen = 300:150" produces 15 progressively shorter syllables from 300 ms to 150 ms in length (see vignette 2.5)
  • soundgen(): a new argument, "nonlinRandomWalk", for specifying the exact timing of each vocal regime (0 = none, 1 = subharmonics, 2 = subharmonics + jitter + shimmer). This is intended for making reproducible examples of sounds with nonlinear effects (see vignette 2.11.5)
  • soundgen(): "amplAnchorsGlobal" adjusts the amplitude from syllable to syllable in discrete steps, instead of applying a smooth amplitude envelope to the entire bout, and it also applies to the unvoiced component (turbulent noise)
  • depends on seewave >=2.1.0 because of changes in the seewave package

NEW FUNCTIONS:

  • spectrogramFolder(): extracts and saves spectrograms of all .wav files in a folder, creates an html file with a table of click-to-play spectrograms
  • getRandomWalk(), getIntegerRandomWalk(): functions for customizing the random walks that divide the sound into segments with different nonlinear regimes; see "nonlinRandomWalk" argument to soundgen()

MINOR:

  • soundgen(): an option to add extra stochastic formants to the unvoiced component, just like for the voiced component. To do so, specify "vocalTract" explicitly (see vignette 2.12.1)

  • soundgen(): the expected range of amplAnchors is now (throwaway, 0), e.g. (-80, 0), and for amplAnchorsGlobal 0 means no change. Values on the old scale will still work (see vignette 2.8)

  • soundgen(): a negative pause (overlap) is allowed between bouts (but not between syllables)

  • soundgen(): attack can be different at the beginning and end of a syllable, e.g. "attackLen = c(50, 100)"

  • getSmoothContour(): proper handling and plotting of anchors that need downsampling, interpolation forced exactly through the specified anchors with interpol = 'approx'

  • segmentFolder(), analyzeFolder(): new argument "htmlPlots" to create an html file with a table of click-to-play plots; new default strategy for saving plots (see "savePlots" argument)

  • analyze(), analyzeFolder(), segment(), segmentFolder(): new argument "res" for proper control of the size and resolution of the output plots

      CHANGES IN SOUNDGEN VERSION 1.1.2 [23-January-2018]
    

NEW FUNCTIONS:

  • schwa(): estimates expected formant frequencies for a given vocal tract length, deviation of measured formants from the schwa, or formant frequencies in Hz corresponding to given deviations from the schwa
  • getEntropy(): calculates Shannon or Weiner entropy of a vector such as power spectrum

MINOR:

  • soundgen(): very rapid or instantaneous attack is now achievable without adjusting windowLength

  • soundgen(): a new top-level scaling factor formantWidth for simple control of formant bandwidth

  • soundgen(): noiseAnchors are scaled appropriately as syllable length varies due to jitter, while post-syllabic noiseAnchors are not scaled with syllable duration (i.e. the length of post-syllabic aspiration is held constant as syllable length varies)

  • soundgen(): interpol affects the smoothing of mouth anchors

  • analyze(), segment(): streamlined and extended plotting options, fixed a bug in saving plots

  • segment(): fixed a bug that caused pauseLen_sd to always return NA

  • spectrogram(): proper handling of silent or very short input

  • spectrogram(): fixed a bug in denoising routines

  • morph(): fixed a bug with noiseAnchors that are NULL in one sound and another bug caused by changes in duplicated() that were introduced in the new version of r-devel. Thanks to Kurt Hornik for pointing this out

      CHANGES IN SOUNDGEN VERSION 1.1.1 [02-December-2017]
    

NEW FUNCTIONS:

  • addFormants() for adding or removing formants from an existing sound
  • fade() for adding linear/logarithmic/etc fade-in/out
  • One-click formant picker tool added to soundgen_app()

MINOR:

  • soundgen(): support for discontinuous contours such as pitch jumps, new arguments to soundgen for controlling interpolation between anchors (new arguments: "interpol", "discontThres", "jumpThres")

  • soundgen(): all tempEffects default to 1 and act like scaling factors, a new tempEffect added ("specDep")

  • soundgen(): invalidArgAction propagates through temperature effects, ensuring that "weird" parameter values outside the ranges in permittedValues are processed properly

  • soundgen(): vectorized rolloffNoise

  • soundgen(): slightly modified effects of creakyBreathy hyperparameter

  • soundgen_app(): fixed a bug introduced in soundgen 1.1.0, which caused the app to crash with simplified formant specification

  • Extended and updated vignette on sound generation

  • Miscellaneous small-scale debugging

      CHANGES IN SOUNDGEN VERSION 1.1.0 [19-October-2017]
    

MAJOR

  • Formant generation is now compatible with standard phonetic models (all-pole or zero-pole models instead of densities of gamma distribution).
  • Unvoiced component is generated with a flat spectrum up to a certain threshold, followed by linear rolloff.
  • An alternative method of synthesizing glottal pulses one by one ("soundgen(glottisAnchors = ...)". Good for vocal fry and, potentially, for integrating alternative models of glottal pulses (future work).
  • Back-compatibility in program syntax is preserved, but presets from version 1.0.0 will sound different. The vignette on sound generation and the preset library are updated accordingly.

MINOR

  • Support for shorthand specification of formants with default amplitudes and bandwidths, e.g. "formants = c(500, 1500, 2500)" or "formants = list(f1 = 800, f2 = c(1500, 2200))".
  • Support for shorthand specification of all anchors in presets and other input to soundgen_app, e.g. "mouthAnchors = c(0, 0, .3, .5)". The expanded format from soundgen 1.0.0 is also supported.
  • Vectorized rolloff and amplitude modulation parameters, allowing them to change dynamically within a vocalization (command-line only; currently not implemented in the Shiny app).
  • Additional components in tempEffects offering more control over stochastic behavior of soundgen() (command-line only).
  • Separate radiation functions for the lips and the nose (when the mouth is closed). The corresponding soundgen() arguments are "lipRad" (replacing "rolloffLip") and "noseRad".
  • Minor debugging and extended capabilities in soundgen_app(), e.g. preview of formant filter, better intergration of spectrogram and spectrum output plots, a simple way to remove the voiced component, etc.
  • Minor debugging elsewhere, notably dB conversion, timing of unvoiced segments in polysyllabic vocalizations in soundgen(), format of the output in analyzeFolder() and segmentFolder(), etc.

NEW FUNCTIONS:

  • flatEnv(): normalizes amplitude envelope dynamically, i.e., keeping loudness constant throughout the sound

  • estimateVTL(): estimates the length of vocal tract based on formant frequencies

  • fart(): a simplified version of soundgen() for simple and rapid generation of a particular type of sounds, like raspberries, ripping noises, etc

  • beat(): generation of percussive noises like drum-like beats, clicks, etc

      RELEASE OF SOUNDGEN VERSION 1.0.0 [04-September-2017]
    

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("soundgen")

1.4.0 by Andrey Anikin, 2 months ago


http://cogsci.se/soundgen.html


Browse source code at https://github.com/cran/soundgen


Authors: Andrey Anikin [aut, cre]


Documentation:   PDF Manual  


GPL (>= 2) license


Imports stats, graphics, utils, tuneR, seewave, zoo, shiny, reshape2, mvtnorm, plyr, dtw, phonTools

Suggests knitr, rmarkdown, shinyBS


Imported by warbleR.


See at CRAN