Phase vocoder.

*in RTcmix/insts/std*

**PVOC**(outsk, insk, dur, AMP, inputchan, fftsize, windowsize,
DECIMATION, interpolation[, PITCHMULT, npoles, OSCTHRESHOLD])

**set_filter**(“filter_name” OR “filter_DSO_path”)

Param Field | Parameter | Units | Dynamic | Optional | Notes |
---|---|---|---|---|---|

p0 | output start time | seconds | no | no | |

p1 | input start time | seconds | no | no | |

p2 | duration | seconds | no | no | |

p3 | amplitude multiplier | relative multiplier of analyzed input signal | yes | no | |

p4 | input channel | - | no | no | |

p5 | fft size | samples, power of 2 | no | no | |

p6 | window size | samples, normally 2 * fft size | no | no | |

p7 | decimation amount | samples, amount to read in | yes | no | should be < p5 |

p8 | interpolation amount | samples, amount to write out | no | no | |

p9 | pitch multiplier | - | yes | yes | default: 0.0 (no pitch change) |

p10 | npoles (used for LPC data only) | - | no | yes | leave at 0.0 |

p11 | gain threshold for resynthesis | - | yes | yes | default: 0.0 |

Parameters labled as Dynamic can receive dynamic updates from a table or real-time control source.

Author: Doug Scott (based on earlier work by Christopher Penrose and others).

**set_filter**

Param Field | Parameter | Units | Dynamic | Optional | Notes |
---|---|---|---|---|---|

p0 | filter | string | no | no | either the name of a PVOC filter or the full path |

NOTE: this subcommand is only available in standalone RTcmix configurations.

*Phase vocoding* is an analysis/resynthesis technique whereby
a sound is analyzed through a filter bank with additional computation of
phase deviation from each of the channels of the vocoder (sort of an
expanded Fourier transform). The data discerned from the analysis allows
for realistic time-independent transposition and, by corollary,
pitch-independent time-stretching of a soundfile, with much fewer
resynthesis artifacts than would normally be possible (from Dodge and
Jerse, 1997).

The RTcmix **PVOC** instrument uses a standard FFT analysis with
additional phase computation, allowing the user to specify FFT
parameters and time- and pitch-shifting in terms of multiples of the
original sound.

The “fftsize” (p5) determines the resolution in time and frequency of the anaylsis. This has to be a power of 2. The larger the size (2048, 4096, 8192, etc.) the greater the frequency resolution, but time resolution suffers – larger FFT windows ‘smear’ the signal in time. A lower “fftsize” parameter (64, 128, 256) resolves time-events, but will not have as fine a representation of the frequency spectrum. Such is life.

The “windowsize” parameter (p6) sets how much overlap will occur between analysis windows (chunks) on output synthesis. The amount of overlap is p6 compared to p5. Larger values can create a smoother sound, but it may start sounding a bit reverberant. A value of twice the “fftsize” is usually reasonable. It should also be a power of 2.

The ratio between p7 (“decimation”) and p8 (“interpolation”) determines how much time-dilation or time-compression will occur. The time-scaling factor is p8/p7. These don’t have to be a power of 2, and smaller numbers may yield smoother results. Smaller values are more computationally expensive, however. An example of how the time-shifting works: if p7 is 50 and p8 is 100, then the resynthesized sound will be twice as long as the original sound. If p7 is set to 300 and p8 is 100, the resulting sound will be 0.333 times (100/300) as long (three times faster) than the original sound.

If the optional “pitchmult” (p9) is 0, then **PVOC** will do an
inverse-FFT resynthesis (fairly efficient). If it is > 0, however, it
will cause an oscillator-bank resynthesis, with individual oscillators
tracking the frequency and amplitude from the FFT analysis. p9 is a
direct multiplier of all frequency values, so that a value of 2.0 will
shift the entire spectrum up one octave, a value of 0.25 will shift it
down two octaves. The following score fragment can be used to calculate
an oct.pc transposition:

```
transposition = 0.05 // shift up 5 semitones
pitch_multiplier = cpspch(transposition) / cpspch(0.0)
```

If p9 is > 0, it also allows for the use of LPC-generated amplitude coefficients for the spectral envelope resynthesis. The optional “npoles” parameter (p10) can set how many poles the LPC filter will have. Smaller values create a very approximate spectral resynthesis, and larger values can generate a filter that is too “lumpy”. Usually values between 20 - 40 are good starting points. A value of 0 will turn off this feature.

Also if p9 > 0, the optional “oscthreshold” (p11) parameter is engaged. During oscillator-bank rsynthesis, only parts of the frequency spectrum with amplitudes greater than this value will be resynthesized. Values > 1.0 will generally start having an effect on the output sound. This feature can be useful for eliminating noise from a signal, although it will cause audible artifacts in the resulting sound.

**PVOC** can read mono or stereo input files; it only writes mono
output.

**PVOC** has the ability to set filter plugins which operate on the frequency
and amplitude bins before they are used to resynthesize the audio. The plugins are
loaded by name or by DSO path using the **set_filter** command. This feature
is only available in the standalone version, and the details are, for now, left
to those who are willing to examine the source code.

very basic:

```
rtsetparams(44100, 1, 512)
load("PVOC")
rtinput("mysound.aif")
PVOC(start=0, inputskip=0, inputread=DUR(0), amp=0.9, inputchan=0, fft=1024, window=2*fft,
readin=1024, writeout=2*readin)
```

slightly more advanced:

```
rtsetparams(44100, 1)
load("PVOC")
rtinput("mysound.aif")
// Resynthesize with oscillator bank, at 0.5 the orig pitch,
// and only with oscillators > 1.1 in amplitude
PVOC(0, 0, DUR(0), 1, 0, 1024, 2048, 100, 100, 0.5, 0, 1.1)
```

fun stuff!

```
rtsetparams(44100, 1)
load("PVOC")
rtinput("mysound.aif")
start = 0
inskip = 0
duration = DUR(0)
gain = 1
inskip = 0
fftsize = 2048
winsize = 2048*2
pitch = 1
decim = 512
interp = 512
PVOC(start, inskip, duration, gain, 0, fftsize, winsize, decim, interp, pitch)
start = start + duration
pitch = pitch * 0.8
PVOC(start, inskip, duration, gain, 0, fftsize, winsize, decim, interp, pitch)
start = start + duration
pitch = pitch * 0.8
PVOC(start, inskip, duration, gain, 0, fftsize, winsize, decim, interp, pitch)
start = start + duration
pitch = pitch * 0.8
PVOC(start, inskip, duration, gain, 0, fftsize, winsize, decim, interp, pitch)
```

CONVOLVE1, LPCPLAY, SPECTACLE, SPECTACLE2, SPECTEQ, SPECTEQ2, TVSPECTACLE, VOCODE2, VOCODE3, VOCODESYNTH