Linux Audio Control Topology

This is a pretty raw extract from our intranet,   so has more detail of AudioScience APIs (HPI,ASX) than the others.  However, we are looking at how and what to describe in a future API. I'm posting it here in the context of the current discussion about ALSA control topology

ALSA

Controls are identified by name, and possibly index. (though most drivers only use index=0)

Control values are an array of the same datatype, or an enumerated set of strings.

No topology information in the standard API. There are exceptions: HDA driver reveals NID data via a special file. ASoC driver contain topology information to assist with DAPM (Dynamic Audio Power Management) but AFAIK it is not exported to userspace.

Adding more topology info is being requested and discussed http://thread.gmane.org/gmane.linux.alsa.devel/62416 http://thread.gmane.org/gmane.linux.alsa.devel/52498

JACK

Applications provide N output ports and/or M input ports. Arbitrary connection from any output to any input. Multiple connections to a single input are summed.

Doesn't address within-application controls. Each application provides its own UI.

ALSA can be considered as a privileged client application which provides in and out ports corresponding to soundcard channels, and provides the master timebase for all other apps.

Cobranet

SNMP

Bundle addressing, connectivity determined by receivers (as long as potential transmitter exists for multicast or broadcast)

Within bundles, channels determine audio content.

IEC62379

SNMP

Connectivity separate from function.

Blocks with ports

Mathematics

Network theory

Physical analogy

Patch cables connect things

Knobs and buttons control things.

The web

RDF http://www.w3.org/TR/REC-rdf-syntax/

HTML pages link to other pages. pages can contain data, and controls

LV2

http://lv2plug.in/ LV2 is a standard for plugins and matching host applications, mainly targeted at audio processing and generation.

I.e. it addresses objects that process audio and have controls

All control and data connects to ports of the plugin.

OSC

controls are identified by addresses that look like paths "/channel/1/fader"

an OSC receiver has an IP address and a port number

Intel HDA codec

The codec contains 'widgets'.

Each widget has a numeric NID (?Node ID?). NID#0 refers to the overall codec.

Widgets have zero or one outputs. zero..N inputs.

Connectivity information: The list of NIDs that connect to its inputs can be read from each widget.

There is a 'function group' widget that acts as a container for others. Containers contain a set of sequentially numbered widgets, can query start index and count. NID#1 is audio function group.

Other widget types are Audio widgets: input, output, mixer, selector, pin complex, power, volume knob.

There is a set of Verbs that act on the widgets (ie. commands or queries).

OS-X Audio Units

"An audio unit (often abbreviated as AU in header files and elsewhere) is a Mac OS X plug-in that enhances digital audio applications such as Logic Pro and GarageBand. You can also use audio units to build audio features into your own application. Programmatically, an audio unit is packaged as a bundle and configured as a component as defined by the Mac OS X Component Manager. At a deeper level, and depending on your viewpoint, an audio unit is one of two very different things. From the inside—as seen by an audio unit developer—an audio unit is executable implementation code within a standard plug-in API. The API is standard so that any application designed to work with audio units will know how to use yours. The API is defined by the Audio Unit Specification. An audio unit developer can add the ability for users or applications to control an audio unit in real time through the audio unit parameter mechanism. Parameters are self-describing; their values and capabilities are visible to applications that use audio units. From the outside—as seen from an application that uses the audio unit—an audio unit is just its plug-in API. This plug-in API lets applications query an audio unit about its particular features, defined by the audio unit developer as parameters and properties."

http://developer.apple.com/documentation/MusicAudio/Conceptual/AudioUnitProgrammingGuide/Introduction/Introduction.html http://developer.apple.com/documentation/MusicAudio/Conceptual/AudioUnitProgrammingGuide/TheAudioUnit/TheAudioUnit.html#//apple_ref/doc/uid/TP40003278-CH12-SW1 http://developer.apple.com/documentation/MusicAudio/Conceptual/CoreAudioOverview/WhatsinCoreAudio/WhatsinCoreAudio.html#//apple_ref/doc/uid/TP40003577-CH4-SW6

GStreamer

GStreamer is a framework for creating streaming media applications. The fundamental design comes from the video pipeline at Oregon Graduate Institute, as well as some ideas from DirectShow. The framework is based on plugins that will provide the various codec and other functionality. The plugins can be linked and arranged in a pipeline. This pipeline defines the flow of the data. Pipelines can also be edited with a GUI editor and saved as XML so that pipeline libraries can be made with a minimum of effort.

http://gstreamer.freedesktop.org/data/doc/gstreamer/head/manual/html/index.html

Windows

Wave

kMixer?

Possible new HPI

(Moved here from under HPI heading, leaving that to discuss current implementation)

A basic topology will have just controls and connections. I.e. the controls ARE the nodes.

Controls modify or measure the signal going through them. Special cases are sources and sinks which have no input or output respectively. Meters can be represented as either passthrough or input only. Controls have labels which sort of correspond to HPI nodes, but are unique per control. I.e. nodes don't have multiple controls. The unique ID could just be the control index.

A single control can have multiple attributes eg. tuner has band and frequency,

Connections always have source and destination that are controls.

A simplified example (ASX like)

Controls:

1 Player1
2 PlayerMeter1
3 PlayerSRC1
4 PlayerVolume1
5 Player2 (leave out meter etc for brevity)
6 MixGain11
7 MixGain12
8 MixGain21
9 MixGain22
10 Sum1
11 Sum2
12 LineoutLevel1
13 LineoutLevel2
14 Lineout1
15 Lineout2
16 OutMeter1

Connections:

1->2   Player1 - PlayerMeter1
1->3   Player1 - PlayerSRC1
3->4   PlayerSRC1 - PlayerVolume1
4->6   PlayerVolume1 - MixGain11
4->7   PlayerVolume1 - MixGain12
5->8   Player2 - MixGain21
5->9   Player2 - MixGain22
6->10  MixGain11 - Sum1
8->10  MixGain12 - Sum2
7->11  MixGain21 - Sum1
9->11  MixGain22 - Sum2
10->12 Sum1 - LineoutLevel1
12->14 Sum2 - LineoutLevel2
11->13 LineoutLevel1 - Lineout1
13->14 LineoutLevel2 - Lineout2
10->16 Sum1 -> OutMeter1

AGE comments

If we are rewriting the way controls are handled I want to see control type:

ABSTRACT_CONTROL
with property of control type
with a connects to list
with a parent/child property
parent controls would contain a list of child controls
child controls would have basic types like "int", "string", "multiplexer"

The goal would be that once we have a multiplexer implemented ONCE in ASIControl, any other controls that expose that property would be automatically implemented. There would be no additional custom coding.

HPI

HPI concepts

node
a place with a type and index. Nodes are either source or destination, not both.
control
active element with a number of attributes (read-only or read/write). Attached to a single node, or between a source and destination
attribute
setting or measurement value of a control. Some attributes have more than one dimension (multiplexer-like attributes, which have values of [node type, index]

Commentary

Having source and destination nodes means HPI cannot express many topologies accurately. I.e. it is not possible to have a chain of nodes.

Having volume controls "on" single nodes doesn't reflect a volume control having an input on one side and an output on the other.

Summing is implicit in when multiple volume controls attach to a single destination node.

Multiplexer controls are only attached to a single node, representing their output. In the case of linein mux, this node is a source node, even though it is the output of the mux. Connections of mux inputs are implicit.

Linein analog/digital muxes have 'linein' as both a source and destination!?

Node types vs control types. There is no direct connection between the two. I.e. can attach a tuner control to an outstream node.

Attributes with 2 disparate parts to the value don't map easily to a single basic datatype. (maybe string?)

HPI meters, levels and volumes are implicitly mono or stereo. HPI message doesn't support more channels.

HPI controls provide grouping for attributes, by having multiple attributes attached to a single control

EWB critique

I think the idea of nodes and controls is overly complicated, and at the same time too inflexible.

Inflexible because it

  • it only allows a single layer network : source node - destination node.
  • nodes are either source or destination

complicated because

  • nodes have multiple controls,
  • the order of control application is hidden (apart from the 'enumerate in signal flow order hack'),
  • some controls are 'on' nodes, others are between nodes.
  • For many controls, the nodes vs control is redundant eg tuner control/node, mic control/node,

Delio's attempt at visualizing it all

(Insert object diagram here)

Reading the digram from the top-left:

A single ASX subsystem 'controls' many adapters. Each adapters has multiple modes, each mode consists of a different collection of controls.

Each control can optionally send/receive data to/from one or more controls. Each control has zero or more 'properties' that can be either read only or read write. Controls can be grouped into a control group. A control group can have a number of read only properties used to describe how to present its controls to the user.

In addition to the comments above I think that we need also to figure out what it means exactly for control to be connected to another. As I see it there are two data channels in and out of any control. The sample stream channel and property access channel (think of it as in band and out of band data). The sample stream channel carries the data to be processed by the control; the property access channel carries parameter change commands. For instance: an autofader control could be implemented simply as a control that emits "set-volume" commands to a volume control. The relevant topology snippet would look like:

    [control] --> volume -> [control]
       autofader----^

In the graph above the horizontal flow is the sample stream (processed through the volume control) while the vertical arrow is the command stream from the autofader control. The autofader control does not process samples but simply automatically updates the level property of the volume control.

The same could be implemented as an autofading volume control: a single control that supports autofade. The advantage of a model that separates the data and command channels is that it is more modular. Simpler controls can be designed and connected together rather then having to add features to existing controls.

ASX

Similar to HPI in concept. Adds player and recorders as controls (loosely map to HPI streams).

Comments