Skip to content

Commit 462b25a

Browse files
committed
Audio: MFCC: Update example run script run_mfcc.sh
This patch contains several updates: - The run is with valgrind is added to catch memory leaks. - The script applied duplicate "-i" and "-o" arguments. They are removed from "OPT" variables. - The sof-testbench4 can't override the channels count in topology similarly as the IPC3 testbench could. Since the current topology is for stereo 16 kHz the input data and command line must be for such too. - To be able to compare MFCC output for successive runs, the "-R" option is added to run of sox audio convert utility to prevent e.g. randomization of dither. - The script converts input to s24 and s32 formats and runs them for easier check for correct operation with supported formats. The conversion is done from the s16 version to be able to compare the output audio features those should be the same if internal processing is 16 bit. - A run with Mel configured MFCC is added for s16/24/32 formats. - A script to decode and visualize Mel spectrogram data is added as decode_mel.m. Signed-off-by: Seppo Ingalsuo <seppo.ingalsuo@linux.intel.com>
1 parent 71a7cd5 commit 462b25a

3 files changed

Lines changed: 152 additions & 11 deletions

File tree

src/audio/mfcc/tune/README.txt

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ The output file is hard-coded to mfcc.raw.
1616

1717
The output can be plotted and retrieved with Matlab or Octave command:
1818

19-
[ceps, t, n] = decode_ceps('mfcc.raw', 13);
19+
[ceps, t, n] = decode_ceps('mfcc_s16.raw', 13);
2020

2121
In the above it's known from configuration script that MFCC was set up to
2222
output 13 cepstral coefficients from each FFT -> Mel -> DCT -> Cepstral
@@ -27,3 +27,9 @@ e.g. other sound files found in computer.
2727

2828
./run_mfcc.sh /usr/share/sounds/gnome/default/alerts/bark.ogg
2929
./run_mfcc.sh /usr/share/sounds/gnome/default/alerts/sonar.ogg
30+
31+
The script runs the same input sample with s16/24/32 formats for
32+
cepstral coefficients data output and Mel frequency spectrogram
33+
output. The 80 bands Mel output can be visualized with command:
34+
35+
[ceps, t, n] = decode_mel('mel_s16.raw', 80);

src/audio/mfcc/tune/decode_mel.m

Lines changed: 101 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,101 @@
1+
% [mel, t, n] = decode_mel(fn, num_mel, num_channels)
2+
%
3+
% Input
4+
% fn - File with MFCC data in .raw or .wav format
5+
% num_mel - number of Mel coefficients per frame
6+
% num_channels - needed for .raw format, omit for .wav
7+
%
8+
% Outputs
9+
% mel - Mel coefficients
10+
% t - time vector for plotting
11+
% n - mel 1..num_mel vector for plotting
12+
13+
% SPDX-License-Identifier: BSD-3-Clause
14+
% Copyright(c) 2026 Intel Corporation.
15+
16+
function [mel, t, n] = decode_mel(fn, num_mel, num_channels)
17+
18+
if nargin < 3
19+
num_channels = 1;
20+
end
21+
22+
% MFCC stream
23+
fs = 16e3;
24+
qformat = 7;
25+
magic = [25443 28006]; % ASCII 'mfcc' as int16
26+
27+
% Load output data
28+
[data, num_channels] = get_file(fn, num_channels);
29+
30+
idx1 = find(data == magic(1));
31+
idx = [];
32+
for i = 1:length(idx1)
33+
if data(idx1(i) + 1) == magic(2)
34+
idx = [idx idx1(i)];
35+
end
36+
end
37+
38+
if isempty(idx)
39+
error('No magic value markers found from stream');
40+
end
41+
42+
period_mel = idx(2)-idx(1);
43+
num_frames = length(idx);
44+
45+
% Last frame can be incomplete due to span over multiple periods
46+
last = idx(end) + num_mel - 1;
47+
if (last > length(data))
48+
num_frames = num_frames - 1;
49+
end
50+
51+
t_mel = period_mel / num_channels / fs;
52+
t = (0:num_frames -1) * t_mel;
53+
n = 1:num_mel;
54+
55+
mel = zeros(num_mel, num_frames);
56+
for i = 1:num_frames
57+
i1 = idx(i) + 2;
58+
i2 = i1 + num_mel - 1;
59+
mel(:,i) = data(i1:i2) / 2^qformat;
60+
end
61+
62+
figure;
63+
imagesc(t, n, mel);
64+
axis xy;
65+
colormap(jet);
66+
colorbar;
67+
tstr = sprintf('SOF MFCC Mel coefficients (%s)', fn);
68+
title(tstr, 'Interpreter', 'None');
69+
xlabel('Time (s)');
70+
ylabel('Mel coef #');
71+
72+
end
73+
74+
function [data, num_channels] = get_file(fn, num_channels)
75+
76+
[~, ~, ext] = fileparts(fn);
77+
78+
switch lower(ext)
79+
case '.raw'
80+
fh = fopen(fn, 'r');
81+
data = fread(fh, 'int16');
82+
fclose(fh);
83+
case '.wav'
84+
tmp = audioread(fn, 'native');
85+
t = whos('tmp');
86+
if ~strcmp(t.class, 'int16')
87+
error('Only 16-bit wav file format is supported');
88+
end
89+
s = size(tmp);
90+
num_channels = s(2);
91+
if num_channels > 1
92+
data = int16(zeros(prod(s), 1));
93+
for i = 1:num_channels
94+
data(i:num_channels:end) = tmp(:, i);
95+
end
96+
end
97+
otherwise
98+
error('Unknown audio format');
99+
end
100+
101+
end

src/audio/mfcc/tune/run_mfcc.sh

Lines changed: 44 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -4,19 +4,53 @@
44

55
set -e
66

7-
RAW_INPUT=in.raw
8-
RAW_OUTPUT=mfcc.raw
7+
RAW_INPUT_S16=in_s16.raw
8+
RAW_INPUT_S24=in_s24.raw
9+
RAW_INPUT_S32=in_s32.raw
10+
RAW_OUTPUT_S16=mfcc_s16.raw
11+
RAW_OUTPUT_S24=mfcc_s24.raw
12+
RAW_OUTPUT_S32=mfcc_s32.raw
913

14+
VALGRIND="valgrind --leak-check=full"
1015
TESTBENCH=$SOF_WORKSPACE/sof/tools/testbench/build_testbench/install/bin/sof-testbench4
11-
TOPOLOGY=$SOF_WORKSPACE/sof/tools/build_tools/topology/topology2/development/sof-hda-benchmark-mfcc16.tplg
12-
OPT="-r 16000 -c 2 -b S16_LE -p 3,4 -t $TOPOLOGY -i $RAW_INPUT -o $RAW_OUTPUT"
16+
TOPOLOGY_S16=$SOF_WORKSPACE/sof/tools/build_tools/topology/topology2/development/sof-hda-benchmark-mfcc16.tplg
17+
TOPOLOGY_S24=$SOF_WORKSPACE/sof/tools/build_tools/topology/topology2/development/sof-hda-benchmark-mfcc24.tplg
18+
TOPOLOGY_S32=$SOF_WORKSPACE/sof/tools/build_tools/topology/topology2/development/sof-hda-benchmark-mfcc32.tplg
19+
OPT_S16="-r 16000 -c 2 -b S16_LE -p 3,4 -t $TOPOLOGY_S16"
20+
OPT_S24="-r 16000 -c 2 -b S24_LE -p 3,4 -t $TOPOLOGY_S24"
21+
OPT_S32="-r 16000 -c 2 -b S32_LE -p 3,4 -t $TOPOLOGY_S32"
1322

14-
# Convert input audio file raw 16 kHz 1 channel 16 bit
15-
sox --encoding signed-integer "$1" -L -r 16000 -c 1 -b 16 "$RAW_INPUT"
23+
# Convert input audio file raw 16 kHz 2 channel 16 bit
24+
sox -R --encoding signed-integer "$1" -L -r 16000 -c 2 -b 16 "$RAW_INPUT_S16"
25+
sox -R --no-dither --encoding signed-integer -L -r 16000 -c 2 -b 16 "$RAW_INPUT_S16" -b 32 "$RAW_INPUT_S32"
26+
sox -R --no-dither --encoding signed-integer -L -r 16000 -c 2 -b 16 "$RAW_INPUT_S16" -b 32 "$RAW_INPUT_S24" vol 0.003906250000
1627

1728
# Run testbench
18-
$TESTBENCH $OPT -i "$RAW_INPUT" -o "$RAW_OUTPUT"
29+
$VALGRIND $TESTBENCH $OPT_S16 -i "$RAW_INPUT_S16" -o "$RAW_OUTPUT_S16"
30+
$VALGRIND $TESTBENCH $OPT_S24 -i "$RAW_INPUT_S24" -o "$RAW_OUTPUT_S24"
31+
$VALGRIND $TESTBENCH $OPT_S32 -i "$RAW_INPUT_S32" -o "$RAW_OUTPUT_S32"
1932

20-
echo -----------------------------------------------
21-
echo The MFCC data was output to file $RAW_OUTPUT
22-
echo -----------------------------------------------
33+
echo ----------------------------------------------------------------------------------
34+
echo The MFCC data was output to file $RAW_OUTPUT_S16, $RAW_OUTPUT_S24, $RAW_OUTPUT_S32
35+
echo ----------------------------------------------------------------------------------
36+
37+
RAW_OUTPUT_S16=mel_s16.raw
38+
RAW_OUTPUT_S24=mel_s24.raw
39+
RAW_OUTPUT_S32=mel_s32.raw
40+
41+
TESTBENCH=$SOF_WORKSPACE/sof/tools/testbench/build_testbench/install/bin/sof-testbench4
42+
TOPOLOGY_S16=$SOF_WORKSPACE/sof/tools/build_tools/topology/topology2/development/sof-hda-benchmark-mfccmel16.tplg
43+
TOPOLOGY_S24=$SOF_WORKSPACE/sof/tools/build_tools/topology/topology2/development/sof-hda-benchmark-mfccmel24.tplg
44+
TOPOLOGY_S32=$SOF_WORKSPACE/sof/tools/build_tools/topology/topology2/development/sof-hda-benchmark-mfccmel32.tplg
45+
OPT_S16="-r 16000 -c 2 -b S16_LE -p 3,4 -t $TOPOLOGY_S16"
46+
OPT_S24="-r 16000 -c 2 -b S24_LE -p 3,4 -t $TOPOLOGY_S24"
47+
OPT_S32="-r 16000 -c 2 -b S32_LE -p 3,4 -t $TOPOLOGY_S32"
48+
49+
# Run testbench
50+
$VALGRIND $TESTBENCH $OPT_S16 -i "$RAW_INPUT_S16" -o "$RAW_OUTPUT_S16"
51+
$VALGRIND $TESTBENCH $OPT_S24 -i "$RAW_INPUT_S24" -o "$RAW_OUTPUT_S24"
52+
$VALGRIND $TESTBENCH $OPT_S32 -i "$RAW_INPUT_S32" -o "$RAW_OUTPUT_S32"
53+
54+
echo ----------------------------------------------------------------------------------
55+
echo The MFCC Mel data was output to file $RAW_OUTPUT_S16, $RAW_OUTPUT_S24, $RAW_OUTPUT_S32
56+
echo ----------------------------------------------------------------------------------

0 commit comments

Comments
 (0)