Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use pffft in spectrogram #5960

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

Paul-Licameli
Copy link
Collaborator

@Paul-Licameli Paul-Licameli commented Feb 15, 2024

Resolves: #5880

Depends on

Serial performance improvement in spectrogram drawing, using pffft

  • I signed CLA
  • The title of the pull request describes an issue it addresses
  • If changes are extensive, then there is a sequence of easily reviewable commits
  • Each commit's message describes its purpose and effects
  • There are no behavior changes unnecessary for the stated purpose of the PR

Recommended:

  • Each commit compiles and runs on my machine without known undesirable changes of behavior

@Paul-Licameli Paul-Licameli self-assigned this Feb 15, 2024
@Paul-Licameli Paul-Licameli force-pushed the Use-pffft-in-spectrogram branch 3 times, most recently from aaad77d to 97d3fe6 Compare February 20, 2024 12:24
Copy link
Contributor

@saintmatthieu saintmatthieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be a relatively high-impact change for our users, but unfortunately it is a lot to consider, all the more that there are several algorithms to review and test.

im=buffer[hFFT->BitReversed[i]+1];
scratch[2*i ] = re*mFilterFuncR[i] - im*mFilterFuncI[i];
scratch[2*i+1] = re*mFilterFuncI[i] + im*mFilterFuncR[i];
if(0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose this if(0) is here because of the missing down-scaling upon IFFT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed that fix, looks like it works now.

size_t i;
for (i = 0; i < len; ++i)
buffer[i] *= window[i];
for( ; i < len; ++i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That doesn't look right. This second loop won't have any iteration.
On the other hand, looking in SpectrogramSettings.cpp, I find that window already includes the zero-padding, and that hFFT->Points is len/2. So I guess we can safely ignore padding here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for noticing that! I don't want to let that stand.

else
out[i] = 10.0*log10f(power);
out[index / 2] = 10.0 * log10f(power);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, maybe we could replace this log10f with an optimized log2. We saw in the tempo-estimation optimization PR that it sped up the STFT a great deal. (We probably should not blindly reuse the GetLog2 of MirDsp.cpp, though, because that approximation is only accurate to two decimal places, as you correctly found out, @Paul-Licameli ).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No.

Don’t use approximation for speed. Plot the best spectral data we can calculate.

const size_t scratchSize =
reassignment ? 3u * PffftAlignedCount{ fftLen } : fftLen;
(reassignment ? 4u : 2u) * PffftAlignedCount{ fftLen };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're here in Populate, and within that function it's difficult to find out why there's a need for more than just fftLen. One has to dig CalculateOneSpectrum to find out about that dependency. And in fact, it doesn't seem that the standard spectrogram view requires scratch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Now I see that the buffer allocation is done outside a nested loop, which might explain why the scratch size must be determined outside of CalculateOneSpectrum.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comment before this line is sufficient explanation.

@Paul-Licameli Paul-Licameli force-pushed the Use-pffft-in-spectrogram branch 2 times, most recently from 9789675 to 9119270 Compare February 24, 2024 11:18
... also more type checking that buffers are aligned.

Also making sure not to inline any constants or functions in pffft.h, so that
the libsoxr version isn't accidentally used by calling code.

The full generality of complex time domain data, or the non-reordering calls,
are not yet needed or exposed through this wrapper.
Copy link
Contributor

@saintmatthieu saintmatthieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an important improvement and the type safety is good, but I'd like this ambiguity in the PffftFloatVector API to be resolved, at least by making that aligned overload private.

@@ -3,12 +3,12 @@

Audacity: A Digital Audio Editor

PowerSpectrumGetter.cpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this stage you might want to keep the PowerSpectrumGetter in its own file and let PffftTransformer.h/cpop have their new Paul Licameli banners. The functionality you've authored covers 90% of this file. At the very least please add your name to the banner.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the power spectrum is a closely enough related utility that it can live in the same file.

const size_t scratchSize =
reassignment ? 3u * PffftAlignedCount{ fftLen } : fftLen;
(reassignment ? 4u : 2u) * PffftAlignedCount{ fftLen };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... Now I see that the buffer allocation is done outside a nested loop, which might explain why the scratch size must be determined outside of CalculateOneSpectrum.

PffftAlignedCount &operator =(const PffftAlignedCount &) = default;

//! @invariant `result: result % FloatAlignment == 0`
operator size_t() const { return value; }
operator size_t() const noexcept { return value; }

private:
size_t value{};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be const, it would seem.
And then, at least on my Windows machine with build settings that are still inconsistent with CI, I can also constexpr operator size_t() const noexcept { return value; } ; or are there platforms where it fails?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making these const and constexpr allowed me to write this:

static_assert(PffftAlignedCount(0) == 0);
static_assert(PffftAlignedCount(1) == 16);
static_assert(PffftAlignedCount(16) == 16);
static_assert(PffftAlignedCount(17) == 32);

What I understand from this is that this PffftAlignedCount(n) answers the question "How many sizeof(float) bytes do I need for n floats?". This is useful if PffftFloatVector is to be used as a vector of vectors, or matrix, but not as a straightforward vector; something like

   PffftFloatVector v(2);
   v[1] = 2.f;
   const auto c1 = v.aligned(PffftAlignedCount(1));

doesn't make sense: c1.get() is out of bound and *c1.get() isn't 2.
In fact, I found out that the PffftFloats aligned( overload with one argument is not used anywhere other than PffftFloatVector, so it can - and should - be made private.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was explicit in comments above the class:

 Usual hazards of pointer invalidation or out-of-bounds subscripting still
 apply.

 Size and capacity are NOT constrained to be aligned.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that the one-argument aligned should be made private.

There may be future need.

pffft_transform_ordered(get(), input.get(), out, work.get(),
PFFFT_BACKWARD);
if (renormalize)
std::transform(out, out + N, out, [N = N](float f){ return f / N; });
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requires #include <algorithm>

// Inverse FFT and normalization
InverseRealFFTf(scratch, hFFT.get());
ReorderToTime(hFFT.get(), scratch, buffer);
transformer.InverseTransformOrdered(buf, buf, scr, true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

anonymous boolean

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One easily jumps to the function definition to know what the boolean is, if the IDE browser is good enough.

Copy link
Contributor

@saintmatthieu saintmatthieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, @Paul-Licameli, using a spectrogram length of 8 in the settings causes an assertion in PFFFT.
Reading more carefully pffft.h:

   - supports only transforms for inputs of length N of the form
   N=(2^a)*(3^b)*(5^c), a >= 5, b >=0, c >= 0 (32, 48, 64, 96, 128,
   144, 160, etc are all acceptable lengths). Performance is best for
   128<=N<=8192.

This would imply that the least size is 32. (Maybe by luck, but I just tried a size of 16 and that worked.)

@LWinterberg do you have an idea of the amount of dissatisfaction raising the least spectrum size to 32 would cause? I never came across a need for such small window sizes in audio applications. Maybe analysis of non-audio waves sampled at extremely low rates ?

PS: I wasn't aware that non-power-of-two sizes were also possible - interesting!

@LWinterberg
Copy link
Member

@LWinterberg do you have an idea of the amount of dissatisfaction raising the least spectrum size to 32 would cause? I never came across a need for such small window sizes in audio applications. Maybe analysis of non-audio waves sampled at extremely low rates ?

After I didn't find a a spectrogram in promising places like Multi-Beam Spatio-Spectral Beamforming Receiver for Wideband Phased Arrays, Pipeline FFT architectures optimized for FPGAs and Goldstein phase filtering on InSAR data (it's an interferogram), I checked Sonic Visualiser and found:

image

If even they don't need anything below 32, it's probably fair to say we don't either.

@Paul-Licameli
Copy link
Collaborator Author

Indeed, @Paul-Licameli, using a spectrogram length of 8 in the settings causes an assertion in PFFFT. Reading more carefully pffft.h:

   - supports only transforms for inputs of length N of the form
   N=(2^a)*(3^b)*(5^c), a >= 5, b >=0, c >= 0 (32, 48, 64, 96, 128,
   144, 160, etc are all acceptable lengths). Performance is best for
   128<=N<=8192.

This would imply that the least size is 32. (Maybe by luck, but I just tried a size of 16 and that worked.)

@LWinterberg do you have an idea of the amount of dissatisfaction raising the least spectrum size to 32 would cause? I never came across a need for such small window sizes in audio applications. Maybe analysis of non-audio waves sampled at extremely low rates ?

PS: I wasn't aware that non-power-of-two sizes were also possible - interesting!

Interesting, they allow other prime factors than 2. See Lanczos Lemma.

The divide and conquer step of fft.

Copy link
Contributor

@saintmatthieu saintmatthieu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsatisfied to see argued suggestions rejected for no apparent good reason, but the remaining time for completion of this PR doesn't allow further argument. Please raise the least allowed frame size to 32 and I'll approve.

@@ -6,6 +6,7 @@
PffftTransformer.cpp

Matthieu Hodgkinson
Paul Licameli
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny character

@@ -68,7 +68,7 @@ struct PffftAlignedCount
PffftAlignedCount &operator =(const PffftAlignedCount &) = default;

//! @invariant `result: result % FloatAlignment == 0`
operator size_t() const noexcept { return value; }
constexpr operator size_t() const noexcept { return value; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

value could/should still be const.

auto pOutput = output.get();
const auto factor = N / requestedSize;
for (size_t ii = 1, nn = requestedSize; ii < nn; ++ii)
pOutput[ii] = pOutput[ii * factor];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You overlooked that output here has layout [DC, Nyq, real1, imag1, ...].
But besides that I think it should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Use PFFFT for spectrogram calculation
3 participants