Giter Club home page Giter Club logo

Comments (5)

andrej102 avatar andrej102 commented on August 13, 2024 1

I tested MASE. It's just a noise reduction algorithm, it has nothing to do with beamforming. At the same time, it does not work well - it greatly distorts the sound and cuts the frequencies to 4 kHz. In the future, the automatic voice recognition system recognizes the voice much worse than without MUSE.

from diy-alexa.

cgreening avatar cgreening commented on August 13, 2024

I think you could definitely do it at sample capture time - as you say - before putting the samples into the ring buffer.

If memory does become an issue then I'd be tempted to look at the wrover modules which have 4MB of RAM.

from diy-alexa.

StuartIanNaylor avatar StuartIanNaylor commented on August 13, 2024

Yeah my wondering mind keeps thinking that a distributed array of directional mics that use the KW hit result could give a central ASR a signal to use the best and nearest mic signal.
But yeah the wrover dont have much price difference now.
https://www.ebay.co.uk/itm/ESP32-WROVER-B-T8-V1-8-ESP32-8MB-PSRAM-TF-Card-WiFi-Module-Bluetooth-Board-HL/293784167596
Would need 2x i2s mics but still very cheap.

Does anyone know if there is a lib for a lossless audio RTP with latency compensation similar to snapcast?

from diy-alexa.

StuartIanNaylor avatar StuartIanNaylor commented on August 13, 2024

Whilst we are talking mics has anyone tried the MASE and other algs on Wrover or lower?
https://github.com/espressif/esp-sr

If it was a RTP mic streamer triggered by KW till silence that would be amazing to couple to a central ASR.
If it maybe streamed AMR-WB to a port and accompanied KWS hit score on another then multiple ESP32 could be part of a distributed array.

from diy-alexa.

StuartIanNaylor avatar StuartIanNaylor commented on August 13, 2024

@andrej102 I did wonder how they had shoehorned a series of load heavy DSP that even seems to be too much for Raspberry embedded Linux. 4kHz is no use to the central ASR I plan to use.

They have done an amazing demo with the esp32-s3 https://www.youtube.com/watch?v=ARkuaeEW1eM

The EQ DSP to rectify the filtering of the beamforming also adds much load that the stereo mic idea I gave up on.
The ADC on the ESP32 from bits to SNR is not great but going analogue with directional electrets is an alternative.

The above vid has me confused as it seems to exceed high end dsp beamformers I have seen and if snakeoil or not I am unsure.
I still plan to use distributed directional electrets so that a unit is voice=near / noise=far purely by positioning

from diy-alexa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.