Giter Club home page Giter Club logo

speechmetrics's People

Contributors

aliutkus avatar bdmgxl avatar lochenchou avatar mpariente avatar turian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

speechmetrics's Issues

the range of mosnet and srmr?

I was wondering the exact range of mosnet and srmr ,cuz I have seen few utterances got a result which is larger than 5 ,even up to 8.xx. Really appreciate your answer!🌼

Releasing to PyPI register

Hi, just crossed this nice package and would like to use it in some of our projects, do you think it would be possible to have it also on PyPI not only installed from URL the main advantage is that with PyPI it can be cached from URL it has to be always installed regardless it already in site-packages... BTW, the name on pypi seems to be still available 🐰

How to comprehend output?

Hi
First, the metric is super cool, it saved me from downloading each of the metrics separately. Thanks!
Also I wanted to know how to comprehend the output. It would be great if you add that to the Readme file.
Here is the output from two of the files in your dataset, could you elaborate on the results, as in what does high positive or negative value or close to zero mean?

reference = 'data/m2_script1_produced.wav'
ditorted = 'data/m2_script1_clean.wav'

{'mosnet': array([[5.0981326]], dtype=float32),
'srmr': 4.653473083972128}
{'sdr': array([[-0.39609285]]),
'isr': array([[0.24738725]]),
'sar': array([[-0.37060632]]), '
pesq': 4.354660987854004,
'sisdr': -14.740691053217517,
'stoi': 0.9718856108717927}

Failed to install speechmetrics following steps in Installation section

Hello everyone,
I am following steps on the README to install speechmetrics however I am facing the following errors:
(base) [bilal@Fedora ~]$ pip install git+https://github.com/aliutkus/speechmetrics#egg=speechmetrics[cpu]
Collecting speechmetrics[cpu] from git+https://github.com/aliutkus/speechmetrics#egg=speechmetrics[cpu]
Cloning https://github.com/aliutkus/speechmetrics to /tmp/pip-install-3ibjsrho/speechmetrics
Running command git clone -q https://github.com/aliutkus/speechmetrics /tmp/pip-install-3ibjsrho/speechmetrics
error: RPC failed; curl 18 transfer closed with outstanding read data remaining
fatal: the remote end hung up unexpectedly
fatal: early EOF
fatal: index-pack failed
ERROR: Command "git clone -q https://github.com/aliutkus/speechmetrics /tmp/pip-install-3ibjsrho/speechmetrics" failed with error code 128 in None

Can't install this package according to the steps in the readme

When I follow the steps in the readme I am unable to install this package.

conda create --name myenv python=3.7
pip install numpy
pip install git+https://github.com/aliutkus/speechmetrics#egg=speechmetrics[cpu]

The error I get is as follows:

Collecting speechmetrics[cpu]
  Cloning https://github.com/aliutkus/speechmetrics to /tmp/pip-install-niosg_5m/speechmetrics_e8b8c8981b054e5abf0eb066869fb2be
Requirement already satisfied: numpy in /home/bram/miniconda3/envs/myenv/lib/python3.6/site-packages (from speechmetrics[cpu]) (1.19.5)
Collecting gammatone@ git+https://github.com/detly/gammatone
  Cloning https://github.com/detly/gammatone to /tmp/pip-install-niosg_5m/gammatone_895b5433ba3942b89e09332d77220272
Collecting pypesq@ git+https://github.com/vBaiCai/python-pesq
  Cloning https://github.com/vBaiCai/python-pesq to /tmp/pip-install-niosg_5m/pypesq_10d103a6483e4b49baf71e1a6c5e1860
Collecting srmrpy@ git+https://github.com/jfsantos/SRMRpy
  Cloning https://github.com/jfsantos/SRMRpy to /tmp/pip-install-niosg_5m/srmrpy_bdd74c3ba26f435a9e50257e2b5ae2ee
Collecting Gammatone@ https://github.com/detly/gammatone/archive/master.zip#egg=Gammatone
  Using cached https://github.com/detly/gammatone/archive/master.zip
INFO: pip is looking at multiple versions of speechmetrics to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of <Python from Requires-Python> to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of pypesq to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of gammatone to determine which version is compatible with other requirements. This could take a while.
INFO: pip is looking at multiple versions of speechmetrics[cpu] to determine which version is compatible with other requirements. This could take a while.
ERROR: Cannot install speechmetrics, speechmetrics==1.0 and speechmetrics[cpu]==1.0 because these package versions have conflicting dependencies.

The conflict is caused by:
    speechmetrics[cpu] 1.0 depends on gammatone 1.0 (from git+https://github.com/detly/gammatone)
    speechmetrics 1.0 depends on gammatone 1.0 (from git+https://github.com/detly/gammatone)
    srmrpy 1.0 depends on gammatone 1.0 (from https://github.com/detly/gammatone/archive/master.zip#egg=Gammatone)

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies

Where are the versions specified and can they be updated to make the install commands work? Any other tips are welcome as well.

License

I was hoping to use this repository for a commercial project. Would it be possible for you to add a license to this repository that would allow it?

Really awesome repository by the way!

Can you list what the parameters of each function are?

such as:
metrics = speechmetrics.load('bsseval' , window_length)
scores = metrics(?, ?)

metrics = speechmetrics.load('nb_pesq', window_length)
scores = metrics(?, ?)

metrics = speechmetrics.load('stoi',, window_length)
scores = metrics(?, ?)
......

ERROR:0xC00000FD: Stack overflow

python out:
Loaded speechmetrics.relative.nb_pesq
Loaded speechmetrics.relative.pesq
Loaded speechmetrics.relative.sisdr
Loaded speechmetrics.relative.stoi

vs dubeg it:
0x00007FFD86C9FA07 (pesq_core.cp38-win_amd64.pyd) (python.exe 中)处有未经处理的异常: 0xC00000FD: Stack overflow (参数: 0x0000000000000001, 0x000000C2BF203000)。

Wide-band PESQ instead of Narrow band?

Hey,

I wonder what you would think about making the WB PESQ from here the default in speech_metrics.
This replicates the results from Loizou's Matlab code.

We could still keep the current pesq under raw_pesq or something.
I'm willing to make a PR if needed.

Non existing "tensorflow==2.0.0"

The issue is that there is no longer "tensorflow==2.0.0".
I ended up installing straight from the stable packages of tensorflow here: link
Note: after a brief check I noticed: the code works with tensorflow 2.3.0 [at least mosnet], and by installing tensorflow-cpu==2.3.0 Everything works well.

cant not print

a,sr=sf.read('E:/speech/sliced_test_clean/S_01_01.wav')
b,sr=sf.read('E:/speech/sliced_test_-5/S_01_01.wav')
score=pesq(a,b,sr)
print(score)
hello,my code like this but cant print the score,i dont konw why.

Poetry solverproblem when trying to install both `speechmetrics` and `pysepm`

Hi,
First and foremost; thank you for this amazing code! 🔥,

I want to install both pysepm and speechmetrics in my poetry env; however, I get this error:

$ poetry add git+https://github.com/aliutkus/speechmetrics.git

Updating dependencies
Resolving dependencies... (24.7s)

  SolverProblemError

  Because speechmetrics (rev master) depends on srmrpy (branch master)
   and pysepm (rev master) depends on SRMRpy (1.0), speechmetrics (rev master) is incompatible with pysepm (rev master).
  So, because semi-guided-speech depends on both pysepm (branch master) and speechmetrics (branch master), version solving failed.

  at /usr/lib/python3.10/site-packages/poetry/puzzle/solver.py:241 in _solve

Why is there a tight dependency on the master/branch version of SRMRpy?

'srmrpy @ git+https://github.com/jfsantos/SRMRpy',

fyi: this is the setup.py of pysepm:

https://github.com/schmiph2/pysepm/blob/3c3f35ef5846d0e976adbc9d72469c3d4ae99a4f/setup.py#L17

An installation problem about 'speechmetrics' repository

Hi,I failed to install your speechmetrics on windows neither python3 nor python2 yesterday,and I also tried on ubuntu but failed again.Could you tell me whats the problem and how I can solve it?Thanks!
there are no issues button in your 'speechmetrics' repository ,so I raise the question there

error

the score of SRMR is bigger

Hi, when I use the SRMR to evaluate the quality of audio, sometime the score is bigger than 1 and the average score of 2430 audio is 2.23. Do you have any idea about this?

Is this going to depend on `bsseval`

Thanks for the wrapper, it's a nice idea !
Is speechmetrics going to depend on bsseval once it is done?
Which would also mean that speechmetrics will include SI-SDR? That would be really practical to share the same metrics in the community.

example/test.py

test.py: scores = metrics(reference, test)
example_use: scores = metrics(path_to_estimate_file, path_to_reference)

Which of the above two is correct?

'str' object has no attribute 'decode'

I follow the examples and get something wrong,
here are my codes

import speechmetrics
window_length = 5 # seconds
metrics = speechmetrics.load('absolute', window_length)
scores = metrics("/Wave/000001.wav")
print(scores)

and I get errors like this:

'str' object has no attribute 'decode'
  File "/home/mike/testaudioquality/test.py", line 3, in <module>
    metrics = speechmetrics.load('absolute', window_length)

the versions of my python is 3.6 and my tensorflow-gpu version is 2.0.0

setup.py installation error: `tensorflow-gpu` package has been removed.

The installation command in the readme throws an error, because setup.py is trying to build the tensorflow-gpu package, which has been removed from pip.

From their pypi page:

tensorflow-gpu has been removed. Please install tensorflow instead.

As of December 2022, tensorflow-gpu has been removed and has been replaced with this new, empty package that generates an error upon installation.

Installing tensorflow separately does not help, as setup.py always looks for the empty tensorflow-gpu package instead.

Keras model error

When running test.py, I get the following error :

Trying ABSOLUTE metrics: 
Traceback (most recent call last):
  File "test.py", line 7, in <module>
    metrics = sm.load('absolute', window)
  File "/home/mparient/code_perso/cloned/speechmetrics/speechmetrics/__init__.py", line 151, in load
    new_metric = load_function(window)
  File "/home/mparient/code_perso/cloned/speechmetrics/speechmetrics/absolute/mosnet/__init__.py", line 22, in load
    mosnet = MOSNet(window, hop)
  File "/home/mparient/code_perso/cloned/speechmetrics/speechmetrics/absolute/mosnet/model.py", line 36, in __init__
    padding='same'))(re_input)
  File "/home/mparient/.virtualenvs/speechmetrics/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 817, in __call__
    self._maybe_build(inputs)
  File "/home/mparient/.virtualenvs/speechmetrics/lib/python3.6/site-packages/tensorflow_core/python/keras/engine/base_layer.py", line 2141, in _maybe_build
    self.build(input_shapes)
  File "/home/mparient/.virtualenvs/speechmetrics/lib/python3.6/site-packages/tensorflow_core/python/keras/layers/convolutional.py", line 153, in build
    raise ValueError('The channel dimension of the inputs '
ValueError: The channel dimension of the inputs should be defined. Found `None`.

Installing the b1 version (as specified in the original repo) doesn't solve the problem. Any idea?

Inconsistency between museval and speechmetrics-bsseval

I wrote the following code to compare the behavior between museval and speechmetrics-bsseval.

from museval.metrics import bss_eval
import speechmetrics as sm
import numpy as np

metrics = sm.load(['bsseval'],window=1)

ref = np.random.randn(1, 44100*3, 2)  # [nsrc, nsample, channel], a single audio source with two channels 
est = np.random.randn(1, 44100*3, 2)

res = bss_eval(ref,est,window=44100,hop=44100)

bsseval = metrics(est[0,...],ref[0,...],rate=44100)

print(res)

print(bsseval)

It output the following:

Loaded  speechmetrics.relative.bsseval
(array([[-3.02169448, -2.98148236, -3.01738321]]), array([[-0.03463801, -0.03900151, -0.0400294 ]]), array([[inf, inf, inf]]), array([[-21.09888836, -21.01320054, -21.05034071]]), array([[0]]))
{'sdr': array([[-2.99676764, -2.98088619, -2.99560498],
       [-3.04682562, -2.98208233, -3.03924334]]), 'isr': array([[-0.01493135, -0.01706893, -0.01804728],
       [-0.02410121, -0.02879832, -0.0266307 ]]), 'sar': array([[-21.00349928, -20.94294041, -20.91428113],
       [-21.19664823, -21.08528379, -21.18806085]])}

It seems that speechmetrics treat two channels as two sources.

Change the following code in bsseval.py:16, the problem would be solved.

result = bss_eval(reference_sources=audios[1][None,...], # shape: [nsrc, nsample, nchannels]
                estimated_sources=audios[0][None,...],
                window=self.bss_window * rate,
                hop=self.bss_hop * rate)

Multiple Values for Single Metric?

Hello. Thank you for the efforts in this all-in-one package.
I am mainly using MOSNet and SRMR now, as they are non-intrusive (absolute).

But I find that more than 1 value are output from the codes for each metric, even I only load 1 audio file for evaluation.
Here is an example of Python output:
{'mosnet': array([2.75408196, 3.04858017, 3.26394176]), 'srmr': array([10.25382248, 7.35339144, 8.33086446]), 'stoi': array([ 0.09140952, -0.08605568, -0.0059758 ])}
You can see there are 3 values for each metric. And sometimes I got only 1 or 2 for each.

I am quite sure my input audio is an 1d array as numpy.shape(audio) gives (101984,)
Is it because of the length of input audio?

Also please advise why SRMR returns large number while in your introduction it is between 0 to 1 with 1 the best?
Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.