query_repeat_part_by_audio's Introduction

功能

传入多段音频，找出其中相同的部分，并给出各个音频相同部分的起止时间

环境

Python=3.8.3
librosa=0.8.0
numpy=1.19.2
scipy=1.4.1
pandas=1.0.5
soundfile=0.10.3
matplotlib=3.3.0

音频特征

采样率：16000
比特数：16bit
声道：单声道

原理

倒排索引
Shazam音频指纹提取算法

运行

1 安装所需模块：pip install -r requirements.txt，然后将多个音频放入audio_data文件夹下

2 首先从百度网盘里将数据下载完毕放入audio_data文件夹里(链接：https://pan.baidu.com/s/1YZWSFYUciFCeF6QyedMCOA 提取码：anpj )
然后直接运行python main.py可将结果打印在屏幕上

或者运行./main.sh，结果会在log里
log中会出现如下内容：
path: ./audio_data/66.wav sr: 16000 duration: 959.0 feature.shape: (34479, 3)
path: ./audio_data/70.wav sr: 16000 duration: 204.0 feature.shape: (6784, 3)
path: ./audio_data/72.wav sr: 16000 duration: 459.0 feature.shape: (17897, 3)
target_advise_list of 66.wav: ['70', '72'] # 若音频数量过大，可通过倒排索引快速定位目标音频
['66', 104.8875, 119.075] ['70', 2.7375, 16.925] 解释：音频66的第104秒到119秒和音频70的第2秒到第16秒是相同的内容（广告）
['66', 895.975, 908.9625] ['72', 430.9375, 443.9375]
['66', 90.4875, 104.2625] ['72', 415.45, 429.2375]
['66', 940.5, 952.9375] ['72', 130.475, 142.9]
target_advise_list of 70.wav: ['72', '66']
['70', 2.9375, 16.925] ['72', 355.05, 369.0375]
['70', 2.7375, 16.925] ['66', 104.8875, 119.075]
target_advise_list of 72.wav: ['70', '66']
['72', 355.05, 369.0375] ['70', 2.9375, 16.925]
['72', 430.9375, 443.9375] ['66', 895.975, 908.9625]
['72', 415.45, 429.2375] ['66', 90.4875, 104.2625]
['72', 130.475, 142.9] ['66', 940.5, 952.9375]
若想中途停止，运行./stop.sh

参考

[1]. https://www.toptal.com/algorithms/shazam-it-music-processing-fingerprinting-and-recognition
[2]. https://zhuanlan.zhihu.com/p/75360272
[3]. https://github.com/lukemcraig/AudioSearch
[4]. An Industrial-Strength Audio Search Algorithm

query_repeat_part_by_audio's People

Contributors

Stargazers

Watchers

query_repeat_part_by_audio's Issues

在 Python3.9 环境下出现了问题

Traceback (most recent call last):
  File "D:\Work_woker\_user\py_class\query_repeat_part_by_audio\main.py", line 233, in <module>
    test()
  File "D:\Work_woker\_user\py_class\query_repeat_part_by_audio\main.py", line 230, in test
    main(audio_path_list)
  File "D:\Work_woker\_user\py_class\query_repeat_part_by_audio\main.py", line 213, in main
    audio = Audio(audio_path)
  File "D:\Work_woker\_user\py_class\query_repeat_part_by_audio\delete_repeat_advise\audio_feature.py", line 229, in __init__
    self.get_audio_params(self.audio_path)
  File "D:\Work_woker\_user\py_class\query_repeat_part_by_audio\delete_repeat_advise\audio_feature.py", line 234, in get_audio_params
    self.audio_feature = self.audio_obj.get_audio_feature(self.y, self.sr, 1)
  File "D:\Work_woker\_user\py_class\query_repeat_part_by_audio\delete_repeat_advise\audio_feature.py", line 35, in get_audio_feature
    return self.get_fingerprints(audio_data, audio_sr)
  File "D:\Work_woker\_user\py_class\query_repeat_part_by_audio\delete_repeat_advise\audio_feature.py", line 49, in get_fingerprints
    fingerprints = self._get_fingerprints_from_peaks(len(f) - 1, f_step, peak_locations, len(t) - 1, t_step)
  File "D:\Work_woker\_user\py_class\query_repeat_part_by_audio\delete_repeat_advise\audio_feature.py", line 103, in _get_fingerprints_from_peaks
    paired_df_peak_locations, n_pairs = self._query_dataframe_for_peaks_in_target_zone_binary_search(
  File "D:\Work_woker\_user\py_class\query_repeat_part_by_audio\delete_repeat_advise\audio_feature.py", line 155, in _query_dataframe_for_peaks_in_target_zone_binary_search
    paired_df_peak_locations = df_peak_locations.loc[t_index & f_index]
  File "C:\Users\chen\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\ops\common.py", line 81, in new_method
    return method(self, other)
  File "C:\Users\chen\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\arraylike.py", line 70, in __and__
    return self._logical_method(other, operator.and_)
  File "C:\Users\chen\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\indexes\base.py", line 6791, in _logical_method
    res_values = ops.logical_op(lvalues, rvalues, op)
  File "C:\Users\chen\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\ops\array_ops.py", line 394, in logical_op
    res_values = na_logical_op(lvalues, rvalues, op)
  File "C:\Users\chen\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\ops\array_ops.py", line 304, in na_logical_op
    result = op(x, y)
ValueError: operands could not be broadcast together with shapes (99,) (11,)

是否可以将librosa 0.8.0、numpy 1.19.2升级至librosa 0.9.2 、numpy 1.23.5 ？

Recommend Projects

anpengjin / query_repeat_part_by_audio Goto Github PK

query_repeat_part_by_audio's Introduction

功能

环境

音频特征

原理

运行

参考

query_repeat_part_by_audio's People

Contributors

Stargazers

Watchers

Forkers

query_repeat_part_by_audio's Issues

在 Python3.9 环境下出现了问题

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent