qiuqiao / sofa Goto Github PK

View Code? Open in Web Editor NEW

104.0 6.0 14.0 1.68 MB

SOFA: Singing-Oriented Forced Aligner

License: MIT License

Python 100.00%

sofa's People

Stargazers

Watchers

Forkers

colstone oxygen-dioxide akamstsu cardroid spicytigermeat cnchtu hrukalive yqzhishen oliver0621 xzm2004260 wolfgitpr splinter21 dankyu-fly techthiyanes

sofa's Issues

SOFA inference

Hello!

Dear developers, thank you so much for your aligner SOFA. Do you have some inference scripts with SOFA?

Which ckpt should be chosen? What parameters need to be comprehensively considered?

KeyError: "Unable to synchronously open object (object 'ph_seq' doesn't exist)"

hi, i tried to train with my own data but i got the following error when executing python train.py -p path_to_your_pretrained_model as following

'torchaudio' installed and imported.
Seed set to 114514
Using bfloat16 Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

  | Name                     | Type              | Params
---------------------------------------------------------------
0 | backbone                 | UNetBackbone      | 101 M
1 | head                     | Linear            | 5.5 K
2 | ph_frame_GHM_loss_fn     | GHMLoss           | 0
3 | pseudo_label_GHM_loss_fn | MultiLabelGHMLoss | 0
4 | ph_edge_GHM_loss_fn      | MultiLabelGHMLoss | 0
5 | EMD_loss_fn              | BinaryEMDLoss     | 0
6 | ph_edge_diff_GHM_loss_fn | MultiLabelGHMLoss | 0
7 | MSE_loss_fn              | MSELoss           | 0
8 | CTC_GHM_loss_fn          | CTCGHMLoss        | 0
---------------------------------------------------------------
101 M     Trainable params
0         Non-trainable params
101 M     Total params
406.018   Total estimated model params size (MB)
Sanity Checking: |                                                                                                  | 0/? [00:00<?, ?it/s]/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/trainer/connectors/data_connector.py:441: The 'val_dataloader' does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` to `num_workers=15` in the `DataLoader` to improve performance.
Traceback (most recent call last):
  File "/home/ria/SOFA/train.py", line 152, in <module>
    main()
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/click/core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/click/core.py", line 1078, in main
    rv = self.invoke(ctx)
         ^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/click/core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/click/core.py", line 783, in invoke
    return __callback(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/train.py", line 138, in main
    trainer.fit(
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 544, in fit
    call._call_and_handle_interrupt(
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/trainer/call.py", line 44, in _call_and_handle_interrupt
    return trainer_fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 580, in _fit_impl
    self._run(model, ckpt_path=ckpt_path)
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 987, in _run
    results = self._run_stage()
              ^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 1031, in _run_stage
    self._run_sanity_check()
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/trainer/trainer.py", line 1060, in _run_sanity_check
    val_loop.run()
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/loops/utilities.py", line 182, in _decorator
    return loop_run(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/loops/evaluation_loop.py", line 128, in run
    batch, batch_idx, dataloader_idx = next(data_fetcher)
                                       ^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/loops/fetchers.py", line 133, in __next__
    batch = super().__next__()
            ^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/loops/fetchers.py", line 60, in __next__
    batch = next(self.iterator)
            ^^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/utilities/combined_loader.py", line 341, in __next__
    out = next(self._iterator)
          ^^^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/lightning/pytorch/utilities/combined_loader.py", line 142, in __next__
    out = next(self.iterators[0])
          ^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 629, in __next__
    data = self._next_data()
           ^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 672, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
            ~~~~~~~~~~~~^^^^^
  File "/home/ria/SOFA/dataset.py", line 84, in __getitem__
    ph_seq = np.array(item["ph_seq"])
                      ~~~~^^^^^^^^^^
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "/home/ria/SOFA/venv/lib/python3.11/site-packages/h5py/_hl/group.py", line 357, in __getitem__
    oid = h5o.open(self.id, self._e(name), lapl=self._lapl)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
  File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
  File "h5py/h5o.pyx", line 241, in h5py.h5o.open
KeyError: "Unable to synchronously open object (object 'ph_seq' doesn't exist)"

before this command, i ran python binarize.py which was no problem

Data compression ratio:
    full label data: 45.89 %,
    weak label data: 54.11 %,
    no label data: 0.00 %.
Successfully binarized train set, total time 3756.69s, saved to data/binary/train.h5py

for additional infomations, i ran it in wsl2 debian12, cuda12.4 nightly, python3.11.2, i don't know if it affect the result.

can't train model because ph_seq is set to None

I don't know enough about python to be able to fix this problem with a pr, so I hope this is descriptive enough

commit 7cd5771 adds wav_length to _infer_once, but doesn't change arguments provided in validation_step

SOFA/modules/task/forced_alignment.py

Lines 774 to 781 in 7cd5771

 _, _, _, _, _, ctc, fig = self._infer_once( 

 input_feature, 

 ph_seq_g2p, 

 None, 

 None, 

 True, 

 True, 

 )

this sets ph_seq to None, causing _infer_once to fail with a NoneType error when using train.py. I reverted back to commit 2591e98 and this error was resolved.

Improper `fmax` configuration?

SOFA/configs/binarize_config.yaml

Line 8 in 51a869d

fmax: 32000

However fmax should be no larger than sample_rate / 2. Is this by design, and why?

Please help. I encountered an error when trying to execute python binarize.py before training.

My steps are to refer to this
https://github.com/MLo7Ghinsan/DiffSinger_colab_notebook_MLo7/blob/main/SOFA_Notebook.ipynb

But I just use the previous installation dependency code and dependency part.I'm not using the Extract Data part of the code. Because I have split the wav file locally.

Then I put the split wav file and transcriptions.csv file in the data directory.

Then I executed the code below. Then I encountered the above error.
`
%cd /content/SOFA

if not os.path.exists("data/binary/global_config.yaml"):
os.makedirs("data/binary")
with open("data/binary/global_config.yaml", "w") as file:
pass

!source activate aligner;
python binarize.py

TypeError encountered when using multiprocessing dataloader on Windows

When setting dataloader_workers to a value other than 0, the program raises

TypeError: h5py objects cannot be pickled

The reason is because Python uses pickle to transfer objects between processes, and the h5py object in dataset instances cannot be pickled. The direct cause of this problem is that the d5py object is created before the worker processes are created.

To be specific, the h5py object is created when calling get_label_types() here on the main process:

SOFA/train.py

Lines 71 to 78 in 51a869d

 train_sampler = WeightedBinningAudioBatchSampler( 

 train_dataset.get_label_types(), 

 train_dataset.get_wav_lengths(), 

 config["oversampling_weights"], 

 config["batch_max_length"] / (2 if config["data_augmentation_size"] > 0 else 1), 

 config["binning_length"], 

 config["drop_last"], 

 )

One possible solution to this is to close and release the h5py objects after accessing metadata like this, so that the main process open the HDF5 file and get the metadata, then closes it before spawning into worker processes to avoid pickling and transferring it; the worker process opens it, gets data items from it and keeps it open.

If you think the solution above is okay, I can submit a PR to solve this problem.

Training stopped before it even started. Please help to find out what is the reason?

I have added the full_label, weak_label and no_label data normally. And binarize.py was successfully executed. The following is the result.

I have not modified the train_config.yaml configuration file.

I'm going to use a pretrained model for fine-tuning. The following result appeared, and then it paused. What is the problem?

不支持多卡训练吗？

Are there any documents about the algorithm for SOFA? 请问有没有模型算法的说明呢

请问有没有模型结构的说明文档或视频，或者有没有计划写一个相关的文档/宣传呢？
ps. 感谢非常好工作，使我的强制对齐旋转

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

	_, _, _, _, _, ctc, fig = self._infer_once(
	input_feature,
	ph_seq_g2p,
	None,
	None,
	True,
	True,
	)

	train_sampler = WeightedBinningAudioBatchSampler(
	train_dataset.get_label_types(),
	train_dataset.get_wav_lengths(),
	config["oversampling_weights"],
	config["batch_max_length"] / (2 if config["data_augmentation_size"] > 0 else 1),
	config["binning_length"],
	config["drop_last"],
	)