Comments (4)
Hi Ben,
We appreciate your interest in Casanovo!
- The config file can be user provided but the default is the
casanovo/config.py
we provide in the repo. You can use it as a template for your own config file and provide the path to your file. test_data_path
denotes the path to the directory where you have the.mgf
file you want to sequence.- I added an example output file
casanovo_sample_output.csv
to the repo.
Let me know if you have other questions, feel free to close the issue otherwise.
from casanovo.
Greeting @melihyilmaz,
This is actually a really amazing tool! I have successfully started the program. Now it is running. Based on the output file you had provided can I request for a program improvement feature? The improvement I would suggest is to allow a proteomics fasta database to be provided in the command itself. Subsequently, the program would match the denovo sequenced peptides onto the fasta protein database provided and append the fasta header to the corresponding peptides. Based on this it would be easier to know from which protein these peptides are derived from. Also if the denovo sequenced peptide is absent from the provided protein database, it should be labeled as missing. I understand this is a huge ask, but this enhancement would improve downstream analysis.
Additionally, I also observed the user warning of:
rank_zero_warn("You are running on single node with no parallelization, so distributed has no effect.")
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
c:\users\parth\appdata\local\programs\python\python39\lib\site-packages\pytorch_lightning\trainer\data_loading.py:132: UserWarning: The dataloader, test_dataloader 0, does not have many workers which may be a bottleneck. Consider increasing the value of the num_workers
argument(try 24 which is the number of cpus on this machine) in the
DataLoader` init to improve performance.
rank_zero_warn(
Testing: 0it [00:00, ?it/s]
- Could you advise me on how to dedicate/allocate sufficient CPU for your program (e.g., --cpu 15). Unfortunately, the options of --cpu or --memory is not available.
- My computer has GPU (NVIDIA GeForce GTX 1660 SUPER), is there a way to access those using your program?
- Also after observing the above message I did not observed any progress for more than 15 mins. By any chance is the program stalled? Is there a way to access if the program is running in the background?
Regards,
Ben
from casanovo.
Hi @BenSamy2020,
I've been using the program recently and think I can help!
- You can adjust the number of CPUs in the casanovo/config.py file, which should be found in your
/environment/lib/pythonversion/site-packages/casanovo/config.py
file, on line 30. - Yes. If you are able to run
python3
from the command line,import torch
, and typetorch.cuda.is_available()
and it returnsTrue
, that means your environment is configured to recognize your GPU, and so all you need to do is change line 31 in the same file as above togpus = [0]
. Then, when you run Casanovo, you should seeGPU available: True, used: True
instead. - I think the GPU will help lots there. Also, check the
test_batch_size
(line 80 in the same config file) - it's by default set to 1024, so your screen will only update after inferring 1024 peptides. On CPU, that takes a while. So try changing that test batch size to something small and see if you see progress.
Hope this helps!
from casanovo.
Greetings @guhanrv,
I am really appreciative of your assistances. With regards to CPUs I will edit line 30 of config.py file.
Unfortunately, pytorch is not available on my PC and I would require to set it up. Additionally, I am a wet lab person. I will have to youtube or google some information on how to set it up before tapping onto my GPUs.
Once again thank alot!
Regards,
Ben
from casanovo.
Related Issues (20)
- Exclude some PTMs for prediction? HOT 2
- Question about tracking spectra from MGF files in output HOT 1
- Using HDF5 file as train/val dataset leads to index out of bound error HOT 9
- No Models Saved, No Validation Loss Reported HOT 3
- Save final model HOT 1
- how to train casanovo v4 on huge dataset like massive-kb from scratch? HOT 3
- What is the criteria of saving the top k models in Casanovo version 4? HOT 1
- add additional inputs to encoder and decoder HOT 1
- Add contrastive loss term
- Implement bidirectional decoding HOT 1
- Add rotary embeddings
- mzTab validation
- Automate mzTab validation
- More information about the train/val/test split HOT 2
- WARNING: Skipped spectra with invalid precursor info HOT 1
- Export casanovo to torchscript/onnx HOT 1
- ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all() HOT 3
- Make Casanovo produce Skyline compatible output
- 9-Species Benchmark Set: Data Preprocessing Step? HOT 5
- Migrating PeptideMass, PeptideDecoder, and PeptideEncoder from depthcharge v0.2.3 to casanovo HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from casanovo.