Giter Club home page Giter Club logo

Comments (14)

laninsky avatar laninsky commented on July 27, 2024

Hi Omid,

Thanks for raising this issue! I wrote this filter for a dataset where we had quite a few "bad" samples for one of the populations, and the line that didn't work for you was assuming that you'd have populations that needed to be removed from the dataset because all of the individuals within that population had been removed. I think I've fixed the issue - could you re-download GBS_SNP_filter_rsq.R and try again? If it works then I'll close off the issue.

Let me know what other bugs you find! Hard to control for accidentally making code specific to one dataset!

Cheers,

Alana

from gbs_snp_filter.

OmidJa avatar OmidJa commented on July 27, 2024

Hi dear Alana,
Thanks for your kind answer and help. Actually I did based on your comment and downloaded the updated GBS_SNP_filter_rsq.R , and now in the step like yesterday I have a new bug which is different from the previous one.

Error in matrix(if (is.null(value)) logical() else value, nrow = nr, dimnames = list(rn,  :
  length of 'dimnames' [2] not equal to array extent
Calls: Ops.data.frame -> matrix

Can you please let me know how we can cope with it?
Thank you very much in advance.

Regards,
Omid

from gbs_snp_filter.

laninsky avatar laninsky commented on July 27, 2024

Hi Omid,

Well at least we solved the first problem :)

This second one is a little trickier for me to track down without playing around with the data myself. Would it be possible to get a copy of your original vcf, popmap.txt, and GBS_SNP_filter.txt files you've been using emailed to me?

If this isn't an option for privacy reasons etc let me know - the 'slower' way will be for me to add some checks to the code so we can figure out exactly where it is failing and why.

Thanks for your patience!

Alana

from gbs_snp_filter.

OmidJa avatar OmidJa commented on July 27, 2024

from gbs_snp_filter.

laninsky avatar laninsky commented on July 27, 2024

Awesome, thanks Omid. I've got the invitation to the dropbox link, but it is currently empty - are you in the process of uploading the vcf? GBS_SNP_filter.txt and popmap.txt look absolutely fine, so I'm curious to try and figure out why the code has suddenly stopped working! :)

from gbs_snp_filter.

laninsky avatar laninsky commented on July 27, 2024

Cool, came through on that link, thanks! I'll be back in touch (hopefully in the next day or so) once I figure out what has gone wrong.

from gbs_snp_filter.

OmidJa avatar OmidJa commented on July 27, 2024

from gbs_snp_filter.

laninsky avatar laninsky commented on July 27, 2024

Hi Omid,

Thanks for your file - I'm currently running it through to figure out why it failed at that step, but in the mean time I think I've found another potential issue. The dataset that I wrote this code for was assembled de novo so the locus identifier was contained in the #CHROM column of the vcf. However, based on the GenBank codes under #CHROM in your file, you must have used a reference-guided assembly approach right? The code as it is written is taking just one SNP per scaffold, which I bet is not what you want. You probably want it to allow for multiple SNPs per scaffold, and instead use the ID column to identify the RAD/GBS locus and take just one SNP per those loci? This shouldn't be too hard to fix, but will require the code to be rewritten a smidgen, which will take me a couple of days. I just want to confirm that is how you'd want to use this code before I go too far down the rabbit hole!

Cheers,

Alana

from gbs_snp_filter.

OmidJa avatar OmidJa commented on July 27, 2024

from gbs_snp_filter.

laninsky avatar laninsky commented on July 27, 2024

Hi Omid,

Cool - I'll go ahead and get those changes going then. I think you probably want to ditch everything but one SNP/locus because over the short 100-200bp length of GBS/RADseq loci, those sites are unlikely to be regularly broken up by recombination. If you use STRUCTURE or similar programs that assume that loci are unlinked then you definitely want to ditch SNPs that are likely to be linked. So yes, your plan sounds like a good one. I'll post here again once I have a new version of the code for you to try out.

Cheers,

Alana

from gbs_snp_filter.

laninsky avatar laninsky commented on July 27, 2024

Hi Omid,

I've made a new branch of the repo that can allow you to specify what you want to consider the "SNP locus" (in your case, the ID column, not the #CHROM column). The big differences with the previous version are that you will have to add an extra two lines to GBS_filter_SNP.txt giving the column in the vcf file that contains the locus names (in your case, ID), and the regex to strip away anything that isn't the locus name (in your case, the ID column contains the locus name combined with the position within that locus that the SNP occurred at e.g. 223111_6 is locus 223111 with a SNP at position 6. The regex to get rid of the trailing _ and number would be _.*

This branch is at the following link if you want to run it yourself (but I've also uploaded the files to the dropbox link you shared with me):
https://github.com/laninsky/GBS_SNP_filter/tree/reference_assembled_options

To demonstrate how to specify the column and regex pattern in the README I've included a very small snippet of your vcf file. Please let me know if this is not OK and I can put some dummy data up instead.

Once I hear back from you I will pull these changes in to the main version of the repo.

Cheers,

Alana

from gbs_snp_filter.

OmidJa avatar OmidJa commented on July 27, 2024

Hello dear Alana,
Hope you are fine. I should be really grateful for this your great help and support and also the very clear explanations to me. I checked it and it works well on the vcf file that I sent to you.
It is no problem at all and I get happy if it can be as an example for better understanding the process.
Again I really appreciate all your kind supports and will back to you again if I find any bug.

Best regards,
Omid

from gbs_snp_filter.

laninsky avatar laninsky commented on July 27, 2024

Awesome, thanks for all the help providing your dataset to solve this issue, Omid! I've gone ahead and merged the two branches, so the updated code is now available at the master branch (https://github.com/laninsky/GBS_SNP_filter). Cheers!

from gbs_snp_filter.

OmidJa avatar OmidJa commented on July 27, 2024

from gbs_snp_filter.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.