Comments (4)
Something like this Perl one-liner:
perl -pe 's/[^AGTC]/N/gi unless m/>/;' old.fa > new.fa
If a line has a >
leave it alone, otherwise replace all non-AGTC characters with N.
The -p
option puts an implicit "read + print lines" loop around the file.
from seqtk.
@MrOlm good point! Can also use sed
:
% cat seq.fa
>good
AGTCTCTTC
>bad
AGRGAATNC
% sed '/^[^>]/ s/[^AGTC]/N/gi' < seq.fa
>good
AGTCTCTTC
>bad
AGNGAATNC
from seqtk.
Unfortunately, seqtk doesn't have this functionality. Perhaps it is best to write a script to do this.
from seqtk.
Just a note to the above comment- that perl script also replaces newline characters with Ns. The following small modification seems to fix that:
perl -pe 's/[^AGTC\n]/N/gi unless m/>/;' old.fa > new.fa
from seqtk.
Related Issues (20)
- cutN penalty to identify all Ns?
- feature request: ability to pass through all reads by specifying sample '1.0'
- subseq empty output
- seqtk sample: with out without replacement? HOT 1
- `seqtk seq` segfaults on 10G scaffolds HOT 4
- seqtk sample not working as expected HOT 2
- seqtk sample can't properly output fastq.gz HOT 1
- ERROR: the 2nd file has fewer records HOT 1
- The output file size of seqtk subseq is zero HOT 1
- Question: DNA string compressing HOT 1
- seqtk produces different number of reads for paired end files HOT 1
- Problem with seqtk sample HOT 1
- output file contains only one amino acid HOT 1
- seqtk telo -m works partially with 8mer wasp telomere HOT 5
- seqtk comp count CpG
- Seqtk to count sequences same SeqID
- `seqtk hpc input.fq` ignores the quality and converts to .fa
- Is the "sample" feature subsampling without replacement? HOT 1
- DNS Resolution Warning with Singularity Container
- buggy behavior with seqtk subseq command HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from seqtk.