Giter Club home page Giter Club logo

macs's People

Contributors

gchen98 avatar jackkamm avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

macs's Issues

GSR certificate for MaCS

Hi, @gchen98 The GSR team has evaluated MaCS as part of the GSR certification program but we could not reach you through your usc email address. Could you please write to [email protected] and let us know your current contact information? Thanks.

Issue with running many MACS simulations simultaneously

What steps will reproduce the problem?
1. Run many MAC simulations simultaneously (I used sge grid to submit many jobs)


What is the expected output? What do you see instead?
The output of MACs is identical for different runs even when run under 
different demographic parameters.


What version of the product are you using? On what operating system?
Most current version. Linux 2.6.18-194.3.1.el5_lustre.1.8.4 x86_64

Please provide any additional information below.
My best guess is that MACs is confusing output files from different runs (the 
output files start with haplo and tree).


Original issue reported on code.google.com by [email protected] on 28 Jan 2015 at 3:42

TMRCA?

What steps will reproduce the problem?
1. N/A

What is the expected output? What do you see instead?
The times to the most recent common ancestor (TMRCA) for the loci simulated.

What version of the product are you using? On what operating system?
The SVN version downloaded Oct. 2013.

Please provide any additional information below.
I am interested in having the TMRCA for the loci that I simulated. Is there an 
easy way to pull out this from MaCS' output?

Thanks,
Benson

Original issue reported on code.google.com by [email protected] on 24 May 2014 at 5:28

differential mutation rates across whole genome?

Hello,

I am interested in using macs to simulate a handful of whole genomes. Yet, I do 
not see a way to assign different mutation rates to different regions in the 
genome. A brute-force way to do so, I guess, is to simulate each region 
separately, but it will require tons of commands to achieve such goal. Would it 
be possible to assign a file, which specifies the mutation rate (a.k.a a value 
for theta) of each region of interest, similar to the recombination map file 
that macs takes?

Thanks,
PingHsun(Benson)


Original issue reported on code.google.com by [email protected] on 2 May 2013 at 8:47

Question about -h parameter

Hello,

I have a question about the -h parameter. Let k=# trees stored in memory. 
According to the original paper, k= the expected number of recombination events 
in h bases (so, k is fixed along the sequence). However, the README suggests 
that k=the actual number of trees within the last h bases (so, k varies along 
the sequence, depending on how many recombination events actually occurred).

Could you clarify which is correct?

Thanks,
Jack Kamm

Original issue reported on code.google.com by [email protected] on 18 Apr 2013 at 5:45

macs install problem - because of Boost?

What steps will reproduce the problem?
1. make all
2.
3.

What is the expected output? What do you see instead?
macs doesn't 'make'
Get huge error message:

make allg++ -Wall -g -I /Users/garychen/software/boost_1_36_0 -c simulator.cpp
In file included from simulator.cpp:7:
simulator.h:8:29: error: boost/weak_ptr.hpp: No such file or directory
simulator.h:9:31: error: boost/shared_ptr.hpp: No such file or directory
simulator.h:10:34: error: boost/intrusive_ptr.hpp: No such file or directory
simulator.h:11:44: error: boost/random/mersenne_twister.hpp: No such file or 
directory
simulator.h:12:38: error: boost/random/uniform_01.hpp: No such file or directory
In file included from simulator.cpp:7:
simulator.h:42: error: ‘boost’ has not been declared
simulator.h:42: error: expected initializer before ‘<’ token
simulator.h:43: error: ‘NodePtr’ was not declared in this scope
simulator.h:43: error: template argument 1 is invalid
simulator.h:43: error: template argument 2 is invalid
simulator.h:43: error: invalid type in declaration before ‘;’ token
simulator.h:44: error: ‘NodePtr’ was not declared in this scope
simulator.h:44: error: template argument 1 is invalid
simulator.h:44: error: template argument 2 is invalid
simulator.h:44: error: invalid type in declaration before ‘;’ token
simulator.h:45: error: ‘NodePtr’ was not declared in this scope
simulator.h:45: error: template argument 1 is invalid
simulator.h:45: error: template argument 3 is invalid
simulator.h:45: error: invalid type in declaration before ‘;’ token
simulator.h:49: error: ‘boost’ has not been declared
simulator.h:49: error: expected initializer before ‘<’ token
simulator.h:52: error: ‘boost’ has not been declared
simulator.h:52: error: expected initializer before ‘<’ token
simulator.h:54: error: ‘EdgePtr’ was not declared in this scope
simulator.h:54: error: template argument 1 is invalid
simulator.h:54: error: template argument 2 is invalid
simulator.h:54: error: invalid type in declaration before ‘;’ token
simulator.h:57: error: ‘EdgePtr’ was not declared in this scope
simulator.h:57: error: template argument 1 is invalid
simulator.h:57: error: template argument 2 is invalid
simulator.h:57: error: invalid type in declaration before ‘;’ token
simulator.h:64: error: ‘boost’ has not been declared
simulator.h:64: error: expected initializer before ‘<’ token
simulator.h:65: error: ‘EventPtr’ was not declared in this scope
simulator.h:65: error: template argument 1 is invalid
simulator.h:65: error: template argument 2 is invalid
simulator.h:65: error: invalid type in declaration before ‘;’ token
simulator.h:165: error: expected `)' before ‘&’ token
simulator.h:168: error: ISO C++ forbids declaration of ‘NodePtr’ with no 
type
simulator.h:168: error: expected ‘;’ before ‘&’ token
simulator.h:169: error: ISO C++ forbids declaration of ‘NodePtr’ with no 
type
simulator.h:169: error: expected ‘;’ before ‘&’ token
simulator.h:176: error: ‘NodePtr’ has not been declared
simulator.h:186: error: ‘NodePtr’ does not name a type
simulator.h:221: error: ‘EdgePtr’ does not name a type
simulator.h:222: error: ‘EdgePtr’ does not name a type
simulator.h:226: error: ‘EdgePtr’ has not been declared
simulator.h:226: error: ‘EdgePtr’ has not been declared
simulator.h:228: error: ‘EdgePtr’ has not been declared
simulator.h:234: error: ‘EventPtr’ has not been declared
simulator.h:243: error: ‘EventPtr’ does not name a type
simulator.h:244: error: ‘WeakEdgePtr’ does not name a type
simulator.h:412: error: ‘NodePtr’ does not name a type
simulator.h:509: error: ‘boost’ has not been declared
simulator.h:509: error: ISO C++ forbids declaration of ‘uniform_01’ with no 
type
simulator.h:509: error: expected ‘;’ before ‘<’ token
simulator.h:554: error: ‘NodePtr’ does not name a type
simulator.h:565: error: ‘EdgePtr’ does not name a type
simulator.h:568: error: ‘EdgePtr’ does not name a type
simulator.h:588: error: ISO C++ forbids declaration of ‘EdgePtr’ with no 
type
simulator.h:588: error: expected ‘;’ before ‘*’ token
simulator.h:590: error: ISO C++ forbids declaration of ‘NodePtr’ with no 
type
simulator.h:590: error: expected ‘;’ before ‘*’ token
simulator.h:612: error: ‘EdgePtr’ does not name a type
simulator.h:633: error: ‘NodePtr’ has not been declared
simulator.h:633: error: ‘EventPtr’ has not been declared
simulator.h:648: error: ‘EdgePtr’ does not name a type
simulator.h:651: error: ‘EdgePtr’ does not name a type
simulator.h:674: error: ‘EdgePtr’ has not been declared
simulator.h:682: error: ‘EdgePtr’ has not been declared
simulator.h:687: error: ‘EdgePtr’ has not been declared
simulator.h:690: error: ‘EdgePtr’ has not been declared
simulator.h:693: error: ‘EdgePtr’ has not been declared
simulator.h:698: error: ‘EdgePtr’ has not been declared
simulator.h:700: error: ‘EdgePtr’ has not been declared
simulator.h:702: error: ‘EdgePtr’ has not been declared
simulator.h:704: error: ‘EdgePtr’ has not been declared
simulator.h:704: error: ‘EdgePtr’ has not been declared
simulator.h:706: error: ‘NodePtr’ has not been declared
simulator.h:706: error: ‘EdgePtr’ has not been declared
simulator.h:709: error: ‘NodePtr’ has not been declared
simulator.h:709: error: ‘EdgePtr’ has not been declared
simulator.h:717: error: ‘NodePtr’ has not been declared
simulator.h:749: error: expected ‘,’ or ‘...’ before ‘&’ token
simulator.h:749: error: ISO C++ forbids declaration of ‘NodePtr’ with no 
type
simulator.h: In member function ‘bool byNodePtr::operator()(int) const’:
simulator.h:750: error: ‘node1’ was not declared in this scope
simulator.h:750: error: ‘node2’ was not declared in this scope
simulator.h: At global scope:
simulator.h:756: error: expected ‘,’ or ‘...’ before ‘&’ token
simulator.h:756: error: ISO C++ forbids declaration of ‘EventPtr’ with no 
type
simulator.h: In member function ‘bool byEventTime::operator()(int) const’:
simulator.h:757: error: ‘event1’ was not declared in this scope
simulator.h:757: error: ‘event2’ was not declared in this scope
simulator.h: At global scope:
simulator.h:763: error: expected ‘,’ or ‘...’ before ‘&’ token
simulator.h:763: error: ISO C++ forbids declaration of ‘EdgePtr’ with no 
type
simulator.h: In member function ‘bool byBottomNode::operator()(int) const’:
simulator.h:764: error: ‘edge1’ was not declared in this scope
simulator.h:764: error: ‘edge2’ was not declared in this scope
simulator.h: At global scope:
simulator.h:800: error: expected initializer before ‘&’ token
simulator.h:804: error: expected initializer before ‘&’ token
simulator.h:838: error: ‘EdgePtr’ does not name a type
simulator.h:848: error: ‘EdgePtr’ does not name a type
simulator.h: In member function ‘void Node::removeEvent()’:
simulator.h:860: error: ‘class Node’ has no member named ‘event’
simulator.h: At global scope:
simulator.h:864: error: variable or field ‘setEvent’ declared void
simulator.h:864: error: ‘EventPtr’ was not declared in this scope
simulator.h:864: error: ‘event’ was not declared in this scope
simulator.h: In member function ‘double RandNumGenerator::expRV(double)’:
simulator.h:894: error: ‘unif’ was not declared in this scope
simulator.h: In member function ‘double RandNumGenerator::unifRV()’:
simulator.h:898: error: ‘unif’ was not declared in this scope
simulator.cpp: In constructor ‘RandNumGenerator::RandNumGenerator(long 
unsigned int)’:
simulator.cpp:14: error: ‘boost’ has not been declared
simulator.cpp:14: error: expected `;' before ‘mt’
simulator.cpp:15: error: ‘unif’ was not declared in this scope
simulator.cpp:15: error: expected type-specifier before ‘boost’
simulator.cpp:15: error: expected `;' before ‘boost’
simulator.cpp: In destructor ‘RandNumGenerator::~RandNumGenerator()’:
simulator.cpp:19: error: ‘unif’ was not declared in this scope
simulator.cpp: In member function ‘void 
Simulator::readInputParameters(CommandArguments)’:
simulator.cpp:169: error: ‘EventPtr’ was not declared in this scope
simulator.cpp:169: error: expected `;' before ‘wrapper’
simulator.cpp:501: error: ‘wrapper’ was not declared in this scope
simulator.cpp:515: error: ‘wrapper’ was not declared in this scope
simulator.cpp:530: error: ‘wrapper’ was not declared in this scope
simulator.cpp:544: error: ‘wrapper’ was not declared in this scope
simulator.cpp:558: error: ‘wrapper’ was not declared in this scope
simulator.cpp:580: error: ‘wrapper’ was not declared in this scope
simulator.cpp:604: error: ‘wrapper’ was not declared in this scope
simulator.cpp:650: error: ‘wrapper’ was not declared in this scope
simulator.cpp:664: error: ‘wrapper’ was not declared in this scope
simulator.cpp:675: error: request for member ‘push_back’ in ‘* 
pEventList’, which is of non-class type ‘int’
simulator.cpp:675: error: ‘wrapper’ was not declared in this scope
simulator.cpp:698: error: request for member ‘sort’ in ‘* pEventList’, 
which is of non-class type ‘int’
make: *** [simulator.o] Error 1


What version of the product are you using? On what operating system?
Mac OS Lion

Please provide any additional information below.
I have tried getting boost, which is now unzipped with the root directory in my 
include path, but I don't know if I need to do anything else to build or 
install it; I am not a programmer so only have an 'enduser' level of knowledge. 
Boost documentation is no help, maybe you have an easy fix? Also, is there 
anything else in the error message unrelated to boost?
Thanks
Sam

Original issue reported on code.google.com by [email protected] on 2 Nov 2012 at 12:17

[feature enhancement] tbs option

The attached patch adds ms' tbs options to MaCS. I've tried to stay as close as 
possible to ms' behaviour. Exceptions:
* *all* options except -i can be marked as tbs
* number of iterations is optional in MaCS, thus without -i the number of lines 
in stdin will determine number of iterations

I have tested the patch but I can't promise that there aren't any bugs left, so 
please use at your own risk ;-).

Until the project owner incorporates the patch in the official sources (if 
indeed he does so), please contact me if you run into problems with the patch.

Original issue reported on code.google.com by [email protected] on 5 May 2014 at 1:29

Attachments:

Interpreting output

What steps will reproduce the problem?
./macs 100 1e6 -t .001 -r .001 -h 1e2

What is the expected output? What do you see instead?

COMMAND:    ./macs 100 1e6 -t .001 -r .001 -h 1e2
SEED:   1370614464
SITE:   0    0.000184624691    
0.0518626894    0100000000000000010000000000000000110000000100000000100000000000000
000000000100000000000000000000000
SITE:   1    0.000278265467   
0.00288222514   000000000000000000000000000010000000000000000000000000100000000000
0000000000000000000000000000000000
SITE:   2    0.000407719185     
0.449356828 00011000100010001001010100000011000010010000010100010101001011000111
00000011000010001010000101110101
SITE:   3    0.000517977674     
0.443446518 00011000100010001001010100000011000010010000010100010101001011000111
00000011000010001010000101110101
SITE:   4    0.000520646282     
0.021129332 00000000000000000000000000000010000000000000000000000000001010000000
00000000000000000000000000000000
SITE:   5     0.00074230439    
0.0214293894    0000000000000000000000000000000000000000100000000000000000000000000
000000000000000010000000000000000
 and so on

Please provide any additional information below.

Hi Gary,
I am having trouble understanding the third column of the output. I understand 
the first column is the site number and the second is the position (in what 
units?). The last column, each 0 or 1 represents an individual with or without 
the variant. But what is the third column?
In particular, my goal is to find the time a mutation occurs (in generations or 
coalescent units). I don't think ms has this function so I was hoping your 
version might.
Thanks!

Original issue reported on code.google.com by [email protected] on 7 Jun 2013 at 3:32

Issues with Building macs--Compiling

There is an error "In file included from simulator.cpp:7:
./simulator.h:8:9: fatal error: 'boost/weak_ptr.hpp' file not found
#include<boost/weak_ptr.hpp>
^~~~~~~~~~~~~~~~~~~~
1 error generated.
make: *** [simulator.o] Error 1"
when I try to make macs.
Can you help me to solve this problem?

Interpretation of results

Hi,

I ran the simulation as suggested in the readme file
./macs 100 1e6 -T -t .001 -r .001 -h 1e2 -R inputfiles/hotspot.txt -F 
inputfiles/ascertainment.txt 0 2>trees.txt | ./msformatter > haplotypes.txt

Then I look up  haplotypes.txt.

I get lines of kind:
//
[48]((13:0.641735,(((((((75:0.00122728,11:0.00122728):0.000699119,(77:0.00042199
2,0:0.000421992):0.00150441):0.0043
...

what does the number in square brackets mean?

Further on I see haplotypes
110011111000001010001000000001000011011011000000000000011000001...

what do 0s and 1s mean? 
Why are there exactly 2816 symbols in each line?


Kind regards
Dominik

Original issue reported on code.google.com by [email protected] on 29 Aug 2012 at 5:33

MaCS error with population splitting/joining

Hi Gary,

I chatted with you about this a while back, but I'm just following up to say that I think MaCS still isn't quite doing split/join events right.

I'm trying to model quite a complex situation with eight populations, but I've managed to isolate the problem down to the following simple example:

Imagine we have just two populations, 1 and 2. At a given time, we split population 1 into two new populations, which under ms rules should be called populations 1 and 3. Slightly later, we split population 2 into two new populations, which should be called populations 2 and 4. We then join these populations together: first 4 to 1, then 3 to 2, then 2 to 1.

There's nothing very complex in this setup. An example command line would be:

macs 120 1e8 -t 4.2e-05 -r 0.00042 -h 1e3 -I 2 60 60 10 -es 0.00388 1 0.741 -es 0.00391 2 0.741 -ej 0.00394 4 1 -ej 0.00395 3 2 -ej 0.00396 2 1 2>trees.txt | msformatter > test.out

However, this fails with the following error:

Graph Builder begin
Attempting to add edge of population 3 when the number of available population edge pools is only 2. It is recommended that you increase the migration rates and/or number of sampled chromosomes.
Graphbuilder destructor
Simulator caught exception with message:
Data structure integrity error.
Simulator destructor:
Configuration destructor
Program is complete

In my tests, raising the migration rate and sample size doesn't help - even if I raise them to strange levels (e.g., setting the sample size larger than the population size). MaCS either throws the same error, or a different one:

Infinite coalescent time. No migration

This also doesn't make much sense to me, as all populations are quickly joined together into a single population, whose individuals should then rapidly coalesce.

Can you possibly take a look at this and see what's going on?

-Murray

Disagreement between ms and macs when models involve migration

What steps will reproduce the problem?

- Assume the per base theta is 0.001 (Ne=1e4, mutation rate=2.5x10^-8 per 
generation), simulate 1000 sequences with 1e5 bases given a simple 
two-populations (20 samples for each population) split model with/without gene 
flow between them.

1. A model with migration (, where the macs output does NOT agree with the one 
of ms)
    ms: 
          seed: 641499925
          ms 40 1000 -t 100 -I 2 20 20 -n 1 1 -n 2 1 -ma x 2 2 x -ej 0.1 2 1 -r 0.0004 100000
    MaCS:
          macs 40 100000 -s 1371511838 -i 1000 -t 0.001 -I 2 20 20 -h 100 -n 1 1 -n 2 1 -ma x 2 2 x -ej 0.1 2 1 -r 0.0004

2. A model without migration (, where the macs output does agree with the one 
of ms)
    ms: 
          seed:993080870
          ms 40 1000 -t 100 -I 2 20 20 -n 1 1 -n 2 1 -ma x 0 0 x -ej 0.1 2 1 -r 0.0004 100000
    MaCS:
          macs 40 100000 -s 1371512336 -i 1000 -t 0.001 -I 2 20 20 -h 100 -n 1 1 -n 2 1 -ma x 0 0 x -ej 0.1 2 1 -r 0.0004


What is the expected output? What do you see instead?
- The expected theta and/or the distributions of the number of segregating 
sites over the 1000 loci that result from running ms and MaCS should be 
compatible. Yet it seems that MaCS produces much more segregating sites than ms 
does when gene flow is present in the model of interest. The uploaded boxplots 
illustrate the difference (Fig1: with migration; Fig2: without migration).


What version of the product are you using? On what operating system?
- I tested the above MaCS commands in the version 0.5c on a Mac OS X 10.6 
system.


Please provide any additional information below.
- I also gave the same demographic model (with symmetric gene flow between the 
two populations) to the forward simulator, DaDi, to estimate the theta of the 
simulated data sets produced by ms and MaCS as its inputs, separately. The 
averaged estimated theta of the ms data set is ~100, which is expected given 
the above command; however, the averaged estimated theta of the MaCS data set 
is >150, which is 50% more than the expected value. Thus this reproduced the 
difference observed between the two plots I uploaded. I also tested a 
two-populations model with asymmetric migration between them using both ms and 
MaCS, and MaCS also produced much more mutations than ms did. 

I wonder what makes MaCS produce much more mutations. Please help to identify 
if any mistakes I might have made/overlooked.

Thanks,
PingHsun(Benson)

Original issue reported on code.google.com by [email protected] on 18 Jun 2013 at 1:38

Attachments:

runtime crash reported by Shohei Takuno


Hi Gary,

Linux farm.caes.ucdavis.edu 2.6.18-194.32.1.el5 #1 SMP Wed Jan 5 17:52:25 EST 
2011 x86_64 x86_64 x86_64 GNU/Linux
Redhat CentOS release 5.8 (Final)

Boost version
version 1.33.1

Compiler version
gcc version 4.1.2 20080704 (Red Hat 4.1.2-52)

Thank you very much,

shohei


On 2012/08/13, at 10:01, Gary Chen wrote:

> Shohei,
>
> Can you provide me some info?
>
> operating system, version of Boost, compiler version?
>
> Thanks
>
> Gary
>
> On 08/09/2012 09:32 AM, Shohei Takuno wrote:
>>> This line works too.  Did make clean and make help?
>> Yeha, I did make clean and make again, but it doesn't work.
>>
>> Thanks,
>>
>> shohei
>>
>>
>>>
>>> On 08/07/2012 04:19 PM, Shohei Takuno wrote:
>>>> Ummm...  Still macs crashes...
>>>>
>>>> Here is another command line.
>>>> ./macs 200 100000 -t 0.018 -r 0.03 -s 1344375493 -I 2 100 100 -n 1 1.05
>>>> -n 2 0.35 -es 0.01 2 0.2 -en 0.01 3 1.06666666666667 -ej 0.01 2 1 -en
>>>> 0.01 1 1.4 -en 0.0133333333333333 1 0.0163333333333333 -en 0.015 1 1.0
>>>> -ej 0.1 3 1 -h 100
>>>>
>>>> Error message is
>>>> *** glibc detected *** ./macs: free(): invalid next size (fast):
>>>> 0x0000000006616d80 ***
>>>> ======= Backtrace: =========
>>>> /lib64/libc.so.6[0x36a4a70f0f]
>>>> /lib64/libc.so.6(cfree+0x4b)[0x36a4a7136b]
>>>> ./macs[0x40db45]
>>>> ./macs[0x40db77]
>>>> ./macs[0x40dfd0]
>>>> ./macs[0x40e027]
>>>> ./macs[0x416273]
>>>> ./macs[0x417bd0]
>>>> ./macs[0x418acd]
>>>> ./macs[0x40300f]
>>>> ./macs[0x40840e]
>>>> /lib64/libc.so.6(__libc_start_main+0xf4)[0x36a4a1d994]
>>>> ./macs(__gxx_personality_v0+0xb9)[0x402349]
>>>> ======= Memory map: ========
>>>> 00400000-00444000 r-xp 00000000 09:01 256770153
>>>> /home/stakuno/maize/macs/99_test1/macs
>>>> 00643000-00644000 rw-p 00043000 09:01 256770153
>>>> /home/stakuno/maize/macs/99_test1/macs
>>>> 065d9000-0665d000 rw-p 065d9000 00:00 0
>>>> [heap]
>>>> 36a4600000-36a461c000 r-xp 00000000 09:00 17677443
>>>> /lib64/ld-2.5.so
>>>> 36a481c000-36a481d000 r--p 0001c000 09:00 17677443
>>>> /lib64/ld-2.5.so
>>>> 36a481d000-36a481e000 rw-p 0001d000 09:00 17677443
>>>> /lib64/ld-2.5.so
>>>> 36a4a00000-36a4b4d000 r-xp 00000000 09:00 17677445
>>>> /lib64/libc-2.5.so
>>>> 36a4b4d000-36a4d4d000 ---p 0014d000 09:00 17677445
>>>> /lib64/libc-2.5.so
>>>> 36a4d4d000-36a4d51000 r--p 0014d000 09:00 17677445
>>>> /lib64/libc-2.5.so
>>>> 36a4d51000-36a4d52000 rw-p 00151000 09:00 17677445
>>>> /lib64/libc-2.5.so
>>>> 36a4d52000-36a4d57000 rw-p 36a4d52000 00:00 0
>>>> 36a4e00000-36a4e82000 r-xp 00000000 09:00 17677464
>>>> /lib64/libm-2.5.so
>>>> 36a4e82000-36a5081000 ---p 00082000 09:00 17677464
>>>> /lib64/libm-2.5.so
>>>> 36a5081000-36a5082000 r--p 00081000 09:00 17677464
>>>> /lib64/libm-2.5.so
>>>> 36a5082000-36a5083000 rw-p 00082000 09:00 17677464
>>>> /lib64/libm-2.5.so
>>>> 36ade00000-36ade0d000 r-xp 00000000 09:00 17677478
>>>> /lib64/libgcc_s-4.1.2-20080825.so.1
>>>> 36ade0d000-36ae00d000 ---p 0000d000 09:00 17677478
>>>> /lib64/libgcc_s-4.1.2-20080825.so.1
>>>> 36ae00d000-36ae00e000 rw-p 0000d000 09:00 17677478
>>>> /lib64/libgcc_s-4.1.2-20080825.so.1
>>>> 36b4800000-36b48e6000 r-xp 00000000 09:00 19314461
>>>> /usr/lib64/libstdc++.so.6.0.8
>>>> 36b48e6000-36b4ae5000 ---p 000e6000 09:00 19314461
>>>> /usr/lib64/libstdc++.so.6.0.8
>>>> 36b4ae5000-36b4aeb000 r--p 000e5000 09:00 19314461
>>>> /usr/lib64/libstdc++.so.6.0.8
>>>> 36b4aeb000-36b4aee000 rw-p 000eb000 09:00 19314461
>>>> /usr/lib64/libstdc++.so.6.0.8
>>>> 36b4aee000-36b4b00000 rw-p 36b4aee000 00:00 0
>>>> 2ab435674000-2ab435676000 rw-p 2ab435674000 00:00 0
>>>> 2ab435696000-2ab435699000 rw-p 2ab435696000 00:00 0
>>>> 7fff9258e000-7fff925a3000 rw-p 7ffffffe9000 00:00 0
>>>> [stack]
>>>> ffffffffff600000-ffffffffffe00000 ---p 00000000 00:00 0
>>>> [vdso]
>>>> Aborted
>>>>
>>>>
>>>> On 2012/08/07, at 16:02, Gary Chen wrote:
>>>>
>>>>> Odd. The command completes fine on my end.  Here is the output
>>>>> attached.
>>>>>
>>>>> On 08/07/2012 03:07 PM, Shohei Takuno wrote:
>>>>>> ./macs 200 100000 -t 0.018 -r 0.03 -s 3 -I 2 100 100 -n 1 1.05 -n 2
>>>>>> 0.35
>>>>>> -es 0.01 2 0.2 -en 0.01 3 1.06666666666667 -ej 0.01 2 1 -en 0.01 1 1.4
>>>>>> -en 0.0133333333333333 1 0.0163333333333333 -en 0.015 1 1.0 -ej 0.1
>>>>>> 3 1
>>>>>> -h 1000
>>>>> <output.tar.gz>
>>>
>
>

Original issue reported on code.google.com by [email protected] on 13 Aug 2012 at 9:32

Nondeterminism in macs when python bindings are used

What steps will reproduce the problem?
1. Pick any binding method to allow you to call c++ code from python-- I've 
tried swig, cython and boost-python (the last being by far the easiest)
2. Run a simple macs command (e.g., macs 10 1 -t 10 -s 3) and the output will 
differ between the python version of the code and the "pure" c++ version. 
Namely, it will differ not in the number of segregating sites, but in the 
patterns of diversity seen (e.g., pi will be different).
3. In the code, when an Edge object is made, the top and bottom node heights 
will be different when you run macs natively, compared to when you run macs via 
python (as seen by just printing out the node heights and comparing the output)

What is the expected output? What do you see instead?
I would expect to see the same output when you run either via boost-python or 
via the native c++ code when you set the seed. As I see the same nondeterminism 
across 3 different ways of linking python to c++, I'm gathering that this is 
not a problem w/ the language bindings, but in macs code itself. Printing out 
the results of the random number generator shows the *same* results between the 
two different methods of invoking the code.
I have used python bindings with Hudson's ms, and it works great.


What version of the product are you using? On what operating system?
The most recent (used svn just 2 weeks ago). RHEL6 (on two different systems).

Please provide any additional information below.
It isn't really fair to call this a bug, but if you want to really be able to 
do a lot whole genome simulations, having a way to invoke macs as a function, 
and not a program, is pretty essential. EG, having an API would be awesome.

I also am unsure if what I'm seeing is really a bug-- I can't see why you'd get 
nondeterministic output when the RNG is giving the same output on either 
approach, but I can see how floating-point imprecision might also introduce a 
little noise in this scenario. That being said, for instance, when you do the 
above simulation you get different Newick trees out; the shape is exactly the 
same, but the node ordering is different... and I have a hard time chalking 
that up to floating point issues. Honestly, the tree differences won't matter 
for this panmictic simulations I'm doing, but I still need to see if it does 
matter when we start simulating structure. 

To be clear, to get the python bindings to work you do need to tweak the c++ 
code, but the tweaking is very minimal (and each of the 3 approaches above 
involved pretty different tweaks, all leading to the same error). I've tried 
using valgrind and it detects no memory errors (though I don't know how well it 
works with c++. I'm more of a C programmer...)

Again, I'm not sure of what a realistic answer to this post is, unless it's "of 
course there can be non-determinism when you fix the random number seed".

-August

Original issue reported on code.google.com by [email protected] on 3 Oct 2013 at 8:14

recombination rate

Hi Gary,

This is a potentially very obvious question. I'm confused about the -r parameter, which is recombination rate per site per 4N generations.
If I have a species with a recombination rate of 4.83 cM/Mb, then should the unscaled per site r = 4.83e-8 per site per 1 generation?

Thanks,
Ariella

scaling of branches

Hi Gary,

I was wondering how you scale the tree branches, and if you do it like in ms 
(in units of 4N0 generations), I couldn't find any details on that in the 
documentation. 

Thank you,
Paula

Original issue reported on code.google.com by [email protected] on 14 Nov 2012 at 7:48

Ploidy

Hi.

I would like to know if it is possible to set up MaSC to simulate haploid 
sequences? If I understand the tool correctly the parameters are scaled for 
diploids.
I appreciate very much the help and I am sorry if the question is already 
answered elsewhere. 

Fernando

Original issue reported on code.google.com by [email protected] on 3 Jul 2014 at 2:01

error with specifying migration rates

What steps will reproduce the problem?
Run this command:
macs 18 1000000.0 -t 0.0014 -r 0.0004 -I 12 3 3 0 0 0 3 3 0 0 0 3 3 -m 1 2 40.0 
-m 2 1 40.0 -m 1 7 40.0 -m 7 1 40.0 -m 2 3 40.0 -m 3 2 40.0 -m 2 7 40.0 -m 7 2 
40.0 -m 8 7 40.0 -m 7 8 40.0 -m 2 8 40.0 -m 8 2 \
40.0 -m 5 6 40.0 -m 6 5 40.0 -m 6 11 40.0 -m 11 6 40.0 -m 6 12 40.0 -m 12 6 
40.0 -m 11 10 40.0 -m 10 11 40.0 -m 11 12 40.0 -m 12 11 40.0 -em 0.0125 3 8 
40.0 -em 0.0125 8 3 40.0 -em 0.0125 3 4 40.0 -em 0.0125 4 3 40.0 -em 0.0125 4 5 
40.0 -em 0.0125 5 4 40.0 -em 0.0125 3\
 9 40.0 -em 0.0125 9 3 40.0 -em 0.0125 8 9 40.0 -em 0.0125 9 8 40.0 -em 0.0125 4 10 40.0 -em 0.0125 10 4 40.0 -em 0.0125 9 10 40.0 -em 0.0125 10 9 40.0 -em 0.0125 4 9 40.0 -em 0.0125 9 4 40.0 -em 0.0125 5 10 40.0 -em 0.0125 10 5 40.0

What is the expected output? What do you see instead?
INPUT: Sample size is now 18
INPUT: Seq length is now 1e+06
INPUT: Scaled mutation rate is now 1400
INPUT: Scaled recombination rate is now 400
INPUT: Setting chr sampled for pop 1 to 3
INPUT: Setting chr sampled for pop 2 to 3
INPUT: Setting chr sampled for pop 3 to 0
INPUT: Setting chr sampled for pop 4 to 0
INPUT: Setting chr sampled for pop 5 to 0
INPUT: Setting chr sampled for pop 6 to 3
INPUT: Setting chr sampled for pop 7 to 3
INPUT: Setting chr sampled for pop 8 to 0
INPUT: Setting chr sampled for pop 9 to 0
INPUT: Setting chr sampled for pop 10 to 0
INPUT: Setting chr sampled for pop 11 to 3
INPUT: Setting chr sampled for pop 12 to 3
INPUT: Global migration rate to 0
INPUT: At time 0.0125: Mig rate of source pop 3 to dest pop 8 set to 40.0.
INPUT: At time 0.0125: Error, this event is redundant with a previous time.  
Please increment it slightly from 0.0125 to prevent unpredictable results

What version of the product are you using? On what operating system?
Most current version. linux

Please provide any additional information below.
command worked with ms

Original issue reported on code.google.com by [email protected] on 2 Feb 2015 at 9:54

Negative branch lengths

What steps will reproduce the problem?
1./macs 100 3000000 -t 1.0 -T -r .0002 -h 1 >test100Const.tree
2.
3.

What is the expected output? What do you see instead?
I expect to see a sequence of genealogies simulated from SMC'. I see that 
sometimes from one tree to the next one, all the branch lengths change by some 
very small number, so I added a tolerance on my programs. What I am having now 
is some negative branch lengths
?
What version of the product are you using? On what operating system?
I downloaded macs on January this year and my computer is a Mac OS X 10.7.5

Please provide any additional information below.


Original issue reported on code.google.com by [email protected] on 19 May 2014 at 7:38

Question about -ej option in macs

Hello again,

When the option -ej is used, what happens to the size of population j? Is it the same as specified in the command?
In particular, I want to know if these 2 commands will have the same behavior.
Command 1:
macs 10 100000 -t 0.004 -r 0.004 -h 1e5 -I 2 5 5 -n 1 2 -n 2 2 -ej 0.0003 2 1 -en 0.000300001 1 2
-en 0.00005 1 1
Command 2:
macs 10 100000 -t 0.004 -r 0.004 -h 1e5 -I 2 5 5 -n 1 2 -n 2 2 -ej 0.0003 2 1 -en 0.00005 1 1

Thank you

potential bug when simulating with migration

I compared the output between macs and ms under a simple demographic scenario 
(see command lines below). The macs command produces many more segregating 
sites than ms under the same parameters. If I switch off migration, the two 
commands give similar results. Here are some simple bash-command lines to check:

First ms:
$ ms 4 10 -I 2 2 2 -ej 0.4 2 1 -eM 0.2 1 -t 1000 -r 400 1000000 | grep segsites
segsites: 2908
segsites: 2848
segsites: 2701
segsites: 2842
segsites: 2882
segsites: 2703
segsites: 2712
segsites: 2635
segsites: 2985
segsites: 2860

Now macs:
$ for i in {1..10}; do macs 4 1000000 -s $RANDOM -t 0.001 -r 0.0004 -I 2 2 2 
-ej 0.4 2 1 -eM 0.2 1 2> /dev/null | grep TOTAL_SITES; done
TOTAL_SITES:    3848
TOTAL_SITES:    4427
TOTAL_SITES:    3942
TOTAL_SITES:    4345
TOTAL_SITES:    4182
TOTAL_SITES:    4016
TOTAL_SITES:    3716
TOTAL_SITES:    3962
TOTAL_SITES:    4120
TOTAL_SITES:    4545

Now macs without migration:
$ for i in {1..10}; do macs 4 1000000 -s $RANDOM -t 0.001 -r 0.0004 -I 2 2 2 
-ej 0.4 2 1 -eM 0.2 0 2> /dev/null | grep TOTAL_SITES; done
TOTAL_SITES:    2734
TOTAL_SITES:    2722
TOTAL_SITES:    2618
TOTAL_SITES:    2855
TOTAL_SITES:    2759
TOTAL_SITES:    2861
TOTAL_SITES:    2930
TOTAL_SITES:    2599
TOTAL_SITES:    2658
TOTAL_SITES:    2638

and ms without migration:
$ ms 4 10 -I 2 2 2 -ej 0.4 2 1 -eM 0.2 0 -t 1000 -r 400 1000000 | grep segsites
segsites: 3028
segsites: 2435
segsites: 2782
segsites: 2673
segsites: 2997
segsites: 2937
segsites: 2232
segsites: 2547
segsites: 2959
segsites: 2798

I am using version 0.5c, the currently newest version of macs.


Original issue reported on code.google.com by [email protected] on 13 Nov 2013 at 8:21

growth increases runtime?

Hi Gary,

When I added exponential growth to my model, I'm getting a massive increases in runtime.
Naively, I would not expect that to happen. larger growth means, back in time, smaller Ne, which means less coalescent events... smaller Ne means shorter sim time.
The expected number of generations until two samples share a common ancestor is 2N...

Am I thinking about this wrong?

Thanks,
Ariella

Low admixture proportion and small sample sizes cause memory error

What steps will reproduce the problem?
I have experienced a crash when simulating the following model for pop1 size = 
8 and pop2 size = 18:

          __|     t = 0.7
         |   |
         |   |
         |   |
         |   |\    t = 0.07
         |   |  \
         |   |   |
         |   |   |
          \  |   |
            -|   |  f = <admixture>%, t = 0.0035
              |   |
              |   |
 pop  3   1   2

The model includes admixture at t=0.0035, and from that time to the present the 
population's size is zero. The only purpose population 3 serves is to introduce 
a more ancient lineage into population 1.  

Here is a sample command line that crashes:
macs 26 1e7 -i 1 -s 1 -t 0.0008 -r 0.0004 -h 1e3 -I 2 8 18 -n 1 0.5 -n 2 0.5  
-es 0.0035 1 0.3 -en 0.0035 3 1.0  -ej 0.07 2 1 -en 0.07 1 1.0  -ej 0.7 3 1 -en 
0.7 1 1.0

The error becomes more likely with:  

What is the expected output? What do you see instead?
Macs works as normal until a sudden memory error. A portion of the output file 
is written. Example error output is attached.

What version of the product are you using? On what operating system?
I am using macs 0.4e on Ubuntu.

Original issue reported on code.google.com by [email protected] on 9 Jul 2012 at 5:23

Attachments:

manual?

Have I somehow missed a link to the manual for this program?


Original issue reported on code.google.com by [email protected] on 22 Aug 2012 at 12:19

More SNPs than sites

Hi Gary

I'm running Macs 3 on Mac OSX, trying to simulate a very recent severe 
population bottleneck using:

./macs 26 100000 -t 0.001 -r 0.01 -h 1e2 -eN 0.000011 1000

The output gives 381867 segsites, which is more than the sequence length.

Am I missing something?

Cheers

Sam

Original issue reported on code.google.com by [email protected] on 14 Mar 2014 at 3:42

Proper order of populations for specifying migration

For assigning migration between two populations in MaCS, if I want to have some migration (rate R) flow from population 1 to population 2 (i.e. introgression into population 2) at time T, is the correct command:
-em T 2 1 R ?

specifying proportion for splitting command

Hi Gary,

I just wanted to clarify the role of the 'p' value when using -es in macs. In the documentation it describes it as the proportion of chromosomes that move from pop i to pop i+1. So if I have a command macs ... I 2 1 1 ...-es 0.001 1 0.25..., this means that 25% of the chromosomes from pop1 will be moved to pop3?

This is different from how -es functions in ms, where the 'p' value specifies the probability of a chromosome ending up in population i. i.e. -es 0.001 1 0.25 in ms means that a chromosome from pop1 has a 25% probability of remaining in pop1, and a 75% probability of being moved to pop3?

Thank you.

Reduced variation in new MaCS version (as of 13th of Dec)


What steps will reproduce the problem?
1. Run "old" and new version of MaCS with the shell scrip bellow to generate 
haplotype file
2. Run the R script bellow on haplotype file to compute site specific allele 
frequencies and plot them

What is the expected output? What do you see instead?
With previous version I got much more variation - allele frequencies covered 
whole spectrum [0-1], while now values are very uniform. In the attached plots 
I show this frequencies (y axis) versus site index (x axis). I also noticed 
that in the haplotype file there are now a lot of haplotypes with only zero 
alleles. In some cases (I can not reproduce this) I saw for the very first 
haplotype some other values than 0 and 1 in the first few sites.

What version of the product are you using? On what operating system?
Version as of 13th od Dec checked out from Google Code and compiled on Ubunut 
Linux 12.04. I do not know the version of old binary.

Please provide any additional information below.

#-------------------- START SHELL SCRIPT ---------------------------------------

#!/bin/bash 

# Set up number of haplotypes
nHaps=400

# Call MaCS
./macs $nHaps        \
       810000000     \
       -t 0.00000036 \
       -r 0.00000036 \
       -eN 0.05 2 -eN 0.1 4 -eN 0.15 6 -eN 0.2 8 -eN 0.25 10 -eN 0.3 12 -eN 0.35 14 -eN 0.4 16 -eN 0.45 18 -eN 0.5 20 -eN 1 40 -eN 2 60 -eN 3 80 -eN 4 100 -eN 5 120 -eN 10 140 -eN 20 160 -eN 30 180 -eN 40 200 -eN 50 240 -eN 100 320 -eN 200 400 -eN 300 480 -eN 400 560 -eN 500 640 1>output.txt 2>debug.txt 

# Format as in ms
cat output.txt | ./msformatter > haplotypes.txt 

# Tear appart the output
# ... header
tail -n +6 haplotypes.txt > Intermediate.txt
# ... haplotypes
tail -n +2 Intermediate.txt > MacsHaplotypes.txt
# ... map
head -n 1 Intermediate.txt | sed -e 's/positions: //' > PhysicalMapInput.txt
# ... number of segregating sites
grep segsites haplotypes.txt | sed  -e 's/segsites: //' > SegSites.txt
# ... get nice format with spaces
sed -e 's/0/ 0/g' -e 's/1/\ 1/g' MacsHaplotypes.txt > tmp
# ... save outgroup haplotypes in two files
head -n $nHaps tmp > MacsHaplotypes.txt

#-------------------- STOP  SHELL SCRIPT ---------------------------------------

#-------------------- START R SCRIPT -------------------------------------------

hap1 <- read.table(file="MacsHaplotypes.txt")
af1 <- colMeans(hap1)

png(file="AFplot.png", width=480*3, height=480*3)
plot(af1, pch=19, cex=0.5, col=rgb(178/255, 34/255, 34/255, alpha=1), ylim=c(0, 
1))
dev.off()

png(file="AFhist.png", width=480*3, height=480*3)
hist(af1, xlim=c(0, 1))
dev.off()

#-------------------- STOP  R SCRIPT -------------------------------------------

Original issue reported on code.google.com by [email protected] on 3 Jan 2013 at 10:50

Attachments:

recombination chunk of length infinity

In macs, it is possible to draw a distance of +infinity to the next 
recombination breakpoint, and hence for there to be no recombination for most 
or all of an entire sequence (which happened to me in one simulation).

My apologies, but this problem is difficult to reproduce, since I have slightly 
modified my version of macs. However, this error occurs with probability 2^-32 
per recombination chunk (see below), so should also affect other users. I can 
send you my modified macs if you wish -- there are only ~5 lines of code 
modified, but they do affect the order of calls to the random number generator.

The cause of the error is as follows. The distance to the next recombination 
breakpoint is gotten by drawing from an exponential distribution, which in turn 
is gotten by drawing u ~ Uniform[0,1), and transforming to log(u) / Lambda. 
However, the random number generator that macs uses has a 2^-32 probability of 
returning exactly 0, which will cause log(u) / Lambda = Infinity.

Replacing log(u) / Lambda with log(1 - u) / Lambda will fix this problem, since 
the random number generator boost:uniform_01 will never return exactly 1. This 
change can be made at the end of simulator.h.

Original issue reported on code.google.com by [email protected] on 8 May 2013 at 5:18

Precision in MaCS

Hello @gchen98,

I was wondering if there was any way of increasing the precision (say, to 10 significant digits) of the coalescent times in the trees produced from MaCS?

Thanks,
Ismael

error with 'tbs' flag

I ran macs with the following command:
macs 4 1e6 -t .004112508 -r .004112508 -h 1e2 -I 2 2 2 0 -n 1 .5153702 -eM tbs 
tbs -eM tbs 0 -ej 10.26451 2 1 <priors

I received the following error:

COMMAND:        /home/smalls/programs_that_work/macs/macs 4 1e6 -t .004112508 
-r .004112508 -h 1e2 -I 2 2 2 0 -n 1 .5153702 -eM tbs tbs -eM tbs 0 -ej 
10.26451 2 1
INPUT: Sample size is now 4
INPUT: Seq length is now 1e+06
INPUT: Scaled mutation rate is now 4112.51
INPUT: Scaled recombination rate is now 4112.51
INPUT: Base pairs to track is 100
INPUT: Setting chr sampled for pop 1 to 2
INPUT: Setting chr sampled for pop 2 to 2
INPUT: Global migration rate to 0
INPUT: Pop 1 has size: 0.51537
INPUT: At time 0: Global migration rate is 0
INPUT: At time 0: Error, this event is redundant with a previous time.  Please 
increment it slightly from 0 to prevent unpredictable results
Simulator destructor:
Configuration destructor
[1]    13988 segmentation fault  ~/programs_that_work/macs/macs 4 1e6 -t 
.004112508 -r .004112508 -h 1e2 -I 2

It would seem that macs is not recognizing the tbs flag. Is there another way 
of reading in parameter values in macs for use in ABC?

Original issue reported on code.google.com by [email protected] on 31 Dec 2013 at 4:19

Migration Parameter

Although the format of the input for MaCS is similar to that for ms, there are 
some differences (e.g. in the scaling of the mutation and recombination rates). 
The documentation does not state in what format the migration parameter M for 
an n-island model should be given. In ms, M=4Nm where N is effective population 
size of one island and m is the proportion of inhabitants of an island that 
come from other islands each generation. Is this also the case in MaCS?

Original issue reported on code.google.com by [email protected] on 1 Aug 2014 at 1:11

Population split and merge (-es, -ej) indexing

What steps will reproduce the problem?

./macs 2 100000 -I 2 1 1 -es 0.01 1 0.05 -ej 0.05 3 2

What is the expected output? What do you see instead?

This run terminates because of an indexing error:

------------------------------
INPUT: Sample size is now 2
INPUT: Seq length is now 100000
INPUT: Setting chr sampled for pop 1 to 1
INPUT: Setting chr sampled for pop 2 to 1
INPUT: Global migration rate to 0
INPUT: At time 0.01: Population 1 splits at proportion 0.05
INPUT: At time 0.05: WARNING: The pop IDs used in pop join is greater than the 
number specified in -I.  You must have a split event before this join event.
Population 3 will merge with 2
SEED:   1357694700
Debugging: 0

Graph Builder begin
Simulator caught exception with message:
Infinite coalescent time. No migration
Program is complete
----------------------



What version of the product are you using? On what operating system?
Mac OS 10.6.8, macs-0.4e

Please provide any additional information below.

In the command README that appears when you type ./macs into the terminal, the 
-es command is described as follows:

-es <t> <i> <p> (Split two populations.  At time t, a proportion p of 
chromosomes from pop i will migrate to a population i+1.

This is different from how -es works in Hudson's MS: at time t, if there exist 
n total populations and you split population i into two parts, the new 
population index is n+1, not i+1. I wrote the above command line assuming the 
discrepancy was a typo and that the indexing works the same as in MS, but it 
didn't seem to work that way. However, the following command line assumes that 
-es is correctly described in the macs manual, and it doesn't work either:

-------------------
COMMAND:        ./macs 2 100000 -I 2 1 1 -es 0.01 2 0.05 -ej 0.05 3 1
INPUT: Sample size is now 2
INPUT: Seq length is now 100000
INPUT: Setting chr sampled for pop 1 to 1
INPUT: Setting chr sampled for pop 2 to 1
INPUT: Global migration rate to 0
INPUT: At time 0.01: Population 2 splits at proportion 0.05
INPUT: At time 0.05: WARNING: The pop IDs used in pop join is greater than the 
number specified in -I.  You must have a split event before this join event.
Population 3 will merge with 1
SEED:   1357695229
Debugging: 0

Graph Builder begin
Simulator caught exception with message:
Infinite coalescent time. No migration
Program is complete
-------------------------

Thanks so much for your help with this issue!
-Kelley Harris




Original issue reported on code.google.com by [email protected] on 9 Jan 2013 at 1:47

units of exponential growth

Hi Gary,

I'm using macs to run simulations for Approximation Bayesian Computation to infer demographic parameters. Part of my model is exponential growth. I'm confused about what units the growth rate parameter is in.
Take a look at this overleaf for more details on my question.
https://www.overleaf.com/9631592trgnjyzqfhvz

Thanks,
Ariella

Simulate admixture with two source population at time T

Dr. Chen,

I want to simulate a single event of admixture with two source populations 
using macs. I attach a pdf with a picture of the model. 

I have used macs previously but I am having trouble writing the correct 
command. It would seem that the -es option would be useful, but it only takes 
into account admixture from one population, instead of two. 

Any help would be appreciated. 

Consuelo Quinto 



Original issue reported on code.google.com by [email protected] on 15 Feb 2015 at 3:45

Attachments:

sequence length reported by Stephen Schiffels



Hi Gary,

my name is Stephan Schiffels, I am a postdoc in Richard Durbin's lab at the 
Wellcome Trust Sanger Institute in Cambridge, UK.

I noticed a problem with the MaCS simulator. The lengths of all the output 
trees do not add up to the given length of the chromosome. Also, many 
segregating sites that are output between two trees are found to have a 
position that actually falls outside the given tree segment.

I attached a simple awk-script, that shows the error. You can check it with - 
for example - this command in a bash shell

M=3;L=1000000;mu=0.002732;rho=0.000452;
macs $M $L -t $mu -r $rho -T 2>/dev/null | awk -v L=$L -f checkMacs.awk


Related to these problems (and how I actually found it) is the fact that the 
density of mutations and the given local tMRCA decorrelate when you go further 
down the chromosome.

What I could imagine as a possible source of error is a discretization problem 
in the output lengths of the trees. So for example, is it possible, that you 
actually generate the tree lengths from an underlying continuous model, and 
then you discretize them by just converting numbers to ints? That would explain 
why the total lengths of all trees adds up to a number that is roughly the 
lenght of the chromosome minus half the number of trees. And it also explains 
why sites are more and more decorrelated with the actual trees the further you 
go down the chromosome (this is not shown in the script).

Maybe I am just doing something fundamentally wrong, in any case I would be 
very happy to get comments from you. MaCS has been an extremely useful program 
in our group, and the error I described doesn't change anything fundamental 
about the distribution of segregating sites. But it does play an important role 
once you do an analysis that connects positions of trees and sites.

Thanks a lot and best regards,

Stephan


Original issue reported on code.google.com by [email protected] on 13 Aug 2012 at 5:05

Running macs with negative -G

Hi Gary

Thanks for your help in getting macs set up on my Mac last week. I am now 
trying to simulate a chromosome under different demographic scenarios. All goes 
ok until I try to simulate an exponentially shrinking population using a 
negative -G (as you would in ms) . In this case the run either hangs, or aborts 
with errors like this:

Tree:112,pos:2.19213e-05,len:1.19761,TMRCA:0.11807,ARG:,len:1.60599,TMRCA:0.1343
24
proposed grandMRCA != top Node
proposed grandMRCA != top Node
iHistoryMax: 88
Tree:113,pos:2.19539e-05,len:1.21005,TMRCA:0.11807,ARG:,len:1.63406,TMRCA:0.1343
24
proposed grandMRCA != top Node
proposed grandMRCA != top Node
iHistoryMax: 89
Tree:114,pos:2.33874e-05,len:1.23026,TMRCA:0.11807,ARG:,len:1.64076,TMRCA:0.1343
24
proposed grandMRCA != top Node
proposed grandMRCA != top Node
proposed grandMRCA != top Node
iHistoryMax: 90
Tree:115,pos:2.47017e-05,len:1.26891,TMRCA:0.11807,ARG:,len:1.63976,TMRCA:0.1343
24
proposed grandMRCA != top Node
proposed grandMRCA != top Node
proposed grandMRCA != top Node
iHistoryMax: 91
Tree:116,pos:2.48832e-05,len:1.26891,TMRCA:0.11807,ARG:,len:1.6309,TMRCA:0.13432
4
proposed grandMRCA != top Node
iHistoryMax: 92
Assertion failed: (px != 0), function operator->, file 
/Users/samanthaoloughlin/boost_1_51_0/include/boost/smart_ptr/shared_ptr.hpp, 
line 424.
Abort trap: 6

In this case the command used was:
./macs 26 41e6 -T -t 0.008 -r 0.08 -h 1e2 -G -8.605951 -eN 0.607360 1.49  > 
outfile

Exactly the same command with a positive value for -G works fine.

Is there a fix for this?

Cheers

Sam

What steps will reproduce the problem?
1.
2.
3.

What is the expected output? What do you see instead?


What version of the product are you using? On what operating system?


Please provide any additional information below.

Original issue reported on code.google.com by [email protected] on 5 Nov 2012 at 11:47

is there anything wrong with -ej/es options?

What steps will reproduce the problem?
./macs 2 100000 -t 0.001 -r 0.001 -es 0.01 1 0.05 -ej 0.1 2 1

What is the expected output? What do you see instead?
COMMAND:    ./macs 2 100000 -t 0.001 -r 0.001 -es 0.01 1 0.05 -ej 0.1 2 1
INPUT: Sample size is now 2
INPUT: Seq length is now 100000
INPUT: Scaled mutation rate is now 100
INPUT: Scaled recombination rate is now 100
INPUT: At time 0.01: Population 1 splits at proportion 0.05
INPUT: At time 0.1: WARNING: The pop IDs used in pop join is greater than the 
number specified in -I.  You must have a split event before this join event.
Population 2 will merge with 1
SEED:    1361513204
Debugging: 0

Graph Builder begin
Time for build prior tree: 0 seconds
Tree:0,pos:0,len:1.60643,TMRCA:0.803213,ARG:,len:1.60643,TMRCA:0.803213
Source pop and total pops are 1,1 in POP JOIN event at history 1. It is 
recommended that you increase the migration rates and/or number of sampled 
chromosomes.
Graphbuilder destructor
Simulator caught exception with message:
Invalid data structure
Simulator destructor:
Configuration destructor
Program is complete


What version of the product are you using? On what operating system?
latest version, macs-0.5b.tar.gz

Please provide any additional information below.
I have a population, first split and then join,
the program seems to have trouble doing this?

Original issue reported on code.google.com by [email protected] on 22 Feb 2013 at 2:25

smaller variance of summary statistic in macs compared to that in ms data

What steps will reproduce the problem?
- Assume the per base theta is 0.001 (Ne=1e4, mutation rate=2.5x10^-8 per 
generation), simulate 1000 sequences with 1e5 bases given a simple 
two-populations (20 samples for each population) split model with gene flow 
between them (same as listed in issue 17).

1. ms command: 
          seed: 641499925
          ms 40 1000 -t 100 -I 2 20 20 -n 1 1 -n 2 1 -ma x 2 2 x -ej 0.1 2 1 -r 0.0004 100000
2. MaCS command:
          macs 40 100000 -s 1371511838 -i 1000 -t 0.001 -I 2 20 20 -h 100 -n 1 1 -n 2 1 -ma x 2 2 x -ej 0.1 2 1 -r 0.0004

What is the expected output? What do you see instead?
- The variance of the number of segregating sites (S) over the 1000 loci that 
result from running ms and MaCS should be compatible, as it been shown in the 
supplementary information of the MaCS original paper. Yet it seems that MaCS 
produces a much smaller variance of S than ms does. In my case, the standard 
deviation for S in the MaCS data set is only half of that in the ms data set. 
The same pattern is also observed under a model without migration and/or with a 
variety of values for the -h switch. Please refer to the Fig2 attached in the 
issue 17 for an idea of the difference in variance.

What version of the product are you using? On what operating system?
- I tested the above MaCS commands in the last version (via svn repository) on 
a Mac OS X 10.6 system.

I wonder what makes MaCS produce much less variation. Please help identify 
things that I might have made/overlooked.

Thanks,
PingHsun(Benson)


Original issue reported on code.google.com by [email protected] on 19 Jul 2013 at 1:15

memory crash when modeling admixture

From:

Katie Cunningham
Gutengroup
University of Arizona

Dear Dr. Chen,

I am trying to simulate the following model for chromosome-sized lengths in 
MaCS:

    |    |    t = 0.7
   / /|  |
  | | | 1|
  |1| |  |
  | | |  \
  | | |   \   t = 0.07
  | | | |\ \
  | | |.| |.|
  | | |5| |5|
   \ \| | | |
    --, | | | f = <admixture>%, t = 0.0035
      | | | |
      | | | |
       B   Y

I have created what I believe is the corresponding MaCS core code:
-I 2 8 18 -n 1 0.5 -n 2 0.5  -es 0.0035 1 <admixture> -en 0.0035 3 1.0  -ej 
0.07 2 1 -en 0.07 1 1.0  -ej 0.7 3 1 -en 0.7 1 1.0

However, MaCS crashes with a memory error for the following input parameters:
macs 26 <length> -i <iterations> -t 0.0008 -r 0.0004 -h 1e5 -I 2 8 18 -n 1 0.5 
-n 2 0.5  -es 0.0035 1 <admixture> -en 0.0035 3 1.0  -ej 0.07 2 1 -en 0.07 1 
1.0  -ej 0.7 3 1 -en 0.7 1 1.0

The crash appears to be influenced by the randomness in the simulation, and it 
is more or less likely to occur with different parameter values.
The crash is more likely for longer length, more iterations, and lower 
admixture proportion. 

As an example, the crash happens very often for length = 1e7, admixture = 0.3, 
and iterations = 1.
macs 26 1e7 -i 1 -t 0.0008 -r 0.0004 -h 1e5 -I 2 8 18 -n 1 0.5 -n 2 0.5  -es 
0.0035 1 0.3 -en 0.0035 3 1.0  -ej 0.07 2 1 -en 0.07 1 1.0  -ej 0.7 3 1 -en 0.7 
1 1.0

An example of the most typical error message is attached, obtained from an 
Intel Core i7 (2.80 GHz, 8Gb RAM) running Ubuntu.

Do you know of any way around this error, to obtain simulations of the model 
above?

In addition, I'd like to bring to your attention to the fact that msformatter 
prints the last token on command line twice when it writes the first line of 
the output file.

Thank you,
Katie Cunningham
Gutengroup
University of Arizona


Here is a command line that crashes, with seed 1. Admixture proportion
is 0.1.

macs 26 1e7 -i 1 -s 1 -t 0.0008 -r 0.0004 -h 1e5 -I 2 8 18 -n 1 0.5 -n 2
0.5  -es 0.0035 1 0.1 -en 0.0035 3 1.0  -ej 0.07 2 1 -en 0.07 1 1.0  -ej
0.7 3 1 -en 0.7 1 1.0 

Original issue reported on code.google.com by [email protected] on 6 Jul 2012 at 6:53

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.