Giter Club home page Giter Club logo

stochhmm's Introduction

#StochHMM - A Flexible hidden Markov model application and C++ library.


#Introduction

StochHMM is a free, open source C++ Library and application that implements HMM from simple text files. It implements traditional HMM algorithms in addition it providing additional flexibility. The additional flexibility is achieved by allowing researchers to integrate additional data sources and application into the HMM framework.

For documentation on model syntax and designing a model, see Github wiki.

###http://www.github.com/KorfLab/StochHMM/wiki

Update: Comparison between StochHMM, Mamot, R HMM, and HMMoc

###Download version 0.36: https://github.com/KorfLab/StochHMM/archive/master.zip

##Integrating Data Here are a few of the ways that StochHMM allows the users to integrate additional data sources:

  1. Multiple Emission States
  2. Weighting or Explicitly Defining State paths on a sequence
  3. Linking States Emissions/Transitions to external user-defined functions

##Multiple Emission States

StochHMM allows the user to provide multiple sequences. These sequences are then handled by the emissions. These sequences can be REAL numbers or discrete characters/words. StochHMM allows each state to have many emissions (Discrete or Continuous). Discrete emissions can be independent of each other or joint distributions. The continuous emissions can be considered in multiple ways. 1) They can be considered as raw probabilities which will be integrated without transformation. 2) They can be considered as values to be plugged into a Univariate Probability Distribution Function or Multivariate PDF (In the case of multiple REAL sequences.

Each states emissions are user-defined, so one state may have emissions from two different sequences, while another may only have a single emission from a single sequence.

##Weighting or Explicitly Defining State paths to follow on a sequence.

Often, we have some prior knowledge about the sequence. If this is the case, we may want to integrate that into the model, without redesigning or retraining the model (a timely endeavor). StochHMM allows the user to explicitly define a State path (By name of state, or category of state). In addition, StochHMM also allows the user to weight a states path (By name of State or category of state defined by user) This allows the user to restrict the predicted path or weight their prior knowledge.

##Linking States Emissions or Transitions to external user-defined functions

When that transition/emission is evaluated the function is called and can provide an emission. While this may provide one way of addressing a weakness of HMMs, which is that they do not handle long range dependencies. We see it rather as a way to link together existing utilities or functions that provide additional information to the decoding algorithms. In this way, we can link divergent datasets or functions within the HMM trellis in order to arrive at a better prediction.


#Features

##Brief list of features implemented in StochHMM:

  • General settings within Hidden Markov Models
    1. User-defined HMM model via simple human readable text file
    2. User-defined Alphabet
    3. User-defined Ambiguous Characters
  • States
    1. Emissions
      • Multiple emission states (Discrete / Continuous)
      • Independent (Single or Multiple Discrete)
      • Joint Distribution (Multiple Discrete)
      • Univariate PDF (Single Sequence - Continuous)
      • Multivariate PDF (Multiple Sequence - Continuous)
      • Linkable to user-defined function
    2. Transitions
      • Standard Transitions
      • Lexical Transitions (Single or multiple emission)
      • (Preliminary Support) Explicit Duration Transitions
      • Linkable to user-defined functions
  • Decoding
    1. Traditional Decoding Algorithms
      • Forward/Backward/Posterior
      • Viterbi
      • N-best Viterbi
    2. Stochastic Sampling Decoding Algorithms
      • Stochastic Forward
      • Stochastic Viterbi
      • Stochastic Posterior
  • Decoding Traceback Path output formats
    • State Path Index
    • State Path Label
    • GFF
    • Hit Table (Stochastic Algorithms)
    • Posterior Probability Table

#Developers

##Korf Lab

Korf Lab, Genome Center, University of California, Davis

##For suggestions or support:


#Code Documentation Documentation for the C++ code can be found at StochHMM Doxygen Documentation

Documentation on the Model files can be found at StochHMM Github Wiki


#References:

  1. Schroeder, D.I., Blair J.D., Lott P., Yu H.O., Hong D., Crary F., Ashwood P., Walker C. , Korf I., Robinson W.P., LaSalle J.M.. The human placenta methylome. PNAS 15:6037-6042 (2013)

  2. Lott, P., Dunaway, K., Yu, K., Korf, I. StochHMM: A Flexible Hidden Markov Model Framework for Rapid Development of HMMs. Poster presented at: Genome Informatics, 2012 Sep 6-9, Cambridge, UK.

  3. Ginno, P. A., Lott, P. L., Christensen, H. C., Korf, I. & Chédin, F. R-loop formation is a distinctive characteristic of unmethylated human CpG island promoters. Mol. Cell 45, 814–825 (2012).

  4. Schroeder, D. I., Lott, P., Korf, I. & LaSalle, J. M. Large-scale methylation domains mark a functional subset of neuronally expressed genes. Genome Res 21, 1583–1591 (2011).


#Installation

To compile StochHMM in Unix command-line (Linux, Mac OS X)

 $ ./configure
 $ make

Compiled application ./stochhmm will be located in the projects root folder and the static library will be in the src/ folder.

To compile StochHMM in XCode (Mac OS X only)

  1. Open the StochHMM.xcodeproj in the Xcode directory.
  2. Select the Debug/Release within the StochHMM Scheme.
  3. Select Run

Compiled target will be accessible from Xcode


#Examples

To run the examples,

$ cd bin/
$ stochhmm -model ../examples/Dice.hmm -seq ../examples/Dice.fa -viterbi -label
$ stochhmm -model ../examples/3_16Eddy.hmm -seq ../examples/3_16Eddy.fa -viterbi -gff
$ stochhmm -model ../examples/3_16Eddy.hmm -seq ../examples/3_17Eddy.fa -posterior
$ stochhmm -model ../examples/Dice.hmm -seq ../examples/Dice.fa -stochastic viterbi -rep 10 -label
$ stochhmm -model ../examples/Dice.hmm -seq ../examples/Dice.fa -stochastic posterior -rep 10 -label

#License Information

The MIT License (MIT)

Copyright (c) 2007-2012 Paul Lott, Ian Korf, Korf Lab, Genome Center, UC Davis, Davis, CA. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

stochhmm's People

Contributors

lottpaul avatar nkgwer avatar paullott avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stochhmm's Issues

where to find GC_SKEW.fa

Hi, I am very interested in this project and want to use GC_SKEW example in my study. I know that it's enormous work to estimate HMM parameters, so I want to use your GC_SKEW.hmm directly. But where to find GC_SKEW.fa. I have read your related paper and but still don't know how to get raw data. Can you give me some clue?

Error while compiling

./stochMath.h:139:18: error: call to 'abs' is ambiguous
base=abs(base);
^~~
lexicalTable.cpp:470:16: note: in instantiation of function template
specialization 'StochHMM::integerPower' requested here
array_size*=integerPower(alpha_size, (size_t) or...
^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.13.sdk/usr/include/stdlib.h:137:6: note:
candidate function
int abs(int) __pure2;
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/stdlib.h:115:44: note:
candidate function
inline _LIBCPP_INLINE_VISIBILITY long abs( long __x) _NOEXCEPT ...
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/stdlib.h:117:44: note:
candidate function
inline _LIBCPP_INLINE_VISIBILITY long long abs(long long __x) _NOEXCEPT ...
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/math.h:693:1: note:
candidate function
abs(float __lcpp_x) _NOEXCEPT {return ::fabsf(__lcpp_x);}
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/math.h:697:1: note:
candidate function
abs(double __lcpp_x) _NOEXCEPT {return ::fabs(__lcpp_x);}
^
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/math.h:701:1: note:
candidate function
abs(long double __lcpp_x) _NOEXCEPT {return ::fabsl(__lcpp_x);}
^
1 error generated.
make[2]: *** [lexicalTable.o] Error 1
make[1]: *** [all-recursive] Error 1
make: *** [all] Error 2

Implement Nth Algorithms in the StochTrellis

I have move the Nth viterbi algorithm from NthTrellis class to the simpleTrellis and stochasticTrellis classes.

I've already implemented it in simpleTrellis and need to adapt the code for the stochasticTrellis class

Installing on Microsoft Windows 10

Hello.
It may not the proper place to post this issue, but unfortunately, I don't know where to post it. Regarding the installation on windows 10, how to create the executable files using nmake. I have already installed and run MS VCVARS32. But I couldn't find the makefie (.mkf) in the StochHMM directory

Please maintain this

Thank you for creating this -- it is a very important tool for many projects. At the moment, installation does not work. This issue was first referenced back in 2019 but was never address. #17

In 2021, the issue still remains...

During compilation with make, I get the following error:

make  all-recursive
make[1]: Entering directory '/home/millerh1/projects/RLSuite/RLBase-data/misc-data/StochHMM-0.37'
Making all in src
make[2]: Entering directory '/home/millerh1/projects/RLSuite/RLBase-data/misc-data/StochHMM-0.37/src'
/home/millerh1/miniconda3/envs/rlbaseData/bin/x86_64-conda-linux-gnu-c++ -DHAVE_CONFIG_H -I. -I.. -I ./  -DNDEBUG -D_FORTIFY_SOURCE=2 -O2 -isystem /home/millerh1/miniconda3/envs/rlbaseData/include  -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem /home/millerh1/miniconda3/envs/rlbaseData/include -MT lexicalTable.o -MD -MP -MF .deps/lexicalTable.Tpo -c -o lexicalTable.o lexicalTable.cpp
In file included from PDF.h:14,
                 from userFunctions.h:35,
                 from track.h:38,
                 from lexicalTable.h:18,
                 from lexicalTable.cpp:9:
stochMath.h: In instantiation of 'T StochHMM::integerPower(T, T) [with T = long unsigned int]':
lexicalTable.cpp:470:60:   required from here
stochMath.h:139:21: error: call of overloaded 'abs(long unsigned int&)' is ambiguous
  139 |             base=abs(base);
      |                  ~~~^~~~~~
In file included from /home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/cstdlib:75,
                 from /home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/ext/string_conversions.h:41,
                 from /home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/basic_string.h:6496,
                 from /home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/string:55,
                 from lexicalTable.h:11,
                 from lexicalTable.cpp:9:
/home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/sysroot/usr/include/stdlib.h:785:12: note: candidate: 'int abs(int)'
  785 | extern int abs (int __x) __THROW __attribute__ ((__const__)) __wur;
      |            ^~~
In file included from /home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/cstdlib:77,
                 from /home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/ext/string_conversions.h:41,
                 from /home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/basic_string.h:6496,
                 from /home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/string:55,
                 from lexicalTable.h:11,
                 from lexicalTable.cpp:9:
/home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_abs.h:79:3: note: candidate: 'constexpr long double std::abs(long double)'
   79 |   abs(long double __x)
      |   ^~~
/home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_abs.h:75:3: note: candidate: 'constexpr float std::abs(float)'
   75 |   abs(float __x)
      |   ^~~
/home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_abs.h:71:3: note: candidate: 'constexpr double std::abs(double)'
   71 |   abs(double __x)
      |   ^~~
/home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_abs.h:61:3: note: candidate: 'long long int std::abs(long long int)'
   61 |   abs(long long __x) { return __builtin_llabs (__x); }
      |   ^~~
/home/millerh1/miniconda3/envs/rlbaseData/x86_64-conda-linux-gnu/include/c++/9.4.0/bits/std_abs.h:56:3: note: candidate: 'long int std::abs(long int)'
   56 |   abs(long __i) { return __builtin_labs(__i); }
      |   ^~~
make[2]: *** [Makefile:345: lexicalTable.o] Error 1
make[2]: Leaving directory '/home/millerh1/projects/RLSuite/RLBase-data/misc-data/StochHMM-0.37/src'
make[1]: *** [Makefile:354: all-recursive] Error 1
make[1]: Leaving directory '/home/millerh1/projects/RLSuite/RLBase-data/misc-data/StochHMM-0.37'
make: *** [Makefile:215: all] Error 2

Implement Stringify in StochTrellis

I've implemented this function in SimpleTrellis. To follow interface in other classes I need a stringify that converts the class data to a string. Then instead of using print to do the same thing, print just calls stringify and then prints to output.

Use SimpleTrellis class as a template for the stochastic trellis. Differences with stochastic trellis: 1) will print the probability traceback values.

Compiling Code

When compiling the TestUsingStochHMMlib.cpp I encountered error: call of overloaded 'abs(unsigned int&)' is ambiguous

I am compiling on Code Blocks using GNU GCC compiler set at C++17 on windows 10.

Integrating into an R package

Hello,
I really appreciate it if you give me a precise instruction on how to integrate StochHMM into an R package. I have binomial data (y, n) of size one million and sometimes n and y are 0, meaning they are missing but I want to impute them using HMM.
Thank you in advance,

Use std::ostream for outputs

std::ostream should be used for outputs. It might be initialized with std::cout/cerr.
The current solution of printing directly to std::cout/cerr makes it very inconvenient to use StochHMM as a library because error messages cannot be logged correctly.

I want to train a new model

Hi, I am very interested in this tool and want to use it in my study.
But I want to train a new model.
Can you provide me with a script to train the new model?
Or tell me how to train a new model ?
Thank you very much.

no seqTool.h and seqTool.cpp

Hello, I am going to compile your source codes on linux, but I cannot find seqToo.h and seqTool.cpp. I don't know why

Compile and execute on windows visual studio

Dear Team,

Currently, my working environment is windows visual studio.
May I know whether you can provide a version that can be compiled and executed on windows visual studio?
Something like compiled by Cmake instead of Makefile.

Many tks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.