Giter Club home page Giter Club logo

pyfriends's Introduction

pyfriends

Python implementation of the paper "PyFriends: The First Fully Generalized Friends-of-Friends Extragalactic Galaxy Group Finder", using a Friends-of-Friends (FoF) algorithm for galaxy group detection, augmented by graph theory approaches.

A detailed description of the algorithm can be found in the paper above linked on ArXiv.org.

Installation

Download the repository through Git (For Windows, you can download Git Bash For Windows here).

git clone https://github.com/BrutishGuy/pyfriends.git

Data

Example data has been included in the ./data/ folder of this repository. It follows from Macri et al.

Execution

To execute the code, one must modify the config.text file to set necessary parameters for the run. These are already set to reasonable parameters.

Detailed explanation on these parameters will follow.

To run the code, simply execute the file Py2Friends.py through the command line or your favourite editor, ensuring your working directory is set to the repository directory, such that config.txt is in your working directory. Then, simply run

python ./src/Py2Friends.py

For any issues or feature requests, please log an issue on this Github repository.

pyfriends's People

Contributors

trystanscottlambert avatar brutishguy avatar

Stargazers

 avatar Miguel Verdugo avatar Song Huang avatar

Watchers

James Cloos avatar  avatar

Forkers

wolffem

pyfriends's Issues

Vanilla FoF

Need to add a vanilla FoF thing so users who don't want to trust the graph theory can make use of the traditional HG82 version of the FoF algorithm without some features

  • no updating position by averaging
  • no multiple runs (not needed since removing the position averaging will ensure all runs are identical)

Parallelization of FoF function

Really we only need to parallelize the "FoF" function over the number of runs which are taking place. Maybe this can even be a user option? Multiprocessing = True or something like that.

I don't know what the maximum we could do but we should just do that.

This really only effects users who want to run the algorithm many times and average over using graph theory, which may not be everyone but the Vanilla FoF algorithm should be included for people who want to use just that.

So lets parallelize the FoF function in the Py2Friends.py That should bring 100 runs down to 10 minutes or so, which it is now in anycase. Plus the Graph Theory is already pretty fast after being vectorized properly and should be able to average over all the runs in about 30 - 90 seconds.

Syntax error Py2Friends.py

There are two instances of <> in the Py2Friends.py program that cause a syntax error immediately.

My guesses are
if len(checklimit)<>0: should be if len(checklimit)>0:
while list(friends_after) <> list(friends_before) and iterations<20: should be while list(friends_after) > list(friends_before) and iterations<20:.

Include Flagging and Correcting Faux Connections

Need to flag problematic cases where large groups are found as a single group because of a small number of connections between the 2.

We have done this previously by post processing the output files but this should be part of the main program such that it is independent of output.

Overall Speed of the initial Algorithm takes way too long.

So far for 45 000 galaxies the algorithm takes about 10-15 minutes to run 10 times. Ideally at least 100 runs would be done meaning about 100 minutes.

Need to speed up the overall time of the actually FoF algorithm.

Obvious solution is to parallelize running the different trials but I think we can speed up the whole thing before we parallelize and then future proof using parallelization.

For 45 000 galaxies, the time taken should be under 2 mins for 100 runs. This would mean future large surveys with 100's of thousands of galaxies would be handled well.

I don't want to use C :'(

Param File Generator.

I think the best way to get the param file implemented into the package is to create a param file generator that will spawn a param file template which the user can then update.

The param file can be generated blankly or alternatively we can include some default values. Specifically for specific type of surveys with Shecter params included.

Output files should be managed better than just being in the main script

The main body of the program, running the algorithm is actually very short. Writing the numerous outputs takes up the most space by far. Perhaps a class with .totxt() .tocsv() .toIAPUC() .toPartiView() etc. Anytime someone makes a request for a new type of output type we should be able to just add it to the class?

For now those three should do well.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.