Giter Club home page Giter Club logo

datajoint-tutorials's Introduction

Welcome to DataJoint tutorials!

DataJoint is an open-source library for science labs to design and build data pipelines for automated data analysis and sharing.

This document will guide you as a new DataJoint user through interactive tutorials organized in Jupyter notebooks and written in Python.

Please note that these hands-on DataJoint tutorials are friendly to non-expert users, and advanced programming skills are not required.

Table of contents

  • In the tutorials folder are interactive Jupyter notebooks to learn DataJoint. The calcium imaging and electrophysiology tutorials provide examples of defining and interacting with data pipelines. In addition, some fill-in-the-blank sections are included for you to code yourself!

    • 01-DataJoint Basics
    • 02-Calcium Imaging Imported Tables
    • 03-Calcium Imaging Computed Tables
    • 04-Electrophysiology Imported Tables
    • 05-Electrophysiology Computed Tables
  • In the completed_tutorials folder are Jupyter notebooks with the code sections completed and solved.

  • You will find the following notebooks in the short_tutorials folder:

    • DataJoint in 30min
    • University

Key learnings from the tutorials

After completing this set of tutorials, you will gain real experience in the basics of the DataJoint framework. These skills will allow you to design, implement and manage data pipelines effectively applied to your scientific research.

Here is a summary of the content that you can expect to have learned:

  • Understanding DataJoint basics: concepts, design, and structure (~1 hour)

    • Create schemas/tables
    • Table tiers (Lookup, Manual, Imported, Computed)
    • Insert entries and view entries in tables
    • Table dependency and data integrity
    • Basic operations
      • Restriction - &
      • Join - *
      • Projection - .proj()
      • Fetch - .fetch()
      • Deletion - .delete()
      • Drop - .drop()
  • DataJoint advanced topics: pipeline automation (~1 hour)

    • Imported and Computed tables
    • make() function
    • .populate() for automated computation
    • .populate(reserve_jobs=True) for parallelization

Interactive Environment

  • These interactive DataJoint tutorials can be accessed through a cloud-based environment on GitHub Codespaces. The following instructions will provide you with an environment that is configured with DataJoint for Python so that you can immediately begin to build and run a data pipeline.
  • Instructions
    • Sign up for a free a GitHub account.
    • Fork this repository.
    • Launch the environment using GitHub Codespaces on your fork with the default options by selecting the green Code button, then the Codespaces tab, and then the green Create codespace on main button. For more control, under the Codespaces tab select the ... button where you may create New with options.....
    • The launch time for the Codespace is less that 2 minutes.
    • You will know your environment has finished loading once the pip install -e . command has run and the terminal prompt is clear.
    • To begin, navigate to the tutorials directory located in the left panel and proceed through the sequentially organized Jupyter notebooks. Execute the cells in the notebooks to begin your walkthrough of the tutorial.
    • Once you are done, see the options in the menu in the bottom-left corner. In Codespaces, you can Stop Current Codespace. By default, GitHub will also automatically stop the Codespaces after 30 minutes of inactivity.
  • Tip: Each month, GitHub renews a free-tier quota of computing and storage. Typically we run into the storage limits before anything else since Codespaces consume storage while stopped. It is best to delete Codespaces when not actively in use and recreate them when needed. Once any portion of your quota is reached, you will need to wait for it to be reset at the end of your cycle or add billing info to your GitHub account to handle overages.
  • Tip: GitHub auto names the Codespaces, but you can rename the Codespace so that it is easier to identify later.
  • Tip: All the edits you make in these tutorial notebooks are not persistent. Edits will be reset to the original content every time you restart the server. However, you can easily commit the changes to your fork.

Documentation

  • For more information on DataJoint Python, please refer to the documentation.

Support

If you need help getting started or run into any errors, please open a GitHub Issue or contact our team by email at [email protected].

Additional DataJoint Tutorials

Developer Instructions

  • Local environment instructions
    • Install the following:
    • git clone your fork of the repository and open it in VSCode.
    • Use the Dev Containers extension to Reopen in Container. (More info in the Getting started included with the extension.)
    • To begin, navigate to the tutorials directory located in the left panel and proceed through the sequentially organized Jupyter notebooks. Execute the cells in the notebooks to begin your walkthrough of the tutorial.
    • Once you are done, you can stop the container by closing the VS Code window.

datajoint-tutorials's People

Contributors

kabilar avatar ttngu207 avatar milagrosmarin avatar shenshan avatar davidgodinez avatar yambottle avatar guzman-raphael avatar yarikoptic avatar dimitri-yatsenko avatar cbroz1 avatar kushalbakshi avatar tdincer avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.