_        __                                                     _            
 (_)      / _|                                                   (_)           
  _ _ __ | |_ ___ _ __ ___ _ __   ___ ___         ___ _ __   __ _ _ _ __   ___ 
 | | '_ \|  _/ _ \ '__/ _ \ '_ \ / __/ _ \  __   / _ \ '_ \ / _` | | '_ \ / _ \
 | | | | | ||  __/ | |  __/ | | | (_|  __/ |__| |  __/ | | | (_| | | | | |  __/
 |_|_| |_|_| \___|_|  \___|_| |_|\___\___|       \___|_| |_|\__, |_|_| |_|\___|
                                                             __/ |             
                                                            |___/

Inference-Engine

Overview
Downloading, Building and testing
Examples
Documentation

Overview

Inference-Engine is a software library for researching concurrent, large-batch inference and training of deep, feed-forward neural networks. Inference-Engine targets high-performance computing (HPC) applications with performance-critical inference and training needs. The initial target application is in situ training of a cloud microphysics model proxy for the Intermediate Complexity Atmospheric Research (ICAR) model. Such a proxy must support concurrent inference at every grid point at every time step of an ICAR run. For validation purposes, Inference-Engine can also import neural networks exported from Python by the companion package nexport. The training capability is currently experimental. Current unit tests verify that Inference-Engine's network-training feature works for networks with one hidden layer. Future work will include developing unit tests that verify that the training works for deep neural networks.

Inference-Engine's implementation language, Fortran 2018, makes it suitable for integration into high-performance computing (HPC). The novel features of Inference-Engine include

Exposing concurrency via

An elemental, polymorphic, and implicitly pure inference strategy,
An elemental, polymorphic, and implicitly pure activation strategy , and
A pure training subroutine.

Gathering network weights and biases into contiguous arrays
Runtime selection of inferences strategy and activation strategy.

Item 1 facilitates invoking Inference-Engine's infer function inside Fortran's do concurrent constructs, which some compilers can offload automatically to graphics processing units (GPUs). We envision this being useful in applications that require large numbers of independent inferences or or multiple networks to train concurrently. Item 2 exploits the special case where the number of neurons is uniform across the network layers. The use of contiguous arrays facilitates spatial locality in memory access patterns. Item 3 offers the possibility of adaptive inference method selection based on runtime information. The current methods include ones based on intrinsic functions, dot_product or matmul. Future options will explore the use of OpenMP and OpenACC for vectorization, multithreading, and/or accelerator offloading.

Downloading, Building and Testing

To download, build, and test Inference-Engine, enter the following commands in a Linux, macOS, or Windows Subsystem for Linux shell:

git clone https://github.com/berkeleylab/inference-engine
cd inference-engine
./setup.sh

whereupon the trailing output will provide instructions for running the examples in the example subdirectory.

Examples

The example subdirectory contains demonstrations of several intended use cases.

Documentation

Please see the Inference-Engine GitHub Pages site for HTML documentation generated by ford.

zhawhjw / inference-engine Goto Github PK