Giter Club home page Giter Club logo

analogsensemble's Introduction

DOI License: MIT Codacy Badge Build C++ Build R codecov lifecycle Binder dockeri.co

Overview

Parallel Analog Ensemble (PAnEn) generates accurate forecast ensembles relying on a single deterministic model simulation and the historical observations. The technique was introduced by Luca Delle Monache et al. in the paper Probabilistic Weather Prediction with an Analog Ensemble. Developed and maintained by GEOlab at Penn State, PAnEn aims to provide an efficient implementation for this technique and user-friendly interfaces in R and C++ for researchers who want to use this technique in their own research.

The easiest way to use this package is to install the R package, 'RAnEn'. C++ libraries are also available but they are designed for intermediate users with requirement for performance. For installation guidance, please refer to the installation section.

Citation

To cite this package, you have several options:

  • Using LaTex: Please use this file for citation.
  • Using R: Simply type citation('RAnEn') and the citation message will be printed.
  • Using plain text: Please use the following citation format:
Weiming Hu, Guido Cervone, Laura Clemente-Harding, and Martina Calovi. (2019). Parallel Analog Ensemble. Zenodo. http://doi.org/10.5281/zenodo.3384321

Installation

RAnEn is very easy to install if you are already using R. This is the recommended way to start.

RAnEn

The command is the same for RAnEn installation and update.

To install RAnEn, please install the following packages first:

  • BH: install.packages('BH')
  • Rcpp: install.packages('Rcpp')
  • If you are using Windows, please also install the latest version of Rtools.

The following R command install the latest RAnEn.

install.packages("https://github.com/Weiming-Hu/AnalogsEnsemble/raw/master/RAnalogs/releases/RAnEn_latest.tar.gz", repos = NULL)

That's it. You are good to go. Please refer to tutorials or the R documentation to learn more about using RAnEn. You might also want to install RAnEnExtra package with functions for visualization and verification. After RAnEn installation, you can simply run devtools::install_github("Weiming-Hu/RAnEnExtra").

Mac users: if the package shows that OpenMP is not supported. You can do one of the followings:

  1. Avoid using Clang compilers and convert to GNU compilers. To change the compilers used by R, create a file ~/.R/Makevars if you do not have it already and add the following content to it. Of course, change the compilers to what you have. If you do not have any alternative compilers other than Clang, HomeBrew is your friend.
CC=gcc-8
CXX=g++-8
CXX1X=g++-8
CXX14=g++-8
  1. You can also follow the instructions here provided by data.table. They provide similar solutions but stick with Clang compilers.

After the installation, you can always revert back to your original setup and RAnEn will stay supported by OpenMP.

CAnEn

Docker/Singularity

No installation is needed if you are already using docker or singularity. Docker images available here can be directly downloaded and used.

# Download and run the docker image within docker
docker container run -it weiminghu123/panen:default

# Run the dokcer image with a local folder mounted inside the image
docker container run -it -v ~/Desktop:/Desktop weiminghu123/panen:default

# Download and run the docker image within singularity
singularity run docker://weiminghu123/panen:default

From Source

To install the C++ libraries, please check the following dependencies.

  • Required CMake is the required build system generator.
  • Required NetCDF provides the file I/O with NetCDF files.
  • Required Eccodes provides the file I/O with Grib2 files.
  • Optional Boost provides high-performance data structures. Boost is a very large library. If you don't want to install the entire package, PAnEn is able to build the required ones automatically.
  • Optional CppUnit provides test frameworks. If CppUnit is found in the system, test programs will be compiled.

To set up the dependency, it is recommended to use conda. I chose minicoda instead of anaconda simply beacause miniconda is the light-weight version. If you already have anaconda, you are fine as well.

The following code sets up the environment from stratch:

# Python version is required because of boost compatibility issues
conda create -n venv_anen python==3.8 -y

# Keep your environment activate during the entire installation process, including CAnEn
conda activate venv_anen

# Required dependency
conda install -c anaconda cmake boost -y
conda install -c conda-forge netcdf-cxx4 eccodes doxygen  -y

# Optional dependency: LibTorch
# If you need libTorch, please go ahead to https://pytorch.org/get-started/locally/ and select
# Stable -> [Your OS] -> LibTorch -> C++/Java -> [Compute Platform] -> cxx11 ABI version
# 
# Please see https://github.com/Weiming-Hu/AnalogsEnsemble/issues/86#issuecomment-1047442579 for instructions
# on how to inlcude libTorch during the cmake process.

# Optional dependency: MPI
conda install -c conda-forge openmpi -y

After the dependencies are installed, let's build CAnEn:

# Download the source files (~10 Mb)
wget https://github.com/Weiming-Hu/AnalogsEnsemble/archive/master.zip

# Unzip
unzip master.zip

# Create a separate folder to store all intermediate files during the installation process
cd AnalogsEnsemble-master/
mkdir build
cd build
cmake -DCMAKE_INSTALL_PREFIX=~/AnalogEnsemble ..

# Compile
make -j 4

# Install
make install

CMake Parameters

Below is a list of parameters you can change and customize.

Parameter Explanation Default
CMAKE_C_COMPILER The C compiler to use. [System dependent]
CMAKE_CXX_COMPILER The C++ compiler to use. [System dependent]
CMAKE_INSTALL_PREFIX The installation directory. [System dependent]
CMAKE_PREFIX_PATH Which folder(s) should cmake search for packages besides the default. Paths are surrounded by double quotes and separated with semicolons. [Empty]
CMAKE_INSTALL_RPATH The run-time library path. Paths are surrounded by double quotes and separated with semicolons. [Empty]
CMAKE_BUILD_TYPE Release for release mode; Debug for debug mode. Release
INSTALL_RAnEn Build and install the RAnEn library. OFF
BUILD_BOOST Build Boost regardless of whether it exists in the system. OFF
BOOST_URL The URL for downloading Boost. This is only used when BUILD_BOOST is ON. [From SourceForge]
ENABLE_MPI Build the MPI supported libraries and executables. This requires the MPI dependency. OFF
ENABLE_OPENMP Enable multi-threading with OpenMP ON
ENABLE_AI Enable PyTorch integration and the power of AI. OFF

You can change the default of the parameters, for example, cmake -DCMAKE_INSTALL_PREFIX=~/AnalogEnsemble ... Don't forget the extra letter D when specifying argument names.

High-Performance Computing and Supercomputers

Here is a list of instructions to build and install AnEn on supercomputers.

MPI and OpenMP

TL;DR

Launching an MPI-OpenMP hybrid program can be tricky.

If the performance with MPI is acceptable,
disable OpenMP (`cmake -DENABLE_OPENMP=OFF ..`).

If the hybrid solution is desired,
make sure you have the proper setup.

When ENABLE_MPI is turned on, MPI programs will be built. These MPI programs are hybrid programs (unless you set -DENABLE_OPENMP=OFF for cmake) that use both MPI and OpenMP. Please check with your individual supercomputer platform to find out what the proper configuration for launching an MPI + OpenMP hybrid program is. Users are responsible not to launch too many process and threads at the same time which would overtask the machine and might lead to hanging problems (as what I have seen on XSEDE Stampede2).

On NCAR Cheyenne, the proper way to launch a hybrid program can be found here. If you use mpirun, instead of mpiexec_mpt, you will loose the multi-threading performance improvement.

To dive deeper into the hybrid parallelization design, MPI is used for computationally expensive portions of the code, e.g. file I/O and analog generation while OpenMP is used by the master process during bottleneck portion of the code, e.g. data reshaping and information queries.

When analogs with a long search and test periods are desired, MPI is used to distribute forecast files across processes. Each process reads a subset of the forecast files. This solves the problem where serial I/O can be very slow.

When a large number of stations/grids present, MPI is used to distribute analog generation for different stations across processes. Each process takes charge of generating analogs for a subset of stations.

Sitting between the file I/O and the analog generation is the bottleneck which is hard to parallelize with MPI, e.g. reshaping the data and querying test/search times. Therefore, they are parallelized with OpenMP on master process only.

So if the platform support heterogeneous task layout, users can theoretically allocate one core per worker process and more cores for the master process to facilitate its multi-threading scope. But again, only do this when you find the bottleneck is taking much longer time than file I/O and analog generation. Use --profile to have profiling information in standard message output.

Tutorials

Tutorials can be accessed on binder or be found in this directory

Here are also some tips and caveats in this ticket.

References

Feedbacks

We appreciate collaborations and feedbacks from users. Please contact the maintainer Weiming Hu through [email protected] or submit tickets if you have any problems.

Thank you!

# "`-''-/").___..--''"`-._
#  (`6_ 6  )   `-.  (     ).`-.__.`)   WE ARE ...
#  (_Y_.)'  ._   )  `._ `. ``-..-'    PENN STATE!
#    _ ..`--'_..-_/  /--'_.' ,'
#  (il),-''  (li),'  ((!.-'
# 
# Authors: 
#     Weiming Hu <[email protected]>
#     Guido Cervone <[email protected]>
#     Laura Clemente-Harding <[email protected]>
#     Martina Calovi <[email protected]>
#
# Contributors: 
#     Luca Delle Monache
#         
# Geoinformatics and Earth Observation Laboratory (http://geolab.psu.edu)
# Department of Geography and Institute for CyberScience
# The Pennsylvania State University

analogsensemble's People

Contributors

weiming-hu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

analogsensemble's Issues

RAnEn with different predictor weights for each point

The RAnEn package uses the same predictor weights for all stations/points. However, the related predictors and the corresponding weights might be dependent on locations. Is it possible to change the weight for each predictor-station combination? It seems to be easy for the independent search case (i.e., search for each location), but I don't how to deal with it with search space extension. Thank you very much in advance.

AnEn Search Space Extension: Discreet output

Describe the bug
When performing verification on the AnEn SSE, histograms show the output to be discreet. See sample plot below:
image

Code to reproduce: (Or check out /AnalogsSpatial/SSE_test1.R on svn)

# Generate Analogs

source('~/geolab/projects/AnalogsSpatial/code/SSE_CoreSetUp.R')
# Variables
members.size    <- 25
# Identify the day to use
dayi <- 735
# For two year search history:
search.ID.start <- 1
search.ID.end <- 730
paramBeingForecasted <- 2
#############################################################################
# # Set up parameters to compute analogs
# # weights            <- c(1,1,0,0,0) #  "wdir","ws","2T","2DPT","MSLP"
xs <- as.numeric(lon)
ys <- as.numeric(lat)
nx <- 51; ny <- 49
icounter1 <- dayi
# Generate AnEn output, one day at a time.
# for ( icounter1 in 735:735 ) {  # Now, 730; Eventually do this through to 1095 for the 3rd year. So this is training on the first year and testing on the second two years
test.ID.start <- icounter1
test.ID.end <- test.ID.start+20# Generate the analogs for one day at a time.
# A sampling of 100 stations
stations.ID <- stations_ID <- sort(sample(1:2499,100))
config2 <- generateConfiguration('independentSearch')
config2$observation_id <- paramBeingForecasted
config2$test_forecasts <- fcst.aligned[,stations.ID,test.ID.start:test.ID.end,, drop = F]
config2$search_forecasts <- fcst.aligned[,stations.ID,search.ID.start:search.ID.end, , drop = F]
config2$search_times <- as.vector(times[search.ID.start:search.ID.end])
config2$search_flts <- flts[1:dim(config2$search_forecasts)[4]]
tmp.search.observations2 <- obsv.aligned[,,search.ID.start:search.ID.end,,drop=F]  # create a new copy of it
search.observations2 <- aperm(tmp.search.observations2, c(4, 3, 2, 1)) # Reorganizing the structure
search.observations2 <- array(search.observations2,
dim = c(dim(tmp.search.observations2)[3]
* dim(tmp.search.observations2)[4],
dim(tmp.search.observations2)[2],
dim(tmp.search.observations2)[1]))
search.observations2 <- aperm(search.observations2, c(3, 2, 1))
config2$search_observations <- search.observations2
config2$observation_times = rep(config2$search_times, each = length(config2$search_flts)) + config2$search_flts
config2$num_members <- members.size
num.parameters <- dim(config2$search_forecasts)[1]
config2$weights <- rep(1, num.parameters)
config2$test_stations_x <- xs[stations.ID]
config2$test_stations_y <- ys[stations.ID]
config2$search_stations_x <- xs[stations.ID]
config2$search_stations_y <- ys[stations.ID]
config2$preserve_mapping <- T
config2$verbose <- 3
config2$max_flt_nan <- 1
config2$max_par_nan <- 0
config2$extend_observations <- T
# Validate first before using
validateConfiguration(config2)
# Generate analogs
AnEn.ind <- generateAnalogs(config2)
config <- generateConfiguration('extendedSearch')
config$test_forecasts <- fcst.aligned[,stations.ID,test.ID.start:test.ID.end,, drop = F]
config$observation_id <- paramBeingForecasted
config$search_forecasts <- fcst.aligned[,stations.ID,search.ID.start:search.ID.end, , drop = F]
config$search_times <- as.vector(times[search.ID.start:search.ID.end])
config$search_flts <- flts[1:dim(config$search_forecasts)[4]]  # Want this to match with config$search_forecasts
# # We need to convert obsv.aligned from 4 dimensions to 3 dimensions:
tmp.search.observations <- obsv.aligned[,,search.ID.start:search.ID.end,,drop=F]  # create a new copy of it
search.observations <- aperm(tmp.search.observations, c(4, 3, 2, 1)) # Reorganizing the structure
search.observations <- array(search.observations,
dim = c(dim(tmp.search.observations)[3]
* dim(tmp.search.observations)[4],
dim(tmp.search.observations)[2],
dim(tmp.search.observations)[1]))
# Combined (collapsed) the first two dimensions (multiplied the first two dimensions to bring them together )
search.observations <- aperm(search.observations, c(3, 2, 1))  # Flip the location back
# search.observations <- aperm(search.observations, c(3, 2, 1))
#
config$search_observations <- search.observations   # search_observations[parameter, stations, time]
#
# # Need to go do the dim(config$search_observations)[3]
config$observation_times = rep(config$search_times, each = length(config$search_flts)) + config$search_flts
#
#
config$num_members <- members.size
#
num.parameters <- dim(config$search_forecasts)[1]
config$weights <- rep(1, num.parameters)
#
# # Right now, these first four lines are all the same. (because the test stations are the search stations but in other examples, the test and search stations could be different )
config$test_stations_x <- xs[stations.ID]
config$test_stations_y <- ys[stations.ID]
#
# # config$search_stations_x <- xs
# # config$search_stations_y <- ys
config$search_stations_x <- xs[stations.ID]
config$search_stations_y <- ys[stations.ID]
#
config$preserve_mapping <- T
config$verbose <- 3
config$max_flt_nan <- 1
config$max_par_nan <- 0
config$extend_observations <- T  # added on 20181212 <- analog from the point (target) being forecasted for
# # save search stations in the output
config$preserve_search_stations <- T  # Tells you waht the search stations are that you're looking into
# # save metrics in the output
config$preserve_similarity <- T  # Tells you which statiosn were most similar
#
config$num_nearest <- 8  # This is an option you can vary
config$max_num_search_stations <- 10  # Can decrease this if you want. # Can set this to be the same as number of nearest if/when using # nearest.
# # config$max_num_search_stations <- config$num_nearest
# # config$distance <- 1
#
# # Validate first before using
validateConfiguration(config)
#
# # w/ search extension
AnEn <- generateAnalogs(config)


# Look at anen.ver for SSE and IS: 
anen.ver.sse <- array(AnEn$analogs[,,,,1], dim=dim(AnEn$analogs)[1:4])

anen.ver.ind <- array(AnEn.ind$analogs[,,,,1], dim=dim(AnEn.ind$analogs)[1:4])



# Observations 
# # Observations (Analysis Fields)
nc.analy.file <- '~/geolab_storage_V3/data/Analogs/ECMWF_Italy/ItalyAnalysis.nc'
nc.analysis   <- nc_open(nc.analy.file)
obsv          <- ncvar_get(nc.analysis, 'Data')
# dim(obsv) -- 3 parameters   2499 stations   1102 days    4 flt
# Parameter 3 is temperature 
parameter <- 2
dir <- UVtoDir(obsv[1,,,], obsv[2,,,])
spd  <- UVtoSpd(obsv[1,,,], obsv[2,,,])
obs  <- array(spd,dim=dim(obsv)[2:4])


# Source Verification Functions 
source('~/geolab/projects/ExtremeHeat/code/Verification_Functions.R')
source('~/geolab/projects/ExtremeHeat/code/AnEn_functions.R')
library(ncdf4)

# # 2 days every 6 hours
time      <- (seq(0,48,6) )*60*60   ; time <- time[-length(time)]
# rhist.ver=function(anen.ver, obs.ver)

# anen.ver <- array(AnEn.ind$analogs[,,,,1], dim=dim(AnEn.ind$analogs)[1:4])
obs.ver  <- array( NA, dim=c(nrow(obs), dim(anen.ver.ind)[2], dim(obs)[3]*2 ))
for ( d in 1:dim(anen.ver.ind)[2] ) {
  obs.ver[,d,] = array( cbind( obs[ , test.ID.start+d-1, ], obs[ , test.ID.start+d, ] ), dim=c(nrow(obs), 1, dim(obs)[3]*2 ))
}
# Subselect down to the stations chosen. 
# stations.ID are the stations that are randomly kept 
obs.ver <- obs.ver[stations.ID,,]

rankhist.ind <- rhist.ver(anen.ver = anen.ver.ind ,obs.ver = obs.ver )
barplot(rankhist.ind, main = "AnEn.ind")

rankhist.sse <- rhist.ver(anen.ver = anen.ver.sse ,obs.ver = obs.ver )
barplot(rankhist.sse, main = "AnEn SSE")

Missing values found in RAnEn

Using RAnEn package and foud missing values in the results.
Here the data:

data.dir  <- "~/geolab_storage_V3/data/ExtremeHeat/NYdata/"
gfs.fname <- "gfs_209_68.Rdata"
pws.fname <- "pws.Rdata"

Code to run (SVN repository: geolab/projects/ExtremeHeat) AnEn_GFS_PWS.R

I found NAs in the results:

AnEn$analogs[1, 256, 12, , ]
      [,1] [,2] [,3]
 [1,]  NaN  NaN  NaN
 [2,]  NaN  NaN  NaN
 [3,]  NaN  NaN  NaN
 [4,]  NaN  NaN  NaN
 [5,]  NaN  NaN  NaN
 [6,]  NaN  NaN  NaN
 [7,]  NaN  NaN  NaN
 [8,]  NaN  NaN  NaN
 [9,]  NaN  NaN  NaN
[10,]  NaN  NaN  NaN
[11,]  NaN  NaN  NaN
[12,]  NaN  NaN  NaN
[13,]  NaN  NaN  NaN
[14,]  NaN  NaN  NaN
[15,]  NaN  NaN  NaN
[16,]  NaN  NaN  NaN
[17,]  NaN  NaN  NaN
[18,]  NaN  NaN  NaN
[19,]  NaN  NaN  NaN
[20,]  NaN  NaN  NaN
[21,]  NaN  NaN  NaN

Improve File IO for SimilarityMatrices

Reading and writing similarity matrices are very slow because of the data structure.

A SimilarityMatrix is a vector of a vector which is not optimized for writing to and reading from a NetCDF file.

Index out of bounds

Index out of bounds. Probably because analog.index.day difference between observation and forecast

Basic function development in progress

This issue is created for the first release of the package.

  • Read NetCDF files as observations and forecasts
  • Write NetCDF files for observations and forecasts
  • Compute similarity matrix
  • Order similarity matrix
  • Select analogs based on the ordered similarity matrix
  • Read part of NetCDF files as observations and forecasts
  • Read similarity matrices
  • Write similarity matrices
  • Read analogs
  • Write analogs
  • CMake script for compiling the package and dependencies
  • Export to R interface
  • Correctness test with R script
  • Stations function development for getting search stations
  • Stations new function tests
  • Test compilation with -fomp
  • R Wrapper function for backward compatibility
  • Executable (Si) for computing similarity matrices
  • Executable for converting GRB2 file
  • Executable (Se) for selecting analogs
  • Executable for computing analogs as a whole
  • Test build process

Error during RAnEn installation

Describe the bug
I got the following error when install the RAnEn package.

> install.packages("https://github.com/Weiming-Hu/AnalogsEnsemble/raw/master/RAnalogs/releases/RAnEn_latest.tar.gz", repos = NULL)
Installing package into ‘/Users/wuh20/Library/R/3.5/library’
(as ‘lib’ is unspecified)
trying URL 'https://github.com/Weiming-Hu/AnalogsEnsemble/raw/master/RAnalogs/releases/RAnEn_latest.tar.gz'
Content type 'application/octet-stream' length 115962 bytes (113 KB)
==================================================
downloaded 113 KB

* installing *source* package ‘RAnEn’ ...
Checking whether R_HOME is already set? R_HOME = /usr/local/Cellar/r/3.5.0_1/lib/R
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... configure: error: in `/private/var/folders/z2/qq0ntf292kj8hy14ckfrmlp80000gp/T/Rtmpmbn6Nq/R.INSTALL27d92b7a8188/RAnEn':
configure: error: cannot run C++ compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
ERROR: configuration failed for package ‘RAnEn’
* removing ‘/Users/wuh20/Library/R/3.5/library/RAnEn’
* restoring previous ‘/Users/wuh20/Library/R/3.5/library/RAnEn’
Warning message:
In install.packages("https://github.com/Weiming-Hu/AnalogsEnsemble/raw/master/RAnalogs/releases/RAnEn_latest.tar.gz",  :
  installation of package ‘/var/folders/z2/qq0ntf292kj8hy14ckfrmlp80000gp/T//Rtmpf4qdEz/downloaded_packages/RAnEn_latest.tar.gz’ had non-zero exit status

My Makevars

CC=gcc-8
CXX=g++-8
CXX1X=g++-8
CXX11=g++-8

Interface Help Following 3.2.1 release (Operational Search added)

Following use of Parallel Ensemble help page (https://weiming-hu.github.io/AnalogsEnsemble/2019/02/12/operational-search.html) and binder (https://hub.mybinder.org/user/weiming-hu-analogsensemble-bmqvhvn1/notebooks/demo-3_operational-search.ipynb) documentation,

  1. request help utilizing new commands added/revised with the inclusion of the Operational Search option
  2. Suggest revision of user interface to reduce duplication and streamline use

RAnEn function generateAnalogs failed with RStudio

I have the following script.

library(RAnEn)
library(maps)

# load("forecasts_ocean.RData")
# load("observations_ocean.RData")

cat("Loading data ...\n")
if ('forecasts' %in% ls()) {
  # Don't reload data
} else {
  load("forecasts_Utah.RData")
  # load("forecasts_Denver.RData")
  # load("forecasts_ocean.RData")
}

if ('observations' %in% ls()) {
  # Don't reload data
} else {
  load("observations_Utah.RData")
  # load("observations_Denver.RData")
  # load("observations_ocean.RData")
}

# Only keep the first 4 FLTs because they are perfect forecasts
if (length(forecasts$FLTs) == 53) {
  #flts.to.keep <- c(1:4)
  #flts.to.keep <- c(1:4, 7)
  flts.to.keep <- c(1:4)
  forecasts$FLTs <- forecasts$FLTs[flts.to.keep]
  forecasts$Data <- forecasts$Data[, , , flts.to.keep, drop = F]
  rm(flts.to.keep)
} else {
  cat("FLTs have already been truncated. No changes are made to the current FLTs.")
}

# Shift Xs range
if (range(forecasts$Xs)[2] > 180) {
  forecasts$Xs <- forecasts$Xs - 360
}

# Configure start and end time
test.start <- 2997
test.end <- 3027
search.end <- test.start - 2
search.start <- search.end - 364
observation_id <- 8
# weights <- c(1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1)
weights <- c(0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0)
# weights <- rep(1, length(forecasts$ParameterNames))
names(weights) <- forecasts$ParameterNames

if (T) {
  cat("Range of test times:", format(as.POSIXct(forecasts$Times[c(
    test.start, test.end)], origin = '1970-01-01', tz = 'UTC'), format = "%Y-%m-%d"),
    "\nRange of search times:", format(as.POSIXct(forecasts$Times[c(
      search.start, search.end)], origin = '1970-01-01', tz = 'UTC'), format = "%Y-%m-%d"),
    "\nPredicted variable is", observations$ParameterNames[observation_id],
    "\nweights are:\n")
  print(weights)
}

# Generate AnEn
config <- generateConfiguration('independentSearch')

config$forecasts <- forecasts$Data
config$forecast_times <- forecasts$Times
config$flts <- forecasts$FLTs
config$search_observations <- observations$Data
config$observation_times <- observations$Times
config$observation_id <- observation_id
config$weights <- weights
config$num_members <- 20
config$verbose <- 6
config$max_par_nan <- 3
config$max_flt_nan <- 1
config$quick <- F
config$circulars <- unlist(lapply(forecasts$ParameterCirculars, function (x) {
  return(which(x == forecasts$ParameterNames))}))

config$test_times_compare <- forecasts$Times[test.start:test.end]
config$search_times_compare <- forecasts$Times[search.start:search.end]

AnEn <- generateAnalogs(config)

obs <- alignObservations(observations$Data, observations$Times, forecasts$Times, forecasts$FLTs)

I get the following results when I'm running it over NAM data. The place that generates this error message is not consistent.

OpenMP is supported.
Package 'RAnEn' version 3.2.4
Copyright (c) 2018 Weiming Hu
Loading data ...
Range of test times: 2017-04-12 2017-05-12 
Range of search times: 2016-04-10 2017-04-10 
Predicted variable is SurfaceTemperature 
weights are:
    2MetreRelativeHumidity             2MetreDewpoint 
                         0                          0 
         2MetreTemperature            SoilTemperature 
                         1                          1 
             SurfaceAlbedo         1000IsobaricInhPaU 
                         0                          0 
        1000IsobaricInhPaV         SurfaceTemperature 
                         0                          1 
           SurfacePressure            TotalCloudCover 
                         1                          0 
        TotalPrecipitation DownwardShortWaveRadiation 
                         0                          0 
 DownwardLongWaveRadiation   UpwardShortWaveRadiation 
                         1                          0 
   UpwardLongWaveRadiation     1000IsobaricInhPaSpeed 
                         1                          0 
      1000IsobaricInhPaDir 
                         0 
Convert R objects to C++ objects ...
A summary of test forecast parameters:
[Parameters] size: 17
[Parameter] ID: 0, name: UNDEFINED, weight: 0, circular: 0
[Parameter] ID: 1, name: UNDEFINED, weight: 0, circular: 0
[Parameter] ID: 2, name: UNDEFINED, weight: 1, circular: 0
[Parameter] ID: 3, name: UNDEFINED, weight: 1, circular: 0
[Parameter] ID: 4, name: UNDEFINED, weight: 0, circular: 0
[Parameter] ID: 5, name: UNDEFINED, weight: 0, circular: 0
[Parameter] ID: 6, name: UNDEFINED, weight: 0, circular: 0
[Parameter] ID: 7, name: UNDEFINED, weight: 1, circular: 0
[Parameter] ID: 8, name: UNDEFINED, weight: 1, circular: 0
[Parameter] ID: 9, name: UNDEFINED, weight: 0, circular: 0
[Parameter] ID: 10, name: UNDEFINED, weight: 0, circular: 0
[Parameter] ID: 11, name: UNDEFINED, weight: 0, circular: 0
[Parameter] ID: 12, name: UNDEFINED, weight: 1, circular: 0
[Parameter] ID: 13, name: UNDEFINED, weight: 0, circular: 0
[Parameter] ID: 14, name: UNDEFINED, weight: 1, circular: 0
[Parameter] ID: 15, name: UNDEFINED, weight: 0, circular: 0
[Parameter] ID: 16, name: UNDEFINED, weight: 0, circular: 1
Computing standard deviation ... 
corrupted size vs. prev_size
Aborted (core dumped)

Distinguish NAN values

When NAN values are assigned, leave a unique integer to specify the reason for being NAN values.

RAnEn installation on Mac OS

I have the following errors while installing RAnEn on Mac Air.

> install.packages("https://github.com/Weiming-Hu/AnalogsEnsemble/raw/master/RAnalogs/releases/RAnEn_latest.tar.gz", repos = NULL)
trying URL 'https://github.com/Weiming-Hu/AnalogsEnsemble/raw/master/RAnalogs/releases/RAnEn_latest.tar.gz'
Content type 'application/octet-stream' length 146604 bytes (143 KB)
==================================================
downloaded 143 KB

Warning in strptime(xx, f <- "%Y-%m-%d %H:%M:%OS", tz = tz) :
  unknown timezone 'zone/tz/2018i.1.0/zoneinfo/America/New_York'
* installing *source* package ‘RAnEn’ ...
Checking whether R_HOME is already set? R_HOME = /Library/Frameworks/R.framework/Resources
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
xcrun: error: invalid active developer path (/Library/Developer/CommandLineTools), missing xcrun at: /Library/Developer/CommandLineTools/usr/bin/xcrun
checking whether the C++ compiler works... no
configure: error: in `/private/var/folders/h2/sgb6vf0j61554sqdq_8glx280000gn/T/RtmpjOM3Zt/R.INSTALL17f3f1cf1ddb9/RAnEn':
configure: error: C++ compiler cannot create executables
See `config.log' for more details
ERROR: configuration failed for package ‘RAnEn’
* removing ‘/Library/Frameworks/R.framework/Versions/3.4/Resources/library/RAnEn’
Warning in install.packages :
  installation of package ‘/var/folders/h2/sgb6vf0j61554sqdq_8glx280000gn/T//RtmpnfzJLt/downloaded_packages/RAnEn_latest.tar.gz’ had non-zero exit status

Insufficient memory when dealing when large objects

When dealing with large objects in R, the memory is exhausted.

> AnEn <- generateAnalogs(config)
Convert R objects to C++ objects ...
Computing standard deviation ... 
Computing mapping from forecast [Time, FLT] to observation [Time]  ... 
Computing search space extension ... 
Computing search windows for FLT ... 
Computing similarity matrices ... 
Error in .generateAnalogs(configuration$test_forecasts, dim(configuration$test_forecasts),  : 
  std::bad_alloc

RAnEn::generateAnalogs return error messages about not matching object types

Hi Alon, could you please upload two things here to assist the debugging?

  • Run all the codes until the line that will give you the error message. Save the environment (you can remove and clear some variables that won't be used to reduce the file size. Normally it is enough that you save all the input the function that will generate the error) and try uploading it here or directly to me.
  • Attach the line of code here that you didn't run and will give you the error.

Thank you

NetCDF: Unknown file format when using mpi reading

$ forecastsToObservations -i ItalyAnalysis20190204.nc -o asdfasdfItalyAnalysis20190204_tmp1.nc -v 3
Parallel Ensemble Forecasts --- Forecasts to Observations v 1.0.2
Copyright (c) 2018 Weiming Hu @ GEOlab
Converting Observations to Forecasts
Reading forecast file ...
Reading Parameters from file (ItalyAnalysis20190204.nc) ...
Reading dimension (num_parameters) length ...
Warning: Optional variable (ParameterCirculars) is missing in file (ItalyAnalysis20190204.nc)!
Warning: Optional variable (ParameterWeights) is missing in file (ItalyAnalysis20190204.nc)!
Reading Stations from file (ItalyAnalysis20190204.nc) ...
Reading dimension (num_stations) length ...
Warning: Optional variable (StationNames) is missing in file (ItalyAnalysis20190204.nc)!
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
Error at line=166: (-51) NetCDF: Unknown file format
-------------------------------------------------------
Child job 2 terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
^C
--------------------------------------------------------------------------
(null) detected that one or more processes exited with non-zero status, thus causing
the job to be terminated. The first process to do so was:

  Process name: [[36550,2],17]
  Exit code:    205
--------------------------------------------------------------------------

Issue with RAnEn installation

I have the following errors while installing the RAnEn package on Linux.

> install.packages('/users/yyang/code/R/script/AnalogsEnsemble-master/RAnalogs/releases/RAnEn_latest.tar.gz',repos=NULL)

Installing package into ?.users/yyang/code/R/library?
(as ?.ib?.is unspecified)
* installing *source* package ?.AnEn?....
Checking whether R_HOME is already set? R_HOME = /opt/R/R-3.5.3-intel2017-icc-ifort/lib64/R
checking whether the C++ compiler works... yes
checking for C++ compiler default output file name... a.out
checking for suffix of executables... 
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C++ compiler... yes
checking whether icpc -std=gnu++11 accepts -g... yes
checking for icpc -std=gnu++11 option to support OpenMP... -fopenmp
configure: creating ./config.status
config.status: creating src/Makevars
** libs
icpc -std=gnu++11 -I"/opt/R/R-3.5.3-intel2017-icc-ifort/lib64/R/include" -DNDEBUG  -I"/users/yyang/code/R/library/Rcpp/include" -I"/users/yyang/code/R/library/BH/include" -I/usr/local/include  -fopenmp -fpic  -g -O2 -c AnEn.cpp -o AnEn.o
In file included from /users/yyang/code/R/library/BH/include/boost/noncopyable.hpp(15),
                 from /users/yyang/code/R/library/BH/include/boost/multi_index/detail/auto_space.hpp(20),
                 from /users/yyang/code/R/library/BH/include/boost/multi_index/detail/rnd_index_ptr_array.hpp(19),
                 from /users/yyang/code/R/library/BH/include/boost/multi_index/detail/rnd_index_ops.hpp(18),
                 from /users/yyang/code/R/library/BH/include/boost/multi_index/random_access_index.hpp(35),
                 from Stations.h(13),
                 from Forecasts.h(16),
                 from Analogs.h(12),
                 from Functions.h(11),
                 from AnEn.h(11),
                 from AnEn.cpp(8):
/users/yyang/code/R/library/BH/include/boost/core/noncopyable.hpp(42): error: defaulted default constructor cannot be constexpr because the corresponding implicitly declared default constructor would not be constexpr
        BOOST_CONSTEXPR noncopyable() = default;
                        ^

compilation aborted for AnEn.cpp (code 2)
make: *** [AnEn.o] Error 2
ERROR: compilation failed for package ?.AnEn?
* removing ?.users/yyang/code/R/library/RAnEn?
Warning message:
In install.packages("/users/yyang/code/R/script/AnalogsEnsemble-master/RAnalogs/releases/RAnEn_latest.tar.gz",  :
  installation of package ?.users/yyang/code/R/script/AnalogsEnsemble-master/RAnalogs/releases/RAnEn_latest.tar.gz?.had non-zero exit status

RAnEn Help Documentation

Updated to the new latest version of RAnEn (using R version 6.0) and when I try to find the documentation on the RAnEn package through the normal means (?RAnEn or ??RAnEn), I receive an error. (Specifically:" Error in fetch(key) : lazy-load database '/usr/local/lib/R/3.6/site-library/RAnEn/help/RAnEn.rdb' is corrupt" prints to the screen when ?RAnEn is used and "No results found" when ??RAnEn is written.) Suggestions? Thanks!

The number of similarities in the result

I noticed that, while by default twice as many as ensemble members should be kept for similarity, the actual result only keeps the same number as ensemble members. This should be fixed

Question: Independent Search file size and IS vs SSE computing time

These are just questions/observations and I am curious about your thoughts. No hurry and no worries if you don't have time to respond.

  1. Independent Search File size question: Under the new RAnEn v.3.2.5, the general file size of the Independent Search file size is around 1.9MB for analogs generated for one day whereas under earlier versions it was around 4MB for analogs generated for one day. (The AnEn IS I am using for comparison was computed under the most current version on 20190129. I think pre 3.2.x but it may be 3.1.x. ). I have compared the files and both seem like they have reasonable analog results (nothing missing, etc). Beyond the "Changelog" page or the "Issues" page, can you let me know if you changed something so that the filesize is smaller? I am just curious.

  2. Computing time: The IS took a little over an hour to generate 365 days using the IS search while it took about 14.75 hours to compute the SSE for 365 days. Does this seem appropriate to you? It seems like the code may be running faster now but I do not have numerical evidence. Just curious if you posted on this or had any thoughts to share.

Thanks!

Just in case you want to see the code used to configure/execute the AnEnIS and AnEnSSE.

# Objective:  Code to generate the analogs for various scenarios/cases. 
#             Generate AnEnIS output, AnEnSSE output, and configuration files for each.  
# Author:         Laura Clemente-Harding ([email protected]) 
# Collaborators:  Guido Cervone, Weiming Hu
# Note! Weiming Hu is the author of the original RAnEn package. 

# Load libraries, source functions file
library(ncdf4); library(RAnEn)
source('~/geolab/projects/AnalogsSpatial/code/SSE_functions.R', echo=TRUE)

# Load ECMWF forecasts, analysis field, coordinates, calcuate WS and WD from U and V components
source('~/geolab/projects/AnalogsSpatial/code/SSE_loadBasicData.R', echo=TRUE)

# Generation Options 

generateAnEnSSE   <- FALSE # If TRUE, generates AnEnSSE
generateAnEnIS    <- TRUE  # If TRUE, generates AnEnIS
currentDate       <- "20190301"
# Save path and directory 
# savePath          <- "~/geolab_storage_V3/data/Analogs/AnEn-SSE/"   # 
savePath          <- "/Volumes/blackeye/geolab_storage_V3/data/Analogs/AnEn-SSE/"
saveDir           <- "operational_TEMP_20190301/"
operationalCase   <- T   # If TRUE, then it activates the operational case in the config generation files below
# operational       <- TRUE  # By default this is FALSE. 


# ANEN PARAMETERS
# Define variables here: 
members.size       <- 21
# Choose variable to be predicted (predictand) 
predictandParam   <- 3  # parameter 1 is WD; param 2 is WS; param 3 is Temperature
stations.ID       <- 1:2499 
weights           <- rep(1, dim(fcst.aligned)[1])     # <- c(1,1,0,0,0) #  "wdir","ws","2T","2DPT","MSLP"
verbosity         <- 3
preserve_mapping  <- TRUE 
extObs            <- FALSE 
test.start <- 730 # once looping, don't need to define this here
# test.end   <- 730 # once looping, don't need to define this here
search.start <- 1
search.end <- 730
AnEnGen.startDate <- test.start
AnEnGen.endDate   <- 1095 # end of the testing period, all the days to generate information for 


xs <- as.numeric(lon)
ys <- as.numeric(lat)
nx <- 51; ny <- 49   # Preset for Italy dataset 

# icounter1 <- 733 
# Generate AnEn output, one day at a time. 
for ( icounter1 in AnEnGen.startDate:AnEnGen.endDate ) {  # Now, 730; Eventually do this through to 1095 for the 3rd year. So this is training on the first year and testing on the second two years 
  print(paste("Generating AnEn for ", icounter1))
   test.start <- icounter1
  # test.end <- test.start# Generate the analogs for one day at a time. 
  
  # A sampling of 500 stations 
  # stations.ID <- stations_ID <- sort(sample(1:2499,500))
   if ( generateAnEnSSE ){
    config                     <- generateConfiguration('extendedSearch')
    config$observation_id      <- predictandParam
    config$forecasts           <- fcst.aligned   # changed    dim(fcst.aligned)=>  5 2499 1095    8
    config$forecast_times      <- fcst.times # NEW   
    config$flts                <- flts.subset# Want this to match with config$search_forecasts  # NEW-ISH bc search_flts is now flts 
                                  # 14Feb - So this no longer needs to be subset, 
                                  #    but we still subset it simply so it doesn't take the program as long (give it less to search through) 
    config$search_observations <- obsv  # search_observations[parameter, stations, time]
                                   # can constrain this to svae time 
    config$observation_times   <- obsv.times
    config$num_members         <- members.size
    config$weights             <- weights
    config$forecast_stations_x <- xs[stations.ID]
    config$forecast_stations_y <- ys[stations.ID]
    config$verbose             <- verbosity
    config$extend_observations <- extObs 
    # Set up test times to be compared
    config$test_times_compare   <- config$forecast_times[test.start] # One single point in time that the comparing starts from? 
    config$search_times_compare <- config$forecast_times[search.start:search.end]  # This means nothing if operational is changed to TRUE. However, operational is FALSE by default. 
    # Specific to SSE 
    config$preserve_search_stations  <- T  # Tells you waht the search stations are that you're looking into 
    config$preserve_similarity       <- T  # Tells you which statiosn were most similar 
    config$num_nearest               <- 8  # This is an option you can vary
    config$max_num_search_stations   <- 10  # Can decrease this if you want. # Can set this to be the same as number of nearest if/when using # nearest.
    # config$max_num_search_stations <- config$num_nearest 
    # config$distance <- 1
    
    # if ( operationalCase == TRUE ){
    #   config$operational  <- operationalCase
    #   # Additional parameters? 
    # }
    # 
    # Validate first before using 
    if ( validateConfiguration(config) == FALSE ){
      stop("Stop Program: Validation Failed ")
    }
    
    # w/ search extension
    AnEn <- generateAnalogs(config)
    
    # Save the AnEn 
    fname.SSE <- print(paste("AnEn_SSE_nn-",config$num_nearest, "_NumEns_", config$num_members,"_train_",search.start,"-",
                             search.end,"_testday_",test.start,"_op",operationalCase,sep=""))
    save(AnEn, file = print(paste(savePath,saveDir,fname.SSE,".Rdata", sep = "")))
    
    if ( generateAnEnSSE && test.start == 731 ){
      save(config, AnEn, file= print(paste(savePath,"config_AnEnSSE_",currentDate,"_day_",test.start,".Rdata", sep="")) )
    }
    rm(AnEn,fname.SSE)
  } # Ends if statement for AnEnSSE 
  
  
  # AnEn IS ( without search space extension )
  if ( generateAnEnIS ){
    config2                     <- generateConfiguration('independentSearch')
    config2$observation_id      <- predictandParam
    config2$forecasts           <- fcst.aligned   # changed    dim(fcst.aligned)=>  5 2499 1095    8
    config2$forecast_times      <- fcst.times # NEW   
    config2$flts                <- flts.subset# Want this to match with config$search_forecasts  # NEW-ISH bc search_flts is now flts 
    # 14Feb - So this no longer needs to be subset, 
    #    but we still subset it simply so it doesn't take the program as long (give it less to search through) 
    config2$search_observations <- obsv  # search_observations[parameter, stations, time]
    # can constrain this to svae time 
    config2$observation_times   <- obsv.times
    config2$num_members         <- members.size
    config2$weights             <- weights
    config2$forecast_stations_x <- xs[stations.ID]
    config2$forecast_stations_y <- ys[stations.ID]
    config2$verbose             <- verbosity
    config2$extend_observations <- extObs 
    # Set up test times to be compared
    config2$test_times_compare   <- config2$forecast_times[test.start]
    config2$search_times_compare <- config2$forecast_times[search.start:search.end]  # This means nothing if operational is changed to TRUE. However, operational is FALSE by default. 
    
    config2$preserve_mapping    <- preserve_mapping
    config2$verbose             <- verbosity
    
    # Validate first before using 
    validateConfiguration(config2)
    # Generate analogs 
    AnEn.ind <- generateAnalogs(config2)
    
    fname.nonSSE <- print(paste("AnEn_IS_NumEns_", config2$num_members,"_train_",search.start,"-",
                                search.end,"_testday_",test.start,"_op",operationalCase, sep=""))
    save(AnEn.ind, file = print(paste(savePath,saveDir,fname.nonSSE,".Rdata", sep = "")))
    
    
    # Save configuration file for whichever AnEn (IS or SSE) was generated 
    
    if ( generateAnEnIS && test.start == 731 ){
      save(config2, AnEn.ind, file = print(paste(savePath,"config2_AnEnIS_",currentDate,"_day_",test.start,".Rdata", sep="")) )
    }
    
    rm(AnEn.ind,fname.nonSSE)
  } # Ends T/F for AnEn IS generation 
   
  
} # End of anen generation calculator 

Different results from RAnEn when subseting stations

When IS is used, AnEn on stations should be independent of each other. However, when computing a portion of the stations, the results for these stations are not the same from when computing all stations at once.

For example, the left three columns come from computing all stations, and the right 3 columns are from computing partial stations. This might be caused by dealing with NA values.

> cbind(AnEn.all$similarity[5, 2, 3, order(AnEn.all$similarity[5, 2, 3, , 3]), ], AnEn$similarity[5, 2, 3, order(AnEn$similarity[5, 2, 3, ,3]),])[1:20, ]
          [,1] [,2] [,3]     [,4] [,5] [,6]
 [1,]      NaN    5    1      NaN    5    1
 [2,] 7.212896    5    2      NaN    5    2
 [3,] 2.776645    5    3      NaN    5    3
 [4,]      NaN    5    4      NaN    5    4
 [5,] 3.268617    5    5 3.268617    5    5
 [6,] 1.744200    5    6 1.744200    5    6
 [7,] 4.941840    5    7 4.941840    5    7
 [8,] 4.340009    5    8 4.340009    5    8
 [9,]      NaN    5    9 3.855888    5    9
[10,]      NaN    5   10 5.344926    5   10
[11,] 5.490306    5   11 5.490306    5   11
[12,] 3.994862    5   12 3.994862    5   12
[13,] 3.262656    5   13      NaN    5   13
[14,] 3.588668    5   14      NaN    5   14
[15,]      NaN    5   15      NaN    5   15
[16,]      NaN    5   16      NaN    5   16
[17,] 3.961612    5   17      NaN    5   17
[18,] 3.799170    5   18      NaN    5   18
[19,] 3.386876    5   19      NaN    5   19
[20,] 4.393606    5   20      NaN    5   20

AnEn + MPI + TAU program aborted due to error

I compiled the code with the following command.

CC=tau_cc.sh CXX=tau_cxx.sh cmake -DENABLE_MPI=ON -DCMAKE_PREFIX_PATH=/home/graduate/wuh20/packages/release/ -DBOOST_TYPE=SYSTEM -DCMAKE_BUILD_TYPE=Debug ..
make -j 16

I encountered the following error.

OMP_NUM_THREADS=3 mpirun -np 1 /home/graduate/wuh20/github/AnalogsEnsemble/output/bin/standardDeviationCalculator -v 6 -i /home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/forecasts/201712.nc /home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/forecasts/201801.nc -o ~/exfat-hu/Data/2019_Hu_AnEn-bias-correction/sds/sds-0001.nc --start 0 0 0 0 0 0 0 0 --count 17 100 31 53 17 100 31 53

Parallel Ensemble Forecasts --- Standard Deviation Calculator v 3.2.1
Copyright (c) 2018 Weiming Hu @ GEOlab
Input parameters:
in_files: /home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/forecasts/201712.nc,/home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/forecasts/201801.nc,
out_file: /home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/sds/sds-0001.nc
verbose: 6
config_file: 
start: 0,0,0,0,0,0,0,0,
count: 17,100,31,53,17,100,31,53,
Checking mode ...
Checking file (/home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/sds/sds-0001.nc) ...
Combining forecasts along the time dimension...
Checking mode ...
Checking file (/home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/forecasts/201712.nc) ...
Checking file type (Forecasts) ...
Checking dimension (num_parameters) ...
Checking dimension (num_stations) ...
Checking dimension (num_times) ...
Checking dimension (num_flts) ...
Checking dimension (num_chars) ...
Checking variable (Data) ...
Checking variable (FLTs) ...
Checking variable (Times) ...
Checking variable (ParameterNames) ...
Checking variable (Xs) ...
Checking variable (Ys) ...
Processing partial meta information ...
Reading Parameters from file (/home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/forecasts/201712.nc) ...
Reading dimension (num_parameters) length ...
Checking variable (ParameterCirculars) ...
Checking variable (ParameterWeights) ...
Reading Stations from file (/home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/forecasts/201712.nc) ...
Reading dimension (num_stations) length ...
Spawning 3 processes to read StationNames ...
Broadcasting variables ...
Child rank #0 received from the parent's broadcast ...
Child rank #1 received from the parent's broadcast ...
Child rank #2 received from the parent's broadcast ...
Child rank #0 reading StationNames with start/count ( 0,33 0,50 ) ...
Child rank #2 reading StationNames with start/count ( 66,34 0,50 ) ...
Child rank #1 reading StationNames with start/count ( 33,33 0,50 ) ...
Parent waiting to gather data from processes ...
Rank #0 sending data (1650) back to the parent ...
Rank #2 sending data (1700) back to the parent ...
Rank #1 sending data (1650) back to the parent ...
[sapphire:02637] *** Process received signal ***
[sapphire:02637] Signal: Segmentation fault (11)
[sapphire:02637] Signal code: Address not mapped (1)
[sapphire:02637] Failing at address: (nil)
[sapphire:02637] [ 0] /lib/x86_64-linux-gnu/libpthread.so.0(+0x1288f)[0x7f93c43cf88f]
[sapphire:02637] [ 1] mpiAnEnIO(MPI_Gatherv+0x120)[0x56347a6ffc00]
[sapphire:02637] [ 2] mpiAnEnIO(main+0x10ef)[0x56347a63ab1d]
[sapphire:02637] [ 3] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe6)[0x7f93c3fedb96]
[sapphire:02637] [ 4] mpiAnEnIO(_start+0x29)[0x56347a638f09]
[sapphire:02637] *** End of error message ***
Reading Times from file (/home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/forecasts/201712.nc)   ...
Reading dimension (num_times) length ...
Reading FLTs from file (/home/graduate/wuh20/exfat-hu/Data/2019_Hu_AnEn-bias-correction/forecasts/201712.nc) ...
Reading dimension (num_flts) length ...
Combining times ...
...
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 0 on node sapphire exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.