Giter Club home page Giter Club logo

movement's People

Contributors

andrew-schofield avatar cjelandm avatar goldingn avatar kathrintessella avatar mariabnd avatar talexperkins avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

movement's Issues

Enforce the integer values for the movement_matrix in the movement() method

Running the example for movement, the step predicting form a flux model issues the following warning:

predictedMovement  <- predict(originalRadiation(theta = 0.1), locationData, symmetric = TRUE)
Warning message:
In as.movement_matrix.matrix(prediction$prediction) :
  The given movement_matix contains non-integer values. Rounding was used to return a valid movement_matrix object.

Whilst we want integer predictions in this cases, in general, predictions from a flux or movement_model object should not be coerced to integer. The following should therefore work and not issue any warnings (note changes to object names):

data(kenya)
kenya10 <- raster::aggregate(kenya, 10, sum)
net <- getNetwork(kenya10, min = 50000)
location_data <- data.frame(location = net$locations, population = net$population, x = net$coordinate[,1], y = net$coordinate[,2])
class(location_data) <- c("location_dataframe", "data.frame")

# simulate movements (note the values of movementmatrix must be integer)
predicted_movement  <- predict(originalRadiation(theta = 0.1), location_data, symmetric = TRUE)
movement_matrix <- as.movement_matrix(round(predicted_movement$movement_matrix))

# fit a new model to these data
movement_model <- movement(movement_matrix ~ location_data, radiationWithSelection(theta = 0.5))
# print movement_model
print(movement_model)
# predict the population movements
predicted_movements  <- predict(movement_model, kenya10)
# display the predicted movements
showprediction(predicted_movements)

notify the user that standard errors are on the transformed scale

In print.summary.movement_model we (should, #108) report the estimated coefficients, and the estimated standard errors. Whilst parameters are reported on the 'true' scale, as in the equations, the standard errors are correspond to parameters estimates on the continuous scale we optimise on, and it only makes sense to report these.

We need to add something to the docs for summary.movement_model explaining this, and edit the printed summary to report the parameters on the trues scale, on the transformed scale, and the standard errors on the transformed scale.

Handling line endings

It looks like we have the settings of the individual git's different configured and working on different OS (Windows vs Mac) causing various merging conflicts due to line endings.

Reading a bit how to best address this issue, I came across the suggestions to add a '.gitattributes' file to the repos to ensure consistent handling of line endings (see https://help.github.com/articles/dealing-with-line-endings/). And here, https://github.com/Danimoth/gitattributes, is a selection of
'gitattributes' templates for various languages incl. R which we might can use as a starting point.

@goldingn Lets have a call today to discuss.

plot method for output of movement

The object returned by the movement function (which is undergoing renaming so I won't be more specific) should have a plot method.

PR #60 provides such functionality, given vectors of distances, observed movements and predicted movements

Check that column/row names are set when generating a movement_matrix

When creating the movement_matrix ensure that the column & row names are meaningful in respect of their location.

We could then also check that these line up with the 'location_dataframe' locations during model fitting which would be safer.

We can issue a warning or an error if something is coerced into a movement_matrix without row and column names.

define plot method for class movement_predictions created with no raster

Using the version of the movement example code here, the following does not plot predictions in a sensible way:

predicted_movements2  <- predict(movement_model, location_data)
class(predicted_movements2)
showprediction(predicted_movements2)

screen shot 2015-11-02 at 11 55 54

The following code in show.prediction should be sufficient in this case (unneccessary raster version commented out), so some handling of the input datatype to select the plot method should be all that is needed:

# sp::plot(raster_layer, ...)
size <- 0.1 + 2 * network$population/max(network$population)
# points(network$coordinates, pch = 16, cex = size, col = rgb(0, 0, 1, 0.6))
plot(network$coordinates, pch = 16, cex = size, col = rgb(0, 0, 1, 0.6), asp = 1)
n <- nrow(network$coordinates)
for (i in 2:n) {
  for (j in (i - 1):n) {
    arrows(network$coordinates[i, 1],
           network$coordinates[i, 2],
           network$coordinates[j, 1],
           network$coordinates[j, 2],
           lwd = 4,
           length = 0,
           col = rgb(0, 0, 1, predictedMovements[i, j] / (max(predictedMovements) + 0)))
  }
}

note also that the current first line:

sp::plot(raster::raster(network$distance_matrix))

should not be in there!

pass CRAN checks without errors, warnings or notes

I'd like to submit the package to CRAN before the end of next week. They'll only accept it passes R CMD check --as-cran on their machines on the previous, stable and development versions of R with no warnings, errors or significant* notes. Currently, Travis reports two notes and two warnings on R-devel. We'll need to fix these before submission. I've pasted the current R-devel problems below and suggested fixes. We'll also need to eyeball the outputs of the other Travis runs for issues, do a test run locally which builds the manual, and run checks on Windows.

*significance is often arbitrarily determined


* checking dependencies in R code ... WARNING
'::' or ':::' import not declared from: ‘viridis’

Adding #' importFrom viridis viridis to the package's roxygen blurb

* checking R code for possible problems ... NOTE
densityCompare: no visible global function definition for ‘density’
densityCompare: no visible global function definition for ‘grey’
densityCompare: no visible global function definition for ‘lines’
distanceDistPlot: no visible global function definition for ‘grey’
plotComparePredictions: no visible global function definition for ‘cor’
plotComparePredictions: no visible global function definition for
  ‘title’
scatter: no visible global function definition for ‘smoothScatter’
scatter: no visible global function definition for ‘grey’
unitTransform: no visible global function definition for ‘plogis’
unitTransform: no visible global function definition for ‘qlogis’
Undefined global functions or variables:
  cor density grey lines plogis qlogis smoothScatter title
Consider adding
  importFrom("graphics", "lines", "smoothScatter", "title")
  importFrom("grDevices", "grey")
  importFrom("stats", "cor", "density", "plogis", "qlogis")
to your NAMESPACE.

Add the suggested imports as #' importFrom x y in the package roxygen docs (if written in NAMESPACE directly they'll be overwritten)

* checking Rd line widths ... NOTE
Rd file 'movement.Rd':
  \examples lines wider than 100 characters:
     locationData <- data.frame(location = net$locations, population = net$population, x = net$coordinate[,1], y = net$coordinate[,2])
Rd file 'plot.movement_predictions.Rd':
  \examples lines wider than 100 characters:
     locationData <- data.frame(location = net$locations, population = net$population, x = net$coordinate[,1], y = net$coordinate[,2])
Rd file 'predict.movement_model.Rd':
  \examples lines wider than 100 characters:
     locationData <- data.frame(location = net$locations, population = net$population, x = net$coordinate[,1], y = net$coordinate[,2])
These lines will be truncated in the PDF manual.

Add some newlines in these examples

* checking Rd cross-references ... WARNING
Missing link or links in documentation object 'travelTime.Rd':
  ‘[gdistance]{adjacent}’
See section 'Cross-references' in the 'Writing R Extensions' manual.

I think that just needs to be gDistance

There are also examples which take >5s to run (bad from CRAN's perspective, though weirdly not a warning):

* checking examples ... OK
Examples with CPU or elapsed time > 5s
                             user system elapsed
movement                  106.159  0.155 106.553
plot.movement_predictions 102.005  0.111 102.178
predict.movement_model     97.064  0.099  97.268
kenya                       5.396  0.102   5.502

These can be fixed quickly by adding \dontrun{} around the examples in the roxygen blurb. See here for a working example.

fix tests using predict methods

Some of the tests explicitly called predict.*() functions. Now that they're properly registered as S3 methods these won't work.

For some cases it was just a case of switching to predict(), but for tests which use with_mock they'll need to be re-written. I've commented them out for now to pass tests, in bf106ed

Link to predict() in document does not link to movement.predict() method

The link in the movement() method for the predict() method opens the help for the associates predict() methods in the packages 'Spatial model predictions' and 'Model Predictions', but not the relevant predict.movementmodel method

Do we need to implement a
predict <- function(object,...){
UseMethod("....")
}
similar to the showprediction() method? Or is there another way how to correctly link to the right predict() method?
Same would apply for the 'predict.optimisedmodel' method.

Check through all flux functions

Make sure that the docs, parameter names, flux functions, and references all line up and that all parameters are explained in the docs

  • gravity
  • gravity.with.distance
  • intervening.opportunities
  • original radiation
  • radiation.with.selection
  • uniform.selection

Define new predict.movementpredictions() based on showprediction.movementmodel()

Given the new interface, I think we should define a class movementpredictions for the output of the predict method for 'optimisedmodel' objects (the output of class to movement).

This could then be the predict function for the movementpredictions class, with plotting of the underlying raster removed. I.e. only the coordinates of the locations in the object 'newdatapassed topredict.optimisedmodel`, and the lines between them.

Great circle distances

Include option to calculate great circle (or orthodromic) distances between location points.

Should get.network.fromdataframe() expect a locationdataframe object?

The get.network.fromdataframe() function is used within the 'predict.movementmodel' and therefore within the various predict functions of the package.

Currently, for this function to work, the data.frame has to have the following column names: 'pop_origin' (for population), 'origin' (for locations), 'long_origin' (for long = x) and 'lat_origin' (for lat = y). If this function is currently called with the locationdataframe object (having columns 'location', 'population, 'x' and 'y'), this function will generate an error.

Either we need to clarify this restriction in the documentation, or we want to rewrite the code to accept a locationdataframe object instead? The latter required careful checking of the code where the change of the column names can have knock-on effects.

@goldingn : what do you think?

non-NA diagonal elements cause an error

fitting a movement model (at least one using radiationWithSelection()) to an observed connectivity matrix with 0s on the diagonal causes the optimiser to fail on the starting parameters, presumably because the radiation model evaluated on distance 0 is nonsense and return a non-finite value.

movement should catch this and handle it nicely

remove exported movementmodel function and class

The confusingly named function movementmodel (which returns an object of class movementmodel) uses a named model to make predictions to a RasterLayer object, given a named flux model and object. This function is exported to the user and used in various documentation examples.

The main model-fitting function movement (which returns an object of class optimisedmodel) already handles prediction to raster data (by calling predict.movementmodel).

We should unexport the function movementmodel and functions which operate on the class movementmodel and make sure all of its functionality is incorporated in movement. This will likely break some things and require changes to the existing examples.

Warning: declared S3 method 'predict.default' not found

R CMD check --as-cran issues the following warning

* checking S3 generic/method consistency ... WARNING
Warning: declared S3 method 'predict.default' not found
as.locationdataframe:
  function(input, ...)
as.locationdataframe.SpatialPolygonsDataFrame:
  function(input, populationraster)

summary:
  function(object, ...)
summary.optimisedmodel:
  function(x)

predict:
  function(object, ...)
predict.movementmodel:
  function(predictionModel, dataframe, ...)

predict:
  function(object, ...)
predict.optimisedmodel:
  function(predictionModel, dataframe, ...)

predict:
  function(object, ...)
predict.movementmodel:
  function(predictionModel, dataframe, ...)

predict:
  function(object, ...)
predict.optimisedmodel:
  function(predictionModel, dataframe, ...)

See section ‘Generic functions and methods’ in the ‘Writing R
Extensions’ manual.

I'm not sure what the expected set up should be for S3 classes, but apparently this isn't it...

as.location_dataframe.dataframe should take fewer arguments

Currently the user has to format their locations data into pairs of to-from locations, just so that as.location_dataframe.dataframe can flatten them back to one location per row.

Instead, as.location_dataframe.dataframe should take a correctly formatted dataframe with location IDs, populations and x and y coordinates, and just apply the class label. The docs should be updated accordingly

unexport a bunch of functions

Currently loads of functions are exported which are unlikely to ever be needed by the user. We need to identify the core functions and then hide some of the ancillary functions.

Add model details to docs for flux functions

There are equations, descriptions and references for most of these flux functions here (and code here)

These should be added to the flux function docs for the corresponding models, later to be moved into the docs for flux objects, as described in #39.

These should also list the constraints on each parameter, using the constraints info e.g. here

Add parallel prediction

The sequential for loop in movement.predict https://github.com/SEEG-Oxford/movement/blob/master/R/movement.R#L1478-L1511 should be parallelisable.

We should:

1) expose a parallel argument to the user so that they can choose whether to run prediction in parallel:

predict(obj, parallel = TRUE)

This can be passed up through all functions that make movement predictions (movement, predict.prediction_model, predict.flux, predict.movement_model, fittingwrapper & attemptoptimisation)

2) Provide a simple utility function to set up the parallel backend using doSnow if they want parallelism:

parallel(n = 20)

3) Make sure a progress bar is still provided. See here for how to include progress bars in a foreach parallel implementation: http://www.r-bloggers.com/updates-to-the-foreach-package-and-its-friends/


Here's a function that works for 2) (with three new importFrom calls required):

parallel <- function(n = NULL) {
  if (is.null(n)) {
    n <- parallel::detectCores()
  }
  cl <- snow::makeSOCKcluster(n)
  doSNOW::registerDoSNOW(cl)
  message(sprintf('parallel backend registered on %i cores', n))
}

fix radiation code

It's lopsided (only calculates i to j, not both or the average) because I wrote the code after a beer or two...

I'll add an option to either calculate the two-way movements (non-symmetric movement matrix) or one-way average movements (symmetric movement matrix). The latter is what I'll be sticking into statistical models and should be a bit faster to calculate, but contains less information.

visible predict function masking base and raster

On loading the package (Andrew's fork at least), I get this note:

The following object is masked from ‘package:raster’:

    predict

The following object is masked from ‘package:stats’:

    predict

it would be better not to mask these generics, perhaps predict should not be exported (fine to export predict.movementmodel etc.)

Rectangular OD Matrices

Include option to use rectangular origin-destination matrices in intervening opportunities model.

incorrect starting parameters

For many applications movement fails because the objective can't be calculated at the starting parameters, causing optim to error:

Error in optim(predictionModel$modelparams, fittingwrapper, method = "Nelder-Mead",  : 
  function cannot be evaluated at initial parameters

Some error handling here would be helpful, and more intelligent parameter transformations for the optimiser.

movement_model summary causing failure for Kenya example used in documentation

When running the documented example using the Kenya model:

# get location data
data(kenya)
kenya10 <- raster::aggregate(kenya, 10, sum)
net <- getNetwork(kenya10, min = 50000)
location_data <- data.frame(location = net$locations, population = net$population, x = net$coordinate[,1], y = net$coordinate[,2])
location_data  <- as.location_dataframe(location_data)
# simulate movements (note the values of movementmatrix must be integer)
predicted_flux  <- predict(originalRadiation(theta = 0.1), location_data, symmetric = TRUE)
movement_matrix <- predicted_flux$movement_matrix
# fit a new model to these data
movement_model <- movement(movement_matrix ~ location_data, radiationWithSelection(theta = 0.5))

the summary of the calculated model summary(movement_model ) causes an error due to the method stderrors <- sqrt(abs(diag(solve(object$optimisation_results$hessian)))).
The hessian results for this example is:

$hessian
        [,1] [,2]
[1,] 4300876    0
[2,]       0    0

and the error message printed is

Error in solve.default(movement_model$optimisation_results$hessian) : 
  Lapack routine dgesv: system is exactly singular: U[2,2] = 0

@goldingn : How best do you want to handle such a case? Perhaps we can add a tryCatch statement for the line stderrors <- sqrt(abs(diag(solve(object$optimisation_results$hessian)))) and if an error occurs, set the variable stderrors to NA and print a warning to the user instead of causing a failure of the function?

Upper and Lower bounds for optim() method might be redundant

While rewriting the movement() method, I found a redundant code section / potential bug when calling the optimization method optim().

In line 120/121 of movement.R in the movement() function body there are two different calls to the attemptoptimisation() method whereas one is comment out:
L120: #optimresults <- attemptoptimisation(predictionModel, locationdataframe, movement_matrix, progress=FALSE, hessian=TRUE, upper=upper, lower=lower, ...) #, upper=upper, lower=lower
L121: optimresults <- attemptoptimisation(predictionModel, population_data, movement_matrix, progress=FALSE, hessian=TRUE, ...) #, upper=upper, lower=lower

Using the call in line 121 (which does NOT assigns the upper and lower arguments) results in the scenario that the optim() method will use the default values for lower (=-Inf) and upper (=Inf) [see documentation of optim()] and NOT the upper / lower bounds set in the line 59..103 in the movement() method depending on the model selected. This option was in the movement.R package as I started the project.

Using the call in line 120 (which assigned the upper and lower argument to the attemptoptimisation() method) will result that the upper and lower values, as set in line 59..103 in the movement() method depending on the model selected, will be passed over to the optim() method. This will, however, cause an error, as the bounds can ONLY be used for the optimization methods = "L-BFGS-B" or "Bent" - but we use the 'BFGS" method.

Thus, if the "BFGS" method should be use for the optimization, the 'upper' and 'lower' parameters are redundant and the code should be cleaned accordingly (including the relevant tests). Furthermore, the movement() method MUST use the call
optimresults <- attemptoptimisation(predictionModel, population_data, movement_matrix, progress=FALSE, hessian=TRUE, ...)
and the other comment-out line should be removed to avoid future confusion.

If however, the upper and lower bounds should be used by the optimization model, then the code contains a bug and needs fixing.

define more user-friendly interface

currently the main model-fitting function is movement. This requires many pieces of data from the user:

movement(locations, coords, population, movement_matrix, model,  model.params = NULL, ...)

with required arguments:

  • a vector locations (unclear what this needs to be)
  • a data.frame coords of coordinates
  • a vector population of populations
  • a matrix movement_matrix of observed movements between locations
  • a string model naming a model to use

And optional arguments:

  • numeric vector model.params giving parameters corresponding to model.
  • unspecified additional parameters ...

A simpler interface would be:

movement(movement_matrix, location_dataframe, model = gravity(), optim_options = list())

possibly with glm-like function interface:

movement(movement_matrix ~ location_dataframe)

with required arguments:

  • movement_matrix being an object of class movement_matrix containing the observed movement between some or all locations in location_dataframe
  • location_dataframe being an object of class location_dataframe containing all of the information in locations, coords and population as dataframe columns, each row being a different location.

and optional parameters

  • model being a function to initialise a model of the named type. E.g. gravity() would return a list giving the name of the flux function (in this case gravity.flux) and default arguments such as starting parameters as a named numeric vector (in this case c(alpha = 1, beta1 = 0.6, beta2 = 0.3, gamma = 3)). This model initialiser and could take different arguments, e.g. gravity(params = c(0.5, 0.2, 0.1, 5)).
  • optim_options being a named list of options to pass to optim

Existing functions could be modified to help users create the location_dataframe and movement_matrix objects, the *.flux functions could be unexported and their docs replaced with more model- (rather than code-) centric docs for these initialisation functions.

Finally, we could write generic print, plot, and summary methods for the objects returned by the initialisation functions (flux objects?) and the same models with optimised parameters. e.g.:

flux <- gravity(c(0.5, 0.2, 0.1, 5))
plot(flux)

and

m <- movement(obs ~ locs)
plot(m$flux)

Thoughts on this proposed interface?

define predict method for `flux` objects

Currently the movementmodel and predict.movementmodel functions enables users to construct theoretical (i.e. not fitted to data) movement models and make predictions from them to RasterLayer objects. This will be removed to solve issue #54.

predict.optimisedmodel enables users to predict from fitted movement models to RasterLayers, and dataframes.

We should define a predict method for flux objects to make predictions to dataframes and RasterLayers from theoretical models.

The syntax would then be:

fl1 <- gravity()
predict(fl1, some_raster)
fl2 <- radiation.with.selection()
predict(fl2, some_dataframe)

summary.movement_model reports no parameters

the summary method for a movement_model reports no parameter estimates:

devtools::install_github('SEEG-Oxford/movement')
library(movement)

# get location data
data(kenya)
kenya10 <- raster::aggregate(kenya, 10, sum)
net <- getNetwork(kenya10, min = 50000)
location_data <- data.frame(location = net$locations, population = net$population, x = net$coordinate[,1], y = net$coordinate[,2])
location_data  <- as.location_dataframe(location_data)
# simulate movements (note the values of movementmatrix must be integer)
predicted_flux  <- predict(originalRadiation(theta = 0.1), location_data, symmetric = TRUE)
movement_matrix <- predicted_flux$movement_matrix
# fit a new model to these data
movement_model <- movement(movement_matrix ~ location_data, originalRadiation())
summary(movement_model)
Model:    

Deviance Residuals:   1 

No coefficients

Null Deviance:      19356153.9810511 on 18906 degrees of freedom
Residual Deviance:  1191290.81130209 on 18905 degrees of freedom
AIC:   1191292.81130209 

it also doesn't report the model type

docs try and fail to display tables

e.g. the input argument of as.locationdataframe.data.frame (ugh) reads:

input   A data.frame of the format # origin destination movement pop_origin
        pop_destination # 1 a b 10 100 88 # 2 a c 8 100 100 # 3 a d 10 100 113 # 4 a e 11 100 107 
        #5 a f 8 100 67 # lat_origin long_origin lat_destination long_destination # 0.07826932 0.13612404
        0.12114115 0.58984725 # 0.07826932 0.13612404 0.07126503 0.19544754 # 0.07826932 
        0.13612404 0.97817937 0.22771625 # 0.07826932 0.13612404 0.87233335 0.06695538 # 
        0.07826932 0.13612404 0.23157835 0.19573021

I thing \tabular is what's needed here:
https://cran.r-project.org/doc/manuals/R-exts.html#Lists-and-tables

add dots for S3 generic functions

checking as CRAN results in these warnings:

* checking S3 generic/method consistency ... WARNING
print:
  function(x, ...)
print.flux:
  function(object)

summary:
  function(object, ...)
summary.flux:
  function(object)

predict:
  function(object, ...)
predict.flux:
  function(object, locationdataframe, min_network_pop, symmetric)

See section ‘Generic functions and methods’ in the ‘Writing R
Extensions’ manual.

This is because there should be a dots (...) argument for all of these S3 methods. You can copy the argument documentation for this argument from those for ?print, ?summary and ?predict.

Error when running sample Kenya code with own raster

net <- getNetwork(nt10, min = 1)
location_data <- data.frame(location = net$locations,
                        population = net$population,
                        x = net$coordinate[,1],
                        y = net$coordinate[,2])
location_data  <- as.location_dataframe(location_data)
predicted_flux  <- predict(radiationWithSelection(theta = 0.5), location_data, symmetric =    TRUE)
movement_matrix <- round(predicted_flux$movement_matrix)
movement_model <- movement(movement_matrix ~ location_data,     radiationWithSelection(theta = 0.5))

print(movement_model)
predicted_movements  <- predict(movement_model, nt10)

I get the following error:
Error in txtProgressBar(min = 1, max = nrow(indices), style = 3) :
must have 'max' > 'min'

I am not sure what this error means and how to fix it?

movement_matrix is not a full matrix

Hello, thanks for your nice package. When using the package to predict the human movement, I came across a question and hope you can give me some advices.

For each location, we have obtained the top 100 human flow among locations, rather than the human flow among all the regions. Therefore, the movement matrix is not a full matrix, there was some missing value in the matrix. For instance, we gathered the population flow between A and B as well as B and C, but we don't know the exact human flow between A and C. If we remove the location with the missing value, the number of regions remaining in the movement matrix is quite small. In this situation, how can I use the movement package to predict the human flow between A and C?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.