Giter Club home page Giter Club logo

hmds's Introduction

hmds: An R Package for Heuristic High and Multi Dimensional Scaling

Build Status

Abstract

In this document, I propose a heuristic to calculate the coordinates in high dimensions. If the similarities or distances between two objects and dimensions in the coordinate space are given, The heuristic calculates approximate coordinates in high dimensions. And if the similarities or distances have contradiction in metric space, the heuristic can calculate approximate coordinates. The coordinates are available for lots of analysis. The heuristic is proposed by R package.

Introduction

Multi-Dimensional Scaling(MDS)[@Carroll1980] is a statistical method in order to put objects at coordinates. If the similarities or distances between two objects are given, MDS can put objects into two or three dimensional coordinate space. In this package, I propose a heuristic in order to calculate coordinates in high dimensional space from the data of similarities or distances between two objects. The heuristic calculates approximate coordinates in the dimensions given by user. And if the similarities or distances have contradiction in metric space, the method can calculate approximate coordinates. And several important methods like Clustering[@Liu2007] and Data Visualization[@Ben2007] require coordinates in high dimensions. And the heuristic acts as follows. First of all, the heuristic randomly puts the objects in the high dimensional space. The number of dimensions is given by user. Then the distances between two objects are compared with the given data in turns. If the distance is longer than the distance of two objects in the data, the distances is made shorter by moving the objects in coordinate space. If the distance is shorter than the data, the distance is made longer. The iteration continues until the sum of distances is less than an approximate rate. And if the sum of distances is not less than the rate, the program exits by the limit of iteration count. As a result, approximate coordinate points of all objects are acquired.

Installation

If download from GitHub, you can use devtools by the commands:

> library(devtools)
> install_github("jirotubuyaki/hmds")

Once the packages are installed, it needs to be made accessible to the current R session by the commands:

> library(hmds)

For online help facilities or the details of a particular command (such as the function hmds) you can type:

> help(package="hmds")

Method

This pakage has only one method. And it is excused by:

> output <- hmds(data = input, dim=20, approx=1.2, itera=10000)

Let's args be

  • data is a numeric symmetric matrix of input data. It describe similarities or distances between two objects.
  • dim describes dimensions of coordinate space.
  • approx is approximate rate between the sum of input distances and the sum of output distances. If the rate between input and output are less than approximate rate, iterations are halt.
  • itera is iteration numbers to move points in coordinate space.

Then let's return be

  • output is a numeric matrix of points in coordinate space. Row is objects. Col is dimensions.

Data

This package includes a sample dataset. The dataset contains a matrix of similarity between two points. The dataset is generated by R. Please check the data and use dataset named "similarity" like this:

> data(package="hmds")
> data(similarity)

Conclusions

The heuristic for Multi Dimensional Scaling is described and explain how to use. This package can produce the approximate coordinates in high dimensions. And several improvements are planed. Please send suggestions and report bugs to [email protected].

Acknowledgments

This activity would not have been possible without the support of my family and friends. To my family, thank you for lots of encouragement for me and inspiring me to follow my dreams. I am especially grateful to my parents, who supported me all aspects.

References

Carroll, J D, and P Arabie. 1980. “Multidimensional scaling.” Annual Review of Psychology 31 (1): 607–49. doi:10.1146/annurev.ps.31.020180.003135.
Fry, Ben. 2007. “Visualizing Data Exploring and Explaining Data with the Processing Environment.” O’Reilly Media.
Liu, Bingh. 2007. “Web Data Mining Exploring Hyperlinks, Contents, and Usage Data.” Springer-Verlag pp. 117-146,

hmds's People

Contributors

jirotubuyaki avatar

Stargazers

 avatar

Watchers

 avatar  avatar

hmds's Issues

Resolve R CMD check warnings

When running devtools::check() within RStudio, I see:

==> Rcpp::compileAttributes()

* Updated R/RcppExports.R

==> devtools::check(document = FALSE)

Setting env vars ---------------------------------------------------------------
CFLAGS  : -Wall -pedantic
CXXFLAGS: -Wall -pedantic
Building hmds ------------------------------------------------------------------
'/Users/kevin/r/r-devel-sanitizers/lib/R/bin/R' --no-site-file --no-environ  \
  --no-save --no-restore --quiet CMD build '/Users/kevin/r/pkg/hmds'  \
  --no-resave-data --no-manual 

* checking for file ‘/Users/kevin/r/pkg/hmds/DESCRIPTION’ ... OK
* preparing ‘hmds’:
* checking DESCRIPTION meta-information ... OK
* cleaning src
* installing the package to build vignettes
* creating vignettes ... OK
Warning: ‘inst/doc’ files
    ‘my-vignette.Rmd’, ‘my-vignette.R’
  ignored as vignettes have been rebuilt.
  Run R CMD build with --no-build-vignettes to prevent rebuilding.
* cleaning src
* checking for LF line-endings in source and make files
* checking for empty or unneeded directories
* looking to see if a ‘data/datalist’ file should be added
* building ‘hmds_1.0.tar.gz’

Setting env vars ---------------------------------------------------------------
_R_CHECK_CRAN_INCOMING_USE_ASPELL_: TRUE
_R_CHECK_CRAN_INCOMING_           : FALSE
_R_CHECK_FORCE_SUGGESTS_          : FALSE
Checking hmds ------------------------------------------------------------------
'/Users/kevin/r/r-devel-sanitizers/lib/R/bin/R' --no-site-file --no-environ  \
  --no-save --no-restore --quiet CMD check  \
  '/var/folders/tm/5dt8p5s50x58br1k6wpqnwx00000gn/T//RtmpDYP4R6/hmds_1.0.tar.gz'  \
  --as-cran --timings --no-manual 

* using log directory ‘/Users/kevin/r/pkg/hmds.Rcheck’
* using R Under development (unstable) (2017-04-22 r72596)
* using platform: x86_64-apple-darwin16.5.0 (64-bit)
* using session charset: UTF-8
* using options ‘--no-manual --as-cran’
* checking for file ‘hmds/DESCRIPTION’ ... OK
* checking extension type ... Package
* this is package ‘hmds’ version ‘1.0’
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for executable files ... OK
* checking for hidden files and directories ... NOTE
Found the following hidden files and directories:
  .travis.yml
These were most likely included in error. See section ‘Package
structure’ in the ‘Writing R Extensions’ manual.
* checking for portable file names ... OK
* checking for sufficient/correct file permissions ... OK
* checking whether package ‘hmds’ can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking ‘build’ directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... NOTE
File
  LICENSE
is not mentioned in the DESCRIPTION file.
Non-standard file/directory found at top level:
  ‘paper’
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* checking whether the package can be loaded ... OK
* checking whether the package can be loaded with stated dependencies ... OK
* checking whether the package can be unloaded cleanly ... OK
* checking whether the namespace can be loaded with stated dependencies ... OK
* checking whether the namespace can be unloaded cleanly ... OK
* checking loading without being on the library search path ... OK
* checking dependencies in R code ... OK
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... OK
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd line widths ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking contents of ‘data’ directory ... OK
* checking data for non-ASCII characters ... OK
* checking data for ASCII and uncompressed saves ... OK
* checking line endings in C/C++/Fortran sources/headers ... OK
* checking compiled code ... NOTE
File ‘hmds/libs/hmds.so’:
  Found no calls to: ‘R_registerRoutines’, ‘R_useDynamicSymbols’

It is good practice to register native routines and to disable symbol
search.

See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual.
* checking installed files from ‘inst/doc’ ... OK
* checking files in ‘vignettes’ ... OK
* checking examples ... OK
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in ‘inst/doc’ ... OK
* checking re-building of vignette outputs ... OK
* DONE
Status: 1 WARNING, 3 NOTEs

See
  ‘/Users/kevin/r/pkg/hmds.Rcheck/00check.log’
for details.


checking for hidden files and directories ... NOTE
Found the following hidden files and directories:
  .travis.yml
These were most likely included in error. See section ‘Package
structure’ in the ‘Writing R Extensions’ manual.

checking top-level files ... NOTE
File
  LICENSE
is not mentioned in the DESCRIPTION file.
Non-standard file/directory found at top level:
  ‘paper’

checking compiled code ... NOTE
File ‘hmds/libs/hmds.so’:
  Found no calls to: ‘R_registerRoutines’, ‘R_useDynamicSymbols’

It is good practice to register native routines and to disable symbol
search.

See ‘Writing portable packages’ in the ‘Writing R Extensions’ manual.
 WARNING
‘qpdf’ is needed for checks on size reduction of PDFs
R CMD check results
0 errors | 0 warnings | 3 notes

R CMD check succeeded

You should resolve these if possible. The NOTEs related to .travis.yml, LICENSE and paper could be resolved by using a .Rbuildignore file to ensure that these files aren't included with the created R package. See https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Building-package-tarballs for more details.

In addition, the Read-and-delete-me file can be removed.

Allow user to control output emitted by `hmds`

E.g.

> hmds(matrix())
0 : 0 : 9.154231 
0 : 1 : 1.295180 
0 : 2 : 0.909164 
0 : 3 : 7.521004 
0 : 4 : 7.515656 
0 : 5 : 6.444834 
0 : 6 : 2.650794 
0 : 7 : 3.501593 
0 : 8 : 8.213786 
0 : 9 : 1.268850 
0 : 10 : 4.044897 
0 : 11 : 5.629833 
0 : 12 : 7.183209 
0 : 13 : 4.674137 
0 : 14 : 5.216217 
0 : 15 : 4.089559 
0 : 16 : 2.277564 
0 : 17 : 8.733508 
0 : 18 : 0.132038 
0 : 19 : 9.495266 
0 : 20 : 0.630379 
0 : 21 : 4.039999 
0 : 22 : 3.874295 
0 : 23 : 1.651352 
0 : 24 : 5.504311 
0 : 25 : 1.824779 
0 : 26 : 0.067227 
0 : 27 : 5.568614 
0 : 28 : 2.412821 
0 : 29 : 1.006876 
0 : 30 : 2.101734 
0 : 31 : 4.298240 
0 : 32 : 2.362230 
0 : 33 : 9.964386 
0 : 34 : 8.526113 
0 : 35 : 0.316426 
0 : 36 : 9.567611 
0 : 37 : 3.220868 
0 : 38 : 6.748602 
0 : 39 : 0.109211 
0 : 40 : 6.364845 
0 : 41 : 9.364899 
0 : 42 : 6.283575 
0 : 43 : 5.715547 
0 : 44 : 0.996311 
0 : 45 : 7.619263 
0 : 46 : 5.049326 
0 : 47 : 0.778688 
0 : 48 : 3.755725 
0 : 49 : 7.531476 
count : 1000 
      item :       item : input distance : output distance
input data distance : 0.000000
output data distance : 0.000000
count:0
         [,1]    [,2]      [,3]     [,4]     [,5]     [,6]     [,7]     [,8]     [,9]   [,10]
[1,] 9.154231 1.29518 0.9091641 7.521004 7.515656 6.444834 2.650794 3.501593 8.213786 1.26885
        [,11]    [,12]    [,13]    [,14]    [,15]    [,16]    [,17]    [,18]     [,19]    [,20]
[1,] 4.044897 5.629833 7.183209 4.674137 5.216217 4.089559 2.277564 8.733508 0.1320378 9.495266
         [,21]    [,22]    [,23]    [,24]    [,25]    [,26]      [,27]    [,28]    [,29]    [,30]
[1,] 0.6303789 4.039999 3.874295 1.651352 5.504311 1.824779 0.06722711 5.568614 2.412821 1.006876
        [,31]   [,32]   [,33]    [,34]    [,35]     [,36]    [,37]    [,38]    [,39]     [,40]
[1,] 2.101734 4.29824 2.36223 9.964386 8.526113 0.3164265 9.567611 3.220868 6.748602 0.1092108
        [,41]    [,42]    [,43]    [,44]     [,45]    [,46]    [,47]     [,48]    [,49]    [,50]
[1,] 6.364845 9.364899 6.283575 5.715547 0.9963105 7.619263 5.049326 0.7786881 3.755725 7.531476

The function should probably provide a verbose / quiet argument or similar, controlling whether output is printed to the console. (Also: is this producing a sensible result for a matrix of this form?)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.