Giter Club home page Giter Club logo

Comments (4)

psiminelakis avatar psiminelakis commented on June 9, 2024

Hi Martin,

thank you for your interest in our work!

Although @kexinrong has written all of the cpp code, I will try to the best of my knowledge give you a high-level idea of how to run the experiments until Kexin can chime in. There are roughly two separate folders to look into:

  • Benchmark: includes instructions for ASKIT and FIGTree.
  • HBE: includes high level instructions for HBE. The basic way the main programs interact with files is through a config file. This assumes that the data have been normalized so that each dimension of the data has variance 1.

In terms of the experiments, for each dataset in question there are a few things that need to be done:

  1. Preprocess the dataset in order for each column to have variance 1.
  2. Given a kernel (e.g. gaussian) and a bandwidth (typically set to sqrt(2) * n**(-1./(d+4)) for gaussian), compute the ground truth kernel density for the dataset using ComputeExact.cpp that reads a config file where the number of random queries M is specified.
  3. At this point you can run the experiment using ASKIT or FIGTRee.
  4. For HBE, RS:
    • Run FindAdaptiveEps to find the epsilon parameter for RS, HBE to use in the experiments for comparison.
    • Run BatchBenchmark to get results on RS, and variants of HBE.

from rehashing.

kexinrong avatar kexinrong commented on June 9, 2024

Paris - Thanks for the information!

Hi Martin:

Thanks for reaching out. Please refer to Pari's comments on the overall project structure and preprocessing. A few additions/clarifications:
For Table 1: RunAdaptive would be the main program for timing/accuracy measurements. BatchBenchmark is for results related to sketching.
For Figure 3: Diagnosis outputs the estimates of relative variances for RS and HBE for a given dataset.

You can change the main program by modifying the executable in cmake.

Hope it helps!

from rehashing.

maumueller avatar maumueller commented on June 9, 2024

Thank you, @kexinrong and @psiminelakis! I'll try to work from what you described. It's a bit unfortunate that there is no bash script that exemplifies how to run everything for at least one of the datasets.

I hope it's ok if I keep this open and get back to you if I have more questions.

from rehashing.

kexinrong avatar kexinrong commented on June 9, 2024

One additional note: with the new updated datasets, it should be easier to get an example run. For example, if you compile the RunAdaptive as the main executable, you can run adaptive sampling with RS, with eps=0.2 using
./hbe conf/shuttle.cfg gaussian 0.2 true
To run adaptive sampling with HBE, with eps=0.9
./hbe conf/shuttle.cfg gaussian 0.9
The top of each main program contains some comments on example usage.

I also suggest start trying with the shuttle dataset since it's smaller and easier to debug.

from rehashing.

Related Issues (2)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.