Giter Club home page Giter Club logo

data-parallelism's People

Contributors

cmcaine avatar jw3126 avatar tkf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

data-parallelism's Issues

Picking a proper parallel library

Hello,

It is nice to see that the manual is becoming more and more comprehensive with time!

However, the section of mentioning other parallel libraries makes me wonder about how to select a parallel library to use in practice. There have been already ~10 libraries aiming at better usage of either multi-threading or multi-core parallelism besides the basic ones mentioned in Julia's official document; many of them provide more or less the same functionalities, which makes it somehow harder for users to choose from. Maybe this is also a sign that many more can be done in this category, and eventually one package will show up.

What is your opinion about this? Will there be a standard MPI or OpenMP like library in Julia? In the future if I want to build a massively parallel project using Julia, a good parallel library will be a solid building block. I know you are also an active developer in this field, so it is good to hear from an expert!

Questions about the tutorial

Hi,

This is a very nice tutorial! I have some questions and also comments after going through it:

  1. I tried the example of letter count in the mapreduce section. On my Mac with Julia 1.5.1, the performance is a little bit surprising.

With 1 thread:

@btime f1 = mapreduce(x -> Dict(x => 1), mergewith!(+), str)
  8.830 μs (203 allocations: 27.66 KiB)
@btime f2 = ThreadsX.mapreduce(x -> SingletonDict(x => 1), mergewith!!(+), str)
  36.834 μs (308 allocations: 20.67 KiB)

With 4 threads:

@btime f1 = mapreduce(x -> Dict(x => 1), mergewith!(+), str)
  9.466 μs (203 allocations: 27.66 KiB)
@btime f2 = ThreadsX.mapreduce(x -> SingletonDict(x => 1), mergewith!!(+), str)
  55.702 μs (1347 allocations: 86.23 KiB)

Shouldn't the threaded version be faster? Is the workload here too small to show the speedup? I guess there is threading launching overhead, but are these numbers normal?

  1. In the Practical example: Stopping time of Collatz function section,
julia> Threads.nthreads()  # I started `julia` with `-t 4`
4

julia> using BenchmarkTools

julia> @btime map(collatz_stopping_time, 1:100_000);
  18.116 ms (2 allocations: 781.33 KiB)

julia> @btime ThreadsX.map(collatz_stopping_time, 1:100_000);
  5.391 ms (1665 allocations: 7.09 MiB)

With 4 threads, why is the total memory usage ~10 times larger? Is it useful in general to check the memory usage for parallel programs?

  1. The section Practical example: Histogram of stopping time of Collatz function shows a more complicated usage of the FLoops package, which is a little bit harder to follow without knowing the package ahead.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.