Giter Club home page Giter Club logo

Comments (8)

easypickings avatar easypickings commented on June 14, 2024 1

Good to know! And thanks again for this awesome work and all your help!

from parallel-hashmap.

greg7mdp avatar greg7mdp commented on June 14, 2024

Yes, if the map is not modified, it is perfectly safe to check whether it contains keys from multiple threads.

from parallel-hashmap.

easypickings avatar easypickings commented on June 14, 2024

Thanks Greg! Another question please: if I have a lot of (tens of millions maybe) simple <int, int> pairs to insert into a hash map, is it a good idea to use a parallel_flat_hash_map and insert using multiple threads? My concern is that the number of data to insert is huge, and a single insert operation is lightweight, which may make the mutex lock a overhead.

from parallel-hashmap.

greg7mdp avatar greg7mdp commented on June 14, 2024

Sure, it still would be much faster to use a parallel_flat_hash_map and multiple threads, just make the N template parameter larger than the default 4, maybe 10 or something, and there will be very little mutex contention.

Another possibility. Where are all those pairs you want to insert coming from? If you can iterate very quickly over them, you can actually insert in a parallel_flat_hash_map without locking at all.

from parallel-hashmap.

easypickings avatar easypickings commented on June 14, 2024

Another possibility. Where are all those pairs you want to insert coming from? If you can iterate very quickly over them, you can actually insert in a parallel_flat_hash_map without locking at all.

Hmm... I have an array storing m offsets of a file. What I want to do is iterate over the array, read some bytes starting at the offset into memory, then store the pair <offset, index> in the map. So I don't think it can be done quickly. But how can iterating quickly make the insertion lock-free anyway?

from parallel-hashmap.

greg7mdp avatar greg7mdp commented on June 14, 2024

Actually that would work well. The parallel-flat-hash internally has an array of submaps (When N=4 you have 16 submaps).
What you would do is start 16 threads, and each thread would populate its own submap (thread # 0 populate submap 0, etc.....
So each thread would:

  • iterate over all the offsets in the file.
  • for each offset, check which submap it would go to (using the submap function).
    If not its target submap, it would do nothing.
    If it is, it would read some bytes starting at the offset into memory, then store the pair <offset, index> in the map.

that's it, no locking necessary.

This is what the bench does here. I really should write a better example.

from parallel-hashmap.

easypickings avatar easypickings commented on June 14, 2024

That's cool! By the way, when using a lock-free parallel hash map, is there a difference among insertion methods, like operator[]/insert/emplace?

from parallel-hashmap.

greg7mdp avatar greg7mdp commented on June 14, 2024

If you use the method I indicated above, use emplace_with_hash so you pass the hashval and it is not recomputed. Otherwise it doesn't make much difference when you insert a pair of integers.

from parallel-hashmap.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.