Comments (4)
In view of #59 and #60 I think this no longer applies. $lookup
does not perform a deep clone on the inputs.
Please let me know if I have missed something. If not feel free to close the issue.
Thanks
from mingo.
I'm not sure. First, lets establish we use the same semantics. 'Inputs' is ambiguous when having two collections.
Let's assume the collection from the (previous step in the) pipeline is called the pipeline collection
and the collection from the lookup is called the lookup collection
.
I'm assuming (correct me if I'm wrong) that now:
pipeline collection
is shallow-cloned.lookup collection
is referenced.
I would like to see the option to modify rather than shallow-clone the pipeline collection
.
(Or if that is already the case, I think you should shallow-clone by default, not for me, but in order to "guarantee that the underlying collection is not changed".)
With such option, pipelines on big collections can be programmatically optimized further.
from mingo.
Shallow-cloning the pipeline-collection
is the default here.
Surely, there is a performance hit, but I would argue this addresses 99% of use-cases. The cloning is just a new object populated with the fields and values of the old object. The lookup-collection
should be referenced (forgot to remove the clone
. must fix)
On thread for #59, I mentioned that modifying the pipeline-collection
in place would break other operators such as $out
. The result there won't make sense anymore unless it detects that we are not cloning and so must deep clone the current result into a new array.
I think the cost of the edge cases introduced and non-resilience to future operators (i.e. each operator must be aware of that option, including custom operators) does not make it worth the benefit.
If your collection is big enough that this is an issue, then mingo
may not be the right tool.
Thoughts?
from mingo.
Fair enough. I think you are correct.
With the side-note that you probably mean 99% of use-cases within the current audience, being small and medium datasets. Because high performance in-memory big datastores like redis
are hugely popular.
For high performance collection querying in node.js without separate servers (redis) there is LokiJS
. However this is nowhere near as convenient as full mongo pipeline syntax support, and it attaches data to the original collection.
mingo
has some sought-after functionality that LokiJS
lacks, and vice-versa. If at any point you're interested in making mingo
a candidate for audiences that work with heavy loads, it would be interesting to revisit "edge case" optimizations, as they quickly add up. Imagine sending 1,000,000 documents with 100 root properties each through the pipeline. Memory usage would multiply the collection size with the number of $lookup
s for the duration of the function call, as the garbage collector would only (eventually) remove pre-clone collections from memory when they are moved out of scope.
I can easily work around the memory problem with the subdocuments/references described elsewhere, but in the end you want to be able to query mingo
the same as mongodb
without having to differ the data layout between the two.
As for your comment:
I mentioned that modifying the
pipeline-collection
in place would break other operators such as$out
.
I'd argue that's why it would need to be an option, only for people who know what they are doing.
mingo.setup({
lookupClone: false // defaults to true
});
and/or for single shots:
$lookup: {
from: 'collection1',
localField: 'id',
foreignField: 'id',
as: 'lookup1',
_noClone: true
}
from mingo.
Related Issues (20)
- Possible breaking error in 6.1.1 `dateFromParts` HOT 1
- core.addOperators is missing HOT 1
- How does generator function for collection work? HOT 2
- Expressions in arithmetic operators are evaluated in a nested path HOT 2
- Projection not working properly for deep nested objects HOT 1
- $Reduce $Map Returning incorrect results and making other fields return undefined.
- Es6 Module build HOT 1
- [email protected] breaks CommonJS requires of files inside package HOT 20
- Issues with $let since 6.2.0 HOT 1
- Strange behaviour in $round HOT 3
- Wrong evaluation with `NaN` value HOT 4
- TypeError: CreateListFromArrayLike called on non-object HOT 10
- Filter inside of map causes undefined variable item error
- Add support for `$linearFill` (aggregation window)
- Add support for $fill (aggregation)
- Add support for $densify (aggregation)
- $filter "truthy" condition returns incorrect results HOT 4
- Support useStrictMode for all truth value checks HOT 1
- Add support for $graphLookup (pipeline)
- aggregate() $sort stage on nested date replaces all items in collection with the last item (6.3.2) HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from mingo.