Comments (56)
This could be done by storing a hash of the brocfile in the tmp folder and checking it on startup.
this hash is much more complicated then just of the brocfile.
from broccoli-plugin.
directory is created on each run of broccoli
each run, or each rebuild? We expect persistentOutput to only live for the duration of a process's life not beyond.
from broccoli-plugin.
so how does caching work if you run the process twice?
from broccoli-plugin.
for example when you want to do regular testing then do a deploy - our build takes around 6 minutes
from broccoli-plugin.
so how does caching work if you run the process twice?
The caching you see is for rebuilds (in the same process) not builds between different processes. The later is quite tricky to get correct. Remember, cache invalidation is one of those gnarly problems.
For some plugins, like broccoli-peristent-filter the inputs/output/options/dependencies is mostly known, so a more aggressive inter process persistent cache can be created. It is though, not without its own caveats (several pending issues must yet be resolved, and great care was originally taken).
In many cases, you should be able to do something like:
ember build --output-path somewhere --env production // build it once
ember test --path somewhere // if i recall, this or something like it allows an existing build to be used
// and if ember deploy is a well behaved citizen maybe it supports
ember deploy --path somewhere
from broccoli-plugin.
trouble is, I can create a persistent cache for archiver, but unless the incoming merge, funnel, replace etc also use one, the input files are going to change every time. And hashing every file is likely to be just as slow...
from broccoli-plugin.
And hashing every file is likely to be just as slow...
you should check, hashing is really fast (largely just limited by available IO).
What you describe is a trade-off between stale reads and accurate reads. We error on the side of accurate reads, because the alternative leads to very bad things. Continuing to improve and evolve the caching + perf story is great but we cannot sacrifice build accuracy or we will just cause much more grief.
An alternative to explore is what i described above (build once, and reuse for multiple commands).
from broccoli-plugin.
no, obviously build accuracy has to come first, and cache invalidation should err on the side of pessimism. I'll investigate hashing and persistent caching.
from broccoli-plugin.
@BryanCrotaz being able to build once, and then reuse for testing + deploying also seems like a big win (if it is not already possible).
from broccoli-plugin.
But even then, you'd come back the next day and be hit by a 6 minute startup
from broccoli-plugin.
@BryanCrotaz I also suspect, some of the slower steps. Like uglification may benefit from utilizing persistent-filter, as I suspect (for example) that your apps vendor file does not change very often..
from broccoli-plugin.
and you'd need to leave your build process running so as to get in-process caching, so you're in danger of kicking off another build while deploy is still in progress
from broccoli-plugin.
and you'd need to leave your build process running so as to get in-process caching, so you're in danger of kicking off another build while deploy is still in progress
builds don't mix, whats the danger?
from broccoli-plugin.
But even then, you'd come back the next day and be hit by a 6 minute startup
development warm builds should not take 6 minutes.
Our largest app is about 500kloc of app code (not addons not vendor), and the warm dev (non-prod) boots are 12s.
from broccoli-plugin.
to get caching, you're (I think) suggesting running two instance of broccoli with different output directories
from broccoli-plugin.
@BryanCrotaz no
from broccoli-plugin.
development warm builds should not take 6 minutes.
I bet you're not zipping multiple copies of your resulting node_modules and injecting binaries like ffmpeg into them?
from broccoli-plugin.
@BryanCrotaz no
Then you're surely going to get nothing better than what we have at the moment - build taking 6 minutes before a test or deploy
from broccoli-plugin.
@BryanCrotaz it sounds like your custom steps require optimization. Most likely following the approach taken by persistent filter will lead you down a path of success.
from broccoli-plugin.
@BryanCrotaz no
Then you're surely going to get nothing better than what we have at the moment - build taking 6 minutes before a test or deploy
@BryanCrotaz no, I believe you are confused. The idea is, build once, then subsequent commands should use the already built output (not build there own) to perform tests + deploy etc.
from broccoli-plugin.
The idea is, build once, then subsequent commands should use the already built output to perform tests + deploy etc.
That's what we're doing - but if tests show a bug (though of course we never have bugs!) then you have a 6 minute cycle to try again.
from broccoli-plugin.
it sounds like your custom steps require optimization
We build about 15 zip files - in 99% of cases only one of them includes new source changes, so if we could cache we'd only need to rebuild one of them
from broccoli-plugin.
Our custom steps are just concatenating json, eg building AWS CloudFormation templates from lots of small files. Again, these change rarely, so caching would fix this problem.
from broccoli-plugin.
@BryanCrotaz i was referring to:
I bet you're not zipping multiple copies of your resulting node_modules and injecting binaries like ffmpeg into them?
from broccoli-plugin.
That's what we're doing - but if tests show a bug (though of course we never have bugs!) then you have a 6 minute cycle to try again.
It seems like:
- ember s
- check on
/tests
while you develop (for fast iteration), or runember test --server
during development
would catch your test failures nice and quick.
Starting and stopping the process to run tests seems very strange.
from broccoli-plugin.
Can't avoid that bit - we're building for AWS Lambda, so it requires a zip of the executable node files. I've built an npm -pull plugin which only brings in the modules we're actually using, but it's still a lot of files.
from broccoli-plugin.
That would be great if it was an ember app - this is a broccoli problem, not an ember problem!
from broccoli-plugin.
this is a broccoli problem,
Im dubious of this.
Can't avoid that bit - we're building for AWS Lambda, so it requires a zip of the executable node files. I've built an npm -pull plugin which only brings in the modules we're actually using, but it's still a lot of files.
have you tried zipmerge ? Seems like you could zip your node_modules, then append the zip of your app code.
from broccoli-plugin.
Seems like you could zip your node_modules, then append the zip of your app code
Surely this has the same caching problem?
from broccoli-plugin.
it would be a great way to speed it up once persistent caching is working
from broccoli-plugin.
@BryanCrotaz ya, but your working on your app you can manually be aware if you have made a change to your node_modules or not. Broccoli cannot infer this, without extra work.
from broccoli-plugin.
Can't broccoli treat node_modules as any other source folder?
from broccoli-plugin.
Can't broccoli treat node_modules as any other source folder?
It does
from broccoli-plugin.
then why can't broccoli infer this?
from broccoli-plugin.
then why can't broccoli infer this?
It can, it just must diff the whole fs tree. Which is why i said, "Extra work".
from broccoli-plugin.
Well, to be accurate It also would need to be aware of the node resolution algorithm and walk in both directions (up and down) looking for any ancestor node_modules and also traversing them.
from broccoli-plugin.
so basically to get persistent caching we need a fast way to diff a requires
tree. Early exit on the first diff found would be fine (pessimistic).
from broccoli-plugin.
and using file hashing rather than last modified so that other plugins that copy files wouldn't break downstream caches
from broccoli-plugin.
being clever with symlink recognition would hopefully shortcut some steps
from broccoli-plugin.
@BryanCrotaz well its more then just then, all inputs must be part of the cache. This is quite tricky, and also quite costly to perform.
I really think you should focus on maintaining the zip of your deps manually (just add a postinstall hook or something to invalidate it), and using zipmerge as part of the build.
from broccoli-plugin.
other plugins that copy files
do plugins still do this? Our builds are without such steps.
from broccoli-plugin.
can you add a global postinstall hook? you'd need it to fire on every install
from broccoli-plugin.
do plugins still do this? Our builds are without such steps.
Yes - every time you rerun broccoli
from broccoli-plugin.
can you add a global postinstall hook? you'd need it to fire on every install
Sounds like something you should investigate. I am merely hinting at a simple solution.
from broccoli-plugin.
and using file hashing rather than last modified so that other plugins that copy files wouldn't break downstream caches
this is how broccoli-persistent-filter works.
from broccoli-plugin.
so if it works for broccoli-persistent-filter, why can't it work for broccoli-caching-writer?
from broccoli-plugin.
so if it works for broccoli-persistent-filter, why can't it work for broccoli-caching-writer?
it could, but not in the general case as input/output
is typically entirely opaque for caching-writer. Where as broccoli-filter the input is known, as the filter is responsible for input reading, and passing it to the processString.
from broccoli-plugin.
I really think you should focus on maintaining the zip of your deps manually (just add a postinstall hook or something to invalidate it), and using zipmerge as part of the build.
npm does have global scripts, however the dependencies are generated by walking the requires
tree, so the deps are dependent on the source. Darn.
from broccoli-plugin.
npm does have global scripts, however the dependencies are generated by walking the requires tree, so the deps are dependent on the source. Darn.
Write
you should read the source for persistent-filter, as we have sorted out most of that issue.
from broccoli-plugin.
is there scope here for a clone of persistent-filter which takes in a tree and outputs a new tree rather than a 1:1 map? The whole output is recalculated if any of the inputs have changed?
from broccoli-plugin.
or is that something that already exists too?
from broccoli-plugin.
Alright this seems to have turned into consulting, unfortunately I do not have the bandwidth to offer such services, and this is not the correct venue to be asking for what feels like unbounded help.
I do wish I had such bandwidth, unfortunately reality is not that kind to me.
TL;DR:
Caching is hard, broccoli takes a conservative stance to be confident it does not serve stale assets.
In the general case, there is nothing perserved between rebuilds, as the number of inputs (code/config/node_modules/OS/ENV vars) that could have changed is quite large. That being said In some cases, via targeted filters such as broccoli-persistent-filter it is able to maintain a persistent inter-process cache. This is hard to do safely, and requires much attention and careful thought. But when done correctly, it does offer some very nice performance boosts. (Takes my app a work from 90s initial builds to 12s).
It seems like something is strange/bespoke in the described build, which should likely be investigated further.
from broccoli-plugin.
I'm only looking for where I should best spend my time contributing, and not recreating something that already exists but I'm not aware of.
from broccoli-plugin.
Let me encourage further exploration, so please do investigate.
I do not believe this to be an issue with broccoli or broccoli-plugin themselves, rather a reality of the domain. Caching is hard. This doesn't mean a creative solution isn't possible, but the real problem I believe is still yet to be defined in sufficient detail to be actionable.
Let me recommend providing a demo that is publicly available (as a github repo), conveys the exact problem accurately, and can something others can hack on to explore further. As speaking abstractly, and back filling context is quite time consuming and I view it as a fairly high risk activity. Communicating is basically just hard ...
With the above, it actually sounds quite compelling problem to explore.
For now I am going to close this issue, if at some point the above can be provided, please provided a link on this issue and it can be explored further.
Again, I wish I had unbounded time. The above request is most likely the best chance of being able to collaborate on a solution.
from broccoli-plugin.
Where's the best forum to discuss a demo and RFC?
from broccoli-plugin.
Where's the best forum to discuss a demo and RFC?
@BryanCrotaz I think an RFC is premature, as i mentioned:
This doesn't mean a creative solution isn't possible, but the real problem I believe is still yet to be defined in sufficient detail to be actionable.
Let me recommend providing a demo that is publicly available (as a github repo),
^^ seems like the best venue, as it will be an example of the problem, its own issue tracker and pr system for exploration.
I feel speaking concretely of a specific build will help, as the context will be at hand.
from broccoli-plugin.
Related Issues (17)
- Make inputTrees immutable HOT 4
- Example code fails with TypeError: Missing `new` operator HOT 1
- How do we handle dependencies that are not within the input tree? HOT 1
- This project appears to introduce ~239 files via itself and its deps
- better "cleanup" idea HOT 5
- HTML elements don't have closing tags
- Return rejected promise HOT 6
- got [object Object] for inputNodes[1] HOT 6
- [Discussion] Attaching Meta Data HOT 2
- trackInputChanges fails silently on older broccoli versions HOT 1
- builtin walkSync for this.output, this.input or `fake fs` HOT 1
- cachePath persistence. HOT 2
- Snyk shows high vulnerable issue through dependencies
- tests appear to require the internet HOT 4
- `multidep` failure on node@15
- Not working with ember-auto-import & absolute input paths HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from broccoli-plugin.