jorgenschaefer / emacs-buttercup Goto Github PK
View Code? Open in Web Editor NEWBehavior-Driven Emacs Lisp Testing
License: GNU General Public License v3.0
Behavior-Driven Emacs Lisp Testing
License: GNU General Public License v3.0
Syntactic sugar for :and-call-fake
:
(defspy function-name (args ...)
body...
(spy-original 1 2)
...)
While writing specs for Flycheck I noticed that it's—by design, of course—more verbose in its output than ERT. Now, I really like that, because it's much more informative, but I think that the output could profit from ASCII colors being applied. I think a simple scheme of four colours would be great:
That'd help to get a quick idea and make failures stand out much more than they do currently, particularly if there are a lot of specs out of which just a single one failed.
What do you think?
While you can certainly argue that you should isolate the functions calling message
and then only test the other functions, I think it would be nice to have some functionality for testing for a certain output that resulted from calls to message
.
Sadly you can’t advice C primitives so the only way to record what message
has written is probably to look in *Messages*
. I don’t think it’s reasonable to expect that to be empty when running the test so either the position is recorded or the buffer is emptied before running the test.
The following tests both fail as expected, and print the output below:
(describe "Buttercup"
(it "shows incorrect messages for :to-match"
(expect "a" :to-match "b"))
(it "shows incorrect messages for :not :to-match"
(expect "a" :not :to-match "a")))
Running 2 specs.
Buttercup
shows incorrect messages for :to-match FAILED
shows incorrect messages for :not :to-match FAILED
========================================
Buttercup shows incorrect messages for :to-match
FAILED: Expected "a" not to match the regexp "b"
========================================
Buttercup shows incorrect messages for :not :to-match
FAILED: Expected "a" to match the regexp "a"
I expected the opposite output, in terms of negations.
Suite execution should catch errors and include a cleaned-up backtrace in the result (i.e. the reporter gets the backtrace to display it).
Some code that used to be used in some early code to get backtraces, posted here for reference:
(let* ((buttercup--descriptions (cons description
buttercup--descriptions))
(debugger (lambda (&rest args)
(let ((backtrace (buttercup--backtrace)))
;; If we do not do this, Emacs will not this
;; handler on subsequent calls. Thanks to ert
;; for this.
(cl-incf num-nonmacro-input-events)
(signal 'buttercup-error (cons args backtrace)))))
(debug-on-error t)
(debug-ignored-errors '(buttercup-failed buttercup-error)))
(buttercup-report 'enter nil buttercup--descriptions)
(condition-case sig
(progn
(funcall body-function)
(buttercup-report 'success nil buttercup--descriptions))
(buttercup-failed
(buttercup-report 'failure (cdr sig) buttercup--descriptions))
(buttercup-error
(buttercup-report 'error (cdr sig) buttercup--descriptions))))
(defun buttercup--backtrace ()
(let* ((n 5)
(frame (backtrace-frame n))
(frame-list nil))
(while frame
(push frame frame-list)
(setq n (1+ n)
frame (backtrace-frame n)))
frame-list))
See ert-runner and cask exec.
Having this would be a bit more convenient than having to figure out the right arguments for the emacs -batch
invocation.
Buttercup helpfully filters tests in dot-directories, so that we won't run tests for packages installed via Cask in .cask
directory.
However, if I have my project inside /home/user/.dotdirectory/projects/my-project/tests
where my-project
is the root of my project, no tests will run because they all match the /.
of /.dotdirectory
.
A solution is to cut away the current-directory to only do this filter inside the project directory.
The following phrases were left out from introduction.js
, and should be implemented still:
Pending specs do not run, but their names will show up in the results as pending.
And if you call the function
pending
anywhere in the spec body, no matter the expectations, the spec will be marked pending. A string passed to pending will be treated as a reason and displayed when the suite finishes.
Hi Jorgen,
Here's an example; this test fails, but the exception is not reported, causing one to think that there's an issue with the expect (whereas the issue is with the before-each
.
(describe "buttercup"
:var (x y)
(before-each (setq x 0) (error "Badly written test") (setq y 1))
(it "swallows errors in before-each"
(expect x :to-equal y)))
It took me a while to understand why one of my tests was failing, because in fact the error was not in the test, but to an error being thrown in the before-each clause.
Let me know if I'm doing something wrong :) Thanks!
It would be nice if the expect macro could record its unevaluated arguments, and print that in error messages: for example,
FAILED: Expected #[128 "\302\300\303\301�\"\"\207" [biblio--throw-on-unexpected-errors (((http . 406) (url-queue-timeout . "Queue timeout exceeded"))) apply append] 6 "
(fn &rest ARGS2)"] to throw a child signal of url-error, not wrong-number-of-arguments
would be shown as
FAILED: Expected (apply-partially #'biblio--throw-on-unexpected-errors `(,http-error--alist ,timeout-error--alist)) to throw a child signal of url-error, not wrong-number-of-arguments
Btw, let me know if that's too much feedback in too little time! buttercup is really fun to work with, congratulations on this neat project :)
Hi Jorgen,
Is there support in buttercup for expected failures? As a way to document features to be implemented, for example. (I don't mean something like xit
, since I'd like to know if changes suddenly make something work)
Thanks!
The following test passes:
(describe "Buttercup"
(it "does not complain when the argument of :to-throw isn't a function"
(expect nil :to-throw 'error)))
That's clearly a user error, but it would be nice to get a warning about it: I got bitten by this because my function didn't throw, and I set the test up incorrectly (as in #41), so I thought everything was fine.
This is a just a small suggestion: maybe it would be nice to replace the line of equal sign with Emacs' standard, ^L (Control-L)? There are packages to display it as a pretty line, too.
Your call :)
Buttercup doesn't exclude skipped specs from the number of running specs. Consider the following output from our Flycheck specs. Note how only four specs actually run, the rest is skipped, but buttercup claims that it's running all 36 specs, and afterwards says that it ran all 36 specs.
$ cask exec buttercup -L . --pattern ^Language Haskell
c-l-a-l: ("--pattern" "^Language Haskell")
Running 36 specs.
Language Haskell
Module names
does not extract a module name from commented code
extracts a simple module name without exports
extracts a simple module name at the end of a line
extracts a module name with exports
Error List
has the correct buffer name SKIPPED
has a permanently local source buffer SKIPPED
derives from Tabulated List Mode SKIPPED
Format
sets the error list format locally SKIPPED
sets a proper padding locally SKIPPED
sets the list entries locally SKIPPED
has a local header line SKIPPED
Columns
has the line number in the 1st column SKIPPED
has the column number in the 2nd column SKIPPED
has the error level in the 3rd column SKIPPED
has the error ID in the 4th column SKIPPED
has the error message in the 5th column SKIPPED
Entry
has the error object as ID SKIPPED
has the line number in the 1st cell SKIPPED
has the column number in the 2nd cell SKIPPED
has an empty 2nd cell if there is no column number SKIPPED
has the error level in the 3rd cell SKIPPED
has the error ID in the 4th cell SKIPPED
has the error message in the 5th cell SKIPPED
has a default message in the 5th cell if there is no message SKIPPED
Filter
kills the filter variable when resetting the filter SKIPPED
filters errors with lower levels SKIPPED
Mode Line
shows no mode line indicator if no filter is set SKIPPED
shows the level filter in the mode line if set SKIPPED
Manual
List of languages
lists all languages SKIPPED
doesn't list unknown languages SKIPPED
Syntax checkers
documents all syntax checkers SKIPPED
doesn't document syntax checkers that don't exist SKIPPED
Options
documents all syntax checker options SKIPPED
doesn't document syntax checker options that don't exist SKIPPED
Configuration files
documents all syntax checker configuration files SKIPPED
it doesn't document configuration files that don't exist SKIPPED
Ran 36 specs, 0 failed, in 0.0 seconds.
I think skipped specs should not count as "running". Buttercup should probably say "Running 4 out of 36 specs." and "Ran 4 specs, 32 skipped, 0 failed, in 0.0 seconds." instead.
The test runner output should be colored, with passing specs/suites in green, failing ones in red, and pending ones in yellow.
It should be possible to disable this using a --no-color
command line argument.
I'd like to improve buttercup by adding support for a user-defined hook that is run after any test has failed. This would make travis build outputs immensely more useful.
Consider my scenario with omnisharp-emacs:
The README should mention how to install buttercup and run tests most easily.
Buttercup should be able to report stack traces in various ways, depending on a command line argument.
--traceback full
should simply emit the full traceback
--traceback crop
should crop the traceback to 80 columns
--traceback pretty
should pretty-print every frame
Alternatively, buttercup-define-matcher
could.
The problem I see with this is that people often won't even load buttercup in their emacs where they write the code, so the forms won't evaluate. However, many do and I think discoverability from within emacs is a neat feature to have.
The docstring could be updated dynamically as users define more matchers (buttercup-define-matcher
can do the magic).
Related but orthohonal to #35.
I'm trying to use Buttercup for Flycheck, and miss the ability to skip suites and specs programmatically.
Many of Flycheck's tests depend on the environment, i.e. whether a certain mode is installed or a certain program is present. I'd like to be able to skip these tests when certain executables aren't present, etc.
I saw that I can just signal buttercup-pending
to more or less achieve this, but actually I'd prefer to have a different error condition with a different output to differentiate between not-yet-implemented tests and tests that are skipped because certain conditions are not met.
This warning shows up in recent Emacsen. Have to check if _
breaks in older versions.
When I have a test where a spec fails and I run it using buttercup-run-at-point
, two buffers pop at me: the butter cup report (yes, I want this!) and the emacs error/debug window (no, thanks emacs).
I'm not sure about the solution... maybe scan windows to see if such and such was opened, or wrap some code in ignore-errors
(I do not yet fully understand how errors are evaluated/catched).
I'm not sure if I'm doing this correctly.
My directory structure is:
emacs-habitica\
tests\
test-feature.el
feature.el
Except when I run buttercup:
drfra@spacecat ~Users\drfra\workspace\emacs-habitica master $ cask exec buttercup -L .
Cannot open load file: no such file or directory, feature
I was following the wiki... am I doing something incorrectly?
I was able to get around this by using the load-relative
library.
(require 'load-relative)
(require-relative "../feature")
That made it work but it doesn't feel "right".
It might be interesting to link to https://github.com/sviridov/undercover.el-buttercup-integration-example and https://github.com/sviridov/undercover.el in the README :)
Currently, the return value of matchers is pretty unreable. There should be a function or even syntax to create the return value that makes it obvious what is happening.
See #21 for some discussion.
It's difficult to find out the cause of error because the error message right now reports something like this:
FAILED: Expected ido-completing-read :to-have-been-called-with "(\"#1 public class SomeClass : BaseClass {}\" \"#2 public class SomeClass : BaseClass {}\")"
As a new user, I want to know why I should use Buttercup over the standard ERT framework.
Some thoughts:
- No setup/teardown facility
- Means you end up writing lots of boilerplate code, or, more
likely, large tests instead of fine-grained ones
- Difficult to add, too, as there is no facility for grouping tests
- No built-in mocking facility
- All existing mocking libraries require nesting
- Also, do not mesh well with the setup/teardown code
Hi Jorgen,
What's the right way to write a test to ensure that a function properly swallows errors? For example, this fails:
(describe "condition-case-unless-debug"
(it "does not throw too much"
(expect (condition-case-unless-debug nil (error "!!") (error nil)) :to-equal nil)))
I'd like to make sure that the function in fact does not throw errors; is there a way?
When interacting with external processes which a lot of packages do or do network stuff you have to be able to deal with tests that don’t actually block but run asynchronously.
Currently I have
(defmacro async-with-timeout (timeout &rest body)
`(progn
(setq hie-async-returned nil)
,@body
(with-timeout (,timeout) (while (not hie-async-returned) (sleep-for 0.1)))))
for that and then I advice the async functions to set hie-async-returned when they’re finished. I think this is common enough to add something like this to buttercup.
I am not completely sure what would be the ideal way to do so, ert-async shows one approach which gives you a list of callbacks and then it blocks until all callbacks are called.
As a user, I want to see the results of all failed expect
calls in a failed spec in order to get a full understanding of the erroneous behavior.
For example, when doing data-driven testing (e.g. like Fuco does in Smartparens), it is useful to see all mismatches, not just the first.
When matching large data structures with :to-equal the failure case output is not that helpful. Buttercup simply dumps the left and right hand side entirely, and leaves it to the developer to sort out the difference.
Other test frameworks (e.g. scalatest) try to "diff" the output, e.g. make differences between both sides stand out. I think it'd be great to have that in buttercup as well, specifically for strings and lists:
The second test of the following suite fails:
(defun a (&rest args)
"A")
(describe "Buttercup"
(describe "can register spies"
(spy-on #'a :and-return-value "B")
(it "to change return values"
(expect (a) :to-equal "B")))
(describe "doesn't extend spies beyond their enclosing describe"
(it "thus a now returns 'A'"
(expect (a) :to-equal "A"))))
But the documentation says
A spy only exists in the describe or it block it is defined in, and will be removed after each spec.
Is that a bug?
The buttercup-pending signal's message should be printed, not omitted. See e6fd5f5#commitcomment-16743046
This feature would make it possible to pick a single test that is run, from the pool of all possible tests. Allowing running only a single test will help developers who do test driven development, by speeding their test execution when focusing on a single bug, as well as providing a clearer test output.
The buttercup
script that comes with buttercup is not documented anywhere. That's not good.
Sisyphus is a library for ERT. Check it for ideas we could use in Buttercup:
For example,
(spy-on #'a :and-call-through #'b)
succeeds silently, and it took me a while to spot the mistake (I meant and-call-fake
). It would be nice to have an error there.
1.2 isn't on melpa-stable because it isn't tagged---but your last commit messages suggest that it was intended to be.
When I now read a bigger file with lots of describes and its and before-each etc it gets quite monotone and confusing to see what is going on. I'm a very "visual" person and I like the colors to keep things readily visible.
So far I put this in my .emacs to make things easier, I think it might be a good addition overall
(font-lock-add-keywords 'emacs-lisp-mode '(("(\\(describe\\|buttercup-define-matcher\\) " 1 'font-lock-keyword-face)))
This makes the describes stand out more so I can better see where things start, especially when you nest the specs this helps.
In dash we have a customize which you can enable or disable to get font-lock for the dash macros, we could potentially do the same here. What do you think?
I tried to write tests to verify errors on bad arguments and was bitten by the fact that
(expect (tested-function 'foo) :to-throw)
should have been
(expect (apply-partially 'tested-function 'foo) :to-throw)
A quick look at the expect macro shows that this is probably the easiest solution, but it feels backward and unintuitive.
When I do C-h v :to-equal
I get:
:to-be's value is :to-be
Documentation:
Not documented as a variable.
We can add documentation using
(put :to-equal 'variable-documentation "This is a buttercup matcher, it does so and so")
:to-equal's value is :to-equal
Documentation:
This is a buttercup matcher, it does so and so
There is a slight chance someone else would want to document this variable. In which case, we could check if the above property exists and append. In case they document after us, we probably can't do much about it, but I think the chances of that are rather slim.
Then buttercup-define-matcher
could accept optional docstring as 3rd argument (just like defun
) so we won't break existing custom matchers.
You can put this on the backburner if you want, I know how to implement it but probably won't get to it soon (certainly not before I finish the file system thing).
This could be really handy if we start adding more "complex" matchers where the wording might be confusing for some (or even the already existing :to-be-close-to
taking the precision
argument isn't completely obvious).
I'm really enjoying buttercup by the way!
Hi.
While I was implementing the mock-file system I wrote tests for it, obviously using buttercup itself. I wrote new matchers which I think might be quite useful in general
(buttercup-define-matcher :to-be-file (file)
(if (file-regular-p file)
(cons t (format "Expected %S not to be a file" file))
(cons nil (format "Expected %S to be a file" file))))
(buttercup-define-matcher :to-be-directory (dir)
(if (file-directory-p dir)
(cons t (format "Expected %S not to be a directory" dir))
(cons nil (format "Expected %S to be a directory" dir))))
(buttercup-define-matcher :to-contain (file content)
(if (with-temp-buffer
(insert-file-contents file)
(equal (buffer-string) content))
(cons t (format "Expected the content of %S not to `equal' %S" file content))
(cons nil (format "Expected the content of %S to `equal' %S" file content))))
You give it a string representing the name of the file/directory etc... straight forward. I wanted a matcher where it would do a regexp match but the syntax "file" :to-match "foo"
seems too much like the filename should match, and not the content. Ideally, we would say ":the-content-of "file" :to-match "foo"
. Maybe adding a feature which would skip all the :
prefixed tokens before the first real argument would help us create a bit nicer language?
Anyway... should I do a PR? Maybe we want to prefix these somehow to indicate they deal with the fs?
I seem to be using this idiom a lot:
(describe "Some feature"
(let (var1 var2)
....))
This adds one unnecessary level of nesting. It would be nice if describe
could introduce variables itself, like:
(describe "Some feature"
:variables (var1 var2)
...)
The argument to :variables
would be the same as for let
.
The discover test runner selects files based on specific file name patterns. These should be documented somewhere.
EDIT: This functionality is now part of the assess project, check it out!
I have no idea how to implement this properly, but it would be really cool if I could run tests inside a "faked file system".
Couple solutions:
One solution would be to mock all the functions for file-system access to operate over some known data... the problem is finding those functions and managing their interactions. Probably the most robust solution but I think pretty much impossible to implement.
Other solution: we could just create the requested file structure on the disc (say in /tmp
), set the default directory there and let the code work normally. The downside is that all the tested code would need to work with relative paths (which might not be that bad, most programs don't work with hard-coded paths anyway).
Some nice syntax to make directory structure and specify file content would also be cool.
Here is a repro:
(describe "Buttercup"
(spy-on #'a)
(it "Does catch the first call"
(a)
(expect #'a :to-have-been-called))
(it "Doesn't reset it's state for consecutive ‘it’ forms"
(expect #'a :not :to-have-been-called)))
This could be a duplicate of #52, and it could also be an incorrect expectation on my side :)
The Buttercup test runners should use reporter functions for better abstraction. A reporter function is called with various events during the execution of a test run. See the Jasmine documentation for a list of useful events.
There should be a buttercup-reporter-batch
by default, and possibly a buttercup-reporter-interactive
for buttercup-run-at-point
.
Hey Jorgen,
Is anything wrong with this test? It passes in Emacs 25 on my machine.
(describe "buttercup"
(after-each (ignore))
(describe "when using an after-each clause"
(it "seems to expect nil to be truthy"
(expect nil :to-be-truthy))))
Cheers,
Clément.
The discover code in emacs-buttercup is untested as of yet. Considering this is now the main entry point, and not a proof-of-concept anymore, that's kind of sad.
As a buttercup user, I want to know which parts of my code are not covered by my tests in order to identify which parts of my code still need testing.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.