emacs-elsa / elsa Goto Github PK
View Code? Open in Web Editor NEWEmacs Lisp Static Analyzer and gradual type system.
License: GNU General Public License v3.0
Emacs Lisp Static Analyzer and gradual type system.
License: GNU General Public License v3.0
With #50 implemented, we can detect nonsense code such as (and (stringp x) (numberp x))
because after both of the narrowings the domain of x
will be empty => condition will never be true.
Some functions like =
take a variadic number of "Number | Marker" which we can not express yet.
We can add a constructor syntax such as Variadic (Number | Marker)
which would be quite easy. A nicer syntax such as (Number | Marker)...
can be also added.
We can also try to support LSP and enable a whole bunch of refactorings. With the proper source code analysis everything will be so simple.
We should also be able to query types of things under cursor (or anywhere for that matter), find references, sources, definitions...
In case we want to suppress the error or warning we should have a way to do so.
This is tricky because simply using read
can not distinguish different forms of the input
,asd
and
(\, asd)
read to the same expression but we need to handle them differently in the reader (though they should bot be read into the same structure).
For naming use the type predicates and follow the convention.
(cond
((maybe1) 1)
((maybe2) "string"))
this now has a type int | string
but it can also be nil
if no branch triggers.
For 3rd party packages we might want to provide the type hints ourselves.
We could have a tool that would go over the package files and extract all the public code:
elsa-make-type
can be pretty slow at times, we should support dumping the typed files into a byte-compiled native format which can be read much faster.
I suspect that simply byte-compiling the file should work as the macros should be able to expand during compile-time.
We need some special logic for:
Special support for cl-defun
will also be necessary to properly handle &keys
and &aux
.
For example
(put 'subr-arity 'elsa-type (elsa-make-type Mixed -> Cons Int (Int | Symbol)))
Returns either Int
or a symbol 'many
as the cdr
. Saying just Symbol
is overly vague.
Once implemented, go over the typed definitions and adjust those where it makes sense
subr-arity
When iterating over the arglists we need to drop the special markers &optional
and &rest
.
There should be some common abstraction for lambdas and defuns.
Since we don't know what is supposed to be evaluated and what not, do not evaluate anything by default.
Later we should learn to read the declare
debug
form and analyse only those parts which are evaluated, such as form
or body
.
These are not ordered by priority but load-order. Pick something useful first.
subr
and simple
are good initial candidates.
Because different versions of emacs use different representation of objects (vectors before 26 and records after 26) half the tests fails.
We can kill two birds with one stone by testing that whatever type the macro produces serializes back to the form that created it, such that reading it back would produce the same type.
However, we need to take into account normalization of nested parens, such that (Int)
and Int
are equivalent even though the form is different (both will print as Int
)
This was discovered in #20.
Functions in elisp can be without arguments and so we can have (current-buffer-name) :: string
but it still should be a "function" type?
Every type should support elsa-type-return
which returns the last type in the list, so for atomic types just that and for real function types the last type in the sequence.
There is not really much to implement except the unification mechanism and then some simple substitution magic on elsa-get-type
which would take the types from the surroundings and fill the still generic variables with whatever appropriate (or mixed).
We need to track the existence of the dot read syntax independent of the resulting read form.
This breaks reading forms such as
(defvar sp-pairs
'((t
.
((:open "\\\\(" :close "\\\\)" :actions (insert wrap autoskip navigate))
(:open "\\{" :close "\\}" :actions (insert wrap autoskip navigate))
(:open "\\(" :close "\\)" :actions (insert wrap autoskip navigate))
(:open "\\\"" :close "\\\"" :actions (insert wrap autoskip navigate))
(:open "\"" :close "\""
:actions (insert wrap autoskip navigate escape)
:unless (sp-in-string-quotes-p)
:post-handlers (sp-escape-wrapped-region sp-escape-quotes-after-insert))
(:open "'" :close "'"
:actions (insert wrap autoskip navigate escape)
:unless (sp-in-string-quotes-p sp-point-after-word-p)
:post-handlers (sp-escape-wrapped-region sp-escape-quotes-after-insert))
(:open "(" :close ")" :actions (insert wrap autoskip navigate))
(:open "[" :close "]" :actions (insert wrap autoskip navigate))
(:open "{" :close "}" :actions (insert wrap autoskip navigate))
(:open "`" :close "`" :actions (insert wrap autoskip navigate)))))
"List of pair definitions.
Maximum length of opening or closing pair is
`sp-max-pair-length' characters.")
Currently we save the start and end positions of the forms but those are not good for error reporting since we would have to open the file buffers again and calculate the lines and offsets which can be done in the reader itself (probably from the mail function for each of the forms uniformly).
If I have a variable a
which has type Mixed
, in the following form
(if (stringp a) (progn ...) (progn ...))
it should have type String
in the then body and Mixed \ String
in the else body.
We should also have some unit tests... this is pretty core logic.
This is quite important as these are quite common and break a lot of code by introducing "unbound variables".
For start just adding mixed variables is OK, later we might want to actually analyse the binding forms also (where possible... destructuring is super tricky because the structure might not be known).
For example the return type of symbol-function
can be a lambda or a subr... and we don't know the arity or type or anything (unless the symbol is a constant and we know the definition).
But if it is not nil it is definitely callable.
We should have several subtypes
See https://github.com/Fuco1/Elsa/blob/master/dev/builtin.txt
If we detect a function that throws in the body and the exception is not handled, we should error.
See #26 .
;; (elsa :: (Int -> Bool) -> Bool)
(defun foo (fun) t)
;; (elsa :: Number -> Bool)
(defun bar (n) t)
(let ((a (lambda (x) t))) (foo a))
(let ((a 'null)) (foo a)) ;; does not work currently, a is Mixed (and would be a Symbol)
(foo 'bar)
(foo 'null)
Maybe we need to keep track of a second type for places where the symbol is used as a callable? Once a
is bound to 'null
we lose the information about the symbol name so the function definition should probably be bound to something to keep track.
I think this is only relevant property for symbols as other atoms are not callable.
(let ((a "a"))
(eq a a)) ;; => t
(let ((a "a")
(b "a"))
(eq a b)) ;; => nil
I can not imagine when the first is much useful... Probably this is an error. It should be a warning but configurable.
(:foo a :bar b)
&keys
syntax from cl-defun(propertize "string" 'a b 'c d)
, a
and c
are keys and b
is a value... so the type is something like (function (string &rest [symbol value]) string)
. We need to add a good syntax for repeating a set of values. Take inspiration in macro instrumentation.What we need at minimum:
oref
and oset
to work with this information.We should thread a state variable through the analysis and it should have a method to add a message.
This will also save us all the filtering for -non-nil
and -flatten
and so on.
'foo
should have type symbol'(foo)
should have type list:keywords
, numbers...) since e.g. (stringp '"foo")
is t
https://www.gnu.org/software/emacs/manual/html_node/elisp/Special-Forms.html
'(repeat :value ("k" . ignore)
(choice :value ("k" . ignore)
(list :tag "Descriptive Headline" (string :tag "Headline"))
(cons :tag "Letter and Command"
(string :tag "Command letter")
(choice (function) (sexp)))))
(from org.el)
So it could be maintained and released separately. We currently have the code in readme only which is not very convenient.
Some methods might always return a non-empty list, then taking something like car
of it will never return nil.
A type Cons Int (Int | Symbol)
is printed as Cons Int Int | Symbol
which is a different type. Similarly for functions with nested functions etc.
We need to decide on wrapping with parens based on the type of type (composite or atom). This is I think relevant to sum, diff and function types only.
Variable type is static for the entire duration of the script, but the type of the expression involving said variable can change due to e.g. typechecks:
(defvar a "string" :type string?)
(if a "not-null"
(setq a "default) ;; should not error on "a is of type nil"
)
This should probably be tracked on the scope because as we nest bindings variables get pushed and popped and so the information would get mixed.
We can simply walk through the scope and on the first use set a flag. When we are popping a variable, check if it was never used in the scope and flag it as a warning.
Beware, the popping might happen as implementation detail for example in conditionals narrowing... we should only disregard the information when popping a "defining" entry like a binding from let
or defun
. Otherwise the information needs to be propagated upwards.
To make this simple we should probably add the defining form to the variable class so that we can refer back to it later.
For built-ins subr-arity
.
See also
(defun help-function-arglist (def &optional preserve-names)
"Return a formal argument list for the function DEF.
IF PRESERVE-NAMES is non-nil, return a formal arglist that uses
the same names as used in the original source code, when possible."
;; Handle symbols aliased to other symbols.
(if (and (symbolp def) (fboundp def)) (setq def (indirect-function def)))
;; If definition is a macro, find the function inside it.
(if (eq (car-safe def) 'macro) (setq def (cdr def)))
(cond
((and (byte-code-function-p def) (listp (aref def 0))) (aref def 0))
((eq (car-safe def) 'lambda) (nth 1 def))
((eq (car-safe def) 'closure) (nth 2 def))
((or (and (byte-code-function-p def) (integerp (aref def 0)))
(subrp def))
(or (when preserve-names
(let* ((doc (condition-case nil (documentation def) (error nil)))
(docargs (if doc (car (help-split-fundoc doc nil))))
(arglist (if docargs
(cdar (read-from-string (downcase docargs)))))
(valid t))
;; Check validity.
(dolist (arg arglist)
(unless (and (symbolp arg)
(let ((name (symbol-name arg)))
(if (eq (aref name 0) ?&)
(memq arg '(&rest &optional))
(not (string-match "\\." name)))))
(setq valid nil)))
(when valid arglist)))
(let* ((args-desc (if (not (subrp def))
(aref def 0)
(let ((a (subr-arity def)))
(logior (car a)
(if (numberp (cdr a))
(lsh (cdr a) 8)
(lsh 1 7))))))
(max (lsh args-desc -8))
(min (logand args-desc 127))
(rest (logand args-desc 128))
(arglist ()))
(dotimes (i min)
(push (intern (concat "arg" (number-to-string (1+ i)))) arglist))
(when (> max min)
(push '&optional arglist)
(dotimes (i (- max min))
(push (intern (concat "arg" (number-to-string (+ 1 i min))))
arglist)))
(unless (zerop rest) (push '&rest arglist) (push 'rest arglist))
(nreverse arglist))))
((and (autoloadp def) (not (eq (nth 4 def) 'keymap)))
"[Arg list not available until function definition is loaded.]")
(t t)))
See 626fb1f
Blocked by #36.
cl-defmethod
as if it were a defun
cl-defgeneric
We need to add support for analysing cl-defmethod
as if it were a defun
with addition of handling the "cl-style" argument list. For start just extend it to the forms like (this type)
so that we can have type information better than Mixed
on actually typed arguments.
At some point we would need to add (elsa ::)
annotations anyway for the rest of the arguments. The type in the function should always take precedence over the comment annotation because that is what is actually going to be used.
For example org-element-at-point
returns a structure which is
(symbol plist)
and some of the plist properties are known beforehand (:begin
, :end
...)
elsa-type-sum
elsa-type-intersect
elsa-type-diff
We can first make this as naive as possible and then add some normalization on top to reduce types into minimal forms.
Then remove unnecessary functions
elsa-sum-type-remove
elsa-diff-type-remove-positive
It should be possible to do all the operations generically on all the types (that's the point of having the class interface) so that these specific functions make no selse.
We should also move the methods that work on the classes back to elsa-types.el
and only have defuns in helpers (or maybe remove that file alltogether).
This possibly also means (think about it!) that we don't need the surely
and surely-not
properties on elsa-variable
as those are simply expressible as sum and diff types.
Here it will report that -map
expects two args but only gets one. Same for -reduce
(->> (-split-on '| definition)
(-map 'elsa--make-type)
(-reduce 'elsa-type-sum))
It might happen that a private function is not referenced from anywhere inside the package. In which case it's dead code and we should remove it.
This will require construction of something like a call graph, because a function might be referenced from another which is never called, then both should be marked as unreachable.
Also think about mutually recursive functions which are unreachable from outside.
So basically directed components and then use those as equivalence classes and try reaching them from the entry points (= public functions).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.