stencila / stencila Goto Github PK

Programmable, reproducible, interactive documents

License: Apache License 2.0

Makefile 0.43% Rust 72.25% Shell 0.34% Dockerfile 0.04% TypeScript 20.52% Python 4.50% JavaScript 0.61% CSS 0.78% R 0.53%

executable document interactive programmable reactive reproducible-research

stencila's Issues

Include directive once flag

The following may be useful flag for Stencil include directives:

asis : means that the includee is not refreshed (re-rendered) but included as is (useful for including large stencils e.g. chapters which you don't want to rerender.
complete : means that the includee is included with all its directives (normally these will be removed)
once : means that the includee is only included once with all directives

Consider setting up Vagrant instances for builds with provisioning scripts in bash (so users can run scripts on their own machines easily without requiring something like Chef, Puppet or Ansible). These would provide a way of compiling 32/64 bit linux/windows etc versions of each library module.

Import directive

It would be useful to have an "import" directive similar to, for example, Python's import statement. It would import the names of macro elements into a "macro map" for the Stencil (or Context it is being rendered in). Those macros could then be accessed more succinctly e.g.

import some/address/to/a/stencil some-macro
some-macro(arg1=42,arg2="foo")

instead of

include some/address/to/a/stencil #some-macro
   set arg1=42
   set arg2="foo"

Integrate docker module

Currently, we have a separate stencila/docker repo. But it is so tightly linked to this repo it may as well go here. Automated builds can be specified for the subdirectory anyway.

Cila parsing and generation

The current implementation is clumsy and does not handle corner cases well. Reimplement new Cila syntax using a state machine which transitions between contexts

Directive parameters optionally treated as expression

Many directives have parameters which are expressions in the context language. For example in the for directive data-for="num in 1:10", there is a name parameter num (should not be an expression) and an expression parameter 1:10. In this case, 1:10 is evaluated within the context.

For maximum flexibility it would be advantageous to make many (all?) directive parameters expressions. For example, when using an exec directive to create an image it may be advantageous to have the dimensions of the figure determined within the context. To allow for this the width and height parameters of the exec directive should be expressions.

For some directive parameters this may be onerous. For example, the address parameter of the include directive would have to have quotes around it e.g. data-include="'address/of/includee'". In these cases some extra syntax may be necessary such that the parameter is by default not evaluated but can be if necessary e.g.

For normal use

data-include="address/of/includee"

For evaluated use

data-include="eval paste0(an,expression,which,provides,the,address)"

For directive : group child elements into a repeat directive

Currently, the for directive simply repeats the first child element for each item. This means that if there are, say, 2 children, only the first gets rendered. To fix this you need to nest multiple children within a div.

The for directive should do this for you. If there is no each directive child then create a new one and nest all children under it.

Input type for par stencil directive

Currently, rendering of a par directive only handles

<input type="text"...> where value is converted to a string literal
or an <input> with no type specified where value is parsed as an expression in the context language

Need to handle other HTML5 input types e.g. date, colour, range by wrapping them in code before evaluating the resulting expression in the context.

Generation of Markdown syntax within Cila

Parsing of Markdown syntax was added to Cila with b4bbe70. Generation of that syntax is partially implemented and needs to be completed. This probably requires some thought about distinguishing between inline and block elements in Cila as in HTML. Rules need to be defined for when the Markdown "shortcut" syntax can be generated and when need to use usual Cila syntax.

Add Stencil class

Implement Stencil class with interfaces in Python and R. Tasks include

integrating pugixml and tidy-html5 into cpp\requires
creating utilities/xml and utilities/html namespaces and headers
adding stencil.hpp and implementing basic interface
implement rendering
implement r and py contexts

Stencil context declarations

Currently contexts must be defined for each stencil as a <li>. It may be better to use a comma separated list in a data attribute to be consistent with other directives: <div data-contexts="r,py">.

Also, it would be nice if the preferred context be inferred from the code directives within the stencil e.g. if they are all r directives then use an RContext.

Lastly, we should allow for code directives with no context specified. Some pieces of code may run in multiple contexts

Improve embeddded server error messages

When using the embedded server, error pages just return plain text exception messages e.g.

Improve by creating an error method in Server which returns exceptions and other messages wrapped in some HTML, possibly including some data uri encoded images.

Add Array class

Implement Array class: Tasks include

array.hpp with Array, Dimension and Level classes.
query.hpp with Aggregator classes like Sum, Count etc and corresponding free functions that will dispatch to Array::query method (These classes will be used with Table and other classes as well)

HttpServer to shutdown gracefully with SIGTERM

This may help:

#include <boost/asio/signal_set.hpp>

http://www.boost.org/doc/libs/1_47_0/doc/html/boost_asio/overview/signals.html
http://www.boost.org/doc/libs/1_47_0/doc/html/boost_asio/example/http/server/server.cpp

Override source function in R stencil contexts

In R stencils, by default the source function sources code into the global environment, not the local stencil rendering context. To fix that, you need to use local=T:

r
    source('../../common.R',local=T)

but a better alternative may to override the source function within RContexts like this:

source <- function(file) { base::source(file,local=T) }

Basic directory structure and repository functionality for package component

Set up the basic directory structure for cpp, python, and r modules. Implement Component base class and Package class with git repository functionality.

R HttpServer should take address and port arguments

Refactor HTML Node and Document

Currently,

Html::Node is just a Xml::Node
Html::Document is derived from Xml:Document with some input/output methods that deal with HTML

When you say filter a Html:Document you get a list of Xml::Nodes which dump() as XML, not as HTML. Html::Node should be derived from Xml::Node with the extra methods and Html:Document derived from Html::Document.

This will simplify the code for the proper generation of indented HTML which needs some work.

Add users and permissions to Component<void>

The hub will send sessions (via Http::Server) a username and token and permission for each component that user opens. The Component class is the place that should store that information so that user requests can be authenticated and authorized.

Add websockets server

Currently there is a HTTP server which handles REST requests. Add in an anlalogue for websockets, probably based upon https://github.com/zaphoyd/websocketpp/, which passes a path and data to the same Component methods as REST currently does

Add Report class

Report class to represent a Stencil that has be stripped of all it semantics by removing stencil data-xxx attributes or rendered nodes with these attributes (e.g. data-error). Likely to be implemented as derived from Html::Document and having an additional methods that give the address of the stencil it came from, the context used to render it, time of rendering etc. The actual generation of the Report from a rendered stencil may be best part of the Stencil class - it will be similar to the write() method.

R package class framework

Currently we use "reference classes" (aka R5 classes) to implement Stencila components in R (e.g.).

Newish, R6 classes offer performance benefits.

Move to R6 classes (requires R6 package) or use a simple closure based approach (to minimize dependencies)? Consider documentation implications (e.g. r-lib/R6#3)

Stencil::render_image_() method

In Stencil::render_image_() finalise the protocol for insertion of bitmap formats: file in stencil directory or as a data uri? Or both depending upon the type of stencil?

R packaging

The Stencila R package needs to be built for multiple platforms.

Stencila C++, and thus Stencila R, relies on numerous open source libraries. Rather than distributing an R source package, and hoping that the user will have all the necessary dependencies to do a compile, the current strategy is to compile shared libraries (.so and .dll) and distribute those instead. install.libs.R is run on installation and looks for the correct shared library (either in the package or from http://get.stenci.la) and puts it in the right place.

Is this the right strategy?
How specific do versions need to be? For example, an R package developed under Ubuntu12.04/R3.1.2/Rcpp0.11.3 fails to install under Ubuntu14.10/R3.1.1/Rcpp0.11.4 (reason unknown) but will install under Ubuntu14.10/R3.1.2/Rcpp0.11.4 (i.e. upgrading R from 3.1.1 to 3.1.2).
Building Windows binaries has not been worked on for a while
Building Mac OSX binaries has never been worked on
Makefile should include a task to upload built binaries to http://get.stenci.la (if this strategy is continued)

See https://github.com/stencila/stencila/blob/master/Makefile#L550 for relevant section of Makefile.

Add `r-tests` to Travis build

Recently, py-tests were added. Do the same for R

Automatically print ggplots in R stencils

With a code directive like this:

r png
     ggplot(...) + ....

no png file is generated because the ggplot is not rendered, you have to explicitly print ggplot's like this:

r png
     print(ggplot(...) + ....)

Re-read components on get()

Currently the Component::get method retrieves a component instance from memory if it has alread been "gotten" (e.g. when included in a stencil). But it does not re-read that component. That is a problem if the source file for that stencil has been changed - currently if the source for an included stencil has change you have to restart the R/Python session. So, we need to perhaps add some attributes to components like source and time so that files can be examined for changes and perhaps reloaded.

Description directive

Some directives (macro and arg in particular) could usefully have child "description" (or "notes") directives for documentation. These would be <div>s which contain HTML describing what the directive is for.

Inline text directive in Cila?

Currently to render some text inline with Cila you have to use this syntax: Area of circle: {text 2*pi*r^2}m. Would it be better to have syntax like: Area of circle: {{ 2*pi*r^2}}m as in Django, Mustache and others?

Write stencil to disk when HTML or Cila is set remotely

Some of remote method interfaces for stencils in https://github.com/stencila/stencila/blob/master/cpp/stencila/stencil-serve.cpp should write the stencil to disk.

Stencil methods for modifying HTML

Consider adding two methods for modifying a stencil's HTML:

"strip", "rebase" or "bare" : removes all elements and attributes added as a result of rendering (e.g. items in a for directive); this could be useful for "starting over" with a stencil
"opaque" ... : remove all stencil directives; should create a new document perhaps called a "Report" since this is a destructive method

Stencil `compile()` method

Themes have a compile() method (for generating minified CSS and JS). Add the same for stencils for generating index.html and preview.png

Use a Windows CI

Currently, Travis CI is used to build for Linux. Can http://www.appveyor.com/ be used for Windows builds?

Improve decoding of URLs by server

Relevant line is

stencila/cpp/stencila/network.cpp

Line 74 in 6c13073

// More conversions will be required

See for example http://bogomip.net/blog/cpp-url-encoding-and-decoding/

Update tidy-html5 version

tidy-html5 development has been reinvigorated. A 5.0.0 release is due 17 April. Probably worthwhile upgrading.

Token based access to Components via websocket server

Currently the Websocket server passes all requests on to Components. There is no access control. Implement token based access control. It might work something like this:

in Component::declare() generate a token for the component and store it in instances_
in Component::view() append the token to the URL
in Server::http() extract the and pass it on...
in Component::page() and Component::message() check the token

Component testing

Each component should have a test method which searches for tests in the component e.g.:

a tests directory
tests.* files e.g. tests.R

The tests will get called by a system call from C++ e.g python tests.py. The tests should load the component, run tests and output a standard format output file (probably JUnit XML). The component repo should be tagged with:

datetime of test
number of tests performed; number passed (parsing of XML can be done in C++)

That allows any updates of a local component to only use the latest version which has passed all tests. Tests will get inherited when a component is forked. Testing is most likely to be useful for stencil because they often contain code.

The might be a specific case of a general type of component methods that execute corresponding scripts in the component's directory e.g.

method test() runs test.R dumping stdout and stderr to a unique filename in the tests subdirectory which is parsed by test() for results; method tests() parses all files in the tests subdirectory.
method update() runs update.py .... (for updating Tables, Arrays etc)

Stencil import directive?

Should there be a data-import directive to allow for functions of a stencil to be imported into the context but no HTML content to be included (will require a <code id="main"> or similar element to identify which code is to be imported). Allow for Python style import xxxx (import all objects into context), import xxx.yyy (only yyy) and import xxx.yyy as zzz (rename)

Implement Python HttpServer

Relocate, and provide Cila for, "meta" Stencil attributes

There are some, optional, Stencil attributes, namely title, authors, contexts, keywords, description. Most of these are stored in <head><meta> although some are in <body> but outside of #content. It may be best to allow for all of these to go into #content so that they can be edited directly and for Cila directives to be created for each.

See also #20 with regard to description and #23 with regard to contexts.

Customised `preview()` shots

Commit #1d764a478 started adding alternative ways to generate previews for a stencil. Finish these off.

Stencil sanitization

In construction of a stencil, some HTML sanitization should be done. Use a whitelisting approach, only tags in the list are allowed, rather than the less robust blacklisting. In addition to simple whitelisting, tag modifiction may be appropriate. e.g for an img the src attribute could be modified to a generic "blocked image" image.

Javascript rendering context

Twould be useful to have a Javascript stencil rendering context that could GET JSON from a URL and render a stencil fragment using it e.g. for creating a page with a list of components

with get('/path/to/a/list/of/components'):
  for com in components:
    with x:
       div .address 
           write address

Server would serve html version of stencil and client side JS would get the JSON, and walk through html nodes calling context methods as per usual.

See the following for ideas of creating context namespaces in JS:

Document path and patch format in HTTP and WS requests

Each Component type exposes a path "API". That is, each type translates a path into something a GET, PUT, or PATH can be applied to. The basic path spec is,

/type/id/method
or
/typ>/id/rest of path

Each type know how to deal with rest of path. For example, in a Stencil,

DELETE /stencil/43111/body/0/

means delete the first child of the body in stencil with id 43111

PATCH /dataset/5422/flags/6432/colour {['s':'red']}

Means set to red the colour column of the 6432 row of the flags table.

Include directive stripping

Currently the includee gets included verbatim (i.e. with all its directives). This has the disadvantage of cluttering the includer with directives. For example, would you really want the R code used to transform a data.frame into a HTML table to be included in the includer for every table?

So, the Include::render method should first render and strip the includee before including it.

Better output filenames

Currently, when an exec directive has an output, the filename is the directive's hash. This makes it difficult to pick out the right file for use elsewhere (e.g. putting a PNG into a word document). Consider adding the figure's #id or caption slug if that is available as a prefix.

stencila / stencila Goto Github PK

stencila's Issues

Recommend Projects

Recommend Topics

Recommend Org