Giter Club home page Giter Club logo

stencila's Introduction

Stencila

Programmable, reproducible, interactive documents

๐Ÿ‘‹ Intro โ€ข ๐Ÿšด Roadmap โ€ข ๐Ÿ“œ Docs โ€ข ๐Ÿ“ฅ Install โ€ข ๐Ÿ› ๏ธ Develop

๐Ÿ™ Acknowledgements โ€ข ๐Ÿ’– Supporters โ€ข ๐Ÿ™Œ Contributors



๐Ÿ‘‹ Introduction

Stencila is a platform for creating and publishing, dynamic, data-driven content. Our aim is to lower the barriers for creating truly programmable documents, and to make it easier to publish them as beautiful, interactive, and semantically rich, articles and applications. Our roots are in scientific communication, but our tools are useful beyond.

This is v2 of Stencila, a rewrite in Rust focussed on the synergies between three recent and impactful innovations and trends:

We are embarking on a rewrite because CRDTs will now be the foundational synchronization and storage layer for Stencila documents. This requires fundamental changes to most other parts of the platform (e.g. how changes are applied to dynamic documents). Furthermore, a rewrite allow us to bake in, rather than bolt on, new modes of interaction between authors and LLM assistants and add mechanisms to mitigate the risks associated with using LLMs (e.g. by recording the actor, human or LLM, that made the change to a document). Much of the code in the v1 branch will be reused (after some tidy-ups and refactoring), so v2 is not a complete rewrite.

๐ŸŽฅ Showcase

Simultaneously editing the same document in different formats

Here, a Stencila Article has previously been saved to disk as a CRDT in main.sta. Then, the sync command of the stencila CLI is used to simultaneously synchronize the CRDT with three files, in three different formats currently supported in v2: JATS XML, JSON, and Markdown. Changes made in one file (here, in VSCode) are merged into the in-memory CRDT and written to the other files.

You'd probably never want to do this just by yourself. But this demo illustrates how Stencila v2 will be enable collaboration across formats on the same document. Any particular format (e.g. Markdown, LaTeX, Word) is just one of the potential user interfaces to a document.

file-sync-2023-09-29.mp4

๐Ÿšด Roadmap

Our general strategy is to iterate horizontally across the feature set, rather than fully developing features sequentially. This will better enable early user testing of workflows and reduce the risk of finding ourselves painted into an architectural corner. So expect initial iterations to have limited functionality and be buggy.

We'll be making alpha and beta releases of v2 early and often across all products (e.g. CLI, desktop, SDKs). We're aiming for a 2.0.0 release by the end of Q3 2024.

๐ŸŸข Stable โ€ข ๐Ÿ”ถ Beta โ€ข โš ๏ธ Alpha โ€ข ๐Ÿšง Under development โ€ข ๐Ÿงช Experimental โ€ข ๐Ÿงญ Planned โ€ข โ” Maybe

Schema

The Stencila Schema is the data model for Stencila documents (definition here, generated reference documentation here). Most of the schema is well defined but some document node types are still marked as under development. A summary by category:

Category Description Status
Works Types of creative works (e.g. Article, Figure, Review) ๐ŸŸข Stable (mostly based on schema.org)
Prose Types used in prose (e.g. Paragraph, List, Heading) ๐ŸŸข Stable (mostly based on HTML, JATS, Pandoc etc)
Code Types for executable (e.g. CodeChunk) and non-executable code (e.g. CodeBlock) ๐Ÿ”ถ Beta (may change)
Math Types for math symbols and equations (e.g. MathBlock) ๐Ÿ”ถ Beta (may change)
Data Fundamental data types (e.g. Number) and validators (e.g. NumberValidator) ๐Ÿ”ถ Beta (may change)
Flow Types for control flow within a document (e.g. If, For, Call) ๐Ÿšง Under development (likely to change)
Style Types for styling parts of a documents (Span and Division) ๐Ÿšง Under development (likely to change)
Edits Types related to editing a documents (e.g. InstructionBlock, DeleteInline) ๐Ÿšง Under development (likely to change)

Storage and synchronization

In v2, documents can be stored as binary Automerge CRDT files, branched and merged, and with the ability to import and export the document in various formats. Collaboration, including real-time, is made possible by exchanging fine-grained changes to the CRDT over the network. In addition, we want to enable interoperability with a Git-based workflow.

Functionality Description Status
Documents read/write-able Able to write a Stencila document to an Automerge binary file and read it back in โš ๏ธ Alpha
Documents import/export-able Able to import or export document as alternative formats, using tree diffing to generate CRDT changes โš ๏ธ Alpha
Documents fork/merge-able Able to create a fork of a document in another file and then later merge with the original ๐Ÿงญ Planned Q4 2023
Documents diff-able Able to view a diff, in any of the supported formats, between versions of a document and between a document and another file ๐Ÿงญ Planned Q4 2023
Git merge driver CLI can act as a custom Git merge driver ๐Ÿงญ Planned Q4 2023
Relay server Documents can be synchronized by exchanging changes via a relay server ๐Ÿงญ Planned Q4 2023
Rendezvous server Documents can be synchronized by exchanging changes peer-to-peer using TCP or UDP hole punching โ” Maybe

Formats

Interoperability with existing formats has always been a key feature of Stencila. We are bringing over codecs (a.k.a. converters) from the v1 branch and porting other functionality from encoda to Rust.

Format Encoding Decoding Coverage Notes
JSON ๐ŸŸข ๐ŸŸข
JSON5 ๐ŸŸข ๐ŸŸข
JSON-LD ๐Ÿ”ถ ๐Ÿ”ถ
CBOR ๐ŸŸข ๐ŸŸข
CBOR+Zstandard ๐ŸŸข ๐ŸŸข
YAML ๐ŸŸข ๐ŸŸข
Plain text ๐Ÿ”ถ -
HTML ๐Ÿšง ๐Ÿงญ
JATS ๐Ÿšง ๐Ÿšง Planned for completion Q4 2023. Port decoding and tests from encoda.
Markdown โš ๏ธ โš ๏ธ
R Markdown ๐Ÿงญ ๐Ÿงญ Relies on Markdown; v1
Jupyter Notebook ๐Ÿงญ ๐Ÿงญ Relies on Markdown; v1
Scripts ๐Ÿงญ ๐Ÿงญ Relies on Markdown; v1
Pandoc ๐Ÿงญ ๐Ÿงญ Planned Q4 2023. v1
LaTeX ๐Ÿงญ ๐Ÿงญ Relies on Pandoc; v1; discussion
Org ๐Ÿงญ ๐Ÿงญ Relies on Pandoc; PR
Microsoft Word ๐Ÿงญ ๐Ÿงญ Relies on Pandoc; v1
ODT ๐Ÿงญ ๐Ÿงญ Relies on Pandoc
Google Docs ๐Ÿงญ ๐Ÿงญ Planned Q1 2024 v1
PDF ๐Ÿงญ ๐Ÿงญ Planned Q1 2024, relies on HTML; v1
Codec Plugin API ๐Ÿงญ ๐Ÿงญ An API allowing codecs to be developed as plugins in Python, Node.js, and other languages

Kernels

Kernels are what executes the code in Stencila CodeChunks and CodeExpressions, as well as in control flow document nodes such as IfClause and For. In addition to supporting interoperability with existing Jupyter kernels, we will bring over microkernels from v1. Microkernels are lightweight kernels for executing code which do not require separate installation and allow for parallel execution. We'll also implement at least one kernel for an embedded scripting language so that it is possible to author a Stencila document which does not rely on any other external binary.

Kernel Purpose Status
Rhai Execute a sandboxed, embedded language โš ๏ธ Alpha
Bash Execute Bash code โš ๏ธ Alpha
[Zsh] (https://zsh.org/) Execute Zsh code โ” Maybe; v1
Python Execute Python code โš ๏ธ Alpha
R Execute R code ๐Ÿšง In progress; v1
Node.js Execute JavaScript code โš ๏ธ Alpha
Deno Execute TypeScript code โ” Maybe; v1
SQLite Execute SQL code ๐Ÿงญ Planned Q1 2024; v1
Jupyter kernels Execute code in Jupyter kernels ๐Ÿšง In progress; PR
HTTP Interact with RESTful APIs โ” Maybe; v1

Actors

In Stencila v2, non-human changes to the document will be performed, concurrently, by various actors. Actors will listen for changes to document and react accordingly. For example, a LLM actor might listen for the insertion of a paragraph starting with "!add a code chunk to read in and summarize mydata.csv" and do just that. We'll be starting by implementing relatively simply actors but to avoid being painted into a corner will probably implement one LLM-base actor relatively early on.

Actor Purpose Status
MathML Update the mathml property of Math nodes when the code property changes ๐Ÿงญ Planned Q4 2023
Tailwind Update the classes property of Styled nodes when the code property changes ๐Ÿงญ Planned Q4 2023 v1
Compiler Update compileDigest and other properties of Executable nodes e.g. when the code or programmingLanguage properties change ๐Ÿšง In progress
Executor Execute nodes when their executionRequired property changes and update their executionStatus, output, etc properties ๐Ÿงญ Planned Q4 2023
Actor Plugin API An API allowing actors to be developed as plugins in Python, Node.js, and other languages ๐Ÿงญ Planned Q1 2024 to allow prototypes of Coder and Writer actors as plugins
Coder An LLM actor that creates and edits CodeExecutable nodes ๐Ÿงญ Planned Q1 2024
Writer An LLM actor that creates and edits prose nodes ๐Ÿงญ Planned Q1 2024
CitationIntent An AI actor that suggests a CitationIntent for Cite nodes โ” Maybe

Editors

Editors allow users to edit Stencila documents, either directly, or via an intermediate format.

Interface Purpose Status
File watcher Edit documents via other formats and tools (e.g. code editors, Microsoft Word) and react on file change โš ๏ธ Alpha
Code editor Edit documents via other formats using a built-in code editor and react on key presses ๐Ÿšง In progress
Visual editor Edit documents using a built-in visual editor and react on key presses and widget interactions ๐Ÿšง In progress

Tools

Tools are what we call the self-contained Stencila products you can download and use locally on your machine to interact with Stencila documents.

Tool Purpose Status
CLI Manage documents from the command line and read and edit them using a web browser โš ๏ธ Alpha
Desktop Manage, read and edit documents from a desktop app ๐Ÿงญ Planned Q1 2024, likely using Tauri
VSCode extension Manage, read and edit documents from within VSCode โ” Maybe

SDKs

Stencila's software development kits (SDKs) enable developers to create plugins to extend Stencila's core functionality or to build other tools on top of. At this stage we are planning to support Python, Node.js and R but more languages may be added if there is demand.

Language Description Status Coverage
Python Types and function bindings for using Stencila from Python โš ๏ธ Alpha PyPI
TypeScript JavaScript classes and TypeScript types for the Stencila Schema โš ๏ธ Alpha NPM
Node.js Types and function bindings for using Stencila from Node.js โš ๏ธ Alpha NPM
R Types and function bindings for using Stencila from R โ” Maybe

Testing and auditing

Making sure Stencila v2 is well tested, fast, secure, and accessible, is important. Here's what where doing towards that:

What Description Status
Property-based testing Establish property-based (a.k.a generative) testing for Stencila documents ๐ŸŸข Done
Round-trip testing Establish property-based tests of round-trip conversion to/from supported formats and reading/writing from/to Automerge CRDTs ๐ŸŸข Done here and here
Coverage reporting Report coverage by feature (e.g. by codec) to give developers better insight into the status of each ๐ŸŸข Done Codecov
Dependency audits Add dependency audits to continuous integration workflow. ๐ŸŸข Done
Accessibility testing Add accessibility testing to continuous integration workflow. ๐ŸŸข Done here
Performance monitoring Establish continuous benchmarking ๐ŸŸข Done here
Security audit External security audit sponsored by NLNet. ๐Ÿงญ Planned Q2 2023 (after most v2 functionality added and before public beta)
Accessibility audit External accessibility audit sponsored by NLNet. ๐Ÿงญ Planned Q3 2023 (before v2.0.0 release)

๐Ÿ“œ Documentation

At this stage, documentation for v2 is mainly reference material, much of it generated:

More reference docs as well as guides and tutorial will be added over the coming months. We will be bootstrapping the publishing of all docs (i.e. to use Stencila itself to publish HTML pages) and expect to have an initial published set in Q4 2023.

๐Ÿ“ฅ Install

Although v2 is in early stages of development, and functionality may be limited or buggy, we are releasing alpha versions of the Stencila CLI and SDKs. Doing so allows us to get early feedback and monitor what impact the addition of features has on build times and distribution sizes.

CLI

Windows

To install the latest release download stencila-<version>-x86_64-pc-windows-msvc.zip from the latest release and place it somewhere on your PATH.

MacOS

To install the latest release in /usr/local/bin,

curl --proto '=https' --tlsv1.2 -sSf https://stencila.dev/install.sh | sh

To install a specific version, append -s vX.X.X. Or, if you'd prefer to do it manually, download stencila-<version>-x86_64-apple-darwin.tar.xz from the one of the releases and then,

tar xvf stencila-*.tar.xz
cd stencila-*/
sudo mv -f stencila /usr/local/bin # or wherever you prefer
Linux

To install the latest release in ~/.local/bin/,

curl --proto '=https' --tlsv1.2 -sSf https://stencila.dev/install.sh | sh

To install a specific version, append -s vX.X.X. Or, if you'd prefer to do it manually, download stencila-<version>-x86_64-unknown-linux-gnu.tar.xz from the one of the releases and then,

tar xvf stencila-*.tar.xz
mv -f stencila ~/.local/bin/ # or wherever you prefer
Docker

The CLI is also available in a Docker image you can pull from the Github Container Registry,

docker pull stencila/stencila

and use locally like this for example,

docker run -it --rm -v "$PWD":/work -w /work --network host stencila/stencila --help

The same image is also published to the Github Container Registry if you'd prefer to use that,

docker pull ghcr.io/stencila/stencila

SDKs

Python

Use your favorite package manager to install Stencila's SDK for Python:

python -m pip install stencila

[!NOTE] If you encounter problems with the above command, you may need to upgrade Pip using pip install --upgrade pip.

poetry add stencila
Node

Use your favorite package manager to install @stencila/node:

npm install @stencila/node
yarn add @stencila/node
pnpm add @stencila/node
TypeScript

Use your favorite package manager to install @stencila/types:

npm install @stencila/types
yarn add @stencila/types
pnpm add @stencila/types

๐Ÿ› ๏ธ Develop

Code organization

This repository is organized into the following modules. Please see their respective READMEs, where available, for guides to contributing to each.

  • schema: YAML files which define the Stencila Schema, an implementation of, and extensions to, schema.org, for programmable documents.

  • json: A JSON Schema and JSON LD @context, generated from Stencila Schema, which can be used to validate Stencila documents and transform them to other vocabularies

  • rust: Several Rust crates implementing core functionality and a CLI for working with Stencila documents.

  • python: A Python package, with classes generated from Stencila Schema and bindings to Rust functions, so you can work with Stencila documents from within Python.

  • ts: A package of TypeScript types generated from Stencila Schema so you can create type-safe Stencila documents in the browser, Node.js, Deno etc.

  • node: A Node.js package, using the generated TypeScript types and bindings to Rust functions, so you can work with Stencila documents from within Node.js.

  • prompts: Prompts for used to instruct AI assistants in different contexts and for different purposes.

  • docs: Documentation, including reference documentation generated from schema and CLI tool.

  • examples: Examples of documents conforming to Stencila Schema, mostly for testing purposes.

  • scripts: Scripts used for making releases and during continuous integration.

Continuous integration and deployment

Several Github Action workflows are used for testing and releases. All products (i.e CLI, Docker image, SKDs) are released at the same time with the same version number. To create and release a new version:

bash scripts/bump-version.sh <VERSION>
git push && git push --tags
Workflow Purpose Status
test.yml Run linting, tests and other checks. Commit changes to any generated files.
pages.yml Publish docs, JSON-LD, JSON Schema, etc to https://stencila.dev hosted on GitHub Pages
version.yml Trigger the release.yml workflow when a version tag is created.
release.yml Create a release, including building and publishing CLI, Docker image and SDKs.
install.yml Test installation and usage of CLI, Docker image and SDKs across various operating systems and language versions.

๐Ÿ™ Acknowledgements

Stencila is built on the shoulders of many open source projects. Our sincere thanks to all the maintainers and contributors of those projects for their vision, enthusiasm and dedication. But most of all for all their hard work! The following open source projects in particular have an important role in the current version of Stencila. We sponsor these projects where, and to an extent, possible through GitHub Sponsors and Open Collective.

Link Summary
Automerge A Rust library of data structures for building collaborative applications.
Clap A Command Line Argument Parser for Rust.
NAPI-RS A framework for building pre-compiled Node.js addons in Rust.
PyO3 Rust bindings for Python, including tools for creating native Python extension modules.
Rust A multi-paradigm, high-level, general-purpose programming language which emphasizes performance, type safety, and concurrency.
Serde A framework for serializing and deserializing Rust data structures efficiently and generically.
Similar A Rust library of diffing algorithms including Patience and Huntโ€“McIlroy / Huntโ€“Szymanski LCS.
Tokio An asynchronous runtime for Rust which provides the building blocks needed for writing network applications without compromising speed.

๐Ÿ’– Supporters

We wouldnโ€™t be doing this without the support of these forward looking organizations.

๐Ÿ™Œ Contributors

Thank you to all our contributors (not just the ones that submitted code!). If you made a contribution but are not listed here please create an issue, or PR, like this.

Ackerley Tng Aleksandra Pawlik Alex Ketch Ben Shaw Colette Doughty Daniel Beilinson Daniel Ecer
Daniel Mietchen Daniel Nรผst Danielle Robinson Dave David Moulton Finlay Thompson Fรกbio H. K. Mendes
J Hunt Jacqueline James Webber Jure Triglav Lars Willighagen Mac Cowell Markus Elfring
Michael Aufreiter Morane Gruenpeter MorphicResonance Muad Abd El Hay Nokome Bentley Oliver Buchtala Raniere Silva
Remi Rampin Rich Lysakowski Robert Gieseke Seth Vincent Stefan Fritsch Suminda Sirinath Salpitikorala Dharmasena Tim McNamara
Titus Tony Hirst Uwe Brauer Vanessasaurus Vassilis Kehayas alexandr-sisiuc asisiuc
campbellyamane ern0 - Zalka Ernล‘ grayflow happydentist huang12zheng ignatiusm jmhuang
jon r kitten solsson taunsquared yasirs

stencila's People

Contributors

alex-ketch avatar alexandr-sisiuc avatar allcontributors[bot] avatar apawlik avatar beneboy avatar colettedoughty avatar daniellecrobinson avatar davidcmoulton avatar dependabot[bot] avatar discodavey avatar fkmendes avatar greenkeeper[bot] avatar hlm628 avatar huang12zheng avatar ignatiusm avatar integral avatar jwijay avatar mike-parkin avatar nlisgo avatar nokome avatar obuchtala avatar oliver7654 avatar pin3needles avatar renovate-bot avatar renovate[bot] avatar rgieseke avatar sethvincent avatar simonwinter avatar stencila-ci avatar timclicks avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

stencila's Issues

Include directive once flag

The following may be useful flag for Stencil include directives:

  • asis : means that the includee is not refreshed (re-rendered) but included as is (useful for including large stencils e.g. chapters which you don't want to rerender.
  • complete : means that the includee is included with all its directives (normally these will be removed)
  • once : means that the includee is only included once with all directives

Generation of Markdown syntax within Cila

Parsing of Markdown syntax was added to Cila with b4bbe70. Generation of that syntax is partially implemented and needs to be completed. This probably requires some thought about distinguishing between inline and block elements in Cila as in HTML. Rules need to be defined for when the Markdown "shortcut" syntax can be generated and when need to use usual Cila syntax.

Inline text directive in Cila?

Currently to render some text inline with Cila you have to use this syntax: Area of circle: {text 2*pi*r^2}m. Would it be better to have syntax like: Area of circle: {{ 2*pi*r^2}}m as in Django, Mustache and others?

Document path and patch format in HTTP and WS requests

Each Component type exposes a path "API". That is, each type translates a path into something a GET, PUT, or PATH can be applied to. The basic path spec is,

/type/id/method
or
/typ>/id/rest of path

Each type know how to deal with rest of path. For example, in a Stencil,

DELETE /stencil/43111/body/0/

means delete the first child of the body in stencil with id 43111

PATCH /dataset/5422/flags/6432/colour {['s':'red']}

Means set to red the colour column of the 6432 row of the flags table.

Description directive

Some directives (macro and arg in particular) could usefully have child "description" (or "notes") directives for documentation. These would be <div>s which contain HTML describing what the directive is for.

Add users and permissions to Component<void>

The hub will send sessions (via Http::Server) a username and token and permission for each component that user opens. The Component class is the place that should store that information so that user requests can be authenticated and authorized.

Refactor HTML Node and Document

Currently,

  • Html::Node is just a Xml::Node
  • Html::Document is derived from Xml:Document with some input/output methods that deal with HTML

When you say filter a Html:Document you get a list of Xml::Nodes which dump() as XML, not as HTML. Html::Node should be derived from Xml::Node with the extra methods and Html:Document derived from Html::Document.

This will simplify the code for the proper generation of indented HTML which needs some work.

Improve embeddded server error messages

When using the embedded server, error pages just return plain text exception messages e.g.

image

Improve by creating an error method in Server which returns exceptions and other messages wrapped in some HTML, possibly including some data uri encoded images.

Input type for par stencil directive

Currently, rendering of a par directive only handles

  • <input type="text"...> where value is converted to a string literal
  • or an <input> with no type specified where value is parsed as an expression in the context language

Need to handle other HTML5 input types e.g. date, colour, range by wrapping them in code before evaluating the resulting expression in the context.

Add Report class

Report class to represent a Stencil that has be stripped of all it semantics by removing stencil data-xxx attributes or rendered nodes with these attributes (e.g. data-error). Likely to be implemented as derived from Html::Document and having an additional methods that give the address of the stencil it came from, the context used to render it, time of rendering etc. The actual generation of the Report from a rendered stencil may be best part of the Stencil class - it will be similar to the write() method.

Add Stencil class

Implement Stencil class with interfaces in Python and R. Tasks include

  • integrating pugixml and tidy-html5 into cpp\requires
  • creating utilities/xml and utilities/html namespaces and headers
  • adding stencil.hpp and implementing basic interface
  • implement rendering
  • implement r and py contexts

Stencil::render_image_() method

In Stencil::render_image_() finalise the protocol for insertion of bitmap formats: file in stencil directory or as a data uri? Or both depending upon the type of stencil?

Stencil methods for modifying HTML

Consider adding two methods for modifying a stencil's HTML:

  • "strip", "rebase" or "bare" : removes all elements and attributes added as a result of rendering (e.g. items in a for directive); this could be useful for "starting over" with a stencil
  • "opaque" ... : remove all stencil directives; should create a new document perhaps called a "Report" since this is a destructive method

Integrate docker module

Currently, we have a separate stencila/docker repo. But it is so tightly linked to this repo it may as well go here. Automated builds can be specified for the subdirectory anyway.

Include directive stripping

Currently the includee gets included verbatim (i.e. with all its directives). This has the disadvantage of cluttering the includer with directives. For example, would you really want the R code used to transform a data.frame into a HTML table to be included in the includer for every table?

So, the Include::render method should first render and strip the includee before including it.

Add Array class

Implement Array class: Tasks include

  • array.hpp with Array, Dimension and Level classes.
  • query.hpp with Aggregator classes like Sum, Count etc and corresponding free functions that will dispatch to Array::query method (These classes will be used with Table and other classes as well)

Stencil sanitization

In construction of a stencil, some HTML sanitization should be done. Use a whitelisting approach, only tags in the list are allowed, rather than the less robust blacklisting. In addition to simple whitelisting, tag modifiction may be appropriate. e.g for an img the src attribute could be modified to a generic "blocked image" image.

Better output filenames

Currently, when an exec directive has an output, the filename is the directive's hash. This makes it difficult to pick out the right file for use elsewhere (e.g. putting a PNG into a word document). Consider adding the figure's #id or caption slug if that is available as a prefix.

Re-read components on get()

Currently the Component::get method retrieves a component instance from memory if it has alread been "gotten" (e.g. when included in a stencil). But it does not re-read that component. That is a problem if the source file for that stencil has been changed - currently if the source for an included stencil has change you have to restart the R/Python session. So, we need to perhaps add some attributes to components like source and time so that files can be examined for changes and perhaps reloaded.

Component testing

Each component should have a test method which searches for tests in the component e.g.:

  • a tests directory
  • tests.* files e.g. tests.R

The tests will get called by a system call from C++ e.g python tests.py. The tests should load the component, run tests and output a standard format output file (probably JUnit XML). The component repo should be tagged with:

  • datetime of test
  • number of tests performed; number passed (parsing of XML can be done in C++)

That allows any updates of a local component to only use the latest version which has passed all tests. Tests will get inherited when a component is forked. Testing is most likely to be useful for stencil because they often contain code.

The might be a specific case of a general type of component methods that execute corresponding scripts in the component's directory e.g.

  • method test() runs test.R dumping stdout and stderr to a unique filename in the tests subdirectory which is parsed by test() for results; method tests() parses all files in the tests subdirectory.
  • method update() runs update.py .... (for updating Tables, Arrays etc)

Javascript rendering context

Twould be useful to have a Javascript stencil rendering context that could GET JSON from a URL and render a stencil fragment using it e.g. for creating a page with a list of components

with get('/path/to/a/list/of/components'):
  for com in components:
    with x:
       div .address 
           write address

Server would serve html version of stencil and client side JS would get the JSON, and walk through html nodes calling context methods as per usual.

See the following for ideas of creating context namespaces in JS:

Stencil `compile()` method

Themes have a compile() method (for generating minified CSS and JS). Add the same for stencils for generating index.html and preview.png

Token based access to Components via websocket server

Currently the Websocket server passes all requests on to Components. There is no access control. Implement token based access control. It might work something like this:

  • in Component::declare() generate a token for the component and store it in instances_
  • in Component::view() append the token to the URL
  • in Server::http() extract the and pass it on...
  • in Component::page() and Component::message() check the token

Import directive

It would be useful to have an "import" directive similar to, for example, Python's import statement. It would import the names of macro elements into a "macro map" for the Stencil (or Context it is being rendered in). Those macros could then be accessed more succinctly e.g.

import some/address/to/a/stencil some-macro
some-macro(arg1=42,arg2="foo")

instead of

include some/address/to/a/stencil #some-macro
   set arg1=42
   set arg2="foo"

Relocate, and provide Cila for, "meta" Stencil attributes

There are some, optional, Stencil attributes, namely title, authors, contexts, keywords, description. Most of these are stored in <head><meta> although some are in <body> but outside of #content. It may be best to allow for all of these to go into #content so that they can be edited directly and for Cila directives to be created for each.

See also #20 with regard to description and #23 with regard to contexts.

R packaging

The Stencila R package needs to be built for multiple platforms.

Stencila C++, and thus Stencila R, relies on numerous open source libraries. Rather than distributing an R source package, and hoping that the user will have all the necessary dependencies to do a compile, the current strategy is to compile shared libraries (.so and .dll) and distribute those instead. install.libs.R is run on installation and looks for the correct shared library (either in the package or from http://get.stenci.la) and puts it in the right place.

  • Is this the right strategy?
  • How specific do versions need to be? For example, an R package developed under Ubuntu12.04/R3.1.2/Rcpp0.11.3 fails to install under Ubuntu14.10/R3.1.1/Rcpp0.11.4 (reason unknown) but will install under Ubuntu14.10/R3.1.2/Rcpp0.11.4 (i.e. upgrading R from 3.1.1 to 3.1.2).
  • Building Windows binaries has not been worked on for a while
  • Building Mac OSX binaries has never been worked on
  • Makefile should include a task to upload built binaries to http://get.stenci.la (if this strategy is continued)

See https://github.com/stencila/stencila/blob/master/Makefile#L550 for relevant section of Makefile.

Directive parameters optionally treated as expression

Many directives have parameters which are expressions in the context language. For example in the for directive data-for="num in 1:10", there is a name parameter num (should not be an expression) and an expression parameter 1:10. In this case, 1:10 is evaluated within the context.

For maximum flexibility it would be advantageous to make many (all?) directive parameters expressions. For example, when using an exec directive to create an image it may be advantageous to have the dimensions of the figure determined within the context. To allow for this the width and height parameters of the exec directive should be expressions.

For some directive parameters this may be onerous. For example, the address parameter of the include directive would have to have quotes around it e.g. data-include="'address/of/includee'". In these cases some extra syntax may be necessary such that the parameter is by default not evaluated but can be if necessary e.g.

For normal use

data-include="address/of/includee"

For evaluated use

data-include="eval paste0(an,expression,which,provides,the,address)"

Stencil context declarations

Currently contexts must be defined for each stencil as a <li>. It may be better to use a comma separated list in a data attribute to be consistent with other directives: <div data-contexts="r,py">.

Also, it would be nice if the preferred context be inferred from the code directives within the stencil e.g. if they are all r directives then use an RContext.

Lastly, we should allow for code directives with no context specified. Some pieces of code may run in multiple contexts

Override source function in R stencil contexts

In R stencils, by default the source function sources code into the global environment, not the local stencil rendering context. To fix that, you need to use local=T:

r
    source('../../common.R',local=T)

but a better alternative may to override the source function within RContexts like this:

source <- function(file) { base::source(file,local=T) }

For directive : group child elements into a repeat directive

Currently, the for directive simply repeats the first child element for each item. This means that if there are, say, 2 children, only the first gets rendered. To fix this you need to nest multiple children within a div.

The for directive should do this for you. If there is no each directive child then create a new one and nest all children under it.

Cila parsing and generation

The current implementation is clumsy and does not handle corner cases well. Reimplement new Cila syntax using a state machine which transitions between contexts

Stencil import directive?

Should there be a data-import directive to allow for functions of a stencil to be imported into the context but no HTML content to be included (will require a <code id="main"> or similar element to identify which code is to be imported). Allow for Python style import xxxx (import all objects into context), import xxx.yyy (only yyy) and import xxx.yyy as zzz (rename)

R context : better error reporting

Currently, if there is an error with a big exec block of code. It is hard to know where it occurred. The Context::execute() method may need to be changed so that it returns a list of lines with their result/error+traceback.

Look at package https://github.com/hadley/evaluate which does much of this already.

Build envionments

Consider setting up Vagrant instances for builds with provisioning scripts in bash (so users can run scripts on their own machines easily without requiring something like Chef, Puppet or Ansible). These would provide a way of compiling 32/64 bit linux/windows etc versions of each library module.

Automatically print ggplots in R stencils

With a code directive like this:

r png
     ggplot(...) + ....

no png file is generated because the ggplot is not rendered, you have to explicitly print ggplot's like this:

r png
     print(ggplot(...) + ....)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.