Giter Club home page Giter Club logo

unified-doc

unified document APIs.


Contents

Intro

Vast amounts of human knowledge is stored digitally in different document formats. It is cheap to create, store, render, and manage content for the same document format, but much harder to perform the same operations for content across different formats. Some form of unified bridge is required to significantly lower the friction when working across different formats, resulting in improved sharing of human knowledge.

Instead of implementing custom programs per format to parse/render/search/annotate/export content, unified-doc implements a set of unified document APIs for supported content types. This allows extension of existing APIs to newly introduced content types, and for supported content types to benefit from future API methods.

With unified-doc, we can easily

  • compile and render any content to HTML.
  • format and style the document.
  • mark or annotate the document.
  • search on the document's text content.
  • export the document in a variety of file formats.
  • preserve the semantic structure of the source content.
  • retrieve useful representations of the document (e.g. source, html, text, syntax tree).
  • enrich the document through an ecosystem of plugins.
  • evolve with interoperable web technologies.

Document formats

unified-doc supports the following document formats by implementing parsers associated with the mime type of the document format:

  • most source code supported by syntax highlighting libraries (e.g. .txt, .json, .js, .css, .sh, .py, .r, .cpp)
  • .html
  • .md
  • .csv
  • .docx
  • .epub
  • .pdf
  • .tex
  • .mathml
  • .rtf

Spec

Please refer to the Spec documentation for more details on goals, definitions, and implementations in unified-doc.

Packages

The following packages are managed under the unified-doc project.

APIs

Unified document APIs for Node, CLI, DOM.

Parsers

Parsers parse source content into hast trees.

Search Algorithms

Search algorithms use a unified search interface to return search results based on the provided query when searching across a document's textContent.

Hast Utils

hast utilities operate and transform hast trees.

Wrappers

Wrappers implement unified-doc APIs in other interfaces.

Types

Shared Typescript typings used across unified-doc packages.

Development

This project is:

  • implemented with the unified interface.
  • linted with xo + prettier + tsc.
  • developed and built with microbundle.
  • tested with jest.
  • softly-typed with typescript with checkJs (only public APIs are typed).
  • managed with lerna

Monorepo scripts:

# install dependencies and bootstrap with lerna
npm run bootstrap

# build all packages with microbundle
npm run build

# clean all packages (rm dist + node_modules)
npm run clean

# watch/rebuild all packages with microbundle
npm run dev

# lint all packages with xo + prettier + tsc
npm run lint

# test all packages with jest in --watch mode (make sure to run the 'dev' script)
npm run test

# test all packages in a single run
npm run test:run

# publish all packages with lerna
npm run publish

unified-doc's Projects

docs icon docs

progressive docs for unified-doc

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.