pepr / asciidoc Goto Github PK

This project forked from asciidoc-py/asciidoc-py2

Text based document generation. AsciiDoc is a text document format for writing notes, documentation, articles, books, ebooks, slideshows, web pages, man pages and blogs. AsciiDoc files can be translated to many formats including HTML, PDF, EPUB, man page.

Home Page: http://asciidoc.org/

License: GNU General Public License v2.0

Python 55.37% TeX 0.10% XSLT 2.92% Shell 2.26% JavaScript 29.69% CSS 7.90% Vim Script 1.76%

asciidoc's People

Stargazers

Watchers

asciidoc's Issues

Braindump

This is just a braindump of things that you may (or may not) find useful for porting and re-design.

The (rough) outline design of the current implementation is:

read input until markup is recognised
if markup is start of an element
    output start of element based on configuration template
    process input recursively until end element found
    output end of element based on configuration template
else markup must be end of an element
    return

The problem with this design is that it has no memory of what it has seen except for the direct parent elements of the current location, and it has no clue what will come in the future. Thus it cannot make things like tables of contents, since when it is outputting the TOC near the front of the document it hasn't seen any of the contents it is tabling. That is why it uses the Javascript TOC generator since that can see the whole DOM at display time. And forward links cannot access any information from the target because it hasn't been processed when the link is written.

A redesign (as distinct from just porting from Python 2 to 3) should address the above by parsing the whole document and manipulating the resulting tree before translating for output. This will allow static HTML TOCs and links as Asciidoctor generates and also allow several other features that need the whole doc to be visible.

As mentioned on the Asciidoc thread, the Python design uses a lot of regular expressions, both in the code, and especially in the config files, for recognising markup during parsing. Since Python 3 has changed the semantics of regular expressions to always be Unicode by default, for the Python 3 port all the regexes need to be checked that they match the correct thing when they are Unicode instead of ASCII (or they need to be explicitly marked ASCII).

There are some pieces of embedded Python in the config files and in the filters, these also need to be checked as Python 3. Also the code needs to be checked to make sure that the right version of Python is run when these are run.

Both the issues may make it hard to run the same code and config files on both Python 2 and 3 since there is a difference in semantics between the implementations.

The `modernize` approach

The `modernize` package approach

The modernize package (pip install modernize) tries to modify the existing Python 2 code so that it is compatible both with Python 2.6+ and Python 3 (uses the six library):

use modernize -wn . to suppress generating backup files (we have Git for that)
usually, the result is not fully OK
run tests and fix the details

The approach was suggested by Petr Viktorin instead of developing the separate asciidoc3.py. In my opinion, the approach that uses separate source files for the Python 2 version asciidoc.py, and for the Python 3 version asciidoc3.py. The reason is that there may be a lot of text-encoding issues with different solutions in Python 2 and Python 3. I guess it will be easier to develop a clean asciidoc3.py, refactor it, and possibly back-port some newly implemented things to asciidoc.py (Python 2).

However, the modernize transformation should be applied to asciidocapi.py to have a single imported module with the same interface both for asciidoc.py and for asciidoc3.py.

The modernize approach may possibly be used also for writing regression tests.

Resulting HTML output contains extra empty lines

It seems that extra newline is produced for each line. It is visible in listing blocks, or if you open the HTML file in a text editor.

Tools for getting/maintaining the Big Picture

I personally consider important to get Big Picture of what was done and what should be done (being the person with poor long-memory capability).

Doxygen + graphviz + doxypypy

Doxygen is a document generator available for both Linux and Windows. You can download precompiled binaries from here. It can use the graphviz tool to generate various graphs, including the call graph that shows the relations of the part of the code. The doxypypy filter must be installed explicitly for Python 3 as it compiles the Python source that is to be documented by Doxygen. It is used for on-the-fly transforming Python doc-strings to the special comments accepted by Doxygen.

Note: The Doxyfile configuration file in the same directory as the asciidoc3.py. It was created for Windows. Create a separate one for the same purpose on Linux. The configuration file name can be changed in future to make apparent for what version of OS and for what version of Python it is used. (The Python 2 and Python 3 versions must be documented separately because of the doxypypy filter.)

Doxygen is launched from command line doxygen Doxyfile or simply doxygen (Doxyfile is the default name for the configuration file -- see the note above).

The progdoc/html/ subdirectory is created. The progdoc/html/index.html is the root document. The progdoc/ subdirectory was excluded from the repository as it is quite big, and (re)generating the content is not time consuming.

Mind Map -- Freeplane

Freeplane (implemented in Java, that is multiplatform) is a nice tool for creating/editing mind maps -- basically an non-sequential form of writing notes. See AsciiDoc_big_picture.mm (early version of the file was created before using Doxygen for the project -- progdoc/ may be better for sniffing for the current structure of the implementation).

Plan

(This comment will be continually updated as needed.)

Make asciidoc3.py work keeping the same design -- just to make it work soon.
Check the usage of regular expressions with respect to Unicode -- see comments in #7. Check for the usage of '\r' (unified line endings).
Using the modernize transformation to make asciidocapi.py the same for both Python 2 and Python 3 versions of asciidoc implementations -- see #3 (only for the asciidocapi.py).
Make the tests working for asciidoc3.py to be able to compare functionality of the asciidoc.py and of asciidoc3.py .
Getting Big Picture of the implementation (written/drawn manually and generated from source) -- see #4 for tools to be used and #7 for @elextr's braindump on existing design.
Redesign the asciidoc3.py (regression tested based on the existing tests).
If suitable and reasonably easy, backport the new design (in steps) to the asciidoc.py.
Think about C++ implemenation (based on the same tests).

Wrong encoding used in the timestamp at the bottom of generated HTML pages

Observed in Czech environment. The timestamp should look like Last updated 2015-06-30 19:13:08 Střední Evropa (letní čas). It is correctly produced by the asciidoc.py. However, asciidoc3.py cripples the encoding. Probably the time_str() function must be fixed.

pepr / asciidoc Goto Github PK

asciidoc's People

Stargazers

Watchers

asciidoc's Issues

Braindump

The `modernize` approach

The `modernize` package approach

Resulting HTML output contains extra empty lines

Tools for getting/maintaining the Big Picture

Tools for getting/maintaining the Big Picture

Doxygen + graphviz + doxypypy

Mind Map -- Freeplane

Plan

Plan

Wrong encoding used in the timestamp at the bottom of generated HTML pages

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent

pepr / asciidoc Goto Github PK

asciidoc's People

Stargazers

Watchers

asciidoc's Issues

The modernize package approach

Tools for getting/maintaining the Big Picture

Doxygen + graphviz + doxypypy

Mind Map -- Freeplane

Plan

Recommend Projects

Recommend Topics

Recommend Org

The `modernize` package approach