Giter Club home page Giter Club logo

knockoff's People

Contributors

ezh avatar justjoheinz avatar philcali avatar tristanjuricek avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

knockoff's Issues

HTML -> Markdown conversion

Another very handy tool would be to go 'backwards' from HTML to a "best" guess markdown document. This would allow my literate programming system to edit the HTML documents on the fly.

I consider it to be a pretty low priority, since this only really enables the use of fancy HTML editors to do writing, which the source markdown pretty much excels at. We'll see. (If it's easy, maybe, otherwise, no way.)

This file breaks things

e3 Development Environment Checklist
====================================

1.  Initial Requirement
1.  Initial Setup
1.  Java Setup
1.  PostgreSQL 8.1 - build it from source
1.  Dropbox
1.  Db configuration
1.  e3db configuration
1.  e3local configuration
1.  e3mail configuration
1.  JBoss 4.0.4.GA
1.  Jboss MultiInstance Configuration
1.  Apache Configuration Debian
1.  Apache Configuration MacOs
1.  CHECKDATA.PL SCRIPT
1.  SUMMARIZER
1.  EXPORTER CONSIDERATION
1.  CHIME EXPORTER
1.  BASIC EXPORTER


## Requirements ##

1. We have to install:

        $ sudo apt-get install openssh-server   
        $ sudo apt-get install build-essential
        $ sudo apt-get install gcc, zlib1g-dev, readline-dev


## Initial Setup ##

1.  Directory Structure
    *   ~/Applications
    *   ~/Dropbox (automatically created by Dropbox Installation)
    *   ~/bin
    *   sudo mkdir /home/emarsys; chown <user>:<user> /home/emarsys
    *   mkdir /home/emarsys/tools
    *   mkdir -p /home/emarsys/IO/import
    *   mkdir -p /home/emarsys/IO/export/chime
    *   mkdir /home/emarsys/IO/export/basic
    *   mkdir /home/emarsys/IO/remote_transfer

    For MacOs before creating the directory structure you should:

    * In the file **/etc/auto_master** comment the line:

                #/home auto_home -nobrowse

    * Restart automount daemon

                $ sudo automount -vc

2.  Environment variables 

        export POSTGRESQL_HOME=$HOME/Applications/postgresql/current
        export JAVA_HOME=$HOME/Applications/java/current
        export JBOSS_HOME=$HOME/Applications/JBoss/current
        export JRUBY_HOME=$HOME/Applications/jruby
        export SCALA_HOME=$HOME/Applicaitons/scala
        export EMMA_HOME=$HOME/Applications/emma
        export MVN_HOME=$HOME/Applications/maven (where maven is a symbolic link to the Dropbox - only for macos)
        export DEV_HOME points to you dev directory

1. Symbolic Links

        $ ln -sf ~/Dropbox/scala-<LASTVERSION> ~/Applications/scala
        $ ln -sf ~/Dropbox/jruby-<LASTVERSION> ~/Applications/jruby
        $ ln -sf ~/Dropbox/maven-<LASTVERSION>  ~/Applications/maven

3.  PATH

        export PATH=$POSTGRESQL_HOME/bin:$JAVA_HOME/bin:$JRUBY_HOME/bin:$SCALA_HOME/bin:$MVN_HOME/bin:$JBOSS_HOME/bin:~/bin:$PATH; 

1. Setting executable permission (if not already set)

        $ chmod +x $SCALA_HOME/bin/*
        $ chmod +x $JRUBY_HOME/bin/*
        $ chmod +x $MVN_HOME/bin/*  


## Java ##

### Install Java 6

For Debian:

    $ sudo apt-get install sun-java6-jdk
    $ ln -sf /usr/lib/jvm/java-6-sun $HOME/Applications/java/current

### Keytool


Used to sign Jar files for use in applets. Maybe we use certificates, but, meh. 

    $ keytool -genkey -alias e3 -keypass 123456 -validity 365
    Enter keystore password:  123456
    What is your first and last name?
      [Unknown]:  emarsys Developer
    What is the name of your organizational unit?
      [Unknown]:  Development
    What is the name of your organization?
      [Unknown]:  emarsys eMarketing Systems AG
    What is the name of your City or Locality?
      [Unknown]:  Vienna
    What is the name of your State or Province?
      [Unknown]:  
    What is the two-letter country code for this unit?
      [Unknown]:  AT
    Is CN=emarsys Developer, OU=Development, O=emarsys eMarketing Systems AG, L=Vienna, ST=Unknown, C=AT correct?
      [no]:  yes

PostgreSQL 8.1
--------------

1. Download the source of PostgreSQL 8.1 and extract in a temp directory
1. You need a full Perl installation, including the libperl library and the header files.

    For Debian:

        $ sudo apt-get install perl, libperl-dev

    For Leopard

        $ sudo port install perl5.8 +shared
        $ sudo port activate perl5.8 @5.8.9_3+shared

        * Check if the module URI::Escape is installed

                $ perl -MURI::Escape -e 1

                if not 

                $ sudo port install p5-uri-fetch (or you can find the same module in /System/Library/Perl/Extras/5.8.8/URI/Escape.pm) 


1. Compile and install to ~/Applications/postgresql/postgresql-8.1.`<NUM>`


        ./configure --prefix=$HOME/Applications/postgresql/postgresql-8.1.<NUM>/ --enable-depend --with-perl
        make
        make install

1. Create a symlink to ~/Applications/postgresql/current

        $ ln -s ~/Applications/postgresql/postgresql-8.1.<NUM>  ~/Applications/postgresql/current 

1. Create data directory ~/Applications/postgresql/data

1. Run __initdb__ on ~/Applications/postgresql/data

        $ initdb -E=UTF-8 data

1. Create the script ~/bin/start_postgresql.sh

        postmaster -D $HOME/Applications/postgresql/data&

1. Allow TCP/IP socket

        $ vim ~/Applications/postgresql/data/postgresql.conf
        Find configuration line that read as follows:
        #listen_addresses='localhost'
        Change with
        listen_addresses='*'

        $ vim vim ~/Applications/postgresql/data/pg_hba.conf
        Insert the following line at the end of the file:
        host    all         all         192.168.0.0/24        trust
        (TODO check if the first option is necessary)

1. Enable languages globally

        $ createlang plperl template1
        $ createlang plpgsql template1
        $ createlang plperlu template1
        $ createuser emarsys (superuser)


####Testing our Installation

1. Create a user for your account
2. Create a DB with the same name
3. Create the file  $HOME/.emarsys/DatabaseConfig.xml

            <config>
                <dbAdminURI>jdbc:postgresql://localhost:5432/pinco?user=pinco&amp;password=pinco</dbAdminURI>
                <dbUnitTestURI>jdbc:postgresql://localhost:5432/test?user=pinco&amp;password=pinco</dbUnitTestURI>
            </config>   
4. Run the test EDREI/common/deebee_test/test.sh (if everything works you should find a DB test with a relation insert_query_update)

Dropbox
-------

For Mac OS X, dropbox was as simple as just setting up the account: [email protected]. For 
Windows, it shouldn't require much more.



####Dropbox on Debian Lenny for Gnome

Unfortunately it is not possible to install on Etch due to library version problem.

1. Add to __/etc/apt/sources.list__ the line


        deb http://www.getdropbox.com/static/ubuntu gutsy main
2. In __/etc/apt/preferences__ set some basic package pinning to make sure that any packages didn't collide with the existing Debian repository (not likely but you never know)

        Package: *
        Pin: release a=gutsy
        Pin-Priority: 400


3. apt-get update
4. apt-get install nautilus-dropbox

####Dropbox without Gnome

1. Download the closed source Dropbox Linux client from http://www.getdropbox.com/download?plat=lnx.x86 (x86_64 for 64 bit)
1. DownloadExtract the contents and you should get a .dropbox-dist folder out of the archive. 
1. Move the folder to $HOME the closed source Dropbox Linux client from http://www.getdropbox.com/download?plat=lnx.x86 (x86_64 for 64 bit)
1. Run ~/.dropbox-dist/dropboxd
1. Ensure that the daemon runs whenever you use your computer





Db configuration
------------------

1. Create the file  $HOME/.emarsys/Upgrader.xml (For more information have a look at EDREI/admin/upgrader/readme.txt)


        <upgrader>
            <jdbc>
                <id>basetta_e3db</id>
                <url>jdbc:postgresql://localhost:5432/e3db?user=basetta&amp;password=basetta</url>
            </jdbc>
            <jdbc>
                <id>basetta_e3local</id>
                <url>jdbc:postgresql://localhost:5432/e3local?user=basetta&amp;password=basetta</url>
            </jdbc>
            <jdbc>
                <id>basetta_e3mail</id>
                <url>jdbc:postgresql://localhost:5432/e3mail?user=basetta&amp;password=basetta</url>
            </jdbc>
            <aliases>
                <alias>
                    <name>e3db</name>
                    <id>basetta_e3db</id>
                </alias>
                <alias>
                    <name>e3local</name>
                    <id>basetta_e3local</id>
                </alias>
                <alias>
                    <name>e3mail</name>
                    <id>basetta_e3mail</id>
                </alias>
            </aliases>
        </upgrader>

1. Create the ROLE emarsys

        $ psql template1
        template1=# CREATE ROLE emarsys WITH LOGIN PASSWORD '<password>';
        template1=# SELECT rolname FROM pg_roles;




e3db Configuration
---------

1. Create the database e3db

        $ createdb e3db

2. Lanuch the Upgrader

        $ cd /EDREI/admin/upgrader
        $ java -cp target/upgrader.jar com.emarsys.e3.upgrader.Main e3db

## e3local Configuration ##

JBoss 4.0.4.GA
--------------

1. Create JBoss directory

        $ mkdir ~/Applications/JBoss

1. Unzip ~/Dropbox/JBoss-4.0.4.GA.tar.bz2 in ~/Applications/JBoss

2. Set up a symlinks to the current JBoss version

        $ ln -sf $HOME/Applications/JBoss/JBoss-4.0.4.GA $HOME/Applications/JBoss/current
        $ ln -sf $HOME/Applications/JBoss/current /opt/JBoss

1. Create directory for configuration files

        $ mkdir ~/Applications/JBoss/app
        $ mkdir ~/Applications/JBoss/app/<VERSION_APP>
        $ mkdir ~/Applications/JBoss/app/<VERSION_APP>/app
        $ mkdir ~/Applications/JBoss/app/<VERSION_APP>/broadcasting
        $ mkdir ~/Applications/JBoss/app/<VERSION_APP>/mail
        $ mkdir ~/Applications/JBoss/app/<VERSION_APP>/common
        $ ln -sf ~/Applications/JBoss/app/<VERSION_APP> ~/Applications/JBoss/current_app

3. Create a symlink for imports and exports

        $ ln -sf /home/emarsys/import /opt/JBoss/import
        $ ln -sf /home/emarsys/export /opt/JBoss/export


## Jboss MultiInstance Configuration ##

* Create three dummy interfaces
    * Debian

            $ sudo /sbin/ifconfig eth0:1 <IP_1> NETMASK 255.255.255.0
            $ sudo /sbin/ifconfig eth0:2 <IP_2> NETMASK 255.255.255.0
            $ sudo /sbin/ifconfig eth0:3 <IP_3> NETMASK 255.255.255.0

            * You can configure the additional IP addresses automatically at boot with another iface statement in /etc/network/interfaces:

            $ sudo vi /etc/network/interfaces
            auto eth0:1
            iface eth0:1 inet static

            address <IP_1>
            netmask 255.255.255.0
            broadcast 192.168.0.0
            (bug https://bugs.launchpad.net/debian/+source/ifupdown/+bug/114457 so ifup --all)
            It is not possible to add automatically virtual interface so let-s use the script


    * MacOs

            $ sudo /sbin/ifconfig en0 alias <IP_1> netmask 255.255.255.255
            $ sudo /sbin/ifconfig en0 alias <IP_2> netmask 255.255.255.255
            $ sudo /sbin/ifconfig en0 alias <IP_3> netmask 255.255.255.255


* Edit the file  **/etc/hosts**

            <IP_1>  app.<HOST_NAME>.emarsys.int 
            <IP_2>  broadcasting.<HOST_NAME>.emarsys.int
            <IP_3>  mail.<HOST_NAME>.emarsys.int

* Create a different server configuration directory for each instance of JBoss AS (remember log4j.xml, e3.properties, e3send.properties, emarsys3-ds.xml, e3mail-ds.xml)

        $ $JBOSS_HOME
            - server
                - default
                - app
                - broadcasting
                - mail

* Create symbolic links to the ear,datasourses and properties files

        $ ln -sf $HOME/current/common/log4j.xml $JBOSS_HOME/server/app/conf/log4j.xml
        $ ln -sf $HOME/current/app/e3.properties $JBOSS_HOME/server/app/conf/e3.properties
        $ ln -sf $HOME/current/current_app/app/E3-web.ear $JBOSS_HOME/server/app/deploy/E3-web.ear
        $ ln -sf $HOME/current/current_app/app/emarsys3-ds.xml $JBOSS_HOME/server/app/deploy/emarsys3-ds.xml

        $ ln -sf $HOME/current/common/log4j.xml $JBOSS_HOME/server/broadcasting/conf/log4j.xml
        $ ln -sf $HOME/current/broadcasting/e3.properties $JBOSS_HOME/server/broadcasting/conf/e3.properties
        $ ln -sf $HOME/current/broadcasting/E3-web.ear $JBOSS_HOME/server/broadcasting/deploy/E3-web.ear
        $ ln -sf $HOME/current/broadcasting/emarsys3-ds.xml $JBOSS_HOME/server/broadcasting/deploy/emarsys3-ds.xml

        $ ln -sf $HOME/current/common/log4j.xml $JBOSS_HOME/server/mail/conf/log4j.xml
        $ ln -sf $HOME/current/mail/e3.properties $JBOSS_HOME/server/mail/conf/e3.properties
        $ ln -sf $HOME/current/mail/e3send.properties $JBOSS_HOME/server/mail/conf/e3send.properties
        $ ln -sf $HOME/current/mail/e3send.ear $JBOSS_HOME/server/mail/deploy/e3send.ear
        $ ln -sf $HOME/current/mail/e3mail-ds.xml $JBOSS_HOME/server/mail/deploy/e3mail-ds.xml


* Setting the correct connection-url, user-name, password in the datasource file **emarsys-ds.xml** (or e3mail-ds.xml)

* Configure the **e3.properties** and **e3send.properties** files

    * General Configuration

        * remote.jndi.server
        * jndi.db
        * common.dbpwd

    * Import file

        * import.uploadHost
        * import.testUser
        * import.testPwd
        * web.url
* Launch the all instances (do not do it at the same time ....launch one and go for a coffee before the other launch)

        $ sh $JBOSS_HOME/bin/run.sh -c app -b <IP_1> -Djboss.messagingServerPeerID=1
        $ sh $JBOSS_HOME/bin/run.sh -c broacasting -b <IP_2> -Djboss.messagingServerPeerID=2
        $ sh $JBOSS_HOME/bin/run.sh -c mail -b <IP_3> -Djboss.messagingServerPeerID=3

* Check this script http://www.jboss.org/community/docs/DOC-12305#comment-1106

## Apache Configuration Debian ##


### Activate modules: proxy, rewrite

        $ cd /etc/apache2/mods-enabled
        $ sudo ln -snf ../mods-available/proxy*
        $ sudo ln -snf ../mods-available/rewrite*


### Create a VirtualHost for each JBoss instance (example for the app)

This goes in some place like `/etc/apache2/sites-available/e3.conf`, which is then simlinked to the
`/etc/apache2/sites-enabled` directory.

        <VirtualHost *:80>
                ServerAdmin webmaster@localhost
                ServerName  app.basiglio.emarsys.int
                <Directory />
                        Options FollowSymLinks
                        AllowOverride None
                </Directory>
                <Proxy *>
                        Order allow,deny
                        Allow from all
                </Proxy>

                # Possible values include: debug, info, notice, warn, error, crit,
                # alert, emerg.
                ErrorLog /var/log/apache2/app_error.log
                LogLevel warn

                CustomLog /var/log/apache2/access.log combined
                ProxyPass / http://app.basiglio.emarsys.int:8080/
                ProxyPassReverse / http://app.basiglio.emarsys.int:8080/
                ProxyPass / ajp://app.basiglio.emarsys.int:8009/
        </VirtualHost>

### Extra Proxies And Rewrite Rule For The Web Machine (app)

Insert this into the virtual host for the `app` instance:

        # Batch Mailing API

        ProxyPass               /bmapi http://localhost:9090
        ProxyPassReverse        /bmapi http://localhost:9090

        # MCAPI

        ProxyPass               /mcapi http://app.caparezza.emarsys.int:9091
        ProxyPassReverse        /mcapi http://app.caparezza.emarsys.int:9091

        # MailHoney

        ProxyPass               /mh    http://caparezza.emarsys.int:9092
        ProxyPassReverse        /mh    http://caparezza.emarsys.int:9092

        RewriteEngine On
        # mail open
        RewriteRule ^/img/([0-9a-fI]+)\.gif$ /op/t.do?event=open&i=$1 [R,L]


1. Activate the e3.conf

        $ sudo ln -sf ../sites-available/e3.conf

1. Restart Apache
        $ sudo /etc/init.d/apache2 restart

Apache Configuration MacOs
----------------

1. Uncomment the line in /etc/apache2/httpd.conf (line 465)

        Include /private/etc/apache2/extra/httpd-vhosts.conf

1. Edit the file **/etc/apache2/extra/httpd-vhost.conf**

            Check the debian configuration

1. Restart apache

            $ sudo apachectl restart

CHECKDATA.PL SCRIPT
-----------------------------------

1. In order to use the checkdata.pl script

        $ mkdir /home/emarsys/tools/checkdata
        $ ln -sf <DEV_HOME>/EDREI/broadcasting/checkdata/checkdata.pl /home/emarsys/tools/checkdata/checkdata.pl
        $ ln -sf /home/emarsys/tools/checkdata /home/emarsys/cleaner 
        (path hardcoded in FileCopyTask.java and CsvRecipientSource.java)

1. Install the necessary perl library (Text::CSV_XS)(verify for macos)

        Debian:
        $ sudo apt-get install libtext-csv-xs-perl

        MacOs
        $ sudo rm /usr/bin/perl
        $ sudo ln -s /opt/local/bin/perl /usr/bin/perl
        $ sudo -H cpan -i Text::Iconv
        $ sudo -H cpan -i Text::CSV_XS

SUMMARIZER
-----------------------------------
1. Create direcoty for the summarizer
        $ mkdir $HOME/tools/summarizer

1. We should tell the compiler where to find the postgres library (libpq.so.4)

        $ sudo echo "$HOME/Applications/postgresql/current/lib" >> /etc/ld.so.conf.d/libc.conf
        $ sudo ldconfig
1. Copy the source in a temp directory and compile it (remeber to change the variable PGHOME in the Makefile)

1. Copy the target file to $HOME/tools/summarizer

1. Set the properties **nbr_of_default_summarizer** of the broadcasting instance (ask Guy) (default should be 2)

1. Create n script
        $ vim $HOME/tools/summarizer/start_[1..nbr_of_default_summarizer].sh
        #!/bin/sh

        # $1: database connect string
        # $2: id of subtable
        # $3: sync interval in seconds
        # $4: loglevel
        ./summarizer 'dbname=e3db user=emarsys password=emarsys host=localhost' [1..nbr_of_default_summarizer] 25 2

        $ vim $HOME/tools/summarizer/start_queued (same script but with table_id 160301 and loglevel 3)
        $ vim $HOME/tools/summarizer/start_top_queue.sh (same script but with table_id 0)


1. e3db Configuration (to include in the upgrader) 


            CREATE AGGREGATE sum_uniq (
                BASETYPE = text,
                SFUNC = sum_uniq,
                STYPE = int8,
                INITCOND = 0);


## EXPORTER CONSIDERATION ##

Remember that the id field should be set properly with the id in t_field_definition
(think about it) (sync_data.scala ?? )

## CHIME EXPORTER ##

1. Create a bunch of directories (have fun)

        $ mkdir -p /home/emarsys/tools/exporters/chime_exporter
        $ mkdir -p /home/emarsys/tools/exporters/chime_exporter/conf
        $ mkdir -p /home/emarsys/tools/exporters/chime_exporter/log
        $ mkdir -p /home/emarsys/tools/exporters/chime_exporter/META-INF
        $ mkdir -p /home/emarsys/tools/exporters/chime_exporter/lib
        $ cd /home/emarsys/tools/exporters/chime_exporter

1. create the following symbolic links

        $ ln -sf $DEV_HOME/export/chime_exporter/target/e3-export-chime_exporter-<VERSION>-jar-with-dependencies.jar e3-export-chime_exporter.jar
        $ ln -sf $HOME/Dropbox/Library/Java/bcpg-139-jdk15.jar lib/bcpg-139-jdk15.jar
        $ ln -sf $HOME/Dropbox/Library/Java/bcprov-139-jdk15.jar lib/bcprov-139-jdk15.jar

1. Copy the script file chime_exporter.sh

        $ cp ~/Dropbox/ConfigurationFiles/exporters/chime_exporter/chime_export.sh .

1. Copy the log4j.xml

        $ cp $HOME/Dropbox/ConfigurationFiles/exporters/common/log4j.xml .

1. Copy persistence.xml and set accordingly

        $ cp $HOME/Dropbox/ConfigurationFiles/expoters/common/persistence.xml ./META-INF

1. Copy emarsys3-ds.xml 

        $ cp /opt/JBoss/server/broadcasting/deploy/emarsys3-ds.xml .

1. Copy e3properties and set it accordingly to your system

        $ cp $HOME/Dropbox/ConfigurationFiles/exporters/chime_exporter/e3.properties ./conf

1. Create a symlinc for the export keys

        $ ln -sf $HOME/Dropbox/ConfigurationFiles/exporters/chime_exporter/export_keys export_keys

1. Launch the chime_export
        $ ./chime_export.sh <ID_ACCOUNT>

1. (Discuss in order to avoid password request ssh-agent known-host what do you prefer? :)

## BASIC EXPORTER ##

For the basic exporter is not necessary to set any transfer properties seeing that it entails in a merely rsync from an /mnt to anoter /mnt

1. mkdir -p /home/emarsys/tools/exporters/basic_exporter
1. mkdir -p /home/emarsys/tools/exporters/basic_exporter/conf
1. mkdir -p /home/emarsys/tools/exporters/basic_exporter/META-INF
1. mkdir -p /home/emarsys/tools/exporters/basic_exporter/log
1. cd /home/emarsys/tools/exporters/basic_exporter

1. create a symbolic link

        $ ln -sf $DEV_HOME/export/basic_exporter/target/e3-export-basic_exporter-<VERSION>-jar-with-dependencies.jar e3-export-basic_exporter.jar

1. Copy the script file basic_exporter.sh

        $ cp ~/Dropbox/ConfigurationFiles/exporters/basic_exporter/basic_export.sh .

1. 1. Copy the log4j.xml

        $ cp $HOME/Dropbox/ConfigurationFiles/exporters/common/log4j.xml .

1. Copy persistence.xml and set accordingly

        $ cp $HOME/Dropbox/ConfigurationFiles/expoters/common/persistence.xml ./META-INF

1. Copy e3properties and set it accordingly to your system

        $ cp $HOME/Dropbox/ConfigurationFiles/exporters/basic_exporter/e3.properties ./conf

1. Launch the basic_export (export type bitwise mask 3 = 0011 EXPORT_UNSUBSCRIBE 15 = 1111 ALL)

        $ ./basic_exporter.sh 1000 3 <(YYYY-MM-DD)>
        $ ./basic_exporter.sh 1000 15 <(YYYY-MM-DD)>

paragraphs mess with html

for instance:

    val tabsp = """<table><tr><td>
1

1
</td><td></td></tr></table>"""
println(knockoff(tabsp))

results in

ListBuffer(Paragraph(List(HTMLSpan(<table>), HTMLSpan(<tr>), HTMLSpan(<td>), Text(
1
)),1.1), Paragraph(List(Text(1
</td>), HTMLSpan(<td></td>), Text(</tr></table>)),4.1))

Note that the closing /tr and /table are rendered as text.

This does not hapen if the 1s are not separated by empty line...

Strange link matching problem

It looks like a regular expression might be off.

When I used the following source, the links were off:

# Test #

[Link][] leads to
[another][] link.

[link]: http://example.com/link
[another]: http://another.com/another

But I if I alter it just slightly to have the first reference link use the name, both links work.

# Test #

[Link][link] leads to
[another][] link.

[link]: http://example.com/link
[another]: http://another.com/another

Huh? Is it that case-insensitivity breaks all links?

There are unnecessary linebreaks before closing tags

When I convert such as the following markdown document to HTML.

- item1
- item2

I had expected the results shown in (A), but the actual output like shown in (B).

(A)
<li>item1</li><li>item2</li>

(B)
<li>item1
</li><li>item2
</li>

Is this the correct behavior? I want to send a pull request about that removing this line breaks. Is it OK?

Things after lists start breaking

OK I had an embedded list, and this seemed to cause a parsing error:

* This is a long line that wrapped
  **bold**

And then I notice that if you want a code block that trails a list, things are not so happy:

1. List item

    code block

That should actually be a code block within a complex list.

`knockoff` method throws StackOverflowError

I'm using Scala 2.10.2, hence using 0.8.1 version.

Calling knockoff method keeps throwing me this:

java.lang.StackOverflowError: null
at java.util.regex.Pattern.group0(Pattern.java:2513) ~[na:1.6.0_26]
at java.util.regex.Pattern.sequence(Pattern.java:1806) ~[na:1.6.0_26]
at java.util.regex.Pattern.expr(Pattern.java:1752) ~[na:1.6.0_26]
at java.util.regex.Pattern.compile(Pattern.java:1460) ~[na:1.6.0_26]
at java.util.regex.Pattern.(Pattern.java:1133) ~[na:1.6.0_26]
at java.util.regex.Pattern.compile(Pattern.java:823) ~[na:1.6.0_26]
at scala.util.matching.Regex.(Regex.scala:153) ~[scala-library-2.10.2.jar:na]
at scala.collection.immutable.StringLike$class.r(StringLike.scala:224) ~[scala-library-2.10.2.jar:na]
at scala.collection.immutable.StringOps.r(StringOps.scala:31) ~[scala-library-2.10.2.jar:na]
at scala.collection.immutable.StringLike$class.r(StringLike.scala:213) ~[scala-library-2.10.2.jar:na]
at scala.collection.immutable.StringOps.r(StringOps.scala:31) ~[scala-library-2.10.2.jar:na]
at com.tristanhunt.knockoff.ChunkParser$$anon$1.findEnd(MarkdownParsing.scala:261) ~[knockoff_2.10-0.8.1.jar:0.8.1]
at com.tristanhunt.knockoff.ChunkParser$$anon$1.findEnd(MarkdownParsing.scala:270) ~[knockoff_2.10-0.8.1.jar:0.8.1]
at com.tristanhunt.knockoff.ChunkParser$$anon$1.findEnd(MarkdownParsing.scala:270) ~[knockoff_2.10-0.8.1.jar:0.8.1]
at com.tristanhunt.knockoff.ChunkParser$$anon$1.findEnd(MarkdownParsing.scala:270) ~[knockoff_2.10-0.8.1.jar:0.8.1]
at com.tristanhunt.knockoff.ChunkParser$$anon$1.findEnd(MarkdownParsing.scala:270) ~[knockoff_2.10-0.8.1.jar:0.8.1]
at com.tristanhunt.knockoff.ChunkParser$$anon$1.findEnd(MarkdownParsing.scala:270) ~[knockoff_2.10-0.8.1.jar:0.8.1]
at com.tristanhunt.knockoff.ChunkParser$$anon$1.findEnd(MarkdownParsing.scala:270) ~[knockoff_2.10-0.8.1.jar:0.8.1]
at com.tristanhunt.knockoff.ChunkParser$$anon$1.findEnd(MarkdownParsing.scala:270) ~[knockoff_2.10-0.8.1.jar:0.8.1]
at com.tristanhunt.knockoff.ChunkParser$$anon$1.findEnd(MarkdownParsing.scala:270) ~[knockoff_2.10-0.8.1.jar:0.8.1]
at com.tristanhunt.knockoff.ChunkParser$$anon$1.findEnd(MarkdownParsing.scala:270) ~[knockoff_2.10-0.8.1.jar:0.8.1]

Any idea?

Oh and just to add more info, using the version for Scala 2.9 works just fine.

UPDATE 08/29/13 10:06 : will try increasing stack size and cross my fingers

Trailing header caused problems

Need to investigate how this happened, but having a header at the end of the document errored out:

Blah blah blah

## Foo ##

Not sure why

Handling \r\n newlines

Knockoff doesn't handle newlines feeds well, as it only expects the \n delimiter and will fail if it sees a \n line ending (or a \r).

Here is script to reveal the problem:

scala> import com.tristanhunt.knockoff.DefaultDiscounter._                 
import com.tristanhunt.knockoff.DefaultDiscounter._

scala> knockoff("\n") // normal case                      
res21: Seq[com.tristanhunt.knockoff.Block] = ListBuffer()

scala> knockoff("\r\n") // abnormal case
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
' foundmatching regex `[\t ]*\n' expected but `
[...]

' foundmatching regex `[\t ]*\n' expected but `
next == reader : false
java.lang.StackOverflowError
    at java.util.regex.Pattern.atom(Pattern.java:1952)
    at java.util.regex.Pattern.sequence(Pattern.java:1834)
    at java.util.regex.Pattern.expr(Pattern.java:1752)
    at java.util.regex.Pattern.group0(Pattern.java:2530)
    at java.util.regex.Pattern.sequence(Pattern.java:1806)
    at java.util.regex.Pattern.expr(Pattern.java:1752)
    at java.util.regex.Pattern.compile(Pattern.java:1460)
    at java.util.regex.Pattern.<init>(Pattern.java:1133)
    at java.util.regex.Pattern.compile(Pattern.java:823)
    at scala.util.matching.Regex.<init>(Regex.scala:41)
    at scala.collection.immutable.StringLike$class.r(StringLike.scala:202)
    at scala.collection.immutable.StringOps.r(StringOps.scala:31)
    at com.tristanhunt.knockoff.ChunkParser.bulletLead(MarkdownParsing.scala:69)
    at com.tristanhu...

Parsing inside blockquote is not always correct

Hi,

thanks for a great library!
I stumbled about an issue in blockquote parsing, which looks like a parser error:
Following example work just fine:

> # Hi
> * One
> * Two
> * Three

But this won't work as expected:

> Hi
> # Hi
> * One
> * Two
> * Three

and produces following html

<blockquote><p>Hi
# Hi
<em> One
</em> Two
* Three</p></blockquote>

Here is the snippet to reproduce this:

    import DefaultDiscounter._
    println(toXHTML(knockoff("> # Hi\n> * One\n> * Two\n> * Three")))
    println(toXHTML(knockoff("> Hi\n> # Hi\n> * One\n> * Two\n> * Three")))

Best regards,
Matthias

Newline required after header statements

If you don't have a whitespace line after headers, you'll get a matching error.

For example:

Header
----------
Body

This breaks. What I want is a warning in these cases, and for the Body to be treated as the start of the next paragraph.

code block following a list

this won't work - the code block does not look like a code block

1. some item

    code line 1
    code line 2

2. some item

LaTeX usage considerably broken

If you use characters significant to Markdown in your LaTeX, you'll probably not get what you expect in the output. Considering that characters like _ or * are very useful in math declarations, well, you're probably not going to have much TeX actually pass through the system.

This should be fixed by the next version, where I'm also using SnuggleTeX to render MathML sequences.

Wrong interpretation of exclamation mark, some text, and then a link (interpretation as image)

In a source like Text ! text [linktext](linkurl), the text part is skipped/removed and the whole thing is interpreted as Text ![linktext](linkurl), i.e. as an image definition. Escaping the exlamation mark does not help.

Reproduction:

import com.tristanhunt.knockoff.DefaultDiscounter._
import com.tristanhunt.knockoff._

val source = """Text ! text [linktext](linkurl)"""
val parsed = knockoff(source)

println(source)
println(toXHTML(parsed))


// try with an escaped exclamation mark
val source2 = """Text \! text [linktext](linkurl)"""
val parsed2 = knockoff(source)

println(source2)
println(toXHTML(parsed2))

gives:

Text ! text [linktext](linkurl)
<p>Text <img src="linkurl" alt="linktext"></img></p>
Text \! text [linktext](linkurl)
<p>Text <img src="linkurl" alt="linktext"></img></p>

No email address entitizing

While it probably doesn't help too much, the email addresses are not completely entitized by the Converter.

Malformed links throw java.util.NoSuchElementException: None.get

steps

Welcome to Scala version 2.10.2 (Java HotSpot(TM) 64-Bit Server VM, Java 1.6.0_51).
Type in expressions to have them evaluated.
Type :help for more information.

scala> import com.tristanhunt.knockoff.DefaultDiscounter._
import com.tristanhunt.knockoff.DefaultDiscounter._

scala> knockoff("""val turtlePosition = Lens.lensu[Turtle, Point] (
     |   (a, value) => a.copy(position = value),
     |   _.position)
     | val pointX = Lens.lensu[Point, Double] (
     |   (a, value) => a.copy(x = value),
     |   _.x)""")

problem

java.util.NoSuchElementException: None.get
    at scala.None$.get(Option.scala:313)
    at scala.None$.get(Option.scala:311)
    at com.tristanhunt.knockoff.SpanConverter.findNormalMatch(MarkdownParsing.scala:810)
    at com.tristanhunt.knockoff.SpanConverter$$anonfun$matchers$3.apply(MarkdownParsing.scala:660)
    at com.tristanhunt.knockoff.SpanConverter$$anonfun$matchers$3.apply(MarkdownParsing.scala:660)
    at com.tristanhunt.knockoff.SpanConverter$$anonfun$2.apply(MarkdownParsing.scala:642)
    at com.tristanhunt.knockoff.SpanConverter$$anonfun$2.apply(MarkdownParsing.scala:641)
    at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
    at scala.collection.immutable.List.foldLeft(List.scala:84)
    at scala.collection.TraversableOnce$class.$div$colon(TraversableOnce.scala:138)
    at scala.collection.AbstractTraversable.$div$colon(Traversable.scala:105)
    at com.tristanhunt.knockoff.SpanConverter.convert(MarkdownParsing.scala:641)
    at com.tristanhunt.knockoff.SpanConverter$DelimMatcher.apply(MarkdownParsing.scala:616)
    at com.tristanhunt.knockoff.SpanConverter$DelimMatcher.apply(MarkdownParsing.scala:606)
    at com.tristanhunt.knockoff.SpanConverter$$anonfun$2.apply(MarkdownParsing.scala:642)
    at com.tristanhunt.knockoff.SpanConverter$$anonfun$2.apply(MarkdownParsing.scala:641)
    at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
    at scala.collection.immutable.List.foldLeft(List.scala:84)
    at scala.collection.TraversableOnce$class.$div$colon(TraversableOnce.scala:138)
    at scala.collection.AbstractTraversable.$div$colon(Traversable.scala:105)
    at com.tristanhunt.knockoff.SpanConverter.convert(MarkdownParsing.scala:641)
    at com.tristanhunt.knockoff.SpanConverter.apply(MarkdownParsing.scala:630)
    at com.tristanhunt.knockoff.Discounter$$anonfun$2.apply(Discounter.scala:38)
    at com.tristanhunt.knockoff.Discounter$$anonfun$2.apply(Discounter.scala:37)
    at scala.collection.immutable.Stream.map(Stream.scala:376)
    at com.tristanhunt.knockoff.Discounter$class.knockoff(Discounter.scala:37)
    at com.tristanhunt.knockoff.DefaultDiscounter$.knockoff(Discounter.scala:79)

expectation

As the comment says, it shouldn't fail:

/** Parses and returns our best guess at the sequence of blocks. It will
      never fail, just log all suspicious things. */

Worlds Ugliest Parser Implementation

FYI

This project was a very early stint of me learning how to use Scala. Thus, the parser itself is a series of bad ideas strung together with waaaaay too much complexity.

At some point, I'll totally rewrite the thing as just a combinator parser, but that will take some time. I can live with my hackfest -> just a few more bugs to quash.

Line number references not findable

It dawned on me that it would be far easier to build serious apps against this thing if you could take, say, a recognized block and get the line numbers and general positions for that block.

This is totally do-able (I think) by the positioned decoration. This should give us the input start for any of our recognized tokens.

I'm not sure if this requires a full-rewrite or not.

Lists being parsed incorrectly

Input:

1. This
2. That; and
3. the other

* not
* the
* end

and

- another
- style

and

+ finally
+ this
+ style

Converts to Block:

ListBuffer(OrderedList(List(OrderedItem(List(Paragraph(List(Text(This2. That; and3. the other), Emphasis(List(Text( not))), Text( the* endand- another- styleand+ finally+ this+ style)),1.1)),1.1))))

Converts to XHTML:

<ol>
  <li>
    This2. That; and3. the other
    <em> not</em>
    the* endand- another- styleand+ finally+ this+ style
  </li>
</ol>

expected rendering:

  1. This
  2. That; and
  3. the other
  • not
  • the
  • end

and

  • another
  • style

and

  • finally
  • this
  • style

Markdown compatibility: Bad link translation with parenthetical text

Consider this Markdown text:

Here is a [link][] (cool!)

[link]: http://localhost/

With that input, John Gruber's Perl markdown script produces:

<p>Here is a <a href="http://localhost/">link</a> (Cool!).</p>

Knockoff produces:

<p>Here is a [link]<a href="Cool!" ></a>.
</p>

Remove the trailing parenthetical expression, and Knockoff produces a valid link:

Input

Here is a [link][].

[link]: http://localhost/

Output

<p>Here is a <a href="http://localhost/" >link</a>.
</p>

Error discovered in: knockoff_2.8.0.RC2-0.7.1-11.jar

Knockoff uses "Unparsed" for inline HTML in Markdown document

One option for inline HTML would be to attempt to convert it to XHTML, which might be favorable. This should be possible by adjusting the XHTMLWriter.spanToXHTML method, but I note that this is currently wrapped in a Group() as well in the paragraphToXHTML method.

  • We probably should make it easy to add a "handler for found HTML". This is probably going to be on the output method, not during parsing, which needs to be flexible to be useful.
  • Should I alter all the return values to be Option[Node] and use a flatMap method?

Need Line Number In Error Messages

It's really hard to figure out where you might have messed up the file, because the parsing error spits out some jibberish, but not where the error was. Dang.

An asterisk em right after an asterisk list marker is not processed correctly

Currently, if you lead a list item with an asterisk-delimited emphasis, you'll mess up the processing.

scala> toXHTML( knockoff("""* *What* is this""") )
res0: scala.xml.Node = <p><em> </em>What* is this</p>

There is a simple workaround:

scala> toXHTML( knockoff("""* _What_ is this""") )
res1: scala.xml.Node = <ul><li><em>What</em> is this</li></ul>

Markdown compatibility issue

Standard Markdown:

$ echo "*Test*" | markdown
<p><em>Test</em></p>

Knockoff (built against GitHub source with Scala 2.8):

scala> import com.tristanhunt.knockoff.DefaultDiscounter._
import com.tristanhunt.knockoff.DefaultDiscounter._

scala> toXHTML(knockoff("*Test*")).toString
res0: String = <ul><li>Test*</li></ul>

Get Rid Of That Casting Stink

So we parse a text file into a list of Block objects. This means client code does things like this:

blocks.filter( _.isInstanceOf[ CodeBlock ] )

To get all code blocks. This kind of, well, sucks. Something is wrong here, because it sure seems like my type hierarchy is wrong if we have to cast to do anything useful.

parsing links ending with brackets

eats up the last bracket...

object Test extends App {
  import com.tristanhunt.knockoff.DefaultDiscounter._
  println(toXHTML(knockoff("[wiki link](http://en.wikipedia.org/wiki/Bracket_(disambiguation))")).toString)
}

output - note the missing bracket in the href:

<p><a  href="http://en.wikipedia.org/wiki/Bracket_(disambiguation">wiki link</a>)</p>

Build system + test suite needs a makeover

The usage of buildr is pretty, well, bad. I went with buildr and testng because I knew it and wanted to get something done.

Probably should try out simple build tool, and then convert the testing framework to ScalaTest (with specs!)

Markdown compatibility problem: Hard break

Take the following input:

This is a line ending with two blanks.  
That should produce a hard "br" in Markdown, per Daring Fireball.  
It doesn't.

The three lines end with two blank characters. Per Daring Fireball, that's supposed to cause a hard break. Here's what the Perl Markdown script produces, given that input:

<p>This is a line ending with two blanks.<br />
That should produce a hard "br" in Markdown, per Daring Fireball.<br />
It doesn't.</p>

Here's what Knockoff produces:

<p>This is a line ending with two blanks.  
</p><p>That should produce a hard &quot;br&quot; in Markdown, per Daring Fireball.  
</p><p>It doesn't.</p>

Stackoverflow error with empty code block

I had a basic document started and got a stack overflow error. It looked like this:

# Title #

Note that the second line has no spaces and the third line has four spaces (start a code block). This made things go kaboom.

Code Blocks Are Separated By Empty Spaces

OK, currently, if you do this

    Code line one

    Code line two

And the line between those those two lines does not have four spaces, they will be parsed as two separate code blocks. This is lame.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.