Giter Club home page Giter Club logo

validator-java's Introduction

AMP HTML ⚡ Validator

Build status: build_status

AMP HTML Validator takes input of HTML text and parses the text tag by tag, running the ruleset against every tag. Resultant from this process is validity of the input text and the errors generated while parsing.

Table of Contents

Background

@nhant01 and @GeorgeLuo own development and maintenance responsibility of the project. As such they will serve as the points of contact surrounding changes and modifications. The team will allocate resources to align the business with the implementations provided under the amphtml repo. With that in mind, should supporting this project become infeasible, we will publish a version of the codebase marked for deprecation and prepare for its removal from public repositories (every effort will be made to not have this happen).

Install

Maven

AMP HTML Validator uses maven as tool for building and managing project. Add following snippet to your pom.xml and hit "mvn" to build your project.

<dependency>
  <groupId>dev.amp</groupId>
  <artifactId>validator-java</artifactId>
  <version>${version.number}</version>
</dependency>

Here are the instructions to setup maven environment. https://maven.apache.org/what-is-maven.html https://maven.apache.org/guides/introduction/introduction-to-the-pom.html

Usage

Initialize the validator using

final AMPHtmlParser ampHtmlParser = new AMPHtmlParser();
final ValidationResult validationResult = ampHtmlParser.parse(inputHtml, htmlFormat, condition, maxNodes);

The parser can be used beyond the first document, to truncate initialization time. The maxNode condition is the maximum number of tags reviewed by the validator before forcing exit on exception. The condition is an enumeration of type ExitCondition, either exit on first error or a full parsing attempt. The htmlFormat is an enumeration for the format of AMP to be validated against.

Issues

The are several known bugs in the validation output.

  • the DOCTYPE tag is assumed to exist. Documents without the tag will fail to return an error.
  • Observance of afterbody mode is not implemented. Tags that appear following a body closing tag should be interpreted as within the body tag. Specs that mandate relationships between tags and ancestors related to the body tag will not account for this behavior and will return unexpected errors.
  • Attributes are de-duped by the internal html parser (the last value assigned to an attribute is the one returned to the handler). This leads to discrepancies in attribute validation to do with validation of the uniqueness of attributes and unexpected behavior in value validation.
  • Validation of URLs found within the HTML document may not return the same errors as the Node.js implementation. This stems from the fact that the internal URL library is lenient in terms of validating hostnames and characters used within the URL.
  • This validator prioritizes amp4email html content, enforcement of validator logic for other formats is not guaranteed.
  • CSS validation does not yet configure for a max nodes value

Contribute

Please refer to the Contribute.md file for information about how to get involved. We welcome issues, questions, and pull requests. Pull Requests are welcome.

License

This project is licensed under the terms of the Apache 2.0 open source license. Please refer to LICENSE for the full terms.

Attribution

validator-java's People

Contributors

georgeluo avatar gregable avatar nhant01 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.