Giter Club home page Giter Club logo

katana-parser's Introduction

Katana - A pure-C CSS parser.

Katana is an implementation of the CSS (Cascading Style Sheet) parsing algorithm implemented as a pure C library with no outside dependencies. It's designed to serve as a building block for other tools and libraries such as linters, validators, templating languages, and refactoring and analysis tools.

Katana is inspired by Gumbo, so it has some goals and features same as Gumbo.

Goals & features:

  • Simple API that can be easily wrapped by other languages.
  • Relatively lightweight, with no outside dependencies.
  • Support for fragment parsing.

Non-goals:

  • Mutability. Katana is intentionally designed to turn a style sheet into a parse tree, and free that parse tree all at once. It's not designed to persistently store nodes or subtrees outside of the parse tree, or to perform arbitrary style mutations within your program. If you need this functionality, we recommend translating the Katana parse tree into a mutable style representation more suited for the particular needs of your program before operating on it.

Wishlist:

  • Fully conformant with the CSS-syntax.
  • Hackable dump or print.
  • Robust and resilient to bad input.
  • Full-featured error reporting.
  • Additional performance improvements.
  • Tested on Official W3C Test Suites.

Installation

To build and install the library, issue the standard UNIX incantation from the root of the distribution:

$ ./autogen.sh
$ ./configure CFLAGS="-std=c99"
$ make
$ sudo make install

Katana comes with full pkg-config support, so you can use the pkg-config to print the flags needed to link your program against it:

$ pkg-config --cflags katana         # print compiler flags
$ pkg-config --libs katana           # print linker flags
$ pkg-config --cflags --libs katana  # print both

For example:

$ gcc examples/dump_stylesheet.c `pkg-config --cflags --libs katana` -o dump

If package katana was not found in the pkg-config search path, perhaps you should add the directory containing katana.pc to the PKG_CONFIG_PATH environment variable as following:

$ export PKG_CONFIG_PATH=/usr/local/lib/pkgconfig

See the pkg-config man page for more info.

Basic Usage

Within your program, you need to include "katana.h" and then issue a call to katana_parse:

#include "katana.h"

int main() {
  const char* css = "selector { property: value }";
  KatanaOutput* output = katana_parse(css, strlen(css), KatanaParserModeStylesheet);
  // Do stuff with output, eg. print the input style
  katana_dump_output(output);
  katana_destroy_output(output);
}

See the API documentation and sample programs for more details.

Contributing

Bug reports are very much welcome. Please use GitHub's issue-tracking feature, as it makes it easier to keep track of bugs and makes it possible for other project watchers to view the existing issues.

Patches and pull requests are also welcome.

If you're unwilling to do this, it would be most helpful if you could file bug reports that include detailed prose about where in the code the error is and how to fix it, but leave out exact source code.

katana-parser's People

Contributors

andrewlin12 avatar detailyang avatar htower avatar julianeisel avatar qfish avatar stackluca avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

katana-parser's Issues

Crash on parsing '@supports'

Attempt to parse a string "@supports (transform-origin: 5% 5%) {}" will rise EXC_BAD_ACCESS exception inside 'katana_add_rule'. Function 'katana_add_rule' is called from here:

katana.tab.c:2252

  case 36:

    {
        if ((yyvsp[-1].rule))
            katana_add_rule(parser, (yyvsp[-1].rule));
    }

    break;

(yyvsp[-1].rule) contains invalid value.

Crash in 'katana_stringify_value_list'.

Here (parser.c:1635):

        const char* value_str = katana_stringify_value(parser, value);
        katana_string_append_characters(parser, value_str, buffer);
        katana_parser_deallocate(parser, (void*) value_str);
        value_str = NULL;

'katana_stringify_value' can actually return NULL, if value is empty KATANA_VALUE_PARSER_LIST.
After that katana_string_append_characters will call strlen(NULL).

extraneous return in the middle of parser.c katana_print routine

Did fresh checkout and build on Mac OS X and it all builds and seems to work.

Found an extraneous printf("\n") of a newline inside kantan_print routine in parser.c

void katana_print(const char * format, ...)
{
    va_list args;
    va_start(args, format);
    vprintf(format, args);
    printf("\n");
    va_end(args);
    fflush(stdout);
}

Perhaps you want to remove that? It seems to mess up indentation and output format for no good reason (inserting a newline before every ";" at the end of each property value).

Also I wanted to know about the status of this project. I develop for Sigil (github.com/Sigil-Ebook/Sigil) which is a epub2/epub3 editor and we need a good css parser. We are already a heavy user of a modified version of gumbo (to support xhtml parsing) and your project description caught my attention because of your project goals and their similarity to gumbo-parser.

  1. Is this project still alive?
  2. How complete and usable is it?
  3. Is there any read only interface for specific pieces or a set of callbacks or anything that would allow the parser to be utilized more easily (the dump_output approach seems to be to heavy to make easy use)? Or do we walk the output by duplicating your dump routine approach and process things ourselves step by step?
  4. Are there any sample programs that demonstrate the interface in more details?
  5. Is there any documentation outside of the code itself?

Thanks!

Crash due to void* dereference level mismatch in parser

There are a few places in parser.c where void* pointers in the parser struct are used at the wrong indirection level. The C compiler doesn't catch these because of the opaque void* typing, but they cause crashes at run-time.

parser.c line 1254:    katanaget_text(parser->scanner)   // Should be *parser->scanner

parser.c line 1256:  YYSTYPE * s = katanaget_lval(parser->scanner);  // Should be *parser->scanner

parser.c line 1258: ...katana_get_previous_state(parser->scanner); // Should be *parser->scanner

parser.c line 1272:    katanaget_text(parser->scanner)   // Should be *parser->scanner

API Documentation

The readme file suggest reading the API documentation. Where is it?
Were can I find documentation for using this library? The examples just do a dump of the CSS file!

find mem leak by valgrind

hi:
I have used the parser tool, and it is very cool. It help me very much.
I found a problem when I used valgrind to test it, and find mem leak. It is show in pic below:

image

I try to find it, but I am not good at it. So, I hope you and help me.

Yours
Thanks.

Crash on [|att] selector.

Katana will crash trying to parse '[|att]' selector.

Example:

<html><head><style>[|att] {color: green;}</style></head><body><p>Some text</p></body></html>

some compile problems

make all-am
make[1]: Entering directory /home/work/chengang06/cssparsertest/katana-parser' /bin/sh ./libtool --tag=CC --mode=link gcc -std=c99 -mcmodel=large -o dump_stylesheet examples/dump_stylesheet.o libkatana.la libtool: link: gcc -std=c99 -mcmodel=large -o .libs/dump_stylesheet examples/dump_stylesheet.o ./.libs/libkatana.so -Wl,-rpath -Wl,/usr/local/lib ./.libs/libkatana.so: undefined reference to katana_is_html_space'
collect2: error: ld returned 1 exit status
make[1]: *** [dump_stylesheet] Error 1
make[1]: Leaving directory `/home/work/chengang06/cssparsertest/katana-parser'
make: *** [all] Error 2

Missing flex/bison sources

Hello !
I've looked at the source code and could not find the original lex/yacc files (katana.l & katana.y).
There is any reason to not make then public in the repository ?
Cheers !

KatanaError objects are leaked.

Error objects are allocated on the heap and are not freed when KatanaOutput object is destroyed.

Function katana_destroy_output:

...
katana_destroy_stylesheet(&parser, output->stylesheet);

katana_array_destroy(&parser, &output->errors); // here should be katana_destroy_array call instead

katana_parser_deallocate(&parser, output);

...

Matching selectors

I'm working on a function for matching CSS selectors and I seem to be having some issues trying to write this function. This is my current code but it doesn't match correctly. Does anyone know of an easier way of matching or one that works? I also tried looking for a function in the Katana source code but I didn't find a matching function.

KatanaStyleRule* rds_theme_find_byrule(rds_theme_t* theme, const char* strrule) {
  KatanaOutput* css = katana_parse(strrule, strlen(strrule), KatanaParserModeSelector);
  if (css == NULL) return NULL;
  for (int i = 0; i < theme->css->stylesheet->rules.length; i++) {
    if (((KatanaRule*)theme->css->stylesheet->rules.data[i])->type == KatanaRuleStyle) {
      KatanaStyleRule* rule = (KatanaStyleRule*)theme->css->stylesheet->rules.data[i];
      if (rule->selectors->length != css->selectors->length) continue;
      int match = 1;
      for (int x = 0; x < css->selectors->length; x++) {
        KatanaSelector* asel = (KatanaSelector*)rule->selectors->data[x];
        KatanaSelector* bsel = (KatanaSelector*)css->selectors->data[x];

        if ((asel->match == bsel->match) == KatanaSelectorMatchTag) {
          if (asel->tag->prefix == NULL && bsel->tag->prefix == NULL) {
            if (!!strcmp(asel->tag->local, bsel->tag->local)) {
              match = 0;
            }
          } else {
            if (!!strcmp(asel->tag->prefix, bsel->tag->prefix)) {
              match = 0;
            }
            if (!!strcmp(asel->tag->local, bsel->tag->local)) {
              match = 0;
            }
          }
        }

        while (asel != NULL && bsel != NULL) {
          if (asel->match == bsel->match) {
            if (asel->match == KatanaSelectorMatchId || asel->match == KatanaSelectorMatchClass) {
              if (!!strcmp(asel->data->value, bsel->data->value)) match = 0;
            }
          } else {
            match = 0;
            break;
          }
          asel = asel->tagHistory;
          bsel = bsel->tagHistory;
        }
      }
      if (!match) continue;
      katana_destroy_output(css);
      return rule;
    }
  }
  katana_destroy_output(css);
  return NULL;
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.