Giter Club home page Giter Club logo

flake8-jupyter-notebook's Introduction

A container and a wrapper script around flake8 to validate python code within Jupyter notebooks. flake8 will pick up configuration files in your project, but some options are not supported and will cause the action to fail (sometimes silently).

Motivation

An easy way to automate flake8 code checks over code blocks defined in a Jupyter notebook.

Example usage

jobs:
  flake8:
    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v4
    - uses: mhitza/flake8-jupyter-notebook@v1
      with:
        debug: 'false' # set 'true' for additional logging
        # paths and files to ignore, one regexp rule per line
        ignore: |
          tests/
          test\.ipynb$

annotation-screenshot

Implementation details

There is an existing project, called flake8-nb that performs the same task as this action. While initial implementation tried to wrapp the annotation script around that utility, it was abandoned and instead flake8 was used because:

  1. flake8-nb did not report absolute line number within the notebook file, instead it reported only relative line numbers within the checked code blocks.
  2. A notebook might be checked in the repository without the code cells evaluated. In that case flake8-nb would report on cells without a number, and tracking back from the reported error to absolute line numbers became a more difficult task than wrapping around flake8.

In order to check the notebook, the annotate script keeps track of all the various code blocks within the notebook, concatenates them into a single source and pipes it into flake8.

Known limitations

Supports version 4 compatible notebook formats. It will just silently skip over other notebook formats, as it's using regular expressions based on indentation level to extract source blocks. If you're of aware of any JavaScript JSON parser that keeps track of the source lines parsed I'd be happy to hear about it.

Due to implementation details and Jupyter notebook specific idiosyncrasies, some warnings and errors reported by flake8 are ignored by default (hardcoded in source code). The following list is not necessarily exhaustive and might change based on testing and issues raised.

flake8 configuration support

The following options which can be defined in a flake8 configuration file are not supported.

Any option that changes the output of flake8: --quiet, --count, --format (only default supported), --show-source, --statistics.

Anything that relies on filename/paths, as code is passed in to flake8 via stdin. Thus the following options will have no effect: --exclude, --extend-exclude, --filename, --per-file-ignores

flake8-jupyter-notebook's People

Contributors

chasnelson1990 avatar mhitza avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Forkers

chasnelson1990

flake8-jupyter-notebook's Issues

Github action doesn't work

Hi.

I've just made a simple repository https://github.com/topshik/jupyter-test-linter where I wanted to test your linter with a simple example: create a branch with a linter and .ipynb file, so that it can be checked by a linter. However, the step with applying your code always fails with some JS memory problems.

<--- Last few GCs --->
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

[18:0x5582cf6f7f80]    65724 ms: Scavenge 2046.7 (2050.7) -> 2045.7 (2050.7) MB, 2.0 / 0.0 ms  (average mu = 0.235, current mu = 0.213) allocation failure 
[18:0x5582cf6f7f80]    65805 ms: Scavenge 2046.8 (2050.7) -> 2045.8 (2049.7) MB, 14.1 / 0.0 ms  (average mu = 0.235, current mu = 0.213) allocation failure 
[18:0x5582cf6f7f80]    65931 ms: Scavenge 2046.8 (2049.7) -> 2045.8 (2049.7) MB, 9.6 / 0.0 ms  (average mu = 0.235, current mu = 0.213) allocation failure 


<--- JS stacktrace --->
 1: 0x7f764c129d4c node::Abort() [/lib64/libnode.so.72]

==== JS stack trace =========================================

    0: ExitFrame [pc: 0x7f764cf884b9]
Security context: 0x371f82e1b161 <JSObject>
    1: match [0x371f82e09891](this=0x043d40eb6f29 <String[18]:    "metadata": {},>,0x159ee930f5c9 <JSRegExp <String[#18]: ^\s{3}"cell_type":>>)
    2: find_source_blocks [0x1032084a00f1] [/annotate:~140] [pc=0x3055ab189746](this=0x346d9066b999 <JSGlobal Object>,0x043d40eb64b1 <JSArray[67]>)
    3: /* anonymous */ [0x1032084a0131] [/annotate:27] [bytecode=0x1...

 2: 0x7f764bf638cd node::OnFatalError(char const*, char const*) [/lib64/libnode.so.72]
 3: 0x7f764c48cc7a v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [/lib64/libnode.so.72]
 4: 0x7f764c48cf02 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/lib64/libnode.so.72]
 5: 0x7f764c60cd89  [/lib64/libnode.so.72]
 6: 0x7f764c621717 v8::internal::Heap::PerformGarbageCollection(v8::internal::GarbageCollector, v8::GCCallbackFlags) [/lib64/libnode.so.72]
 7: 0x7f764c62247a v8::internal::Heap::CollectGarbage(v8::internal::AllocationSpace, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/lib64/libnode.so.72]
 8: 0x7f764c6227e0 v8::internal::Heap::CollectAllGarbage(int, v8::internal::GarbageCollectionReason, v8::GCCallbackFlags) [/lib64/libnode.so.72]
 9: 0x7f764c5d77b2 v8::internal::StackGuard::HandleInterrupts() [/lib64/libnode.so.72]
10: 0x7f764c8fb1f5 v8::internal::Runtime_StackGuard(int, unsigned long*, v8::internal::Isolate*) [/lib64/libnode.so.72]
11: 0x7f764cf884b9  [/lib64/libnode.so.72]
/entrypoint.sh: line 3:    18 Aborted                 (core dumped) /annotate

Option to ignore notebooks?

As far as I can tell there's no way to either explicitly specify which notebooks to validate or which notebooks to ignore. Am I right in that assumption?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.