jcoombes / commonsensecode Goto Github PK

View Code? Open in Web Editor NEW

0.0 0.0 0.0 0 B

commonsensecode's Introduction

Hi there 👋

🔭 I’m currently working on obvs: An Interpretability Library that helps Make Transformers Obvious.
🌱 I’m currently learning Mojo 🔥
👯 I’m looking to collaborate in Open Source development.
💬 Ask me about understanding model internals.
😄 Pronouns: They | Them.

commonsensecode's People

Watchers

commonsensecode's Issues

Initial repo setup

We'll need two repos.

commonsensecode repo:

Coefficient-cookiecutter a python project
directory layout similar to pre-commit hooks repo.
README and CONTRIBUTING guide explain the project goals and ideas.
ADR.md details the date, the architectural choice made and the justification.

handcrafted-senseless-python repo:

0.1 milestone octocat github repository contains some unsage python code and a pre-commit.yaml file that connects to our tool.

cs-semantic

There are only two hard problems in computer science...

"
Look for badly-named variables, functions, classes, modules etc - rename them to be more semantic using a pythonic style.
Look for overly broadly scoped variables, Out of LEGB scope, what is the tightest scope needed for this problem? Choose a professional and appropriate scope.

Code generated by language models sometimes contains hallucinations.
Look for any variables which you think might be hallucinated, slightly misremembered versions of previous variable names.
Look for any python functions within imported modules which you think might be hallucinated.

Code can contain syntax, static-semantic and dynamic-semantic errors.
Look for all forms of dynamic semantic errors and fix them please.
"

Release MVP

cs-comments

A langchain chain has an eye for detail and can fix all the common-sense places in the codebase where the inline comments and docstrings are out of harmony with the rest of the codebase.

Possible Implementation - prompt is something like:

"
Imagine you are a sensible and mature software engineer, you write clean performant code and value communication with your fellow engineers.

Check the following code, and determine where the inline comments and documentation are incongruous with the file and the wider codebase.
"

It can fix:
a .py file with no comments, a .py file with too many comments, a .py file with comments aimed at too-basic an audience (based on your inline comments, it seems like you're writing for a software engineer with n years of experience - c.f. grammarly tone indicator).

a .py file containing a function that doesn't do what the docstring says, in an obvious way (type hint doesn't match return:)

a .py file containing a function that contains pragmas (checks our inline-comment chain doesn't break them)

cs-type-annotator

Uses common sense to infer the types this function might take and then specifies type hints for every function.

Not in Scope: Django-type-hinting-stubs.

cs-readme-contributing

Write a language chain which checks whether all the .md and the .txt files in a repo, such as the README and CONTRIBUTING.md have a consistent style, consistency, formatting, grammar, tone, vibes, authorial voice.

Testset:

The senseless readme and the senseless contributing.

cs-design

Write a language chain that fixes within senseless one-file design mistakes.

Example Prompt:
"
Imagine you are my co-worker, does this implementation seem sensible to you?
"

Testset:

a .py file (.ipynb notebook?) that doesn't use method chaining, and keeps instantiating things in memory.

a .py file using a functional-programming approach to write game of life.

a .py file using a class based approach for something where we should just let-data-be-data. (class with 2 methods, one is init())

a .py file that is writing obfuscated java code disguised as python, obfuscated c++ code disguised as python.

handcrafted-senseless-python test repo

we need:

none of the files have type hints.
a .py file that imports a now deprecated library (see Kenneth Reitz ones), a .py file that imports a library with almost no github stars, a .py file that imports a popular library, a .py file that imports like 20 libraries.
a .py file with no comments, a .py file with too many comments, a .py file with comments aimed at too-basic an audience (based on your inline comments, it seems like you're writing for a software engineer with n years of experience - c.f. grammarly tone indicator).
a .py file containing a function that doesn't do what the docstring says, in an obvious way (type hint doesn't match return:)
a .py file containing a function that contains pragmas (checks our inline-comment chain doesn't break them)
a .py file with really unsemantic variable names.
a .py file with overly long variable names.
a .py file which implements an unwise user interface (command-line interface with gratuitous steps?).
a .py file containing bogosort, stoogesort.
a .py file (.ipynb notebook?) that doesn't use method chaining, and keeps instantiating things in memory.
a .py file using a functional-programming approach to write game of life.
a .py file using a class based approach for something where we should just let-data-be-data. (class with 2 methods, one is init())
a .py file that is writing obfuscated java code disguised as python, obfuscated c++ code disguised as python.
a .py file that breaks PEP-8 in a whole bunch of ways (preferably ways not covered by black).
a .py file with really high cyclomatic complexity.
a .py file containing an impure function, that causes an error by accidentally mutating a data structure.
a test_foo.py file that doesn't actually test what the documentation claims it tests.
a really badly written (no style, inconsistent, spelling, grammar errors) readme and contributing guide.
sphinx documentation that doesn't match the codebase?

Any other ideas for how the inline comments, module docstring, function docstring, documentation, tests and code can be out of harmony with one another.

Any other ideas for how the code can be out of harmony with itself, code-smells, anti-patterns etc.

cs-library-and-approach-recommender

Write a language chain to analyse the libraries used alongside the approach taken.
If appropriate for the context, suggest an alternate library and write a comment to describe the abstractions used by that library offer a short tldr of how that library can solve your problem more effectively.

Testset:

a file with pandas which also takes a long time to import code and could benefit from polars.
a file with floundering datetime handling that would benefit from e.g. pendulum
urllib rather than requests.
writing with csv writer rather than using e.g. pandas for dataframe io.
the tool doesn't recommend left-pad or other very small libraries.

modify the hooks for any outstanding testcases in senseless-python repo

integrate all the commonsense-hooks

cs-holistic

"Considering this entire codebase holistically, and everything you have seen in your hidden .language_model_scratchpad file, write code to implement any recommended changes to make everything a bit more sensible and increase harmony between
code, tests and text.

Be sure to consider inline comments, semantic variable names, clean modular interfaces, documentation in restructured text files, readme, contributing, module docstrings, function docstrings and inline comments, ignoring pragmas.

How could this whole codebase become more readable, more performant, more professional, more python or otherwise a better hypermodern LLM-augmented python codebase."

cs-visualiser

Provide a code example for a sensible vision of this dataset. Be sure to consider data storytelling principles in creating an effective visualisation, focusing on the story told with the setup, conflict and resolution.

Testset:
python file reimplementing Jake Vanderplas's first 7 minutes of his talk with the senseless dataviz.

cs-interface

Imagine you were a user who was going to test the resultant code generated by running this file, which parts of the interface might be too complicated or contain dark-patterns? Where in the user interface are we breaking the laws of good ui and ux?

Testset:
A CLI tool py file with e.g. Typer for now

Out of Scope:
Flask app.

cs-documentation-generator

Write a language chain that can summarise the interface of a file and write restructured text files to generate documentation.

Out of scope: analysis of the git diff of this commit's documentation and the previous commits documentation to solve merge commits.

jcoombes / commonsensecode Goto Github PK

commonsensecode's Introduction

Hi there 👋

commonsensecode's People

Watchers

commonsensecode's Issues

Recommend Projects

Recommend Topics

Recommend Org