Giter Club home page Giter Club logo

Comments (8)

thebjorn avatar thebjorn commented on August 23, 2024 2

If I'm misusing the term,

No, you're using it 100% correctly.

Pydeps started as a tool for helping us move from a repo/package containing all our code, to many smaller repos/packages (it's an improvement, but brings its own set of challenges). I don't think there is any easy solution to representing relationships in a 300+ Kloc code base in 2d., but manually reducing the graph can still give you useful insights...

I usually follow a workflow that looks something like this (using django-cms - ~50KLOC) as an example:
pydeps cms --cluster (had to scale it down due to size):

cms1

then I look at the top and the bottom of the graph and remove nodes with high in/out degrees (since they're significantly constraining the graph display algorithm that dot uses), e.g. a first pass gives:
pydeps cms -Tpng --show-dot --cluster -x django* sekizai* south* cms.test*

cms2

..and repeating until I come to a graph that's readable and interesting:
pydeps cms --show-dot --cluster -x django* sekizai* south* cms.test* cms.exceptions* cms.constants* cms.publisher.query* cms.utils.conf* cms.utils.i18n* cms.admin.pageadmin* cms.api* -xx cms.admin

cms

We were having increasing issues with developers unintentionally introducing circular imports, so we look for arrows going the "wrong" way (i.e. up). During the extract-to-individual projects phase we'd use nodes that clustered together in the graph as an indication that they should probably move as a unit.

from pydeps.

thebjorn avatar thebjorn commented on August 23, 2024 1

That does make sense and it might not be the same as #39, which is about limiting the graph based by cutting the depth of the directory tree , while this seems (iiuc) to be more about ways to globally simplify a ginormous graph. Both seem useful, #39 is probably easier to implement...

When you say "piles of packages", are those the kind of packages that have setup.py files (and are installed in develop mode)? We have some internal tools which look at inter-package dependencies that are a bit rough but should probably become part of pydeps (we have 130+ packages and probably 300 Klocs -- it's been a little while since I counted).

from pydeps.

NeilGirdhar avatar NeilGirdhar commented on August 23, 2024 1

This would be really useful for me. In order to prevent circular imports, and to keep a logical program flow, it's nice to ensure that internal packages import each other as a DAG. This is despite only requiring module imports to form a DAG. So, I'd like a way to collect modules into their respective packages and to make a graph of their dependencies.

@thebjorn Your last graph looks really nice and close to what I'd like to have, I think.

from pydeps.

miketheman avatar miketheman commented on August 23, 2024

It also looks like #39 may be similar as well - sorry for any duplication or noise!

from pydeps.

miketheman avatar miketheman commented on August 23, 2024

When you say "piles of packages", are those the kind of packages that have setup.py files (and are installed in develop mode)?

No - I'm using that as a concept to describe subdirectories of modules within the codebase that are imported by other modules - currently does lead to a ginormous graph. 😄 If I'm misusing the term, happy to adapt.

from pydeps.

thebjorn avatar thebjorn commented on August 23, 2024

Just as a data point... If we take the first django-cms graph and replace all nodes with their prefix (i.e. replace all a.b.c.d with a.b) and remove duplicates, we get this:
cms1b

The nodes here would be similar to package clusters (without the internal arrows).

The code I used (this might be what a solution to #39 would look like):

from __future__ import print_function
import re

def shorten(name):
    return '_'.join(name.strip().split('_')[:2])

def attrs(fmt):
    return dict(kv.split('=') for kv in fmt.strip()[:-2].split(','))

def attrs2fmt(attrs):
    return '[%s];' % ','.join('%s=%s' % (k, v) for k, v in attrs.items())

def main(fname):
    txt = open(fname).read()

    lines = txt.split('\n')
    header = lines[:6]
    body = lines[6:-3]
    nodes = [line for line in body if '->' not in line]

    nodz = {}
    for node in nodes:
        name, fmt = node.split('[')
        sname = shorten(name)
        if sname in nodz:
            continue
        nodz[sname] = attrs(fmt)

    rules = [line for line in body if '->' in line]

    rulz = {}
    used_nodes = set()
    for rule in rules:
        arrow, fmt = rule.split('[')

        a, _, b = arrow.split()
        a = shorten(a)
        b = shorten(b)

        if (a, b) in rulz:
            continue
        rulz[(a, b)] = attrs(fmt)

        used_nodes.add(a)
        used_nodes.add(b)

    with open('simplified-' + fname, 'w') as fp:
        fp.write('\n'.join(header))
        for n in used_nodes:
            a = nodz[n]
            a['label'] = '"%s"' % n
            print('    {} {}'.format(n, attrs2fmt(a)), file=fp)
        for (a, b), fmt in rulz.items():
            print('    {} -> {} {}'.format(a, b, attrs2fmt(fmt)), file=fp)
        print('}', file=fp)
             
    
if __name__ == "__main__":
    main('cms1.dot')

from pydeps.

miketheman avatar miketheman commented on August 23, 2024

That seems like it might be a step in the right direction - although controlling the depth might be useful.
Are you considering how this might make it into mainline?

from pydeps.

NeilGirdhar avatar NeilGirdhar commented on August 23, 2024

Here's my modified script to support graphing packages rather than modules.

#!/usr/bin/env python

import subprocess
from pathlib import Path
from typing import Dict, List, Mapping, Tuple
import typer


def find_modules(path: Path,
                 path_components: List[str]) -> Mapping[str, List[str]]:
    modules: Dict[str, List[str]] = {}
    ruined_module = '_'.join(path_components)
    if path.is_dir():
        modules[ruined_module] = (path_components[1:]
                                  if len(path_components) > 1
                                  else path_components)
        for sub_path in path.iterdir():
            modules.update(find_modules(sub_path, path_components + [sub_path.stem]))
    elif path.is_file() and path.suffix == '.py':
        if path.stem != '__init__':
            modules[ruined_module] = (path_components[1:-1]
                                      if len(path_components) > 2
                                      else path_components[:-1])
    return modules


def shorten(name: str,
            modules: Mapping[str, List[str]]) -> str:
    retval = '•'.join(modules[name.strip()])
    if retval in ['graph', 'edge']:
        return f'{retval}_'
    return retval


def attrs(fmt: str) -> Dict[str, str]:
    return dict(kv.split('=') for kv in fmt.strip()[:-2].split(','))  # type: ignore


def attrs2fmt(attr_map: Mapping[str, str]) -> str:
    return '[%s];' % ','.join('%s=%s' % (k, v) for k, v in attr_map.items())


def main(base_name: str) -> None:
    modules = find_modules(Path(base_name), [base_name])

    cp = subprocess.run(['pydeps', base_name, '--max-bacon=1', '--show-dot', '--no-output'],
                        stdout=subprocess.PIPE, text=True, check=True)
    lines = cp.stdout.splitlines()
    header = [line
              for line in lines[:6]
              if 'concentrate' not in line and line != '']
    body = lines[6:-3]
    nodes = [line for line in body if '->' not in line if line]

    node_mapping: Dict[str, Dict[str, str]] = {}
    for node in nodes:
        name, fmt = node.split('[')
        sname = shorten(name, modules)
        if sname in node_mapping:
            continue
        node_mapping[sname] = attrs(fmt)

    rules = [line for line in body if '->' in line]

    rule_mapping: Dict[Tuple[str, str], Dict[str, str]] = {}
    used_nodes = set()
    for rule in rules:
        arrow, fmt = rule.split('[')

        a, _, b = arrow.split()
        a = shorten(a, modules)
        b = shorten(b, modules)

        if (a, b) in rule_mapping:
            continue
        if a == b:
            continue
        if b == base_name:
            continue
        rule_mapping[(a, b)] = attrs(fmt)

        used_nodes.add(a)
        used_nodes.add(b)

    with open(f'uml/{base_name}.dot', 'w') as fp:
        fp.write('\n'.join(header))
        for n in used_nodes:
            some_dict: Dict[str, str] = node_mapping[n]
            some_dict['label'] = '"%s"' % n
            print('    {} {}'.format(n, attrs2fmt(some_dict)), file=fp)
        for (a, b), fmt_dict in rule_mapping.items():
            print('    {} -> {} {}'.format(a, b, attrs2fmt(fmt_dict)), file=fp)
        print('}', file=fp)

    subprocess.run(['dot', '-Tpng', f'uml/{base_name}.dot', '-o', f'uml/{base_name}.png'])


if __name__ == "__main__":
    typer.run(main)

from pydeps.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.