Giter Club home page Giter Club logo

utensil's People

Contributors

hychou0515 avatar kany102030 avatar

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

utensil's Issues

[BE] POST /flow/job

start a flow given a flow URI
accept format
{
flow_id: 'xxx'
}

response

code: 201
location: job_id

code: 404
flow_id not found

[ENH] using spawn with process pool instead of fork

Instead of using the default spawn of multiprocessing, we use fork to solve issue described this SO.

# we are doing it now
if platform == 'darwin':
    set_start_method("fork")

However, using fork may cause memory leak issue and segmentation fault ( #21 ). It seems more preferable using process pool and we don’t need to use fork.

Add an action button to run a given flow

Add a play button at the top menu bar to run the given flow and get the return value.

A play icon may be suitable.

A minimal version needs to fulfill

  • #32
    Send a post request /flow/job and get a job id from backend

  • #33
    1. Send a get request /flow/job/{job_id}/status to check whether it is finished periodically (maybe 1 request per 5sec?)
    2. Send a get request /flow/job/{job_id}/result to get the final result once it is done, and show the result on the web.

  • #31

Screen Shot 2021-11-18 at 11 13 46 PM

[ENH] Random Search

Instead of grid search, Random Search is considered as a better hyperparameter searching method.

A simplest front-end framework as a loopflow viewer

A viewer of flow that can be opened by a browser.

To use a browser because for cross-platform support.

This viewer does not have to be powerful. A runnable, viewable version is fine.

Requirements:

  • Select a flow file to open
  • A simple graphical view

Screen Shot 2021-11-07 at 10 34 07 PM

[DOC] loopflow should be documented

loopflow.py is not documented at all. Apparently, it should be documented, better fine-grained, because loopflow is a major function of the project.

State should really be a state instead of a counter.

Current in every Seeder, state is just a counter instead of a real state.

A real state means that for two seeders A = Seeder(state=3) and B = Seeder(state=0), we expect

next(B)
next(B)
next(B)
assert next(A) == next(B)
  • SimpleUniformParametricSeeder
  • MoreUniformParametricSeeder
  • GridParametricSeeder

[ENH] Refactor loopflow

Rename

  • Node
    The structural building block of a flow. It has a long life cycle, listening to its parents node and spawn a process to do a job when its parents are ready.
    Won’t rename.
    Reason: Node is simple. Despite Node does not say anything about its life time, I think it’s ok.

  • NodeProcess
    A function runner, it has a short lift cycle, existing (suicide) when it finished its job.
    Renamed as: NodeWorker
    Reason: It is a worker for a node.

  • NodeProcessFunction
    A configurable function executed by a process. It is strongly related to business logic.
    Renamed as: NodeTask
    Reason: NodeTask may not be very suitable, because it sounds like it is a static task instead of a configurable one.

  • Trigger and Triggering
    If a node A triggers another node B, then A is a trigger of B and B is a triggering of A.
    This is VERY confusing.
    Renamed as: Caller and Callee
    Reason: a traditional caller and callee can catch the point that who is calling who. Though the call is not including the parameters passed to the function.

  • Parent and Child
    If a node A passes parameters to another node B, then A is a parent of B and B is a child of A.
    Better than Trigger, but still confusing. Parent and child is wildly used in Tree structure and whether passing parameters or not is not relevant to this traditional definition of parent and child.
    Renamed as: Sender and Receiver
    Reason: Named sender and receiver, then it is expected that something is transmitted between these two nodes.

  • Parent, the base class of Parent and Trigger
    Originally, Trigger is a derived class of Parent. After renaming, Trigger and Parent should have a separate base class.
    Proposal: This base class can be named Parent.
    Reason: Trigger and Parent are still a Parent. It is that one is for triggering (calling) and one is for message passing.

  • NodeProcessMeta
    A shared information between Node and NodeProcess
    Renamed as : NodeMeta
    Reason: NodeMeta is simpler and still descriptive.

  • FLOW, FLOW_USE, FLOW_OR

    class _Operator(str, Enum):
    OR = "|"
    FLOW = "/"
    FLOW_OR = ","
    SUB = "."
    FLOW_USE = "="

    Should be a more descriptive term.

Member Refactor

  • Node.trigger

    def trigger(self, param, caller_name):
    # triggered by unexpected caller is
    # currently considered fine, e.g. SWITCHON
    if caller_name not in self.triggers.node_map:
    return
    for parent_key, parent_spec in self.triggers.node_map[caller_name]:
    c = param
    for attr in parent_spec.flow_condition:
    c = self._getitem(c, attr)
    if len(parent_spec.flows) == 0:
    self._tqs[parent_key].put(TriggerToken())
    for flow in parent_spec.flows:
    if str(c) == flow:
    self._tqs[parent_key].put(TriggerToken())
    break # only need to put one

    It doesn’t make sense to passed in a value beside the caller.
    Yes, it does. Passed value is used for check some flow conditions to determine whether to be really triggered.

  • ParentSpecifiers

    class ParentSpecifiers(tuple):
    @classmethod
    def parse(cls, list_str: Union[str, List[str]]):
    if isinstance(list_str, str):
    list_str = [list_str]
    return cls(
    tuple(
    ParentSpecifier.parse(spec.strip())
    for spec in s.split(_Operator.OR))
    for s in list_str)

    Since we already have ParentSpecifier, do we really need a tuple of it, the ParentSpecifiers?

Exception Class

  • So Many RuntimeError
    Shouldn’t have any. Change it to a more descriptive class with more information.

Tests

  • Add more test
    Without doing it, codecov fails.

[WONTFIX] Random Search should support resolutions

Random search with the resolution has no obvious benefit.

In grid search, we want resolution because we want a dimension to reach a finer grid faster. And via the resolution, this goal can be done.

However in random search (MoreUniformParametericSeeder), because we don't actually run all possible combinations, the way to reach a finer grid is to start from a larger state and skip the coarser grid.

[ENH] Connector

Including connecting, connector, connectorPool to RDB.
Table fields types should be designed.

[ENH] Grid Search

Grid search is good for low dimensional parameter search.

  • Resolution in Parametric
  • left/right ends in Parametric
  • Seeder for grid search
  • Grid search gets resolution from parametric and uses it to initialize the grid search seeder.

[BE] GET /flow/job/{job_id}/status

response

code 200
{
state: created/processing/finished
start_time: 'YYYY-MM-DD HH:MI:SS' or undefined
finish_time: 'YYYY-MM-DD HH:MI:SS' or undefined
error: undefined or ['error1', 'error2', ...]
result: result in json
}

code 404
job_id not found

Coding style changing to black, isort, then yapf

TL;DR

Format code in this order

  1. black
  2. isort
  3. yapf

Test coding style in

  • CI test
  • pre-commit

Description

Current coding style is yapf only.
However yapf itself is not strict. (see example below)
To be more strict, but still stick to yapf style, using black first and then yapf seems a good way to go.
To make this realistic and practical, try adding them to CI test and pre-commit check.

yapf is not strict

Both style is considered fine in yapf

’single quoted’
“double quoted"

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.