Giter Club home page Giter Club logo

Comments (11)

nigini avatar nigini commented on May 17, 2024

IMO:

  1. We should convert the code we already have in Bossa to PyBossa before redesign the spreadsheet thing. I think the JQuery usage is a lot more concise then using GDocs. I think that migrating the code will be easy and will show us where PyBossa needs to improve.

  2. The 3rd point should be migrated to issue #25. @lucasmation can you copy this to there so we maintain the idea's author? I have some comment on this.

from pybossa.

rufuspollock avatar rufuspollock commented on May 17, 2024

I'm assigning to me on this. I will work on this for the next few days and at hackday.

from pybossa.

lucasmation avatar lucasmation commented on May 17, 2024

guys:

  • @nigini : Ok, I moved the issue on parallell vs. sequential volunteer job assignments to #25. But lets keep that in mind.
    We will need sample pages in the hackfest. We shuld upload a book or a set of images with some tables (not the one we have at the current bossa version, thoose tables are not really tables).
  • @rgrp: great! We will be online to help too.

from pybossa.

rufuspollock avatar rufuspollock commented on May 17, 2024

Clearing assignee as I'm not really taking this forward atm. @lucasmation are you able to contribute here.

from pybossa.

nigini avatar nigini commented on May 17, 2024

Hi @rgrp. We are going to create and upload a first version of this app in a fill hours.

from pybossa.

lucasmation avatar lucasmation commented on May 17, 2024

Ok, I´ll assign to myself and/or nigini.

from pybossa.

teleyinex avatar teleyinex commented on May 17, 2024

Hi @lucasmation @nigini

I've been thinking a bit more about your idea of a wiki based scheduler for PyBossa. After playing a bit with the idea, I've some doubts that you may help me to understand :-)

Imagine the following scenario, I'm assuming that PyBossa does not lock tasks:

  • User1 request a task
  • PyBossa returns a task (in this iteration User1 is the first collaborator, so there are no TaskRuns at all). TaskId: 1
    • User1 works in TaskId 1
  • User2 request a task
  • PyBossa returns a task (there are no TaskRuns at all, User1 has not submitted yet an answer for TaskId 1). TaskId: 1
    • User2 works in TaskId 1
  • User 1 submits a TaskRun for TaskId 1
  • PyBossa saves the TaskRun for TaskId: 1
  • User 2 submits a TaskRun for TaskId 1
  • [issue] PyBossa should raise an Edit conflict or merge both TaskRuns into one, if both cases: how? Merging tasks could be very cumbersome and complicated. Wikipedia has a specific article about this issue, so I guess we would be able to implement this solution, however I think it could be overkilling for PyBossa as we will be basically building a Wiki Engine (this will imply several changes right now into the model, api, PyBossa.JS, etc.) We need to discuss this with more detail. In any case, how do you deal with this situations in your application?

Imagine the next scenario, I'm assuming that PyBossa locks tasks to users:

  • User1 request a task
  • PyBossa returns a task (in this iteration User1 is the first collaborator, so there are no TaskRuns at all). TaskId: 1
    • User1 working in TaskId 1
  • PyBossa locks TaskId 1
  • User2 request a task
  • PyBossa returns a task (there are no TaskRuns at all, but TaskId 1 is locked). TaskId:2
  • PyBossa locks TaskId 2
    • User2 works in TaskId 2
  • User 1 is AFK or Suspends his laptop without submitting a TaskRun
  • PyBossa saves the TaskRun for TaskId: 1
  • User 2 submits a TaskRun for TaskId 2
  • [issue] PyBossa should time out the task for User1, otherwise this Task.ID1 will never be sent again. Adding TimeOuts could be problematic as it will involve a lot of queries to the DB. Additionally, how long do we wait? It could be difficult to find a value that will not wait too long, but that will not timeout too soon :-)

As you can see these items are challenging, at least for me, so to sum up:

  • How do you manage right now in your application edit conflicts?
    • If you merge them, or the humans do, will be possible at a later stage to analyze data statistically? With PyBossa's current model, you will have different independent samples for the same task, while with the wiki approach, lots of task will built on top on previous work, so the samples will not be canonical (it could be a merge from previous work) and independent.
  • Do you lock tasks in your application?

Thanks in advance for your feedback :-) I'm just trying to fully understand the implications of using this approach in PyBossa right now.

from pybossa.

nigini avatar nigini commented on May 17, 2024

Great points @teleyinex.
Sadly, the correct answer would be: we do not treat these problems. But, this was the kind of issue I was thinking about when I called you to talk (and then I disappeared) about PyBossa's task management strategies.

Locking appears to me as something that could be used as a case to implement as a second type of TaskRun strategy inside PyBossa engine. It is simple and general, but I don't know what would be the load to make the Presenter ping server every X seconds (as I imagine it would be the simpler way to implement it!?).

In the case on conflict management I would not try to merge much. I would try to minimize the conflict generation resolve it using the same user crowd: as I have two conflicting versions submitted, I would let both live until I have a version that receives a "OK status" from enough users. Actually it is like a strategy used in a best effort Computational Grid: replicate and use the first result.

One other possibility is feed another Task type with the different accepted versions to a voting process (this tasks layering is an idea we are already trying to use at App-TableTanscriber, and that's another thing to engineer).

This process will probably require PyBossa to have some metadata about the TaskRun and its versions! It would be another TaskRun strategy...

Can you understand my idea? Do you agree about the necessity of creating a TaskRun strategy module inside PyBossa's core?

from pybossa.

nigini avatar nigini commented on May 17, 2024

I was just talking with my colleague about our app-TT implementation, and that was something I was missing: I already had to make changes in PyBossa to make things work... Basically, I had to rewrite "api.new_task" to make it search for the last edition in the Task - like this:

def new_task(app_id):
    # TODO: make this better - for now just any old task ...
    task = model.new_task(app_id)
    #ToDo - this is a temp code to prove an idea at the TT development
    task_run = model.last_task_run(task.id)
    data = task.dictize()
    if task_run:
        data['last_answer'] = task_run.info['answer']
    return Response(json.dumps(data), mimetype="application/json")

Then I ask: how is the best way to make this idea available at PyBossa now, as we depend on this to make the "incremental strategy" to work? Notice that this worked for us, and that this code does not intend to change but to add functionality to the available Task management available today. BUT again, it requires to be separated in a strategy manner.

from pybossa.

lucasmation avatar lucasmation commented on May 17, 2024

Daniel,
tks for iniciating this discussion. Thoose are indeed good points.
First of all, from what I understand, in PhyBossa "core" would need to be changed in this way:

  1. you kind of need 3 indexes for each "task" a user perfomrs
    m=1,..,M: indexes the image or the total number of images M.
    i =1,...,I : indexes the "task sequence". "I" is the number of parallel wiki sequences for each task. This has to be a parameter of the system.
    j : indexes the "iteration. The total number of iterations per "task squence" ocurs when the last user hits "table finished"
    You need to store results from each (m,i,j) task, although in the end you are only interested in the last iteration of each sequence J.
  2. I had not considered the problem that daniel metioned about simultaneous users acessing the same taks.But I agree completely with his solution of "freezing" the (m,i,j) task while someone is worlking on it.
  3. I don´t know how difficult it would be to implement. But it would be great to have a "time-out" built into our aplication or the pybossa core. If the user is inactive for too long then the system acts as if he finished that iteration, ans saves it.

regards
Lucas

from pybossa.

teleyinex avatar teleyinex commented on May 17, 2024

This issue is no longer needed here, as right now there is an app in PyBossa for this issue.

from pybossa.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.