Giter Club home page Giter Club logo

cue-queue's Introduction

Hi, I'm Eva ๐Ÿ‘‹

I am currently a PhD student at the University of Washington Information School working in the UW DataLab. My research involves the collection and analysis of large-scale social and behavioral data, usually from municipal or public sources. I create and use novel machine learning methods for information extraction to answer questions related to conversation dynamics, engagement, misinformation, and understanding policy actions within public meetings.

(And sometimes I do some biology and microscopy data engineering work too.)

For more details about me, see my website.

Projects I Maintain:

Projects in Development:

cue-queue's People

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

cue-queue's Issues

What does this code do?

I can't find any instructions on how to use this. From looking at the docs, I assume it does not output segments of a text. Does it?

Install Error

Describe the Bug

$ pip install git+https://github.com/JacksonMaxfield/cue-queue.git
Collecting git+https://github.com/JacksonMaxfield/cue-queue.git
  Cloning https://github.com/JacksonMaxfield/cue-queue.git to /tmp/pip-req-build-zzf9yt5l
  Running command git clone -q https://github.com/JacksonMaxfield/cue-queue.git /tmp/pip-req-build-zzf9yt5l
  Resolved https://github.com/JacksonMaxfield/cue-queue.git to commit b10d94759c9a8dd6cafd714297f66cc52a4cd571
ERROR: Could not find a version that satisfies the requirement cdp-backend>=3.0.0.dev18 (from cue-queue) (from versions: none)
ERROR: No matching distribution found for cdp-backend>=3.0.0.dev18

Expected Behavior

Install without error

Reproduction

Steps to reproduce the behavior and/or a minimal example that exhibits the behavior.

Environment

Any additional information about your environment.

  • OS Version: Ubuntu 20.04

Try align then segment

Building off of: https://aclanthology.org/2020.lrec-1.829.pdf

Use semantic similarity between each minutes item to a sentence in transcript, then using the trained model for segmentation classification, classify sentences that are breaking. Find the nearest segmentation markings to the most similar breaking sentences.

i.e.

part 1:

s1, s2, s3, sn...
     ^ most similar to t1

part 2:

s1, s2, s3, sn...
     ^ most similar to t1
^a       ^b

where a and b are delimiter sentences (topic breaks)

Sequence alignment using classified cue sentences / blocks as signal

Topic and Cue Alignment

Build a classifier for if something is a cue block or if something is a discussion block.

Can test different size block sizes (moving window) from 1, 2, 3, 4, 5, 10, sentences etc.
After finding the block size classifier that performs best, use the trained classifier to generate a signal that is 0 for discussion blocks and 1 for cue blocks.

So a transcript's generated sequence from the classifier may look something like the bottom sequence in the image. (1, 0, 0, 0, 1, 1, 0, 0 ,0, 0, 1, 1, 0, 1)

The top sequence is created by assuming that there will always be an "intro cue", some discussion, and then an "outro cue". So generate this sequence as (1, 0, 1) * M where M is the number of minutes items. I.e. for three minutes items the generated sequence is (1, 0, 1, 1, 0, 1, 1, 0, 1).

Finally perform dynamic time warping / sequence alignment on these two sequences to find best path.

Eval overal performance with PK / WindowDiff.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.