Giter Club home page Giter Club logo

panopto-video-analytics's Introduction

Sauder Ops Canvas Site

for public site

Run with: $ bundle exec jekyll serve

Need to have Ruby and Bundkler Installed: Docs

panopto-video-analytics's People

Contributors

markoprodanovic avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

Forkers

alisonmyers

panopto-video-analytics's Issues

Update Nov 9

I’ve managed to create two data tables to help us answer questions about viewership in Panopto

TABLE 1: UNIQUE VIEWERSHIP ACROSS CHUNKS

Example Output (COMM 290)
=> Unique Viewers: 272

Screen Shot 2020-11-09 at 11 34 31 AM

Creating this table turned out to be more challenging than expected:

For the purposes of not forgetting what we did, and transparency about how we calculate this, the steps are:

  • make a call to the REST API for the session -> pull out the duration
  • use this duration to calculate the 5% chunk size (duration / 20) and break up the timeline into chunks of that size
  • then make a call to the SOAP API for session viewing data
  • Go through the data and filter down to a unique list of user ids
  • For each unique user, go through all of their records and calculate coverage (ie. list of tuples with start and end times that show what parts of the timeline they’ve watched) - this was the challenging part!
  • Once coverage is calculated, compare user’s coverage to every chunk and increase the unique_views tally if a user has watched 90% or more of that chunk (note this isn’t 90% of the /duration/ of that chunk - this is unique coverage - ie. 90% of that footage has been watched so if I watch the same 5 seconds over and over again my coverage doesn’t increase)

⚠️ Note that the date range that is shown in the data can be adjusted by narrowing the start and end time in the SOAP call. So you can answer questions like, for example, “how many students watched this video between the first and second midterm?”

I tried to play around with including dates somehow as values in the data but they didn’t really make sense to me. Even if a chunk was completed, a user could’ve watched it over multiple times/days. Users can also rewatch chunks, so which date counts?

  • when they first watched some of a given chunk?
  • when they first watched all of a given chunk?
  • the most recent time they watched a chunk?
  • what if a student watches 78% of a chunk one day and 5% another day and 7% another day. Which day did they watch it? 😰

To me it made the most sense to just say, we can narrow dates by adjusting the call we make so that the raw data that we begin to work with is already filtered to those dates (although I realize this could make it difficult to adjust these via parameters in Tableau )

TABLE 2: CHUNK VIEWERSHIP PER USER

Screen Shot 2020-11-09 at 11 37 57 AM

  • every row represents a unique user
  • there is a column for total_view_time - which is the total amount of time spent watching the video
  • there is a column for each chunk - the value of which represents the TOTAL AMOUNT OF TIME spent viewing that chunk (this is no longer unique viewership but TOTAL)
  • from this we can total/average chunks to get a sense of
    • which chunks have been watched for the most amount of time
    • what is the average amount of time spent per chunk

For example, in analyzing the data above (excel)...

averages in yellow
totals in green

Screen Shot 2020-11-09 at 11 59 22 AM

As we can see, chunk 3 has notably higher total and average viewership than its neighbours.
Also, looking at the first table shows that it has a lot of unique viewership as well

...and indeed going into the content of the video, this section is a walkthrough of a problem solution - so it makes sense that viewership would be more concentrated

Chunks 9-12 also look interesting - in context of the video it's yet another walkthrough of a problem solution followed by a steep dropoff in chunk 13, when the walkthrough is finished, all math disappears from the slides in favour of images

Chunks 17-19 see a pretty major drop-off both in unique viewership as well as totals/averages - and in the context of the video, this is when the instructor ends their slides and the main lecture

In my limited experimenting with COMM290, I found that if a chunk has some combination of:

  1. a higher number of unique viewers for that chunk
  2. a higher total time a chunk has been watched
  3. a higher average time a chunk has been watched

… it tends to be indicative of a more-engaged-with part of the video. (usually the solutions to an example problem). Pretty cool!

Update Nov 4th

Nov 4th Update and Notes

I’ve been able to take the video timeline and break it down into “chunks”:

5% of total video
20 chunks

[
	{
        'session_id': 84cef7f7-f168-4a80-9a5a-ac100144db29,
        'chunk_index': 0,
        'chunk_start': 0.0000,
        'chunk_end': 1.5596,
        'chunk_id': 84cef7f7-f168-4a80-9a5a-ac100144db29-0
    },
	{
        'session_id': 84cef7f7-f168-4a80-9a5a-ac100144db29,
        'chunk_index': 1,
        'chunk_start': 1.5596,
        'chunk_end': 3.1192,
        'chunk_id': 84cef7f7-f168-4a80-9a5a-ac100144db29-1
    },
	...
	...
	...	
	{
        'session_id': 84cef7f7-f168-4a80-9a5a-ac100144db29,
        'chunk_index': 19,
        'chunk_start': 29.6324,
        'chunk_end': 31.1920,
        'chunk_id': 84cef7f7-f168-4a80-9a5a-ac100144db29-19
    }
]

I’ve also been able to go through each user and compute a unique “coverage” list: a list of tuples representing viewing ranges across the timeline

So even if someones viewing activity has a lot of entries like this:

Screen Shot 2020-11-04 at 3 42 13 PM

Their coverage would look like this:

[(0.0, 31.305298999999977)]

i.e. viewed the entire video

notice that times (seconds) are sometimes off by a few milliseconds

Or somebody who’s has gaps in their viewing would appear something like
[(0.0, 3.316318), (10.190626, 31.289296999999983)]

i.e. watched everything except for between 3.31 and 10.19

I’m still noticing ironing out a few bugs but I’m close. So now we know, for each user, what parts of the video they did/didn’t watch.

I’ll also be able to compare this to the chunks list, and see how many unique users viewed each chunk (that’s next!)

HOST error

HOST keyword in env causes problem (not reading properly input HOST). Should change name for now.
Changing env to THE_HOST and calling 'THE_HOST' in settings.py seemed to work

in error, was setting host
HTTPConnectionPool(host='x86_64-apple-darwin13.4.0', port=80)

Re-title project

I called this panopto-data-api because in an abstract sense, it's an interface for Panopto data.
Is this terminology accurate? misleading?

Low priority, just food for thought

Meetings/Notes

2020-09-22

  • notes added
  • label as meeting-minutes
  • link to project (as appropriate)
  • issues created from meeting -> "project-todos" where appropriate

Notes

  • Instructors interested in video analytics, need to be able to provide this as a service
  • Panopto is already part of workflow, so makes sense to start here
  • Will eventually
  • Panopto REST API
  • Panopto SOAP API
  • goal: AWS integration (don't want this to live in W/)

TODOs (can become issues as needed)

  • create documentation of current data - what it is, what it means
  • create documentation of how to get data - manually
  • create documentation of how to get data - possibilities using api

Revisit subfolder traversal

Revisit implementation for sub-folder traversal and document
Think about how it should work given requests we get

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.