Giter Club home page Giter Club logo

dataforme's People

Contributors

dependabot[bot] avatar shukerov avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

dataforme's Issues

Clock Chart needs to be interactive

It would be cool to have the clockchart be able to switch between sent & received messages. In other words int needs to include different datasets.

Might be wise to extract that whole graph into its own npm module, or add it as part of frappe-charts!

General Project Discussion - I need your help!

In the README, I mention that I am a Javascript novice. I took a deeper dive into Javascript, and front-end development in order to make this project a reality. The code in its current state hasn't gone through any code reviews. I want to become a better developer, so any feedback would be greatly appreciated.

I think that the project is functional in its current state, but will be hard to maintain. In this issue I would like to address some of my concerns, and prompt some discussion that can possibly help this project evolve into something better. What I hope to discuss with the GitHub community is:

  1. Testing DataForMe
  2. Maintaining functional report pages (aka issues with the nature of the data analyzed)
  3. Security

Testing DataForMe

Currently the tests I have written are… Ugh .. well in one word its BAD πŸ˜†.

I am using Jest as a test framework, but I am willing to completely pivot if there are better suggestions from the community.

Unit tests

The main thing to set up would be unit tests for the data crunching. There is a saying that bad data is worse, than no data at all. To prevent bad data from happening I plan to do the following:

  1. Generate fake .zip files.
  2. Load them in the respective WebserviceAnalyzer
  3. Unit tests the hell out of them.

The rendering of the reports won't be tested, but the data served from a WebserviceAnalyzer, will be properly tested. Looking at my code now, it is definitely testable.

I feel like the way I applied OOP principles only kind of work, but I will truly find out more once I expand the current tests. Any suggestions or code review would be greatly appreciated.

These are the files, that in my opinion, should be tested first:

facebookAnalyzer.js
spotifyAnalyzer.js
tinderAnalyzer.js

Functional tests

Jest seemed like a proper choice to eventually get to functional tests. That being said I am not entirely sure that at the current stage of the project those are really needed. What are your thoughts on this?

The data problems

The source of data that this project analyzes - makes maintenance inherently hard. Personal data has basically no documentation. The providers of data also have no guarantees that they will keep the data the same. I have worked on this project on and off for about 7 months. Each one of the three web services have changed their data structure at least once in that timeframe. The changes are always incremental and not too hard to patch, however patching such changes in a timely manner is where challenges occur. Currently I have two approaches in mind to this problem:

  1. Automated scripts that periodically request data, and run it through basic unit tests (or unzip and diff files). Failures can be reported and patched. The problem with this approach is that many of the webservices require extra verification, like password inputs. The creation of reliable scripts might be very hard to achieve. Also, as we all know, unit tests can still allow for bugs to slip through the cracks.

  2. A more human approach would be to contact the teams responsible for the personal data provision. DataForMe is an open source project, so cooperation is definitely possible.

Security and data privacy concerns

Well I am no expert on this whatsoever, and that makes it even more important to bring up here. The website is designed with the goal of having a minimum amount of dependencies. When you look at the dependency graph, the list ends up being not small at all. All of the dependencies used are well known, and widely used but as we know that doesn't mean a whole lot.

Some of the questions that I have, and would love to discuss:

  1. What are some ways to check and maintain a safe node_modules folder?
  2. I have no web requests, apart from the initial loading of the reports page. Is that enough to eliminate Cross-site scripting and cross-site request forgery attacks?
  3. Do you think the user be forced to disconnect from the internet before attaching his file in his browser?

Other more general security discussion

  1. Is personal data being analyzed in browser an inherently bad idea?
  2. Do you think that the average user equipped to keep his data safe outside of the context of DataForMe? And a follow up, should it be DataForMe's responsibility to educate users, if that is the case?

Add 404 page

Create a nice 404 page, and make sure all broken link take the user there.

Finish Facebook report

Facebook report currently only provides messaging statistics. Here is a giant list of things that need to be done:

General report

  • include the string that is your face
  • include number of face examples facebook has
  • relationship count
  • relationship status

Message report

  • average words per message
  • average sent number of messages per day (on days that you did message)
  • average received number of messages per day
  • total calls that you have initiated
  • total calls that you have received
  • average call time

Create search report

  • total number of searches could be moved down here
  • top 10 searched things
  • time of the day i search for things
  • number of searches throughout the years

Post Report

  • total number of posts moved here?
  • time of the day I post things
  • number of posts throughout the years

Reaction Report

  • total number of reactions
  • reactions broken down by love, likes, and the other ones
  • number of reactions throughout hte years
  • time of the day I usually react to stuff

Ad report

  • number of interests
  • giant list of interests
  • number of interactions with commercials

Note that this is just an MVP. There is a lot of interesting data and stats that can be deduced from it, but it is a decent start.

Update Spotify download instructions

There is an extra step added to the spotify instructions that needs to be updated...

While you are at it make sure to record the steps to create these instructions, and set some conventions like:

  • Border size on the red rectangles.
  • Image sizes.
  • Others?

Update the wiki..

Filepicker change on valid file upload

Filepicker needs to change on valid file upload. There are also a few subtasks here:

  • It also needs to disable input when that happens, so you cannot run the reports twice in a row.
  • Needs to notify with an alert when the file is larger than ?GB (the maximum should be about 4GB)

Preview of report

Each report should have a preview button with fake data. I think this will incetivise people to actually be curious about their stats.

Broken tooltips

The tooltips are broken. Particularly bad on mobile or on a smaller screen.

This occurs when the tooltip text is too big to fit on the screen, the text wraps and the whole tooltip is shifted down.

Fix index page text content

The landing page has too much text.

  • Break it up with large headings that will make the content more digestable.
  • Add icons to headings to make the page more interesting to browse
  • Include data tracking disclaimers (google analytics, and add-blocker turn-off request)

Fix the navigation bar

The decision to move the navigation bar to the right was a bad one....
From user testing I figured out that it needs to be brought back to the left. Also the button is confusing and people don't know they are supposed to press it. Check the list below for a list of things that need to be done before this issue is considered complete:

  • move navbar to the left
  • change from logo to hamburger
  • scale up the button on mobile

Set up pipeline for project

Travis or something that will also run tests and give stats for every push. Could be just github actions. Need to do some research...

Fix FOUC

  1. Fix the flash of unstyled content by using mini-extract-css webpack plugin.
  2. Rethink what needs to be javascript generated and actually on the page...

Create Tinder Report

General

  • Name
  • Date Joined
  • Birthday
  • Education
  • Email
  • Phone
  • Photo Count
  • Location stackoverflow

Match report

  • Number of likes total
  • Number of left-swipes total
  • Number of matches total
  • Age Range
  • Likes to matches ratio
  • Likes to total people ratio
  • Likes vs left swipes pie
  • Matches vs likes pie
  • Heat graph matches by years and months

Usage report

  • Number of app opens
  • Heat graph usage by years and months

Approach report
Card view with

  • Match Id
  • Date
  • Total messages
  • First message

Spotify connected?

  • song popularity cards
  • Song name
  • Maybe image?

Instructions for Facebook data download

make proper instructions describing how to download your facebook data. Right now only one picture is proper. Make sure you also hide any sensative data.

Resizing browser bug.

When reisizing the browser from big to small, the graphs fail to adjust.
Not sure what the root of the issue is. I believe it could be a Frappe Chart issue, related to grid display.

Need to investigate more

Index page content is hidden

It is not clear that the user can scroll down on the index page. Some of the content remains hidden forever :(

Loading screen update

When a zip file is added, and analysis start a loading screen shows up.
This loading screen needs an update very badly. Currently it has lots of left over code and bad terrible style.

Navbar prevents scrolling

The way the navbar is done prevents the user from scrolling when on top of the invisible div.

Need to rethink, and redo the way that navbar works

Facebook broke big files apart. Facebook report is in need of an update.

The Facebook zip file used to have all messages with a person under a singular file called 'messages_1.json'. Files would always have that name no matter how big they would get.

Now big files are broken into multiple files, where the number is incremented at the end. So the file messages_1.json now is messages_1.json, messages_2.json, etc.

Redo footer

The footer needs some contact info, link to the site credits and link to the reddit subreddit.

Thats the very minimum. Would be cool if you link some well known data privacy organizations.

UNICODE character support

The facebook report spits out unicode characters which I am having trouble processing properly. Need a way to read such unicode string reliably.

Create a credits page...

Make simple credits page where you can shoutout the people that helped you.

Add a donate button?

Chaining unzip callbacks in a better way

The way unzipping is handled in the callback loop is quite bad. For once, I frequently forget to update the loop count.

There is a repeating pattern where DRY can be applied:

  1. initializing a callback loop for the subprocess.
  2. getting a json file.
  3. unzipping the json file and chaining a callback to that.

Need to speak to Clements about best ways to chain such events.

The progress bar is also innacurate right now. Complete chaining with the mind that the progress bar needs to be fixed as well.

Problem broken down into three steps:

  • Fix chaining unzipping events
  • Handle missing files or broekn callback loops (right now things just crash and burn)
  • Fix the progress bar (refactor the messaging FB report)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.