Giter Club home page Giter Club logo

headstart's Introduction

Head Start

Head Start is a web-based knowledge mapping software intended to give researchers a head start on their literature review (hence the name). It comes with a powerful backend that is is capable of automatically producing knowledge maps from a variety of data, including text, metadata and references.

Head Start

Getting Started

Client

To get started, clone this repository. Next, duplicate the file config.example.js in the root folder and rename it to config.js.

Set the skin property in the config to one of the following values to use the particular data integration skin:

  • "covis"
  • "triple"
  • "viper"

or leave it empty ("") for the default project website skin.

Make sure to have installed node version >= 14.18.1 and npm version >=8.1.1 (best way to install is with nvm, nvm install 14.18.1) and run the following two commands to build the Headstart client:

npm install
npm run dev

We are using webpack to build our client-side application. webpack is started in watch mode which means that changes to files are tracked and the created headstart.js is automatically updated.

Now you can run a local dev server:

npm start

Note: you can also set the skin in this step as an argument to the npm start command (e.g. npm start -- --env skin=triple).

The browser will automatically open a new window with the example specified by the skin.

Alternatively, you can point your browser to one of the following addresses:

http://localhost:8080/project_website/base.html
http://localhost:8080/project_website/pubmed.html
http://localhost:8080/local_covis/
http://localhost:8080/local_triple/map.html
http://localhost:8080/local_triple/stream.html
http://localhost:8080/local_viper/

If everything has worked out, you should see the example visualization.

To run Headstart on a different server (e.g. Apache), you need to set the publicPath in config.js to the URL of the dist directory:

  • Dev: specify the full path including protocol, e.g. http://localhost/headstart/dist
  • Production: specify the full path excluding protocol, e.g. //example.org/headstart/dist

Contributors

Maintainer: Peter Kraker ([email protected])

Authors: Maxi Schramm, Christopher Kittel, Jan Konstant, Asura Enkhbayar, Scott Chamberlain, Rainer Bachleitner, Yael Stein, Thomas Arrow, Mike Skaug, Philipp Weissensteiner, and the Open Knowledge Maps team

Features

  • Interactive, web-based knowledge maps based on D3.js, following Shneiderman's principle of "overview first, zoom and filter, then details-on-demand"
  • Synchronized list representation of documents complementing the knowledge map
  • Integrated PDF viewer and annotation tool, courtesy of Hypothes.is
  • Powerful server component written in PHP and R for the creation of knowledge maps, including algorithms for clustering, ordination and labelling
  • Connectors to a number of academic search engines through rOpenSci, including BASE, PubMed, PLOS and DOAJ
  • Persistence and versioning system based on SQLite

Showcases

  • Open Knowledge Maps: Creates a visualization on the fly based on a user's search in either BASE or PubMed.
  • VIPER - The Visual Project Explorer: Provides overviews of research projects indexed by OpenAIRE.
  • CRIS Vis: Enables the exploration of crowd-sourced research questions related to mental health.
  • Overview of Educational Technology: A working prototype for the field of educational technology based on co-readership.
  • OpenUP Dissemination Toolbox: A prototype showcasing an overview of innovative dissemination case studies.
  • Conference Navigator 3 [registration required]: An adaptation of Head Start for the conference scheduling system CN3. This version enables users to schedule papers directly from the visualization. Scheduled papers and recommended papers are highlighted.

Compatibility

The visualization has been successfully tested with Chrome, Firefox, Safari and Microsoft Edge. Unfortunately, Internet Explorer is not supported due to the fact that it is not possible to insert HTML into a foreignObject.

Background

More information can be found in the following papers:

Kraker, P., Schramm, M., Kittel, C., Chamberlain, S., & Arrow, T. (2018). VIPER: The Visual Project Explorer. Zenodo. doi:10.5281/zenodo.2587129

Kraker, P., Kittel, C., & Enkhbayar, A. (2016). Open Knowledge Maps: Creating a Visual Interface to the World’s Scientific Knowledge Based on Natural Language Processing. 027.7 Journal for Library Culture, 4(2), 98–103. doi:10.12685/027.7-4-2-157

Kraker, P., Schlögl, C. , Jack, K. & Lindstaedt, S. (2015). Visualization of Co-Readership Patterns from an Online Reference Management System. Journal of Informetrics, 9(1), 169–182. doi:10.1016/j.joi.2014.12.003

Kraker, P., Weißensteiner, P., & Brusilovsky, P. (2014). Altmetrics-based Visualizations Depicting the Evolution of a Knowledge Domain. In 19th International Conference on Science and Technology Indicators (pp. 330–333).

Kraker, P., Körner, C., Jack, K., & Granitzer, M. (2012). Harnessing User Library Statistics for Research Evaluation and Knowledge Domain Visualization. Proceedings of the 21st International Conference Companion on World Wide Web (pp. 1017–1024). Lyon: ACM. doi:10.1145/2187980.2188236

License

Head Start is licensed under MIT.

Citation

If you use Head Start in your research, please cite it as follows:

Peter Kraker, Christopher Kittel, Maxi Schramm, Jan Konstant, Rainer Bachleitner, Thomas Arrow, Scott Chamberlain, Asura Enkhbayar, Yael Stein, Philipp Weissensteiner, Mike Skaug, Katrin Leinweber & Open Knowledge Maps team and contributors. (2019, March 7). Headstart 5 (Version v5). Zenodo. http://doi.org/10.5281/zenodo.2587129

headstart's People

Contributors

bubblbu avatar chreman avatar jaels avatar katrinleinweber avatar konstiman avatar pkraker avatar rainbac avatar rbachleitner avatar rolandschuetz avatar sckott avatar tanteuschi avatar tarrow avatar vrednyydragon avatar wpp avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

headstart's Issues

Encoding of Pubmed results

I am having problems with processing the output of Pubmed and I am pretty sure that they are encoding-related. When I check the encoding with Encoding(metadata$abstract), the result is "Unknown". Is it possible that everything that comes back from get_papers is UTF-8 encoded?

User edited/user extended maps

Currently, interaction with Head Start is limited to filtering, searching, and zooming.

It would be nice to give users the possibility to edit a map. This would allow users to correct perceived errors, and to extend maps with their own literature. The following features come to mind:

  • the ability to move, resize, and rename areas,
  • the ability to move papers to a different area,
  • the ability to correct the metadata,
  • the ability to introduce new papers and areas to the map.

These features only make sense in combination with an adequate persistence infrastructure where the changes are saved.

Dedicated mobile version of Head Start

While the current version of Head Start can be used on smartphones and tablets, there are a number of limitations with regards to usability and performance. Thus, a dedicated mobile version is needed that is optimized for small screens, touch displays and mobile processors.

Force layout server side vs client side

Currently we are using the D3 implementation of Force Layout on client side.

Not sure if there is a noticeable difference in performance if we run force layout on the server in R for ~100 nodes and deliver the final map. But I guess that the performance should at least improve for weaker devices?

But most importantly the user would no longer have the waiting phase until the layout is finished. This would also solve #31. As the map is ready to go from the start we no longer need to worry about the synchronization between paper list and map.

Display bubble title in some way when bubble is highlighted

When the user moves over a bubble, the bubble title is hidden to show the papers in the area. This breaks the user experience, especially with long titles and overlapping bubbles, as users want to see the title in some form at any time. A tooltip is shown after a few seconds, but this is not satisfactory. A possible solution could be jQuery tooltips as they can be immediately shown.

Pubmed search

It would be awesome to extend the search functionalty to Pubmed. This would require to conduct a search in Pubmed and to retrieve basic metadata, abstract and categories for the first 100 papers. For display purposes in the visualization, we will also need to retrieve the full text PDF on demand (if it's available).

Use PCA to make map layout consistent

Currently the maps are not reproducible in the sense that the same papers as input will create differently rotated maps. By using PCA after NMDA, which would align the first NMDA axis with the direction of maximum variance.

Mediator pattern for communication between components

Currently, the components of the visualization communicate directly with each other. When an event is triggered that effects two or more components (e.g. a zoom which updates both the chart and filters the list elements to show only papers of the zoomed area), the component which was notified about the event (in this case the chart), also directly updates the other components (in this case the list).

Therefore, there is high dependency between the different components and interaction is distributed and hard to manage. In the mediator pattern, components do not directly communicate with each other but through a mediator object. This would result in more loosely coupled components, and most possibly a better overview of the interaction between the components.

Zoom out on double click

It should be possbile to zoom out with a double click on a bubble. For many users this is an expected behaviour that stems from mobile devices.

Single click access to the full-text

Feedback from @brembs: "In terms of then reading the literature, it would be good to be able to get a single click access to the full-text."

This can be achieved by adding an overlay when a user hovers over a paper, or an entry in the list.

Resume the list at the same document after zooming out

When browsing through the list of search results, often individual articles are clicked on and focussed in order to see the detailed description. After zooming out I would expect to resume my search at the same article I clicked on

Persistence layer

Add a persistence layer that enables storing the state of a visualization in a database for later retrieval. This is intended to replace the static CSV files currently in use.

Parallelize R scripts

PLOS queries

Is searchplos.r already parallelized? If not, would it improve the performance if we would use parallel searchplos(start=i, limit=10) calls with i = seq(0, len_papers, @10)?

Non-metric multidimensional scaling

Instead of using nmds {ecodist} (doc), we could use par.nmds {parfossil}(doc) which is the implementation of the ecodist function with multi-core foreach-loops instead of the standard for-loops.

At least on my machine this reduced the runtime for nmds from ~15s to ~5s

Graphic interface

The interface looks much improved since I last found it. Zooming works very well and navigating the bubbles seems intuitive. I'll likely have some minor details to note upon extended usage, but for now this looks good to me.
In terms of then reading the literature, it would be good to be able to get a single click access to the full-text.

Tweet visualization including an image

It should be possible to tweet the visualization using a tweet button. It should be possible to include a still image of the visualization in the tweet.

Filter and zoom not working properly when combined

In zoomed out state, filtering works as expected. But once you zoom into an area, the filter isn't applied any more. I have attached images of the correct and the errneous behavior. In the example, the list and the bubble should be filtered for "social".

Correct
right

Erroneous
wrong

sorting algorithm

I don't yet know how representative my current search is so I'll expand on my searches, but one issue I noticed was that my technical term 'operant' was expanded to 'operative' and 'operating', neither of which are synonyms. Hence, the groupings are too often off-topic.
I realize that this is likely not completely trivial, but there likely needs to be some list of technical terms in the background. How about using the various ontologies? For instance, gene ontology is widely used and should be quite up to date. I haven't checked it in a while, but there are numerous behavior-related terms in GO and 'operant' might be there along with related concepts.
But perhaps these ontologies are already incorporated and just need to be corrected?

Hover behavior when zoomed in

When zoomed in, the surrounding circles should not gain focus on mouse over. Also, the selected circle should not lose focus on mouse out. I have attached images of the erroneous and the correct behavior below.

Erroneous
wrong

Correct
right

Split viz-requests into smaller parts

Current requests are processed from start till end in one go and cause the known waiting times for users. In order to enhance the user experience we could split a request in:

  1. Get data from PLOS

    Once the data is retrieved from PLOS the list containing the papers can already be displayed.

  2. Calculate the map ...

    ... while the user explores the papers as a list.

Search PLOS: Enable users to set options

Users should be able to set some options for their PLOS search: time range, article types and journals. That way, you can show e.g. just the research articles for a certain topic in the last month.

Export data

Users should be able to export data from Headstart without having to resort to looking at the network traffic to find the JSON. Suggested formats are JSON and CSV.

Display visualization context

When a visualization context is available (e.g. the parameters of the PLOS search), then this context should be displayed in the visualization to improve the user's ability to interpret the visualization.

Complete mediator implementation

Before we progress with improvements on the frontend, we should complete the mediator implementation. The goal is that there are no cross-module calls any more. All logic between modules should be handled by the mediator.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.