Giter Club home page Giter Club logo

ecos's Introduction

ECOS (Extracting Content Output Sonogram)

ECOS is a real-time system that ingests, stores, processes and indexes medical reports.

Data storage and processing take place in a distributed manner through the use of Kafka and Spark respectively.

In particular, information such as the status of examined anatomical structures, medical diagnosis and other information related to the patient's health is to be extracted from the documents.

The next step in this project would be to make the extraction customisable;

Ideally, any doctor specialising in any field of medicine could upload their reports and specify which parameters they would like the model to search for.

The power of ChatGPT-4 was used to extract the features.


πŸ“ Requirements

  • Docker & Docker Compose
  • In order to to use ChatGPT-4 API, one possible solution is to create an Azure account and enable OpenAI services.
    If you are a student, follow my guide to get an Azure account with $100 credit and no credit card required!

⚑ Quickstart

$ git clone https://github.com/WoWS17/MedScan.git

$ cd ./kafka/setup

$ wget https://dlcdn.apache.org/kafka/3.4.0/kafka_2.13-3.4.0.tgz

$ cd ../../

$ echo 'export AZURE_OPENAI_ENDPOINT=<Your OpenAI Endpoint>' >> ~/.bashrc

$ echo 'export AZURE_OPENAI_KEY=<Your OpenAI Secret Key>' >> ~/.bashrc

$ source ~/.bashrc

$ docker compose up --build

πŸ“Š Data flow

data-flow

Data Source

The data analysed were provided by a doctor. They are diagnoses made following ultrasound examinations of the male inguinal-scrotal regions.

LogStash

What is it?

Logstash is a free and open server-side data processing pipeline that ingests data from a multitude of sources, transforms it, and then sends it to your favorite "stash."

Kafka

What is it?

Apache Kafka is an open-source event streaming platform that enables the management and processing of real-time data streams. It is designed for scalability, reliability and speed, enabling the transmission of large volumes of data between applications and systems. Kafka operates on a publish-subscribe model, where producers send data into containers called topics and consumers subscribe to the topics to receive the data.

ElasticSearch

What is it?

Elasticsearch is a distributed, RESTful search and analytics engine capable of addressing a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data for lightning fast search, fine‑tuned relevancy, and powerful analytics that scale with ease.

Spark

What is it?

Apache Spark is a lightning-fast unified analytics engine for big data and machine learning. It was originally developed at UC Berkeley in 2009.

Kibana

What is it?

Kibana is an free and open frontend application that sits on top of the Elastic Stack, providing search and data visualization capabilities for data indexed in Elasticsearch. Commonly known as the charting tool for the Elastic Stack (previously referred to as the ELK Stack after Elasticsearch, Logstash, and Kibana), Kibana also acts as the user interface for monitoring, managing, and securing an Elastic Stack cluster β€” as well as the centralized hub for built-in solutions developed on the Elastic Stack. Developed in 2013 from within the Elasticsearch community, Kibana has grown to become the window into the Elastic Stack itself, offering a portal for users and companies.

Useful links

Container URL Description
kafkaserver http://localhost:8080 Open kafka UI to monitor kafka server
elasticsearch http://localhost:9200/ ElasticSearch base URL
elasticsearch http://localhost:9200/ner_idx/_search ElasticSearch index content
kibana http://localhost:5601 Kibana base URL

Authors

Giuseppe Coco

ecos's People

Contributors

giuseppe-coco avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.