Giter Club home page Giter Club logo

polyrabbit / hacker-news-digest Goto Github PK

View Code? Open in Web Editor NEW
650.0 20.0 87.0 4.51 MB

:newspaper: Let ChatGPT Summarize Hacker News for You

Home Page: http://hackernews.betacat.io/

License: GNU Lesser General Public License v3.0

Makefile 1.00% Python 63.75% JavaScript 3.11% CSS 1.25% HTML 9.79% Jupyter Notebook 21.10%
hacker-news python data-extraction hacker-news-reader rss extract-summaries hacker-news-digest spider crawler machine-learning

hacker-news-digest's Introduction

polyrabbit's GitHub stats

hacker-news-digest's People

Contributors

dependabot[bot] avatar itsjw avatar mchangxin avatar melroy89 avatar polyrabbit avatar sitebase avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hacker-news-digest's Issues

FR: allow filtered rss feeds like hnrss

Hi,

I think it would be awesome to be able to filter the frontpage a bit in the rss and would possibly require little changes. Some of the filters would be more complicated like search and would not be appropriate for this project because you're only offering the frontpage but I still think it would be good to allow to filter the frontpage for like points and comments.

Thoughts?

Edit : I'm refering to https://github.com/hnrss/hnrss

Incomplete news summary

For some news, the summary is not shown completely, and there is no link provided to check out the full summary. Is it that the summary does end with an ellipsis or that the summary is not rendered correctly?

Example:

Example 1

Example 2

experiment with thought process prompt

Hi,

Just wanted to signal that there are promising ways to summarize documents in a much better way in my opinion.

This is more costly as it's using chains of langchain but I think the added value is tremendous.

To me this is the kind of feature that would make me pay for that service.

Also this can nicely handle comments too with a little adjustment: make a prompt that extracts the new information in the comments, summarized opinions, new facts, etc #5

I did a quick try earlier today and find this very promising. I'm insanely busy atm so I thought you might be interested in the raw code directly. The idea is to ask to summarize not the key facts but the reasonning of the author paragraph by logically indented paragraph in markdown. Here's a quick proof of concept (just add your api key and add a txt file as argument, also notice that for testing I shortenned the input via [:1000]):

# source https://python.langchain.com/en/latest/modules/chains/index_examples/summarize.html

from pathlib import Path
import os
from langchain.chat_models import ChatOpenAI
from langchain import PromptTemplate, LLMChain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.mapreduce import MapReduceChain
from langchain.prompts import PromptTemplate
from langchain.docstore.document import Document
from langchain.chains.summarize import load_summarize_chain
from pprint import pprint

assert Path("API_KEY.txt").exists(), "No api key found"
os.environ["OPENAI_API_KEY"] = str(Path("API_KEY.txt").read_text()).strip()

llm = ChatOpenAI(
        model_name="gpt-3.5-turbo",
        temperature=0,
        verbose=True,
        )

text_splitter = CharacterTextSplitter()

def load_doc(path):
    assert Path(path).exists(), f"file not found: '{path}'"
    with open(path) as f:
        content = f.read()[:1000]
    texts = text_splitter.split_text(content)
    if len(texts) > 5:
        ans = input(f"Number of texts splits: '{len(texts)}'. Continue? (y/n)\n>")
        if ans != "y":
            raise SystemExit("Quitting")
    docs = [Document(page_content=t) for t in texts]
    return docs


prompt_template = """Write a very concise summary of the author's reasonning paragraph by paragraph as logically indented markdown bullet points:

'''
{text}
'''

CONCISE SUMMARY AS LOGICALLY INDENTED MARKDOWN BULLET POINTS:"""
PROMPT = PromptTemplate(template=prompt_template, input_variables=["text"])
refine_template = (
    """Your job is to continue a summary of a long text as logically indented markdown bullet points of the author's reasonning.
    We have provided an existing summary up to this point:
    '''
    {existing_answer}
    '''

    You have to continue the summary by adding the bullet points of the following part of the article (only if relevant, stay concise, avoid expliciting what is implied by the previous bullet points):
    '''
    {text}
    '''
    Given this new section of the document, refine the summary as logically indented markdown bullet points. If the new section is not worth it, simply return the original summary."""
)
refine_prompt = PromptTemplate(
    input_variables=["existing_answer", "text"],
    template=refine_template,
)

if __name__ == "__main__":
    import sys
    docs = load_doc(sys.argv[-1])
    chain = load_summarize_chain(llm, chain_type="refine", return_intermediate_steps=True, question_prompt=PROMPT, refine_prompt=refine_prompt)
    out = chain({"input_documents": docs}, return_only_outputs=True)

    t = out["output_text"]
    for bulletpoint in t.split("\n"):
        print(bulletpoint)

    print("Openning console.")
    import code ; code.interact(local=locals())

Thoughts?

Feature request: Add link to comments in RSS feed contents

The web version shows an icon and count of comments. Clicking on it takes you to the Hacker News page with all user comments. However, the RSS feed does not include such a link.

I would find the RSS feed much more useful if a link to the comments page was added into the content.

如何本地运行调试?

你好,我注意到这是一个flask app? 是不是少了app.py文件,没有看到app route相关信息。
目前在学习AI,我想知道如何在本地启动这个app?

Add digest comments?

Hi,

Loved what you have for hack news digest? Is it possible to add a digest comment session in the RSS feed? (to show top comments for each article).

question on crawler

Hi,

I read this page from your doc the other day and was wondering.

Why not just article extractors made in the passed? There is even a github tag for some of them there

Just wondering, hope you don't mind

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.