polyrabbit / hacker-news-digest Goto Github PK
View Code? Open in Web Editor NEW:newspaper: Let ChatGPT Summarize Hacker News for You
Home Page: http://hackernews.betacat.io/
License: GNU Lesser General Public License v3.0
:newspaper: Let ChatGPT Summarize Hacker News for You
Home Page: http://hackernews.betacat.io/
License: GNU Lesser General Public License v3.0
For example : https://www.smithsonianmag.com/history/the-photographer-who-forced-the-us-to-confront-its-child-labor-problem-180982355/
The summary is extremely long and unformatted. I've seen this several times so I think there is a bug in your code for some websites of input file type?
Thanks for the website btw
Hi,
I think it would be awesome to be able to filter the frontpage a bit in the rss and would possibly require little changes. Some of the filters would be more complicated like search and would not be appropriate for this project because you're only offering the frontpage but I still think it would be good to allow to filter the frontpage for like points and comments.
Thoughts?
Edit : I'm refering to https://github.com/hnrss/hnrss
Hi,
Just wanted to signal that there are promising ways to summarize documents in a much better way in my opinion.
This is more costly as it's using chains of langchain but I think the added value is tremendous.
To me this is the kind of feature that would make me pay for that service.
Also this can nicely handle comments too with a little adjustment: make a prompt that extracts the new information in the comments, summarized opinions, new facts, etc #5
I did a quick try earlier today and find this very promising. I'm insanely busy atm so I thought you might be interested in the raw code directly. The idea is to ask to summarize not the key facts but the reasonning of the author paragraph by logically indented paragraph in markdown. Here's a quick proof of concept (just add your api key and add a txt file as argument, also notice that for testing I shortenned the input via [:1000]):
# source https://python.langchain.com/en/latest/modules/chains/index_examples/summarize.html
from pathlib import Path
import os
from langchain.chat_models import ChatOpenAI
from langchain import PromptTemplate, LLMChain
from langchain.text_splitter import CharacterTextSplitter
from langchain.chains.mapreduce import MapReduceChain
from langchain.prompts import PromptTemplate
from langchain.docstore.document import Document
from langchain.chains.summarize import load_summarize_chain
from pprint import pprint
assert Path("API_KEY.txt").exists(), "No api key found"
os.environ["OPENAI_API_KEY"] = str(Path("API_KEY.txt").read_text()).strip()
llm = ChatOpenAI(
model_name="gpt-3.5-turbo",
temperature=0,
verbose=True,
)
text_splitter = CharacterTextSplitter()
def load_doc(path):
assert Path(path).exists(), f"file not found: '{path}'"
with open(path) as f:
content = f.read()[:1000]
texts = text_splitter.split_text(content)
if len(texts) > 5:
ans = input(f"Number of texts splits: '{len(texts)}'. Continue? (y/n)\n>")
if ans != "y":
raise SystemExit("Quitting")
docs = [Document(page_content=t) for t in texts]
return docs
prompt_template = """Write a very concise summary of the author's reasonning paragraph by paragraph as logically indented markdown bullet points:
'''
{text}
'''
CONCISE SUMMARY AS LOGICALLY INDENTED MARKDOWN BULLET POINTS:"""
PROMPT = PromptTemplate(template=prompt_template, input_variables=["text"])
refine_template = (
"""Your job is to continue a summary of a long text as logically indented markdown bullet points of the author's reasonning.
We have provided an existing summary up to this point:
'''
{existing_answer}
'''
You have to continue the summary by adding the bullet points of the following part of the article (only if relevant, stay concise, avoid expliciting what is implied by the previous bullet points):
'''
{text}
'''
Given this new section of the document, refine the summary as logically indented markdown bullet points. If the new section is not worth it, simply return the original summary."""
)
refine_prompt = PromptTemplate(
input_variables=["existing_answer", "text"],
template=refine_template,
)
if __name__ == "__main__":
import sys
docs = load_doc(sys.argv[-1])
chain = load_summarize_chain(llm, chain_type="refine", return_intermediate_steps=True, question_prompt=PROMPT, refine_prompt=refine_prompt)
out = chain({"input_documents": docs}, return_only_outputs=True)
t = out["output_text"]
for bulletpoint in t.split("\n"):
print(bulletpoint)
print("Openning console.")
import code ; code.interact(local=locals())
Thoughts?
url && reason
The web version shows an icon and count of comments. Clicking on it takes you to the Hacker News page with all user comments. However, the RSS feed does not include such a link.
I would find the RSS feed much more useful if a link to the comments page was added into the content.
你好,我注意到这是一个flask app? 是不是少了app.py文件,没有看到app route相关信息。
目前在学习AI,我想知道如何在本地启动这个app?
希望有中文RSS
Hi, just in case you didn't know there is another website doing something similar. You might be interested in it but it's not open source : https://www.emergentmind.com/
Hope you find this useful.
比如 diffbot 这样的
I have a free OpenAI account. Will hacker-news-digest work for me?
Hi,
Loved what you have for hack news digest? Is it possible to add a digest comment session in the RSS feed? (to show top comments for each article).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.