The script first searches the custom google search engine and collects all urls given by the search engine. Articles are then extracted through their url by using newscat to get the text of the articles. We then push these text files to the agent. The agent creates their embeddings, and uses them when querying.
- Python 3.9
- OpenAI API Key
sudo yum install -y amazon-efs-utils
sudo mount -t efs -o tls fs-0123456789:/ /path/to/mnt
-
ssh into ec2
-
scp
app.py
into ec2 -
make sure you have golang installed
-
install the required packages
- for the app:
pip install langchain openai chromadb tiktoken google-api-python-client
- for deploying fastapi:
pip install fastapi uvicorn pickle5 pydantic requests pypi-json pyngrok nest-asyncio python-multipart httpx
- you might encounter a problem with chromadb (see below)
- for the app:
-
install newscat
go install github.com/slyrz/newscat@latest
- add GOPATH to your PATH
export GOPATH=$HOME/go export PATH=$PATH:$GOROOT/bin:$GOPATH/bin
-
run
python app.py 1 1
- the 2 positional arguments determine if the script will search for articles and extract links respectively. set them to 0 or omit them if you want to skip those steps.
-
to start a running process
- start a screen session with
screen
- run
python app.py
- press
ctrl + a
, followed byd
to leave it in the background
- start a screen session with
-
if you want to go back to a running session, do
screen -r
or alternatively, to check the session pid,ps aux | grep screen
- to kill, do
kill -15 [pid]
- to kill, do
- chromadb โ
RuntimeError: Unsupported compiler -- at least C++11 support is needed!
- you need g++, run the following commands and try again
sudo yum -y install gcc
sudo yum -y install gcc-c++
sudo yum install python3-devel
[Unit]
Description=carbon news service
[Service]
User=ec2-user
WorkingDirectory=/home/ec2-user
ExecStart=/usr/bin/python3 app.py
Restart=always
RestartSec=1
[Install]
WantedBy=multi-user.target