dmmiller612 / lecture-summarizer Goto Github PK
View Code? Open in Web Editor NEWLecture summarization with BERT
Home Page: https://arxiv.org/abs/1906.04165
Lecture summarization with BERT
Home Page: https://arxiv.org/abs/1906.04165
Hi I've been trying out the program but i run into a unicode error when trying out certain files.
I have tried to follow their advice and put the encoding at several places but it doesnt seem to work. Do you know where i should change it?
The error message is as follows:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/3.7/bin/lecture-summarizer", line 10, in
sys.exit(run())
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/lecture_summarizer.py", line 173, in run
factoryargs.action()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/lecture_summarizer.py", line 52, in call
self.run()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/lecture_summarizer.py", line 79, in run
to_upload = self.__get_lecture_content()
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/lecture_summarizer.py", line 65, in __get_lecture_content
req = requests.post(url, all_data)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/api.py", line 116, in post
return request('post', url, data=data, json=json, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/api.py", line 60, in request
return session.request(method=method, url=url, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/sessions.py", line 533, in request
resp = self.send(prep, **send_kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/connectionpool.py", line 603, in urlopen
chunked=chunked)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/urllib3/connectionpool.py", line 355, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1244, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 1289, in _send_request
body = _encode(body, 'body')
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/http/client.py", line 170, in _encode
(name.title(), data[err.start:err.end], name)) from None
UnicodeEncodeError: 'latin-1' codec can't encode character '\u2013' in position 476: Body ('โ') is not valid Latin-1. Use body.encode('utf-8') if you want to send it encoded in UTF-8.
When i'm trying to test the CLI-Tool with the provided command "lecture-summarizer get-lectures"
i'lll get an connection Error. below the error trace:
root@marvin-VirtualBox:/home/marvin/lecture-summarizer# pip3 install git+https://github.com/dmmiller612/lecture-summarizer.git
Collecting git+https://github.com/dmmiller612/lecture-summarizer.git
Cloning https://github.com/dmmiller612/lecture-summarizer.git to /tmp/pip-3qb2iltk-build
Requirement already satisfied (use --upgrade to upgrade): lecture-summarizer==0.0.1 from git+https://github.com/dmmiller612/lecture-summarizer.git in /usr/local/lib/python3.6/dist-packages
Requirement already satisfied: requests in /usr/lib/python3/dist-packages (from lecture-summarizer==0.0.1)
root@marvin-VirtualBox:/home/marvin/lecture-summarizer# lecture-summarizer get-lectures
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 141, in _new_conn
(self.host, self.port), self.timeout, **extra_kw)
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 83, in create_connection
raise err
File "/usr/lib/python3/dist-packages/urllib3/util/connection.py", line 73, in create_connection
sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 357, in _make_request
conn.request(method, url, **httplib_request_kw)
File "/usr/lib/python3.6/http/client.py", line 1264, in request
self._send_request(method, url, body, headers, encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1310, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1259, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "/usr/lib/python3.6/http/client.py", line 1038, in _send_output
self.send(msg)
File "/usr/lib/python3.6/http/client.py", line 976, in send
self.connect()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 166, in connect
conn = self._new_conn()
File "/usr/lib/python3/dist-packages/urllib3/connection.py", line 150, in _new_conn
self, "Failed to establish a new connection: %s" % e)
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPConnection object at 0x7fb7a4f0a550>: Failed to establish a new connection: [Errno 110] Connection timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 440, in send
timeout=timeout
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 398, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='54.85.20.109', port=5000): Max retries exceeded with url: /lectures (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb7a4f0a550>: Failed to establish a new connection: [Errno 110] Connection timed out',))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/bin/lecture-summarizer", line 11, in <module>
load_entry_point('lecture-summarizer==0.0.1', 'console_scripts', 'lecture-summarizer')()
File "/usr/local/lib/python3.6/dist-packages/lecture_summarizer.py", line 173, in run
factory[args.action](args)()
File "/usr/local/lib/python3.6/dist-packages/lecture_summarizer.py", line 52, in __call__
self.run()
File "/usr/local/lib/python3.6/dist-packages/lecture_summarizer.py", line 98, in run
self.run_get(url)
File "/usr/local/lib/python3.6/dist-packages/lecture_summarizer.py", line 33, in run_get
'Content-Type': 'application/json'
File "/usr/lib/python3/dist-packages/requests/api.py", line 72, in get
return request('get', url, params=params, **kwargs)
File "/usr/lib/python3/dist-packages/requests/api.py", line 58, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 520, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 630, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 508, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='54.85.20.109', port=5000): Max retries exceeded with url: /lectures (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7fb7a4f0a550>: Failed to establish a new connection: [Errno 110] Connection timed out',))`
Thanks for help
$ make docker-build-run
docker build -t local-summary -f Dockerfile ./
Sending build context to Docker daemon 176.1kB
Step 1/10 : FROM ubuntu:16.04
---> 657d80a6401d
Step 2/10 : RUN apt-get update && apt-get install -y sudo build-essential curl libcurl4-openssl-dev libssl-dev wget python3-dev python3-pip libxrender-dev libxext6 libsm6 openssl
---> Using cache
---> c06c359ea431
Step 3/10 : RUN mkdir -p /opt/service
---> Using cache
---> 7b2438beb164
Step 4/10 : RUN mkdir -p /opt/service/summarizer
---> Using cache
---> 56cec71213c9
Step 5/10 : COPY summarizer /opt/service/summarizer
---> Using cache
---> 56f29661dbe9
Step 6/10 : COPY server.py /opt/service
---> Using cache
---> 9caa6122383f
Step 7/10 : COPY requirements.txt /opt/service
---> Using cache
---> 4cf3dd2c5ae2
Step 8/10 : WORKDIR /opt/service
---> Using cache
---> 8d7f02cfc081
Step 9/10 : RUN pip3 install -r requirements.txt
---> Using cache
---> 8ae60c217e59
Step 10/10 : CMD /bin/bash -c "python3 server.py"
---> Using cache
---> 3f5be93d9373
Successfully built 3f5be93d9373
Successfully tagged local-summary:latest
docker run --rm -it -p 5000:5000 local-summary:latest
File "server.py", line 81
lecture: int = lectureid
^
SyntaxError: invalid syntax
Makefile:12: recipe for target 'docker-build-run' failed
make: *** [docker-build-run] Error 1
Since this was a separate app of its own from the general bert summarizer, I wanted to ask whether any dataset was used to finetune the bert model for the task of lecture transcript summarization and if yes, which dataset was used.
Getting an error while trying to run the service locally.
Please advise its also my first attempt to run a code in a docker
$ make docker-build-run
docker build -t local-summary -f Dockerfile ./
ERRO[0000] failed to dial gRPC: cannot connect to the Docker daemon. Is 'docker daemon' running on this host?: dial unix /var/run/docker.sock: connect: permission denied
Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Post http://%2Fvar%2Frun%2Fdocker.sock/v1.40/build?buildargs=%7B%7D&cachefrom=%5B%5D&cgroupparent=&cpuperiod=0&cpuquota=0&cpusetcpus=&cpusetmems=&cpushares=0&dockerfile=Dockerfile&labels=%7B%7D&memory=0&memswap=0&networkmode=default&rm=1&session=hz77ar7alqxnbm6vqj1mgjc6s&shmsize=0&t=local-summary&target=&ulimits=null&version=1: dial unix /var/run/docker.sock: connect: permission denied
Makefile:12: recipe for target 'docker-build-run' failed
make: *** [docker-build-run] Error 1
This mainly applies if you have a running service for the actual REST api.
Probably a nice-to-have feature would be to migrate flask to FastAPI. The two frameworks are very similar and you should be able to make the migration fairly smoothly. The nice thing about FastAPI is that it has auto-generated SwaggerUI and redoc documentation and the framework itself should be able to handle async requests more efficiently than flask.
cant open the IP address http://54.85.20.109:5000 in the paper.
When i'm trying to get the service run locally , i faced some problems. I had to do some manual changes to overcome these problems. I thought it would be useful to report over this things
Problem 1:
ImportError: No module named tkinter
Problem 2: Got stucked on the geographic area input dialog at ubuntu installing dialog (input not accepted):
Preconfiguring packages ...
Configuring tzdata
------------------
Please select the geographic area in which you live. Subsequent configuration
questions will narrow this down by presenting a list of cities, representing
the time zones in which they are located.
1. Africa 4. Australia 7. Atlantic 10. Pacific 13. Etc
2. America 5. Arctic 8. Europe 11. SystemV
3. Antarctica 6. Asia 9. Indian 12. US
Geographic area:
The Solution for that was to edit the Dockerfile (added python3-tk and env-variables):
FROM ubuntu:18.04
ENV TZ=Europe/Minsk
RUN ln -snf /usr/share/zoneinfo/$TZ /etc/localtime && echo $TZ > /etc/timezone
RUN apt-get update && \
apt-get install -y sudo \
tzdata \
build-essential \
curl \
libcurl4-openssl-dev \
libssl-dev \
wget \gedit \
python3-dev \
python3-pip \
python3-tk \
libxrender-dev \
libxext6 \
libsm6 \
openssl
RUN mkdir -p /opt/service
RUN mkdir -p /opt/service/summarizer
COPY summarizer /opt/service/summarizer
COPY server.py /opt/service
COPY requirements.txt /opt/service
WORKDIR /opt/service
RUN pip3 install -r requirements.txt
CMD /bin/bash -c "python3 server.py"
with that changes i was able to build and run the Image
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.