tagucci / pythonrouge Goto Github PK

View Code? Open in Web Editor NEW

165.0 3.0 34.0 358 KB

Python wrapper for evaluating summarization quality by ROUGE package

License: MIT License

Perl 90.40% Python 9.60%

summarization rouge natural-language-processing document-summarization python evaluation-metrics text-summarization

pythonrouge's Introduction

pythonrouge

This is the python wrapper to use ROUGE, summarization evaluation toolkit.

In this implementation, you can evaluate various types of ROUGE metrics. You can evaluate your system summaries with reference summaries right now. It's not necessary to make an xml file as in the general ROUGE package. However, you can evaluate ROUGE scores in a standard way if you saved system summaries and reference summaries in specific directories. In the document summarization research, recall or F-measure of ROUGE metrics is used in most cases. So you can choose either recall or F-measure or both of these of ROUGE evaluation result for convenience.

Any feedbacks or comments are welcome.

Install

You can install pythonrouge in both ways

# not using pip
git clone https://github.com/tagucci/pythonrouge.git
python setup.py install

# using pip
pip install git+https://github.com/tagucci/pythonrouge.git

Then, you can use pythonrouge.

Usage

The only things you need to evaluate ROUGE score is to specify the paths of ROUGE-1.5.5.pl and RELEASE-1.5.5/data in this package.

from pythonrouge.pythonrouge import Pythonrouge

# system summary(predict) & reference summary
summary = [[" Tokyo is the one of the biggest city in the world."]]
reference = [[["The capital of Japan, Tokyo, is the center of Japanese economy."]]]

# initialize setting of ROUGE to eval ROUGE-1, 2, SU4
# if you evaluate ROUGE by sentence list as above, set summary_file_exist=False
# if recall_only=True, you can get recall scores of ROUGE
rouge = Pythonrouge(summary_file_exist=False,
                    summary=summary, reference=reference,
                    n_gram=2, ROUGE_SU4=True, ROUGE_L=False,
                    recall_only=True, stemming=True, stopwords=True,
                    word_level=True, length_limit=True, length=50,
                    use_cf=False, cf=95, scoring_formula='average',
                    resampling=True, samples=1000, favor=True, p=0.5)
score = rouge.calc_score()
print(score)

The output will be below. In this case, only recall metrics of ROUGE is printed.

{'ROUGE-1': 0.16667, 'ROUGE-2': 0.0, 'ROUGE-SU4': 0.05}

You can also evaluate ROUGE scripts in a standard way. In this case, your directory format of system/reference summary directory should be as below.

# Directory format sample
1 system summary and 4 reference summaries.
- system summary
./summary_path/summaryA.txt

- reference summary
./reference_path/summaryA.1.txt
./reference_path/summaryA.2.txt
./reference_path/summaryA.3.txt
./reference_path/summaryA.4.txt

File name of reference summaries should be same as the system summary.
In this case, system file is "summaryA.txt" and reference files should have "summaryA" in file names.

# Name Rule
- system summary
{NAME}.txt

- reference summary
{NAME}.{SUMMARY_ID}.txt

In system and reference summary, {NAME} should be same as an above sample.
If there are 4 gold summaries, {SUMMARY_ID} is [1, 2, 3, 4].

After putting system/reference files as above, you can evaluate ROUGE metrics as blow.

from pythonrouge.pythonrouge import Pythonrouge

# initialize setting of ROUGE, eval ROUGE-1, 2, SU4
# if summary_file_exis=True, you should specify system summary(peer_path) and reference summary(model_path) paths
rouge = Pythonrouge(summary_file_exist=True,
                    peer_path=summary, model_path=reference,
                    n_gram=2, ROUGE_SU4=True, ROUGE_L=False,
                    recall_only=True,
                    stemming=True, stopwords=True,
                    word_level=True, length_limit=True, length=50,
                    use_cf=False, cf=95, scoring_formula='average',
                    resampling=True, samples=1000, favor=True, p=0.5)

Error Handling

If you encounter following error message when you use pythonrouge

Cannot open exception db file for reading: /home/pythonrouge/pythonrouge/RELEASE-1.5.5/data/WordNet-2.0.exc.db

you can run pythonrouge by doing following.

# move to pythonrouge dir you've installed
cd pythonrouge/RELEASE-1.5.5/data/
rm WordNet-2.0.exc.db # only if exist
cd WordNet-2.0-Exceptions
rm WordNet-2.0.exc.db # only if exist
./buildExeptionDB.pl . exc WordNet-2.0.exc.db
cd ../
ln -s WordNet-2.0-Exceptions/WordNet-2.0.exc.db WordNet-2.0.exc.db

pythonrouge's People

Contributors

Stargazers

Watchers

pythonrouge's Issues

Reference summary list structure not accessed correctly in for loop

in lines https://github.com/tagucci/pythonrouge/blob/master/pythonrouge/pythonrouge.py#L159-L165, the for loop accesses references before individual summaries. You have:

for j, ref in enumerate(self.reference):
    for k, doc in enumerate(ref):

Since,

reference = [
[[summaryA_ref1_sent1, summaryA_ref1_sent2], [summaryA_ref2_sent1,  summaryA_ref2_sent2]], 
[[summaryB_ref1_sent1, summaryB_ref1_sent2], [summaryB_ref2_sent1, summaryB_ref2_sent2]]
]

your for loop should access summaries first such that:

for k, doc in enumerate(self.reference):
    for j, ref in enumerate(doc):

ROUGE-W results not right?

Hi,

I have a problem when using ROUGE-W.
Here is my setting:

summ = ['a b c d e']
rouge = Pythonrouge(n_gram=1, ROUGE_SU4=False, ROUGE_L=False, ROUGE_W=True, \
                    stemming=False, stopwords=False, word_level=True, \
                    length_limit=False, use_cf=False, cf=95, scoring_formula="average", \
                     resampling=False, samples=1, favor=True, p=0.5)
setting_file = rouge.setting(files=False, summary=[summ], reference=[[summ]])
scores = rouge.eval_rouge(setting_file, ROUGE_path=ROUGE_path, data_path=data_path)

I try the situation that the summary and the reference are the same. I thought the results of ROUGE-W should 1.0, but I get the results below:

{'ROUGE-W-1.2-R': 0.72478, 'ROUGE-W-1.2-P': 1.0, 'ROUGE-1-F': 1.0, 'ROUGE-W-1.2-F': 0.84043, 'ROUGE-1-R': 1.0, 'ROUGE-1-P': 1.0}

I don't know why the results are not 1.0. Did I do something wrong with the settings?

Error in running example.py

Hi
I run the example.py, and I found this error:

evaluate sumamry & reference in these dir
summary: ./sample/summary/
reference: ./sample/reference/

All metric
Traceback (most recent call last):
File "example.py", line 16, in
setting_file = rouge.setting(files=True, summary_path=summary_dir, reference_path=reference_dir)
File "/home/yingwenhao/project/pythonrouge/pythonrouge/pythonrouge.py", line 107, in setting
file_name = os.path.splitext(os.path.basename(path))[0]
UnboundLocalError: local variable 'path' referenced before assignment

I checked the code, and I guess that something is ignored. How should I deal with the problem? thx

eval_rouge

In the name of Allah
Hello,
Thank you for adding japanese summary.
Before I use 'eval_rouge' method but In new code, 'eval_rouge' method don 't exist.
If I want to use new code, must I change perviouse code?
Thank you very much.

I have the following problem

Hi guys,
sry, that might be a dumb question, but I'm still new to python an github.

When trying the example code, I get the following error:
File "C:\Users...\Anaconda3\lib\tempfile.py", line 368, in mkdtemp
_os.mkdir(file, 0o700)
FileNotFoundError: [WinError 3] The system cannot find the path specified: '/tmp/tmpdrthhe2i'

How can I solve that?

Is ROUGE toolkit required?

Hi,

Thanks for the useful repo!
ROUGE metrics have been quite novel for me & I haven't installed the ROUGE toolkit. Does the repo requires ROUGE to calculate such metrics? If so, could you please show us how should it be installed (e.g. versions and options) to let pythonrouge work properly.
Many thanks for the help!

japanese & summary_file_exist=True

In the name of Allah
Hello,
could you please take a example for Japanese language for state of summary_file_exist=True.
Thank you very much.

Non-zero exit status 79

Hi,

I tried the example code on OSX 10.13.2, with both Python 3.6 and Python 2.7, and get the error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pythonrouge.py", line 334, in calc_score
    output = subprocess.check_output(rouge_cmd, stderr=subprocess.STDOUT)
  File "/usr/local/Cellar/python/2.7.13/Frameworks/Python.framework/Versions/2.7/lib/python2.7/subprocess.py", line 219, in check_output
    raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['perl', '/Users/xxx/bin/pythonrouge/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl', '-e', '/Users/xxx/bin/pythonrouge/pythonrouge/RELEASE-1.5.5/data', '-a', '-n', '2', '-2', '4', '-u', '-x', '-l', '50', '-m', '-s', '-f', 'A', '-r', '1000', '-p', '0.5', '/tmp/tmpKCWgSB/setting.xml']' returned non-zero exit status 79

I cannot find solution online, could you please help me? Thanks!

CalledProcessError

Hi,

I'm getting this error when trying to evaluate on my dev set. Does anybody have idea why this happens?? and solution to it?

Thanks!

Traceback (most recent call last):
File "train.py", line 149, in
rouge1, rouge2, rougel = rouge.get_rouge(dev_preds, dev_gold, use_cf=False)
File "/disk1/sajad/private/summarize-radiology-findings-normal/utils/rouge.py", line 29, in get_rouge
score = rouge.calc_score()
File "/home/sajad/anaconda3/lib/python3.6/site-packages/pythonrouge/pythonrouge.py", line 382, in calc_score
output = ç.check_output(rouge_cmd, stderr=subprocess.STDOUT)
File "/home/sajad/anaconda3/lib/python3.6/subprocess.py", line 336, in check_output
**kwargs).stdout
File "/home/sajad/anaconda3/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['perl', '/home/sajad/anaconda3/lib/python3.6/site-packages/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl', '-e', '/home/sajad/anaconda3/lib/python3.6/site-packages/pythonrouge/RELEASE-1.5.5/data', '-a', '-n', '2', '-f', 'A', '-r', '1000', '-p', '0.5', '/tmp/tmp0dq087zw/setting.xml']' returned non-zero exit status 2.

returned non-zero exit status 255

from pythonrouge.pythonrouge import Pythonrouge

summary = [[" Tokyo is the one of the biggest city in the world."]]
reference = [[["The capital of Japan, Tokyo, is the center of Japanese economy."]]]

rouge = Pythonrouge(summary_file_exist=False,
summary=summary, reference=reference,
n_gram=2, ROUGE_SU4=True, ROUGE_L=False,
recall_only=True, stemming=True, stopwords=True,
word_level=True, length_limit=True, length=50,
use_cf=False, cf=95, scoring_formula='average',
resampling=True, samples=1000, favor=True, p=0.5)
score = rouge.calc_score()
print(score)

The results:

Traceback (most recent call last):
File "", line 1, in
File "/home/anaconda2/lib/python2.7/site-packages/pythonrouge/pythonrouge.py", line 335, in calc_score
output = subprocess.check_output(rouge_cmd, stderr=subprocess.STDOUT)
File "/home/zhc415/anaconda2/lib/python2.7/subprocess.py", line 574, in check_output
raise CalledProcessError(retcode, cmd, output=output)
subprocess.CalledProcessError: Command '['perl', '/home/anaconda2/lib/python2.7/site-packages/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl', '-e', '/home/zhc415/anaconda2/lib/python2.7/site-packages/pythonrouge/RELEASE-1.5.5/data', '-a', '-n', '2', '-2', '4', '-u', '-x', '-l', '50', '-m', '-s', '-f', 'A', '-r', '1000', '-p', '0.5', './tmpP1dTjG/setting.xml']' returned non-zero exit status 255

how to test rouge for a file

Hi,

Thanks for your script, it can be work very well for one sentence, I have a question that if I have a system sentences file, and a model sentences file, each system corresponding to one model sentence, how to test it with this great package?

local variable 'path' referenced before assignment

Hi, I got a problem when I try to evaluate ROUGE scripts in a standard way, for files.

I just run as in readme,and defined summary_dir & reference_dir, it happens when setting_files:

setting_file = rouge.setting(files=True, summary_path=summary_dir, reference_path=reference_dir)
Traceback (most recent call last):
File "", line 1, in
File "pythonrouge/pythonrouge.py", line 107, in setting
file_name = os.path.splitext(os.path.basename(path))[0]
UnboundLocalError: local variable 'path' referenced before assignment

Hope for help, thx!

getting error for output = subprocess.check_output([ROUGE_path, "-e", data_path, "-a", "-m", "-2", "4","-n", "3", abs_xml_path], stderr=subprocess.STDOUT)

Traceback (most recent call last):

File "", line 1, in
score = pythonrouge.pythonrouge(peer_sentence, model_sentence,ROUGE,data_path)

File "pythonrouge\pythonrouge.py", line 65, in pythonrouge
output = subprocess.check_output([sys.executable, ROUGE_path, "-e", data_path, "-a", "-m", "-2", "4","-n", "3", abs_xml_path], stderr=subprocess.STDOUT)

File "C:\Users\rajesh\Anaconda\lib\subprocess.py", line 566, in check_output
process = Popen(stdout=PIPE, *popenargs, **kwargs)

File "C:\Users\rajesh\Anaconda\lib\subprocess.py", line 710, in init
errread, errwrite)

File "C:\Users\rajesh\Anaconda\lib\subprocess.py", line 958, in _execute_child
startupinfo)

WindowsError: [Error 193] %1 is not a valid Win32 application

number of [] in summary & references

In the name of God
Hello,
I run following code and give me Result: {'ROUGE-SU4': 0.0, 'ROUGE-L': 0.0, 'ROUGE-2': 0.0, 'ROUGE-1': 0.0}

from pythonrouge.pythonrouge import Pythonrouge

#/media/aliasghar/01CDBF1969EE87A0/MySoftWare/UniversitySoftWare/Term4/TS/Evaluate/pythonrouge-master/pythonrouge-master/pythonrouge
ROUGE_path = '/media/aliasghar/01CDBF1969EE87A0/MySoftWare/UniversitySoftWare/Term4/TS/Evaluate/pythonrouge-master/pythonrouge-master/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl' #ROUGE-1.5.5.pl
data_path = '/media/aliasghar/01CDBF1969EE87A0/MySoftWare/UniversitySoftWare/Term4/TS/Evaluate/pythonrouge-master/pythonrouge-master/pythonrouge/RELEASE-1.5.5/data' #data folder in RELEASE-1.5.5

# initialize setting of ROUGE, eval ROUGE-1, 2, SU4, L
rouge = Pythonrouge(n_gram=2, ROUGE_SU4=True, ROUGE_L=True, stemming=True, stopwords=True, word_level=True, length_limit=True, length=50, use_cf=False, cf=95, scoring_formula="average", resampling=True, samples=1000, favor=True, p=0.5)

# system summary & reference summary
summary = ["This is a sample"]
reference = ["This is a sample"]
# If you evaluate ROUGE by sentence list as above, set files=False
setting_file = rouge.setting(files=False, summary=summary, reference=reference)

# If you need only recall of ROUGE metrics, set recall_only=True
result = rouge.eval_rouge(setting_file, recall_only=True, ROUGE_path=ROUGE_path, data_path=data_path)
print(result)

Why do I put extra [] in summary and reference?
thanks

The input dimension of using pure code summary & reference.

I want to input the summary and summary like the example

summary = [[" Tokyo is the one of the biggest city in the world."]]
reference = [[["The capital of Japan, Tokyo, is the center of Japanese economy."]]]

But I was confused by the dimension of the list here. And I try to change the dimension, the result is very strange?
So, can anyone tell me the dimension of the meaning here? If I have several sentences and reference here and how to change the summary and reference ?

set ROUGE-1.5.5.pl and RELEASE-1.5.5/data

In the name of Allah
Hello,

The only things you need to evaluate ROUGE score is to specify the paths of ROUGE-1.5.5.pl and RELEASE-1.5.5/data in this package.

In new version how do I set ROUGE-1.5.5.pl and RELEASE-1.5.5/data?
Thank you

I am wondering the progress of supporting Chinese

Oh, It is so nice to see the repo.

I am puzzled that Whether the original ROUGE perl scripts could support Chinese or other non-ascii characters, or how can I do something to make it support.

I'm surprised to see that very little informations about ROUGE for Unicode, it make the text summarization likes self-deception because the Most important Evaluation is supported so bad.

Illegal division by zero at /root/RELEASE-1.5.5/ROUGE-1.5.5.pl line 2450

我根据这个https://stackoverflow.com/questions/45894212/installing-pyrouge-gets-error-in-ubuntu成功安装了pyrouge，然后使用cpanm安装缺少的perl module。python3 -m pyrouge.test 可以成功运行，不会报错。non-zero exit status 2在安装缺少的perl module已经解决了。但是，出现另外一个错误。

total number of parameters: 83289940 [76/1916]

[========================================= 10000/10000 ===============================>] Step: 222ms | Tot: 36m58s
epoch: 1, loss: 55443.712, time: 2220.975, updates: 10000, accuracy: 24.77
evaluating after 10000 updates...
Illegal division by zero at /root/RELEASE-1.5.5/ROUGE-1.5.5.pl line 2450................] Step: 826ms | Tot: 0ms
Traceback (most recent call last):
File "train.py", line 332, in
main()
File "train.py", line 324, in main
train_model(model, data, optim, i, params)
File "train.py", line 179, in train_model
score = eval_model(model, data, params)
File "train.py", line 252, in eval_model
score[metric] = getattr(utils, metric)(reference, candidate, params['log_path'], params['log'], config)
File "/root/Learning-Board/text_summarization/Global-Encoding/utils/metrics.py", line 59, in rouge
rouge_results = r.convert_and_evaluate()
File "/usr/local/lib/python3.5/dist-packages/pyrouge-0.1.3-py3.5.egg/pyrouge/Rouge155.py", line 367, in convert_and_evaluate
rouge_output = self.evaluate(system_id, rouge_args)
File "/usr/local/lib/python3.5/dist-packages/pyrouge-0.1.3-py3.5.egg/pyrouge/Rouge155.py", line 342, in evaluate
rouge_output = check_output(command, env=env).decode("UTF-8")
File "/usr/lib/python3.5/subprocess.py", line 626, in check_output
**kwargs).stdout
File "/usr/lib/python3.5/subprocess.py", line 708, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/root/RELEASE-1.5.5/ROUGE-1.5.5.pl', '-e', '/root/RELEASE-1.5.5/data', '-c', '95', '-2', '-1', '-U', '-r', '1000', '-n', '4', '-w', '1.2', '-a', '-m', '/tmp/tmpnf3k
9ztc/rouge_conf.xml']' returned non-zero exit status 255

单独跑/root/RELEASE-1.5.5/ROUGE-1.5.5.pl -e /root/RELEASE-1.5.5/data -c 95 -2 -1 -U -r 1000 -n 4 -w 1.2 -a -m /tmp/tmpnf3k9ztc/rouge_conf.xml的时候，
1 ROUGE-1 Average_R: 0.32286 (95%-conf.int. 0.31207 - 0.33280)
1 ROUGE-1 Average_P: 0.35237 (95%-conf.int. 0.34137 - 0.36314)
1 ROUGE-1 Average_F: 0.32779 (95%-conf.int. 0.31737 - 0.33765)

1 ROUGE-2 Average_R: 0.13193 (95%-conf.int. 0.12332 - 0.14033)
1 ROUGE-2 Average_P: 0.14852 (95%-conf.int. 0.13927 - 0.15799)
1 ROUGE-2 Average_F: 0.13512 (95%-conf.int. 0.12635 - 0.14339)

1 ROUGE-3 Average_R: 0.05436 (95%-conf.int. 0.04784 - 0.06078)
1 ROUGE-3 Average_P: 0.06431 (95%-conf.int. 0.05691 - 0.07159)
1 ROUGE-3 Average_F: 0.05681 (95%-conf.int. 0.05020 - 0.06328)

1 ROUGE-4 Average_R: 0.02492 (95%-conf.int. 0.01983 - 0.03013)
1 ROUGE-4 Average_P: 0.03005 (95%-conf.int. 0.02413 - 0.03571)
1 ROUGE-4 Average_F: 0.02609 (95%-conf.int. 0.02087 - 0.03111)

1 ROUGE-L Average_R: 0.30569 (95%-conf.int. 0.29490 - 0.31526)
1 ROUGE-L Average_P: 0.33381 (95%-conf.int. 0.32302 - 0.34479)
1 ROUGE-L Average_F: 0.31038 (95%-conf.int. 0.30036 - 0.31989)
Illegal division by zero at /root/RELEASE-1.5.5/ROUGE-1.5.5.pl line 2450.

Cannot report precision

I found that the precision is missing while only F is reported. I hope precision can be returned also.

Find error when i run the example

from pythonrouge import pythonrouge

ROUGE = '~/pythonrouge/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl'

data_path = '~/pythonrouge/pythonrouge/RELEASE-1.5.5/data' #data folder in RELEASE-1.5.5

peer = " Tokyo is the one of the biggest city in the world."

model = "The capital of Japan, Tokyo, is the center of Japanese economy."

score = pythonrouge.pythonrouge(peer, model,"~~/pythonrouge/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl", "~~/pythonrouge/pythonrouge/RELEASE-1.5.5/data")

print(score)

OSError Traceback (most recent call last)
in ()
4 peer = " Tokyo is the one of the biggest city in the world."
5 model = "The capital of Japan, Tokyo, is the center of Japanese economy."
----> 6 score = pythonrouge.pythonrouge(peer, model,"~~/pythonrouge/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl", "~~/pythonrouge/pythonrouge/RELEASE-1.5.5/data")
7 print(score)

/afs/inf.ed.ac.uk/user/s15/s1537328/MSc/env/lib/python2.7/site-packages/pythonrouge/pythonrouge.pyc in pythonrouge(peer_sentence, model_sentence, ROUGE_path, data_path)
61
62 abs_xml_path = str(temp_dir+"/"+xml_path)
---> 63 output = subprocess.check_output([ROUGE_path, "-e", data_path, "-a", "-m", "-2", "4","-n", "3", abs_xml_path], stderr=subprocess.STDOUT)
64 output = output.decode("utf-8")
65 outputs = output.strip().split("\n")

/usr/lib64/python2.7/subprocess.pyc in check_output(_popenargs, *_kwargs)
566 if 'stdout' in kwargs:
567 raise ValueError('stdout argument not allowed, it will be overridden.')
--> 568 process = Popen(stdout=PIPE, _popenargs, *_kwargs)
569 output, unused_err = process.communicate()
570 retcode = process.poll()

/usr/lib64/python2.7/subprocess.pyc in init(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags)
709 p2cread, p2cwrite,
710 c2pread, c2pwrite,
--> 711 errread, errwrite)
712 except Exception:
713 # Preserve original exception in case os.close raises.

/usr/lib64/python2.7/subprocess.pyc in _execute_child(self, args, executable, preexec_fn, close_fds, cwd, env, universal_newlines, startupinfo, creationflags, shell, to_close, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite)
1325 raise
1326 child_exception = pickle.loads(data)
-> 1327 raise child_exception
1328
1329

OSError: [Errno 2] No such file or directory

Results on CNN/DailyMail

Hi, I'm using this repository to evaluate LEAD-3 on CNN/DailyMail corpus (following https://github.com/tagucci/cnn-dailymail)

However, although the results of ROUGE-1 and ROUGE-2 are equal, the ROUGE-L is not consistent with your results:

{'ROUGE-1-P': 0.3359, 'ROUGE-1': 0.4024, 'ROUGE-2-P': 0.14784, 'ROUGE-2': 0.17705, 'ROUGE-L-P': 0.24819, 'ROUGE-L': 0.29823}

Any idea?

Regards :)

utf8 & pythonrouge

In the name of GOD
Hello,
Does it program support from Persian language?
Thank you

only ROUGE-2

In the name of GOD
Hello,
I need only ROUGE-2 but following code get {'ROUGE-2': 0.84615, 'ROUGE-1': 0.93333}.

from pythonrouge.pythonrouge import Pythonrouge

#/media/aliasghar/01CDBF1969EE87A0/MySoftWare/UniversitySoftWare/Term4/TS/Evaluate/pythonrouge-master/pythonrouge-master/pythonrouge
ROUGE_path = '/media/aliasghar/01CDBF1969EE87A0/MySoftWare/UniversitySoftWare/Term4/TS/Evaluate/pythonrouge-master/pythonrouge-master/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl' #ROUGE-1.5.5.pl
data_path = '/media/aliasghar/01CDBF1969EE87A0/MySoftWare/UniversitySoftWare/Term4/TS/Evaluate/pythonrouge-master/pythonrouge-master/pythonrouge/RELEASE-1.5.5/data' #data folder in RELEASE-1.5.5

# initialize setting of ROUGE, eval ROUGE-1, 2, SU4, L
rouge = Pythonrouge(n_gram=2,  ROUGE_SU4=False, ROUGE_L=False, stemming=True, stopwords=True, word_level=True, length_limit=True, length=50, use_cf=False, cf=95, scoring_formula="average", resampling=True, samples=1000, favor=True, p=0.5)

# system summary & reference summary
summary = [["Great location, very good selection of food for breakfast buffet."]]
reference = [[["Great location, very good hjk selection of food for breakfast buffet."],["Great location, very good selection of food for breakfast buffet."]]]
#
# If you evaluate ROUGE by sentence list as above, set files=False
setting_file = rouge.setting(files=False, summary=summary, reference=reference)

# If you need only recall of ROUGE metrics, set recall_only=True
result = rouge.eval_rouge(setting_file, recall_only=True, ROUGE_path=ROUGE_path, data_path=data_path)
print(result)

Thank you

CalledProcessError

I tried to run the same codes as introduced in the repo:

`from pythonrouge.pythonrouge import Pythonrouge`

summary = [[" Tokyo is the one of the biggest city in the world."]]
reference = [[["The capital of Japan, Tokyo, is the center of Japanese economy."]]]

rouge = Pythonrouge(summary_file_exist=False,
                    summary=summary, reference=reference,
                    n_gram=2, ROUGE_SU4=True, ROUGE_L=False,
                    recall_only=True, stemming=True, stopwords=True,
                    word_level=True, length_limit=True, length=50,
                    use_cf=False, cf=95, scoring_formula='average',
                    resampling=True, samples=1000, favor=True, p=0.5)

score = rouge.calc_score()

But still encounter the following error. I tried previous suggested solution but it didn't help.

Traceback (most recent call last):
File "", line 1, in
File "/home/tanpengshi/anaconda3/envs/summary/lib/python3.6/site-packages/pythonrouge/pythonrouge.py", line 334, in calc_score
output = subprocess.check_output(rouge_cmd, stderr=subprocess.STDOUT)
File "/home/tanpengshi/anaconda3/envs/summary/lib/python3.6/subprocess.py", line 336, in check_output
**kwargs).stdout
File "/home/tanpengshi/anaconda3/envs/summary/lib/python3.6/subprocess.py", line 418, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['perl', '/home/tanpengshi/anaconda3/envs/summary/lib/python3.6/site-packages/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl', '-e', '/home/tanpengshi/anaconda3/envs/summary/lib/python3.6/site-packages/pythonrouge/RELEASE-1.5.5/data', '-a', '-n', '2', '-2', '4', '-u', '-x', '-l', '50', '-m', '-s', '-f', 'A', '-r', '1000', '-p', '0.5', '/tmp/tmpkp6ecmzr/setting.xml']' returned non-zero exit status 2.

I am using Windows Subsystem for Linux (WSL) - Ubuntu. Kindly help, thank you!!! :)

I encountered the following problem

CalledProcessError Traceback (most recent call last)
in ()
17 use_cf=False, cf=95, scoring_formula='average',
18 resampling=True, samples=1000, favor=True, p=0.5)
---> 19 score = rouge.calc_score()
20 print('ROUGE-N(1-2) & SU4 F-measure only')
21 pprint(score)

/notebooks/pythonrouge-master/pythonrouge/pythonrouge.py in calc_score(self)
332 def calc_score(self):
333 rouge_cmd = self.set_command()
--> 334 output = subprocess.check_output(rouge_cmd, stderr=subprocess.STDOUT)
335 output = output.decode('utf-8')
336 output = output.strip().split('\n')

/usr/lib/python3.5/subprocess.py in check_output(timeout, *popenargs, **kwargs)
624
625 return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
--> 626 **kwargs).stdout
627
628

/usr/lib/python3.5/subprocess.py in run(input, timeout, check, *popenargs, **kwargs)
706 if check and retcode:
707 raise CalledProcessError(retcode, process.args,
--> 708 output=stdout, stderr=stderr)
709 return CompletedProcess(process.args, retcode, stdout, stderr)
710

CalledProcessError: Command '['perl', '/notebooks/pythonrouge-master/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl', '-e', '/notebooks/pythonrouge-master/pythonrouge/RELEASE-1.5.5/data', '-a', '-n', '2', '-2', '4', '-u', '-x', '-l', '50', '-m', '-s', '-f', 'A', '-r', '1000', '-p', '0.5', '/tmp/tmpjkiz4uvr/setting.xml']' returned non-zero exit status 2

Thanks very much！

CalledProcessError : returned non-zero exit status

line 65 in pythonrouge.py gives the following error when I run example.py:

subprocess.CalledProcessError: Command '['/home/aman/Desktop/pythonrouge-master/pythonrouge/RELEASE-1.5.5/ROUGE-1.5.5.pl', '-e', '/home/aman/Desktop/pythonrouge-master/pythonrouge/RELEASE-1.5.5/data', '-a', '-m', '-2', '4', '-n', '3', '/tmp/tmp_RW8QT/rouge.xml']' returned non-zero exit status 255

Peculiarity in computing RG-l

Hello,

It seems that something is going wrong when I want to compute RG-L. When I pass in a list of hypotheses and references to Pythonrouge, it gives me a RG-L score as the average. While when passing each of these hypotheses and references one-by-one to the package, and taking an average of RG-L at the end, I obtain quite a different score. Not sure what's going on.

example.py issue

hi
sorry but when i run the code , i found this error.
Traceback (most recent call last):
File "C:\Users\omneya\Desktop\bachelor\pythonrouge-master\example.py", line 17, in
print(rouge.eval_rouge(setting_file, ROUGE_path=ROUGE_path, data_path=data_path))
File "C:\Users\omneya\Desktop\bachelor\pythonrouge-master\pythonrouge\pythonrouge.py", line 158, in eval_rouge
output = subprocess.check_output(rouge_cmd, stderr=subprocess.STDOUT)
File "C:\Python27\lib\subprocess.py", line 219, in check_output
raise CalledProcessError(retcode, cmd, output=output)
CalledProcessError: Command '['perl', 'C:\Users\omneya\Desktop\bachelor\pythonrouge-master\pythonrouge\RELEASE-1.5.5\ROUGE-1.5.5.pl', '-e', 'C:\Users\omneya\Desktop\bachelor\pythonrouge-master\.C:\Users\omneya\Desktop\bachelor\pythonrouge-master\pythonrouge\RELEASE-1.5.5\data', '-a', '-n', '2', '-2', '4', '-u', '-l', '50', '-m', '-s', '-f', 'A', '-r', '1000', '-p', '0.5', 'c:\users\public\documents\wondershare\creatortemp\tmpgbsz5p\setting.xml']' returned non-zero exit status 2

unstable errors when evaluating

have a loop for creating different summaries, sometimes the following error will show up:

Traceback (most recent call last):
File "", line 22, in
File "/home/anaconda2/lib/python2.7/site-packages/pythonrouge/pythonrouge.py", line 340, in calc_score
shutil.rmtree(self.temp_dir)
File "/home/anaconda2/lib/python2.7/shutil.py", line 256, in rmtree
onerror(os.rmdir, path, sys.exc_info())
File "/home/anaconda2/lib/python2.7/shutil.py", line 254, in rmtree
os.rmdir(path)
OSError: [Errno 39] Directory not empty: './tmpNpaixY'

stemming , stop word In japanese

In the name of Allah
Hello,
Do can I set stemming , stop word on True In japanese summary?
Thank you very much.

ROUGE-SU4 is actually ROUGE-S4

Hey!

Your calling ROUGE-SU4 score which in fact is ROUGE-S4 score.
Check: https://github.com/tagucci/pythonrouge/blob/master/pythonrouge/pythonrouge.py#L75

tagucci / pythonrouge Goto Github PK

pythonrouge's Introduction

pythonrouge

Install

Usage

Error Handling

pythonrouge's People

Contributors

Stargazers

Watchers

Forkers

pythonrouge's Issues

1 ROUGE-2 Average_R: 0.13193 (95%-conf.int. 0.12332 - 0.14033) 1 ROUGE-2 Average_P: 0.14852 (95%-conf.int. 0.13927 - 0.15799) 1 ROUGE-2 Average_F: 0.13512 (95%-conf.int. 0.12635 - 0.14339)

1 ROUGE-3 Average_R: 0.05436 (95%-conf.int. 0.04784 - 0.06078) 1 ROUGE-3 Average_P: 0.06431 (95%-conf.int. 0.05691 - 0.07159) 1 ROUGE-3 Average_F: 0.05681 (95%-conf.int. 0.05020 - 0.06328)

1 ROUGE-4 Average_R: 0.02492 (95%-conf.int. 0.01983 - 0.03013) 1 ROUGE-4 Average_P: 0.03005 (95%-conf.int. 0.02413 - 0.03571) 1 ROUGE-4 Average_F: 0.02609 (95%-conf.int. 0.02087 - 0.03111)

Recommend Projects

Recommend Topics

Recommend Org

1 ROUGE-2 Average_R: 0.13193 (95%-conf.int. 0.12332 - 0.14033)
1 ROUGE-2 Average_P: 0.14852 (95%-conf.int. 0.13927 - 0.15799)
1 ROUGE-2 Average_F: 0.13512 (95%-conf.int. 0.12635 - 0.14339)

1 ROUGE-3 Average_R: 0.05436 (95%-conf.int. 0.04784 - 0.06078)
1 ROUGE-3 Average_P: 0.06431 (95%-conf.int. 0.05691 - 0.07159)
1 ROUGE-3 Average_F: 0.05681 (95%-conf.int. 0.05020 - 0.06328)

1 ROUGE-4 Average_R: 0.02492 (95%-conf.int. 0.01983 - 0.03013)
1 ROUGE-4 Average_P: 0.03005 (95%-conf.int. 0.02413 - 0.03571)
1 ROUGE-4 Average_F: 0.02609 (95%-conf.int. 0.02087 - 0.03111)