metricsgrimoire / bicho Goto Github PK
View Code? Open in Web Editor NEWBicho is a command line based tool used to parse bug/issue tracking systems
Home Page: http://metricsgrimoire.github.com/Bicho/
License: GNU General Public License v2.0
Bicho is a command line based tool used to parse bug/issue tracking systems
Home Page: http://metricsgrimoire.github.com/Bicho/
License: GNU General Public License v2.0
Important notice (2018-09-19) ----------------------------- This project lacked active development for a long while, and it is very unlikely that it gets develpment attention in the future. For those interested in retrieving and processing data from issue tracking repositories (bug tracking repositories), please consider checking GrimoireLab [1] and GrimoireLab-Perceval [2] [1] https://chaoss.github.io/grimoirelab [2] https://github.com/chaoss/grimoirelab-perceval Description ----------- Bicho is a command line-based tool used to parse bug/issue tracking systems. It gets all the information associated with issues and stores them in a relational database. (It is part of the MetricsGrimoire suite, which produces data for vizGrimoire to analyze and visualize.) Currently Bicho supports: - Bugzilla - Sourceforge.net (abandoned) - JIRA - Launchpad - GitHub - Maniphest - Redmine - Gerrit - Allura (unstable) - Google Code (abandoned) - Trac License --------- Bicho is licensed under GNU General Public License (GPL), version 2 or later. Download ---------- Home page: * http://metricsgrimoire.github.com/Bicho/ Releases: * https://github.com/MetricsGrimoire/Bicho/downloads Latest version: * git://github.com/MetricsGrimoire/Bicho.git Requirements ------------ * Python >= 2.4 * Python Storm. You'll also need the following Python libraries: - mysqldb (default engine should be set to MYISAM) - python-launchpadlib (only for for Launchpad backend) * Beautiful Soup library: error-tolerant HTML parser for Python * python-feedparser * dateutil Installation ------------- You can install Bicho running the setup.py script: # python setup.py install For the impatients: $ bicho --help Running Bicho -------------- This is a quick list of example commands you would run in your terminal to get bug data from various kinds of bug trackers and capture it in a database on your local machine. The general format is: $ bicho --db-user-out=[YOUR DATABASE USERNAME] --db-password-out=[YOUR DATABASE PASSWORD] --db-database-out=[NAME OF DATABASE] -d [DELAY BETWEEN REQUESTS IN NUMBER OF SECONDS] -b [ABBREVIATION FOR BACKEND] -u [BUGTRACKER URL IN QUOTES] For more guidance, please see doc/UserManual.txt . It is very important to use a delay. If you run Bicho against big sites without a delay between bug queries, your IP address could be banned! E1. Getting information from a project that uses Bugzilla, like Bicho ;) $ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -d 15 -b bg -u "https://bugzilla.libresoft.es/buglist.cgi?product=bicho" E2. Getting information from a project hosted on sourceforge.net $ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -d 15 -b sf -u "http://sourceforge.net/tracker/?atid=516295&group_id=66938" E3. Getting information from a project using JIRA $ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -d 15 -b jira -u "http://support.petalslink.com/browse/PETALSMASTER" E4. Getting information from a project using Launchpad $ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -d 15 -b lp -u "https://bugs.launchpad.net/openstack" E5. Getting information from a project using Allura $ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -d 15 -b allura -u "http://sourceforge.net/rest/p/allura/tickets" E6. Getting information from a project using GitHub $ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -b github -u "https://api.github.com/repos/composer/composer/issues" --backend-token=[API TOKEN] E7. Getting information from a project using Redmine $ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] --backend-user=[REDMINE USER] --backend-password=[REDMINE PASSWORD] -d 1 -b redmine -u "https://www.bitergia.net/" E8. Getting information from Maniphest $ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] --backend-token=[API TOKEN] -b maniphest -u https://phabricator.wikimedia.org [--no-resume] E9. Getting information from Trac $ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -b trac -u https://fedorahosted.org/freeipa/ E10. Getting information from Review Board $ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -n 25 -b reviewboard -u https://reviews.apache.org/groups/geode/ Known issues ------------ Newer versions of MySQL server may raise the next error message: ``` Error Code: 1406. Data too long for column ``` To avoid this problem, please update your MySQL configuration removing `STRICT_TRANS_TABLES` value from `sql-mode` parameter. Roadmap --------- 0.93: * The updated list of bugs to be fixed can be found here https://github.com/MetricsGrimoire/Bicho/issues?milestone=1&page=1&state=open * Incremental support broken by issues updated during the download bug #28 * Incorrect order downloading issues from Bugzilla #20 * Incoherent number of issues after webkit analysis bug bugzilla support #26 * Error in database character sets while comparing dates #8 * Problem cloning repo in case insensitive systems #12 * Incremental feature doesn't support multiple projects in the same database #30 1.0: * https://github.com/MetricsGrimoire/Bicho/issues?milestone=2&page=1&state=open * issues_log for bugzilla and launchpad ** Launchad support for issues_log table enhancement launchpad support #24 ** More efficient and cleaner code for the table issues_log for bugzilla * New table with information about executions (date, issues downloaded, etc ..) * Tests, tests and tests * Improved debug mode with more useful details * Network fault tolerance (in order to survive to connection issues) * New backends: ** FusionForge Improving Bicho ---------------- Source code, wiki and ITS available on GitHub: * https://github.com/MetricsGrimoire/Bicho Please write to the developers mailing at * metrics-grimoire _at _ lists.libresoft.es If you want to receive updates about new versions, and keep in touch with the development team, consider subscribing to the list. It is a very low traffic list (< 1 msg a day): * https://lists.libresoft.es/listinfo/metrics-grimoire Credits -------- Bicho has been originally developed by the GSyC/LibreSoft group at the Universidad Rey Juan Carlos in Mostoles, near Madrid (Spain). It is part of a wider research on libre software engineering, aiming to gain knowledge on how libre software is developed and maintained. FAQ ---- F1. Bicho crashed with 'UnicodeEncodeError' exception UnicodeEncodeError appears when it is not possible to write the data in the database with the encoding used by this one. To avoid that, set your database to use UTF-8. For instance: CREATE DATABASE [DB NAME] CHARACTER SET utf8 COLLATE utf8_unicode_ci; F2. What is the database schema? There is a nice PNG schema in the directory /doc/database . F3. How can I create a new backend? Tell us through the contact information above that you want to create a new backend. We'll try to give you as much information as possible. Whenever possible, we want to use available APIs. However, some old bugtrackers used to not have APIs, and even today, some APIs don't give us all of the information that we want. So a new backend needs to be able to fall back to HTML scraping and parsing via Beautiful Soup in cases where the bug tracker's API doesn't exist (maybe an old version of Bugzilla/JIRA/etc.) or doesn't provide stuff we want. While writing a new backend, please also write tests per tests/README.md . F4. How can I submit a bug report? Use the GitHub issue tracker: https://github.com/MetricsGrimoire/Bicho/issues .
If we store several projects from the same ITS in the same database the incremental support doesn't work as expected. It only takes into account the last modification date stored in the changes table.
When downloading a very long list of issues it is common that some of them are updated during the download. What we have then in the database is a modification date different from the one in the original order . Thus if the execution crashes, the next time Bicho is executed will use the latest modification date and we lost the bugs between the last one retrieved (following the order from oldest to newest) and the last modification date included in the table.
In some cases, the URL used to retrieve Bugzilla issues is bad formed. Some trackers use the "/bugzilla/" path before "show_bug.cgi", but Bicho removes it before building the URL.
Some examples are:
Bicho has some code for managing Apache's tracker, but something more generic is needed to support other repositories.
bicho --db-user-out=root --db-password-out=123456 --db-database-out=dpp -d 15 -b allura -u http://sourceforge.net/p/dpp/_list/tickets
Checking URL: http://sourceforge.net
Running Bicho with delay of 15 seconds
Traceback (most recent call last):
File "/usr/bin/bicho", line 25, in
retval = Bicho.main.main()
File "/usr/lib/python2.7/site-packages/Bicho/main.py", line 56, in main
backend.run()
File "/usr/lib/python2.7/site-packages/Bicho/backends/allura.py", line 381, in run
ticketTotal = json.loads(f.read())
File "/usr/lib64/python2.7/json/init.py", line 326, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded
acs@macitong:~/devel/Bicho$ ./bicho -g --db-user-out=kernel --db-password-out=kernel --db-database-out=bichoKernel -d 1 --backend-user="[email protected]" --backend-password=xxxx -b bg -u https://bugzilla.kernel.org/buglist.cgi?product=
xml.sax._exceptions.SAXParseException: :88426:115: not well-formed (invalid token)
Opening the query for issues in Chrome for the problematic issue:
https://bugzilla.kernel.org/show_bug.cgi?id=45911&ctype=xml
the XML resulting is not well formed:
error on line 87 at column 116: PCDATA invalid Char value 27
We need to filter the XML read before trying to parse it!
Currently in the bicho config testing the URL for the list of issues is downloaded for testing that the remote server is reachable. We should use the global URL so we don't load the server with this query. The results of the query are not used!
try:
print("Checking URL: "+Config.url)
response = urlopen(req)
It seems that there are some problems with foreign key references when the engine is innodb due to the additional checks (for example when creating the trackers table there is a reference to tracker_types - shouldn't that be supported_trackers?).
There are some problems with using MySQL on MacOS so switching to Myisam didn't work very smoothly. As a work around I just added in the table declaration engine=MYISAM in the following files:
Bicho/backends/bg.py
Bicho/db/mysql.py
Error output can be found at the following address:
http://pastebin.com/cEtuEDi0
The crash is produced while parsing extra date fields from Bugzilla (such as deadline). The database expects a Datetime object bug Bicho sends an Unicode string.
Some conversion is required before sendind these dates to the database, in the same way as Bicho does with submitted_on field.
Example: http://itforgebugzilla.atosresearch.eu/bugzilla/show_bug.cgi?id=517&ctype=xml
Traceback (most recent call last):
File "/home/sduenas/devel/ws/bin/bicho", line 25, in <module>
retval = Bicho.main.main()
File "/home/sduenas/devel/ws/lib/python2.7/site-packages/Bicho/main.py", line 54, in main
backend.run()
File "/home/sduenas/devel/ws/lib/python2.7/site-packages/Bicho/backends/bg.py", line 1082, in run
bugsdb.insert_issue(issue_data, dbtrk.id)
File "/home/sduenas/devel/ws/lib/python2.7/site-packages/Bicho/db/database.py", line 183, in insert_issue
self.backend.insert_issue_ext(self.store, issue, db_issue.id)
File "/home/sduenas/devel/ws/lib/python2.7/site-packages/Bicho/backends/bg.py", line 181, in insert_issue_ext
db_issue_ext.deadline = self.__return_unicode(issue.deadline)
File "/usr/local/lib/python2.7/dist-packages/storm/properties.py", line 67, in __set__
obj_info.variables[column].set(value)
File "/usr/local/lib/python2.7/dist-packages/storm/variables.py", line 426, in parse_set
raise TypeError("Expected datetime, found %s" % repr(value))
TypeError: Expected datetime, found u'2012-03-31'
The fields status, resolution and title are repeated both in tables issues and issues_ext_jira.
DBG: [22/Feb/2013-13:47:40] http://issues.liferay.com/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?pid=AUI&sorter/field=updated&sorter/order=INC&updated:after=2012-08-11&tempMax=10&pager/start=70
DBG: [22/Feb/2013-13:47:42] Bug activity: http://issues.liferay.com/browse/AUI-719?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:45] Bug activity: http://issues.liferay.com/browse/AUI-670?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:47] Bug activity: http://issues.liferay.com/browse/AUI-686?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:49] Bug activity: http://issues.liferay.com/browse/AUI-445?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:51] Bug activity: http://issues.liferay.com/browse/AUI-714?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:53] Bug activity: http://issues.liferay.com/browse/AUI-724?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:54] Bug activity: http://issues.liferay.com/browse/AUI-725?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:57] Bug activity: http://issues.liferay.com/browse/AUI-732?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:59] Bug activity: http://issues.liferay.com/browse/AUI-731?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:48:01] Bug activity: http://issues.liferay.com/browse/AUI-730?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
Remaining time: 0 m ( 143 )
The error below is caused by an error getting the dates. We get seconds for the creation date. We assign 00 seconds to the changes of the history, which are retrieved using a HTML parser. So we end up having bugs created after the first entry in the history.
luis@tahine:~/repos/Bicho$ ./bicho -g --db-user-out=root --db-password-out=root --db-database-out=bicho_bug43 -d 0 -b jira -u http://issues.liferay.com/browse/IDE
Checking URL: http://issues.liferay.com
DBG: [22/Feb/2013-13:37:42] Bicho object created, options and backend initialized
Running Bicho with delay of 0 seconds
DBG: [22/Feb/2013-13:37:42] Last bugs cached were modified on: 2013-02-22
DBG: [22/Feb/2013-13:37:42] Getting number of issues: http://issues.liferay.com/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?pid=IDE&sorter/field=updated&sorter/order=INC&updated:after=2013-02-22&tempMax=1
Total bugs 3
DBG: [22/Feb/2013-13:37:43] http://issues.liferay.com/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?pid=IDE&sorter/field=updated&sorter/order=INC&updated:after=2013-02-22&tempMax=10&pager/start=0
DBG: [22/Feb/2013-13:37:44] Bug activity: http://issues.liferay.com/browse/IDE-827?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:37:45] Bug activity: http://issues.liferay.com/browse/IDE-820?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:37:47] Bug activity: http://issues.liferay.com/browse/IDE-825?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
/usr/lib/python2.7/dist-packages/storm/database.py:371: Warning: Data truncated for column 'environment' at row 1
return function(*args, **kwargs)
Remaining time: 0 m ( -7 )
Done. 3 bugs analyzed
self.backend_name = jira
DBG: [22/Feb/2013-13:37:49] Last change logged at 2010-03-19 16:49:00
Traceback (most recent call last):
File "./bicho", line 8, in <module>
retval = Bicho.main.main()
File "/home/luis/repos/Bicho/Bicho/main.py", line 59, in main
il.run()
File "/home/luis/repos/Bicho/Bicho/post_processing/logtable.py", line 776, in run
(db_ilog.issue, date))
AttributeError: 'NoneType' object has no attribute 'issue'
Sometimes when downloading issues, there are problems and you have an issues database partial. It could be useful to have an option "only_pending" to donwload only the issues you have not downloaded previously. In the process of creating the list of issues to be downloaded, the issues that we have in the database are filtered out.
When you try to analyze a Bugzilla 3.2.2 like bugzilla.kernel.org the server fails for the query:
https://bugzilla.kernel.org/buglist.cgi?product=&order=changeddate&ctype=csv
The server message is:
"DBD::mysql::st execute failed: Unknown column 'changeddate' in 'order clause' "
The query should be the same like 3.2.3:
https://bugzilla.kernel.org/buglist.cgi?product=&order=Last+Changed&ctype=csv
Bicho doesn't retrive all the bugs when bugzilla sets a maximun of results for buglist queries. Some bugzillas limit the number of bugs that are searched by the buglist query. This limit is usually set to 10K bugs.
For instance, eclipse returns a maximum of 10K bugs searching for platform product's bugs, but there are bugs for this product since 2001 that are not returned by this query.
In order to get the list of issues in CSV you should remove "&order=changeddate". Maybe the bugzilla version is too old. We should check bugzilla version before using this feature used in incremental issues downloading.
Self explanatory. It must offer also support for multiple trackers in the same Redmine instance, sharing the same database
We need extra information about:
Incremental feature doesn't support multiple projects in the same database
Having a look at the "changes" table after analyzing the oslo tracker (https://bugs.launchpad.net/oslo) I've seen that the bug field is not being stored. It always contain "None" in the database.
It would be great to have LP support for the issues_log table, so we can track all the changes of every bug
Stefano Maffulli sent us this bug:
Error in function analyzeBug with URL: '
'https://bugs.launchpad.net/openstack and Bug:
https://api.launchpad.net/1.0/nova/+bug/989764
Traceback (most recent call last):
File "/usr/local/bin/bicho", line 25, in <module>
retval = Bicho.main.main()
File "/usr/local/lib/python2.7/dist-packages/Bicho/main.py", line 54, in main
backend.run()
File "/usr/local/lib/python2.7/dist-packages/Bicho/backends/lp.py",
line 997, in run
issue_data = self.analyze_bug(bug)
File "/usr/local/lib/python2.7/dist-packages/Bicho/backends/lp.py",
line 823, in analyze_bug
issue.set_milestone_code_name(bug.milestone.code_name)
File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/resource.py",
line 688, in __getattr__
return super(Entry, self).__getattr__(name)
File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/resource.py",
line 331, in __getattr__
return self.lp_get_parameter(attr)
File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/resource.py",
line 215, in lp_get_parameter
self._ensure_representation()
File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/resource.py",
line 357, in _ensure_representation
representation = self._root._browser.get(self._wadl_resource)
File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/_browser.py",
line 291, in get
response, content = self._request(url, extra_headers=headers)
File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/_browser.py",
line 242, in _request
str(url), method=method, body=data, headers=headers)
File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/_browser.py",
line 211, in _request_and_retry
url, method=method, body=body, headers=headers)
File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line
1346, in request
info, content = cached_value.split('\r\n\r\n', 1)
ValueError: need more than 1 value to unpack
In some cases there are fields that are not stored in the changes table.
As an example (based on information found at https://bugs.webkit.org/show_activity.cgi?id=12340), fields whose name is "Attachment #12715 Flag" are not correctly stored.
The output of the mysql database for that specific report:
mysql> select * from changes where issue_id=12340;
+--------+----------+----------------+------------------------+-----------------------------+------------+---------------------+
| id | issue_id | field | old_value | new_value | changed_by | changed_on |
+--------+----------+----------------+------------------------+-----------------------------+------------+---------------------+
| 135066 | 12340 | | | review?, commit-queue? | 449 | 2011-05-04 02:57:57 |
| 135067 | 12340 | | review?, commit-queue? | | 449 | 2011-05-04 21:25:35 |
| 135068 | 12340 | | 0 | 1 | 449 | 2011-05-04 21:25:35 |
| 135069 | 12340 | | | review?, commit-queue? | 449 | 2011-05-04 21:25:43 |
| 135070 | 12340 | status | UNCONFIRMED | NEW | 607 | 2011-05-04 22:53:59 |
| 135071 | 12340 | Ever Confirmed | 0 | 1 | 607 | 2011-05-04 22:53:59 |
| 135072 | 12340 | Blocks | | 60244 | 449 | 2011-05-10 21:34:50 |
| 135073 | 12340 | | review?, commit-queue? | | 449 | 2011-05-11 23:56:57 |
[...]
As seen, there are empty values for the column "field".
Perhaps it is because the way Beautiful Soup is working given that the fields that are missing are partially links, while typical values in such columns are plain text.
Running Bicho I get the next error:
bicho -o db -b bg --db-user-out root --db-password-out root --db-database-out solid_bicho -d 5 -u https://bugs.kde.org/buglist.cgi?product=solid
Traceback (most recent call last):
File "/usr/local/bin/bicho", line 25, in <module>
retval = Bicho.main.main()
File "/usr/local/lib/python2.7/dist-packages/Bicho/main.py", line 54, in main
backend.run()
File "/usr/local/lib/python2.7/dist-packages/Bicho/backends/bg.py", line 1064, in run
issues = self.analyze_bug_list(query_bugs, url, dbtrk.id, bugsdb)
File "/usr/local/lib/python2.7/dist-packages/Bicho/backends/bg.py", line 948, in analyze_bug_list
bugsdb.insert_issue(issues[bug_id], dbtrk_id)
File "/usr/local/lib/python2.7/dist-packages/Bicho/db/database.py", line 195, in insert_issue
db_comment = self._get_db_comment(comment, db_issue.id, tracker_id)
File "/usr/local/lib/python2.7/dist-packages/Bicho/db/database.py", line 488, in _get_db_comment
DBComment.submitted_on == comment.submitted_on).one()
File "/usr/lib/python2.7/dist-packages/storm/store.py", line 1142, in one
result = self._store._connection.execute(select)
File "/usr/lib/python2.7/dist-packages/storm/databases/mysql.py", line 106, in execute
return Connection.execute(self, statement, params, noresult)
File "/usr/lib/python2.7/dist-packages/storm/database.py", line 238, in execute
raw_cursor = self.raw_execute(statement, params)
File "/usr/lib/python2.7/dist-packages/storm/database.py", line 322, in raw_execute
self._check_disconnect(raw_cursor.execute, *args)
File "/usr/lib/python2.7/dist-packages/storm/database.py", line 371, in _check_disconnect
return function(*args, **kwargs)
File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 174, in execute
self.errorhandler(self, exc, value)
File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1267, "Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='")
HI!
With the last changes in Bicho it fails in Bugzilla version 3.2.2 incremental analysis.
A solution is:
acs@lenovix:~/devel/Bicho$ git diff Bicho/backends/bg.py
diff --git a/Bicho/backends/bg.py b/Bicho/backends/bg.py
index a9c624a..7f4f03c 100644
--- a/Bicho/backends/bg.py
+++ b/Bicho/backends/bg.py
@@ -1055,6 +1055,8 @@ class BGBackend(Backend):
printout("No more issues to retrieve")
def _retrieve_issues_ids(self, base_url, version, from_date, not_retrieved=True):
+ # hack until we talk with sduenas - acs
+ from_date = from_date.split(" ")[0]
url = self._get_issues_list_url(base_url, version, from_date)
printdbg("Getting bugzilla issues from %s" % url)
Due to Sourceforge seems to have moved to Allura, do we need the code of the old backend? Shouldn't we remove it?
If you try to analyze a bugzilla when the URL is under a subdirectory Bicho fails
For example:
https://issues.apache.org/ooo/
The problem is that the base URL is coded as:
self.get_domain(self.url)
and this is "https://issues.apache.org/" no the current URL "https://issues.apache.org/ooo/
Bicho is not retrieving the mail addresses from Github issue
Seen when bicho is launched for a second project/tracker sharing the same database
luis@tahine:~/repos/Bicho$ ./bicho -g --db-user-out=root --db-password-out=root --db-database-out=bicho_bug40 -d 0 -b bg -u https://bugzilla.libresoft.es/buglist.cgi?product=cvsanaly
...
DBG: [22/Feb/2013-16:28:37] Getting bugzilla issues from https://bugzilla.libresoft.es/buglist.cgi?product=cvsanaly&order=changeddate&ctype=csv&chfieldfrom=2012-05-18%2014:11:52
Round #9 - Total issues to retrieve: 2
DBG: [22/Feb/2013-16:28:38] Issues to retrieve from: https://bugzilla.libresoft.es/show_bug.cgi?id=1&id=2&ctype=xml&excludefield=attachmentdata
DBG: [22/Feb/2013-16:28:39] Retrieving activity of issue #1 from https://bugzilla.libresoft.es/show_activity.cgi?id=1
DBG: [22/Feb/2013-16:28:41] Issue #1 stored
DBG: [22/Feb/2013-16:28:41] Retrieving activity of issue #2 from https://bugzilla.libresoft.es/show_activity.cgi?id=2
DBG: [22/Feb/2013-16:28:42] Issue #2 stored
DBG: [22/Feb/2013-16:28:42] Last issues cached were modified on: 2012-08-19 08:11:06
DBG: [22/Feb/2013-16:28:42] Getting bugzilla issues from https://bugzilla.libresoft.es/buglist.cgi?product=cvsanaly&order=changeddate&ctype=csv&chfieldfrom=2012-08-19%2008:11:06
DBG: [22/Feb/2013-16:28:43] No issues found for date 2012-08-19 08:11:06. Trying with 2012-08-19 08:11:07
DBG: [22/Feb/2013-16:28:43] Getting bugzilla issues from https://bugzilla.libresoft.es/buglist.cgi?product=cvsanaly&order=changeddate&ctype=csv&chfieldfrom=2012-08-19%2008:11:07
No more issues to retrieve
Done. 39 issues retrieved
self.backend_name = bg
DBG: [22/Feb/2013-16:28:44] Last change logged at 2012-06-07 16:58:52
Traceback (most recent call last):
File "./bicho", line 8, in <module>
retval = Bicho.main.main()
File "/home/luis/repos/Bicho/Bicho/main.py", line 59, in main
il.run()
File "/home/luis/repos/Bicho/Bicho/post_processing/logtable.py", line 776, in run
(db_ilog.issue, date))
AttributeError: 'NoneType' object has no attribute 'issue'
The root directory of Bicho's git repository contains two files, bicho
and Bicho
(in fact, a folder), that have the same name under case insensitive file systems. As a result, when cloning the repository, only one of them is cloned (in my case, I get the bicho
file, but not the Bicho
folder).
Running bicho for getting Evince bugs from GNOME Bugzilla, bg backend, I get this error:
Traceback (most recent call last):
File "/usr/local/bin/bicho", line 25, in <module>
retval = Bicho.main.main()
File "/usr/local/lib/python2.6/dist-packages/Bicho/main.py", line 54, in main
backend.run()
File "/usr/local/lib/python2.6/dist-packages/Bicho/backends/bg.py", line 1071, in run
bugsdb.insert_issue(issue_data, dbtrk.id)
File "/usr/local/lib/python2.6/dist-packages/Bicho/db/database.py", line 195, in insert_issue
db_comment = self._get_db_comment(comment, db_issue.id, tracker_id)
File "/usr/local/lib/python2.6/dist-packages/Bicho/db/database.py", line 488, in _get_db_comment
DBComment.submitted_on == comment.submitted_on).one()
File "/usr/local/lib/python2.6/dist-packages/storm-0.19-py2.6-linux-x86_64.egg/storm/store.py", line 1142, in one
result = self._store._connection.execute(select)
File "/usr/local/lib/python2.6/dist-packages/storm-0.19-py2.6-linux-x86_64.egg/storm/databases/mysql.py", line 106, in execute
return Connection.execute(self, statement, params, noresult)
File "/usr/local/lib/python2.6/dist-packages/storm-0.19-py2.6-linux-x86_64.egg/storm/database.py", line 238, in execute
raw_cursor = self.raw_execute(statement, params)
File "/usr/local/lib/python2.6/dist-packages/storm-0.19-py2.6-linux-x86_64.egg/storm/database.py", line 322, in raw_execute
self._check_disconnect(raw_cursor.execute, *args)
File "/usr/local/lib/python2.6/dist-packages/storm-0.19-py2.6-linux-x86_64.egg/storm/database.py", line 371, in _check_disconnect
return function(*args, **kwargs)
File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 166, in execute
self.errorhandler(self, exc, value)
File "/usr/lib/pymodules/python2.6/MySQLdb/connections.py", line 35, in defaulterrorhandler
raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1267, "Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='")
Network fault tolerance, which will allow Bicho to survive to interrupted connections
I've seen this while downloading issues from http://issues.liferay.com/browse/IDE
DBG: [21/Feb/2013-19:15:47] Bug activity: http://issues.liferay.com/browse/IDE-54?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
Change author format not supported. Change lost!
Change author format not supported. Change lost!
Change author format not supported. Change lost!
Change author format not supported. Change lost!
Change author format not supported. Change lost!
Bicho is not able of getting the list of bugs of a private project in Bugzilla. This is due to Bicho was not thought to analyze private projects and the account provided is used to get extra data from every bug petitition.
The change is simple, it is needed to use the cookies also in the petition that gets the list of bugs. This should be enough.
Fusionforge 5.2 (or newer) support
With bugzilla it is possible to read several issues at the same time with:
https://bugzilla.gnome.org/show_bug.cgi?ctype=xml&id=643870&id=643871
Talking with GNOME bugmaster it is ok to use this way to download 500 issues in one query.
We should add this feature to bicho that actually download only 1 issue per XML petition.
Incremental feature doesn't support multiple projects in the same database
@sduenas suggested to improve the way backends are written. The idea of this enhancement is to study what the common parts are and move them to the shared code/objects.
Incremental feature doesn't support multiple projects in the same database
In the analysis of WebKit I see that the oldest but is https://bugs.webkit.org/show_bug.cgi?id=86230 (modified in 2000) but according to the log of the execution the first bug we are downloading is https://bugs.webkit.org/show_bug.cgi?id=101385 (modified in 2012)
We must start from the oldest to the newest in order to respect the incremental mode.
Improved debug mode with more details, the backends should also use the same log format
Incremental feature doesn't support multiple projects in the same database
The tickets query is nor ordered by modification date.
Bicho crashes while parsing bugs which their fields 'long_desc' (the description of the bug) are not filled.
Example: http://itforgebugzilla.atosresearch.eu/bugzilla/show_bug.cgi?id=439&ctype=xml
Error in function analyzeBug with URL: http://itforgebugzilla.atosresearch.eu/bugzilla/ and Bug: 439
Traceback (most recent call last):
File "/home/sduenas/devel/ws/bin/bicho", line 25, in <module>
retval = Bicho.main.main()
File "/home/sduenas/libresoft/devel/bicho/Bicho/main.py", line 54, in main
backend.run()
File "/home/sduenas/libresoft/devel/bicho/Bicho/backends/bg.py", line 1064, in run
issue_data = self.analyze_bug(bug, url)
File "/home/sduenas/libresoft/devel/bicho/Bicho/backends/bg.py", line 947, in analyze_bug
issue = handler.get_issue()
File "/home/sduenas/libresoft/devel/bicho/Bicho/backends/bg.py", line 774, in get_issue
desc = self.ctags["long_desc"][0]["thetext"]
IndexError: list index out of range
This is related to new bugzilla version detection code.
Error when analysing the Webkit tracker
Error parsing URL: https://bugzilla.webkit.org/show_bug.cgi?id=56587&id=61069&id=91335&id=92084&id=77377&id=91998&id=92090&id=65632&id=92087&id=82510&id=35010&id=65533&id=92074&id=92070&id=91449&id=92002&id=91836&id=92021&id=92083&id=91841&id=92085&id=92077&id=92060&id=91593&id=92073&id=92082&id=92010&id=92064&id=38882&id=92068&id=80269&id=63244&id=91971&id=83464&id=83398&id=83529&id=80602&id=79685&id=62857&id=79670&id=77446&id=76453&id=92055&id=73526&id=69896&id=70049&id=87899&id=66004&id=66566&id=65539&id=65200&id=65171&id=92066&id=69658&id=69544&id=69398&id=69372&id=69317&id=69202&id=69167&id=77990&id=91961&id=89148&id=91966&id=91948&id=92058&id=91694&id=57583&id=90037&id=91949&id=84321&id=90325&id=92047&id=61524&id=91917&id=92049&id=91913&id=92048&id=85444&id=92046&id=91825&id=91829&id=91963&id=91761&id=92038&id=81857&id=91674&id=92023&id=91703&id=92030&id=92032&id=91927&id=92034&id=92036&id=91541&id=91981&id=91972&id=92031&id=88555&id=91459&id=92028&id=91884&id=92027&id=53141&id=92026&id=92024&id=91982&id=80622&id=92014&id=91980&id=90873&id=92013&id=91639&id=88271&id=92017&id=92022&id=92007&id=66615&id=92020&id=92019&id=91654&id=92018&id=91997&id=91171&id=92004&id=89519&id=84802&id=81488&id=70708&id=91868&id=91899&id=91249&id=91942&id=92006&id=91857&id=90175&id=91193&id=92003&id=89597&id=91717&id=75071&id=83440&id=91996&id=91958&id=91995&id=91994&id=91979&id=91985&id=91945&id=91939&id=91984&id=91986&id=91977&id=91978&id=89719&id=91782&id=91967&id=91938&id=91921&id=91975&id=91960&id=91959&id=91935&id=42778&id=91837&id=91937&id=91728&id=91826&id=91947&id=83156&id=83436&id=91950&id=75716&id=46248&id=75070&id=89055&id=91577&id=91764&id=91624&id=90937&id=19937&id=91934&id=91953&id=91708&id=91941&id=91946&id=91944&id=76321&id=88937&id=91928&id=91918&id=91796&id=91930&id=90679&id=91923&id=90783&id=84567&id=13351&id=86581&id=91922&id=87935&id=91916&id=91499&id=91914&id=91915&id=91846&id=91848&id=91909&id=91874&id=91834&id=91905&id=91901&id=86911&id=91904&id=56151&id=46283&id=91886&id=91535&id=91902&id=91731&id=40103&id=91903&id=91414&id=91636&id=91762&id=91827&id=91506&id=83370&id=70617&id=89696&id=91873&id=91569&id=91887&id=90227&id=82372&id=91789&id=91895&id=91866&id=91893&id=80644&id=91882&id=91801&id=83187&id=91883&id=90469&id=89767&id=91421&id=90517&id=91847&id=91403&id=91275&id=91133&id=91672&id=91876&id=91870&id=91758&id=91526&id=91784&id=91875&id=30187&id=82835&id=91880&id=85591&id=91474&id=91865&id=69295&id=91867&id=40673&id=91817&id=91869&id=17672&id=91863&id=91747&id=85527&id=91159&id=91767&id=91840&id=91859&id=84813&id=91839&id=90792&id=91845&id=87246&id=91757&id=90455&id=79354&id=91830&id=91838&id=91833&id=90676&id=91721&id=91714&id=90289&id=53932&id=90976&id=90931&id=63952&id=91590&id=17709&id=91745&id=63062&id=85958&id=91629&id=90990&id=91819&id=91816&id=91814&id=91808&id=91209&id=85754&id=90604&id=91797&id=91795&id=91686&id=91729&id=91799&id=91798&id=77383&id=90692&id=91651&id=89796&id=90642&id=90182&id=91081&id=91763&id=50126&id=86016&id=16496&id=83432&id=91483&id=91537&id=91571&id=91142&id=22882&id=6007&id=68089&id=8191&id=91787&id=85140&id=90419&id=91786&id=91785&id=91637&id=91140&id=91690&id=91780&id=91722&id=91781&id=91770&id=90731&id=91777&id=91655&id=91271&id=85826&id=87711&id=90252&id=90713&id=91765&id=91650&id=91706&id=91760&id=89391&id=91134&id=71406&id=91682&id=90284&id=91148&id=91668&id=80576&id=91644&id=90311&id=91243&id=65801&id=50144&id=91671&id=91669&id=91555&id=85174&id=91246&id=85817&id=91132&id=91753&id=91679&id=87844&id=91740&id=91744&id=91751&id=91749&id=91725&id=91026&id=91092&id=91645&id=91741&id=89457&id=91735&id=91739&id=81126&id=91428&id=47727&id=77012&id=91730&id=91500&id=89987&id=89978&id=91699&id=91640&id=88747&id=90169&id=91464&id=91705&id=89224&id=91716&id=59305&id=89544&id=90788&id=59832&id=91044&id=82697&id=80472&id=91493&id=91720&id=91341&id=86215&id=91715&id=91713&id=86196&id=89648&id=91711&id=91641&id=81883&id=91712&id=91356&id=91564&id=87364&id=21692&id=91626&id=81882&id=88077&id=85223&id=89748&id=91622&id=91663&id=91594&id=91489&id=90508&id=91070&id=90039&id=91547&id=91408&id=91649&id=91696&id=91579&id=91692&id=91597&id=91461&id=91695&id=91550&id=91693&id=91687&id=88382&id=91684&id=91681&id=90581&id=91444&id=91680&id=91691&id=91678&id=87987&id=91253&id=91677&id=91670&id=89502&id=91631&id=91422&id=91683&id=11355&id=91659&id=82236&id=83628&id=91562&id=91599&id=91642&id=24880&id=90626&id=91549&id=91304&id=91673&id=91514&id=91565&id=91602&id=91334&id=91652&id=90762&id=91558&id=91666&id=91030&id=84308&id=91662&id=90320&id=91647&ctype=xml
Traceback (most recent call last):
File "./bicho", line 8, in
retval = Bicho.main.main()
File "/home/lcanas/repos/Bicho/Bicho/main.py", line 54, in main
backend.run()
File "/home/lcanas/repos/Bicho/Bicho/backends/bg.py", line 1143, in run
issues = self.analyze_bug_list(query_bugs, url, dbtrk.id, bugsdb)
File "/home/lcanas/repos/Bicho/Bicho/backends/bg.py", line 1005, in analyze_bug_list
self.safe_xml_parse(bugs_url, handler);
File "/home/lcanas/repos/Bicho/Bicho/backends/bg.py", line 986, in safe_xml_parse
join(c for c in contents if self.valid_XML_char_ordinal(ord(c)))
UnboundLocalError: local variable 'contents' referenced before assignment
Incremental feature doesn't support multiple projects in the same database
Self explanatory. It must offer also support for multiple trackers sharing the same database
After executing bicho twice, the second database is not coherent with the first results we got. We have a bug somewhere. Besides that the numbers for db1 are more likely to be the correct ones.
|year|month| # issues db1| # issues db2|
|2000|12| 1| 1|
|2005| 6|331|325|
|2005| 7|251|246|
|2005| 8|323|321|
|2005| 9|215|211|
|2005|10|208|203|
|2005|11|202|200|
|2005|12|279|270|
(reported by @acs )
After installing bicho we get this error:
luis@tahine:~/repos$ bicho -g -d 1 --db-user-out=root --db-password-out=root --db-database-out=acs_bicho_allura_1049 allura http://sourceforge.net/rest/p/allura/tickets
Traceback (most recent call last):
File "/usr/local/bin/bicho", line 21, in
import Bicho.main
File "/usr/local/lib/python2.7/dist-packages/Bicho/main.py", line 34, in
from post_processing.logtable import IssuesLog
ImportError: No module named post_processing.logtable
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.