Giter Club home page Giter Club logo

bicho's Introduction

Important notice (2018-09-19)
-----------------------------

This project lacked active development for a long while, and it is very
unlikely that it gets develpment attention in the future. For those
interested in retrieving and processing data from issue tracking repositories
(bug tracking repositories), please consider checking GrimoireLab [1] and
GrimoireLab-Perceval [2] 

[1] https://chaoss.github.io/grimoirelab
[2] https://github.com/chaoss/grimoirelab-perceval


Description
-----------

Bicho is a command line-based tool used to parse bug/issue tracking
systems. It gets all the information associated with issues and stores
them in a relational database. (It is part of the MetricsGrimoire suite,
which produces data for vizGrimoire to analyze and visualize.)

Currently Bicho supports:
- Bugzilla
- Sourceforge.net (abandoned)
- JIRA
- Launchpad
- GitHub
- Maniphest
- Redmine
- Gerrit
- Allura (unstable)
- Google Code (abandoned)
- Trac


 License
---------

Bicho is licensed under GNU General Public License (GPL), version 2 or later.


 Download
----------

Home page:
* http://metricsgrimoire.github.com/Bicho/

Releases:
* https://github.com/MetricsGrimoire/Bicho/downloads

Latest version:
* git://github.com/MetricsGrimoire/Bicho.git


 Requirements
------------

 * Python >= 2.4
 * Python Storm. You'll also need the following Python libraries:
   - mysqldb (default engine should be set to MYISAM)
   - python-launchpadlib (only for for Launchpad backend)
 * Beautiful Soup library: error-tolerant HTML parser for Python
 * python-feedparser
 * dateutil


 Installation
-------------

 You can install Bicho running the setup.py script:

  # python setup.py install

 For the impatients:

  $ bicho --help


 Running Bicho
--------------

This is a quick list of example commands you would run in your terminal to
get bug data from various kinds of bug trackers and capture it in a
database on your local machine. The general format is:

$ bicho --db-user-out=[YOUR DATABASE USERNAME] --db-password-out=[YOUR DATABASE PASSWORD] --db-database-out=[NAME OF DATABASE] -d [DELAY BETWEEN REQUESTS IN NUMBER OF SECONDS] -b [ABBREVIATION FOR BACKEND] -u [BUGTRACKER URL IN QUOTES]

For more guidance, please see doc/UserManual.txt .

It is very important to use a delay. If you run Bicho against big sites
without a delay between bug queries, your IP address could be banned!

E1. Getting information from a project that uses Bugzilla, like Bicho ;)

$ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -d 15 -b bg -u "https://bugzilla.libresoft.es/buglist.cgi?product=bicho"

E2. Getting information from a project hosted on sourceforge.net

$ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -d 15 -b sf -u "http://sourceforge.net/tracker/?atid=516295&group_id=66938"

E3. Getting information from a project using JIRA

$ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -d 15 -b jira -u "http://support.petalslink.com/browse/PETALSMASTER"

E4. Getting information from a project using Launchpad

$ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -d 15 -b lp -u "https://bugs.launchpad.net/openstack"

E5. Getting information from a project using Allura

$ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -d 15 -b allura -u "http://sourceforge.net/rest/p/allura/tickets"

E6. Getting information from a project using GitHub

$ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -b github -u "https://api.github.com/repos/composer/composer/issues" --backend-token=[API TOKEN]

E7. Getting information from a project using Redmine

$ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] --backend-user=[REDMINE USER] --backend-password=[REDMINE PASSWORD] -d 1 -b redmine -u "https://www.bitergia.net/"

E8. Getting information from Maniphest

$ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] --backend-token=[API TOKEN] -b maniphest -u https://phabricator.wikimedia.org [--no-resume]

E9. Getting information from Trac

$ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -b trac -u https://fedorahosted.org/freeipa/

E10. Getting information from Review Board

$ bicho --db-user-out=[DB USER] --db-password-out=[DB PASS] --db-database-out=[DB NAME] -n 25 -b reviewboard -u https://reviews.apache.org/groups/geode/


Known issues
------------

Newer versions of MySQL server may raise the next error message:
```
Error Code: 1406. Data too long for column
```

To avoid this problem, please update your MySQL configuration removing `STRICT_TRANS_TABLES` value from `sql-mode` parameter.


 Roadmap
---------

0.93:
* The updated list of bugs to be fixed can be found here https://github.com/MetricsGrimoire/Bicho/issues?milestone=1&page=1&state=open
* Incremental support broken by issues updated during the download bug #28
* Incorrect order downloading issues from Bugzilla #20
* Incoherent number of issues after webkit analysis bug bugzilla support #26
* Error in database character sets while comparing dates #8
* Problem cloning repo in case insensitive systems #12
* Incremental feature doesn't support multiple projects in the same database #30

1.0:
* https://github.com/MetricsGrimoire/Bicho/issues?milestone=2&page=1&state=open
* issues_log for bugzilla and launchpad
** Launchad support for issues_log table enhancement launchpad support #24
** More efficient and cleaner code for the table issues_log for bugzilla
* New table with information about executions (date, issues downloaded, etc ..)
* Tests, tests and tests
* Improved debug mode with more useful details
* Network fault tolerance (in order to survive to connection issues)
* New backends:
** FusionForge


 Improving Bicho
----------------

Source code, wiki and ITS available on GitHub:
* https://github.com/MetricsGrimoire/Bicho

Please write to the developers mailing at
* metrics-grimoire _at _ lists.libresoft.es

If you want to receive updates about new versions, and keep in touch
with the development team, consider subscribing to the list. It is a
very low traffic list (< 1 msg a day):

* https://lists.libresoft.es/listinfo/metrics-grimoire


 Credits
--------

Bicho has been originally developed by the GSyC/LibreSoft group at the
Universidad Rey Juan Carlos in Mostoles, near Madrid (Spain). It is
part of a wider research on libre software engineering, aiming to gain
knowledge on how libre software is developed and maintained.


 FAQ
----

F1. Bicho crashed with 'UnicodeEncodeError' exception

UnicodeEncodeError appears when it is not possible to write the data in the
database with the encoding used by this one. To avoid that, set your database
to use UTF-8. For instance:

CREATE DATABASE [DB NAME] CHARACTER SET utf8 COLLATE utf8_unicode_ci;

F2. What is the database schema?

There is a nice PNG schema in the directory /doc/database .

F3. How can I create a new backend?

Tell us through the contact information above that you want to create a new
backend. We'll try to give you as much information as possible.

Whenever possible, we want to use available APIs. However, some old
bugtrackers used to not have APIs, and even today, some APIs don't give us
all of the information that we want. So a new backend needs to be able to
fall back to HTML scraping and parsing via Beautiful Soup in cases where the
bug tracker's API doesn't exist (maybe an old version of Bugzilla/JIRA/etc.)
or doesn't provide stuff we want.

While writing a new backend, please also write tests per tests/README.md .

F4. How can I submit a bug report?

Use the GitHub issue tracker: https://github.com/MetricsGrimoire/Bicho/issues .

bicho's People

Contributors

adinabarham avatar brainwane avatar canasdiaz avatar dicortazar avatar feinomenon avatar jgbarah avatar libresoft avatar rodrigoprimo avatar sduenas avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bicho's Issues

Incremental support broken by issues updated during the download

When downloading a very long list of issues it is common that some of them are updated during the download. What we have then in the database is a modification date different from the one in the original order . Thus if the execution crashes, the next time Bicho is executed will use the latest modification date and we lost the bugs between the last one retrieved (following the order from oldest to newest) and the last modification date included in the table.

Allura Backends No JSON object could be decoded

bicho --db-user-out=root --db-password-out=123456 --db-database-out=dpp -d 15 -b allura -u http://sourceforge.net/p/dpp/_list/tickets
Checking URL: http://sourceforge.net
Running Bicho with delay of 15 seconds
Traceback (most recent call last):
File "/usr/bin/bicho", line 25, in
retval = Bicho.main.main()
File "/usr/lib/python2.7/site-packages/Bicho/main.py", line 56, in main
backend.run()
File "/usr/lib/python2.7/site-packages/Bicho/backends/allura.py", line 381, in run
ticketTotal = json.loads(f.read())
File "/usr/lib64/python2.7/json/init.py", line 326, in loads
return _default_decoder.decode(s)
File "/usr/lib64/python2.7/json/decoder.py", line 366, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/usr/lib64/python2.7/json/decoder.py", line 384, in raw_decode
raise ValueError("No JSON object could be decoded")
ValueError: No JSON object could be decoded

Probs trying to analyze kernel bugzilla with XML enconding

acs@macitong:~/devel/Bicho$ ./bicho -g --db-user-out=kernel --db-password-out=kernel --db-database-out=bichoKernel -d 1 --backend-user="[email protected]" --backend-password=xxxx -b bg -u https://bugzilla.kernel.org/buglist.cgi?product=

xml.sax._exceptions.SAXParseException: :88426:115: not well-formed (invalid token)

Opening the query for issues in Chrome for the problematic issue:

https://bugzilla.kernel.org/show_bug.cgi?id=45911&ctype=xml

the XML resulting is not well formed:

error on line 87 at column 116: PCDATA invalid Char value 27

We need to filter the XML read before trying to parse it!

Don't check the URL with the full list of issues

Currently in the bicho config testing the URL for the list of issues is downloaded for testing that the remote server is reachable. We should use the global URL so we don't load the server with this query. The results of the query are not used!

try:
            print("Checking URL: "+Config.url)
            response = urlopen(req)

Error when using Bicho with innodb - foreign key references

It seems that there are some problems with foreign key references when the engine is innodb due to the additional checks (for example when creating the trackers table there is a reference to tracker_types - shouldn't that be supported_trackers?).

There are some problems with using MySQL on MacOS so switching to Myisam didn't work very smoothly. As a work around I just added in the table declaration engine=MYISAM in the following files:

Bicho/backends/bg.py
Bicho/db/mysql.py

Error output can be found at the following address:
http://pastebin.com/cEtuEDi0

Error parsing extra date fields in Bugzilla

The crash is produced while parsing extra date fields from Bugzilla (such as deadline). The database expects a Datetime object bug Bicho sends an Unicode string.

Some conversion is required before sendind these dates to the database, in the same way as Bicho does with submitted_on field.

Example: http://itforgebugzilla.atosresearch.eu/bugzilla/show_bug.cgi?id=517&ctype=xml

Traceback (most recent call last):
  File "/home/sduenas/devel/ws/bin/bicho", line 25, in <module>
    retval = Bicho.main.main()
  File "/home/sduenas/devel/ws/lib/python2.7/site-packages/Bicho/main.py", line 54, in main
    backend.run()
  File "/home/sduenas/devel/ws/lib/python2.7/site-packages/Bicho/backends/bg.py", line 1082, in run
    bugsdb.insert_issue(issue_data, dbtrk.id)
  File "/home/sduenas/devel/ws/lib/python2.7/site-packages/Bicho/db/database.py", line 183, in insert_issue
    self.backend.insert_issue_ext(self.store, issue, db_issue.id)
  File "/home/sduenas/devel/ws/lib/python2.7/site-packages/Bicho/backends/bg.py", line 181, in insert_issue_ext
    db_issue_ext.deadline = self.__return_unicode(issue.deadline)
  File "/usr/local/lib/python2.7/dist-packages/storm/properties.py", line 67, in __set__
    obj_info.variables[column].set(value)
  File "/usr/local/lib/python2.7/dist-packages/storm/variables.py", line 426, in parse_set
    raise TypeError("Expected datetime, found %s" % repr(value))
TypeError: Expected datetime, found u'2012-03-31'

Info about remaining time in JIRA's backend is not reliable

DBG: [22/Feb/2013-13:47:40] http://issues.liferay.com/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?pid=AUI&sorter/field=updated&sorter/order=INC&updated:after=2012-08-11&tempMax=10&pager/start=70
DBG: [22/Feb/2013-13:47:42] Bug activity: http://issues.liferay.com/browse/AUI-719?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:45] Bug activity: http://issues.liferay.com/browse/AUI-670?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:47] Bug activity: http://issues.liferay.com/browse/AUI-686?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:49] Bug activity: http://issues.liferay.com/browse/AUI-445?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:51] Bug activity: http://issues.liferay.com/browse/AUI-714?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:53] Bug activity: http://issues.liferay.com/browse/AUI-724?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:54] Bug activity: http://issues.liferay.com/browse/AUI-725?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:57] Bug activity: http://issues.liferay.com/browse/AUI-732?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:47:59] Bug activity: http://issues.liferay.com/browse/AUI-731?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:48:01] Bug activity: http://issues.liferay.com/browse/AUI-730?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
Remaining time:  0 m ( 143 )

Error creating log table for a JIRA project

The error below is caused by an error getting the dates. We get seconds for the creation date. We assign 00 seconds to the changes of the history, which are retrieved using a HTML parser. So we end up having bugs created after the first entry in the history.

luis@tahine:~/repos/Bicho$ ./bicho -g --db-user-out=root --db-password-out=root --db-database-out=bicho_bug43 -d 0 -b jira -u http://issues.liferay.com/browse/IDE
Checking URL: http://issues.liferay.com
DBG: [22/Feb/2013-13:37:42] Bicho object created, options and backend initialized
Running Bicho with delay of 0 seconds
DBG: [22/Feb/2013-13:37:42] Last bugs cached were modified on: 2013-02-22
DBG: [22/Feb/2013-13:37:42] Getting number of issues: http://issues.liferay.com/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?pid=IDE&sorter/field=updated&sorter/order=INC&updated:after=2013-02-22&tempMax=1
Total bugs 3
DBG: [22/Feb/2013-13:37:43] http://issues.liferay.com/sr/jira.issueviews:searchrequest-xml/temp/SearchRequest.xml?pid=IDE&sorter/field=updated&sorter/order=INC&updated:after=2013-02-22&tempMax=10&pager/start=0
DBG: [22/Feb/2013-13:37:44] Bug activity: http://issues.liferay.com/browse/IDE-827?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:37:45] Bug activity: http://issues.liferay.com/browse/IDE-820?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
DBG: [22/Feb/2013-13:37:47] Bug activity: http://issues.liferay.com/browse/IDE-825?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
/usr/lib/python2.7/dist-packages/storm/database.py:371: Warning: Data truncated for column 'environment' at row 1
  return function(*args, **kwargs)
Remaining time:  0 m ( -7 )
Done. 3 bugs analyzed
self.backend_name = jira
DBG: [22/Feb/2013-13:37:49] Last change logged at 2010-03-19 16:49:00
Traceback (most recent call last):
  File "./bicho", line 8, in <module>
    retval = Bicho.main.main()
  File "/home/luis/repos/Bicho/Bicho/main.py", line 59, in main
    il.run()
  File "/home/luis/repos/Bicho/Bicho/post_processing/logtable.py", line 776, in run
    (db_ilog.issue, date))
AttributeError: 'NoneType' object has no attribute 'issue'

Add option for download only pending issue

Sometimes when downloading issues, there are problems and you have an issues database partial. It could be useful to have an option "only_pending" to donwload only the issues you have not downloaded previously. In the process of creating the list of issues to be downloaded, the issues that we have in the database are filtered out.

Results limit in buglist queries

Bicho doesn't retrive all the bugs when bugzilla sets a maximun of results for buglist queries. Some bugzillas limit the number of bugs that are searched by the buglist query. This limit is usually set to 10K bugs.

For instance, eclipse returns a maximum of 10K bugs searching for platform product's bugs, but there are bugs for this product since 2001 that are not returned by this query.

https://bugs.eclipse.org/bugs/buglist.cgi?product=Platform

Incremental support for Redmine

Self explanatory. It must offer also support for multiple trackers in the same Redmine instance, sharing the same database

Error reading milestone's code name

Stefano Maffulli sent us this bug:

Error in function analyzeBug with URL: '
'https://bugs.launchpad.net/openstack and Bug:
https://api.launchpad.net/1.0/nova/+bug/989764
Traceback (most recent call last):
  File "/usr/local/bin/bicho", line 25, in <module>
    retval = Bicho.main.main()
  File "/usr/local/lib/python2.7/dist-packages/Bicho/main.py", line 54, in main
    backend.run()
  File "/usr/local/lib/python2.7/dist-packages/Bicho/backends/lp.py",
line 997, in run
    issue_data = self.analyze_bug(bug)
  File "/usr/local/lib/python2.7/dist-packages/Bicho/backends/lp.py",
line 823, in analyze_bug
    issue.set_milestone_code_name(bug.milestone.code_name)
  File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/resource.py",
line 688, in __getattr__
    return super(Entry, self).__getattr__(name)
  File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/resource.py",
line 331, in __getattr__
    return self.lp_get_parameter(attr)
  File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/resource.py",
line 215, in lp_get_parameter
    self._ensure_representation()
  File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/resource.py",
line 357, in _ensure_representation
    representation = self._root._browser.get(self._wadl_resource)
  File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/_browser.py",
line 291, in get
    response, content = self._request(url, extra_headers=headers)
  File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/_browser.py",
line 242, in _request
    str(url), method=method, body=data, headers=headers)
  File "/usr/lib/python2.7/dist-packages/lazr/restfulclient/_browser.py",
line 211, in _request_and_retry
    url, method=method, body=body, headers=headers)
  File "/usr/lib/python2.7/dist-packages/httplib2/__init__.py", line
1346, in request
    info, content = cached_value.split('\r\n\r\n', 1)
ValueError: need more than 1 value to unpack

Error when retrieving information from Webkit Bugzilla: ignored fields in changes table

In some cases there are fields that are not stored in the changes table.

As an example (based on information found at https://bugs.webkit.org/show_activity.cgi?id=12340), fields whose name is "Attachment #12715 Flag" are not correctly stored.

The output of the mysql database for that specific report:

mysql> select * from changes where issue_id=12340;
+--------+----------+----------------+------------------------+-----------------------------+------------+---------------------+
| id | issue_id | field | old_value | new_value | changed_by | changed_on |
+--------+----------+----------------+------------------------+-----------------------------+------------+---------------------+
| 135066 | 12340 | | | review?, commit-queue? | 449 | 2011-05-04 02:57:57 |
| 135067 | 12340 | | review?, commit-queue? | | 449 | 2011-05-04 21:25:35 |
| 135068 | 12340 | | 0 | 1 | 449 | 2011-05-04 21:25:35 |
| 135069 | 12340 | | | review?, commit-queue? | 449 | 2011-05-04 21:25:43 |
| 135070 | 12340 | status | UNCONFIRMED | NEW | 607 | 2011-05-04 22:53:59 |
| 135071 | 12340 | Ever Confirmed | 0 | 1 | 607 | 2011-05-04 22:53:59 |
| 135072 | 12340 | Blocks | | 60244 | 449 | 2011-05-10 21:34:50 |
| 135073 | 12340 | | review?, commit-queue? | | 449 | 2011-05-11 23:56:57 |

[...]

As seen, there are empty values for the column "field".

Perhaps it is because the way Beautiful Soup is working given that the fields that are missing are partially links, while typical values in such columns are plain text.

Error in database character sets while comparing dates

Running Bicho I get the next error:

bicho -o db -b bg --db-user-out root --db-password-out root --db-database-out solid_bicho -d 5 -u https://bugs.kde.org/buglist.cgi?product=solid
Traceback (most recent call last):
  File "/usr/local/bin/bicho", line 25, in <module>
    retval = Bicho.main.main()
  File "/usr/local/lib/python2.7/dist-packages/Bicho/main.py", line 54, in main
    backend.run()
  File "/usr/local/lib/python2.7/dist-packages/Bicho/backends/bg.py", line 1064, in run
    issues = self.analyze_bug_list(query_bugs, url, dbtrk.id, bugsdb)
  File "/usr/local/lib/python2.7/dist-packages/Bicho/backends/bg.py", line 948, in analyze_bug_list
    bugsdb.insert_issue(issues[bug_id], dbtrk_id)
  File "/usr/local/lib/python2.7/dist-packages/Bicho/db/database.py", line 195, in insert_issue
    db_comment = self._get_db_comment(comment, db_issue.id, tracker_id)
  File "/usr/local/lib/python2.7/dist-packages/Bicho/db/database.py", line 488, in _get_db_comment
    DBComment.submitted_on == comment.submitted_on).one()
  File "/usr/lib/python2.7/dist-packages/storm/store.py", line 1142, in one
    result = self._store._connection.execute(select)
  File "/usr/lib/python2.7/dist-packages/storm/databases/mysql.py", line 106, in execute
    return Connection.execute(self, statement, params, noresult)
  File "/usr/lib/python2.7/dist-packages/storm/database.py", line 238, in execute
    raw_cursor = self.raw_execute(statement, params)
  File "/usr/lib/python2.7/dist-packages/storm/database.py", line 322, in raw_execute
    self._check_disconnect(raw_cursor.execute, *args)
  File "/usr/lib/python2.7/dist-packages/storm/database.py", line 371, in _check_disconnect
    return function(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/MySQLdb/cursors.py", line 174, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib/python2.7/dist-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1267, "Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='")

Bugzilla version: 3.2.2 fails in incremental analysis

HI!

With the last changes in Bicho it fails in Bugzilla version 3.2.2 incremental analysis.

A solution is:

acs@lenovix:~/devel/Bicho$ git diff Bicho/backends/bg.py
diff --git a/Bicho/backends/bg.py b/Bicho/backends/bg.py
index a9c624a..7f4f03c 100644
--- a/Bicho/backends/bg.py
+++ b/Bicho/backends/bg.py
@@ -1055,6 +1055,8 @@ class BGBackend(Backend):
                 printout("No more issues to retrieve")

     def _retrieve_issues_ids(self, base_url, version, from_date, not_retrieved=True):
+        # hack until we talk with sduenas - acs
+        from_date = from_date.split(" ")[0]
         url = self._get_issues_list_url(base_url, version, from_date)
         printdbg("Getting bugzilla issues from %s" % url)

Error creating log table for a Bugzilla project

Seen when bicho is launched for a second project/tracker sharing the same database

luis@tahine:~/repos/Bicho$ ./bicho -g --db-user-out=root --db-password-out=root --db-database-out=bicho_bug40 -d 0 -b bg -u https://bugzilla.libresoft.es/buglist.cgi?product=cvsanaly
...
DBG: [22/Feb/2013-16:28:37] Getting bugzilla issues from https://bugzilla.libresoft.es/buglist.cgi?product=cvsanaly&order=changeddate&ctype=csv&chfieldfrom=2012-05-18%2014:11:52
Round #9 - Total issues to retrieve: 2
DBG: [22/Feb/2013-16:28:38] Issues to retrieve from: https://bugzilla.libresoft.es/show_bug.cgi?id=1&id=2&ctype=xml&excludefield=attachmentdata
DBG: [22/Feb/2013-16:28:39] Retrieving activity of issue #1 from https://bugzilla.libresoft.es/show_activity.cgi?id=1
DBG: [22/Feb/2013-16:28:41] Issue #1 stored 
DBG: [22/Feb/2013-16:28:41] Retrieving activity of issue #2 from https://bugzilla.libresoft.es/show_activity.cgi?id=2
DBG: [22/Feb/2013-16:28:42] Issue #2 stored 
DBG: [22/Feb/2013-16:28:42] Last issues cached were modified on: 2012-08-19 08:11:06
DBG: [22/Feb/2013-16:28:42] Getting bugzilla issues from https://bugzilla.libresoft.es/buglist.cgi?product=cvsanaly&order=changeddate&ctype=csv&chfieldfrom=2012-08-19%2008:11:06
DBG: [22/Feb/2013-16:28:43] No issues found for date 2012-08-19 08:11:06. Trying with 2012-08-19 08:11:07
DBG: [22/Feb/2013-16:28:43] Getting bugzilla issues from https://bugzilla.libresoft.es/buglist.cgi?product=cvsanaly&order=changeddate&ctype=csv&chfieldfrom=2012-08-19%2008:11:07
No more issues to retrieve
Done. 39 issues retrieved
self.backend_name = bg
DBG: [22/Feb/2013-16:28:44] Last change logged at 2012-06-07 16:58:52
Traceback (most recent call last):
  File "./bicho", line 8, in <module>
    retval = Bicho.main.main()
  File "/home/luis/repos/Bicho/Bicho/main.py", line 59, in main
    il.run()
  File "/home/luis/repos/Bicho/Bicho/post_processing/logtable.py", line 776, in run
    (db_ilog.issue, date))
AttributeError: 'NoneType' object has no attribute 'issue'

Problem cloning repo in case insensitive systems

The root directory of Bicho's git repository contains two files, bicho and Bicho (in fact, a folder), that have the same name under case insensitive file systems. As a result, when cloning the repository, only one of them is cloned (in my case, I get the bicho file, but not the Bicho folder).

_mysql_exceptions.OperationalError: (1267, "Illegal mix of collations

Running bicho for getting Evince bugs from GNOME Bugzilla, bg backend, I get this error:

Traceback (most recent call last):
  File "/usr/local/bin/bicho", line 25, in <module>
    retval = Bicho.main.main()
  File "/usr/local/lib/python2.6/dist-packages/Bicho/main.py", line 54, in main
    backend.run()
  File "/usr/local/lib/python2.6/dist-packages/Bicho/backends/bg.py", line 1071, in run
    bugsdb.insert_issue(issue_data, dbtrk.id)
  File "/usr/local/lib/python2.6/dist-packages/Bicho/db/database.py", line 195, in insert_issue
    db_comment = self._get_db_comment(comment, db_issue.id, tracker_id)
  File "/usr/local/lib/python2.6/dist-packages/Bicho/db/database.py", line 488, in _get_db_comment
    DBComment.submitted_on == comment.submitted_on).one()
  File "/usr/local/lib/python2.6/dist-packages/storm-0.19-py2.6-linux-x86_64.egg/storm/store.py", line 1142, in one
    result = self._store._connection.execute(select)
  File "/usr/local/lib/python2.6/dist-packages/storm-0.19-py2.6-linux-x86_64.egg/storm/databases/mysql.py", line 106, in execute
    return Connection.execute(self, statement, params, noresult)
  File "/usr/local/lib/python2.6/dist-packages/storm-0.19-py2.6-linux-x86_64.egg/storm/database.py", line 238, in execute
    raw_cursor = self.raw_execute(statement, params)
  File "/usr/local/lib/python2.6/dist-packages/storm-0.19-py2.6-linux-x86_64.egg/storm/database.py", line 322, in raw_execute
    self._check_disconnect(raw_cursor.execute, *args)
  File "/usr/local/lib/python2.6/dist-packages/storm-0.19-py2.6-linux-x86_64.egg/storm/database.py", line 371, in _check_disconnect
    return function(*args, **kwargs)
  File "/usr/lib/pymodules/python2.6/MySQLdb/cursors.py", line 166, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib/pymodules/python2.6/MySQLdb/connections.py", line 35, in defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1267, "Illegal mix of collations (latin1_swedish_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation '='")

Error retrieving name of the author

I've seen this while downloading issues from http://issues.liferay.com/browse/IDE

DBG: [21/Feb/2013-19:15:47] Bug activity: http://issues.liferay.com/browse/IDE-54?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
Change author format not supported. Change lost!
Change author format not supported. Change lost!
Change author format not supported. Change lost!
Change author format not supported. Change lost!
Change author format not supported. Change lost!

Bicho to analyze a private project in Bugzilla

Bicho is not able of getting the list of bugs of a private project in Bugzilla. This is due to Bicho was not thought to analyze private projects and the account provided is used to get extra data from every bug petitition.

The change is simple, it is needed to use the cookies also in the petition that gets the list of bugs. This should be enough.

Crash while parsing bugs that don't have description

Bicho crashes while parsing bugs which their fields 'long_desc' (the description of the bug) are not filled.

Example: http://itforgebugzilla.atosresearch.eu/bugzilla/show_bug.cgi?id=439&ctype=xml

Error in function analyzeBug with URL: http://itforgebugzilla.atosresearch.eu/bugzilla/ and Bug: 439
Traceback (most recent call last):
  File "/home/sduenas/devel/ws/bin/bicho", line 25, in <module>
    retval = Bicho.main.main()
  File "/home/sduenas/libresoft/devel/bicho/Bicho/main.py", line 54, in main
    backend.run()
  File "/home/sduenas/libresoft/devel/bicho/Bicho/backends/bg.py", line 1064, in run
    issue_data = self.analyze_bug(bug, url)
  File "/home/sduenas/libresoft/devel/bicho/Bicho/backends/bg.py", line 947, in analyze_bug
    issue = handler.get_issue()
  File "/home/sduenas/libresoft/devel/bicho/Bicho/backends/bg.py", line 774, in get_issue
    desc = self.ctags["long_desc"][0]["thetext"]
IndexError: list index out of range

Error cleaning up XML in Bugzilla analysis

Error when analysing the Webkit tracker

Error parsing URL: https://bugzilla.webkit.org/show_bug.cgi?id=56587&id=61069&id=91335&id=92084&id=77377&id=91998&id=92090&id=65632&id=92087&id=82510&id=35010&id=65533&id=92074&id=92070&id=91449&id=92002&id=91836&id=92021&id=92083&id=91841&id=92085&id=92077&id=92060&id=91593&id=92073&id=92082&id=92010&id=92064&id=38882&id=92068&id=80269&id=63244&id=91971&id=83464&id=83398&id=83529&id=80602&id=79685&id=62857&id=79670&id=77446&id=76453&id=92055&id=73526&id=69896&id=70049&id=87899&id=66004&id=66566&id=65539&id=65200&id=65171&id=92066&id=69658&id=69544&id=69398&id=69372&id=69317&id=69202&id=69167&id=77990&id=91961&id=89148&id=91966&id=91948&id=92058&id=91694&id=57583&id=90037&id=91949&id=84321&id=90325&id=92047&id=61524&id=91917&id=92049&id=91913&id=92048&id=85444&id=92046&id=91825&id=91829&id=91963&id=91761&id=92038&id=81857&id=91674&id=92023&id=91703&id=92030&id=92032&id=91927&id=92034&id=92036&id=91541&id=91981&id=91972&id=92031&id=88555&id=91459&id=92028&id=91884&id=92027&id=53141&id=92026&id=92024&id=91982&id=80622&id=92014&id=91980&id=90873&id=92013&id=91639&id=88271&id=92017&id=92022&id=92007&id=66615&id=92020&id=92019&id=91654&id=92018&id=91997&id=91171&id=92004&id=89519&id=84802&id=81488&id=70708&id=91868&id=91899&id=91249&id=91942&id=92006&id=91857&id=90175&id=91193&id=92003&id=89597&id=91717&id=75071&id=83440&id=91996&id=91958&id=91995&id=91994&id=91979&id=91985&id=91945&id=91939&id=91984&id=91986&id=91977&id=91978&id=89719&id=91782&id=91967&id=91938&id=91921&id=91975&id=91960&id=91959&id=91935&id=42778&id=91837&id=91937&id=91728&id=91826&id=91947&id=83156&id=83436&id=91950&id=75716&id=46248&id=75070&id=89055&id=91577&id=91764&id=91624&id=90937&id=19937&id=91934&id=91953&id=91708&id=91941&id=91946&id=91944&id=76321&id=88937&id=91928&id=91918&id=91796&id=91930&id=90679&id=91923&id=90783&id=84567&id=13351&id=86581&id=91922&id=87935&id=91916&id=91499&id=91914&id=91915&id=91846&id=91848&id=91909&id=91874&id=91834&id=91905&id=91901&id=86911&id=91904&id=56151&id=46283&id=91886&id=91535&id=91902&id=91731&id=40103&id=91903&id=91414&id=91636&id=91762&id=91827&id=91506&id=83370&id=70617&id=89696&id=91873&id=91569&id=91887&id=90227&id=82372&id=91789&id=91895&id=91866&id=91893&id=80644&id=91882&id=91801&id=83187&id=91883&id=90469&id=89767&id=91421&id=90517&id=91847&id=91403&id=91275&id=91133&id=91672&id=91876&id=91870&id=91758&id=91526&id=91784&id=91875&id=30187&id=82835&id=91880&id=85591&id=91474&id=91865&id=69295&id=91867&id=40673&id=91817&id=91869&id=17672&id=91863&id=91747&id=85527&id=91159&id=91767&id=91840&id=91859&id=84813&id=91839&id=90792&id=91845&id=87246&id=91757&id=90455&id=79354&id=91830&id=91838&id=91833&id=90676&id=91721&id=91714&id=90289&id=53932&id=90976&id=90931&id=63952&id=91590&id=17709&id=91745&id=63062&id=85958&id=91629&id=90990&id=91819&id=91816&id=91814&id=91808&id=91209&id=85754&id=90604&id=91797&id=91795&id=91686&id=91729&id=91799&id=91798&id=77383&id=90692&id=91651&id=89796&id=90642&id=90182&id=91081&id=91763&id=50126&id=86016&id=16496&id=83432&id=91483&id=91537&id=91571&id=91142&id=22882&id=6007&id=68089&id=8191&id=91787&id=85140&id=90419&id=91786&id=91785&id=91637&id=91140&id=91690&id=91780&id=91722&id=91781&id=91770&id=90731&id=91777&id=91655&id=91271&id=85826&id=87711&id=90252&id=90713&id=91765&id=91650&id=91706&id=91760&id=89391&id=91134&id=71406&id=91682&id=90284&id=91148&id=91668&id=80576&id=91644&id=90311&id=91243&id=65801&id=50144&id=91671&id=91669&id=91555&id=85174&id=91246&id=85817&id=91132&id=91753&id=91679&id=87844&id=91740&id=91744&id=91751&id=91749&id=91725&id=91026&id=91092&id=91645&id=91741&id=89457&id=91735&id=91739&id=81126&id=91428&id=47727&id=77012&id=91730&id=91500&id=89987&id=89978&id=91699&id=91640&id=88747&id=90169&id=91464&id=91705&id=89224&id=91716&id=59305&id=89544&id=90788&id=59832&id=91044&id=82697&id=80472&id=91493&id=91720&id=91341&id=86215&id=91715&id=91713&id=86196&id=89648&id=91711&id=91641&id=81883&id=91712&id=91356&id=91564&id=87364&id=21692&id=91626&id=81882&id=88077&id=85223&id=89748&id=91622&id=91663&id=91594&id=91489&id=90508&id=91070&id=90039&id=91547&id=91408&id=91649&id=91696&id=91579&id=91692&id=91597&id=91461&id=91695&id=91550&id=91693&id=91687&id=88382&id=91684&id=91681&id=90581&id=91444&id=91680&id=91691&id=91678&id=87987&id=91253&id=91677&id=91670&id=89502&id=91631&id=91422&id=91683&id=11355&id=91659&id=82236&id=83628&id=91562&id=91599&id=91642&id=24880&id=90626&id=91549&id=91304&id=91673&id=91514&id=91565&id=91602&id=91334&id=91652&id=90762&id=91558&id=91666&id=91030&id=84308&id=91662&id=90320&id=91647&ctype=xml
Traceback (most recent call last):
File "./bicho", line 8, in
retval = Bicho.main.main()
File "/home/lcanas/repos/Bicho/Bicho/main.py", line 54, in main
backend.run()
File "/home/lcanas/repos/Bicho/Bicho/backends/bg.py", line 1143, in run
issues = self.analyze_bug_list(query_bugs, url, dbtrk.id, bugsdb)
File "/home/lcanas/repos/Bicho/Bicho/backends/bg.py", line 1005, in analyze_bug_list
self.safe_xml_parse(bugs_url, handler);
File "/home/lcanas/repos/Bicho/Bicho/backends/bg.py", line 986, in safe_xml_parse
join(c for c in contents if self.valid_XML_char_ordinal(ord(c)))
UnboundLocalError: local variable 'contents' referenced before assignment

incoherent number of issues after webkit analysis

After executing bicho twice, the second database is not coherent with the first results we got. We have a bug somewhere. Besides that the numbers for db1 are more likely to be the correct ones.

|year|month| # issues db1| # issues db2|
|2000|12| 1| 1|
|2005| 6|331|325|
|2005| 7|251|246|
|2005| 8|323|321|
|2005| 9|215|211|
|2005|10|208|203|
|2005|11|202|200|
|2005|12|279|270|

Installation error related to IssuesLog

(reported by @acs )

After installing bicho we get this error:

luis@tahine:~/repos$ bicho -g -d 1 --db-user-out=root --db-password-out=root --db-database-out=acs_bicho_allura_1049 allura http://sourceforge.net/rest/p/allura/tickets
Traceback (most recent call last):
File "/usr/local/bin/bicho", line 21, in
import Bicho.main
File "/usr/local/lib/python2.7/dist-packages/Bicho/main.py", line 34, in
from post_processing.logtable import IssuesLog
ImportError: No module named post_processing.logtable

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.