Comments (9)
I like the idea. CoSMoMVPA does something like this already, but in a toolbox-specific manner. in particular, the function cosmo_check_external
has internally a list of references. When using cosmo_check_external('afni')
, for example, it keeps track that the AFNI matlab library has been used. Then the used external toolboxes can be listed using cosmo_check_external('-cite')
. For an example, see the very end of http://cosmomvpa.org/_static/publish/demo_surface_tfce.html
However, in CoSMoMVPA case, the code and the citations are in the same file, which is not very generalisable. It would probably be better to separate the code keeping track of citations from the bibliography information. That doesn't seem difficult though.
Matlab does not have decorators, but it can just call a function (say duecredit_cite
) that has an persistent variable keeping track of which items have been cited. Or we could use object-oriented style, preferably 'old' style to be Octave-compatible.
Another challenge may be supporting bibtex or other bibliography formats, as I am not aware of a bibtex parser.
from duecredit.
sweet, well done Nick ;)
I think a function (duecredit_cite) should suffice. I am scared of thinking OOP in Matlab/Octave but if you know some compatible way which will withstand the test of time -- sure ;)
Also I wonder how easy would be to enable/disable (nothing to be done at all by duecredit_cite) the mode observing "DUECREDIT_ENABLE" env variable?
Supporting formats could be left to the "duecredit core" which is what we are writing here, and which will support all that madness. So Matlab side would just need to collect those references (plain strings for bibtex or doi or url), save into some file (e.g. .duecredit.mat) which we would load in duecredit, process and add to the global list of citations.
If you are to cook up 'duecredit_cite.m' look at "API" we have so far on Python side: https://github.com/duecredit/duecredit/blob/master/duecredit/collector.py#L30
i.e. would be nice to have/support following arguments: (gy gy -- spotted a typo in inoked -- will fix)
entry: str or DueCreditEntry
The entry to use, either identified by its id or a new one (to be added)
description: str, optional
Description of what this functionality provides
path: str, optional
Path to the object which this citation associated with. Format is
"module[.submodules][:[class.]method]", i.e. ":" is used to separate module
path from the path within the module.
version: str or tuple, version
Version of the beast (e.g. of the module) where applicable
tags: list of str, optional
Add tags for the reference for this method. Some tags have associated
semantics in duecredit, e.g.
- "implementation" [default] tag describes as an implementation of the cited
method
- "reference" tag describes as the original implementation of
the cited method
- "use" tag points to publications demonstrating a worthwhile noting use
the method
- "edu" references to tutorials, textbooks and other materials useful to learn
more
- "cite-on-import" for a module citation would make that module citeable even
without internal duecredited functionality inoked. Should be used only for
core packages whenever it is reasonable to assume that its import constitute
its use (e.g. numpy)
from duecredit.
@nno btw, comments/recommendations/etc on API is also very welcome -- may be we haven't foresaw additional use-cases which couldn't be covered with such setup.
from duecredit.
I think a function (duecredit_cite) should suffice
Note that Matlab / Octave do not have direct import functionality. Thus, functions using duecredit would have to call that function directly. But since it may not be present on other machines, every time it is to be used it should be surrounded by try / catch, e.g.:
try
duecredit_cite('GNU Octave')
catch
% do nothing
end
which is not very elegant...
I am scared of thinking OOP in Matlab/Octave but if you know some compatible way which will withstand the test of time -- sure ;)
OOP will be difficult, particular if this is to be Octave compatible. Octave only supports 'old-style' OOP, which means syntax like this:
duecredit = cite(duecredit, 'GNU Octave')
It seems that a function using a persistent variable is easiest.
Also I wonder how easy would be to enable/disable (nothing to be done at all by duecredit_cite) the mode observing "DUECREDIT_ENABLE" env variable?
That should be straightforward, as matlab has a getenv
function.
Supporting formats could be left to the "duecredit core" which is what we are writing here, and which will support all that madness. So Matlab side would just need to collect those references (plain strings for bibtex or doi or url), save into some file (e.g. .duecredit.mat) which we would load in duecredit, process and add to the global list of citations.
Does that mean that duecredit for Matlab/ Octave would always require that Python and the Python code for duecredit is available? Many users of Matlab run on MS Windows and are unlikely to have that available, or may not be willing to invest the time setting that up... Alternatively, a pure Matlab / Octave implementation may be an option, but that involves a lot of code and effort duplication.
Also, I looked into the injector functionality. That's nice, but more difficult to achieve on Matlab / Octave. The only possible way I could see to make this work is have subdirectories for each package that have the same name as functions called by the respective package (such as ft_defaults for FieldTrip). Upon duecredit initalization, these subdirectories are added to the top of the search path, overriding the original function. Upon the first call of such a function, duecredit_cite is called and the directory removed from the search path. However, such a solution is not very scalable (certain toolboxes have many functions that may be called), and also not elegant as it involves run-time modifications to the search path. It also does not work if a toolbox function is called from the toolbox directory itself, as the current directory has higher precedence than anything in the search path.
Thus, any ideas on how injector functionality can be achieved would be appreciated.
from duecredit.
- The idea is probably to have similar to our stub.py: since it would be trickier to overload name-spaces, I would say that "stock" duecredit matlab/octave module would provide filenames with
_
in them, e.g.duecredit_cite_.m
and then we will provide the ultimate "stub" moduleduecredit_cite.m
which people would copy to their code-base to carry around and which will have the
try
duecredit_cite_('GNU Octave')
catch
% do nothing
end
not sure exactly what to do with different types of citation (safetly catch all for which we provide within the same stub.py - BibTeX, Doi, etc) but I guess we could easily just make duecredit_cite as first argument accept a string which would state what kind of reference next argument is (bibtex, doi, url, etc). So we end up with only 1 file
2. "Does that mean that duecredit for Matlab/ Octave would always require that Python" At the beginning -- I think so. But we will make duecredit available from everywhere possible (we have it on pypi already) -- standalone bundle, conda, etc, may be even we could provide some ugly duecredit_install.m
to be shipped along to install a standalone bundle on a given system. Later we might cook up a duecredit.org website, to which folks could upload their citations manually or may be even that datalad_cite.m
could get a basic implementation to upload those collected citations. I don't think it is worth reimplementing everything in matlab/octave
3. Injectors -- primary motivation for them is to demonstrate benefit of duecredit this early in its life-time. I hope that eventually projects just adopt duecredit stub/citations within their code base so no injections would be necessary. Indeed messing up with path in matlab would be a cruelty better to avoid. And in Matlab land if e.g. cosmomvpa, eeglab, and few others adopt it -- that hopefully would provide sufficient demo/motivation for others.
from duecredit.
Picking up this thread...
- we could help people setting up the duecredit_cite_ command. We could support different types of arguments, e.g.
duecredit_cite('text',['John W. Eaton, David Bateman, Soren Hauberg, Rik Wehbring (2014).'...
GNU Octave version 3.8.1 manual <snip>'],...
'BiBTeX',[' @book{\n,author = {John W. Eaton <snip>'])
or as a starting point just use the 'text' version.
Alternatively, for already widely-used packages we could include the citation information directly with due_credit, so that something like
duecredit_cite_('GNU Octave')
would automatically use the correct information that is part of duecredit.
Then, when run from Matlab / Octave, if the users calls
duecredit_cite()
a list is shown of citations, or
duecredit_cite('BiBTeX')
could show in BiBTeX format if available, and in text for packages not provided in BIBTeX.
- I think requirement of Python would make adaptation much more difficult in Matlab/Octave land.
- Injectors are close to impossible in Octave / Matlab. So to demonstrate the benefit, we would have to convince project leaders to include duecredit in their project. For example, the AFNI Matlab library, Neuroelf, NIFTI libray, GIFTI library, FieldTrip, EEGLab, surfing toolbox.
As a side note: how about provided an update mechanism for duecredit, so that recent citations can added to a users' current duecredit installation? Or is that too invasive? It could be something that users have to allow explicitly.
from duecredit.
- Probably the best would be to mimic Python's API, i.e. having separate
duecredit_cite
to add a (single) citation, with tags argument also to describe its nature (imlpementation, reference-implementation, edu, etc), and indeed first argument depicting a 'type' of provided reference ("text", "doi", "BiBTeX")duecredit_summary
for the output
- "for already widely-used packages we could include the citation information directly with due_credit," Yeap -- we would need something like that ;) and that is something what we already do on Python side (numpy, sklearn, ...) but I guess for octave we would need more automation since, as you have mentioned, injection is not possible. I.e. for some of them to define some 'checkers' which would automagically cite them? (e.g. if duecredit_cite is invoked from within octave -- add octave citation). I think it would be worth introducing the same notion of a "path" as we have in Python implementation to define what that reference belongs to
- "requirement of Python". I don't mind if we get also generic nearly full featured Matlab/Octave implementation. But what we should really assure probably is that the "database" is stored in a format which both could I/O -- json? (atm lazy us just dump Python's pickle)
- "convincing". indeed... for that we would need a good starting point I guess, e.g. Cosmo... if only there also was some Python module which would have called out to cosmo like nipype does into e.g. glm -- then we could really work out the case across environments. alternative is just e.g. having separate invocations of cosmo analysis script and then pymvpa script... but then collating it all into a single report
from duecredit.
-
Seperate
duecredit_cite
andduecredit_summary
is fine. They both will have to call some other duecredit helper function to store internal state. -
It's difficult to include 'checkers' to see which packages have been used, due to lack of injection support. For Octave itself it is possible, but I don't see how to do it for other packages if they don't call some duecredit_cite themselves.
-
JSON would be good, I found a free Matlab / Octave toolbox here:
-
I think if we add something that is easy to use by other package developers, then it may be adopted widely. Actually CoSMoMVPA may be a good use case for this to try it out, and also already provides a basic implementation of most functionality required by
Re storing / keeping track of citations: not sure if that should be- internally in the function, as currently done in https://github.com/CoSMoMVPA/CoSMoMVPA/blob/master/mvpa/cosmo_check_external.m - see lines 327 - 542
- stored in an external file
- done through calls to duecredit_cite
I would be tempted to have only c (with maybe a as fallback) - any thoughts.
-
In terms of use case: I assume that in the Matlab / Octave environment, the user only has to install duecredit (add the appropriate directory to the search path), then can run their analysis scripts, and just call duecredit_summary() to get a summary? Or would it be more complicated?
from duecredit.
-
done ;)
-
yeah -- I saw checkers only as an addition for that limited set of cases, suchas 'environment', which would include e.g. information about infrastructure itself (see #55). So they are invoked ones in a runtime whenever any
duecredit_cite
command to be invoked -
Cool. So that then should be the next one on our table to tackle -- decide on the format. I think we should have smth like
.duecredit/citations/
subdirectory where we would collect{octave,python}.json
citations, which then would all be picked up bysummary
command. We also might end up with.duecredit/config
which then could be used to override defaults (e.g. format to output stuff in etc)
4.1. ah btw -- to ease adoption... That is why we came up with that https://github.com/duecredit/duecredit/blob/master/duecredit/stub.py which would be a minimal thing to include into projects to provide necessary API but so that if duecredit is not installed -- their stuff still works as usual. I guess in case of the matlab/octave implementation if overloading names would be tricky, stub file could define those proxy duecredit_cite
whenever actual, bigger, duecredit module define duecredit_cite_
to be called by those stubbed ones
4.2. In this Python implementation we have separated Entry
(Doi, BibTex, etc) from actual Citation
. Entries could be loaded from anywhere -- .bib file or specified directly in the code. Then Citation
could either consume a new Entry or just a key identifying preloaded one + accept usage tags and description for it. This way a single bibliographic reference could be used for multiple citations in different places, possibly with different description/path/etc. So we kinda supported all a b c I guess ;) But in majority of usecases we seems to use only c atm.
5
Our thought was: by default ducredit should have no impact/affect on anything, and only the run with 'DUECREDIT_ENABLE' environment variable -- it starts tracking. Then 'summary' is independent of that -- it just loads up stored citations, filters by desired tag, and presents them. But may be we could/should enable it by default -- not sure, since with all the injections it does have some measurable impact on imports time ATM.
from duecredit.
Related Issues (20)
- Integration with citepy CSL data classes HOT 3
- References are not deduplicated in a bibtex summary HOT 3
- Ci testing fails
- How to cite duecredit? HOT 5
- Use DueCredit for data files HOT 2
- Deprecation warning for due credit. HOT 3
- [wishlist] interface to templating engines (e.g. ninja) to provide users a flexible way to render reports
- [wishlist] option (env var) to not load existing .duecredit.p upon start
- Help adding duecredit HOT 20
- Enable duecredit by default, but without injections
- SciPy injection is outdated HOT 3
- Switch from WARNING to DEBUG for outdated injections etc
- Best practices for integrating conditional dcite decorators into Nipype workflows? HOT 2
- DueCredit internal failure while running <function DueSwitch.dump ... on Windows 10 HOT 5
- Improve usage in Jupyter HOT 2
- Internal error: Both inactive and active collectors should be provided HOT 1
- [wishlist] Can duecredit be exported as codemeta? HOT 1
- DueCredit `cite()` doesn't work and `dcite()` works, but not always HOT 5
- Duecredit API HOT 1
- inactively `@due.dcite`-decorated classes become unpicklable
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from duecredit.