For simple collaboration in academic writing. Cite papers in a markdown/plain text file as PMID, DOI, bib reference, or whole citation, and citeswitcher will find them, convert them all to the format of your choice, and make a .bib
file for you.
python fixcitations.py test/test.md
You can cite papers in a plain text document like this [PMID: 24465370] or like this [https://doi.org/10.1371/journal.pone.0081229] or [DOI: 10.1371/journal.pone.0081229] or (if you share a .bib file) like this [@HallNetworkAnalysisReveals2014] or (if you really need to) like this [Hall, David P., Ian J. C. MacCormick, Alex T. Phythian-Adams, Nina M. Rzechorzek, David Hope-Jones, Sorrel Cosens, Stewart Jackson, et al. “Network Analysis Reveals Distinct Clinical Syndromes Underlying Acute Mountain Sickness.” PLoS ONE 9, no. 1 (January 22, 2014): e81229].
Citeswitcher will find these and format them properly as citations in the chosen output format.
- works on all linux and mac systems tested so far
- not tested on windows
For basic function of fixcitations.py:
- python 3
- pyparsing 2.4.7
- requests-2.24.0
- biopython
- oyaml
- bibtexparser is also required but it is included in this git repository (because I was concerned that it may not be well maintained)
To install these: pip install pyparsing pip install requests pip install biopython pip install oyaml
Once you have the above installed, you're good to go. Additional dependencies for extra functions are listed at the end.
Download or clone the git repository into a folder on your hard disk. Then run the scripts using python from the command line.
To make it easier to use, I use the unix alias
command in my ~/.bash_profile
to automatically map common commands, like this:
alias fix="/miniconda3/envs/thisenv/bin/python <full-path-to-git-repository>/citeswitcher/fixcitations.py"
Edit the config.json
file to include your own paths before running. Key elements:
- email: required for using the Pubmed API - enter your own email address here.
- filepaths: these need to be on you computer somewhere.
Switches citation formats, finds missing citations online. Optionally, calls pandoc to create a fully referenced output file.
The default or specified .bib file will then be searched for the relevant citations. Citations will be replaced with a citation in either .md or .tex format.
Any citations not present in the master .bib file will be downloaded from pubmed and added to a .bib file
Two new files will be written:
- a new .md or .tex file with correct citations
- a local .bib file containing all cited reference details.
Optionally, additional files can be written:
- pandoc outputs in .docx, .pdf or .html formats
Required:
- a .md or .tex document with references in square [] or curly {} or round brackets which will be searched for PMIDs, DOIs, bibtex refs, and whole citations
Optional:
- bib file.
A bibtex file to use. If not specified, a new
cs.bib
file will be created in the same directory as your input file. You can also specify a global bib file for keeping a shared bank of references. - yaml file. A yaml file called .yaml. If it already exists, it will be augmented to add csl and bib files if needed. YAML is taken according to the following hierarchy: in-file YAML, .yaml, other yaml files. All will be copied into the .yaml file.
- csl file. You can specify a csl file using yaml.
-
replace.json file. A file called
replace.json
will be read as a dictionary of text strings to find:replace. If you just create this file and leave it in the same directory as the input file, it will be read. An alternative is to specify text to be replaced in a YAML dict format. -
files to include.
The default (in config.json
) or specified .bib file will be searched for the relevant citations.
Citations will be replaced with a citaion in either .md or .tex or PMID format.
Any citations not found in the .bib files will be downloaded from pubmed and added to the supplementary.bib file
Two outputs are specified.
-o
determines the CITATION FORMAT and is used for switching between different citation formats e.g. PMID==>DOI or PMID==>BIB
-p
determines the PANDOC OUTPUT FORMAT and provides a quick workflow for generating fully formatted documents
If -p
is specified, -o
must be md. If both -p
and -o
are specified and -o
is not md, this indicates that the user's expectation is different from what's going to happen so the script exits. Fix this by only specifying one of these options.
(Quarto does this for you, and is better, so I'd suggest using that instead.) You'll want to have pandoc properly installed with crossref so that you can automatically generate printable documents.
pandoc pandoc-crossref
This can be done with conda, or pip.
pip install pandoc-crossref
arg | long arg | action |
---|---|---|
-b | --bibfile | bibfile |
-y | --yaml | specify a yaml file to use as a source for yaml; use "normal" or "fancy" to copy from sup-files/yaml |
-o | --outputstyle | choices=md,markdown,tex,latex,pubmed,pmid,inline; output references format |
-p | --pandoc_outputs | append as many pandoc formats as you want: pdf docx html txt md tex |
-l | --localbibonly | use only local bib file |
-d | --outputsubdir | outputdir - always a subdir of the working directory |
-i | --include | include files |
-r | --replace | follow find/replace instructions from yaml or json file |
-m | --messy | disable clean up of intermediate files |
-mf | --move_figures | move all figures to the end and create captions section for submission to journal |
-pm | --pandoc_mermaid | use pandoc-mermaid-filter |
-lt | --latex_template | a latex template to be passed to pandoc |
-wt | --word_template | a word template to be passed to pandoc |
-ptp | --pathtopandoc | specify a particular path to pandoc if desired') # e.g. a letter forma |
-redact | --redact | redact between tags |
-s | --stripcomments | stripcomments in html format |
-v | --verbose | verbose |
-x | --xelatex | use xelatex in pandoc build |
-w | --wholereference | try to match whole references. |
-ch | --chaptermode | use pandoc --top-level-division=chapter |
-svg | --convert_svg | convert svg images to pdf - replaces any pdf files with the same name |
-ui | --uncomment_images | include images commented out using html syntax |
-flc | --force_lowercase_citations | force all citation references into lowercase |
Uses wget
to download your published files from hackmd.io and runs fixcitations.py. This makes working collaboratively with hackmd very easy, but takes a bit of manual setup:
- Specify the filepath to a directory (
hackmddir
) inconfig.json
. This should be an empty directory. - Within your new
hackmddir
, create a folder with the exact name of your hackmd.io username or group name. - Create a hackmd file and publish it. It's easier if you choose the url name for it, such as "mytestfile"
- Create yet another folder with the same name as your hackmd file. So your directory structure would be like this:
- my_hackmd_dir
| +-- my_username
| +-- mytestfile
- Run
python hackmd.py
If this works, your file should download into the mytestfile directory, and you should see the following files:
- my_hackmd_dir
| +-- my_username
| +-- mytestfile
| +-- mytestfile.md
| +-- mytestfie.citemd.md
| +-- mytestfie.citemd.pdf
| +-- mytestfie.citemd.docx
| +-- mytestfie.citemd.html
hackmd.py - download hackmd files either by combining elements from a directory tree or a single file, then run fixcitations.py
The following additional scripts are included to provide some useful functions with varying levels of documentation:
- include.py - include files recursively according to a variety of syntaxes
- put_pmid_in_bibtex.py - search for PMIDs for every element in a .bib file. Designed to run as a cron job to keep your bib file up to date.
- abbreviation_finder.py - print abbreviations in a document (so that you can explain them)
- wordcount.py - function unknown at this time
- svg2pdf.py - batch convert a directory (-d) or file (-f) using cairosvg, which seems to perform better than imagemagick.
- format_author_list.py - this is a work in progress but it is intended to create author lists in a variety of formats from a spreadsheet in xlsx or csv format
- pandas 1.1
- tabulate-0.8.7
pip install pandas
pip install tabulate
- inkscape
There are lots of bibtex database software tools available. The following work very well:
- zotero
- better bibtex (with automatic export feature)
- Feed in a .md or .tex document
- references in square [] or curly {} brackets will be searched for: -- PMID -- doi -- bibtext -- whole ciation
- mermaids.cli (https://github.com/mermaid-js/mermaid-cli) Mermaids is a bit of a hassle to install and needs to be in you PATH variable for this to work. But if it is in there, you can use the -pm option to call mermaids and automatically generate nice figures.
This installation procedure worked for me on a mac:
https://github.com/davidar/pandiff
npm install mermaid.cli
Test it:
./node_modules/.bin/mmdc -h
Add mermaid to ~/.bash_profile:
export MERMAID_BIN=/Users//node_modules/.bin/mmdc
export PATH=$PATH:/Users//node_modules/.bin
~