djsutherland / arxiv-collector Goto Github PK
View Code? Open in Web Editor NEWA little Python script to collect LaTeX sources for upload to the arXiv.
License: BSD 3-Clause "New" or "Revised" License
A little Python script to collect LaTeX sources for upload to the arXiv.
License: BSD 3-Clause "New" or "Revised" License
I swear this worked a month or two ago (running on MacOS Catalina) but now arxiv-collector main.tex
produces the following error:
Traceback (most recent call last):
File "/Users/nrbeaton/anaconda3/bin/arxiv-collector", line 10, in <module>
sys.exit(main())
File "/Users/nrbeaton/anaconda3/lib/python3.6/site-packages/arxiv_collector.py", line 353, in main
version = get_latexmk_version(args.latexmk)
File "/Users/nrbeaton/anaconda3/lib/python3.6/site-packages/arxiv_collector.py", line 77, in get_latexmk_version
raise ValueError("Bad output of {} --version:\n{}".format(latexmk, out))
ValueError: Bad output of latexmk --version:
b'\nLatexmk, John Collins, 26 Dec. 2019. Version 4.67\n'
Was wrangling with compilation issues and .bbl
version mismatches till I found this. Got to say, this software neatly does exactly what it promises.
Excellent work! Thanks again.
PS: Feel free to close this issue.
I am running python ./arxiv_collector.py
from master
which works well.
My latexmk
version is 4.69a and works also well on its own.
When I add the lines
$dependents_list = 1;
$deps_file = ".deps";
END {
system("python arxiv_collector.py --latexmk-deps $deps_file");
}
to my empty .latexmkrc
I get the following error message.
Latexmk: All targets (draft.pdf) are up-to-date
Gathering outputs...
Traceback (most recent call last):
File "arxiv_collector.py", line 500, in <module>
main()
File "arxiv_collector.py", line 482, in main
collect(
File "arxiv_collector.py", line 223, in collect
expect(
File "arxiv_collector.py", line 54, in expect
raise ValueError(msg)
ValueError: deps file .deps seems broken: expected the line
draft.pdf :\
to be one of:
draft.tex.pdf :\
draft.tex.pdf .deps :\
Any idea what goes wrong here?
I started to investigate this problem after noticing that overleaf
set up according to the README
generates only an empty zip file.
arxiv_collector.py 0.4.1
Rough situation: I have a paper with many plots generated with R, each containing quite a few datapoints.
This doesn't compile with pdflatex due to the limited... dunno, stack? memory? Something like that.
It does compile with lualatex, which I've specified in my .latexmkrc
:
# LuaLatex
$pdf_mode = 4;
Now, from my understanding, arxiv-collector by default tries to create the dependencies file through latexmk -pdf ...
, which apparently overrides my settings and calls pdflatex
, which fails.
This can be circumvented like so:
# Save dependencies for arxiv_collector.
$dependents_list = 1;
$deps_file = ".deps";
and
python3 arxiv_collector.py --latexmk-deps .deps ...
But it was a bit surprising.
The project file is text.tex
. Its content is
\documentclass{article}
\usepackage{biblatex}
\addbibresource{bibliography.bib}
\begin{document}
\cite{A}
\printbibliography
\end{document}
The bibliography file is bibliography.bib
. Its content is
@book{A,
author = {Me},
title = {Kamasutra}}
What's wrong? Thanks.
Dependencies in .deps
Gathering outputs...
Deps file .deps: source text, base name text, output text.pdf .deps, jobname text.pdf
Processing /etc/texmf/web2c/texmf.cnf ...
Processing /usr/share/texmf/fonts/map/fontname/texfonts.map ...
Processing /usr/share/texmf/fonts/tfm/public/cm/cmbx12.tfm ...
Processing /usr/share/texmf/fonts/tfm/public/cm/cmr12.tfm ...
Processing /usr/share/texmf/fonts/tfm/public/cm/cmti10.tfm ...
Processing /usr/share/texmf/fonts/type1/public/amsfonts/cm/cmbx12.pfb ...
Processing /usr/share/texmf/fonts/type1/public/amsfonts/cm/cmr10.pfb ...
Processing /usr/share/texmf/fonts/type1/public/amsfonts/cm/cmti10.pfb ...
Processing /usr/share/texmf/tex/generic/oberdiek/etexcmds.sty ...
Processing /usr/share/texmf/tex/generic/oberdiek/ifluatex.sty ...
Processing /usr/share/texmf/tex/generic/oberdiek/ifpdf.sty ...
Processing /usr/share/texmf/tex/generic/oberdiek/infwarerr.sty ...
Processing /usr/share/texmf/tex/generic/oberdiek/kvsetkeys.sty ...
Processing /usr/share/texmf/tex/generic/oberdiek/ltxcmds.sty ...
Processing /usr/share/texmf/tex/generic/oberdiek/pdftexcmds.sty ...
Processing /usr/share/texmf/tex/generic/xstring/xstring.sty ...
Processing /usr/share/texmf/tex/generic/xstring/xstring.tex ...
Processing /usr/share/texmf/tex/latex/base/article.cls ...
Processing /usr/share/texmf/tex/latex/base/ifthen.sty ...
Processing /usr/share/texmf/tex/latex/base/size10.clo ...
Processing /usr/share/texmf/tex/latex/biblatex/bbx/numeric.bbx ...
Adding /usr/share/texmf/tex/latex/biblatex/bbx/numeric.bbx
as numeric.bbx
Processing /usr/share/texmf/tex/latex/biblatex/bbx/standard.bbx ...
Adding /usr/share/texmf/tex/latex/biblatex/bbx/standard.bbx
as standard.bbx
Processing /usr/share/texmf/tex/latex/biblatex/biblatex.cfg ...
Adding /usr/share/texmf/tex/latex/biblatex/biblatex.cfg
as biblatex.cfg
Processing /usr/share/texmf/tex/latex/biblatex/biblatex.def ...
Adding /usr/share/texmf/tex/latex/biblatex/biblatex.def
as biblatex.def
Processing /usr/share/texmf/tex/latex/biblatex/biblatex.sty ...
Adding /usr/share/texmf/tex/latex/biblatex/biblatex.sty
as biblatex.sty
Processing /usr/share/texmf/tex/latex/biblatex/blx-compat.def ...
Adding /usr/share/texmf/tex/latex/biblatex/blx-compat.def
as blx-compat.def
Processing /usr/share/texmf/tex/latex/biblatex/blx-dm.def ...
Adding /usr/share/texmf/tex/latex/biblatex/blx-dm.def
as blx-dm.def
Processing /usr/share/texmf/tex/latex/biblatex/cbx/numeric.cbx ...
Adding /usr/share/texmf/tex/latex/biblatex/cbx/numeric.cbx
as numeric.cbx
Processing /usr/share/texmf/tex/latex/biblatex/lbx/english.lbx ...
Adding /usr/share/texmf/tex/latex/biblatex/lbx/english.lbx
as english.lbx
Processing /usr/share/texmf/tex/latex/etoolbox/etoolbox.sty ...
Processing /usr/share/texmf/tex/latex/graphics/keyval.sty ...
Processing /usr/share/texmf/tex/latex/logreq/logreq.def ...
Processing /usr/share/texmf/tex/latex/logreq/logreq.sty ...
Processing /usr/share/texmf/tex/latex/oberdiek/kvoptions.sty ...
Processing /usr/share/texmf/tex/latex/url/url.sty ...
Processing /usr/share/texmf/web2c/texmf.cnf ...
Processing /var/lib/texmf/fonts/map/pdftex/updmap/pdftex.map ...
Processing /var/lib/texmf/web2c/pdftex/pdflatex.fmt ...
Processing bibliography.bib ...
Processing text.tex ...
Adding text.tex with comments stripped
Used a .bib file, but didn't find 'text.pdf .bbl'; this likely won't work.
Output in arxiv.tar.gz: 10 files, 102KiB compressed
Thanks for all the good work (it is really useful!) but unfortunately your script is broken again (most likely latexmk faults)
Traceback (most recent call last):
File "./arxiv_collector.py", line 378, in <module>
main()
File "./arxiv_collector.py", line 353, in main
version = get_latexmk_version(args.latexmk)
File "./arxiv_collector.py", line 77, in get_latexmk_version
raise ValueError("Bad output of {} --version:\n{}".format(latexmk, out))
ValueError: Bad output of latexmk --version:
Latexmk, John Collins, 17 Apr. 2020. Version 4.69a
the problem is the "." after "Apr."
I quick-fixed it by changing
version_re = re.compile(r"Latexmk, John Collins, \d+ \w+\. \d+\. Version (.*)$")
but maybe one should take a deeper look into latexmk
to see what they do there (which I didn't manage in the hurry just now)
Trying to use the script on a document in ubuntu. Compiling using latexmk -pdf [main.tex]
works. When I run arxiv-collector main.tex
I get the following error:
(xenial)brett@localhost:~/Downloads/SC_Conference$ arxiv-collector main.tex
Building main...
Traceback (most recent call last):
File "/home/brett/.local/bin/arxiv-collector", line 11, in <module>
sys.exit(main())
File "/home/brett/.local/lib/python3.5/site-packages/arxiv_collector.py", line 174, in main
strip_comments=args.strip_comments, verbosity=args.verbosity)
File "/home/brett/.local/lib/python3.5/site-packages/arxiv_collector.py", line 109, in collect
add(dep)
File "/home/brett/.local/lib/python3.5/site-packages/arxiv_collector.py", line 68, in add
raise OSError("{} doesn't exist!".format(path))
OSError: ascii input as UTF-8 doesn't exist!
I'm not exactly sure what the error means. Are you expecting all files to be encoded as UTF-8
?
Hello,
I am trying to use your script to get around the dreaded "Package biblatex Warning: File 'article.bbl' is wrong format version - expected 2.8." error on the arXiv. I am using MikTeX on Windows to write my paper.
Using your script works fine and I managed to create an arxiv.tar.gz
file by calling arxiv-collector article
. Unfortunately, when I upload the resulting file to the arXiv, the build still fails. It seems like the builder on arXiv is still using its own biblatex version:
(/texlive/2016/texmf-dist/tex/latex/biblatex/biblatex.sty
Could this be related to my Windows setup? The arxiv.tar.gz
contains a somewhat strange folder structure, including paths that start with C:/
.
arXiv has a long-standing history of not allowing the package {microtype} (possibly because of some interplay with {hyperref}?) and failing to compile with a very obscure message if the package is included. Would it be possible to either remove the package if it's included, or generate a warning to the user?
The program runs great on my local computer. However, when I tried to run it on the CI service Travis, only the main tex file and the bbl file were included. The image file included files and others included tex files were not included. Do you by chance have any experience to run the arxiv-collector from Travis?
0.4.1
Latexmk: Run number 1 of rule 'pdflatex'
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020/W32TeX) (preloaded format=pdflatex)
restricted \write18 enabled.
entering extended mode
Latexmk: Run number 2 of rule 'pdflatex'
This is pdfTeX, Version 3.14159265-2.6-1.40.21 (TeX Live 2020/W32TeX) (preloaded format=pdflatex)
restricted \write18 enabled.
entering extended mode
Dependencies in .deps-d
Gathering outputs...
Deps file .deps-d: source main, base name main, output main.pdf, jobname main
Processing c:/texlive/2020/texmf-dist/fonts/type1/public/amsfonts/cm/cmr10.pfb ...
Processing c:/texlive/2020/texmf-dist/tex/latex/base/article.cls ...
Processing c:/texlive/2020/texmf-dist/tex/latex/base/inputenc.sty ...
Processing c:/texlive/2020/texmf-dist/tex/latex/base/size10.clo ...
Processing c:/texlive/2020/texmf-dist/tex/latex/l3backend/l3backend-pdfmode.def ...
Processing c:/texlive/2020/texmf-dist/web2c/texmf.cnf ...
Processing c:/texlive/2020/texmf-var/fonts/map/pdftex/updmap/pdftex.map ...
Processing c:/texlive/2020/texmf-var/web2c/pdftex/pdflatex.fmt ...
Processing c:/texlive/2020/texmf.cnf ...
Processing main.tex ...
Traceback (most recent call last):
File "c:\users\karlson\anaconda3\envs\arxiv\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\karlson\anaconda3\envs\arxiv\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\Karlson\Anaconda3\envs\arxiv\Scripts\arxiv-collector.exe_main.py", line 7, in
File "c:\users\karlson\anaconda3\envs\arxiv\lib\site-packages\arxiv_collector.py", line 491, in main
collect(
File "c:\users\karlson\anaconda3\envs\arxiv\lib\site-packages\arxiv_collector.py", line 261, in collect
for line in f:
File "c:\users\karlson\anaconda3\envs\arxiv\lib\encodings\cp1252.py", line 23, in decode
return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d in position 93: character maps to
This is a minimum failing example (file is utf8 encoded without BOM):
\documentclass{article}
\usepackage[utf8]{inputenc}
\begin{document}
“This is a test”
\end{document}
main.zip
I am compiling this on a Windows machine and it fails here:
arxiv-collector/arxiv_collector.py
Lines 259 to 262 in 2811262
Judging by the error message, I assume that python wants to open the file as Windows-1252 file, since no encoding is provided.
For example, if you have \includepackage[disable]{todonotes}
, if you're stripping comments you probably also want to strip the contents of \todo{}
.
Similarly, probably want to remove the contents of any \iffalse
blocks.
Of course this might require a proper tex parser to do correctly....
Would be nice to have some simple tests that things work correctly, and run them on travis / circle / azure....
Would be nice to add on; just add error handling so it doesn't crash if there isn't one / etc.
Dear Dougal,
I get the following ValueError: Unexpected EOF. I'm using Windows 10 and I have
file.tex
mystylefile.sty
file.bbl
as files. Or does this script not work with Windows?
Best,
Jan
0.3.5 (installed via conda-forge)
arxiv-collector places pdf files instead of eps files in the .tar.gz (and does rightly so). However, the includegraphics
command does not get updated - it still refers to the eps file. A simple regex / replace should fix the issue.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.