nileshsah / harwest-tool Goto Github PK
View Code? Open in Web Editor NEWA one-shot tool to harvest submissions from different OJs onto one single VCS managed repository http://bit.ly/harwest
License: MIT License
A one-shot tool to harvest submissions from different OJs onto one single VCS managed repository http://bit.ly/harwest
License: MIT License
raise ValueError(
ValueError: ("Please provide correct file extension for the language 'GNU C++20 (64)'
Support for Atcoder can be easily added using Kenkooo API
Here's a example repo combining both Codeforces and Atcoder Link
Traceback (most recent call last):
File "d:\python\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "d:\python\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "D:\Python\Scripts\harwest.exe\__main__.py", line 9, in <module>
File "d:\python\lib\site-packages\harwest\harwest.py", line 106, in main
args.func(args)
File "d:\python\lib\site-packages\harwest\harwest.py", line 77, in codeforces
CodeforcesWorkflow(configs).run(start_page_index=args.start_page)
File "d:\python\lib\site-packages\harwest\lib\codeforces\workflow.py", line 92, in run
self.repository.push()
File "d:\python\lib\site-packages\harwest\lib\utils\repository.py", line 52, in push
self.git.push(*args)
File "d:\python\lib\site-packages\git\cmd.py", line 542, in <lambda>
return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
File "d:\python\lib\site-packages\git\cmd.py", line 1005, in _call_process
return self.execute(call, **exec_kwargs)
File "d:\python\lib\site-packages\git\cmd.py", line 822, in execute
raise GitCommandError(command, status, stderr_value, stdout_value)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
cmdline: git push origin master
stderr: 'Logon failed, use ctrl+c to cancel basic credential prompt.
bash: /dev/tty: No such device or address
error: failed to execute prompt script (exit code 1)
fatal: could not read Username for 'https://github.com': No such file or directory'
I filled all correct details at both places!!
File "/home/sainath/.local/bin/harwest", line 8, in
sys.exit(main())
File "/home/sainath/.local/lib/python3.8/site-packages/harwest/harwest.py", line 115, in main
args.func(args)
File "/home/sainath/.local/lib/python3.8/site-packages/harwest/harwest.py", line 74, in atcoder
process_platform(args, "AtCoder", AtcoderWorkflow)
File "/home/sainath/.local/lib/python3.8/site-packages/harwest/harwest.py", line 90, in process_platform
workflow(configs).run(start_page_index=args.start_page, full_scan=full_scan)
File "/home/sainath/.local/lib/python3.8/site-packages/harwest/lib/abstractworkflow.py", line 96, in run
response.append(self.__add_submission(submission))
File "/home/sainath/.local/lib/python3.8/site-packages/harwest/lib/abstractworkflow.py", line 29, in __add_submission
solution_file_path = self.__get_solution_path(submission)
File "/home/sainath/.local/lib/python3.8/site-packages/harwest/lib/abstractworkflow.py", line 60, in __get_solution_path
lang_ext = config.get_language_extension(submission_lang)
File "/home/sainath/.local/lib/python3.8/site-packages/harwest/lib/utils/config.py", line 49, in get_language_extension
raise ValueError(
ValueError: ("Please provide correct file extension for the language 'Python3 (3.4.3)' in", '/home/sainath/.local/lib/python3.8/site-packages/harwest/lib/resources/language.json', 'file')
'harwest' is not recognized as an internal or external command,
operable program or batch file.
How to correct this
From what I understand, to update the repo, we still require to periodically run harwest codeforces
or harwest atcoder
. Instead, we could also offer the option of automatically setting up a GitHub Action that runs harwest and updates the repo daily. (using the cron
directive in the .yml file)
A few ideas I have in mind:
.github/workflows/harwest.yml
to the initial repo.lib/resources
. This will have to be changed to the repo itself (maybe a .config folder).harwest codeforces
or the like locally, we will pull first to receive the latest updated repo.P.S. Great work with the project - really slick!
โ Currently scanning page #2: (5/5) 1111gal password https://atcoder.jp/contests/abc242/tasks/abc242_c
Traceback (most recent call last):
File "c:\users\singh\appdata\local\programs\python\python39\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\singh\appdata\local\programs\python\python39\lib\runpy.py", line 87, in run_code
exec(code, run_globals)
File "C:\Users\singh\AppData\Local\Programs\Python\Python39\Scripts\harwest.exe_main.py", line 7, in
File "c:\users\singh\appdata\local\programs\python\python39\lib\site-packages\harwest\harwest.py", line 115, in main
args.func(args)
File "c:\users\singh\appdata\local\programs\python\python39\lib\site-packages\harwest\harwest.py", line 74, in atcoder
process_platform(args, "AtCoder", AtcoderWorkflow)
File "c:\users\singh\appdata\local\programs\python\python39\lib\site-packages\harwest\harwest.py", line 90, in process_platform
workflow(configs).run(start_page_index=args.start_page, full_scan=full_scan)
File "c:\users\singh\appdata\local\programs\python\python39\lib\site-packages\harwest\lib\abstractworkflow.py", line 105, in run
self.repository.push()
File "c:\users\singh\appdata\local\programs\python\python39\lib\site-packages\harwest\lib\utils\repository.py", line 52, in push
self.git.push(*args)
File "c:\users\singh\appdata\local\programs\python\python39\lib\site-packages\git\cmd.py", line 542, in
return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
File "c:\users\singh\appdata\local\programs\python\python39\lib\site-packages\git\cmd.py", line 1005, in _call_process
return self.execute(call, **exec_kwargs)
File "c:\users\singh\appdata\local\programs\python\python39\lib\site-packages\git\cmd.py", line 822, in execute
raise GitCommandError(command, status, stderr_value, stdout_value)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
cmdline: git push origin master
stderr: '
Unhandled Exception: System.ComponentModel.Win32Exception: The directory name is invalid
at System.Diagnostics.Process.StartWithCreateProcess(ProcessStartInfo startInfo)
at System.Diagnostics.Process.Start()
at GitCredentialManager.GitProcess.get_Version()
at GitCredentialManager.GitProcessConfiguration.GetCanonicalizeTypeArg(GitConfigurationType type)
at GitCredentialManager.GitProcessConfiguration.TryGet(GitConfigurationLevel level, GitConfigurationType type, String name, String& value)
at GitCredentialManager.Settings.d__5.MoveNext()
at System.Linq.Enumerable.FirstOrDefault[TSource](IEnumerable`1 source)
at GitCredentialManager.Settings.TryGetSetting(String envarName, String section, String property, String& value)
at GitCredentialManager.Authentication.MicrosoftAuthentication.CanUseBroker(ICommandContext context)
at GitCredentialManager.Program.Main(String[] args)
bash: /dev/tty: No such device or address
error: failed to execute prompt script (exit code 1)
fatal: could not read Username for 'https://github.com': No such file or directory'
This is the error I am getting while pushing the solutions.
Traceback (most recent call last):
File "/home/imskanand/.local/bin/harwest", line 8, in
sys.exit(main())
File "/home/imskanand/.local/lib/python3.8/site-packages/harwest/harwest.py", line 115, in main
args.func(args)
File "/home/imskanand/.local/lib/python3.8/site-packages/harwest/harwest.py", line 70, in codeforces
process_platform(args, "Codeforces", CodeforcesWorkflow)
File "/home/imskanand/.local/lib/python3.8/site-packages/harwest/harwest.py", line 90, in process_platform
workflow(configs).run(start_page_index=args.start_page, full_scan=full_scan)
File "/home/imskanand/.local/lib/python3.8/site-packages/harwest/lib/abstractworkflow.py", line 105, in run
self.repository.push()
File "/home/imskanand/.local/lib/python3.8/site-packages/harwest/lib/utils/repository.py", line 52, in push
self.git.push(*args)
File "/home/imskanand/.local/lib/python3.8/site-packages/git/cmd.py", line 542, in
return lambda *args, **kwargs: self._call_process(name, *args, **kwargs)
File "/home/imskanand/.local/lib/python3.8/site-packages/git/cmd.py", line 1005, in _call_process
return self.execute(call, **exec_kwargs)
File "/home/imskanand/.local/lib/python3.8/site-packages/git/cmd.py", line 822, in execute
raise GitCommandError(command, status, stderr_value, stdout_value)
git.exc.GitCommandError: Cmd('git') failed due to: exit code(128)
cmdline: git push origin master
stderr: 'fatal: protocol 'git remote add origin https' is not supported'
Source : https://codeforces.com/blog/entry/85788?#comment-747260
Error Log says
ValueError: ("Please provide correct file extension for the language 'Java8 (OpenJDK 1.8.0)' in", 'C:\\Users\\bleh0\\AppData\\Local\\Packages\\PythonSoftwareFoundation.Python.3.9_qbz5n2kfra8p0\\LocalCache\\local-packages\\Python39\\site-packages\\harwest\\lib\\resources\\language.json', 'file')
Possible Fix:
add the entry "Java8 (OpenJDK 1.8.0)": "java"
in language.json
Hello,
I found out that when I have multiple submissions on CodeForces for the same problem, the newer submissions get crawled first, then the older one gets crawled. As result, "the newest commit" which stacks on top turns out to be the oldest submission instead of the newest one.
Is there a way so that my newest submission will appear as the latest commit?
Wants to add in readme , how to set the things up again after deleting the entire folder.
These submissions require login. Using requests.session
, login should be possible. I've hacked around and this login method works:
def __login(self):
username = 'I_love_Hoang_Yen'
password = '<redacted>'
bfaa = 'f1b3f18c715565b589b7823cda7448ce'
ftaa = ''.join(random.choices('abcdefghijklmnopqrstuvwxyz0123456789', k=18))
LOGIN_URL = 'https://codeforces.com/enter'
r = self.session.get(LOGIN_URL)
csrf = r.text.split("csrf_token' value='")[1].split("'")[0]
data = {
"csrf_token": csrf,
"action": "enter",
"ftaa": ftaa,
"bfaa": bfaa,
"handleOrEmail": username,
"password": password,
"_tta": "176",
"remember": "on",
}
r = self.session.post(LOGIN_URL, data=data, headers={'X-Csrf-Token': csrf})
After that it's also necessary to modify submission URL (for contest ID > 100k, should be /gym/{contest_id}/submission/{submission_id}
.
harwest codeforces
, is there a method by which I can pull the (latest) unaccepted submission too, please?Submissions aren't downloaded. Instead, it only creates empty contest folders.
submissions.js
is left empty.
No error message whatsoever.
Tried:
Working fine for atcoder.
Windows 10 19044, Python 3.9
Currently, the recent GitHub projects use main as the master branch instead of the master. So when I use the feature to automatically push code to GitHub using harwest, it shows an error as my project doesn't have master as my branch.
cmdline: git push origin master stderr: 'error: src refspec master does not match any'
This is the error you get
How to reproduce:
harwest codeforces -p 5
What happens: the crawler stop without crawling anything, even though I have 150+ pages of submissions.
I think the reason is because page 5 has only my non-AC or gym submissions. So self.client.get_user_submissions
returns an empty array, thus stopping the crawler.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.