Giter Club home page Giter Club logo

opendirectorydownloader's Introduction

Open Directory Downloader

Indexes open directories listings in 130+ supported formats, including FTP(S), Google Drive, Bhadoo, GoIndex, Go2Index (alternatives), Dropbox, Mediafire, GoFile, GitHub.

Written in C# with .NET (Core), which means it is cross platform!

Downloading is not (yet) implemented, but is already possible when you use the resulting file into another tool (for most of the formats).

Downloading with wget: wget -x -i theurlsfile.txt

Downloading with aria2c (Does not support directory structure..): aria2c -i theurlsfile.txt

If you have improvements, supply me with a pull request! If you have a format not yet supported, please let me know.

Releases / Binaries

For builds (64-bit) for Windows, Linux and Mac, or ARM/ARM64 builds for Pi:

https://github.com/KoalaBear84/OpenDirectoryDownloader/releases

When using the self-contained releases you don't need to install the .NET (Core) Runtime.

Prerequisites

When you are NOT using the self-contained releases, you need to install the latest/current Runtime version of .NET 8:

https://dotnet.microsoft.com/download/dotnet/8.0/runtime

Usage

Command line parameters:

Short Long Description
-u --url Url to scan
-t --threads Number of threads (default 5)
-o --timeout Number of seconds for timeout
-w --wait Number of seconds to wait between calls (when single threaded is too fast..)
-q --quit Quit after scanning (No "Press a key")
-c --clipboard Automatically copy the Reddits stats once the scan is done
-j --json Save JSON file
-f --no-urls Do not save URLs file
-r --no-reddit Do not show Reddit stats markdown
-l --upload-urls Uploads urls file
-e --exact-file-sizes Exact file sizes (WARNING: Uses HEAD requests which takes more time and is heavier for server)
--fast-scan Only use sizes from HTML, no HEAD requests, even if the approx. size cannot be extracted from the HTML
-s --speedtest Does a speed test after indexing
-a --user-agent Use custom default User Agent
--username Username
--password Password
--github-token GitHub Token
-H --header Supply a custom header to use for each HTTP request. Can be used multiple times for multiple headers. See below for more info.
--output-file Output file to use for urls file
--proxy-address Proxy address, like "socks5://127.0.0.1:9050" (needed for .onion)
--proxy-username Proxy username
--proxy-password Proxy password
--no-browser Disallow starting Chromium browser (for Cloudflare)

Example

Windows

OpenDirectoryDownloader.exe --url "https://myopendirectory.com"

Linux

./OpenDirectoryDownloader --url "https://myopendirectory.com"

If you want to learn more or contribute, see the following paragraphs!

Custom Headers

Headers need to be provided in the following format:

<Header Name>: <Header Value>

This syntax is compatible with e.g. cURL, so that you can copy the headers from a cURL command and re-use them with OpenDirectoryDownloader.

This means you can easily "fake" a browser request:

  1. On the page/site you want to index, open your browsers dev tools (F12 or CTRL + SHIFT + i)
  2. Go to the Network tab
  3. Reload the page
  4. Right-click on the first request/item in the network tab and select Copy > Copy as cURL (bash) (might be called differently, depending on your browser)
  5. The copied command ends with lots of headers (-H '<something>' -H '<something else>'). Copy only this part of the command and append it to your OpenDirectoryDownloader command, like so: OpenDirectoryDownloader --url "https://myopendirectory.com" -H 'header-name-1: header-value-1' -H 'header-name-2: header-value-2' ...
    You can of course also use other options with this or omit the --url option to use the prompt instead.

Setting some options like --username or --user-agent might override some headers, as explicit options take precedence. Option order does not matter (this applies to OpenDirectoryDownloader in general).

Copying on Linux

When you want to copy (C key or -c flag) the stats at the end on Linux you need to have xclip installed.

Linux distros

On some distros you need extra dependencies. For Alpine: https://docs.microsoft.com/en-us/dotnet/core/install/linux-alpine

For others see: https://docs.microsoft.com/en-us/dotnet/core/install/linux

TLS errors (Windows 10)

If you received errors like this, please apply the registry file "Enable TLS 1.3.reg" from this site.

System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.Security.Authentication.AuthenticationException: Authentication failed because the remote party sent a TLS alert: 'ProtocolVersion'.
 ---> System.ComponentModel.Win32Exception (0x80090326): The message received was unexpected or badly formatted.

Cloudflare

EXPERIMANTAL!! READ THIS FIRST!

IT WILL NOT ALWAYS WORK!

There is experimental support for Cloudflare. When it detects a Cloudflare issue it will download a Chromium browser, start it, in which the Cloudflare protection can be solved. Sometimes this is a captcha which the user (you) needs to solve. For each browser session you have 60 seconds to complete. After that the browser will be killed and you can retry on next request.

Cloudflare does somehow detect that it is not the normal Chromium/Chrome browser and therefore it sadly will not always work. A good tip is move your mouse as soon as possible in the browser.

Sometimes it fails and pops up a browser for every request, and also kills it almost immediately when Cloudflare sees that there is no problem with the session. If this happens, kill the indexer!

If anybody have more info how to get Cloudflare to work better, let me know!

GitHub

By default GitHub has a rate limit of 60 request per hour, which is enough for 20 repositories with less than 100.000 items. You can increase this limit to 5000 per hour by creating a (personal) token:

  1. Go to https://github.com/settings/tokens/new
  2. Add a name like "OpenDirectoryDownloader"
  3. You don't have to select any scopes!
  4. Click "Generate token"
  5. Start OpenDirectoryDownloader with --githubtoken

Docker

Every release will automatically push an image to the Docker Hub:

https://hub.docker.com/repository/docker/koalabear84/opendirectorydownloader

Run it like:

docker run --rm -v c:/Scans:/app/Scans -it koalabear84/opendirectorydownloader --quit --speedtest

It will save the URLs files onto C:\Scans (windows), or replace with a custom folder on other OS-ses.

* You can also run it without -v c:/scans:/app/Scans if you don't want to save the results on your host.

Google Colab / Jupyter Notebook

  1. Open https://colab.research.google.com/github/KoalaBear84/OpenDirectoryDownloader/blob/master/OpenDirectoryDownloader.ipynb
  2. Run step 1 to setup the environment and install the latest OpenDirectoryDownloader
  3. Fill in the Url
  4. Run step 2
  5. Wait until indexing is completed
  6. Urls file can be found in Scans folder (see Folder icon on the left sidebar)

Onion / Tor support

  1. Make sure the Tor is running on your machine
  2. Use the correct proxy address notation, default for Tor is: "socks5://127.0.0.1:9050"
  3. Start it with --proxy-address parameter

OpenDirectoryDownloader.exe --url "http://*.onion/" --proxy-address "socks5://127.0.0.1:9050"

Getting the code

For Visual Studio (Windows)

  1. Install Visual Studio: https://visualstudio.microsoft.com/vs/community/
  • With workload: ".NET Core cross-platform development"
  • With individual components: Code tools > Git for Windows and Code tools > GitHub extension for Visual Studio
  1. Be sure to install Git: https://git-scm.com/downloads
  2. Clone the repository by clicking "Clone or download" and click "Open in Visual Studio"

For Visual Studio Code

  1. Download Visual Studio Code: https://code.visualstudio.com/download
  2. Be sure to install Git: https://git-scm.com/downloads
  3. Clone the repository: https://code.visualstudio.com/docs/editor/versioncontrol#_cloning-a-repository
  4. More help: https://docs.microsoft.com/en-us/dotnet/core/tutorials/with-visual-studio-code

Building

  1. Install the newest .NET 8 SDK: https://dotnet.microsoft.com/download/dotnet/8.0
  2. git clone https://github.com/KoalaBear84/OpenDirectoryDownloader
  3. cd OpenDirectoryDownloader/src
  4. dotnet build .
  5. cd OpenDirectoryDownloader/bin/Debug/net7.0
  6. ./OpenDirectoryDownloader --url "https://myopendirectory.com"

For Linux (Might not be needed since .NET 7):
Then, if you need to package it into a binary, you can use warp-packer

When you have cloned the code, you can also run it without the SDK. For that, download the "Runtime" and do "dotnet run ." instead of build.

Google Drive

For Google Drive scanning you need to get a Google Drive API credentials file, it's free!

You can use a many steps manual option, or the 6 steps 'Quickstart' workaround.

Manual/customized:

  1. Go to https://console.cloud.google.com/projectcreate
  2. Fill in Project Name, like "opendirectorydownloader" or so, leave Location unchanged
  3. Change Project ID (optional)
  4. Click "CREATE"
  5. Wait a couple of seconds until the project is created and open it (click "VIEW")
  6. On the APIs pane, click "Go to APIs overview"
  7. Click "ENABLE APIS AND SERVICES"
  8. Enter "Drive", select "Google Drive API"
  9. Click "ENABLE"
  10. Go to "Credentials" menu in the left menu bar
  11. Click "CONFIGURE CONSENT SCREEN"
  12. Choose "External", click "CREATE"
  13. Fill in something like "opendirectorydownloader" in the "Application name" box
  14. At the bottom click "Save"
  15. Go to "Credentials" menu in the left menu bar (again)
  16. Click "CREATE CREDENTIALS"
  17. Select "OAuth client ID"
  18. Select "Desktop app" as "Application type"
  19. Change the name (optional)
  20. Click "Create"
  21. Click "OK" in the "OAuth client created" dialog
  22. In the "OAuth 2.0 Client IDs" section click on the just create Desktop app line
  23. In the top bar, click "DOWNLOAD JSON"
  24. You will get a file like "client_secret_xxxxxx.apps.googleusercontent.com.json", rename it to "OpenDirectoryDownloader.GoogleDrive.json" and replace the one in the release

Wow, they really made a mess of this..

Alternative method (easier):

This will 'abuse' a 'Quickstart' project.

  1. Go to https://developers.google.com/drive/api/v3/quickstart/python
  2. Click the "Enabled the Drive API"
  3. "Desktop app" will already be selected on the "Configure your OAuth client" dialog
  4. Click "Create"
  5. Click "DOWNLOAD CLIENT CONFIGURATION"
  6. You will get a file like "credentials.json", rename it to "OpenDirectoryDownloader.GoogleDrive.json" and replace the one in the release

On the first use, you will get a browser screen that you need to grant access for it, and because we haven't granted out OAuth consent screen (This app isn't verified), we get an extra warning. You can use the "Advanced" link, and use the "Go to yourappname (unsafe)" link.

Support

If you like OpenDirectoryDownloader, please consider supporting me!

❤️ Sponsor

Contact me

Reddit https://www.reddit.com/user/KoalaBear84

opendirectorydownloader's People

Contributors

4censord avatar 9001 avatar bad3r avatar chaphasilor avatar fr1tzbot avatar kant avatar koalabear84 avatar mcofficer avatar nwithan8 avatar signalhunter avatar soorajsprakash avatar wingysam avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

opendirectorydownloader's Issues

Linux version is not executable by default

Describe the bug

When extracting the current release's self-contained version, the file OpenDirectoryDownloader does not have an executable bit set. (I assume this is the same with the non-self-contained version).

To Reproduce
Steps to reproduce the behavior:

  1. Download the current release to a linux machine
  2. Extract using 7z, others should also work
  3. Try to run ODD
❯ ./OpenDirectoryDownloader --version                          
zsh: permission denied: ./OpenDirectoryDownloader 

Expected behavior
The zip should retain unix permissions.

This is likely because the build runs on windows, which can't retain unix permissions due to its own filesystem:

runs-on: windows-latest

If you want, i can improve the workflow to run on all OSs in parallel, which should also get rid of this issue.

Desktop (please complete the following information):

  • OS: Ubuntu 18.04.5
  • Version: v1.9.1.0 self-contained, also happened with earlier versions.

[Request] Make it Colab or heroku compatible

hi..
this is very helpful project special for downloading google index websites.
Can you plz make it google colab compatible? so that we can run it in colab?

also it would be nice if you make it compatible with heroku, so that we can create an web app on heroku.

Get root level folders sizes

Is your feature request related to a problem? Please describe.
I'd like to be able to get the size of each subfolder of the first depth.

Describe the solution you'd like
http://url.com/books
art/ Total: 11.82 MiB|
computers/ Total: 102 MiB|
|Dirs: 1 Ext: 1|Total: 10|Total: 113.82 MiB|

"dotnet build ." not creating executable file on Mac

Hello,

I was trying to build this program from source on Mac. I followed the guide under the "Building" paragraph here.
I have .NET Core 3.1 installed:
User@Users-MacBook-Pro netcoreapp3.1 % dotnet --list-sdks 3.1.402 [/usr/local/share/dotnet/sdk] User@Users-MacBook-Pro netcoreapp3.1 %
Screenshot 2020-10-11 at 10 01 27

The building went fine, apparently, without any errors:
User@Users-MacBook-Pro Downloads % git clone https://github.com/KoalaBear84/OpenDirectoryDownloader
Cloning into 'OpenDirectoryDownloader'...
remote: Enumerating objects: 21, done.
remote: Counting objects: 100% (21/21), done.
remote: Compressing objects: 100% (16/16), done.
remote: Total 1925 (delta 7), reused 11 (delta 5), pack-reused 1904
Receiving objects: 100% (1925/1925), 1.06 MiB | 1.26 MiB/s, done.
Resolving deltas: 100% (1125/1125), done.
User@Users-MacBook-Pro Downloads % cd OpenDirectoryDownloader/OpenDirectoryDownloader
User@Users-MacBook-Pro OpenDirectoryDownloader % dotnet build .
Microsoft (R) Build Engine version 16.7.0+7fb82e5b2 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

Determining projects to restore...
Restored /Users/User/Downloads/OpenDirectoryDownloader/OpenDirectoryDownloader.GoogleDrive/OpenDirectoryDownloader.GoogleDrive.csproj (in 299 ms).
Restored /Users/User/Downloads/OpenDirectoryDownloader/OpenDirectoryDownloader.Shared/OpenDirectoryDownloader.Shared.csproj (in 289 ms).
Restored /Users/User/Downloads/OpenDirectoryDownloader/OpenDirectoryDownloader/OpenDirectoryDownloader.csproj (in 451 ms).
OpenDirectoryDownloader.Shared -> /Users/User/Downloads/OpenDirectoryDownloader/OpenDirectoryDownloader.Shared/bin/Debug/netcoreapp3.1/OpenDirectoryDownloader.Shared.dll
OpenDirectoryDownloader.GoogleDrive -> /Users/User/Downloads/OpenDirectoryDownloader/OpenDirectoryDownloader.GoogleDrive/bin/Debug/netcoreapp3.1/OpenDirectoryDownloader.GoogleDrive.dll
OpenDirectoryDownloader -> /Users/User/Downloads/OpenDirectoryDownloader/OpenDirectoryDownloader/bin/Debug/netcoreapp3.1/OpenDirectoryDownloader.dll

Build succeeded.
0 Warning(s)
0 Error(s)

Time Elapsed 00:00:05.00
User@Users-MacBook-Pro OpenDirectoryDownloader % cd bin/Debug/netcoreapp3.1
User@Users-MacBook-Pro netcoreapp3.1 % ./OpenDirectoryDownloader
zsh: no such file or directory: ./OpenDirectoryDownloader
User@Users-MacBook-Pro netcoreapp3.1 % ls
OpenDirectoryDownloader.runtimeconfig.dev.json
FluentFTP.dll
OpenDirectoryDownloader.runtimeconfig.json
NLog.dll
OpenDirectoryDownloader.deps.json
Google.Apis.Auth.PlatformServices.dll
OpenDirectoryDownloader.dll
Google.Apis.Auth.dll
OpenDirectoryDownloader.pdb
Google.Apis.Core.dll
OpenDirectoryDownloader.GoogleDrive.dll
Google.Apis.dll
OpenDirectoryDownloader.GoogleDrive.pdb
TextCopy.dll
OpenDirectoryDownloader.Shared.dll
Polly.dll
OpenDirectoryDownloader.Shared.pdb
CommandLine.dll
NLog.config
Microsoft.Extensions.DependencyInjection.Abstractions.dll
OpenDirectoryDownloader.GoogleDrive.json
AngleSharp.dll
Google.Apis.Drive.v3.dll
Newtonsoft.Json.dll
Screenshot 2020-10-11 at 10 03 42

As you can see, no OpenDirectoryDownloader executable file was created, nor in the obj folder.

Could someone knowledgeable please advice on what I'm doing wrong or what I should do? Thanks

Support FTPS URLs

Is your feature request related to a problem? Please describe.
need to index ftps

Describe the solution you'd like
ftps that works

Crashes every time

Describe the bug
Crashing with stack trace

To Reproduce
Steps to reproduce the behavior:

  1. Execute ./OpenDirectoryDownloader --url https://download.tuxfamily.org/

Expected behavior
No failure

Fatal error. Internal CLR error. (0x80131506)
   at System.Diagnostics.StackTrace.GetStackFramesInternal(System.Diagnostics.StackFrameHelper, Int32, Boolean, System.Exception)
   at System.Diagnostics.StackFrameHelper.InitializeSourceInfo(Int32, Boolean, System.Exception)
   at System.Diagnostics.StackTrace.CaptureStackTrace(Int32, Boolean, System.Exception)
   at NLog.LoggerImpl.Write(System.Type, NLog.Internal.TargetWithFilterChain, NLog.LogEventInfo, NLog.LogFactory)
   at OpenDirectoryDownloader.OpenDirectoryIndexer+<WebDirectoryProcessor>d__53.MoveNext()
   at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
   at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.__Canon, System.Private.CoreLib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext(System.Threading.Thread)
   at System.Threading.Tasks.AwaitTaskContinuation.RunOrScheduleAction(System.Runtime.CompilerServices.IAsyncStateMachineBox, Boolean)
   at System.Threading.Tasks.Task.RunContinuations(System.Object)
   at System.Threading.Tasks.Task.TrySetResult()
   at System.Threading.Tasks.Task+DelayPromise.CompleteTimedOut()
   at System.Threading.TimerQueueTimer.CallCallback(Boolean)
   at System.Threading.TimerQueueTimer.Fire(Boolean)
   at System.Threading.TimerQueue.FireNextTimers()
Aborted (core dumped)

Desktop (please complete the following information):

  • OS: Archlinux
  • Version OpenDirectoryDownloader 1.0.0

Problem with google drive

I followed the steps in the readme to create the gdrive API json.

I downloaded the latest release for Windows, installed the latest .NET SDK and replaced the original .json by the one generated previously.
When I try to scan a gdrive folder, I get an error, which seems to be related to google authentication.

I tried another link that isn't a gdrive link, and it worked perfectly.

Here is the error message:

image

I used OpenDirectoryDownloader v1.4.0.8 on W10

Error when there is no Console / Console Input attached

Describe the bug
I am using this tool in another script and I call it as a childProcess. Therefore, it has no Console attached to it which causes the script to throw a lot of errors.

To Reproduce
Steps to reproduce the behavior:

const childProcess = require('child_process');
const response = childProcess.spawnSync('dotnet', ['src/netcoreapp3.1/OpenDirectoryDownloader.dll', '--url=http://someurl.tld', '-q', '-j']);
console.log(response.output.toString())

Expected behavior
When there is no console attached to the script it should not print anything out.

Console Output
ERROR Command.ProcessConsoleInput Error processing action System.InvalidOperationException: Cannot read keys when either application does not have a console or when console input has been redirected. Try Console.Read.

Desktop (please complete the following information):

  • OS: macOS Catalina
  • Version: 10.15.3 (19D76)
  • node: v10.17.0

No Values Being Displayed

Im using windows 10, I installed 3.0 preview 5 and also Visual Studio 2019 preview. Everything seems to work except I'm not getting any values displayed. Please see the output below. Any help will be greatly appreciated.

Started
URL specified: http://www.joellagace.com/Converted%20books/
URL fixed: http://www.joellagace.com/Converted%20books/
Started indexing!



*** Press I for info (this) ***
*** Press S for statistics ***
*** Press T for thread info ***
*** Press J for Save JSON ***


*** Press ESC to EXIT ***



Finshed indexing
Saving URL list to file...
Saved URL list to file: http___www.joellagace.com_Converted%20books_.txt
2019-06-07 21:58:53.3492 [20] WARN <b__0>d.MoveNext Only a speedtest for HTTP(S)
Http status codes
200: 10
Total files: 0, Total estimated size: B
Total directories: 10
Total HTTP requests: 10, Total HTTP traffic: 74.49 kB

Url: http://www.joellagace.com/Converted%20bo...
Extension (Top 5) Files Size
Dirs: 10 Ext: 0 Total: 0 Total: B
Date: 6/8/2019 2:58:49 AM +00:00 Time: 00:00:03

Finished indexing!

[json] - getting error while saving as json

Describe the bug
I was getting error while saving the .json using J key, I found out that Scans Folder was not created at this location C:\Users\Administrator\AppData\Local\Temp\2\.net\OpenDirectoryDownloader\bygfrh2s.ae1, so i had to fix it by adding it manually.

VpmYM3ikjv

To Reproduce
Steps to reproduce the behavior:

  1. Go to 'OpenDirectoryIndexer.exe -u https://gooddebate.org/sin/mirror/library/'
  2. Click on 'J'
  3. Scroll down
  4. See error

Expected behavior
I expected the tool to automatically save the .json file without any manual setup

Screenshots
Error
mstsc_lCikS9nXPT

Desktop (please complete the following information):

  • OS: [Windows Server 2019]
  • Version [17763]

Paging support

Some directory listings have paging, implement a generic way to support this

High CPU usage after scan has finished

Describe the bug
While waiting for the final keystore, ODD uses a lot of CPU.

To Reproduce
Steps to reproduce the behavior:

  1. Run an OD scan
  2. Wait for Finished indexing!
  3. Check CPU usage

Expected behavior
ODD should be relatively idle. The loop to read the next keystroke (the only reasonable explanation) shouldn't consume as much CPU.

Screenshots

image

Desktop (please complete the following information):

  • OS: Ubuntu 18.0.5
  • Version v1.9.1.0 self-contained

Error The SSL connection could not be established, see inner exception.

Hey, im trying to find the size of a mega folder useing this tool (i know this isnt really the correct use but i tought i could give it a try) , the exact command im running is ./OpenDirectoryDownloader -u "http://mega.nz/folder/30MlkQib#RDOaGzmtFEHkxSYBaJSzVA" -e
but the output im getting is
https://paste.gg/p/anonymous/89418b73b9734f00b2965b2d4f35c153
and it keeps giveing the error, for more information im running Arch Linux on Kernel 5.10.5-arch1-1 with OpenDirectoryDownloader-linux-x64-self-contained

Allow aria2c to preserve directory structure

Is your feature request related to a problem? Please describe.
Just noticed that according to the README, aria2c does not preserve directory structure while downloading.

Describe the solution you'd like
This is in fact possible by formatting the URLs file differently (See https://stackoverflow.com/a/28553355/7653274). This feature would best be hidden behind a flag, because it does make the file wget-incompatible.

Support RFC5854 metalink file output

Metalink files are XML files that provide more information than just the filename, such as file sizes, dates, and cryptographic hashes. Aria2c can read these files directly. You can add the statistics for an OpenDirectory in an XML comment at the top, consolidating the metadata and data into a single file.

You might not want estimated file sizes in the file unless the user chooses --exact-file-sizes but again you can add the estimated file sizes as comments. When you get downloading working, you could calculate a hash and add that in to help someone else's aria2c to verify their download, or even as a way to fingerprint your own files as a way to detect future file corruption.

And XML files give you the ability to transform them into any format you want just by writing a stylesheet.

Google Colab Notebook

It would be more helpful, if there is a google colab notebook for this. hope you will consider this.

Unable to save URL list to file

Hi,

The application is currently crashing regularly when attempting to run the following commands

PS G:\> .\OpenDirectoryDownloader.exe -u http://31.22.89.2/cisco/ -j
Started
URL specified: http://31.22.89.2/cisco/
URL fixed: http://31.22.89.2/cisco/
Started indexing!
****************************************************************************
****************************************************************************
***  Press I for info (this)                                             ***
***  Press S for statistics                                              ***
***  Press T for thread info                                             ***
***  Press J for Save JSON                                               ***
***                                                                      ***
***  Press ESC to EXIT                                                   ***
****************************************************************************
****************************************************************************
2019-11-28 16:13:05.0349  [9]  WARN OpenDirectoryIndexer.TimerStatistics_Elapsed Http status codes
200: 90
Total files: 4,532, Total estimated size: 100.3 GiB
Total directories: 90
Total HTTP requests: 95, Total HTTP traffic: 1.45 MiB
|**Url:** http://31.22.89.2/cisco/|||
|:-|-:|-:|
|**Date:** 11/28/2019 5:12:35 AM +00:00|**Time:** 00:00:30||

2019-11-28 16:13:05.0448  [9]  WARN OpenDirectoryIndexer.TimerStatistics_Elapsed Queue: 78, Queue (filesizes): 0
2019-11-28 16:13:35.0240 [11]  WARN OpenDirectoryIndexer.TimerStatistics_Elapsed Http status codes
200: 180
Total files: 6,423, Total estimated size: 143.06 GiB
Total directories: 180
Total HTTP requests: 184, Total HTTP traffic: 2.36 MiB
|**Url:** http://31.22.89.2/cisco/|||
|:-|-:|-:|
|**Date:** 11/28/2019 5:12:35 AM +00:00|**Time:** 00:01:00||

2019-11-28 16:13:35.0404 [11]  WARN OpenDirectoryIndexer.TimerStatistics_Elapsed Queue: 0, Queue (filesizes): 0
2019-11-28 16:13:35.7381 [28] ERROR DirectoryParser.CheckSymlinks Possible virtual directory or symlink detected (level 1)! SKIPPING! Url: http://31.22.89.2/cisco/www.mmnt.net/db/0/
Finshed indexing
Saving URL list to file...

Then the following error occurs:

2019-11-28 16:13:35.9167 [18] ERROR <<StartIndexingAsync>b__0>d.MoveNext System.IO.DirectoryNotFoundException: Could not find a part of the path 'C:\Users\admin\AppData\Local\Temp\.net\OpenDirectoryDownloader\bygfrh2s.ae1\Scans\http___31.22.89.2_cisco_.txt'.
   at System.IO.FileStream.ValidateFileHandle(SafeFileHandle fileHandle)
   at System.IO.FileStream.CreateFileOpenHandle(FileMode mode, FileShare share, FileOptions options)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at System.IO.StreamWriter.ValidateArgsAndOpenPath(String path, Boolean append, Encoding encoding, Int32 bufferSize)
   at System.IO.StreamWriter..ctor(String path)
   at System.IO.File.WriteAllText(String path, String contents)
   at OpenDirectoryDownloader.OpenDirectoryIndexer.<>c__DisplayClass50_0.<<StartIndexingAsync>b__0>d.MoveNext() System.IO.DirectoryNotFoundException: Could not find a part of the path 'C:\Users\admin\AppData\Local\Temp\.net\OpenDirectoryDownloader\bygfrh2s.ae1\Scans\http___31.22.89.2_cisco_.txt'.
   at System.IO.FileStream.ValidateFileHandle(SafeFileHandle fileHandle)
   at System.IO.FileStream.CreateFileOpenHandle(FileMode mode, FileShare share, FileOptions options)
   at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options)
   at System.IO.StreamWriter.ValidateArgsAndOpenPath(String path, Boolean append, Encoding encoding, Int32 bufferSize)
   at System.IO.StreamWriter..ctor(String path)
   at System.IO.File.WriteAllText(String path, String contents)
   at OpenDirectoryDownloader.OpenDirectoryIndexer.<>c__DisplayClass50_0.<<StartIndexingAsync>b__0>d.MoveNext()

And the application contiues.

Starting speedtest (10-25 seconds)...
Test file: 421.8 MiB http://31.22.89.2/cisco/120xxXR/XR12000-iosxr-k9-4.0.0.tar

No list of URL's is generated.

I am using the OpenDirectoryDownloader-1.2.0.0.zip version from the releases section on windows 10. Running from an external HDD and via PowerShell.

I did notice that when I run

PS G:\> .\OpenDirectoryDownloader.exe --help

I get the following output (and subsequent crash)

Started
OpenDirectoryDownloader 1.0.0
Copyright (C) 2019 OpenDirectoryDownloader

  -u, --url                 Url to scan

  -t, --threads             (Default: 5) Number of threads

  -q, --quit                (Default: false) Do not wait after scanning

  -j, --json                (Default: false) Save JSON file

  -f, --no-urls             (Default: false) Do not save URLs file

  -r, --no-reddit           (Default: false) Do not show Reddit stats markdown

  -e, --exact-file-sizes    (Default: false) Exact file sizes (WARNING: Uses HEAD requests which takes more time and is heavier for server)

  -l, --upload-urls         (Default: false) Uploads urls file

  -s, --speed-test          (Default: false) Do a speed test

  --help                    Display this help screen.

  --version                 Display version information.

Error command line parameter 'HelpRequestedError'

Unhandled Exception: System.NullReferenceException: Object reference not set to an instance of an object.
   at OpenDirectoryDownloader.Program.Main(String[] args)
   at OpenDirectoryDownloader.Program.<Main>(String[] args)
PS G:\>

Not sure what else I can tell you really. Any advice would be appreciated.

Cheers

Any way to get the Scanned URL from a URL file?

What the title says. Suppose i have a URL file, long after it has been scanned. Specifically, the scanned URL was http://80s.lt/?dir=Files.

The URL file only contains links like http://80s.lt/Files/..., which don't contain the original URL.

The filename is http___80s.lt__dir=Files_.txt, which i can't reliable infer the original URL from.

Is there any other fancy way you can think of?

Speedtest not possible when spawing the process

Describe the bug

When I spawn ODD in Node.js and add the --speedtest-flag, the speedtest fails with an error about an invalid handle

To Reproduce
Steps to reproduce the behavior:

  1. Spawn ODD in Node.js with the speedtest argument:
    const oddProcess = spawn(<string with path to executable>, [`-u ${url}`, `--quit`, `--speedtest`]);
      
    oddProcess.stdout.on('data', (data) => {
      console.log(`stdout: ${data}`);
    });
      
    oddProcess.stderr.on('data', (data) => {
      console.error(`stderr: ${data}`);
    });
  2. Look at the output. It should say something like the following:
    Starting speedtest (10-25 seconds)...
    Test file: 21 GiB
    http://178.32.222.201/%5bAniDub%5d_Kara_no_Kyoukai_%5bBDRip1080p_h264_Flac%5d%5bSuzaku%5d/%5bAniDub%5d_Kara_no_Kyoukai_%5bMovie_5%5d_Paradox_Paradigm_%5bBDRip1080p_h264_Aac%5d%5bSuzaku%5d.mkv
    2020-11-29 21:17:07.2102 [15] ERROR OpenDirectoryIndexer.StartIndexingAsync Speedtest failed System.IO.IOException: The handle is invalid.
       at System.ConsolePal.GetBufferInfo(Boolean throwOnNoConsole, Boolean& succeeded)
       at System.ConsolePal.GetBufferInfo()
       at System.ConsolePal.GetCursorPosition()
       at OpenDirectoryDownloader.Library.ClearCurrentLine()
       at OpenDirectoryDownloader.Library.SpeedtestFromStream(Stream stream, Int32 seconds)
       at OpenDirectoryDownloader.Library.DoSpeedTestHttpAsync(HttpClient httpClient, String url, Int32 seconds)
       at OpenDirectoryDownloader.OpenDirectoryIndexer.<>c__DisplayClass51_0.<<StartIndexingAsync>b__0>d.MoveNext()

Expected behavior
The speedtest should work (and does work when using ODD in cli mode.

Desktop (please complete the following information):

  • OS: Windows 10 (64 bit)
  • Runtime: Node.js v14.15.1
  • Client: OpenDirectoryDownloader v1.9.0.9

Executable's version is wrong

Describe the bug
Both --version and --help display the version as 1.0.0.

To Reproduce
Steps to reproduce the behavior:

  1. Run ./OpenDirectoryDownloader --version
  2. Output:
❯ ./OpenDirectoryDownloader --version
tarted with PID 1539
OpenDirectoryDownloader 1.0.0
Error command line parameter 'VersionRequestedError'
Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
   at OpenDirectoryDownloader.Program.Main(String[] args)
   at OpenDirectoryDownloader.Program.<Main>(String[] args)
[1]    1539 abort      ./OpenDirectoryDownloader --version

Expected behavior
The version string should reflect the actual release version.

Desktop (please complete the following information):

  • OS: Ubuntu 18.04.5
  • Version: v1.9.1.0 self-contained, also happened with earlier versions.

Properly follow redirects

Describe the bug
ODD doesn't properly follow redirects (status code 301). This breaks scanning of some ODs.

To Reproduce
Steps to reproduce the behavior:

  1. Try to scan https://file.wikileaks.org/file
  2. Notice the following output:
Started with PID 41852
Which URL do you want to index?
https://file.wikileaks.org/file
URL specified: https://file.wikileaks.org/file
Started indexing!
┌─────────────────────────────────────────────────────────────────────────┐
│  Press I for info (this)                                                │
│  Press S for statistics                                                 │
│  Press T for thread info                                                │
│  Press U for Save TXT                                                   │
│  Press J for Save JSON                                                  │
├─────────────────────────────────────────────────────────────────────────┤
│  Press ESC or X to EXIT                                                 │
└─────────────────────────────────────────────────────────────────────────┘

2020-12-23 12:08:46.2585  [9]  WARN OpenDirectoryIndexer.ProcessWebDirectoryAsync First request fails, using Curl fallback User-Agent
2020-12-23 12:08:46.3586  [9]  WARN OpenDirectoryIndexer.ProcessWebDirectoryAsync First request fails, using Chrome fallback User-Agent
2020-12-23 12:08:46.4736  [9]  WARN OpenDirectoryIndexer..ctor [Processor 1] Error Response status code does not indicate success: 301 (Moved Permanently). retrieving on try 1 for url '/file'. Waiting 2 seconds.
2020-12-23 12:08:48.7565  [9]  WARN OpenDirectoryIndexer.ProcessWebDirectoryAsync First request fails, using Curl fallback User-Agent
2020-12-23 12:08:48.8835  [9]  WARN OpenDirectoryIndexer.ProcessWebDirectoryAsync First request fails, using Chrome fallback User-Agent
2020-12-23 12:08:48.9781  [9]  WARN OpenDirectoryIndexer..ctor [Processor 1] Error Response status code does not indicate success: 301 (Moved Permanently). retrieving on try 2 for url '/file'. Waiting 4 seconds.
2020-12-23 12:08:53.1775  [9]  WARN OpenDirectoryIndexer.ProcessWebDirectoryAsync First request fails, using Curl fallback User-Agent
2020-12-23 12:08:53.2767  [9]  WARN OpenDirectoryIndexer.ProcessWebDirectoryAsync First request fails, using Chrome fallback User-Agent
2020-12-23 12:08:53.3764  [9]  WARN OpenDirectoryIndexer..ctor [Processor 1] Error Response status code does not indicate success: 301 (Moved Permanently). retrieving on try 3 for url '/file'. Waiting 8 seconds.
2020-12-23 12:09:01.5481  [9]  WARN OpenDirectoryIndexer.ProcessWebDirectoryAsync First request fails, using Curl fallback User-Agent
2020-12-23 12:09:01.6718  [9]  WARN OpenDirectoryIndexer.ProcessWebDirectoryAsync First request fails, using Chrome fallback User-Agent
2020-12-23 12:09:01.7654  [9]  WARN OpenDirectoryIndexer..ctor [Processor 1] Error Response status code does not indicate success: 301 (Moved Permanently). retrieving on try 4 for url '/file'. Waiting 16 seconds.
2020-12-23 12:09:15.9330  [5]  WARN OpenDirectoryIndexer.TimerStatistics_Elapsed Http status codes
301: 4
Total files: 0, Total estimated size:  B
Total directories: 1
Total HTTP requests: 1, Total HTTP traffic:  B
|**Url:** https://file.wikileaks.org/file|||
|:-|-:|-:|
|**Date (UTC):** 2020-12-23 11:08:45|**Time:** 00:00:30||

2020-12-23 12:09:15.9330  [5]  WARN OpenDirectoryIndexer.TimerStatistics_Elapsed Queue: 0, Queue (filesizes): 0
2020-12-23 12:09:18.1261  [9]  WARN OpenDirectoryIndexer.ProcessWebDirectoryAsync First request fails, using Curl fallback User-Agent
2020-12-23 12:09:18.2213  [9]  WARN OpenDirectoryIndexer.ProcessWebDirectoryAsync First request fails, using Chrome fallback User-Agent
2020-12-23 12:09:18.3267  [9]  WARN OpenDirectoryIndexer..ctor [Processor 1] Cancelling on try 5 for url '/file'.
2020-12-23 12:09:18.3267  [9] ERROR OpenDirectoryIndexer.WebDirectoryProcessor Skipped processing Url: 'https://file.wikileaks.org/file'
Finshed indexing
No URLs to save
Http status codes
301: 5
Total files: 0, Total estimated size:  B
Total directories: 1
Total HTTP requests: 1, Total HTTP traffic:  B
|**Url:** https://file.wikileaks.org/file|||
|:-|-:|-:|
|**Extension (Top 5)**|**Files**|**Size**|
|**Dirs:** 1 **Ext:** 0|**Total:** 0|**Total:**  B|
|**Date (UTC):** 2020-12-23 11:08:45|**Time:** 00:00:34||

^(Created by [KoalaBear84's OpenDirectory Indexer](https://github.com/KoalaBear84/OpenDirectoryDownloader/))

URLs with errors:
https://file.wikileaks.org/file
Finished indexing!
Press ESC to exit! Or C to copy to clipboard and quit!

Expected behavior
It should honor the 301 and scan https://file.wikileaks.org/file/ (with trailing slash) instead. Scanning that URL works as expected.

Desktop (please complete the following information):

  • OS: Windows 10, 64 bit
  • ODD Version: 1.9.2.7

Fix HelpRequestedError and VersionRequestedError

Describe the bug
Based on errors in and while running to fix #50

OpenDirectoryDownloader.exe --help

Error command line parameter 'HelpRequestedError'
Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
   at OpenDirectoryDownloader.Program.Main(String[] args)
   at OpenDirectoryDownloader.Program.<Main>(String[] args)

OpenDirectoryDownloader.exe --version

Error command line parameter 'VersionRequestedError'
Unhandled exception. System.NullReferenceException: Object reference not set to an instance of an object.
   at OpenDirectoryDownloader.Program.Main(String[] args)
   at OpenDirectoryDownloader.Program.<Main>(String[] args)

Expected behavior
No errors

Desktop (please complete the following information):

  • OS: win-x64
  • Version 1.9.1.2

[Feature Request] - Tree structure

Is your feature request related to a problem? Please describe.
Surfing through large amounts of data is painful as the page loads each time we navigate through directories
Describe the solution you'd like
snap2html - Example this is what I want to achieve, being able to generate an html file with Tree Structure of all the directories and files.
Thus making it easier to search.
Source Code

Intermediate Speedtest results fail to write to terminal

Describe the bug
There is a deadlock happening whenever ODD is trying to write the intermediate speedtest results to the terminal. As soon as i hit enter, the deadlock is resolved. If the speedtest has been running for a while, an error is produced and the speedtest shows up as failed in the final results, otherwise it resumes as normal.

Starting speedtest (10-25 seconds)...
Test file: 896 MiB http://works.with/any.url
Downloaded: 4.49 MiB, Time: 295s, Speed: 4.5 MB/s (36 mbit)2020-11-04 08:42:33.1965 [15] ERROR OpenDirectoryIndexer.StartIndexingAsync Speedtest failed System.InvalidOperationException: Sequence contains no elements
   at System.Linq.ThrowHelper.ThrowNoElementsException()
   at System.Linq.Enumerable.Max[TSource](IEnumerable`1 source, Func`2 selector)
   at OpenDirectoryDownloader.Library.SpeedtestFromStream(Stream stream, Int32 seconds) in D:\a\OpenDirectoryDownloader\OpenDirectoryDownloader\OpenDirectoryDownloader\Library.cs:line 205
   at OpenDirectoryDownloader.Library.DoSpeedTestHttpAsync(HttpClient httpClient, String url, Int32 seconds) in D:\a\OpenDirectoryDownloader\OpenDirectoryDownloader\OpenDirectoryDownloader\Library.cs:line 140
   at OpenDirectoryDownloader.OpenDirectoryIndexer.<>c__DisplayClass51_0.<<StartIndexingAsync>b__0>d.MoveNext() in D:\a\OpenDirectoryDownloader\OpenDirectoryDownloader\OpenDirectoryDownloader\OpenDirectoryIndexer.cs:line 367
Http status codes

To Reproduce
Steps to reproduce the behavior:

  1. Open any shell on linux (Test with zsh and bash, over SSH and in a terminal emulator)
  2. Run a scan with the -s flag
  3. Wait for the "Starting speedtest" output. No further output (Apart from "Test file: ...") should appear).
  4. Wait for a solid minute or two, then hit enter
  5. The intermediate output will appear, along with the error right behind it (see paste above).

Video of a reproduction (Skip to 1:25 if you don't want to wait)

Expected behavior
The intermediate results should be written to the terminal. If that turns out to be impossible, it should print a warning and proceed with the speedtest.

Machine 1:

  • OS: Ubuntu 18.04.5 LTS
  • Version: v1.7.0.3
  • Shell: zsh over ssh

Machine 2:

  • OS: Manjaro Linux 20.1.2 "Mikah"
  • Version: v1.7.0.3
  • Shell: bash and zsh in yakuake

Docker build seems to fail

When I try to build the docker container. The following output appears

➜  OpenDirectoryDownloader git:(master) docker build . -t opendirectorydownloader
Sending build context to Docker daemon    5.6MB
Step 1/5 : FROM mcr.microsoft.com/dotnet/core/sdk:3.1
3.1: Pulling from dotnet/core/sdk
57df1a1f1ad8: Pull complete 
71e126169501: Pull complete 
1af28a55c3f3: Pull complete 
03f1c9932170: Pull complete 
1e9f61add744: Pull complete 
8bc534dd6017: Pull complete 
7b0d6e95dc2c: Pull complete 
Digest: sha256:95f46b6614f7fb759cdad3b38286e6d9a25422113a07b5862dec1400379e796b
Status: Downloaded newer image for mcr.microsoft.com/dotnet/core/sdk:3.1
 ---> c4155a9104a8
Step 2/5 : COPY . /app
 ---> c1efa9a65818
Step 3/5 : WORKDIR /app
 ---> Running in fcb2ccd10152
Removing intermediate container fcb2ccd10152
 ---> a979311878a0
Step 4/5 : RUN dotnet build OpenDirectoryDownloader
 ---> Running in 24879e5ba77c
Microsoft (R) Build Engine version 16.7.0+7fb82e5b2 for .NET
Copyright (C) Microsoft Corporation. All rights reserved.

  Determining projects to restore...
  Restored /app/OpenDirectoryDownloader.Shared/OpenDirectoryDownloader.Shared.csproj (in 1000 ms).
  Restored /app/OpenDirectoryDownloader.GoogleDrive/OpenDirectoryDownloader.GoogleDrive.csproj (in 2.29 sec).
  Restored /app/OpenDirectoryDownloader/OpenDirectoryDownloader.csproj (in 2.97 sec).
  OpenDirectoryDownloader.Shared -> /app/OpenDirectoryDownloader.Shared/bin/Debug/netcoreapp3.1/OpenDirectoryDownloader.Shared.dll
  OpenDirectoryDownloader.GoogleDrive -> /app/OpenDirectoryDownloader.GoogleDrive/bin/Debug/netcoreapp3.1/OpenDirectoryDownloader.GoogleDrive.dll
OpenDirectoryIndexer.cs(450,96): error CS8086: A '}' character must be escaped (by doubling) in an interpolated string. [/app/OpenDirectoryDownloader/OpenDirectoryDownloader.csproj]
OpenDirectoryIndexer.cs(451,102): error CS8086: A '}' character must be escaped (by doubling) in an interpolated string. [/app/OpenDirectoryDownloader/OpenDirectoryDownloader.csproj]

Build FAILED.

OpenDirectoryIndexer.cs(450,96): error CS8086: A '}' character must be escaped (by doubling) in an interpolated string. [/app/OpenDirectoryDownloader/OpenDirectoryDownloader.csproj]
OpenDirectoryIndexer.cs(451,102): error CS8086: A '}' character must be escaped (by doubling) in an interpolated string. [/app/OpenDirectoryDownloader/OpenDirectoryDownloader.csproj]
    0 Warning(s)
    2 Error(s)

Time Elapsed 00:00:06.21
The command '/bin/sh -c dotnet build OpenDirectoryDownloader' returned a non-zero code: 1

Find most top-level directory

Is your feature request related to a problem? Please describe.
I'll admit this is one for automation. When given a link, an automated system has no way of knowing wether it's the Opendirectory root or merely a subfolder. The best (most reliable) way to find that would be to look for a "parent folder" (../) link. And what better program to do that than the one that's already parsing all kinds of OD formats?

Describe the solution you'd like
When given a specific flag, the program won't scan the current folder, but the highest-level folder it can find.

Alternatively, when given a specific flag, output the highest-level folder it can find without scanning it.

Suggestion for the read me file

I'm trying to use this tool but i done see any .exe file. I'm i supposed to compile the files by myself? I haven't done anything similar earlier but i would appreciate if you had those instructions in the readme file.

Console output should include "Created by KoalaBear84's OpenDirectory Indexer" banner

Is your feature request related to a problem? Please describe.
Currently, the markdown table is printed to stdout, but it doesn't include the "Created by KoalaBear84's OpenDirectory Indexer" at the bottom. This sucks, because i want to credit you!

Describe the solution you'd like
Just print it out along the table. If people don't like it, they don't have to copy it. Alternatively, hide it behind a feature flag.

Document xclip requirement

Describe the bug
Trying to copy to the clipboard on linux without having xclip installed results in an error:

Press ESC to exit! Or C to copy to clipboard and quit!
2020-10-07 11:33:44.9066  [1] ERROR Command.ProcessConsoleInput Error processing action System.Exception: Could not execute process. Command line: bash -c "cat /tmp/tmpW9nSm7.tmp | xclip -i -selection clipboard".
Output: 

Error: bash: xclip: command not found


   at BashRunner.Run(String commandLine) in C:\projects\textcopy\src\TextCopy\BashRunner.cs:line 45
   at LinuxClipboard.SetText(String text) in C:\projects\textcopy\src\TextCopy\LinuxClipboard_2.1.cs:line 33
   at TextCopy.ClipboardService.SetText(String text) in C:\projects\textcopy\src\TextCopy\ClipboardService.cs:line 65
   at OpenDirectoryDownloader.Command.ProcessConsoleInput(OpenDirectoryIndexer openDirectoryIndexer) in D:\a\OpenDirectoryDownloader\OpenDirectoryDownloader\OpenDirectoryDownloader\Command.cs:line 115
Unhandled exception. System.Exception: Could not execute process. Command line: bash -c "cat /tmp/tmpW9nSm7.tmp | xclip -i -selection clipboard".
Output: 

Error: bash: xclip: command not found


   at BashRunner.Run(String commandLine) in C:\projects\textcopy\src\TextCopy\BashRunner.cs:line 45
   at LinuxClipboard.SetText(String text) in C:\projects\textcopy\src\TextCopy\LinuxClipboard_2.1.cs:line 33
   at TextCopy.ClipboardService.SetText(String text) in C:\projects\textcopy\src\TextCopy\ClipboardService.cs:line 65
   at OpenDirectoryDownloader.Command.ProcessConsoleInput(OpenDirectoryIndexer openDirectoryIndexer) in D:\a\OpenDirectoryDownloader\OpenDirectoryDownloader\OpenDirectoryDownloader\Command.cs:line 115
   at OpenDirectoryDownloader.Program.Main(String[] args) in D:\a\OpenDirectoryDownloader\OpenDirectoryDownloader\OpenDirectoryDownloader\Program.cs:line 97
   at OpenDirectoryDownloader.Program.<Main>(String[] args)

This is probably unavoidable, but should be documented in the READMEE.

To Reproduce
Steps to reproduce the behavior:

  1. Set up a *nix computer without xclip
  2. Run a scan
  3. When prompted, press C to copy

More output options

Is your feature request related to a problem? Please describe.

We are currently running into the problem that we have very large (3GB+) JSON files generated by ODD, but can't process them because we don't have enough RAM to parse the JSON.
I personally love JSON, but it seems like the format is not well-suited for the task (it's not streamable).

Now, you might ask, why don't you guys just use the .txt file?; the problem is that this is only created after the scan is finished, including file size estimations. After scanning a large OD for ~6h yesterday, I had a couple million links, with over 10M links left in queue for file size estimation. The actual urls were already there, but the only way to save them was through hitting J for saving as JSON.

Describe the solution you'd like

There are multiple features that would be useful for very large ODs:

  • add a key command to prematurely save the .txt-file
    this should be no problem at all and is simply a missing option/command at this point
  • adopt a new file format that supports streaming parsers
    think jsonlines, csv, whatever
    it might also be a good idea to restructure the meta info of the scan and files in order to remove duplicate info and make the output files smaller and easier to work with
  • while we're at it, an option for saving the reddit output as well as error logs to a separate file would also be appreciated! :D

@MCOfficer and I would be glad to discuss the new file structure further, if you're so inclined :)

ARM Support

Is your feature request related to a problem? Please describe.

I'd like to integrate the indexer into my downloader server, so I can paste a "folder" url into my download manager and have it download all or a selection of all contained files.
Given that that there are a ton of different OD html structures, it makes no sense to implement this myself when there's already a great solution out there!

Only problem: my download manager server runs on a Raspberry Pi 4, which uses an ARM architecture (armv6/7, 32 bit form what I've found). so the precompiled binaries won't work on the RP4.

Describe the solution you'd like

I already tried to compile the source myself, but I ran into some errors (sent you a message about it on reddit, but you never replied).
I'm not familiar with .NET at all, so I'm having a hard time finding a solution.

That's why I'd like to ask if you could maybe provide precompiled ARM binaries as well?
I imagine building for an additional platform isn't too hard if one knows what they are doing, and it would be greatly appreciated! :D

Error with alx-xlx/goindex

2020-12-30 11:23:11.1330 [9] ERROR GoIndexParser.ScanIndexAsync Error processing GoIndex for URL: https://exclusive.getstudyfever5.workers.dev/0:/? ?????? ?? ??? , ? ????? ?? ?? ??????/@getstudyfever Atmanirbhar Bharat Android App Development/#/ System.NullReferenceException: Object reference not set to an instance of an object.
at OpenDirectoryDownloader.Site.GoIndex.GoIndexParser.ScanIndexAsync(HttpClient httpClient, WebDirectory webDirectory)

Exit when all jobs are finished

Is your feature request related to a problem? Please describe.
I'm thinking about calling OpenDirectoryDownloader from another application. The fact that it waits for the ESC keystroke makes this a bit harder than it could be.

Describe the solution you'd like

A command-line option that makes the program exit automatically when all jobs are finished.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.