Giter Club home page Giter Club logo

Comments (28)

kave333 avatar kave333 commented on May 30, 2024 1

I've tried with overwriting on and off. I opened the .JSON file and it just shows 1 big line of everything so i filtered it to just find how many "https" there was in the whole .json file. It returned with 2,598 matches so im guessing it got the links to 2598 videos where on the OF pages it shows 2,680 so from this i guess some are private...

I ran using the arguments B and C to try eliminate videos and images getting mixed.

Images it managed to download 2404. on the OF page there is 15,981. Then i used the same method in the .JSON file and it shows there was 15,981 "https" matches so script is scraping the links correctly as shown in the JSON files but not all the links are getting downloaded.

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024 1

Can you also do me a favour and sort the links by the Post ID and check the date column to see what is the earliest date for the links? I'm willing to bet it stops before early 2017.

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

Jesus, that's a lot. Try with overwriting on. Have you checked if the link JSON contains 2600 videos or just 350?
Also, some videos get downloaded into the image folder. OnlyFans changed them how their API works so I need to fix that. 😢

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

What you can do is using the below link to convert the JSON file to an excel spreadsheet. You should be able to accurately find out how links are available for download.

you can copy the links to a download manager to make it download faster and it's also resumable as well. I'm using Internet Download Manager and it works really well.

https://json-csv.com/

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

@kave333
I've fixed the image and video mixing issue but I'm not ready to push the commit yet.
I'm also overhauling the json exporting.

@dilemmax I found corrupted posts in 2017 that wouldn't download. It was due to them having the same filename. It WAS the same photo but OnlyFans decided to duplicate them...

Since the script is multithreaded, the script downloads both files with the same name at the same time which causes them to overwrite each other and become corrupted. That's why you'll see fewer files than you accounted for. To fix this I'll just tell the script to search for duplicates and rename them and it should solve the problem.

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

I also found a problem with the math in my script that would get 1 page less when rounding to the floor.
Fixed this as well. Few more tests and I'll push a commit.

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

Pushed the commit. Open/reopen if you still got a problem.

from ultimascraper.

kave333 avatar kave333 commented on May 30, 2024

@dilemmax anyway i can contact you to discuss the IDM further?

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

@kave333 sure can. my email address is [email protected]

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

I also found a problem with the math in my script that would get 1 page less when rounding to the floor.
Fixed this as well. Few more tests and I'll push a commit.

Thanks DigitalCriminal, I'll test it out now

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

@DIGITALCRIMINAL, I don't think I can re-open this issue since I'm not the OP. I'm still having the same problem. It's still scrapping the same video links but I'm think maybe the Content Creator may posted the videos at some stage and then made it private? Just a theory.

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

Hmm, what's the total number of videos on the account @dilemmax because I've noticed this with another account

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

312 videos but it only scrapped 307 video links

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

@dilemmax have you checked the archive.json to check if there are any invalid links?

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

Is archive.json in the same folder as links.json?

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

username/metadata/archive.json

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

okay thanks. Nope it didn't create one.

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

Are you using the latest release because the script doesn't create links.json anymore

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

yeah I am. Sorry I meant videos.json

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

@dilemmax also, my bad. The archive.json is for a different module and images/videos/audio.json is correct. It seems like we have the same problem, but for me, it's for images. I'm going to look into it further now though to see what's happening.

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

Okay no worries. I wish I have enough programming knowledge to help you troubleshoot this problem.

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

Do you think it has something to do with pinned posts? There are 4 pinned posts in mine but 307+4 doesn't equal 312

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

@dilemmax It's all good, you've helped a lot c:
And Nah it's not the pinned posts because mine is in the .json
I honestly just think that the posts are private. The missing content isn't even in the API request so there's nothing we can do...

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

@dilemmax
Added an option to export to csv instead of json in the latest release.
image

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

Thanks @DIGITALCRIMINAL! That would help a lot! While you're at it, is there anyway to strip away text in angle brackets and unicode text from the "text" column when scrapping to a json/csv file?

Currently, I open the json file in Word and use the wildcard search function to delete any text starting and ending with angle brackets, replace \n with a space etc. Then convert it to a spreadsheet and make the relevant changes before creating a batch file to rename the files with relevant titles.

Basically I have to do a lot before I end up with something like this

"ren "5d9ec0518452aaf05bd2d.mp4" "[2019-10-11] 194 - HOOK UP P2 JAYJAY & MM ENJOY!.mp4""

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

Hope I make sense!

from ultimascraper.

UltimaHoarder avatar UltimaHoarder commented on May 30, 2024

@dilemmax open new issue for this please

from ultimascraper.

dilemmax avatar dilemmax commented on May 30, 2024

Will do :)

from ultimascraper.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.