Giter Club home page Giter Club logo

Comments (33)

alvarobartt avatar alvarobartt commented on May 24, 2024 11

Still doesn't work for me, I think we should figure out how to bypass the cloud flare thing because they will just keep turning on the web scraping protection feature from cloud flare. I saw there was a library and I tried it https://pythonlang.dev/repo/venomous-cloudscraper/ it doesn't seem to work.

Instead of looking for ways to bypass their security measurements, I'd wait for them to come back to me with an answer on the collaboration proposal as that way we shouldn't be skipping their security and potentially breaking the terms of use of Investing.com.
But as I said before, whenever this issue first happened I tested it for the sake of exploring the issue further as I didn't know it was Cloudflare at the beginning, and no luck with any of those libraries.
I didn't try puppeteer yet, because it has way too many dependencies and I want to make this package light and fast, and IMO that's not a clean way to develop investiny and/or investpy.

I mean that would be great, I don't even mind using their api or a paid api from their side but they don't have that, they offer a simple pro service with no data download possible.

I know @anarchy89, but I prefer to wait until they answer me, so as to see whether we can make either investiny, investpy or both an official product with access to Investing.com or to get to some sort of agreement so as to either increase those limits or something similar. Anyway, I'll keep you all posted as I've already been redirected to the proper team in charge of that, so let's hope the outcome of that is the best for the community 😄

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024 6

Soooo the day came 😞 Investing.com protected all their APIs with Cloudflare... I've tried to contact them (resent them the email I sent them a couple of weeks ago) but no news from their side so still waiting!

FYI I've tried all the available APIs tvc.investing.com, tvc4.investing.com, and tvc6.investing.com, and none of those are working, so unless we get a response from Investing.com we're either getting to the end of investpy (and investiny now) or we'll need to test JS-based solutions which imply making the package way heavier, and I'm not fully aware on how that works TBH!

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024 3

Still doesn't work for me, I think we should figure out how to bypass the cloud flare thing because they will just keep turning on the web scraping protection feature from cloud flare. I saw there was a library and I tried it https://pythonlang.dev/repo/venomous-cloudscraper/ it doesn't seem to work.

Instead of looking for ways to bypass their security measurements, I'd wait for them to come back to me with an answer on the collaboration proposal as that way we shouldn't be skipping their security and potentially breaking the terms of use of Investing.com.

But as I said before, whenever this issue first happened I tested it for the sake of exploring the issue further as I didn't know it was Cloudflare at the beginning, and no luck with any of those libraries.

I didn't try puppeteer yet, because it has way too many dependencies and I want to make this package light and fast, and IMO that's not a clean way to develop investiny and/or investpy.

from investiny.

joao-pm-santos96 avatar joao-pm-santos96 commented on May 24, 2024 2

Maybe this?: https://stackoverflow.com/a/72728221

from investiny.

anarchy89 avatar anarchy89 commented on May 24, 2024 1

I'm having this problem too it stopped working all of a sudden.

from investiny.

PogoRollo avatar PogoRollo commented on May 24, 2024 1

Yeah, started today. It's a Cloudflare challenge.
The actual response body is:

<!DOCTYPE html>
<html lang="en-US">
<head>
    <title>Just a moment...</title>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=Edge" />
    <meta name="robots" content="noindex,nofollow" />
    <meta name="viewport" content="width=device-width,initial-scale=1" />
    <link href="/cdn-cgi/styles/challenges.css" rel="stylesheet" />
    

</head>
<body class="no-js">
    <div class="main-wrapper" role="main">
    <div class="main-content">
        <h1 class="zone-name-title h1">
            <img class="heading-favicon" src="/favicon.ico"
                 onerror="this.onerror=null;this.parentNode.removeChild(this)" />
            tvc4.investing.com
        </h1>
        <h2 class="h2" id="challenge-running">
            Checking if the site connection is secure
        </h2>
        <noscript>
            <div id="challenge-error-title">
                <div class="h2">
                    <span class="icon-wrapper">
                        <div class="heading-icon warning-icon"></div>
                    </span>
                    <span id="challenge-error-text">
                        Enable JavaScript and cookies to continue
                    </span>
                </div>
            </div>
        </noscript>
        <div id="trk_jschal_js" style="display:none;background-image:url('/cdn-cgi/images/trace/managed/nojs/transparent.gif?ray=75c30ff7bb5a6949')"></div>
        <div id="challenge-body-text" class="core-msg spacer">
            tvc4.investing.com needs to review the security of your connection before proceeding.
        </div>
        <form id="challenge-form" action="THE_ORIGINAL_REQUEST&amp;__cf_chl_f_tk=sSANp3JtMqsjoUh50CcPGPono94Pgq8_.7BGA8I3XKM-1666114860-0-gaNycGzNB2U" method="POST" enctype="application/x-www-form-urlencoded">
            <input type="hidden" name="md" value="i2FRFTE2du3Qco3QFnvnVH6.clW3hLi7rIRscIks2TM-1666114860-0-AcxJCTgnfWs_Qs4t78bTJ7zR19Nw4lWfdvIJeud2RRvvpSdF4TH2PehqmKoIphrQQlXmFrE0VL9Z5Z7suFKpcqfGvv6ctZmz1GJduxQU0aS66SjKgp7hovDYOG_ztWqJFOfLZYtpWY99YJOfYIGMhaJJy-_BrPvA3XD5gDjIA_h2wLQmleMM4gGPM8sP4TfzcrFYd8M8bBA_nDo76-ERHWgWNjmZuvJ1L7KZfzwmGsIodKaIXGQJKk6C1-pm1ZI_zIshwem03pHufykl8ARM43Y072hn1lL0XoxoiYdSnQMrExkDoOnX6Wp-WkQKGIcCaDclEFAkk7kZMSy8L8UCgb_IjW9Th8BIVrtWUElYtH61b8tanqpUUHcTB4E0FjNpMPcxpzVuqfhGgQcBMo3cPBDe2cZn5c5bOrmxIiOQ41ffU0ifXkXi0lrp-VUqRW_Qgk__QbbOWa5I0KBQhyMhZabVU5770I1LYLtVQtETweVJSXGnA1MrC2hUskCBhOj1gwzgcIpynlJqPf_aSArRfGGLahw-abD-Cy2QnAl0xtN0-YkTIouYTjEu3Z9WYelGrORWAdfQdDQHFWO0ZYlBsIn_4CUTu6ppqeWBHHPvrVRBuR3I08JYma28PH9v-z-tDsVm6Z-3TyTnhIg2VVe6Q8sbwHblVZz9Bn1suzePR4e52llJ928hugM5EXaZMoUI00qogHbkhHe_vk0ZgB3gSExR7DmmIGQAYNUQxv7K4pROCYT4Uo1Tp1vjC-w3W4DgSoUczpbQMTpMiQQo2viy6OdyKKBlo4td68IhXgSekLGUcxk6caJz_Kq6VXzDjOHGa2nvVXCk7T6ljBlUp5fa8sjtNKCrlhZpOdt3PAdBwKsOCT7_4aWyXTVSMVLkde7AIg0aDSsoO7m49QBuWSZxjJRZviMFsmC6XARLWlWUdzy53z58MwY_23bGG1Da8K-5Hg" />
            <input type="hidden" name="r" value="bsg4kbcXCUmlUjRsYoAkZTuub56rjGcWQE8YX1hY.O4-1666114860-0-AbCLxJDTqdDuai3VS4783VfnL0UHFlW3M4RW7bFmUT0TWldMizfI3GdgPles+wdN7hCBnO+DsjDblLLgGoSAsrWhy7LDMOLPBU9cFwImwuX2UcgPv6c7+ueGoLc7PHRgOeJjue4jJWBHgpjqSlIShfJLgMMCD1hZbVpMyVY0sxIr019+/HRbfL3lnJxC8W2+4a0kr6rTqPGn952I4Vyw6hBbVgposhygLwduu0khqMUtuQOHYC4QFD1j8h9lU9sHl4fZMa5E8ufZiiRvqiO2muR7lOi2TnFUx83VS06iApzfQTkTWBOpUAJbA3FddVVZpiwUVw+yxFQgpDO9Md9HHokktXMOMXMV656AwnXAsRD/YHDtkhkLYAap+ESP17fTSOqKvJ7YYfWW3IfBW3tziuGAVDxrZT3Nbjvbft8rN/ho0cWJZEd1dLncsVxF+G4ebuRx1q5WmLx5nJYH33XfSL4+M44ttcDcsO4bnJNADV9ODxITkPbPdzx8IBVGxEOKo4Yp+DrmO/R38IgKqrr3UMtvQbLIhoqojzSa7Jw/ZmPBYSzXwk3bQWLiqubDmSb5xXdOwOf7VqZg+L9Gh49+R0Br/2ZCqb8bg3BPygnbFGOaBhoM9RlYGat4CzgVm3ZLC9gOtkyRLkhz3o2niFOLvzmcvlJJlwXwWJMYAQshsvLxni85x/+XQRK/cGUF8B7btxdUaGdT0pWTCgZyhvayY2tXy4ddsLIW83p2sAdLSko7u4nq8g+3fQDUJVzsmV9sSQDF2Rl16aw3cy/UwWZyGkp7Zti+AX8gZabOBQUqwKvJRHWvOwNhZlTq521mNOlMOeBbXh5qaSp57vnYTUFP85kKZ6j03S7gHdbbGG2m3sFpbrOYuMciUX4NSSRAwlLP21+I9W5U7wFk/Rkr8WNc6o2j6LeczeSNS2PC8rcDAN0L4uhHzJHek41QvyFsZX5Bm0SLDQY0fZ0kNeh3v+FqWmdXa/gEzf91nP2kdiU23Q0xqlDeVgVY/I+M03zCzbGTp6p5X4iCf1kwGU2dBTGJk2vGJztiLUUCQFvEITnj58xuIAdflIbDIGLDXS7Ft1nVKy81pxI9IrEhbvc+QVNXW0Lah5SKKXqJwJg7PRyd95aSsriVQ/agISwr8BCNAbvrMA01eUO8jTCgkubwoVqeiDdZK2csFguP+KHn7mt3sZgvwocBsLITAzFGY+c22XwCzWhKtgKWI8/H717Nm/DWMqXuU0CAZIW7jEuvQ7p+RwU+73xRCt3CCRrDxplyIEVHbmE6WAW3DMA+fqCkK3h4P3+3Ol6wW8xzccCSTQhspm1yv26v57FDJe2a/itABooMRKxxz6fzwLmKAVhQH7+gkt+fnX9wQr3EHuQ0dFM6A4C/lgXHN+le6oUCx3uUPQkOA+2YiuaTNn/DhwyWYRDzBtbLBbUYCDKBz/2VMvBW/ph2Gw9l3yn7Mx5uYDHg3utP4+nkTRLHBTzrKV5tp0tCdXIC0g0Zs9UdASPlen5ACVxGwDmqGsR7k98aM7eSFGXe9Ls/UNPki5Y5jWH3RjlNkhZP3aHViY6ArdZg64wiOQ7GMF1jPBX2ZYZO7HxFmSxjS2Zg35XW0gG/ZCs+6nINwNW0lX9/3q7mMMM9BoxVOD/6wrjWu79htO78h/VU36ne04fqOnHqf2nD2joOGHojvAtyjn7+fyk87XwD6bVFQDLr6abpWx4ge+t9HRGCr2ZpmVhWWQvJU8vkOqsFUgwfqnd+tcf+S4MrHdnm4fEgqyUIXny8BptgVWJFxQ8rWt8EKflPUO8JiaFvqpQxd7kR+DU7IrsIxTvFoeth1TWiVGDW"/>
        </form>
    </div>
</div>
<script>
    (function(){
        window._cf_chl_opt={
            cvId: '2',
            cType: 'managed',
            cNounce: '82843',
            cRay: '75c30ff7bb5a6949',
            cHash: 'be597fde60719a5',
            cUPMDTk: "THE_ORIGINAL_REQUEST&__cf_chl_tk=sSANp3JtMqsjoUh50CcPGPono94Pgq8_.7BGA8I3XKM-1666114860-0-gaNycGzNB2U",
            cFPWv: 'b',
            cTTimeMs: '1000',
            cTplV: 4,
            cTplB: 'cf',
            cRq: {
                ru: 'aHR0cHM6Ly90dmM0LmludmVzdGluZy5jb20vMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAwMDAvMC8wLzAvMC9oaXN0b3J5P3N5bWJvbD1OQVNEQVErJTNBQUFDRyZyZXNvbHV0aW9uPUQmZnJvbT0xNjY1NzA1NjAwJnRvPTE2NjU5NjQ4MDA=',
                ra: 'TW96aWxsYS81LjAgKFdpbmRvd3MgTlQgMTAuMDsgV2luNjQ7IHg2NCkgQXBwbGVXZWJLaXQvNTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbykgQ2hyb21lLzEwNi4wLjAuMCBTYWZhcmkvNTM3LjM2IEVkZy8xMDYuMC4xMzcwLjM0',
                rm: 'R0VU',
                d: 'ciZO7j9M6V/+BOoLttyx/6/zvhioUhl3V8HJR5f4qtcwO9dzK+lS8h/HFRESEsMGJB+1mXmCUiXlKkhIOhxzO+1kzC3tOtpQitYnAKzgGJlYsarUBi4CJl33PrEiz0X0sY5GpIitN0thDqXwplEOps8LGNuis5yV/DL/9UF1uCQwg5Om4ZUor1GJx3LDh0WnKh2DnnT27IOcnbUWTsIhLrym2aCB5x8itCCwhgg9syMvQUePt5ZENxfN/ZNXVWdJSGys5j0ArzlamZpHEBpEGGwiizBP5eRLfbAoYp61nDqGF0HTcE5UVewVdV9DVsqJ365daZPv1aGEJE9+KreDXGXW+fdrmA0vR5KXBQhYF26rDcLuc/P28Y759ucTRyb4Q5RPbZlzMZm1CZuErVzdMIXshPWJljQwlSIu2KA13A/B3rfg1HVUXiIY4Lav82+LY0gcX4psQ9zZctpBm52bt1YzIyys5l48JnwINBkGCwcnGiPSZJK5FrrkmZVh3N+CIEo9XYBURG9MOGXdzMNNjBT3I9ThSCi/PwlHQAW+MrMCkn88BidoPCNBWAZw+/ySxyx0HVTOrzssW1lhQi4Jcb5Z+gOgXD8esdMOeDxUigt181VPKJUzt549IGw68kxC',
                t: 'MTY2NjExNDg2MC43ODEwMDA=',
                m: 'aE7puev+CE8d/wG+uUnXqlSR38vGD/7nUVnRmrSO93M=',
                i1: '89LHcnKR77lGOKK2z/HK1w==',
                i2: 'pHrhEoXPZxQUcKsdQNMNiA==',
                zh: 'JJQg2KI/+bPgJbLHlLjmrs/mnno8aAGH5k3tm8QDk4c=',
                uh: 'ndhEe3dibHzXHi71SzbRYjwpKAQEbkeBd4r+hDwx6tA=',
                hh: '+t0nuRZbzd0CcVt9ZZq81dRfWmk/KHIWqVnFfjx7L+s=',
            }
        };
        var trkjs = document.createElement('img');
        trkjs.setAttribute('src', '/cdn-cgi/images/trace/managed/js/transparent.gif?ray=75c30ff7bb5a6949');
        trkjs.setAttribute('style', 'display: none');
        document.body.appendChild(trkjs);
        var cpo = document.createElement('script');
        cpo.src = '/cdn-cgi/challenge-platform/h/b/orchestrate/managed/v1?ray=75c30ff7bb5a6949';
        window._cf_chl_opt.cOgUHash = location.hash === '' && location.href.indexOf('#') !== -1 ? '#' : location.hash;
        window._cf_chl_opt.cOgUQuery = location.search === '' && location.href.slice(0, -window._cf_chl_opt.cOgUHash.length).indexOf('?') !== -1 ? '?' : location.search;
        if (window.history && window.history.replaceState) {
            var ogU = location.pathname + window._cf_chl_opt.cOgUQuery + window._cf_chl_opt.cOgUHash;
            history.replaceState(null, null, "THE_ORIGINAL_REQUEST&__cf_chl_rt_tk=sSANp3JtMqsjoUh50CcPGPono94Pgq8_.7BGA8I3XKM-1666114860-0-gaNycGzNB2U" + window._cf_chl_opt.cOgUHash);
            cpo.onload = function() {
                history.replaceState(null, null, ogU);
            };
        }
        document.getElementsByTagName('head')[0].appendChild(cpo);
    }());
</script>

    <div class="footer" role="contentinfo">
        <div class="footer-inner">
            <div class="clearfix diagnostic-wrapper">
                <div class="ray-id">Ray ID: <code>75c30ff7bb5a6949</code></div>
            </div>
            <div class="text-center">Performance &amp; security by <a rel="noopener noreferrer" href="https://www.cloudflare.com?utm_source=challenge&utm_campaign=m" target="_blank">Cloudflare</a></div>
        </div>
    </div>
</body>
</html>

Note: I replaced the original request with THE_ORIGINAL_REQUEST.
It's simply something like: /XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX/X/X/X/X/history?XXX

Cloudflare challenges are JavaScript challenges. If this continues, investiny would need to solve it. Either manually by using a JS engine, or by using a headless browser (such as Puppeteer) or even solutions like https://github.com/FlareSolverr/FlareSolverr (which also seems to be outdated).

Edit: Scratch that. Maybe it actually is a header issue that Cloudflare doesn't like.
Because it is working in the browser in incognito mode (without prompting a Cloudflare challenge).

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024 1

I've checked again now, and it's not working again... Sorry for the confusion, it worked for me for around 50 requests more or less!

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024 1

I'll test again tomorrow, but it seems that Investing.com is blacklisting some IPs temporarily more often than before, which means that you can send some requests but then you're blocked (Cloudflare challenges your IP address...)

Then cloudscraper is probably the way forward. I think they support v2 for cloudflare.

Also there are other options available; not verified though

Hi @hajimebusuzima96 I tested cloudscraper in the past when this issue first happened and it was not supporting Cloudflare V2, it was just working for some V1 challenges... So sadly that's not an option 😞

from investiny.

RyuuOujiXS avatar RyuuOujiXS commented on May 24, 2024 1

Investing.com will most likely never loosen their policy on bots unless they're paid to do so. Bots use the resources of investing.com without viewing their ads, meaning investing.com effectively loses money to bots. A selenium or other browser-based solution might work by generating a different fingerprint for CloudFlare which may be whitelisted, but that would mean heavier, local recourse usage from the bot. Obviously, a lighter browser such as requests or httpx is preferred, but "you gotta do what you gotta do"

from investiny.

typhoon71 avatar typhoon71 commented on May 24, 2024 1

@alvarobartt
Suggestion: if they could allow api requests from registered users - which would be easy for them to monitor.
Like every user has is own api key, and so on (like other services).

They do allow some level of "free" use of their site - think of ppl behind a firewall that's blocking ads and stuff - but they don't want massive scraping, which cost resources (and could be a competitor).

from investiny.

Guvalle avatar Guvalle commented on May 24, 2024

Is there any way at all that we can help? Maybe we can try sending them emails too or something like that?

from investiny.

hamzaahmedzia1 avatar hamzaahmedzia1 commented on May 24, 2024

Maybe cloudscraper can help.

Also someone succeeded

Here is a minimal tutorial

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024

Is there any way at all that we can help? Maybe we can try sending them emails too or something like that?

I've been just contacted by them this morning, they've redirected me to the proper team, but told me in advance that removing Cloudflare from their side is not an option as that has been done for security reasons, so I'll keep you all posted! 🤗

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024

Maybe cloudscraper can help.

Also someone succeeded

Here is a minimal tutorial

I'll try those later today, thanks! Anyway I think that just works with Cloudflare v1, and Investing.com is using Cloudflare v2, so it probably won't work... but I'll test it anyway 😄

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024

Soooo I've re-run the tests and it seems to be working?

@anarchy89 @hamzaahmedzia1 @joao-pm-santos96 @Guvalle @PogoRollo can anyone confirm? Thanks 🤗

from investiny.

GSLabIt avatar GSLabIt commented on May 24, 2024

Soooo I've re-run the tests and it seems to be working?

@anarchy89 @hamzaahmedzia1 @joao-pm-santos96 @Guvalle @PogoRollo can anyone confirm? Thanks 🤗

no changes needed?

from investiny.

PogoRollo avatar PogoRollo commented on May 24, 2024

Is there any way at all that we can help? Maybe we can try sending them emails too or something like that?

I've been just contacted by them this morning, they've redirected me to the proper team, but told me in advance that removing Cloudflare from their side is not an option as that has been done for security reasons, so I'll keep you all posted! 🤗

It can be tuned down without completely disabling it (lower Security Level in Cloudflare will allow higher Threat Scores to have no challenges).
They were behind Cloudflare for a quite a while now (since April 2021 for the main site and from what I could find about the API - at least since April of this year).
Something has changed yesterday. It's safe to assume the Security Level in their Cloudflare has increased, either manually or automatically.
https://support.cloudflare.com/hc/en-us/articles/200170056-Understanding-the-Cloudflare-Security-Level

Soooo I've re-run the tests and it seems to be working?

@anarchy89 @hamzaahmedzia1 @joao-pm-santos96 @Guvalle @PogoRollo can anyone confirm? Thanks 🤗

Doesn't seem to be working for me. Still getting Cloudflare challenges.

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024

Soooo I've re-run the tests and it seems to be working?
@anarchy89 @hamzaahmedzia1 @joao-pm-santos96 @Guvalle @PogoRollo can anyone confirm? Thanks 🤗

no changes needed?

No @GSLabIt, just use investiny v0.7.2 and it should work fine, it's working on my end... Maybe Investing.com is just blocking you every N requests a day @PogoRollo? Not sure, I checked around 1 hour ago and it was working fine.

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024

I'll test again tomorrow, but it seems that Investing.com is blacklisting some IPs temporarily more often than before, which means that you can send some requests but then you're blocked (Cloudflare challenges your IP address...)

from investiny.

luuducthi avatar luuducthi commented on May 24, 2024

Same error for me:
"ConnectionError: Request to Investing.com API failed with error code: 403."

from investiny.

hamzaahmedzia1 avatar hamzaahmedzia1 commented on May 24, 2024

I'll test again tomorrow, but it seems that Investing.com is blacklisting some IPs temporarily more often than before, which means that you can send some requests but then you're blocked (Cloudflare challenges your IP address...)

Then cloudscraper is probably the way forward. I think they support v2 for cloudflare.

Also there are other options available; not verified though

from investiny.

anarchy89 avatar anarchy89 commented on May 24, 2024

Still doesn't work for me, I think we should figure out how to bypass the cloud flare thing because they will just keep turning on the web scraping protection feature from cloud flare. I saw there was a library and I tried it https://pythonlang.dev/repo/venomous-cloudscraper/ it doesn't seem to work.

from investiny.

anarchy89 avatar anarchy89 commented on May 24, 2024

Still doesn't work for me, I think we should figure out how to bypass the cloud flare thing because they will just keep turning on the web scraping protection feature from cloud flare. I saw there was a library and I tried it https://pythonlang.dev/repo/venomous-cloudscraper/ it doesn't seem to work.

Instead of looking for ways to bypass their security measurements, I'd wait for them to come back to me with an answer on the collaboration proposal as that way we shouldn't be skipping their security and potentially breaking the terms of use of Investing.com.

But as I said before, whenever this issue first happened I tested it for the sake of exploring the issue further as I didn't know it was Cloudflare at the beginning, and no luck with any of those libraries.

I didn't try puppeteer yet, because it has way too many dependencies and I want to make this package light and fast, and IMO that's not a clean way to develop investiny and/or investpy.

I mean that would be great, I don't even mind using their api or a paid api from their side but they don't have that, they offer a simple pro service with no data download possible.

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024

So I've just tested investiny and it seems to be working fine again... I assume their Cloudflare has some limitations but it's not blacklisting every IP forever, just after a certain number of requests...

Look:

(investiny-py3.9) alvarobartt@Alvaros-MacBook-Air investiny % poetry run python
Python 3.9.6 (default, Aug  5 2022, 15:21:02) 
[Clang 14.0.0 (clang-1400.0.29.102)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from investiny import historical_data
>>> d = historical_data(investing_id=6408)
>>> d
{'date': ['09/26/2022', '09/27/2022', '09/28/2022', '09/29/2022', '09/30/2022', '10/03/2022', '10/04/2022', '10/05/2022', '10/06/2022', '10/07/2022', '10/10/2022', '10/11/2022', '10/12/2022', '10/13/2022', '10/14/2022', '10/17/2022', '10/18/2022', '10/19/2022', '10/20/2022', '10/21/2022'], 'open': [149.66000366211, 152.74000549316, 147.63999938965, 146.10000610352, 141.2799987793, 138.21000671387, 145.0299987793, 144.07499694824, 145.80999755859, 142.53999328613, 140.41999816895, 139.89999389648, 139.13000488281, 134.99000549316, 144.30999755859, 141.06500244141, 145.49000549316, 141.69000244141, 143.02000427246, 142.96000671387], 'high': [153.7700958252, 154.7200012207, 150.64140319824, 146.7200012207, 143.10000610352, 143.07000732422, 146.2200012207, 147.38000488281, 147.53999328613, 143.10000610352, 141.88999938965, 141.35000610352, 140.36000061035, 143.58999633789, 144.52000427246, 142.89999389648, 146.69999694824, 144.94920349121, 145.88999938965, 147.83999633789], 'low': [149.63999938965, 149.94500732422, 144.83999633789, 140.67999267578, 138, 137.68499755859, 144.25999450684, 143.00999450684, 145.2200012207, 139.44500732422, 138.57290649414, 138.2200012207, 138.16000366211, 134.36999511719, 138.19000244141, 140.27000427246, 140.61000061035, 141.5, 142.64999389648, 142.67999267578], 'close': [150.77000427246, 151.75999450684, 149.83999633789, 142.47999572754, 138.19999694824, 142.44999694824, 146.10000610352, 146.39999389648, 145.42999267578, 140.08999633789, 140.41999816895, 138.97999572754, 138.33999633789, 142.99000549316, 138.38000488281, 142.41000366211, 143.75, 143.86000061035, 143.38999938965, 147.27000427246], 'volume': [93339000, 84443000, 146691008, 128138000, 124925000, 114312000, 87134000, 79148000, 68402000, 85926000, 74591000, 77034000, 69833000, 112876000, 88237000, 84684000, 98716000, 61758000, 64277000, 85641896]}

from investiny.

marcnshapiro avatar marcnshapiro commented on May 24, 2024

I'm still getting '403' errors right from the start.

from investiny.

InnovArul avatar InnovArul commented on May 24, 2024

The '403' error appears to me as well right from the start.

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024

Ok @marcnshapiro @InnovArul let me re-trigger the CI/CD pipeline to check whether it's working or not, as for me it's working fine locally!

from investiny.

thesoundhead avatar thesoundhead commented on May 24, 2024

Hello @alvarobartt and thank you for working on both, investpy and investiny. I have used investpy in the past and was quite happy with it until this error occured. Now I thought that investiny would solve these issues. Can you please give us a timeline on if/when this will be fixed? Also: Should one use investpy or investiny in the future then? Thank you!

from investiny.

thrasher456 avatar thrasher456 commented on May 24, 2024

@alvarobartt i was getting 403 too but changing user-agent header solved it .
Try manually setting user-agent header to cloudflare and it will bypass the 403

from investiny.

alvarobartt avatar alvarobartt commented on May 24, 2024

@alvarobartt i was getting 403 too but changing user-agent header solved it . Try manually setting user-agent header to cloudflare and it will bypass the 403

Hi @thrasher456, I'm happy to know more about this issue, would you mind sharing a code-snippet in Python so that I can try to replicate? Thanks in advance!

from investiny.

thrasher456 avatar thrasher456 commented on May 24, 2024

@alvarobartt so the main change i did was to change user-agent to cloudflare in utils.py and send a cloudflare cookie as well like below

def request_to_investing(
    endpoint: Literal["history", "search", "quotes", "symbols"], params: Dict[str, Any]
) -> Union[Dict[str, Any], List[Dict[str, Any]]]:
    """Sen<!--SNIPPED--!>
        A dictionary with the response from Investing.com API.
    """
    url = f"https://tvc6.investing.com/{uuid4().hex}/0/0/0/0/{endpoint}"
    headers = {
        "User-Agent": (
            "cloudflare"
        ),
    }
    cookies = {"__cf_bm":"fowl8pLjW8XNVXckxyGjCnQIDGCU0YMIJjn3d8ha6zg-1694288042-0-Aam2fhQupUX6adOIA+RflfgodVXk7XWh8thb5Cn7LLhUJ0FD41MtC8FcKfodGsn27sz2nEDuY17rHzipF47Ii8Y="}
    r = httpx.get(url, params=params, headers=headers,verify=False,cookies=cookies)
    if r.status_code != 200:

one idea is to use httpx.cookie to save the "__cf_bm" cookie and refresh it with every request , this will make sure we always pass cloudflare check . But the initial __cf_bm cookie must be given normally atleast once by user , we can then store the refreshing __cf_bm cookie from response .

Attaching utils.py & tmp.py with txt extension
tmp.txt
utils.txt

from investiny.

typhoon71 avatar typhoon71 commented on May 24, 2024

I guess we'll need to pass "startup" cookies and headers to investpy, then internally investpy will have to update them for every request?

from investiny.

zh1cheng avatar zh1cheng commented on May 24, 2024

I guess they Cloudfare is checking the TLS fingerprint. 403 will come up if the TLS fingerprint is disabled in Cloudflare bot manager. None of the major web browser users the OpenSSL, which used by python.

If that is the case, do you think that we can bypass the cloudflare bot by developing a tool that is using boringssl (same as chrome)?

from investiny.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.