Giter Club home page Giter Club logo

arsenic's People

Contributors

cjlcarvalho avatar dimaqq avatar garyvdm avatar konstunn avatar mjarosie avatar ojii avatar phyrwork avatar silverdoses avatar tarekziade avatar theinvisiblerabbit avatar tnek avatar trbs avatar usrlocalben avatar velikiinehochuha avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

arsenic's Issues

How to interact with frames?

I need to click on a checkbox inside of an iframe. How can I do that? This code gets an empty list:

await session.wait_for_element(float('inf'), 'iframe')
children = await session.get_elements('iframe *')

Selenium's python webdriver library lets you do this by first switching to the iframe. Is this supported?

Using with aiohttp.test_utils.AioHTTPTestCase

When running AioHTTPTestCase there is no event loop assigned to the main thread.

Is it compulsory in this case to call asyncio.set_event_loop(loop) to use arsenic, or is there a way to pass the loop to arsenic explicitly?

Chrome doesn't save its state correctly

Hello!

How to reproduce the issue:

  1. open chrome session with user-data-dir=FOLDER
    I do it like this:
        self.chrome_log = open('chrome.log', 'w')
        self.service = services.Chromedriver(log_file=self.chrome_log)
        self.browser = browsers.Chrome(chromeOptions={
            'args': ['--headless', '--disable-gpu', '-lang=ru', 
                     "--user-data-dir={}".format(self.user_dir), 
                     '--window-size={},{}'.format(WINDOW_WIDTH, WINDOW_HEIGHT)]
        })
  1. Visit some website and sign in there.
  2. Restart the script and visit this website again.

The Issue: We assume we will be signed in at the second visit, but you will see that you are not signed in. However, the folder for user data (history, bookmarks, and cookies, as well as other per-installation local state.) will be created.

That's interesting. If we sign in with selenium and then visit the website with arsenic using the same user-data-dir, we will be signed in. It means that chrome correctly loads the profile.

UPD: The same behaviour, when I try to sign out. After the restarting session, I am still signed in, although I successfully signed out during the previous session.

chrome.log

Starting ChromeDriver 2.45.615279 (12b89733300bd268cff3b78fc76cb8f3a7cc44e5) on port 34409
Only local connections are allowed.

get_rect exception

get_rect is raising an exception when the value is 607.984px

raise ValueError(f"{original!r} is not an int or <int>px value")

ImportError: cannot import name 'get_session' from partially initialized module 'arsenic' (most likely due to a circular import)

Code:

import asyncio
import sys
from arsenic import get_session, browsers, services

if sys.platform.startswith('win'):
    GeckoDriver = 'C:\\Users\\Zandrio\\AppData\\Local\\Programs\\Python\\Python38\\geckodriver.exe'
else:
    GeckoDriver = './geckodriver.exe'    

service = services.Geckodriver(binary=GeckoDriver)
browser = browsers.Firefox(firefoxOptions={ 'args': ['-headless']} )    

async def browser_object():   
  async with get_session(service, browser) as session:
       await session.get('https://google.com/')
       await asyncio.sleep(10)



def main():
 loop = asyncio.get_event_loop()
 loop.run_until_complete(browser_object())



if __name__ == "__main__":
    main()

Terminal Output:

PS C:\Users\Zandrio> & C:/Users/Zandrio/AppData/Local/Programs/Python/Python38/python.exe "c:/Users/Zandrio/Documents/Advanced Project/arsenic.py" Traceback (most recent call last): File "c:/Users/Zandrio/Documents/Advanced Project/arsenic.py", line 3, in <module> from arsenic import get_session, browsers, services File "c:\Users\Zandrio\Documents\Advanced Project\arsenic.py", line 3, in <module> from arsenic import get_session, browsers, services ImportError: cannot import name 'get_session' from partially initialized module 'arsenic' (most likely due to a circular import) (c:\Users\Zandrio\Documents\Advanced Project\arsenic.py)

Remote IE Session broken in dev8

Remote sessions (at least for IE?) need CompatSession. But local IE needs normal Session. Need to find a good solution for this.

Starting up Headless Firefox launches the GUI

Hi I recently started using Arsenic because it's asynchronous. So far I got the hang of the normal WebDrivers so I started with using a headless one. I made a Firefox session which had the argument which causes a browser to go headless written on it, When the script loads up, it shows that the argument has indeed been passed but the Firefox Browser GUI loads up, is there any fix to this or this is a local issue?
Here is my code:

import asyncio
import sys

from arsenic import get_session, keys, browsers, services

if sys.platform.startswith('win'):
    GECKODRIVER = './geckodriver.exe'
else:
    GECKODRIVER = './geckodriver'


async def hello_world():
    service = services.Geckodriver()
    browser = browsers.Firefox(firefoxOptions={'args': ['-headless']})
    async with get_session(service, browser) as session:
        ....
def main():
    loop = asyncio.get_event_loop()
    loop.run_until_complete(hello_world())


if __name__ == '__main__':
    print("Starting!")
    main()

execute_async_script

let's add this to the session class:

    async def execute_async_script(self, script: str, *args: Any):
        return await self._request(
            url="/execute/async",
            method="POST",
            data={"script": script, "args": list(args)},
        )

PhantomJS: ignore stdin

phantomjs, by default, reads from its stdin. That causes problems with the Python debugger / other interactive front-ends.

Please explicitly re-open /dev/null as PhantomJS' standard input.

Can't use mozilla firefox capabilities

Hello,

I have two problems with the firefox driver :

  • Even when passing high log levels in the capabilities like so :
    {"moz:firefoxOptions": {"log" : {"level" : "error"}}}
    or like so :
    {"moz:firefoxOptions": {"log" : {"level" : "fatal"}}}
    I still get the logs of inferior levels, such as WARN. (even though i noticed INFOs are gone)

  • When passing a profile as an argument like so :
    {"moz:firefoxOptions": {"args": ["-headless", "-profile", "path/to/profile"]}}
    The profile is correctly loaded but nothing happens afterwards (The session doesn't seem to start and my elements are not fetched)

Thanks for your time

Decrease output to stdout

Hello, I've already seen this THIS issue
But using the code from there I still have no success on Win 10 and Ubuntu 18. Can anyone help me with decreasing output?
The code, I'm using, is:

async def parser():
    import os
    browser = browsers.Chrome(chromeOptions={'args': ['--whitelisted-ips', '--headless', '--disable-gpu', '--no-sandbox']})
    async with get_session(services.Chromedriver(log_file=os.devnull), browser) as session:
        await session.get("https://www.google.com")

Can't find phantomJS browser

---=== Python Code ===---
bin = 'D:\bin\phantomjs-1.9.8-windows\phantomjs.exe'
service = services.PhantomJS(binary=bin)
browser = browsers.PhantomJS(binary=bin)

Throw
FileNotFoundError: [WinError 2] The system cannot find the file specified

Support for more asynchronous IO frameworks

Just ran into this project, looking forward to using it!

A variety of event loop libraries (currently asyncio, trio, curio) can be supported by coding against AnyIO instead of exclusively against asyncio. How would you feel about supporting these libraries?

Remove docker-compose

New "workflow" system doesn't need docker compose, should go back to using simple docker + workflow.

Chrome Headless HTML broken with Proxy

from arsenic import get_session
from arsenic.browsers import Chrome
from arsenic.services import Chromedriver
from os import devnull
from async_timeout import timeout

service = Chromedriver(log_file=devnull)
browser = Chrome(chromeOptions={ 'args': ['--headless', '--disable-gpu', '--hide-scrollbars', '--window-size=1920,1080', '--disable-gpu', '--remote-debugging-port=9222', '--proxy-server="socks5://127.0.0.1:9050"', '--host-resolver-rules="MAP * ~NOTFOUND , EXCLUDE localhost"' ] })
try:
    async with timeout(10):
        async with get_session(service, browser) as session:
            await session.get("https://example.com")
            print(await session.get_page_source())
except asyncio.TimeoutError:
    print("Took too long to take screenshot.")

This code always returns
<html xmlns="http://www.w3.org/1999/xhtml"><head></head><body></body></html> and screenshots are completely white.

However, it works without the proxy option. When we tried switching to Firefox we got this:

from arsenic import get_session
from arsenic.browsers import Firefox
from arsenic.services import Geckodriver
from os import devnull
from async_timeout import timeout

service = Geckodriver(log_file=devnull)
browser = Firefox(firefoxOptions={ 'args': ['-headless'] })
try:
    async with timeout(10):
        async with get_session(service, browser) as session:
            await session.get("http://idlerpg.fun")
            image = await session.get_screenshot()
except asyncio.TimeoutError:
    return print("Took too long to take screenshot.")
image.seek(0)
�[2m2018-08-14 14:12.33�[0m �[1mrequest                       �[0m �[36mbody�[0m=�[35m{"desiredCapabilities": {"browserName": "firefox", "marionette": true, "acceptInsecureCerts": true, "firefoxOptions": {"args": ["-headless"]}}}�[0m �[36mmethod�[0m=�[35mPOST�[0m �[36murl�[0m=�[35mhttp://localhost:55423/session�[0m
�[2m2018-08-14 14:12.33�[0m �[1mresponse                      �[0m �[36mbody�[0m=�[35m{"desiredCapabilities": {"browserName": "firefox", "marionette": true, "acceptInsecureCerts": true, "firefoxOptions": {"args": ["-headless"]}}}�[0m �[36mdata�[0m=�[35m{'value': {'error': 'unknown error', 'message': 'Process unexpectedly closed with status 1', 'stacktrace': ''}}�[0m �[36mmethod�[0m=�[35mPOST�[0m �[36mresponse�[0m=�[35m<ClientResponse(http://localhost:55423/session) [500 Internal Server Error]>
<CIMultiDictProxy('Content-Type': 'application/json; charset=utf-8', 'Cache-Control': 'no-cache', 'Content-Length': '105', 'Date': 'Tue, 14 Aug 2018 14:12:33 GMT')>
�[0m �[36murl�[0m=�[35mhttp://localhost:55423/session�[0m
Traceback (most recent call last):
  File "/home/travitia/production/Travitia/cogs/owner.py", line 102, in _eval
    ret = await func()
  File "<string>", line 12, in func
  File "/usr/local/lib/python3.6/dist-packages/arsenic/__init__.py", line 16, in __aenter__
    self.session = await start_session(self.service, self.browser, self.bind)
  File "/usr/local/lib/python3.6/dist-packages/arsenic/__init__.py", line 29, in start_session
    return await driver.new_session(browser, bind=bind)
  File "/usr/local/lib/python3.6/dist-packages/arsenic/webdriver.py", line 57, in new_session
    raise SessionStartError(err_resp['error'], err_resp.get('message', ''), original_response)
arsenic.errors.SessionStartError: unknown error: Process unexpectedly closed with status 1

Any idea how to fix this? Chrome is v 68 (latest)

Setting cookie on headless Chrome not working

For my application I'm having the user authenticate themselves in a visible Chrome browser session and saving the authentication cookie. I need to add that cookie to a different headless session that will be used to do the actual work. I have a manager script that's checking if the authentication is still valid on a regular basis (by trying to get a specific element that only exists when logged in), with it relaunching a visible browser session if the user logs out.

Unfortunately, my application is in a loop of launching a visible session A, getting the cookie, setting the cookie on the headless session B, checking that authentication is still valid, and relaunching the visible session A to re request authentication. This seems to mean that the cookie isn't getting set properly or the headless session isn't able to find the element using get_element.

What's strange is that when I try switching the headless session to a visible session everything's working properly. In this scenario the application is requesting user authentication in visible session A, then taking the cookie from that and adding it to visible session B. Visible session A closes (as it should) and B continues to check that the authentication is still valid. Session A is no longer getting relaunched since session B is able to find the element.

I've looked through the logs to see if there are any sorts of exceptions being thrown, but there aren't (except 'no such element', but that's being handled). One thing I noticed is the following:

Starting ChromeDriver 2.40.565498 (ea082db3280dd6843ebfb08a625e3eb905c4f5ab) on port 57194 Only local connections are allowed. 2019-02-05 15:07.41 request body={"desiredCapabilities": {"browserName": "chrome", "chromeOptions": {"args": ["--headless", "--disable-gpu"]}}} method=POST url=http://localhost:57194/session [0205/150741.616:ERROR:gpu_process_transport_factory.cc(967)] Lost UI shared context.

At the end you can see ERROR:gpu_process_transport_factory.cc(967) Lost UI shared context. No idea if that's related, but I thought it's worth mentioning. Is it possible I'm doing something wrong? If there's anything I need to clarify please let me know.

Exception getting new session: io.UnsupportedOperation: fileno

Versions

OS: Ubuntu 18.04 LTS x86_64
Python: 3.6.5
firefox: 60.0.1
geckodriver 0.20.1
arsenic: 1.0.0.dev8

My code

from arsenic import get_session, services, browsers
service = services.Geckodriver()
browser = browsers.Firefox(firefoxOptions={ 'args': ['-headless']} )


async with get_session(service, browser) as session:
  await session.get('https://google.com')

Traceback

  File "/usr/local/lib/python3.6/dist-packages/arsenic/__init__.py", line 16, in __aenter__
    self.session = await start_session(self.service, self.browser, self.bind)
  File "/usr/local/lib/python3.6/dist-packages/arsenic/__init__.py", line 28, in start_session
    driver = await service.start()
  File "/usr/local/lib/python3.6/dist-packages/arsenic/services.py", line 99, in start
    self.log_file
  File "/usr/local/lib/python3.6/dist-packages/arsenic/services.py", line 34, in subprocess_based_service
    process = await impl.start_process(cmd, log_file)
  File "/usr/local/lib/python3.6/dist-packages/arsenic/subprocess.py", line 61, in start_process
    stdin=DEVNULL,
  File "/usr/lib/python3.6/asyncio/subprocess.py", line 225, in create_subprocess_exec
    stderr=stderr, **kwds)
  File "uvloop/loop.pyx", line 2368, in __subprocess_run
  File "uvloop/handles/process.pyx", line 566, in uvloop.loop.UVProcessTransport.new
  File "uvloop/handles/process.pyx", line 692, in uvloop.loop.__process_convert_fileno
io.UnsupportedOperation: fileno
  • same exception happens without using uvloop
  • same exception happens using chrome browser

Disable logging to stdout

Library prints a lot of information to stdout, full page code when navigating / making screenshots. It's not clear from documentation how to disable these logs

Mozilla Firefox Capabilities not Working

Using the example in the documentation for headless Firefox

browser = browsers.Firefox(firefoxOptions={
    'args': ['-headless']
})

does not seem to work. I can see it being passed in when the session is opened, but the browser still opens in the forefront.

API question

I am wondering if we could simplify the API usage by providing a high level class that does everything one would expect when running a Firefox session.

Something like:

async with FirefoxSession(*some, **options) as firefox:
    await firefox.get('http://example.com')

what do you think?

Firefox-ESR 60.6 not working

When trying to use Arsenic with the new firefox-esr build for raspberry pi 3b I get the following error.


2019-03-27 15:24.32 request                        body={"desiredCapabilities": {"browserName": "firefox", "marionette": true, "acceptInsecureCerts": true, "moz:firefoxOptions": {"args": ["-headless"]}}} method=POST url=http://localhost:43867/session
1553696672076   webdriver::command      WARN    You are using deprecated legacy session negotiation patterns (desiredCapabilities/requiredCapabilities), see https://developer.mozilla.org/en-US/docs/Web/WebDriver/Capabilities#Legacy
1553696672085   mozrunner::runner       INFO    Running command: "/usr/bin/firefox" "-marionette" "-headless" "-foreground" "-no-remote" "-profile" "/tmp/rust_mozprofile.vmvW1LqG2Uh7"
*** You are running in headless mode.
JavaScript error: resource://gre/modules/XPCOMUtils.jsm, line 187: TypeError: undefined has no properties
JavaScript error: resource://gre/modules/XPCOMUtils.jsm, line 187: TypeError: undefined has no properties
JavaScript error: resource://gre/modules/XPCOMUtils.jsm, line 187: TypeError: undefined has no properties
JavaScript error: resource://gre/modules/XPCOMUtils.jsm, line 187: TypeError: undefined has no properties
JavaScript error: resource://gre/modules/XPCOMUtils.jsm, line 187: TypeError: undefined has no properties
JavaScript error: resource://gre/modules/XPCOMUtils.jsm, line 187: TypeError: undefined has no properties
JavaScript error: resource://gre/modules/XPCOMUtils.jsm, line 187: TypeError: undefined has no properties
ExceptionHandler::GenerateDump cloned child 9542
ExceptionHandler::SendContinueSignalToChild sent continue signal to child
ExceptionHandler::WaitForContinueSignal waiting for continue signal...

Anyone have successfully used this library on a raspberry pi?

xpath with wait_for_element

Is it possible use xpath selector with join_the_queue = await session.wait_for_element(5, '//div[contains(text(), "text")]')

somehow test windows

find a way to run the tests continuously on windows (not the browser, the actual code).

Specify timeout while executing get in session

Now, arsenic.session.Session.get method accepts only the url as parameter and has a fixed timeout of 300000ms (5 minutes). It would be great if I could specify this timeout in get method.

Firefox 63 cannot start

After update Firefox to 63 cannot start webdriver, error:
Arsenic version 1.0.0.dev8

  File "/home/art/venvs/worker/lib/python3.6/site-packages/arsenic/__init__.py", line 29, in start_session
    return await driver.new_session(browser, bind=bind)
  File "/home/art/venvs/worker/lib/python3.6/site-packages/arsenic/webdriver.py", line 57, in new_session
    raise SessionStartError(err_resp['error'], err_resp.get('message', ''), original_response)
arsenic.errors.SessionStartError: unknown error: newSession

In original selenium 3.14.1 i see same error:

~lib/python3.6/site-packages/selenium/webdriver/remote/webdriver.py in execute(self, driver_command, params)
    319         response = self.command_executor.execute(driver_command, params)
    320         if response:
--> 321             self.error_handler.check_response(response)
    322             response['value'] = self._unwrap_value(
    323                 response.get('value', None))

~lib/python3.6/site-packages/selenium/webdriver/remote/errorhandler.py in check_response(self, response)
    240                 alert_text = value['alert'].get('text')
    241             raise exception_class(message, screen, stacktrace, alert_text)
--> 242         raise exception_class(message, screen, stacktrace)
    243 
    244     def _value_or_default(self, obj, key, default):

WebDriverException: Message: newSession

Backport to python 3.5

Hello,

do you plan to backport arsenic to python 3.5?

If not, would not you mind if I will backport it and submit a pull request? Can we manage to release an arsenic with python 3.5 support ?

How to get nested element?

I have a case where there's an outer container element containing an inner link element. I need to click the link of a specific container element. It's possible to find the right container element using a "data-id" value assigned to each container, but the link doesn't have the data-id value. Is there any way to get the link element within the container element?

How to start a Remote session with desired_capabilities and FirefoxProfile?

Hello,

I'm trying to test a remote Firefox connection using Arsenic. It doesn't look like your Remote service supports desired_capabilities or profile parameters?

Here is the synchronous code I usually use:

    from selenium import webdriver
    capabilities = {
        'platform': 'LINUX', 'browserName': browser, 'version': '',
        'enableVNC': True,
    }
    profile = webdriver.FirefoxProfile()
    profile.set_preference('browser.download.manager.showWhenStarting', False)
    profile.set_preference('browser.helperApps.neverAsk.saveToDisk', 'text/calendar,text/x-vcalendar')

    remote_driver = webdriver.Remote(
        command_executor='http://localhost:4444/wd/hub',
        desired_capabilities=capabilities, browser_profile=profile
    )

firefox options changed

Firefox options now use moz:firefoxOptions. Option handling for browsers should be overhauled.

AioHTTP versions compatibility (sometimes unable to stop chrome driver process)

Hi! There is a problem with Python 3.8 cause TimeoutError is not in asyncio.futures anymore, I suppose, but in asyncio itself already. The exception occurs in arsenic.subprocess.AsyncioSubprocessImpl.stop_process().

I don't know why but chromedriver is not being terminated but become defunct (zombie) and stay. Arsenic waits for it to terminate for 1 second and kill it but fails with AttributeError because of the fact I stated above.

UPD: if I replace asyncio.futures.TimeoutError to asyncio.TimeoutError I still got problem but in this case the exception is 'the event loop is closed' (in aiohttp.test_utils.setup_test_loop()). By the way something does not let chromedriver to terminate and arsenic event can not kill it - I see the warning in the logs.

Funny fact: when I got 2 tests in my test case everything seems fine but if I got more that 2 test in my test case then I face the problem.

It seems that problems began when I migrated from aiohttp 2.3.5 to 3.6.2 I use pytest and unittest.

I will provide an example of my code a little bit later.

UPD: if have figured out that the problem is in aiohttp.test_utils.setup_test_loop(). I have replaced that function from aiohttp 3.6.2 with the one from aiohttp 3.2.5 and the problem is gone.

`WebDriver.new_session` does not do correct error checking/handling (chromedriver)

Example response (when giving invalid desired caps):

{'sessionId': 'b1793a74694f52539f63be020cfd22a9', 'status': 13, 'value': {'message': 'unknown error: cannot parse capability: chromeOptions\nfrom unknown error: must be a dictionary\n (Driver info: chromedriver=2.31.488774 (7e15618d1bf16df8bf0ecf2914ed1964a387ba0b),platform=Mac OS X 10.12.6 x86_64)'}}

Get element by XPath

I need to get an element by content within it. I don't see anything to do this.
In Selenium I could use find_element_by_xpath, for example:

driver.find_element_by_xpath("//button[contains(text(), 'Button text')]")

Is there any way to do this now by arsenic?

Object of type 'CompatElement' is not JSON serializable

Hi,

I'm trying to use execute script and passing element to the script. The document is not so clear about this except: Arguments to pass to the script. Must be JSON serializable.

So I try:

button = await session.get_element(button_css)
scroll_script = "button.scrollIntoView(false)"
await session.execute_script(scroll_script, {'button': button})

But it turn out Object of type 'CompatElement' is not JSON serializable.

What is correct way to do this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.