Giter Club home page Giter Club logo

wjh18 / pyspeedinsights Goto Github PK

View Code? Open in Web Editor NEW
12.0 1.0 2.0 866 KB

Measure your site speed, performance, accessibility and SEO in bulk from the command line with Python and the PageSpeed Insights API.

Home Page: https://pypi.org/project/pyspeedinsights/

License: MIT License

Python 100.00%
python google-api google-pagespeed-insights lighthouse-audits lighthouse-reports pagespeed-insights pagespeed-insights-api pagespeed-optimization cli lighthouse

pyspeedinsights's People

Contributors

wjh18 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

pyspeedinsights's Issues

Logging

Is your feature request related to a problem? Please describe.
Logging is not implemented. As is a cli tool, use of print() is valid in many cases. However, it would be nice to log certain events to a file or stderr.

Describe the solution you'd like
Python logging from stdlib.

ValueError during timestamp parsing in rare case

Describe the bug
Discovered a (rare?) bug whereby the API response contains an analysisUTCTimestamp field with no trailing fractions of a second. When converting the timestamp to a Python datetime object in this scenario, it causes: ValueError: time data '2023-03-29T07:09:31Z' does not match format '%Y-%m-%dT%H:%M:%S.%fZ'.

To Reproduce
Can reproduce in testing by supplying api.response._get_timestamp with a mock JSON response dict (should contain a top-level analysisUTCTimestamp item with no trailing fractions of a second in its value). Then assert that it raises ValueError.

Expected behavior
Fractions of a second aren't used in report generation anyway so we should handle both scenarios gracefully (optional %f?).

Prevent conflicting commands from being used with conditional commands

Describe the bug
The -m / --metrics option should only be allowed when the -f / --format is set to excel or sitemap AND the -c / --category is set to performance. This is because metrics are only included in the response of performance reports and JSON format will include everything by default anyway.

This doesn't have any negative consequences functionality-wise, but it would be nice to at least warn the user that their metrics won't be included in the report despite selecting them.

To Reproduce

Example commands that violate this constraint:

  • psi https://www.example.com -m all
  • psi https://www.example.com -c seo -m all
  • psi https://www.example.com -c seo -f excel -m all

Expected behavior
Prevent the command from being processed or omit a warning that metrics won't be included.

Screenshots
n/a

Desktop (please complete the following information):

n/a

Smartphone (please complete the following information):

n/a

Additional context
n/a

Improve test coverage

Is your feature request related to a problem? Please describe.
There is very minimal test coverage at the moment.

Describe the solution you'd like
Write tests with pytest.

Describe alternatives you've considered
n/a

Additional context
n/a

Scheme required for sitemap `url` CLI argument to pass validation

Describe the bug
You can't enter your sitemap url via the cli without a URL scheme. Non-sitemap urls are converted successfully.

To Reproduce

$ psi example.com/sitemap.xml -f sitemap
> Invalid URL. Please enter a valid fully-qualified URL.
$ psi www.example.com/sitemap.xml -f sitemap
> Invalid URL. Please enter a valid fully-qualified URL.
$ psi www.example.com/sitemap.xml/ -f sitemap
> Invalid URL. Please enter a valid fully-qualified URL.

$ psi https://www.example.com/sitemap.xml/ -f sitemap
> Invalid sitemap URL provided. Please provide a URL to a valid XML sitemap.

Also, the last example is a separate issue caused by oversimplified extension validation where the trailing slash gets in the way. Might be worth taking a second look at that too.

Expected behavior
Should recognize as valid url where appropriate and add scheme if missing.

Add support for sitemap indices

Is your feature request related to a problem? Please describe.
The sitemap parser only works with direct links to standalone sitemaps, not multiple sitemaps or sitemap indices.

Describe the solution you'd like
Modify the parser to traverse sitemap indices, parse URLs from child sitemaps and include those URLs in the request tasks.

Describe alternatives you've considered
Use a 3rd-party package like advertools which has this functionality out of the box. The downside to this is that there's no way to only install the sitemap parser from this package.

Additional context
n/a

Time string in file name causing issue in Windows.

(base) C:\Users\Dipan>psi https://dipan.pages.dev Preparing 1 URL(s)... Sending request... (https://dipan.pages.dev) Request successful! (https://dipan.pages.dev) 1/1 URL(s) processed successfully. Traceback (most recent call last): File "c:\users\dipan\miniconda3\lib\runpy.py", line 197, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\users\dipan\miniconda3\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\Users\Dipan\miniconda3\Scripts\psi.exe\__main__.py", line 7, in <module> File "c:\users\dipan\miniconda3\lib\site-packages\pyspeedinsights\app.py", line 62, in main process_json(response, category, strategy) File "c:\users\dipan\miniconda3\lib\site-packages\pyspeedinsights\api\response.py", line 14, in process_json with open(filename, "w", encoding="utf-8") as f: OSError: [Errno 22] Invalid argument: 'psi-s-desktop-c-performance-2022-10-27 11:05:25.626000.json' Exception ignored in: <function _ProactorBasePipeTransport.__del__ at 0x0000022D4D691820> Traceback (most recent call last): File "c:\users\dipan\miniconda3\lib\asyncio\proactor_events.py", line 116, in __del__ self.close() File "c:\users\dipan\miniconda3\lib\asyncio\proactor_events.py", line 108, in close self._loop.call_soon(self._call_connection_lost, None) File "c:\users\dipan\miniconda3\lib\asyncio\base_events.py", line 746, in call_soon self._check_closed() File "c:\users\dipan\miniconda3\lib\asyncio\base_events.py", line 510, in _check_closed raise RuntimeError('Event loop is closed') RuntimeError: Event loop is closed

We get this traceback in Windows. The filename is not vaild.

Include metrics in Excel by default for performance reports and switch from debug metrics to official report metrics

Currently, all available metrics are being written to Excel which, unbeknownst to me, are supposed to be solely for debug purposes. Only 7 core metrics are included in actual PageSpeed Insights reports.

The goal here is to make metrics written to Excel by default for performance reports and only include the 7 core metrics. The rest will be deprecated. They will also precede the audit columns in Excel since they are key components of the report, and for visibility because there are far more audits than metrics.

I could also make disabling metrics optional for performance reports, but I don't think that's worth the effort as the friction they add is low relative to their usefulness.

Requires:

  • Removing metrics command choices from cli/choices.py
  • Removing metrics as a cli option
  • Detecting the category during response parsing. If performance and Excel or sitemap format, include metrics in Excel.
  • Changing which fields are parsed from JSON response
  • Modifying Excel writing to include metrics before audits
  • Possibly some other minor changes
  • Updating tests and README/docs

Keyring in Linux causing problem

This is not entirely a problem of pyspeedinsights, but hard dependency on keyrings is leading to problems in Linux. Keyrings have a known issue in Linux and therefore this cannot be used in Linux.

A method for directly supplying the api-key would be preferrable.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.