CVE-2021-41125
Description
Scrapy is a high-level web crawling and scraping framework for Python. If you use `HttpAuthMiddleware` (i.e. the `http_user` and `http_pass` spider attributes) for HTTP authentication, all requests will expose your credentials to the request target. This includes requests generated by Scrapy components, such as `robots.txt` requests sent by Scrapy when the `ROBOTSTXT_OBEY` setting is set to `True`, or as requests reached through redirects. Upgrade to Scrapy 2.5.1 and use the new `http_auth_domain` spider attribute to control which domains are allowed to receive the configured HTTP authentication credentials. If you are using Scrapy 1.8 or a lower version, and upgrading to Scrapy 2.5.1 is not an option, you may upgrade to Scrapy 1.8.1 instead. If you cannot upgrade, set your HTTP authentication credentials on a per-request basis, using for example the `w3lib.http.basic_auth_header` function to convert your credentials into a value that you can assign to the `Authorization` header of your request, instead of defining your credentials globally using `HttpAuthMiddleware`.
Predictions
Heuristic predictions, AS-IS, for prioritization only.
Mitigations
No mitigations published for this CVE yet.
The vendor-content worker queues fetches as references arrive (check back in a few minutes). Or โ if you've already worked around this in production โ publish your fix to the community-verified tier.
โ Propose a mitigation on Community โ Mitigations published via the community go through AI scoring + 2 human reviewers + 7-day silent objection window before landing here withsource_tier=community-verified.
OS impact
Arch Fixed 1 release
| Version | Status | Fixed in |
|---|---|---|
| โ | Fixed | 2.5.1-1 |
Debian Fixed 5 releases
| Version | Status | Fixed in |
|---|---|---|
| trixie | Fixed | 2.5.1-1 |
| sid | Fixed | 2.5.1-1 |
| forky | Fixed | 2.5.1-1 |
| bullseye | Fixed | 2.4.1-2+deb11u1 |
| bookworm | Fixed | 2.5.1-1 |
References
- https://github.com/scrapy/scrapy/security/advisories/GHSA-jwqp-28gf-p498
- https://nvd.nist.gov/vuln/detail/CVE-2021-41125
- https://github.com/scrapy/scrapy/commit/b01d69a1bf48060daec8f751368622352d8b85a6
- https://github.com/pypa/advisory-database/tree/main/vulns/scrapy/PYSEC-2021-363.yaml
- https://github.com/scrapy/scrapy
- https://lists.debian.org/debian-lts-announce/2022/03/msg00021.html
- https://w3lib.readthedocs.io/en/latest/w3lib.html#w3lib.http.basic_auth_header
- http://doc.scrapy.org/en/latest/topics/downloader-middleware.html#module-scrapy.downloadermiddlewares.httpauth
- https://security-tracker.debian.org/tracker/CVE-2021-41125
Community-verified mitigations for this CVE will appear above when contributors publish them.
Verify integrity in audit chain (admin only). AS-IS.