Browse
Ideas
This is a list of ideas for applications.
If you wish to apply, learn how to participate!
Scrapy
Handle 429s properly
| Description | Currently scrapy doesn’t handle 429s properly. So, whenever we get 429 response code, we should update throttling configs and concurrency to adapt to the new rate. |
| Expected Result | A new middleware/extension that will handle 429 response codes and adjust request rates properly. |
| Expected Time | 175 hours |
| Required Skills | HTTP |
| Mentors | Adrian, Andrey |
| GitHub Issue | #4424 |
Static Analysis Tooling
| Description | While using Scrapy, there are certain common issues that are hard to detect. For example, a typo in the name of a setting. |
| Expected Result | Build a list of common issues in code using Scrapy that could be detected using static code analysis, and build a tool or extend an existing tool to detect those. |
| Expected Time | 175 hours or 350 hours depending on the chosen scope |
| Required Skills | Abstract Syntax Tree, Regular Expressions |
| Mentors | Adrian, Andrey |
| GitHub Issue | #4421 |
Improve cookie handling
| Description | There are different aspects of cookie handling in Scrapy that we should improve. |
| Expected Result | Update the handling of cookies by Scrapy to meet modern web standards followed by web browsers, and make it easier for Scrapy users to work with cookies. |
| Expected Time | 175 hours or 350 hours depending on the chosen scope |
| Required Skills | HTTP |
| Mentors | Adrian, Andrey |
| GitHub Issue | #5431 |
Add TLS 1.3 support
| Description | Scrapy does not support TLS 1.3, and it is important that we do to keep up with servers that drop support for older TLS versions |
| Expected Result | Update our HTTP 1.1 downloader to support TLS 1.3 connections. |
| Expected Time | 350 hours |
| Stretch Goals | Update our HTTP/2 downloader to support TLS 1.3 connections as well. |
| Required Skills | HTTP, Twisted |
| Mentors | Adrian, Andrey |
| GitHub Issue | #4821 |
Parsel
HTML5 Support
| Description | When you inspect a website element in a web browser, you get a DOM-based HTML tree that is different from the actual, underlying HTML tree. This makes it difficult to translate what you find in a web browser into an XPath or CSS expression that can work in Parsel. More so when the underlying HTML is actually broken. |
| Expected Result | Extend Parsel to support different HTML parsers, and add support for additional HTML parsers. |
| Expected Time | 175 hours |
| Required Skills | HTML, Interface Design |
| Mentors | Andrey, Adrian |
| GitHub Issue | #83 |