HTTP Retry and Resilience
Fromager includes enhanced HTTP retry functionality to handle network failures, server timeouts, and rate limiting that can occur when downloading packages and metadata.
Features
The retry system provides:
Exponential backoff with jitter to avoid thundering herd problems
Configurable retry attempts (default: 8 retries)
Smart error handling for common network issues:
HTTP 5xx server errors (500, 502, 503, 504)
HTTP 429 rate limiting
Connection timeouts and broken connections
Incomplete reads during large downloads
DNS resolution failures
GitHub API rate limit handling with proper reset time detection
Automatic authentication for GitHub and GitLab APIs (see Authentication)
Temporary file handling to prevent partial downloads
Configuration
You can customize retry behavior using environment variables:
# Number of retry attempts (default: 8)
export FROMAGER_HTTP_RETRIES=10
# Backoff factor for exponential delay (default: 1.5)
export FROMAGER_HTTP_BACKOFF_FACTOR=2.0
# Request timeout in seconds (default: 120)
export FROMAGER_HTTP_TIMEOUT=180
Authentication credentials (GITHUB_TOKEN, GITLAB_PRIVATE_TOKEN,
etc.) are documented in Authentication.
Error Types Handled
The retry mechanism specifically handles these error conditions:
Server Errors
408 Request Timeout- Request took too long to complete504 Gateway Timeout- Server overwhelmed or upstream timeout502 Bad Gateway- Proxy/gateway errors503 Service Unavailable- Temporary server overload500 Internal Server Error- General server errors
Rate Limiting
429 Too Many Requests- General rate limitingGitHub API rate limits with proper reset time handling
Network Errors
ConnectionError- Network connectivity issuesChunkedEncodingError- Broken connections during transferIncompleteRead- Partial data receivedProtocolError- Low-level protocol issuesTimeout- Request timeouts
Usage
The retry functionality is automatically enabled for all HTTP operations in Fromager. No code changes are required for existing functionality.
For Plugin Developers
If you’re writing plugins that need HTTP functionality, use the
shared session from request_session. It includes retry handling
and automatic authentication for GitHub and GitLab:
from fromager.request_session import session
# Use it like a normal requests session
response = session.get("https://pkg.test/api/data")
response.raise_for_status()
To register authentication for additional hosts:
from fromager.request_session import session_auth
def _resolve_my_auth(scheme: str, hostname: str) -> dict[str, str]:
return {"Authorization": "Bearer my-token"}
session_auth.add("https://my-registry.test", _resolve_my_auth)
Decorating Functions with Retry Logic
For functions that might fail due to transient errors:
from fromager.http_retry import retry_on_exception, RETRYABLE_EXCEPTIONS
@retry_on_exception(
exceptions=RETRYABLE_EXCEPTIONS,
max_attempts=3,
backoff_factor=1.0,
max_backoff=30.0,
)
def download_metadata(url):
# Your download logic here
pass
Logging
The retry system logs important events:
WARNING: When retries are attempted with backoff times
ERROR: When all retry attempts are exhausted
DEBUG: Detailed retry configuration and authentication resolution
Example log output:
WARNING Request failed for https://api.github.com/repos/owner/repo/tags: 504 Server Error. Retrying in 2.3 seconds (attempt 2/5)
WARNING GitHub API rate limit hit for https://api.github.com/repos/owner/repo/tags. Waiting 1247 seconds until reset.
INFO saved /path/to/package.tar.gz
Performance Considerations
Chunk size: Downloads use 64KB chunks for better error recovery
Temporary files: Partial downloads are written to
.tmpfiles firstJitter: Random delays prevent synchronized retry storms
Max backoff: Delays are capped at 60-120 seconds depending on context
Troubleshooting
High Retry Rates
If you’re seeing many retries, consider:
Configuring authentication credentials (see Authentication) to avoid API rate limits
Increasing timeout values for slow connections
Checking network connectivity and DNS resolution
API Rate Limiting
Configure credentials via netrc or environment variables (see Authentication)
Consider using a local package mirror for PyPI
Monitor API usage if using private registries
Connection Issues
Verify firewall and proxy settings
Check if specific URLs are being blocked
Consider network-level retry/redundancy