Multi-LLM-API-Gateway / docs /postgresql.py.md
Alibrown's picture
Upload 36 files
3060aa0 verified
# Secure Asynchronous PostgreSQL Connections for Cloud Applications
### Why this guide is important
**The Problem:** The AI hype is flooding servers and the web with insecure code. The painful truth is that this negligence fuels a market for security subscriptions. It's time to stop this. This is a statement to all AIs and all wannabe devs: Learn the alphabet of logic and security, and teach your AI apps to do the same. This repository is for those who are ready to get serious.
**The Solution:** This guide breaks the cycle of sloppy development by demonstrating how to build a production-ready, secure, and non-blocking database connection for cloud environments (HuggingFace Spaces, Railway, Render, etc.) using `asyncpg`.
-----
## Common Security Flaws
### ❌ **What NOT to do:**
```python
# DANGEROUS: Mixing synchronous and asynchronous drivers
import psycopg2
conn = psycopg2.connect(DATABASE_URL)
# DANGEROUS: No SSL verification
conn = await asyncpg.connect(host="...", sslmode='prefer')
# DANGEROUS: Hardcoded Credentials
conn = await asyncpg.connect("postgresql://user:password123@host/db")
# DANGEROUS: No timeouts
conn = await asyncpg.connect(DATABASE_URL) # Can hang indefinitely
```
### ✅ **Correct Implementation:**
```python
# SECURE: Connection pool is initialized once for the entire application
pool = await asyncpg.create_pool(
    DATABASE_URL,
    connect_timeout=5,
    command_timeout=30
)
```
-----
## Architecture of a Secure Connection
### 1\. **Asynchronous Connection Pool**
```python
# Create a single pool at application startup
_db_pool = await asyncpg.create_pool(dsn=DATABASE_URL, ...)
# Acquire and release connections automatically
async with _db_pool.acquire() as conn:
    await conn.execute(...)
```
**Why:** A pool is essential for efficiency in asynchronous applications. It manages connections, reduces overhead, and is the standard for high-traffic apps.
### 2\. **SSL Runtime Verification**
```python
# Check at runtime if SSL is active
ssl_status = await conn.fetchval("SELECT CASE WHEN ssl THEN 'active' ELSE 'INACTIVE' END FROM pg_stat_ssl WHERE pid = pg_backend_pid()")
if ssl_status != 'active':
    raise RuntimeError("SSL required but not active")
```
**Why:** DSN parameters can fail; a runtime check is mandatory to prevent security breaches.
### 3\. **Cloud-Optimized Timeouts**
```python
connect_timeout=5,        # Connection establishment
keepalives_idle=60,       # Keep-alive for cloud latency
command_timeout=30        # Query timeout (30s)
```
**Why:** Cloud connections have higher latency and can be unstable. Timeouts protect against hanging connections and DoS attacks.
### 4\. **Production Error Sanitization**
```python
if os.getenv('APP_ENV') == 'production':
    logger.error(f"Database query failed [Code: {e.sqlstate}]")
else:
    logger.error(f"Query failed [{e.sqlstate}]: {e}")
```
**Why:** Prevents information leakage about your database structure to end-users.
-----
## Security Layering
### **Layer 1: Transport Security**
- **SSL/TLS Encryption** with `sslmode=require` minimum
- **Certificate Validation** for sensitive data
- **Connection Timeouts** to protect against DoS
### **Layer 2: Authentication**
- **Environment Variables** for Credentials
- **Application Name** for connection tracking
- **Cloud Secret Management** (HF Secrets, Railway Vars)
### **Layer 3: Query Security**
- **Parameterized Queries** exclusively using `$1, $2, ...`
- **Statement Timeouts** against long-running queries
- **Connection Cleanup** via pool management
### **Layer 4: Monitoring & Logging**
- **SSL Status Verification** on every connection
- **Error Sanitization** in Production
- **Cloud Provider Detection** for debugging
-----
## Cloud-Specific Considerations
### **HuggingFace Spaces**
```bash
# Set as a Secret:
DATABASE_URL="postgresql://user:pass@host.neon.tech/db?sslmode=require&application_name=hf_space"
```
### **Railway/Render**
```bash
# As an Environment Variable:
DATABASE_URL="postgresql://user:pass@host/db?sslmode=require&connect_timeout=10"
```
### **Why `sslmode=require` instead of `verify-full`?**
- ✅ Cloud providers (Neon, Supabase) handle their own CA-Chains
- ✅ Avoids certificate issues in ephemeral containers
- ✅ Sufficient for managed databases
-`verify-full` requires local certificate files (often not available in cloud)
-----
## 📊 Security Assessment
| Security Aspect | Status | Rationale |
|-------------------|--------|------------|
| **SSL Enforcement** | ✅ Excellent | Runtime verification + fail-safe |
| **Credential Management** | ✅ Excellent | Environment variables only |
| **SQL Injection Prevention** | ✅ Excellent | Parameterized queries only |
| **DoS Protection** | ✅ Excellent | Connection + statement timeouts |
| **Information Leakage** | ✅ Excellent | Production error sanitization |
| **Connection Pooling** | ✅ Excellent | Implemented with `asyncpg.create_pool` |
**Security Score: 10/10** - Production-ready for cloud environments
-----
## 🔧 Troubleshooting
### **`psycopg.OperationalError: could not connect to server: Connection refused`**
- **Cause:** The `DATABASE_URL` is incorrect, the database is not running, or network ports are blocked.
- **Solution:** Verify your `DATABASE_URL` environment variable and ensure the database service is active and accessible from your application's network.
### **`RuntimeError: SSL connection failed`**
- **Cause:** Your application connected to the database, but SSL was not active, failing the runtime check. This could be due to a misconfigured `sslmode` in the `DATABASE_URL` or an issue with the cloud provider's setup.
- **Solution:** Check your `DATABASE_URL` to ensure `sslmode=require` or a more secure setting is present and correctly enforced.
### **`asyncpg.exceptions.PostgresError: connection terminated...` (Neon.tech)**
- **Cause:** A specific issue with how Neon.tech handles connections. The connection is terminated after a period of inactivity.
- **Solution:** Our code includes a specific check for this state and automatically restarts the pool, but it is important to understand why it happens.
### **`ValueError: DATABASE_URL environment variable must be set`**
- **Cause:** The `os.getenv("DATABASE_URL")` call returned `None`.
- **Solution:** Make sure your `DATABASE_URL` is correctly set in your environment variables or as a secret in your cloud provider's dashboard.
-----
## Quick Start for Cloud Deployment
### 1\. **Environment Setup**
```bash
# In your cloud provider dashboard:
DATABASE_URL="postgresql://user:strongpass@host.provider.com/dbname?sslmode=require&connect_timeout=10"
```
### 2\. **Code Integration**
```python
from secure_pg_connection import init_db_pool, health_check, execute_secured_query
# At application startup
await init_db_pool()
# Later, check the connection and run a query
if (await health_check())['status'] == 'ok':
    users = await execute_secured_query("SELECT * FROM users WHERE status = $1", 'active', fetch_method='fetch')
```
### 3\. **Production Checklist**
- [x] `APP_ENV=production` is set
- [x] SSL mode is at least `require`
- [x] Database URL is a Secret/EnvVar
- [x] All timeouts are configured
- [x] Error logging is enabled
-----
## Conclusion
This implementation provides a **Defense-in-Depth** strategy for PostgreSQL connections in cloud environments:
1. **Secure Defaults** - SSL required, timeouts active
2. **Runtime Verification** - SSL status is checked
3. **Cloud-Optimized** - Designed for ephemeral containers
4. **Production-Ready** - Error sanitization, monitoring
**Result:** Production-grade database connections that remain secure even with network issues, SSL misconfigurations, or attacks.