# Migration from Laravel OffersExtractor to Flask API

## What Changed

### Before (Laravel-based)

```
OffersExtractor/                    ← Full Laravel app
├── app/
│   └── Http/Controllers/
│       └── OffersController.php    ← PHP controller calling Python script
├── routes/
│   └── web.php                     ← Laravel routes
├── python_scripts/
│   ├── run_extraction.py           ← Python script (CLI)
│   └── wrapper.py                  ← Python wrapper
├── composer.json                   ← PHP dependencies
└── ... (full Laravel structure)

Flow: Frontend → Laravel Route → PHP Controller → shell_exec() → Python Script → JSON output
```

**Issues:**
- ❌ Heavy Laravel dependency for simple task
- ❌ PHP calling Python via shell_exec (not ideal)
- ❌ Difficult to scale independently
- ❌ Complex deployment (need PHP + Laravel + Python)
- ❌ Timeout issues with shell_exec
- ❌ Hard to monitor Python processes

---

### After (Flask API)

```
OffersExtractorFlask/              ← Standalone Flask app
├── app.py                         ← Main Flask application
├── wsgi.py                        ← Production WSGI entry
├── requirements.txt               ← Python dependencies only
├── .env                           ← Configuration
├── deploy.sh                      ← Automated deployment
├── supervisor.conf                ← Process manager config
├── nginx.conf                     ← Reverse proxy config
└── test_service.py               ← Testing utilities

Flow: Frontend → Laravel Proxy → Flask API → JSON response
```

**Benefits:**
- ✅ Pure Python service (no PHP/Laravel overhead)
- ✅ RESTful API with proper HTTP methods
- ✅ Independent scaling and deployment
- ✅ Production-ready (Gunicorn + Nginx + Supervisor)
- ✅ Better error handling and logging
- ✅ Easy to monitor and manage
- ✅ Proper timeouts and async support
- ✅ Can deploy to separate server/domain

---

## Architecture Comparison

### Old Architecture

```
┌──────────┐     HTTP      ┌──────────────┐    shell_exec    ┌──────────────┐
│ Frontend │ ────────────> │ Laravel App  │ ───────────────> │ Python Script│
│ (Angular)│               │ (Port 8000)  │                  │ (CLI)        │
└──────────┘               └──────────────┘                  └──────────────┘
                                   │
                                   └─> uploads/temp.pdf
                                   └─> Process cleanup
```

**Problems:**
- Single point of failure (Laravel handles everything)
- PHP waiting for Python to finish (blocking)
- Temp file management in Laravel
- Hard to scale (need to scale entire Laravel app)

---

### New Architecture

```
┌──────────┐    HTTP     ┌──────────────┐    HTTP Proxy    ┌──────────────┐
│ Frontend │ ──────────> │ Laravel API  │ ───────────────> │ Flask API    │
│ (Angular)│             │ (Proxy Only) │                  │ (Port 5100)  │
└──────────┘             └──────────────┘                  └──────────────┘
                              │  ▲                               │
                              └──┘                               │
                         Transform Response                      │
                                                          ┌───────▼──────┐
                                                          │ Gunicorn     │
                                                          │ (4 workers)  │
                                                          └──────────────┘
```

**Benefits:**
- Laravel acts as simple proxy (no heavy processing)
- Flask handles all PDF processing
- Can scale Flask independently (add more workers/servers)
- Clean separation of concerns
- Easy to add load balancing later

---

## Code Comparison

### Before: Laravel Controller

```php
// OffersExtractor/app/Http/Controllers/OffersController.php
public function extract(Request $request)
{
    $file = $request->file('pdf_file');
    $tempPath = public_path('uploads/' . $filename);
    $file->move($uploadPath, $tempFilename);
    
    // Shell exec (blocking, no proper error handling)
    $command = escapeshellcmd($pythonExecutable) . ' ' . 
               escapeshellarg($scriptPath) . ' ' . 
               escapeshellarg($pdfPath) . ' 2>&1';
    
    set_time_limit(300);
    $output = shell_exec($command);
    
    // Manual cleanup
    if (file_exists($filePath)) {
        unlink($filePath);
    }
    
    return response()->json($result);
}
```

**Issues:**
- `shell_exec()` blocks PHP process
- Manual timeout management (`set_time_limit`)
- Manual file cleanup
- No retry logic
- Limited error handling

---

### After: Flask App + Laravel Proxy

**Flask API:**
```python
# OffersExtractorFlask/app.py
@app.route('/extract_offers', methods=['POST'])
def extract_offers():
    file = request.files['pdf']
    
    # Proper file validation
    if not allowed_file(file.filename):
        return jsonify({"error": "Invalid file type"}), 400
    
    # Secure temp file handling
    temp_path = os.path.join(UPLOAD_FOLDER, secure_filename(file.filename))
    file.save(temp_path)
    
    try:
        # Extract offers (pure Python)
        offers = extract_offers_from_pdf(temp_path)
        return jsonify({"offers": offers, "count": len(offers)}), 200
    
    finally:
        # Automatic cleanup (even on error)
        os.remove(temp_path)
```

**Laravel Proxy:**
```php
// app/Http/Controllers/Api/V1/ExtractorController.php
public function extract(Request $request): JsonResponse
{
    $file = $request->file('pdf');
    $baseUrl = config('services.extractor.url');  // https://extractor.stagi-edu.com
    
    // Simple HTTP proxy
    $response = Http::timeout(60)
        ->asMultipart()
        ->attach('pdf', file_get_contents($file->getRealPath()), $file->getClientOriginalName())
        ->post($baseUrl . '/extract_offers');
    
    // Transform response for frontend
    return response()->json([
        'success' => true,
        'data' => $response->json()['offers'],
        'count' => $response->json()['count'],
    ]);
}
```

**Benefits:**
- Non-blocking HTTP call (async)
- Proper error handling
- Clean separation of concerns
- Easy to test independently
- Flask handles temp files
- Laravel just proxies

---

## Deployment Comparison

### Before: Laravel Deployment

```bash
# Need to deploy entire Laravel app
git clone <repo> OffersExtractor
cd OffersExtractor
composer install
php artisan migrate
php artisan serve

# Also need Python environment
pip install -r python_scripts/requirements.txt

# Complex: PHP + Python on same server
```

---

### After: Flask Deployment

```bash
# Simple standalone service
git clone <repo> offers-extractor
cd offers-extractor
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt

# Production ready
gunicorn --workers 4 --bind 127.0.0.1:5100 wsgi:app

# Or automated:
sudo ./deploy.sh
```

**Benefits:**
- Can deploy to separate server
- Independent scaling
- Easier monitoring (one service)
- Standard Python deployment

---

## Monitoring Comparison

### Before:
```bash
# Check Laravel logs
tail -f storage/logs/laravel.log

# No way to monitor Python script separately
# No way to see if Python process hung
# No metrics on PDF processing time
```

### After:
```bash
# Application logs
tail -f /var/log/offers-extractor/error.log
tail -f /var/log/offers-extractor/access.log

# Service status
sudo supervisorctl status offers-extractor

# Nginx metrics
tail -f /var/log/nginx/extractor.stagi-edu.com.access.log

# Health endpoint
curl https://extractor.stagi-edu.com/health
```

---

## Configuration Comparison

### Before:
```php
// config/app.php
'python_executable' => env('PYTHON_EXECUTABLE', 'D:/Conda/python.exe'),
```

Need to configure Python path in Laravel.

### After:
```env
# OffersExtractorFlask/.env
DEEPSEEK_API_KEY=sk-xxx
FLASK_ENV=production
PORT=5100
```

```env
# Laravel .env (just the URL)
EXTRACTOR_URL=https://extractor.stagi-edu.com
```

Clean separation!

---

## Security Comparison

### Before:
- ❌ Shell exec vulnerable to command injection
- ❌ Temp files in public directory
- ❌ No proper timeout handling
- ❌ Limited input validation

### After:
- ✅ No shell execution
- ✅ Secure temp file handling (system temp dir)
- ✅ Proper HTTP timeouts
- ✅ File type validation
- ✅ Size limits enforced
- ✅ HTTPS with Let's Encrypt
- ✅ Security headers (Nginx)

---

## Scalability Comparison

### Before:
To scale, you need to:
1. Scale entire Laravel application
2. Ensure Python environment on all servers
3. Share uploads directory
4. Complex load balancing

### After:
To scale, you can:
1. Add more Gunicorn workers: `--workers 8`
2. Deploy multiple Flask instances
3. Load balance at Nginx level
4. Scale Flask independently from Laravel
5. Use managed services (AWS ECS, Google Cloud Run, etc.)

**Example multi-instance:**
```
┌─────────┐     ┌────────┐     ┌──────────┐
│ Nginx   │────>│ Flask 1│     │ Laravel  │
│ LB      │     │:5100   │<────│ (Proxy)  │
│         │────>│ Flask 2│     └──────────┘
│         │     │:5101   │
│         │────>│ Flask 3│
└─────────┘     │:5102   │
                └────────┘
```

---

## Summary

| Aspect | Before (Laravel) | After (Flask) |
|--------|------------------|---------------|
| **Deployment** | Complex (PHP + Python) | Simple (Pure Python) |
| **Scaling** | Scale entire app | Scale independently |
| **Monitoring** | Limited | Comprehensive |
| **Security** | Shell exec risks | Pure HTTP API |
| **Performance** | Blocking calls | Async HTTP |
| **Maintenance** | Mixed codebase | Clean separation |
| **Testing** | Difficult | Easy (standalone) |
| **Production** | Not ideal | Production-ready |

---

## Migration Impact

### What Changed:
- ✅ Created new Flask app
- ✅ Laravel now acts as proxy only
- ✅ Configuration simplified

### What Stayed the Same:
- ✅ Frontend code (no changes!)
- ✅ API endpoint path
- ✅ Request/response format
- ✅ DeepSeek integration logic
- ✅ PDF processing algorithm

### What's Better:
- ✅ Independent deployment
- ✅ Better error handling
- ✅ Easier to scale
- ✅ Production-ready
- ✅ Better monitoring
- ✅ Cleaner architecture

---

## Next Steps

1. ✅ **Local Testing**: Already working on port 5100
2. ⏳ **Deploy to Production**: Run `deploy.sh` on server
3. ⏳ **Update Laravel**: Change `EXTRACTOR_URL` to production
4. ⏳ **DNS**: Point `extractor.stagi-edu.com` to server
5. ⏳ **SSL**: Setup Let's Encrypt certificate
6. ✅ **Test**: End-to-end testing via Laravel proxy

**The old Laravel OffersExtractor can be retired after successful deployment!**
