adjust the resolution based on available VRAM. add elapsed time.
This commit is contained in:
29
AGENTS.MD
29
AGENTS.MD
@@ -11,13 +11,13 @@ The owner (user) is not an ML expert. The system must:
|
||||
---
|
||||
|
||||
## 1) High-Level Goal
|
||||
Build a local pipeline that converts **text-only storyboards** into **15–30 second videos** by:
|
||||
Build a local pipeline that converts text-only storyboards into 15-30 second videos by:
|
||||
1) converting storyboard -> shot plan
|
||||
2) generating shot clips (T2V or I2V when possible)
|
||||
3) assembling clips into a final MP4
|
||||
4) upscaling to 2K/4K if desired
|
||||
|
||||
This is a **shot-based** system, not “one prompt makes a whole movie”.
|
||||
This is a shot-based system, not "one prompt makes a whole movie".
|
||||
|
||||
---
|
||||
|
||||
@@ -38,9 +38,9 @@ Design must be stable under 12GB VRAM using:
|
||||
---
|
||||
|
||||
## 3) Output Targets (Realistic)
|
||||
- Native generation: 720p–1080p (preferred)
|
||||
- Native generation: 720p-1080p (preferred)
|
||||
- Final delivery: 1080p required; 2K/4K via upscaling
|
||||
- Duration: 15–30s per video (may be segmented)
|
||||
- Duration: 15-30s per video (may be segmented)
|
||||
- FPS: 24 default
|
||||
- Output: MP4 (H.264/H.265)
|
||||
|
||||
@@ -50,13 +50,15 @@ Design must be stable under 12GB VRAM using:
|
||||
User has CUDA Toolkit 13.1 installed. Current PyTorch builds generally ship with and target CUDA 12.x runtimes.
|
||||
We must NOT assume PyTorch will build/run against local CUDA 13.1 toolkit.
|
||||
|
||||
**Plan:**
|
||||
- Use **PyTorch prebuilt binaries that bundle CUDA runtime** (e.g., cu121 / cu124).
|
||||
Plan:
|
||||
- Use PyTorch prebuilt binaries that bundle CUDA runtime (cu121/cu124/cu128).
|
||||
- Rely on NVIDIA driver compatibility rather than local CUDA toolkit version.
|
||||
- Avoid compiling custom CUDA extensions unless necessary.
|
||||
|
||||
Implementation notes:
|
||||
- Prefer installing PyTorch via conda or pip using official CUDA 12.x builds.
|
||||
- For RTX 5070 (sm_120), use CUDA 12.8 wheels via pip:
|
||||
`pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128`
|
||||
- Prefer conda for Python, ffmpeg, and general deps; use pip for torch if sm_120 support is required.
|
||||
- If xFormers causes build issues, use PyTorch SDPA and disable xFormers.
|
||||
|
||||
---
|
||||
@@ -64,12 +66,13 @@ Implementation notes:
|
||||
## 5) Approved Stack (Do Not Deviate)
|
||||
### Core
|
||||
- Python 3.10 or 3.11 (conda env)
|
||||
- PyTorch (CUDA 12.x build: cu121 or cu124)
|
||||
- PyTorch (CUDA 12.x build, cu121/cu124/cu128)
|
||||
- diffusers + transformers + accelerate + safetensors
|
||||
- ffmpeg for assembly
|
||||
- opencv-python for frame IO (if needed)
|
||||
- pydantic for config/schema validation
|
||||
- rich / loguru for logs
|
||||
- ftfy for text normalization (required by WAN)
|
||||
|
||||
### Testing
|
||||
- pytest
|
||||
@@ -124,7 +127,7 @@ We will later build a utility script:
|
||||
- For each shot: generate clip
|
||||
- Support:
|
||||
- seed control
|
||||
- chunking (e.g., generate 4–6 seconds then continue)
|
||||
- chunking (e.g., generate 4-6 seconds then continue)
|
||||
- optional init frame handoff between shots
|
||||
|
||||
### D) Assembly
|
||||
@@ -149,7 +152,7 @@ For each shot and final render, save:
|
||||
- timing + VRAM notes if possible
|
||||
|
||||
Every run produces a folder:
|
||||
- outputs/<project>/<timestamp>/
|
||||
- outputs/<project>/
|
||||
- shots/
|
||||
- assembled/
|
||||
- metadata/
|
||||
@@ -166,7 +169,7 @@ Every run produces a folder:
|
||||
- assembly command lines are correct
|
||||
- metadata is generated correctly
|
||||
|
||||
Do not require “visual quality” assertions. Test structure and determinism.
|
||||
Do not require visual quality assertions. Test structure and determinism.
|
||||
|
||||
---
|
||||
|
||||
@@ -201,10 +204,10 @@ Required:
|
||||
---
|
||||
|
||||
## 13) Definition of Done
|
||||
A feature is “done” only if:
|
||||
A feature is "done" only if:
|
||||
- implemented
|
||||
- tests added/updated
|
||||
- docs updated
|
||||
- reproducible install instructions remain valid
|
||||
|
||||
End of file.
|
||||
End of file.
|
||||
Reference in New Issue
Block a user