Phenaki is a model capable of generating realistic videos from textual prompts, with the ability to change prompts over time and create videos of any desired length. It uses a tokenizer with causal attention in time to compress videos into discrete tokens, and a bidirectional masked transformer to generate video tokens from text. Phenaki can generate videos conditioned on a sequence of prompts, making it the first model to study video generation from time variable prompts. It outperforms previous video generation methods in terms of spatio-temporal quality and token count per video.
Features
- Generates videos from textual prompts
- Allows for dynamic prompts that can change over time
- Can generate videos of any desired length
Use Cases
- Creating videos from text descriptions
- Generating videos for storytelling or visualization purposes
- Producing video content based on textual prompts
Suited For
- Content creators and storytellers
- Researchers in the field of video generation
- Anyone interested in generating videos from text prompts
FAQ
Phenaki is a model for generating videos from text prompts, with the ability to change prompts over time and create videos of any desired length.
Phenaki uses a tokenizer with causal attention in time to compress videos into discrete tokens, and a bidirectional masked transformer to generate video tokens from text.
Phenaki can be used to create videos from text descriptions, generate videos for storytelling or visualization purposes, and produce video content based on textual prompts.
Phenaki is well-suited for content creators and storytellers, researchers in the field of video generation, and anyone interested in generating videos from text prompts.