Chinese tech giant Kuaishou has unveiled Kling O1 (Omni One), the first unified multimodal model designed to significantly simplify working with video content.
The main innovation of O1 is that it combines all functions — text-to-video generation, image-to-video creation, and advanced editing — within a single, streamlined interface (pipeline). No more switching between different modes.
The model supports up to seven task types, including object addition and removal, stylization, and smooth transitions between specified start and end frames.
Key Features:
• Text-based Editing: Changes can be made using simple text commands such as “replace the cat with a wolf” or “change day to sunset.” O1 understands these requests without requiring manual object selection.
• Consistency: The model can process up to 10 reference images while maintaining consistent characters and scenes across generated frames.
• Quality: Videos are produced at up to 1080p resolution, 30 frames per second, and up to 10 seconds in length (extendable to 2 minutes).
• Audio: Native sound generation is included, with audio synchronized to on-screen action (e.g., rain sounds for a rainy scene).
Kling O1 greatly accelerates and simplifies the production of high-quality video content.
ORIENT
