OpenAI unveiled a tool on Thursday capable of generating videos based on text prompts.
Dubbed “Sora,” the new model can craft lifelike footage lasting up to a minute, aligning with user-provided directives regarding content and aesthetics. According to a blog post from the company, the model can also generate videos from a single image or seamlessly extend existing footage with additional content.
According to the blog post, OpenAI aims to train AI models to comprehend and replicate real-world dynamics, with the intention of assisting individuals in solving problems that necessitate interaction with the physical environment.
Among the initial examples provided by the company is a video based on the prompt: “A movie trailer depicting the exploits of a 30-year-old spaceman donning a red wool knitted motorcycle helmet, set against a backdrop of blue sky and salt desert, employing cinematic style, shot on 35mm film, with vibrant colors.”
OpenAI has granted access to Sora to select researchers and video creators for testing, with the aim of scrutinizing the product’s compliance with OpenAI’s terms of service. These terms prohibit the inclusion of “extreme violence, sexual content, hateful imagery, celebrity likeness, or the IP of others,” as outlined in the company’s blog post. Although access is restricted to a limited group, CEO Sam Altman has responded to user inquiries on Twitter by sharing video clips purportedly generated by Sora, each marked with a watermark to indicate AI authorship.
In 2021, the company introduced the still image generator Dall-E, followed by the generative AI chatbot ChatGPT in November 2022, which quickly gained 100 million users. While other AI companies have also unveiled video generation tools, these models have typically only been capable of producing brief snippets of footage often unrelated to their prompts. Both Google and Meta have disclosed ongoing development efforts for generative video tools, though these tools have yet to be made available to the public. Recently, the company announced an experiment involving the integration of deeper memory into ChatGPT, allowing it to retain more information from users’ conversations.
OpenAI has not provided details regarding the extent of footage used to train Sora or its sources, other than indicating to the New York Times that the dataset comprised publicly available videos as well as content licensed from copyright owners. The company has faced several lawsuits alleging copyright infringement in the training of its generative AI tools, which ingest vast amounts of data scraped from the internet to mimic images or text found within those datasets.