AI News

Hugging Face, MIT Develop Text And Image-based Video Generator

Huggingface, MIT Develop Text And Image-based Video Generator

Philipp Schmid, a technical lead at open-source AI community platform Hugging Face, took to X to announce its latest AI model. Dubbed the Pyramid Flow SD3, the 2 billion parameter Diffusion Transformer (DiT) is capable of generating up to 10-second videos based on text or image inputs.

The open-source model will be MIT-licensed, as the developers aim to foster wide adoption among creative professionals and developers. Moreover, Pyramid Flow SD3 offers both text-to-video and image-to-video capabilities, with Schmid describing it in his post as the first “real good open-source text-to-video model.”

 

Pyramid Flow SD3 leverages a technique called Flow Matching to enhance its training efficiency, leading to significantly less resource consumption when compared to traditional video LLMs. Additionally, an easy two-step process for developers to integrate the model will make it more accessible to Hugging Face’s community.

The video LLM, trained purely on open-source datasets, offers three variants:

  • A resolution of 384p for a 5-second, 24FPS video
  • A resolution of 768p for a 10-second, 24FPS video
  • A natural image-to-video generation

While the team released the technical report, project page and open-source model checkpoint today, the training code and new model checkpoints will be made available soon. Yet, users and developers are excited, with the founder of artul.ai, commenting,

Latest News:

IBM Reveals Mainframes Are Crucial for AI and Hybrid Cloud.

Dunelm Enhances Online Shopping Using Google Cloud’s AI Technology

Happiest Minds Infuses Gen AI Into Latest MDR Solution

What is your reaction?

Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
Aman Dasgupta
Aman is an experienced content marketer and strategist with expertise in technology, finance and marketing. With an engineering background, he aims to simplify the latest news and trends in technology for digital audiences. Having worked with leading tech businesses in AI/ML, data science, AR/VR and Web 3.0, Aman helps decision-makers stay up-to-date and informed on everything technology.
You may also like

Leave a reply

Your email address will not be published. Required fields are marked *

More in:AI News