Imaginario AI https://imaginario.ai Search, convert and publish videos with AI Mon, 15 Dec 2025 16:59:14 +0000 en-US hourly 1 https://wordpress.org/?v=6.9.1 https://imaginario.ai/wp-content/uploads/2023/11/cropped-Imaginario-logo-32x32.png Imaginario AI https://imaginario.ai 32 32 Part 2 – Enterprise AI in Media: Use Cases and the Challenge of Context in the Agentic Future https://imaginario.ai/wide-lens/artificial-intelligence/part-2-enterprise-ai-in-media-use-cases-and-the-challenge-of-context-in-the-agentic-future/ Mon, 15 Dec 2025 16:21:56 +0000 https://imaginario.ai/?p=2087

‘Constrained agency’ is becoming the rule in Enterprise: AI handles the heavy lifting, while humans make the decisions. Context is the next challenge.

This is the second of a three-part post about a few key insights and findings that I came across when attending the Digital Production Partnership (DPP) Leaders Briefing 2025 in London. The DPP is one of the most important trade associations in the media industry that brings together senior executives from major media organizations. More than 1,000 attendees came to this conference held on November 18 and 19.

The transformation from AI demos to production systems is underway across media and entertainment. But the more interesting story isn’t that AI has arrived, it’s where it’s actually working, where it’s struggling, and what infrastructure changes are required to unlock the next wave of value.

Where AI Is Delivering Real Value in the Media Enterprise space

Use Case 1: Ingest and Metadata. The Highest-Leverage Point for Automation

High-quality and multimodal metadata at point of ingestion lies at the heart of building intelligent and robust AI systems further downstream in the media supply chain.

The standout finding from recent industry surveys such as the DPP’s CEO and CTO surveys is that metadata generation and enrichment, particularly speech-to-text and automated speech recognition, has crossed the adoption chasm. More than 80% of organizations are either encouraging or actively implementing these capabilities.

The logic is sound. The oldest rule in data engineering applies: garbage in, garbage out. If your metadata is incomplete, inconsistent, or wrong, everything downstream suffers. Search breaks. Rights management fails. Compliance becomes manual. Smart organizations have realized that the front door of the supply chain is the highest-leverage point for automation.

The DPP’s Media AI Radar 2026, based on industry interviews, forecasts higher adoption of AI in ideation, planning, analytics, commissioning, compliance, video marketing, advertising, and ingest/logging.

The more sophisticated implementations involve what’s being called “agentic orchestration”: AI systems that don’t just transcribe content but actively monitor ingestion workflows for anomalies.

For example, a file arrives with metadata that doesn’t match expected formats or the level of depth and accuracy required. Maybe the localization markers are wrong, or the language code contradicts the filename. Traditional workflows would let that error flow downstream until a human spotted it.

With agentic orchestration applied in Q/C, the system flags the mismatch immediately, provides context, and prompts investigation. A person remains in charge, operating with better information and recommendations earlier in the process.

However, as we will see later in this post, agentic AI doesn’t just need highly accurate metadata but also ways to access content and data silos in a more fluent, robust, and secure way. This remains a challenge for large organizations.

Use Case 2: Ingest and Metadata. The Highest-Leverage Point for Automation

AI is heavily disrupting the localization industry. Yet, the media industry continues to maintain a high bar for quality dubbing and audio descriptions. Humans and AI together produce the best output.

Subtitling, captions, and synthetic dubbing are seeing strong adoption (especially the first two), but with an important caveat: only 14% of teams find auto-generated captions “fully usable” without human review, and 75% cite timing and sync issues as their primary complaint.

Bad subtitles aren’t just annoying, they’re trust-destroying. The pattern emerging is AI-assisted drafting plus mandatory human review. Speech-to-text models generate initial captions. LLMs detect sync problems and propose corrections. Linguistic experts review the final files. The AI handles grunt work; humans ensure quality.

Use Case 3: Marketing and Promotion – Dramatic Time Savings

Beyond the hype. Automated and semi-automated video repurposing is proving to be a growing revenue pipeline for companies with long-form content, such as those in Sports, and a key way to monetize deep high-quality catalogues.

This is where AI delivers some of the most compelling efficiency gains and this goes beyond flashy Nano Banana text to video generative AI. A single piece of premium long-form content (sports matches, TV episodes, etc) might need dozens of promotional assets tailored for different platforms, audiences, and campaigns.

These are AI tools that can analyze content, identify emotional beats, extract compelling sequences, and generate draft cuts in minutes. What used to take days happens in hours or minutes. This is one of the use cases driving the highest demand for Imaginario AI, and this cuts across media, entertainment, sports, news and even SMB content.

In the case of Imaginario AI, we have seen our Enterprise clients save between 50% and 75% in searching deep catalogues, navigating dailies, and generating social media compilations or single cuts. This is quantifiable ROI.

Use Case 4: Contextual Ad Discovery and Insertions. The Next Frontier in Programmatic Advertising

AI contextual ad discovery understands the nuance of content to place ads in the perfect environment. This enables highly personalized creative that align with a user’s immediate mindset and mood, boosting engagement without relying on personal data.

AI tools can analyze narrative structure, detect scene transitions, and suggest non-intrusive ad slots based on emotional pacing. They can also explain their reasoning, which is critical for creative buy-in. The tools that work don’t make autonomous placement decisions. They suggest options, show reasoning, and make overrides easy.

The Maturity Gap: Where We Actually Are

For all this progress, humility and baby steps are still in order. When surveyed by the DPP about media supply chain maturity, 61% of respondents said “developing” and 36% said “maturing.” Only 3% called it “advanced.” Nobody, not a single respondent, said “leading”.

The vast majority of media companies are still developing their cloud, interoperability and AI capabilities. They expect to reach a maturity stage in three years. Sources: DPP CTO Survey 2025.

Projected three years ahead, roughly half expect to be maturing, a third hope for advanced, and just 3% think the industry will be truly leading.

We’re in the early innings. The teams deploying AI today are pioneers, not late adopters. Most organizations are still figuring out use cases, governance, training, and change management. The technology is ahead of organizational readiness.

Co-Pilot, Not Autopilot: The Design Pattern That Works

The highest gains are seen when humans and AI do what they do best: AI handles data processing, automation of repetitive tasks, and generating drafts. Humans use critical judgement, context, creativity, and empathy.

Across trust, integration, governance, security, and perceived technology maturity, concern levels spike whenever AI autonomy increases. People want smart assistants. They do not want self-driving compliance, QC, or editorial decisions.

The pattern that’s working: constrained agency. AI with enough freedom to do meaningful work, but within boundaries that remain legible and controllable by humans. The level of acceptable autonomy varies by use case. Legal compliance has a higher accuracy bar than generating marketing highlights, but the principle holds.

This isn’t because the technology can’t handle more autonomy. It’s because organizations aren’t ready to trust it, and more importantly, because accountability matters. When something goes wrong, someone has to answer for it. That someone needs to be a human.

As one technology vendor put it perfectly during the DPP Leadership Summit last November in London: “AI handles the heavy lifting; humans handle the decisions.”

The Enterprise Challenge: A Fragmented Context

AI implementations often stall because essential enterprise context is locked in fragmented systems and unstructured media formats like video and audio. Making diverse data accessible is critical insfrastructure work required for scalable AI success.

For AI agents to be effective in the enterprise, they need enterprise context. That context sits in contracts, financial documents, research, marketing assets, meeting notes, conversations, and every other piece of information across the organization. By volume, most of this data is unstructured and remains both on-premises and in the cloud. There is no one-size-fits-all solution, as each organization is at a different stage of digitalization and cloud development.

Your customer information might live in HubSpot. Projects are in Monday. Engineering issues sit in Jira. Financials are in Xero or SAP. Your product catalog exists in a different database and MAM systems. Branding and marketing is scattered across DAMs and creative toolkits like Adobe or Avid.

Enterprise context is highly fragmented and requires bespoke integrations, taxonomies, and adapted schemas that ensure data consistency and make it easier for different systems to understand and exchange data. Without this, even the smartest AI still won’t know your organization and your workflows, therefore providing inaccurate information and potential hallucinations.

This fragmentation is why so many AI implementations stall after promising pilots. The AI works brilliantly on a single data source but struggles when it needs to compose context from multiple systems to answer real business questions.

And there’s another layer of complexity: a massive portion of enterprise context is locked in complex media formats like video, audio, and images that agents can’t natively process. Making these formats truly legible to AI is essential infrastructure work that most organizations haven’t yet addressed. Until this unstructured content becomes structured and searchable, AI agents will operate with significant blind spots.

The only way AI agents will be successful at scale is if they have access to the right information, in the right structure, at the right time, in a secure and well-governed way. AI will expand the use and value of this information by orders of magnitude over time, but only if the infrastructure exists to make it accessible.

MCP and the Agentic Architecture Shift

A single contextual layer to rule all APIs and bespoke integrations in the Era of AI: Model Context Protocol (MCP). Source: Descope.

This fragmentation problem is exactly why the Model Context Protocol (MCP) ecosystem has become strategically important, even with APIs available.

MCP, originally developed by Anthropic and now donated to the Linux Foundation’s new Agentic AI Foundation, provides MCP clients with a universal, open standard for connecting AI applications to external databases, servers, and third-party systems. It’s not just about connecting agents to a system, it’s about composing context from all the systems that collectively hold your enterprise’s cognitive reality.

The numbers tell the story of rapid adoption: over 10,000 active public MCP servers exist today, covering everything from developer tools to Fortune 500 deployments. MCP has been adopted by ChatGPT, Cursor, Gemini, Microsoft Copilot, Visual Studio Code, and other major AI products. Enterprise-grade infrastructure now exists with deployment support from AWS, Cloudflare, Google Cloud, and Microsoft Azure. This will likely become the underlying glue of AI.

What we’re seeing emerge is an evolution from atomized MCPs (reflecting individual APIs) to aggregated, orchestrated context layers that can assemble meaningful context on demand. The winners in the AI-enabled enterprise won’t just be the platforms that secure their own data well. They’ll be the ones that embrace composability and make their context genuinely AI-visible.

The agentic future requires infrastructure that makes context available, structured, and secure across the entire SaaS ecosystem. That’s the architecture shift everyone is building toward.

What This Means for the Next Wave

If you’re building AI Enterprise tools for media or implementing them in your organization:

  • Prioritize the ‘front door’. The highest-leverage automation opportunities are at ingest and metadata generation at scale. Get that right, and value compounds downstream. Note: Imaginario AI provides labeled and vector-based contextual understanding and can help your organization in this part of the supply chain.
  • Design for transparency. If your AI makes recommendations, show your reasoning. Black boxes don’t scale where trust must be earned through explainability.
  • Assume human review first. The workflows that work (today) are ones where AI drafts and humans approve. Plan for that from the start, optimize for automation later.
  • Solve the fragmentation problem. Invest in infrastructure that makes enterprise context composable, secure, and AI-visible across your SaaS ecosystem. MCP adoption is accelerating for good reason. APIs are rapidly adapting to support MCP and Imaginario AI is not the exception.
  • Build for constrained agency. Give your AI enough freedom to do real work, but keep it within boundaries humans can understand and control.

The transformation is messy, uneven, and slower than the hype cycle suggests. But it’s happening. AI is moving from the margins to the mainstream, from idiotic hype to efficient supply chains.

We’re not at full automation for most workflows, as context and personalization at scale still need to be completely solved. However, we’re at something more mundane and more valuable: AI as co-pilot, lifting cognitive weight and letting humans focus on judgment and creativity.

The infrastructure work, making enterprise data composable, structured, and AI-visible, is less glamorous than demos of agents completing complex tasks autonomously. But it’s the foundation everything else depends on. The organizations that invest in this plumbing now will be the ones positioned to capture value as agentic capabilities mature.

That might not make for breathless keynote presentations. But it’s the future that’s actually getting built.leaders in the first place (hint: premium content).


About Imaginario.ai
Backed by Techstars, Comcast, and NVIDIA Inception, Imaginario AI helps media companies turn massive volumes of footage into searchable, discoverable, and editable content. Its Cetus™ AI engine combines speech, vision, and multimodal semantic understanding to deliver indexing, simplified smart search, automated highlight generation, and intelligent editing tools.

]]>
YouTube Wins on Usefulness. TikTok Wins on Dopamine Hit. Guess Which One Stalls First. https://imaginario.ai/wide-lens/social-video/youtube-wins-on-usefulness-tiktok-wins-on-dopamine-hit-guess-which-one-stalls-first/ Fri, 28 Nov 2025 09:53:00 +0000 https://imaginario.ai/?p=2077

Every year we get a new “state of social media” report from Pew Research and every year the headlines scream disruption, innovation, a new platform rising to dethrone the old guard. But if you read the latest Pew Research data carefully (and you should), a very different story emerges: Social media isn’t exploding. It’s settling.

  • YouTube sits at 84%
  • Facebook at 71%
  • Instagram stuck at 50%
  • TikTok hovering around 37%

These numbers barely move year to year. In fact, Facebook has essentially been flat since 2016. YouTube gained a whole two percentage points since 2023; barely statistical noise.We’re not witnessing a revolution.


We’re watching a plateau.

And when you zoom into age groups, the picture becomes even clearer. Among 18–29 year olds, the group that supposedly defines the future of culture, almost everyone uses everything. Around 95% use YouTube. 80% use Instagram. 68% use Facebook. 63% use TikTok. 58% use Snapchat.

This is not a battlefield with a single winner.


It’s a crowded food court. People wander between stalls, nibble from each plate, and commit to none.

Meanwhile BeReal, remember that hype cycle? Sits at 3% adoption. Proof that novelty without a real use case burns out faster than the venture money funding it.

All Platforms, Same Features

Pew’s dataset confirms something we’ve all felt intuitively: every platform now looks like a remix of every other platform.

Feeds. Stories. Short-form video. Messaging. Shopping.
Algorithms pushing content you didn’t ask for, trying to keep you inside their world just a little longer.

The result? Commodification.

When the feature sets converge, the differences between platforms stop being strategic and start being aesthetic.

YouTube keeps winning because it solves an actual problem: hosting, archiving, and discovering long-form video (and Imaginario AI can help you transform those YouTubes for other platforms). That’s hard to replicate.

TikTok’s insane rise: Mostly novelty and a powerful recommendation loop both of which competitors have now cloned.

Facebook? The data is brutally consistent: it hasn’t grown in almost a decade, despite owning half the social internet. More apps don’t mean more engagement. They mean more fragmentation.

Even among older adults, supposedly the “offline” generation, YouTube still has around 64% adoption. That’s higher than TikTok gets with any age group. A quiet reminder that usefulness > trendiness.

What This Means for Anyone Trying to Reach an Audience

We’ve been conditioned to chase the “next platform,” the “next format,” the “next wave.” And for a long time, that made sense, the land grab was real.

But now?
The land is fully owned. The borders have been drawn.

Audience behaviour has crystallized around four core modes:

YouTube → sit back, learn, watch something meaningful.
Facebook → talk to your existing community, stay in touch.
Instagram → aesthetic discovery, curated-self world-building.
TikTok → quick dopamine hits, novelty, entertainment in fast motion.

Marketers (and founders, creators, journalists, anyone who publishes anything) don’t win by being everywhere. They win by understanding why people are where they are, by creating distinctive niche content, and tailoring the message accordingly.

YouTube users want to learn or be entertained.
Facebook users want to talk to someone they know.
Instagram users want things that look good.
TikTok users want to feel something immediately.

Once you internalize this, something shifts: you stop chasing platforms and start designing stories.

Because here’s the real truth in all this data: The platforms are no longer the differentiators. You are.

What you say, how you say it, how well it resonates, that’s the new advantage.
Platform choice is basic hygiene.
Message-market fit is the edge.



About Imaginario.ai
Backed by Techstars, Comcast, and NVIDIA Inception, Imaginario AI helps media companies turn massive volumes of footage into searchable, discoverable, and editable content. Its Cetus™ AI engine combines speech, vision, and multimodal semantic understanding to deliver indexing, simplified smart search, automated highlight generation, and intelligent editing tools.

]]>
Part 1 – Winning Enterprise AI in Media: Why Changing Minds Outweighs Better Models https://imaginario.ai/wide-lens/product/part-1-winning-enterprise-ai-in-media-why-changing-minds-outweighs-better-models/ Thu, 27 Nov 2025 17:42:46 +0000 https://imaginario.ai/?p=2066

The DPP Leaders’ Briefing 2025 gathered more than 1,000 attendees and key decision-makers from over 30 major media organisations. Credit: DPP

This is the first of a three-part post about a few key insights and findings that I came across when attending the Digital Production Partnership (DPP) Leaders Briefing 2025 in London. The DPP is one of the most important trade associations in the media industry that brings together senior executives from major media organizations. More than 1,000 attendees came to this conference held on November 18 and 19.

Across the two days of the DPP Leaders’ Briefing, you could almost forget who worked where. Once people stopped talking about their logo and started talking about the technical and management problems they were facing, the patterns lined up across major media, news, broadcast, and entertainment organizations.

Key questions that emerged across most presentations:

  • How and where do we use AI in production rather than in slideware?
  • How can we trust AI tooling and agentic solutions while more decisions are assisted or automated?
  • Do we have the right integrations and security mechanisms in place?
  • How do we turn deep archives, dailies, and live content into something companies can leverage to reduce costs, increase revenue, and boost quality?
  • How do we avoid vendor lock-in while the technology continues to advance at a breakneck pace?
  • How does our tech stack need to adapt to accommodate this level of flexibility?
  • How do we survive the new content and distribution economics of an AI-driven internet and aggressive hyperscalers?

If you needed a one-line summary of the conference, it is this:

“Despite the high priorities for media companies in AI implementation and automation, the reality is that so far for many players internal business engagement and operational effectiveness have been far more challenging than actual tech delivery”

Another risk discussed during the event was relying on legacy workflows to define the future of work in media.

As an executive from RTL said:

“If you narrow it to KPIs too early you’ll be optimizing what you already have”.

In other words, media enterprises will not be investing in learning what they need to learn and building new workflows from scratch.

Uriah Smith’s 1899 “Horsey Horseless”, a car fitted with a fake horse head, is the ultimate cautionary tale of “optimizing what you already have” rather than trusting a completely new workflow.

To innovate faster, partnerships and collaboration were mentioned as key by C-Level executives. However, organizational challenges remain in the Enterprise space.

For CEOs media tech partnerships are key to be successful. The DPP provided a space to foster those relationships. Credit: Imaginario AI.

If there was one thing CTOs agreed on, it is that technology is now the easy bit, but people and culture are not.

A DPP chart on leadership challenges placed the biggest challenges on organizational dynamics:

  • 75% of CTOs said their biggest issue is change management. Not a specific technology, not a security threat, but the basic job of getting people to work in new ways.
  • Securing cross-business engagement came next at 50%.
  • Technology project delivery only showed up in the middle.
  • Recruiting and retaining talent, securing executive buy-in for tech investment, and keeping employees motivated followed behind.
  • Improving diversity within the tech function did appear on the list but was rated by just 9% as their single biggest challenge.

The DPP’s CTO Survey 2025. Management and cross-business engagement remains the main barrier for innovation and AI adoption.

During the conference, plenty of commercial media companies, in particular, said that they need suppliers to act as collaborators rather than just traditional providers.

This was confirmed by the DPP CTO survey: the majority chose “collaborative” as their preferred role for vendors, with only a small minority wanting them to stay “supportive.” Such groups, like Public Service Broadcasters (PSBs) and non-profit organizations, leaned more toward “supportive” than “collaborative,” but very few wanted suppliers to be hands-off in an arm’s-length posture.

The DPP’s CEO Survey 2025. Technical partnerships are a top priority for leaders in this space with 85% stating they are being proactive or opportunistic. Credit: DPP.

The CEO survey backed that up with a clean headline: 85% of media tech CEOs say they prioritize partnerships. These ecosystems are evolving from simple integrations to “deep collaborations aimed at solving shared customer challenges.” In plain language, everybody realizes they cannot solve this alone at the current pace of innovation in AI.

However, the mood around AI across the event was at the same time strangely cautious for a technology that dominated the conference agenda. Several people warned about “AI FOMO” quietly driving bad decisions.

One NRK executive even said, without much sugar-coating, “Don’t fall into vendor lock-in, I’m not your bitch.” The line landed partly as a joke, but the sentiment was serious.

Organizations want partners who are honest about what they can really deliver, who will give directional pricing and modular solutions early rather than hiding behind all-encompassing solutions and five-year contracts, and who are willing to integrate with rival systems when that is what the customer needs.

AI adoption playbooks that are working in Enterprise

Groupe TF1, with Olivier Penin as Director of Innovation, remains as one of the examples of successful leadership implementing AI at scale. 75 PoCs last year; 20 use cases industrialized including AI native content studios. Credit: DPP.

Advice given during the conference: Start small, with one video workflow, one team, and one function. Use your own content, style guides, and rights data, so the solution reflects your reality. Pick problems where gains are clearly measurable: time saved, errors reduced, fewer manual touches, faster time to air. Keep humans in charge of outcomes.

None of that is glamorous, but it is where the real progress is happening. With Imaginario AI, this is exactly how we have been driving adoption of our solutions: starting small with a limited pilot in a specific function and landing wider corporate-level MSAs later.

Another golden nugget from the event for international media tech vendors: the DPP’s CTO survey mentioned that commercial media companies based in the US are more optimistic about the future (and therefore willing to invest more and experiment) than public service organizations (most of them based in Europe). However, PSBs expect a budget increase next year.

Where this leaves us

Pulling all of these threads together, you get a picture of an industry leaving its AI demo phase and entering the infrastructure phase.

AI and cloud are no longer bolt-ons. They are becoming part of the media supply chain from ingest and logging through post, archive, versioning, localization, compliance, marketing, scheduling, and consumption.

At the same time, most CTOs freely admitted the industry is only at a “developing” stage of maturity. The next three years are about turning experiments into dependable systems.

… and the organizations that seem to be making real progress shared a few traits during the event:

  • They treat AI as operational technology, not as an innovation showpiece. As one speaker said: “AI is not the strategy, it’s a means to an end.”
  • They keep humans clearly in charge for decision-making and supervision, with AI in a constrained co-pilot role… at least for now.
  • They recognize that rights, data, and AI intelligence (in that order) will unlock true value. They invest in in-video metadata, not just asset-level metadata and cloud infrastructure.
  • They take change management, specialist partnerships, and world-class UX design as seriously as they take models and buying GPUs.
  • Those adapting faster are getting closer to their core strengths in premium content production (e.g., TF1, RTL) rather than trying to build all AI capabilities themselves.

Partners with all-encompassing, monolithic solutions slow companies down rather than help them get to the next wave of video intelligence and automation.

On the contrary, many companies like Freemantle mentioned that partnering with smaller tech vendors normally provided that level of personalized and high-touch support required by Enterprises. Those specializing in AI, need to be API-first, modular, and that serve core areas such as metadata generation/enrichment, model fine-tuning, and training based on the client’s data and context are the perfect co-pilots in this sprint to innovate.

For those of us building AI-enabling solutions and platforms in this space, the engineering and UX bar is high and goes beyond “we have a clever model.”

We need to help enterprises build their own capabilities on their own terms and focusing on specific use cases. Help them with specific AI razor blades with native integrations and modularity, rather than vendor lock-in with middle-of-the-road solutions.

Other challenges vendors need to understand: clients want automation with control, intelligence with explainability, cloud scale with sovereignty (many times fully on-prem), and AI with genuinely human-centered UX. All with full transparency.

The honest reflection of the industry and the openness to collaborate is what rreally defined the DPP Leaders’ Briefing this year.

Behind all the fear slides and maturity charts, you could feel something else: the sense that the industry has stopped arguing about whether this transformation is happening and has started implementing how to do it without breaking the core strengths that made media companies leaders in the first place (hint: premium content).

To their benefit, content is still king and the time to catch the AI wave is now.


About Imaginario.ai
Backed by Techstars, Comcast, and NVIDIA Inception, Imaginario AI helps media companies turn massive volumes of footage into searchable, discoverable, and editable content. Its Cetus™ AI engine combines speech, vision, and multimodal semantic understanding to deliver indexing, simplified smart search, automated highlight generation, and intelligent editing tools.

]]>
From On-Prem to the Cloud to Social: Imaginario AI Broadens Its Integration Ecosystem https://imaginario.ai/wide-lens/product/from-on-prem-to-the-cloud-to-social-imaginario-ai-broadens-its-integration-ecosystem/ Mon, 08 Sep 2025 13:59:36 +0000 https://imaginario.ai/?p=1908 Imaginario AI's new integrations across storage, MAM, editing and social media

LONDON, 8th September 2025 — At Imaginario AI, we know that seamless video workflows depend on meeting teams where they already work. That’s why, in addition to our existing integrations with AWS S3, Wasabi, Google Drive, Dropbox, Adobe Premiere, and DaVinci Resolve, we’re excited to announce a major expansion of our partner ecosystem.

Through our new partnership with Ortana Media Group announced last month, Imaginario AI now supports a broader set of cloud storage, media asset management, editing systems, and social platforms, giving creative teams of all sizes greater flexibility to ingest, index, and export to editing suites and social media platforms.

New Cloud Storage & MAM Integrations

With Ortana’s Cubix Platform connected to our Cetus™ AI engine, Imaginario now works with leading storage and media management solutions, including:

  • Azure – Microsoft’s enterprise cloud platform for scalable storage and compute.
  • Google Cloud – Google’s infrastructure for media storage, AI, and collaboration.
  • Backblaze B2 – Cost-efficient object storage for archiving large video libraries.
  • StorJ – Decentralized, encrypted cloud storage with high resilience.
  • LucidLink – Cloud file system that streams large media files instantly.
  • Iconik – Cloud MAM for organizing assets globally.
  • Ross – Broadcast workflow and media asset management solutions.
  • Ci Media Cloud (Sony) – Secure cloud collaboration for sharing and reviewing media.

Connected to more Editing Systems and Social Media Platforms

Our existing integrations with Adobe Premiere and DaVinci Resolve are now joined by Avid Media Composer, a cornerstone of professional editing for broadcast and feature film. This expansion ensures that editors can access AI-driven enrichment, search, and automation within the platforms they know best.

  • Adobe Premiere Pro – Industry-standard editor for creative and marketing workflows.
  • DaVinci Resolve – Professional editing and color-grading platform for film and TV.
  • Avid Media Composer – Broadcast-grade editor

Content today doesn’t just end in the edit suite, it needs to reach audiences across multiple channels quickly and consistently. Imaginario now supports direct publishing to:

  • Facebook – Social video distribution for broad community engagement.
  • Instagram – Short-form, Reels, and Stories video publishing.
  • TikTok – Vertical, short-form platform for viral and algorithm-driven reach.
  • Vimeo – Professional hosting and review platform for creatives.
  • X (Twitter) – Social and news platform for fast video distribution.
  • YouTube – The largest global video platform for reach and monetization.

Why This Matters

These new integrations expand the reach and flexibility of Imaginario AI across every stage of the media lifecycle.

  • End-to-end flexibility – Hybrid, on-prem, and cloud-native infrastructures are all supported. Live streaming workflows coming soon!
  • Editorial efficiency – Editors and marketing teams can stay within their preferred environment while accessing AI insights.
  • Faster distribution – Direct social publishing accelerates the journey from raw footage to audience-ready content.
  • AI-powered intelligence – Video becomes instantly searchable, discoverable, and actionable across speech, visuals, and sound, without disrupting your workflows. It meets you where you are.

Our partnership with Ortana Media Group represents an important step in building a truly connected media ecosystem. Together, we are making workflows smarter, more efficient, and more adaptable, helping teams deliver content at the speed and scale today’s audiences expect.


About Imaginario.ai
Backed by Techstars, Comcast, and NVIDIA Inception, Imaginario AI helps media companies turn massive volumes of footage into searchable, discoverable, and editable content. Its Cetus™ AI engine combines speech, vision, and multimodal semantic understanding to deliver indexing, simplified smart search, automated highlight generation, and intelligent editing tools.

About Ortana Media Group
Ortana develops scalable and modular media orchestration solutions. Its flagship platform, Cubix, enables organisations to manage content workflows across cloud and on-premise environments, providing full visibility and control from ingest to distribution.

For more information, please contact:

Jose M. Puga
CEO
Imaginario AI
support@imaginario.ai

]]>
Imaginario AI and Ortana Media Group Partner to Supercharge Video Workflows with AI-Native Indexing and Automation https://imaginario.ai/wide-lens/press/imaginario-ai-and-ortana-media-group-partner-to-supercharge-video-workflows-with-ai-native-indexing-and-automation/ Thu, 14 Aug 2025 12:45:34 +0000 https://imaginario.ai/?p=1875

LONDON, 14th August 2025 — Ortana Media Group, creators of the Cubix Media Aware Workflow Engine (MAWE), today announced a strategic technology integration with Imaginario.ai, the company behind the Cetus™ AI engine, a next-generation platform for multimodal video understanding, smart search, and AI-powered editing.This collaboration combines Ortana’s robust media orchestration and automation layer with Imaginario’s AI-native infrastructure, empowering media organisations to turn raw video into enriched, discoverable, and production-ready content with unprecedented speed and scale.

A Leap Forward in Video Intelligence & Orchestration

Designed for streaming services, post houses, broadcasters, and corporate marketing teams, the integration delivers a future-ready solution that:

1. High-accuracy multimodal indexing at scale – Powered by Imaginario’s proprietary Cetus™ engine, the system performs advanced scene understanding, shot detection, facial analysis, object recognition, speech-to-text, ambient sound and SFX detection, logo recognition, OCR, image-to-video matching, and multilingual transcription.

2. Automated end-to-end workflows – Through Ortana’s Cubix MAWE, the integration orchestrates ingest, AI tagging, metadata enrichment, transcoding, and asset movement across hybrid or cloud environments — reducing manual effort and accelerating turnaround.

3. Deep semantic search and dynamic discovery – Enables users to quickly find and repurpose scenes or moments, regardless of archive size, file format, or content type.

4. API-native and modular by design – Offers seamless integration into existing MAM/PAM ecosystems or as a standalone solution — supporting flexible, cloud-agnostic deployments.

5. Explainable AI with scene-level captioning – Generates descriptive scene captions and match scoring to clarify how content is indexed — enhancing transparency, user trust, and tagging accuracy.

Unlocking the Value of Video at Scale

This joint solution enables post-production, marketing, and archive teams to spend less time rewatching content and more time creating. Whether for programmatic content creation, social repackaging, contextual advertising, or large-scale archive revitalisation, teams can now:

– Search video by who, what, where, when, and even why, using multimodal prompts.
– Automatically surface the best moments for highlights, trailers, ad inventory, and shorts, all within existing workflows.
– Push enriched assets directly into cloud storage and post-production tools like Adobe Premiere or DaVinci Resolve.

The integration is available immediately and supported across Cubix Yunify, Appliance, Halo, and Connect. It is already being evaluated by several broadcasters and content providers in Europe and North America.


About Imaginario.ai
Backed by Techstars, Comcast, and NVIDIA Inception, Imaginario AI helps media companies turn massive volumes of footage into searchable, discoverable, and editable content. Its Cetus™ AI engine combines speech, vision, and multimodal semantic understanding to deliver indexing, simplified smart search, automated highlight generation, and intelligent editing tools.

About Ortana Media Group
Ortana develops scalable and modular media orchestration solutions. Its flagship platform, Cubix, enables organisations to manage content workflows across cloud and on-premise environments, providing full visibility and control from ingest to distribution.

For more information, please contact:

Jose M. Puga
CEO
Imaginario AI
support@imaginario.ai

]]>
Imaginario AI unveils groundbreaking Cetus AI engine, paving the way for conversational video curation https://imaginario.ai/about/press/imaginario-ai-unveils-groundbreaking-cetus-ai-engine/ Wed, 04 Sep 2024 11:18:05 +0000 https://imaginario.ai/?p=1698

Debuting at IBC2024, new indexing and conversational system transforms video curation and exploration for unprecedented user experience and operational efficiency.

Imaginario AI, a leading pioneer in transformative AI-powered video curation solutions, has announced a significant update to its multimodal AI curation and video indexing platform. This innovation, which will be showcased at IBC2024 (stand 14.AIP8, AI Tech Zone), delivers unparalleled efficiency and precision in content indexing, search and curation – all without the need for metadata or multiple AI models.

The new Cetus release builds on Imaginario AI’s pioneering multimodal AI system, Vulpus, used by leading brands such as Comcast, Universal Pictures and Warner Bros Discovery. The solution is designed to analyze video assets through visuals, audio and dialogue while understanding the passing of time. 

Inspired by the intelligent communication of whales, Cetus is transforming AI video curation by enhancing multimodal video indexing and search through the use of rich, insightful descriptions together with machine-readable indexing. These go beyond simple clip recommendations based on keywords and labels, providing explainability and depth that greatly facilitate user understanding and interaction with video content and AI models. 

“We are excited to unveil our cutting-edge video understanding systems designed to enhance how you manage and interact with your video content,” said Dr. Abdelhak Loukkal, CTO of Imaginario AI. “Our innovative systems, Vulpus and Cetus, mark significant advancements in this field.”

Game-changing multimodal AI technology 

As the media and entertainment industry rapidly adopts AI technologies to boost operational efficiency and simplify content management, Imaginario AI stands at the forefront with fully developed solutions that set them apart from other AI providers still in the conceptual stages. 

The sophisticated video indexing AI model integrates data from various modalities with an advanced indexing engine, offering unparalleled discovery capabilities throughout extensive video libraries without relying on labels or metadata. As a result, the solution is more GPU-efficient, accurate, and cost-effective than traditional Media Asset Management (MAM) and AI labeling offerings, which typically require multiple models for different indexing and asset types.

Imaginario AI offers exceptional curation capabilities, including AI video search and discovery, chapterization, 1-click to clip, and timeline sequence export. These features greatly enhance efficiency in content repurposing, creating social cuts, on-set dailies curation, ideation, and compliance editing.

Cetus introduces a range of new features that vastly improve the user experience, delivering additional accuracy and lower latency. New capabilities include:

  • Higher accuracy video indexing system: Cetus has been trained on a vast dataset to identify and classify objects, scenes and actions contextually across vision, speech and sounds.
  • Explainability: The AI system uses scene-level captioning to provide detailed descriptions of video scenes and a search match scoring to improve transparency and accuracy. This approach helps users understand how Cetus’ AI makes decisions, leading to more precise content interpretation and better user interactions.
  • Multimodal approach: The system simultaneously examines visual, audio and textual data for all-encompassing content understanding.
  • Format independent: The solution ensures meticulous indexing of every asset, regardless of format and genre.
  • Audio and dialogue recognition: Cetus discerns subtle audio cues and dialogue nuances for detailed analysis.
  • Flexibility: The solution features native integrations with multiple storage and non-linear editing systems including Frame.io, Adobe Premiere, and AWS S3.
  • API-based integration: Cetus provides easy customization and scalability, allowing seamless integration into existing MAM frameworks and fine-tuning for specific client libraries.

“Imagine engaging in a conversation with your video content, completely transforming how you curate and transform your video libraries,” states Jose Puga, CEO of Imaginario AI, referring to the company’s upcoming AI toolset, which will enable users to interact with their libraries through casual dialogue and intuitive Q&A features. “Our ultimate goal is to build personal assistants that understand users’ editorial preferences and orchestrate their workflows. We look forward to showcasing our solutions at IBC2024 and making a positive impact on media and entertainment companies of all sizes.” 

Experience Imaginario AI at IBC2024 

IBC2024 attendees can book a meeting to explore Imaginario AI’s transformative solutions at the AI Tech Zone (Hall 14; Stand 14.AIP8). The company’s CEO, Jose Puga, will deliver a special presentation, “Accelerating Efficiency with AI-Powered Workflows,” on Sunday, 15 September at 14:00 (AI Tech Stage). Attendees will gain valuable insights into the transformative impact of AI-powered curation workflows on the industry.

]]>
Can we generate B-roll with AI yet? https://imaginario.ai/wide-lens/artificial-intelligence/can-we-generate-b-roll-with-ai-yet/ Wed, 31 Jul 2024 11:04:53 +0000 https://imaginario.ai/?p=1608

When OpenAI first demoed their text-to-video model Sora we, along with a large swathe of the media industry, thought “wow, this is going to be a game-changer for B-roll.”

Generative AI video is still way too random and inconsistent to be used for A-roll. Characters, objects and settings will look different shot-to-shot, so getting reliable continuity is basically impossible. We learned a few months after the Sora unveil that even one of the featured videos – air head by Shy Kids – required substantial correction in post-production to remove inconsistencies and genAI weirdnesses. Shy Kids estimated they generated 300 minutes of Sora footage for every usable minute.

As with all AI efforts right now, we’re seeing huge progress towards more usable systems, and generative video AI startups like Odyssey have already appeared specifically promising the consistency and continuity necessary for good storytelling.

So for now, genAI video isn’t ready to tell stories all by itself. But maybe it can be a part of the storytelling process by producing B-roll. Any generative AI system is only as good as its training data, and there are millions of hours of establishing shots, landscapes, cityscapes and more out there. So lets put it to the test.

I’m going to include 6 of the most popular free generative AI systems on the market right now with a few different styles of prompt. I’m only going to use systems which allow full video generation from a text prompt, not systems which animate images.

Every generative AI system is unique and responds to different types of prompts in different ways, so this shouldn’t be seen as a test of which text-to-video AI is “the best” – you will definitely be able to get better results from each system by playing with the prompt and settings, and experimenting with each to get the best out of it.

The most important test for genAI systems right now – whether text, image or video – is if their output can appear as if it doesn’t come from a genAI system. That’s the benchmark we’ll be applying.

The systems we’ll be using:

Test 1 – A cityscape at night

The prompt

A panning shot of a present-day city at night. Streets, buildings and billboards fill the entire frame.

The results

The verdict

There are elements from a few videos that could be usable, specifically the middle-distance and skyline shots. The buildings created by Runway and Luma are very close to realistic, and the skylines in all shots that contain them are passable.

However without fail the traffic is a disaster – complex moving elements continue to be the achilles heel of generative AI video, and it will be interesting to see if the upcoming models from larger providers (particularly Sora from OpenAI and Veo from Google) can make improvements here.

Test 2 – A forest at sunset

The prompt

The camera pans upwards from the treeline of a pine forest to reveal rolling hills beyond covered in trees, with mist resting in valleys between the hills. On the right side of the frame the sun is setting behind the trees in the distance, partially obscured by wisps of cloud, while a small flock of birds flies on the left side of the frame.

The results

The verdict

Now, these are much better results. There are a few genAI artifacts (Pixverse and Haiper’s birds, in particular), but overall these shots are usable. And perhaps more importantly for people generating footage for use in projects, these shots look like what I was picturing in my head when I wrote the prompt.

I purposely included multiple instructions in the prompt to see which model would follow them best. The individual elements were:

  • Camera movement
  • Type of trees
  • Misty valleys
  • Position of the birds
  • Position of the sun, with clouds and trees in front

I was pleasantly surprised to see that most of the models followed most of these instructions – a few missed the birds, but all of them nailed the forest, the misty valleys and the sunset. One notable curiosity is that only Kling followed the instruction to pan the shot correctly, every other model went for more of a drone or dolly shot with some movement. Kling’s generation interface specifically includes camera controls, so it makes sense it would understand this part of the prompt better.

Test 3 – a stormy seascape

The prompt

The camera flies quickly over a calm sea, we see the water moving with a few waves as we pass close above it. The camera pans upwards to reveal the horizon with a thunderstorm brewing in the distance.

The results

The verdict

In this test we can clearly see some unintended video hallucinations. In particular Haiper, which included the wake of a boat, and Pixverse, whose shot has been invaded by an unwanted seagull.

But again, much of the visual fidelity of these shots is close to good enough. Luma did a particularly good job of following the prompt. With the right color matching and editing, I think half of these shots could be used without being recognized as genAI. And for a technology that is hardly a year old, that is incredible.

What’s the future for AI-generated B-roll?

The simple answer is, as with everything in the generative AI space, it’s going to get a lot better. The industry is realising a simple text prompt isn’t enough to provide the kind of control filmmakers need, so we’re already seeing these tools integrating camera movements, zoom controls and more to give creatives the ability to direct the shot in many of the same ways you would a live crew..

Visual quality will continue to improve, as will the speed of models, lessening the issue of having to generate reels and reels of renders to find something useful.

It’s also worth thinking about how generative AI will impact the use of B-roll more broadly. Of course it will always be important for artistic reasons, but covering a cut or a spoiled shot could become a thing of the past. Adobe recently announced they are adding features to extend shots and remove items via smart masking to Premiere soon, so maybe you’ll no longer need to plaster over that interview shot where someone walks behind your subject – you can just have Firefly cut them out and recreate the background?

AI search is improving B-roll usability too

We’ll never – or probably never – reach a point where there’s no demand for filmed B-roll, so your huge back catalogue of material will always retain its value. And AI isn’t all about generating new material, it can help you understand, index and search your library too. In fact that’s exactly what we’re building here. Check out the demo below to see how we’re unlocking archives for broadcasters, documentarians and more.



Article credits

Originally published on

With image generation from

Playground

And TikTok creation from

Imaginario AI
]]>
Multimodal AI: what is it, and how does it work? https://imaginario.ai/wide-lens/artificial-intelligence/multimodal-ai-what-is-it-and-how-does-it-work/ Mon, 22 Jul 2024 14:14:13 +0000 https://imaginario.ai/?p=1583

Multimodality, it’s so hot right now. 2024 was the year that all the major Large Language Models – ChatGPT, Gemini, Claude and others – introduced new modalities and new ways to interact.

Like most new technological fields AI is full to the brim with technical jargon, some of it totally unnecessary, but some of it quite consequential.

Multimodal is one of the consequential ones.

So what does multimodal mean?

Well it’s actually quite simple. In AI terms, a “modality” is a type of media through which an AI model can consume, understand, and respond to information – think text, audio, image, video.

Historically most AI systems have used only text as their training data, their input and their output, and so were single modality. In the last decade or so AI image recognition systems have become more and more common with products like Google Lens and Amazon’s Rekognition. These computer vision models were obviously a step up in complexity from text-based models, but were still limited to only images, and so were also single modality.

The next evolution was text-to-image models like Stable Diffusion and DALL·E which, technically speaking, are multimodal. They take a text prompt, and produce an image – two modalities! However in practice “multimodal AI” has come to mean systems which combine two or more inputs or outputs simultaneously or alongside one another. This is sometimes called multimodal perception due to the fact that, once you introduce multiple modalities, an AI can begin to perceive (or give the impression of perception) what it is looking at, rather than just matching text or visual patterns.

Imagine providing an image recognition AI with this image. A basic system will recognise individual elements; man, woman, nose, hair. A more mature system will put them together to understand the image as a whole; a crowd watching something.

However, if you showed a multimodal system a video of a crowd watching something, it will recognise movements, facial expressions, sound effects, music and more to build a complete description of the scene.

You can think of AI modalities as roughly equivalent to human senses. There’s a lot you can do with just one sense, but when combined they provide a more complete understanding of the world around you.

How many modalities are there?

The main modalities in regular use right now are:

  • Text, which can include things like:
    • Normal written chat
    • Numerical data
    • Code such as HTML and JavaScript
  • Image
  • Video
  • Text on screen (through optical character recognition)
  • Non-dialogue audio (music, sound effects etc.)
  • Dialogue

The vast majority of AI usage is still confined to single-modality text (and the vast majority of that usage is text inputs and outputs through ChatGPT’s web interface, app and API). However as visual and audio AI systems become more mainstream, and cheaper to operate, this will no doubt change in exactly the same way the early internet was mostly text and images, but now contains a huge amount of video, music, podcasts and more.

Another curiosity of current-generation AI is that, although language and visual models can often give the impression of approaching human intelligence, they lack so many of the building blocks of intelligence that we humans take for granted. These are also modalities, which could be incorporated into AI systems in the future.

An example: the first season of HBO’s House of The Dragon takes place over almost three decades, with the lead character of Rhaenyra Targaryen played by Milly Alcock during the first five episodes, and Emma D’Arcy for the remaining four. As humans we can recognise through production cues, wardrobe, context and many other things that we’re dealing with a time jump, and that this is the same character later in life.

An AI system, not understanding as basic a concept as the passage of time, will recognise two different faces and fail to understand they are the same character.

This is one of the most interesting challenges of building AI today – we’re trying to reconstruct human intelligence but starting in the wrong place, so we have to backfill many of the fundamental aspects of basic intelligence.

We are also seeing the emergence of so-called action models, which can complete tasks on behalf of their users – for example logging into Amazon and ordering something. It’s certainly possible that actions will become another modality that is incorporated into larger models in time.

What’s the future for multimodality?

The only guarantee in the AI space right now is that the pace of innovation will continue to be relentless. Even the word multimodal only entered the public sphere around 18 months ago, and the number of people Googling it has increased tenfold in the last year.

Mobile devices and wearables is an obvious category that can benefit from multimodal models. Although the first attempts at devices incorporating multimodal AI were a huge miss, we’ll likely see these features incorporated into smartphones over time. The main limiting factor right now is the size of the models, which require a stable, fast internet connection to process queries in the cloud. This, too, will change as on-device models become more feasible.

Aside from being a nice-to-have, on-device multimodal AI has clear benefits for people with limited vision or hearing. A smartphone which can perceive the world around it is clearly useful for these groups. Imagine a visually-impaired person pointing their iPhone at a supermarket shelf and asking for help finding a specific product. Taking our human senses metaphor to its logical conclusion, these models can fill in the gaps for people who have lost those senses.

Away from consumer products, we are already seeing some really exciting developments in robotics where multimodal perception is allowing off-the-shelf robotic products to engage with the world without specific programming.

Until now industrial robots needed specific instructions for each task (close your claw 60%, raise your arm 45°, rotate 180°, etc.), but multimodal models may allow them to figure out how to complete a task themselves when given a specific outcome, by understanding the world around them and how to interact with it. Imagine a robot arm which can pick specific groceries of differing shapes and textures, and pack them taking into account how susceptible each item is to damage.

In time multimodal AI will become just another technology that we all take for granted, but right now in 2024, it’s the frontier where the most exciting artificial intelligence developments are taking place, and it’s worth keeping an eye on.



Article credits

Originally published on

With image generation from

OpenAI

And TikTok creation from

Imaginario AI
]]>
June 2024 product update: Framing, branding, even more speed https://imaginario.ai/wide-lens/product/june-2024-product-update-framing-branding-even-more-speed/ Tue, 16 Jul 2024 13:24:44 +0000 https://imaginario.ai/?p=1574

There was a famous study by Google in 2009 that showed that even teeny tiny changes to the speed of web apps (as small as 100ms) had a measurable impact on users’ satisfaction with those apps. Keeping web apps fast while dealing with production-quality video has always been incredibly tricky due to the enormous file sizes involved, and it’s a challenge the industry continues to tackle head-on.

With this in mind, this month we deployed a slew of speed optimizations that will make your Imaginario AI experience faster than ever – in some of our testing we’ve reduced load times by over 50%! Specifically, we’ve:

  • Re-worked some of our backend services to use autoscaling more aggressively, so you won’t notice any speed reduction when we’re seeing periods of high demand.
  • Implemented some new adaptive bitrate video streaming tech that means your search results will load faster than ever.

But that’s not all! We also added two new much-requested Clip Studio features.

New resizing UI

We added manual clip resizing back in April, and this month we launched the updated, better-than-ever version 2. Now, instead of dragging handles to resize you frame, you can adjust zoom levels in the UI, keeping your view of the clip full height, letting you get that pixel-perfect framing.

Branding and watermarks

For those customers with strict video guidelines, you can finally tell your brand team to relax. You can now upload your logo and place it wherever you want in your clip.

You can see a demo of both of these new features in the video below.

See it in action



Article credits

Originally published on

Filed under

With image generation from

OpenAI

And TikTok creation from

Imaginario AI
]]>
How to clip a YouTube video for download, or for social https://imaginario.ai/wide-lens/how-to/how-to-clip-a-youtube-video-for-download-or-for-social/ Tue, 04 Jun 2024 14:07:37 +0000 https://imaginario.ai/?p=1516

Ahh Youtube, the world’s largest library of video. An astonishing collection of humankind’s creativity and knowledge, including everything from how to grow an avocado tree to insanely catchy songs about people who live in the forest and wear tight clothes. If you want it, it’s on YouTube.

Sometimes it can be a little too much though. If all you want is a quick recipe for Pad Thai, sitting through a monologue about a YouTuber’s trip to Bangkok and five minutes of sponcon from a meal-in-a-box vendor is a little annoying. Google recognised this and introduced highlighted clips within their search results to take people straight to the clip they need, wherever their algorithm can identify a helpful video.

An example of a featured snippet video in Google search

But what if you want to save this clip, download it, or remix it for social? Well, you’ve got a few options.

Creating clips

If all you want to do it create some clips for sharing elsewhere, you’ve got plenty of options.

YouTube’s built-in clipping

If it’s enabled on a YouTube video, you can create a quick clip right from the YouTube interface. Hit the “Clip” button, choose your start and finish times (either with timestamps or by dragging the handles), and you’ve got a quick clip ready to go. Here’s a clip I created from Jose’s recent video on humans and AI working together.

However, as with video transcripts on YouTube, you’re a little limited with what you can do with this clip. You can’t download it, you can’t remix it, you can’t edit it any further. Youtube wants you to stay on their platform to watch more delicious advertising.

Pros

  • Simple, easy, free
  • Shareable links

Cons

  • No editing options
  • Stuck on YouTube
  • Not possible for all videos

Recording from your computer

A more DIY option is to record the clip locally using screen recording software. If you’re on a Mac you can use Quicktime (although you’ll need to use something like BlackHole to record your system audio – good explanation here), and on Windows you have a plethora of options.

This approach is infinitely flexible, and has the advantage of giving you a hard copy of all your clips, but requires a little more fiddling than browser-based solutions.

Pros

  • Free
  • Incredibly flexible
  • Keep your recording files

Cons

  • More setup required
  • Video resolution limited to your screen resolution
  • Minimal editing options

Recording from your mobile

Another simple and quick option (especially for people who edit videos primarily on their phone for TikTok or Instagram Reels) is to record your clip using your phone’s built-in screen recorder. Both Android and iPhone will let you screen record with audio easily (Android instructions here, iPhone here).

The most important thing to be aware of with this method is, depending on the options you choose and your phone operating system, you may record your entire screen. That means you’re not just creating a clip of your YouTube video, but also your notifications and your battery level, which is probably embarrassingly low.

You can crop the video once you’ve finished recording, but this can take a long time due to the limited processing power available on mobiles.

Pros

  • Simple, easy, free
  • Convenient if you edit clips on your phone
  • Keep your recording files

Cons

  • Fiddly and slow
  • Need to transfer file to computer for advanced editing

Editing for social

If you want to use your clips in your Tiktoks, Reels, or somewhere else, the next step is to edit them. Most YouTube videos are in horizontal, or 16:9, format. Conversely, most social apps are built for mobile screens and so use vertical, or 9:16 video. So, you need to crop them.

Pro editing software

If you have access to it, something like Adobe Premiere will let you make all the edits you need and them some. Adobe is doing a great job of infusing AI features into Premiere, so you can use the auto-reframe feature to switch you video to landscape and keep the subjects centered.

However, Premiere is a pro tool, built for pro people. So if you’re just getting started with video editing it can be a little (or very) intimidating. A nice middle-ground is Premiere Rush, a stripped-down version of the editor built specifically with creators in mind.

Pros

  • Incredibly powerful
  • Super fast, once you know what you’re doing

Cons

  • Complex for newbies
  • Monthly subscription fee

Consumer apps

For more basic editing tasks – cutting montages together, adding music etc. – there are plenty of free apps that will get you most of the way there.

iMovie for Mac, Lightworks, and plenty of others are a great way to get started with editing, and won’t cost you a penny. However, each has its limitations and none are built specifically for recutting clips for social channels, so won’t have AI reframing, auto-captions and all the nice features you’ll get in a pro package.

Pros

  • Free
  • Good entry point to editing

Cons

  • Limitations on usage (resolution, duration etc.)
  • No dedicated reframing support

Dedicated social clipping apps

If you’ll be doing a lot of clipping, cropping and reframing for your social channels, you should check out a dedicated social clipping app. These don’t come with all the bells and whistles of a full editing suite, but have features specifically built for social video. Cropping from horizontal to vertical, AI reframing, captions, and file export in popular formats.

And, as you’ve probably guessed, Imaginario AI is one such platform! In fact, we can handle the whole process of taking a video from YouTube and clipping it either for download or for social channels. Simply import a YouTube video directly from your channel, choose your clip, and reframe. Give it a try, you can create 5 clips per month totally free!



Article credits

Originally published on

Filed under

With image generation from

OpenAI

And TikTok creation from

Imaginario AI
]]>