ai-video-generator
Rating is derived from the repo's GitHub stars and shown for reference.
shell_executionnetwork_accessfilesystem_accesscredential_accessautomation_control
name: ai-video-generator
description: >-
Generate short-form videos with AI — script writing, text-to-speech narration,
stock footage selection, subtitle generation, and video assembly. Use when:
creating TikTok/YouTube Shorts/Reels content, automating video production,
building content pipelines.
license: Apache-2.0
compatibility: "Python 3.10+, FFmpeg"
metadata:
author: terminal-skills
version: "1.0.0"
category: content
tags: [video-generation, tiktok, youtube-shorts, content-creation, ai-video]
use-cases:
- "Generate 50 short-form videos per day for TikTok/YouTube Shorts"
- "Build an automated content pipeline: topic → script → voice → video"
- "Create educational or explainer videos with AI narration"
agents: [claude-code, openai-codex, gemini-cli, cursor]
AI Video Generator — Short-Form Content Pipeline
Overview
Automate creation of short-form videos (TikTok, YouTube Shorts, Instagram Reels) using AI for every step: topic research, script writing, text-to-speech narration, stock footage matching, subtitle generation, and final assembly. Inspired by MoneyPrinterTurbo (53k+ stars).
Instructions
Step 1: Set Up the Environment
pip install anthropic openai requests moviepy pydub whisperx srt
sudo apt install ffmpeg # Linux — or: brew install ffmpeg (macOS)
API keys needed: Anthropic or OpenAI (scripts), ElevenLabs or OpenAI TTS (voice), Pexels (free stock footage).
Step 2: AI Script Writing
import anthropic
def generate_script(topic, duration_seconds=45):
"""Generate a video script optimized for short-form content."""
client = anthropic.Anthropic()
prompt = f"""Write a {duration_seconds}-second video script about: {topic}
Format:
HOOK (first 3 seconds): A shocking statement or question that stops scrolling
BODY (main content): 3-5 punchy facts or points, each 1-2 sentences
CTA (last 5 seconds): Call to action — follow, like, comment
Rules:
- Conversational, no complex sentences
- Each sentence on its own line
- ~{duration_seconds * 2.5:.0f} words ({duration_seconds}s at 150wpm)
- Use power words: secret, shocking, nobody tells you, actually
- No emojis or hashtags — this is a voiceover script
"""
response = client.messages.create(
model="claude-sonnet-4-20250514", max_tokens=500,
messages=[{"role": "user", "content": prompt}]
)
return response.content[0].text
Step 3: Text-to-Speech Narration
import requests, os
def generate_voice_elevenlabs(text, output_path='narration.mp3'):
"""Generate voiceover using ElevenLabs."""
url = "https://api.elevenlabs.io/v1/text-to-speech/21m00Tcm4TlvDq8ikWAM"
headers = {"xi-api-key": os.environ["ELEVENLABS_API_KEY"], "Content-Type": "application/json"}
data = {"text": text, "model_id": "eleven_turbo_v2_5",
"voice_settings": {"stability": 0.5, "similarity_boost": 0.75}}
response = requests.post(url, json=data, headers=headers)
with open(output_path, 'wb') as f:
f.write(response.content)
return output_path
def generate_voice_openai(text, output_path='narration.mp3'):
"""Generate voiceover using OpenAI TTS (cheaper alternative)."""
from openai import OpenAI
client = OpenAI()
response = client.audio.speech.create(model="tts-1-hd", voice="onyx", input=text)
response.stream_to_file(output_path)
return output_path
Step 4: Stock Footage Selection
def search_pexels_videos(query, count=5):
"""Search Pexels for portrait-oriented stock video clips."""
url = "https://api.pexels.com/videos/search"
headers = {"Authorization": os.environ["PEXELS_API_KEY"]}
params = {"query": query, "per_page": count, "orientation": "portrait", "size": "medium"}
response = requests.get(url, headers=headers, params=params)
videos = response.json().get('videos', [])
results = []
for v in videos:
files = sorted(v['video_files'], key=lambda x: x.get('height', 0), reverse=True)
hd = next((f for f in files if f.get('height', 0) >= 720), files[0])
results.append({'id': v['id'], 'url': hd['link'], 'duration': v['duration']})
return results
Step 5: Subtitle Generation
def generate_subtitles(audio_path, output_srt='subtitles.srt'):
"""Generate word-level subtitles using WhisperX."""
import whisperx, srt
from datetime import timedelta
model = whisperx.load_model("base", device="cpu")
audio = whisperx.load_audio(audio_path)
result = model.transcribe(audio)
align_model, metadata = whisperx.load_align_model(language_code="en")
aligned = whisperx.align(result["segments"], align_model, metadata, audio)
subs = []
words = [w for seg in aligned["segments"] for w in seg.get("words", [])]
for i in range(0, len(words), 4):
group = words[i:i + 4]
if not group: continue
start = timedelta(seconds=group[0].get('start', 0))
end = timedelta(seconds=group[-1].get('end', 0))
text = ' '.join(w['word'] for w in group)
subs.append(srt.Subtitle(index=len(subs)+1, start=start, end=end, content=text))
with open(output_srt, 'w') as f:
f.write(srt.compose(subs))
return output_srt
Step 6: Video Assembly with FFmpeg
import subprocess
def assemble_video(clips, narration, subtitles, output='final.mp4'):
"""Assemble final video: concatenate clips, add narration and subtitles."""
concat_list = 'concat_list.txt'
with open(concat_list, 'w') as f:
for clip in clips:
f.write(f"file '{clip}'\n")
subprocess.run([
'ffmpeg', '-y', '-f', 'concat', '-safe', '0', '-i', concat_list,
'-vf', 'scale=1080:1920:force_original_aspect_ratio=increase,crop=1080:1920',
'-c:v', 'libx264', '-preset', 'fast', '-an', 'temp_video.mp4'
], check=True)
subtitle_filter = (f"subtitles={subtitles}:force_style='"
"FontName=Arial,FontSize=18,PrimaryColour=&H00FFFFFF,"
"OutlineColour=&H00000000,Outline=2,Bold=1,Alignment=2'")
subprocess.run([
'ffmpeg', '-y', '-i', 'temp_video.mp4', '-i', narration,
'-vf', subtitle_filter, '-c:v', 'libx264', '-c:a', 'aac',
'-shortest', output
], check=True)
return output
Step 7: Full Pipeline
def generate_video(topic, output_dir='./output'):
"""Complete pipeline: topic -> finished video."""
import os
os.makedirs(output_dir, exist_ok=True)
script = generate_script(topic)
narration = generate_voice_elevenlabs(script, f'{output_dir}/narration.mp3')
keywords = topic.split()[:3]
videos = search_pexels_videos(' '.join(keywords), count=3)
clips = []
for i, v in enumerate(videos):
path = f'{output_dir}/clip_{i}.mp4'
requests.get(v['url'], stream=True) # download clip
clips.append(path)
subs = generate_subtitles(narration, f'{output_dir}/subs.srt')
return assemble_video(clips, narration, subs, f'{output_dir}/final.mp4')
Examples
Example 1: Generate a Batch of Tech Fact Videos
A creator produces 5 technology-themed short videos for TikTok in one run:
topics = [
"AI tools nobody talks about",
"Apps that feel illegal to use for free",
"Websites that will blow your mind",
"Free AI tools every student needs",
"Tech gadgets under that changed my life"
]
for topic in topics:
output = generate_video(topic, output_dir=f'./output/{topic[:30]}')
print(f"Video ready: {output}")
# Each video: ~45 seconds, portrait 1080x1920, with subtitles and narration
# Total cost: ~.25 (5 x .05 per video) using ElevenLabs + Claude Sonnet
Example 2: Daily Finance Shorts for YouTube
A finance channel automates daily Shorts upload with trending money topics:
import schedule
def daily_finance_video():
topic = "3 passive income ideas that actually work in 2025"
script = generate_script(topic, duration_seconds=55)
# Script output:
# HOOK: "You're losing money every single day you don't know about these."
# BODY: 1. Print-on-demand stores (-2k/mo)
# 2. AI-generated content licensing (-1k/mo)
# 3. Dividend ETF stacking (-800/mo passive)
# CTA: "Follow for more money tips that nobody tells you about."
narration = generate_voice_openai(script, './daily/narration.mp3')
videos = search_pexels_videos("money finance investing", count=4)
# Downloads 4 portrait clips of money/charts/lifestyle footage
# Assembles with bold white subtitles, outputs 55-second Short
final = assemble_video(['./daily/clip_0.mp4', './daily/clip_1.mp4'],
narration, './daily/subs.srt', './daily/final.mp4')
schedule.every().day.at("08:00").do(daily_finance_video)
Guidelines
- Respect copyright — only use royalty-free stock footage (Pexels, Pixabay) or your own content
- Disclose AI usage — YouTube and TikTok require disclosure of AI-generated content
- Review before publishing — always watch the final video; AI scripts can contain inaccuracies
- Optimize for the first 3 seconds — the hook determines whether viewers stay or scroll
- Test multiple voices — ElevenLabs offers dozens of voices; find one that fits your niche
- Monitor performance — track views, retention, and click-through to iterate on content style
References
- MoneyPrinterTurbo — original inspiration (53k stars)
- Pexels API — free stock video
- ElevenLabs — realistic text-to-speech
- WhisperX — word-level subtitle alignment