AI Video Ad Generator - Automatic video ad generation system with Sora 2
.jpg)

Process description
Automation for creating authentic UGC (User-Generated Content) video ads using OpenAI's latest Sora 2 model. The system analyzes the product image, creates a detailed profile of the ideal ambassador, generates a 12-second “raw” video script in the iPhone-style shooting style, and automatically creates the finished video. The whole process from uploading a product photo to getting a professional video takes only 5-10 minutes, replacing days of work by an entire production team.
API keys and services:
- OpenAI API - GPT-4.1 for analysis and Sora 2 for video generation
- Google Gemini 2.5 API - image processing and prompt generation
- Google Drive OAuth2 - storing ready-made videos
- N8n Form Trigger - download interface
System architecture by blocks
SECTION 1: DATA INITIATION AND PREPARATION
1.1 Form Trigger — Entry Point
Purpose: Web form to download the product and enter the name
Form settings:
- Form Title: “Video Generator”
- Field 1: Product (file, required)
- Field 2: Product Name (text, required)
Interface: Simple drag-and-drop form for product images
1.2 Extract from File - Image conversion
Purpose: Extracting binary image data for processing
Settings:
- Operation: BinaryToProperty
- Binary Property Name: Product
1.3 Convert to File — AI Preparation
Purpose: Conversion to a format for transfer to AI models
Settings:
- Operation: Tobinary
- Source Property: data
SECTION 2: AI ANALYSIS AND PERSONA CREATION
2.1 analyze_product - In-depth product analysis
Purpose: OpenAI Vision analyzes the product and creates a detailed profile of the target user
Settings:
- Model: chatgpt-4o-latest
- Operation: analyze
- Input Type: base64
Mega-prompt for analysis (abbreviated):
//ROLE AND PURPOSE//
You are an expert in casting and consumer psychology...
Your only task is to analyze the product and create
A detailed profile of the ideal person for UGC advertising.
//PROFILE STRUCTURE//
I. Basic identity (name, age, location, profession)
II. Appearance and style (detailed description of the image)
III. Personality and communication (demeanor, speech style)
IV. Lifestyle (hobbies, values, pain points)
V. The rationale for trust (why this particular person)
```
Result: A full character profile of 500+ words
2.2 set_model_details - Save profile
Purpose: Structuring personal data for the following stages
Settings:
- Assignment: prompt = $json.content
SECTION 3: VIDEO SCRIPT GENERATION
3.1 set_build_video_prompts - Preparing a master prompt
Purpose: Creating a detailed 12-second UGC video script
Key elements of the promotion:
```
Master Prompt: Raw 12-second UGC video script
AESTHETICS:
✓ Shaking when shooting handheld
✓ Natural camera movement
✓ Real lighting and locations
✓ Authentic imperfections
WE AVOID:
✗ Tripods or stabilization
✗ Text overlays
✗ Professional installation
✗ Clean backgrounds
FRAME-BY-FRAME STRUCTURE:
[0-2 sec] A clue is the middle of a conversation
[2-9 sec] Product demonstration in action
[9-12 sec] Natural completion
```
3.2 generate_ad_prompts - Creating the final script
Purpose: Gemini 2.5 Pro generates a detailed frame-by-frame scenario
Settings:
- Model: gemini-2.5-pro
- Endpoint: GenerateContent
- Input: Prompt+ product image
Output: Per second breakdown with a description of each frame, camera movements, and dialogues
3.3 Message a model - Clearing a prompt
Purpose: GPT-4.1 clears the generated text from unnecessary comments
Prompt:
```
Filter comments, leave only a clean prompt for the video
SECTION 4: PREPARING THE FIRST FRAME
4.1 generate_frame - Format adaptation
Purpose: Gemini adapts the product image to the 9:16 vertical format
Settings:
- Model: gemini-2.5-flash-image-preview
- Special prompt: Adapting to the aspect ratio while preserving the composition
Technique: Intelligent background extension without distortion
4.2 set_frame_result - Extracting the result
Purpose: Parsing Gemini's response to get a base64 image
Expression:
jascript
$json.candidates [0] .content.parts.filter (item => Item.inlineData) .first () .inlineData.data
4.3 get_frame_image+resize_image
Purpose: Final preparation of the first frame
Resize parameters:
- Width: 720px
- Height: 1280px
- Option: IgnoreasSpectratio
SECTION 5: GENERATING VIDEOS VIA SORA
5.1 generate_video — Launching Sora 2
Purpose: Submitting a video generation request to OpenAI Sora
API parameters:
- Endpoint: https://api.openai.com/v1/videos
- Model: sora-2
- Duration: 12 seconds
- Size: 720x1280 (vertical format)
- Input: First frame+detailed prompt
Request format:
json
{
“prompt”: “[Detailed video script]”,
“model”: “sora-2",
“seconds”: 12,
“size”: “720x1280",
“input_reference”: “[base64 first frame]”
}
```
5.2 Status monitoring cycle
Components:
1. delay (Wait) - Wait 15 seconds
2. get_video_status - Checking the generation status
3. check_status (If) - Completed/processing check
4. Cycle - Return to delay if not ready
Logic:
```
generate_video → delay → get_video_status → check_status
↑___________________________|
```
5.3 get_video - Download the finished video
Purpose: Getting the final video after the generation is complete
Endpoint: https://api.openai.com/v1/videos/ {id} /content
SECTION 6: SAVING THE RESULT
6.1 upload_video — Uploading to Google Drive
Purpose: Automatically save the finished video
Settings:
- Name: Video # {runIndex + 1}
- Drive: your google drive
- Folder: your folder
- Credentials: Google Drive OAuth2
Required services and their configuration
OpenAI setup:
- API key with access to:
- GPT-4.1 or GPT-4o
- ChatGPT Vision API
- Sora 2 API (requires special access)
- Limits:
- Sora 2: Check video generation quotas
- Recommended balance: $50+ for testing
Google Gemini setup:
- Obtaining an API key:
- Google AI Studio
- Enable Gemini 2.5 Flash and Pro
- Setup in N8n:
- HTTP Header Authentication
- Header: x-goog-api-key
Google Drive setup:
- OAuth2 connection:
- Create a project in Google Console
- Enable Drive API
- Configure OAuth2 credentials
System capabilities
Types of generated content:
- UGC video style:
- Product reviews
- Unboxing
- First impressions
- Usage tutorials
- Before/after comparisons
- Formats and styles:
- iPhone selfie video
- Mirror photography
- POV demonstrations
- Lifestyle content
- Testimonial video
Unique features:
- Hyperrealism - simulating a real shoot on a phone
- Personalization - a unique character for each product
- Authenticity - natural defects and imperfections
- Speed - 5-10 minutes vs production days
Use cases
1: Cosmetic product
Login: Photo face serums
AI analysis creates a persona:
- 28 year old woman, graphic designer
- Minimalist style, natural beauty
- Speaks fast, enthusiastically
The result: A 12-second video of a morning routine with natural light from the bathroom window
2: Technical gadget
Login: Wireless headphone photos
AI analysis creates a persona:
- 32-year-old male developer
- Casual style, glasses, beard
- A calm, technical way of speaking
The result: Unboxing video on a cluttered desktop with code on the monitor at the back
3: Food product
Login: Photo of a protein bar
AI analysis creates a persona:
- 25 year old woman, fitness trainer
- Sportswear, high ponytail
- Energetic, motivating delivery
The result: Video of a snack after a workout in the gym locker room
Advanced settings and optimization
Fine-tuning prompts:
- Regional adaptation:
- Accents and dialects
- Local references
- Cultural characteristics
- Platform optimization:
- TikTok: More dynamic editing
- Instagram Reels: aesthetics
- YouTube Shorts: informative
- Audience targeting:
- Gen Z: fast paced, memes
- Millennials: Authenticity, Stories
- Gen X: practicality, details
Advanced techniques:
- A/B testing:
- Different people for one product
- Scenario variations
- Different locations and lighting
- Mass production:
- Batch product processing
- Automatic series generation
- Cross-product campaigns
Integrations and extensions
Possible additions:
- Social Media Publishing:
- Auto posting on TikTok/Instagram
- Publication planner
- Hashtag generator
- Performance Tracking:
- Connecting to Meta Ads
- Google Analytics integration
- ROI calculator
- Brand Safety:
- Content moderation
- Brand guidelines check
- Compliance filters
- Multimodal generation:
- Add music/sounds
- Subtitles and captions
- Multilingual versions
Practical value
For small businesses:
- Affordable video marketing no production budget
- Launch campaigns quickly for new products
- Testing creatives at minimal cost
For agencies:
- Scaling up production without increasing the team
- Customization for customers commercially
- Innovative offer for customers
For e-commerce:
- Videos for every SKU in the catalogue
- Dynamic generation for seasons/promotions
- Content localization for different markets
The result of the system
This automation is a full-fledged video studio in the cloud that creates professional-grade content in minutes. The combination of product analysis through AI, the generation of authentic scenarios and the use of advanced Sora 2 technology makes possible what seemed fantastic a year ago - the instant creation of realistic videos without actors, studios and equipment.
An example of automation:


