Short-Video Agent · Desktop client
Two modes · 7 AI steps · 4-platform publish

Drop a link or fill a profile,
AI ships the short video

Script · voice clone · digital human · titles & tags · subtitles & music · cover · one-click publish — AI handles all 7 steps. You just do Step 1.

✓ Two creation modes: benchmark video learning + profile-based generation✓ Unlimited digital-human drive — upload yourself once✓ One-click publish to 4 major platforms✓ Native macOS and Windows builds
短视频智能体 · 桌面客户端
Step 1 · Pick one mode
Benchmark video
Fill profile
Industry & persona Restaurant · Bin · 10 yrs experience
Featured product Spicy hot pot · value tier
Selling point Regular 99 · today 59
AI · 7 automated steps
  1. 1 Research + script
  2. 2 Voice clone
  3. 3 Digital human
  4. 4 Titles + tags
  5. 5 Subtitles + music
  6. 6 Cover
  7. 7 Publish
~30 min later
Douyin Douyin
Kuaishou Kuaishou
Xiaohongshu Xiaohongshu
Channels Channels
Published to 4 platforms
7 steps
Full automation
Product standard pipeline
~30 min
Per video
Reference example · varies by complexity
4 plat.
One-click publish
Douyin / Kuaishou / Xiaohongshu / Channels
12+
Industries supported
Restaurants / Renovation / Aesthetics / etc.
WHY US

Many tools claim AI short video — here is the difference

Three things that set us apart
01

Truly end-to-end AI

Not template stitching — the agent plans and executes 7 connected steps from script through voice, digital human, titles, subtitles, cover, to publish. In Step 1 you pick a creation mode and fill a profile; the remaining 6 steps run untouched.

02

Unlimited digital-human drive

Upload a 10-30 second clip of yourself once. The AI learns your face and lip pattern, then drives unlimited new videos — no per-video license, no need to re-record.

03

Compliance base you can trust

Xinghan Cloud holds three telecom value-added licenses (IDC / CDN / ISP). AI-generated content follows model-platform and regulatory norms; the review trail is preserved and auditable.

Seven Steps

AI runs all 7 steps · You do step 1

The full pipeline, automated

Type your IP info — the agent handles the rest.

01

Research + script

AI learns your benchmark content (video link or profile), analyzes viral patterns on each platform, and writes high-converting voice-over script.

02

Voice clone

NodeKey TTS drive — 50+ voices to pick, or clone your own. Natural delivery and intonation.

03

Digital human

Upload yourself once (10-30s). The AI drives lip-sync and likeness — no re-recording required.

04

Titles + tags

Auto-generates video titles, platform tags, and SEO keywords — improving algorithmic reach.

05

Subtitles + music

AI transcribes audio into smart subtitles and matches background music to the video rhythm — no post-production tools needed.

06

Cover design

AI generates click-optimized cover images that fit each platform's thumbnail spec.

07

Publish

One click to Douyin, Kuaishou, Xiaohongshu, and WeChat Channels — no duplicate work.

Comparison · Efficiency

Traditional vs AI Agent

Not a tool upgrade — a productivity leap
Topic & script
Traditional
2-4 hours / video
AI Agent
~30 seconds
Filming
Traditional
1-2 hours / video
AI Agent
Digital human 3-5 min
Editing
Traditional
2-4 hours / video
AI Agent
Auto
Subtitles & title
Traditional
30-60 minutes
AI Agent
Auto
Multi-platform publish
Traditional
~30 minutes
AI Agent
~5 minutes · 4 platforms
Total
6-12 hours
Total
~30 minutes

Reference example · Actual time depends on content complexity and machine configuration. We make no specific efficiency guarantees.

Get Started

3 steps · First video in 30 minutes

From download to first video
01 ⏱ ~5 minutes

Install the client

  • Open the Download section · pick macOS (Apple Silicon / Intel) or Windows
  • Double-click the installer · complete first-time setup
  • Grant permissions on first launch

💡 macOS "unverified" prompt → System Settings → Privacy & Security → Allow.

02 ⏱ ~3 minutes

Configure NodeKey

  • Settings → API Keys
  • Provider: NodeKey · paste API Key · Save

💡 One key powers everything — text, voice, ASR, digital human.

03 ⏱ ~20 minutes

Likeness · Mode · Ship

  • "Likeness" → upload a 10-30s self video (face forward · clear lips)
  • Pick creation mode: "Benchmark video learning" or "Profile-based generation"
  • Fill industry / persona / product / selling point → "Generate" → AI 7-step ship → auto-publish

💡 Run a full cycle once to learn the flow before scaling.

You type the IP info — the remaining 7 steps run on AI.

Core Capabilities

Three engines under the hood

What powers the 7 steps

Every step is backed by a dedicated engine — together they support the full automation.

01 End-to-end automation

Smart IP Engine

Whole-pipeline video creation

Pick "Benchmark video learning" or "Profile-based generation" — the AI writes the script, drives the digital human, and ships the video.

  • 2 creation modes: benchmark video learning + profile-based generation
  • Auto-analyzes platform trends in your industry
  • Unlimited likeness drive — upload once, drive many
  • After Step 1, the next 6 steps run on autopilot
02 TTS engine

Voice Synthesis

50+ voices · multilingual · natural emotion

NodeKey TTS drive. 50+ professional voices, dialects and multiple languages supported. Voice cloning available.

  • 50+ professional voice library
  • Dialect + multilingual support
  • Voice cloning · natural emotion
  • Lip-sync alignment with digital human
03 ASR engine

Speech Recognition

Audio in · text / subtitle out

Upload video or audio for auto-transcription. Subtitle export (SRT / VTT). NodeKey ASR for precise alignment.

  • Batch transcription of audio / video
  • Precise subtitle timing
  • SRT / VTT / plain text export
  • Chinese + multilingual support
Scenarios

12+ industries · Local businesses ship videos with AI

Works for every local business

If your customers need short-video traffic, the agent fits.

Restaurants
Auto dealerships
Home renovation
Healthcare
Training & education
Beauty & cosmetics
Fitness & sports
E-commerce
Real estate
Weddings
Consumer electronics
Agriculture & fresh

Any vertical that needs short-video lead generation.

Install and ship the first video in 30 minutes
Download

Download now,
let AI ship your videos

Native macOS (Apple Silicon + Intel) and Windows builds. Client is free; runtime billed via NodeKey token usage.

FAQ

You might be wondering

Frequently asked
Is the client free? Does running it cost?

The client is free to download and install. Runtime is billed by actual token usage — AI copy / voice / digital human all run through the Xinghan Cloud Computing Power Router, settled in one NodeKey account. See the pricing center for details; most light workloads stay well under a typical monthly budget.

How realistic is the digital human?

Quality depends on the source clip you upload. We recommend a face-forward 10-30 second video with clear lip movement and even lighting. Under those conditions, lip-sync is well-aligned and the output reads naturally for everyday short-video use. For formal commercial campaigns, pair with a human review pass.

Will the videos get throttled by platforms?

Content the agent generates follows model-platform and regulatory norms with an auditable review trail. But platform throttling policies (especially around AI-generated content labels) change constantly. We recommend: 1) keep the content truthful, 2) add human review on the final cut, 3) avoid overstatement. We track platform rule changes and adjust the generation strategy accordingly.

What hardware do I need?

macOS 12+ on Apple Silicon or Intel; Windows 10+ 64-bit. 16GB RAM recommended. Install footprint ~2 GB; each generated video is around 10-20 MB — minimal local storage required.

Which platforms does one-click publish support?

Currently Douyin, Kuaishou, Xiaohongshu, and WeChat Channels. Link your accounts under "Media" in the client. More platforms (Bilibili, Zhihu Video, Weibo, etc.) are on the roadmap.

How is this different from other AI short-video tools?

Three things: 1) Truly end-to-end — 7 connected steps, not isolated tools stitched together; 2) Unlimited digital-human drive — upload yourself once, drive unlimited videos with no per-video license; 3) Compliance base — Xinghan Cloud holds three telecom value-added licenses (IDC / CDN / ISP), with auditable data handling.