Short-Video Agent · Desktop client

Two modes · 7 AI steps · 4-platform publish

Drop a link or fill a profile,
AI ships the short video

Name: Short-Video Agent
Brand: 星瀚云

Script · voice clone · digital human · titles & tags · subtitles & music · cover · one-click publish — AI handles all 7 steps. You just do Step 1.

Download · macOS Other platforms Contact us

✓ Two creation modes: benchmark video learning + profile-based generation✓ Unlimited digital-human drive — upload yourself once✓ One-click publish to 4 major platforms✓ Native macOS and Windows builds

短视频智能体 · 桌面客户端

Step 1 · Pick one mode

Benchmark video

Fill profile

Industry & persona Restaurant · Bin · 10 yrs experience

Featured product Spicy hot pot · value tier

Selling point Regular 99 · today 59

AI · 7 automated steps

1 Research + script
2 Voice clone
3 Digital human
4 Titles + tags
5 Subtitles + music
6 Cover
7 Publish

~30 min later

Douyin

Kuaishou

Xiaohongshu

Channels

Published to 4 platforms

7 steps

Full automation

Product standard pipeline

~30 min

Per video

Reference example · varies by complexity

4 plat.

One-click publish

Douyin / Kuaishou / Xiaohongshu / Channels

12+

Industries supported

Restaurants / Renovation / Aesthetics / etc.

WHY US

Many tools claim AI short video — here is the difference

Three things that set us apart

Truly end-to-end AI

Not template stitching — the agent plans and executes 7 connected steps from script through voice, digital human, titles, subtitles, cover, to publish. In Step 1 you pick a creation mode and fill a profile; the remaining 6 steps run untouched.

Unlimited digital-human drive

Upload a 10-30 second clip of yourself once. The AI learns your face and lip pattern, then drives unlimited new videos — no per-video license, no need to re-record.

Compliance base you can trust

Xinghan Cloud holds three telecom value-added licenses (IDC / CDN / ISP). AI-generated content follows model-platform and regulatory norms; the review trail is preserved and auditable.

Seven Steps

AI runs all 7 steps · You do step 1

The full pipeline, automated

Type your IP info — the agent handles the rest.

Research + script

AI learns your benchmark content (video link or profile), analyzes viral patterns on each platform, and writes high-converting voice-over script.

Voice clone

NodeKey TTS drive — 50+ voices to pick, or clone your own. Natural delivery and intonation.

Digital human

Upload yourself once (10-30s). The AI drives lip-sync and likeness — no re-recording required.

Titles + tags

Auto-generates video titles, platform tags, and SEO keywords — improving algorithmic reach.

Subtitles + music

AI transcribes audio into smart subtitles and matches background music to the video rhythm — no post-production tools needed.

Cover design

AI generates click-optimized cover images that fit each platform's thumbnail spec.

Publish

One click to Douyin, Kuaishou, Xiaohongshu, and WeChat Channels — no duplicate work.

Comparison · Efficiency

Traditional vs AI Agent

Not a tool upgrade — a productivity leap

	Traditional · Humans + pro tools	AI Agent · 7 steps automated
Topic & script	2-4 hours / video	~30 seconds
Filming	1-2 hours / video	Digital human 3-5 min
Editing	2-4 hours / video	Auto
Subtitles & title	30-60 minutes	Auto
Multi-platform publish	~30 minutes	~5 minutes · 4 platforms
Total	Total per video: 6-12 hours	Total per video: ~30 minutes

Topic & script

Traditional

2-4 hours / video

AI Agent

~30 seconds

Filming

Traditional

1-2 hours / video

AI Agent

Digital human 3-5 min

Editing

Traditional

2-4 hours / video

AI Agent

Auto

Subtitles & title

Traditional

30-60 minutes

AI Agent

Auto

Multi-platform publish

Traditional

~30 minutes

AI Agent

~5 minutes · 4 platforms

Total

6-12 hours

Total

~30 minutes

Reference example · Actual time depends on content complexity and machine configuration. We make no specific efficiency guarantees.

Get Started

3 steps · First video in 30 minutes

From download to first video

01 ⏱ ~5 minutes

Install the client

▸ Open the Download section · pick macOS (Apple Silicon / Intel) or Windows
▸ Double-click the installer · complete first-time setup
▸ Grant permissions on first launch

💡 macOS "unverified" prompt → System Settings → Privacy & Security → Allow.

02 ⏱ ~3 minutes

Configure NodeKey

▸ Settings → API Keys
▸ Provider: NodeKey · paste API Key · Save

💡 One key powers everything — text, voice, ASR, digital human.

03 ⏱ ~20 minutes

Likeness · Mode · Ship

▸ "Likeness" → upload a 10-30s self video (face forward · clear lips)
▸ Pick creation mode: "Benchmark video learning" or "Profile-based generation"
▸ Fill industry / persona / product / selling point → "Generate" → AI 7-step ship → auto-publish

💡 Run a full cycle once to learn the flow before scaling.

You type the IP info — the remaining 7 steps run on AI.

Core Capabilities

Three engines under the hood

What powers the 7 steps

Every step is backed by a dedicated engine — together they support the full automation.

01 End-to-end automation

Smart IP Engine

Whole-pipeline video creation

Pick "Benchmark video learning" or "Profile-based generation" — the AI writes the script, drives the digital human, and ships the video.

2 creation modes: benchmark video learning + profile-based generation
Auto-analyzes platform trends in your industry
Unlimited likeness drive — upload once, drive many
After Step 1, the next 6 steps run on autopilot

02 TTS engine

Voice Synthesis

50+ voices · multilingual · natural emotion

NodeKey TTS drive. 50+ professional voices, dialects and multiple languages supported. Voice cloning available.

50+ professional voice library
Dialect + multilingual support
Voice cloning · natural emotion
Lip-sync alignment with digital human

03 ASR engine

Speech Recognition

Audio in · text / subtitle out

Upload video or audio for auto-transcription. Subtitle export (SRT / VTT). NodeKey ASR for precise alignment.

Batch transcription of audio / video
Precise subtitle timing
SRT / VTT / plain text export
Chinese + multilingual support

Scenarios

12+ industries · Local businesses ship videos with AI

Works for every local business

If your customers need short-video traffic, the agent fits.

Restaurants

Auto dealerships

Home renovation

Healthcare

Training & education

Beauty & cosmetics

Fitness & sports

E-commerce

Real estate

Weddings

Consumer electronics

Agriculture & fresh

Any vertical that needs short-video lead generation.

✨ Install and ship the first video in 30 minutes

Download

Download now,
let AI ship your videos

Native macOS (Apple Silicon + Intel) and Windows builds. Client is free; runtime billed via NodeKey token usage.

arm64

macOS（Apple 芯片）

Requires: macOS 12.0 or later

4.1.5 Download

x64

macOS（Intel 芯片）

Requires: macOS 12.0 or later

4.1.5 Download

x64

Windows

Requires: Windows 10 or later

4.1.5 Download

By downloading you accept the Terms of Service · Install footprint ~2 GB · Talk to us about your scenario

FAQ

You might be wondering

Frequently asked

Is the client free? Does running it cost?

The client is free to download and install. Runtime is billed by actual token usage — AI copy / voice / digital human all run through the Xinghan Cloud Intelligent Compute Platform, settled in one NodeKey account. See the pricing center for details; most light workloads stay well under a typical monthly budget.

How realistic is the digital human?

Quality depends on the source clip you upload. We recommend a face-forward 10-30 second video with clear lip movement and even lighting. Under those conditions, lip-sync is well-aligned and the output reads naturally for everyday short-video use. For formal commercial campaigns, pair with a human review pass.

Will the videos get throttled by platforms?

Content the agent generates follows model-platform and regulatory norms with an auditable review trail. But platform throttling policies (especially around AI-generated content labels) change constantly. We recommend: 1) keep the content truthful, 2) add human review on the final cut, 3) avoid overstatement. We track platform rule changes and adjust the generation strategy accordingly.

What hardware do I need?

macOS 12+ on Apple Silicon or Intel; Windows 10+ 64-bit. 16GB RAM recommended. Install footprint ~2 GB; each generated video is around 10-20 MB — minimal local storage required.

Which platforms does one-click publish support?

Currently Douyin, Kuaishou, Xiaohongshu, and WeChat Channels. Link your accounts under "Media" in the client. More platforms (Bilibili, Zhihu Video, Weibo, etc.) are on the roadmap.

How is this different from other AI short-video tools?

Three things: 1) Truly end-to-end — 7 connected steps, not isolated tools stitched together; 2) Unlimited digital-human drive — upload yourself once, drive unlimited videos with no per-video license; 3) Compliance base — Xinghan Cloud holds three telecom value-added licenses (IDC / CDN / ISP), with auditable data handling.

Drop a link or fill a profile, AI ships the short video

Many tools claim AI short video — here is the difference

Truly end-to-end AI

Unlimited digital-human drive

Compliance base you can trust

AI runs all 7 steps · You do step 1

Research + script

Voice clone

Digital human

Titles + tags

Subtitles + music

Cover design

Publish

Traditional vs AI Agent

3 steps · First video in 30 minutes

Install the client

Configure NodeKey

Likeness · Mode · Ship

Three engines under the hood

Smart IP Engine

Voice Synthesis

Speech Recognition

12+ industries · Local businesses ship videos with AI

Download now,let AI ship your videos

You might be wondering

Drop a link or fill a profile,
AI ships the short video

Download now,
let AI ship your videos