The Problem: iPhone Videos Silently Fail
I tried to set my Telegram video avatar from an iPhone clip and it just didn't work. No error message. The upload completed and my avatar stayed the same.
After digging into the Telegram Bot API spec and a few hours of trial and error with ffprobe, I figured out why. iPhones record video as HEVC (H.265) by default since iOS 11, and Telegram's video avatar endpoint only accepts H.264 in an MP4 container. When you upload HEVC, Telegram drops it on the floor without telling you.
I built @liveavabot to fix this for myself, then opened it up. This post walks through the spec, the ffmpeg pipeline, and the aiogram 3 handler that ties it together.
What Telegram Actually Wants
The Bot API docs cover the basics, but the video avatar requirements are scattered across several pages. Here's what I pieced together from setProfilePhoto with the video parameter:
- Container: MP4
- Video codec: H.264 (AVC), profile baseline or main, yuv420p pixel format
- Resolution: exactly 800x800 (square)
- Duration: 10 seconds or less
- File size: 2 MB or less
- Audio: no audio stream
- Frame rate: 30 fps or lower
- Moov atom at the front (faststart)
The faststart bit matters more than people realize. If the moov atom sits at the end of the file (the ffmpeg default), Telegram's streaming preview can't seek and the avatar sometimes fails to render even when everything else is correct.
The 800x800 square is also strict. I tried 720x720 and it was rejected, tried 1024x1024 and it was rejected. Telegram wants exactly 800x800.
The FFmpeg Pipeline
The conversion has three stages: detect a useful crop region, scale to 800x800, then re-encode to H.264 with the right pixel format and container flags.
First, cropdetect to find the actual subject in the frame. Phone video often has bars, letterboxing, or a lot of dead space around the center.
ffmpeg -i input.mov -vf cropdetect=24:16:0 -f null - 2>&1 \
| grep -oP 'crop=\K[^ ]+' \
| tail -1
That outputs something like 1080:1080:0:420, which is width:height:x:y. I parse that in Python and feed it back into the actual encode.
Then the encode itself:
ffmpeg -i input.mov \
-t 10 \
-vf "crop=1080:1080:0:420,scale=800:800:flags=lanczos,fps=30,format=yuv420p" \
-c:v libx264 \
-profile:v main \
-preset medium \
-crf 23 \
-an \
-movflags +faststart \
-y output.mp4
Breaking that down:
-
-t 10caps duration at 10 seconds. Cheaper than re-checking with ffprobe. -
-vfchain: crop to the detected square, scale to 800x800 with lanczos (sharper than bicubic for downscales), force 30 fps, force yuv420p. -
-c:v libx264 -profile:v mainis the codec choice. Main profile plays everywhere and stays under the size budget. -
-crf 23is the quality knob. 18 is too big, 28 is too soft, 23 is the sweet spot for 10 seconds at 800x800. -
-anstrips audio. Telegram rejects the file if there's an audio track. -
-movflags +faststartmoves the moov atom to the front of the file.
If the output is still over 2 MB (it sometimes is for very fast-motion clips), I rerun with -crf 26. Two passes is rare but worth handling.
Wiring It Into Aiogram 3
The handler is short. Aiogram 3 gives you the file as a Message, you download it, run ffmpeg, send it back. The interesting part is handling video, animation (which is what GIFs become in Telegram), and video_note all through one entry point.
from aiogram import Router, F
from aiogram.types import Message, FSInputFile
from pathlib import Path
import tempfile
router = Router()
@router.message(F.video | F.animation | F.video_note)
async def handle_video(message: Message) -> None:
file = message.video or message.animation or message.video_note
if file.file_size and file.file_size > 50 * 1024 * 1024:
await message.reply("File too big. Max 50 MB.")
return
status = await message.reply("Converting...")
with tempfile.TemporaryDirectory() as tmp:
src = Path(tmp) / "in.mov"
dst = Path(tmp) / "out.mp4"
file_obj = await message.bot.get_file(file.file_id)
await message.bot.download_file(file_obj.file_path, destination=src)
crop = await detect_crop(src)
await encode(src, dst, crop=crop, crf=23)
if dst.stat().st_size > 2 * 1024 * 1024:
await encode(src, dst, crop=crop, crf=26)
if not dst.exists():
await status.edit_text("Conversion failed. Try a shorter clip.")
return
await message.reply_video(
video=FSInputFile(dst),
caption="Long-press the file, then Set as Video Avatar.",
)
await status.delete()
detect_crop and encode are thin wrappers around asyncio.create_subprocess_exec that call ffmpeg with the args from the previous section. Just argument plumbing.
The non-obvious thing here: I use reply_video and let the user manually set the avatar, rather than calling setProfilePhoto on their behalf. Bots can't set a user's avatar through the API, only the user can do that from the Telegram client. So the bot delivers a file the user can long-press and apply.
Shipping This as @liveavabot
I packaged this up as https://t.me/LiveAvaBot?start=devto_article_20260528. Send it any video or GIF, get back an 800x800 H.264 clip that Telegram accepts. About 85 people are using it now, mostly from sharing in maker forums.
It runs on a small VPS with ffmpeg compiled with libx264, aiogram 3, and a sqlite db for usage stats. No queue, just async subprocess calls, because conversions take 2 to 4 seconds and the bot is small enough that worker concurrency isn't needed yet.
If a video has non-square pixel aspect ratio (some Android phones do this), the cropdetect output needs normalization. I handle that by running setsar=1 in the filter chain before crop. One-line fix once you know to look for it.
Edge Cases and What's Next
Things that surprised me:
iPhone slow-motion videos report a frame rate of 240 fps, but actual playback is 30 fps with timestamps stretched. ffmpeg handles this correctly if you let it. Don't force -r 30 on the input, only on the output via the fps=30 filter.
Videos shot in portrait often have rotation metadata instead of actually rotated pixels. ffmpeg respects the rotation flag by default in recent builds, but older versions don't. I added an explicit transpose based on the rotation metadata to be safe.
Telegram's 10-second cap is hard. There's no way to ask for an exception. I tried.
Next on the list: handle 4K input properly (it works now but is slow), add a /url command so people can paste YouTube links instead of uploading files, and maybe a Mastodon mirror so non-Telegram users can convert files too.
Built by me, https://t.me/LiveAvaBot?start=devto_article_20260528. Code is on my todo list to open source once the secret handling is cleaned up.












