Interactive AI-Powered Robot

Real-time voice + expressive motion on an open-source, low-cost chassis (ESP32 + Qwen API).

Context
NYU Shanghai · IMA
Focus
HCI · Product Design · Prototyping
Stack
ESP32, Qwen API, Servos, 3D-printed parts
Year
2025

Case

Move beyond pre-recorded toys by enabling unscripted speech and expressive motion. The robot listens, speaks, and gestures—making interactions feel alive.

Task

Design the behavior model and motion language; implement the voice pipeline (capture → Qwen API → TTS) on ESP32; integrate servos and a 3D-printed body; iterate through user tests.

Abstract & Project Description

The Interactive AI Robot is a smart plush toy enhanced with real-time voice interaction and basic motion capabilities. Unlike traditional toys that rely on pre-programmed responses, this robot utilizes an ESP32 microcontroller combined with the Qwen API to perform dynamic speech generation and real-time audio processing. It aims to simulate natural human-robot interaction by providing unscripted verbal replies and expressive physical feedback.

This project explores the potential of combining natural language processing with physical motion to deliver a more engaging and emotional user experience. By transforming passive responses into active conversations and motions, the robot bridges the gap between artificial intelligence and emotional companionship in toys.

Robot render on blue background

Background, Industry Context & Market Potential

The rise of AI-integrated consumer products—particularly in the toy industry—reflects a larger shift toward immersive, interactive experiences. Open Securities projects the AI-powered plush toy market in China to reach 32.3B RMB by 2027 (neutral scenario).

Globally, the companion robot market was about 75B RMB in 2023 and is expected to grow to 304.3B RMB by 2029 (CAGR 25.6%). This highlights both economic viability and the growing social demand for emotionally intelligent, responsive robotic companions.

As AI becomes more accessible, toys can increasingly support communication, learning, and emotional development—especially for children and older adults who benefit from frequent engagement.

Companion robot market size forecast (2023A→2029E, 25.6% CAGR)
Market size forecast — 25.6% CAGR (2023A→2029E)
Companion robot market growth (2023A vs 2029E)
Growth outlook: 2023A vs 2029E (Open Securities / 贝哲斯咨询)

Similar Projects: Market Comparison

显眼包 (ByteDance) — a trendy AI-themed wearable emphasizing emotional interaction via facial display and limited audio; not commercially available and lacks motion or active speech generation. Static emoji-based responses only.

Talking Tom (Plush) — ≈ €19.99 (≈ ¥158). Mimics voice and plays pre-recorded phrases; no smart AI processing or physical interaction.

Interactive AI Robot (this project) — estimated ¥200–¥300; combines live voice recognition, AI-generated speech, and basic motion for a more responsive, lifelike companion.

Similar projects comparison

Design Concept & Motion System

The mechanical design builds on open-source contributions (OTTO Robot platform), offering a flexible, affordable base for humanoid-style motion.

Servo motors (ESP32 control) synchronize gestures—nod, wave, head-turn—with speech events generated through the Qwen API (capture → NLU → TTS). The result is a dual-modality interaction—voice + motion—that pushes beyond static AI toys.

AI-driven motion control flow

Credits & Acknowledgments

Thanks to the OTTO DIY Robot community for open-source inspiration.

Tech stack: ESP32, Qwen API, servo drivers, and 3D-printed casing.

Final look — robot render on white background