Interaction and Robotics Lab

KTH
Interaction and
Robotics Lab

A world-class shared research infrastructure for generative AI, human-machine interaction, and robotics - jointly operated by the Departments of Speech, Music & Hearing (TMH) and Robotics, Perception & Learning (RPL) at KTH.

25+
Faculty Members
~100
PhD Students
6
Lab Spaces
~25
Postdocs
Boston Dynamics Spot robot at KTH IRL, with Intelligence Augmentation Lab lounge in the background
Lindstedtsvägen 24 KTH Main Campus, Stockholm
🔬

Frontier Research

End-to-end experimental infrastructure for human-robot interaction, from motion-capture data collection to real-world deployment.

🎓

Education & Training

Hands-on environment for MSc and PhD students to work with professional-grade hardware, capture systems, and humanoid robots.

🤝

Industry Collaboration

Open to industrial partners for joint R&D. Past collaborators include Electrolux and Akademiska Hus. A proven high-profile venue.

🌐

Open Infrastructure

Shared booking across KTH schools and departments. CloudGripper opens remote robot access to researchers worldwide.

About KTH IRL

One of Europe's Densest
HRI Hub

KTH IRL is a one-of-a-kind shared facility bringing together world-leading expertise in spoken dialogue, computer vision, robotics, and machine learning — all under one roof on the KTH main campus in Stockholm.

The lab is jointly operated by the Department of Speech, Music and Hearing (TMH) and the Department of Robotics, Perception and Learning (RPL) — two of KTH's most research-intensive units within the EECS School. Together they house over 25 faculty members, approximately 25 postdocs, and close to 100 PhD students, making KTH IRL one of Europe's densest concentrations of expertise in social robotics, spoken dialogue, computer vision, and robot learning.

Located at Lindstedtsvägen 24, KTH IRL provides purpose-built spaces for the full research cycle: from motion-capture data collection and multimodal interaction studies, through AI model development, to deployment and evaluation with real users in domestic-like settings.

The facility has hosted landmark events — including the announcement of Sweden's national AI commission by Prime Minister Ulf Kristersson in December 2023 — and has attracted major external funding from Promobilia, Vinnova, Digital Futures, WASP, Vetenskapsrådet, and the European Research Council.

Sweden's Prime Minister Ulf Kristersson visiting KTH IRL
December 2023 — Sweden's Prime Minister Ulf Kristersson visiting KTH IRL, where he announced the Swedish national AI commission. Demonstration of the Boston Dynamics Spot robot.
📍

Location

Lindstedtsvägen 24, KTH Main Campus, Stockholm. Ground floor, open layout with six dedicated research environments.

Infrastructure

GPU server clusters, three dedicated control rooms with cooling and 100 Gb wired internet, and professional AV capture throughout.

🤖

Robot Fleet

Humanoid robots (Furhat, Softbank Pepper, PAL ARI, Rainbow HRN-Y1, Unitree H1), Boston Dynamics Spot, dual-arm manipulators, UAVs, social robots.

🏛️

Joint Operation

Collaboratively managed by TMH and RPL, KTH EECS School. Shared booking open to all KTH researchers.

Infrastructure

Six Lab Spaces

KTH IRL hosts six purpose-built research environments — from an AI-enabled smart kitchen and a professional motion-capture studio, to an audience-research theatre, humanoid robot hall, aerial robotics workshop, and cloud-connected manipulation lab.

Floor Plan

KTH IRL — Lindstedtsvägen 24, KTH Campus, Stockholm

KTH IRL floor plan — six lab spaces at Lindstedtsvägen 24
Humanoid Robot Lab — humanoid and dual-arm robotic platforms at KTH IRL
Lab 01 · RPL

Humanoid Robot Lab

The primary development and testing environment for robotic manipulation and full-body autonomy research. The lab houses a growing fleet of humanoid robots and dual-arm platforms for dexterous manipulation, loco-manipulation, and physical human-robot collaboration studies.

Through the SAInt project (Promobilia), the fleet has been expanded with Rainbow Robotics HRN-Y1 and Unitree H1 humanoid robots. During the SAInt project period, this will grow to up to five full humanoid platforms — both legged and wheeled — enabling parallel research threads across perception, planning, and adaptive control simultaneously.

The lab also maintains a collection of social robots and expressive robot heads used in dialogue and HRI research by the TMH department.

Rainbow Robotics HRN-Y1 Unitree H1 ABB dual-arm robots Multiple manipulators Social robots (Pepper, Furhat, NAO) Onboard GPU compute Force/torque sensors RGBD cameras
SAInt · Active Project · Promobilia · 5 Years
SAInt — Situated Agentic Intelligence
● Active

SAInt is the primary driver of the humanoid robot fleet expansion. The project develops robots that understand context, learn from interaction, and provide proactive physical and verbal assistance — targeting independent living for older adults and people with special needs. The Humanoid Robot Lab is where hardware development, manipulation learning, and physical collaboration research take place across WP2 and WP3.

Current fleet
Rainbow HRN-Y1 & Unitree H1
Expanding to 5 platforms during the project
Research focus
WP2 · WP3
Scene modelling, dexterous manipulation & adaptive control

Legged and wheeled humanoid platforms from leading vendors run in parallel, enabling simultaneous development across multimodal scene representation (Kragic, WP2) and safe compliant physical interaction (Jaquier, WP3).

GeoRob Lab · RPL · WASP · VR Starting Grant
Geometric Robot Learning Lab (GeoRob)
● Active

Led by Noémie Jaquier (WASP Assistant Professor, RPL), the GeoRob Lab develops data-efficient robot learning, optimization, and control algorithms with sound theoretical guarantees. Research treats differential geometry and physics as core inductive biases — enabling robots to generalise from fewer demonstrations and operate reliably under real-world constraints. The lab was recently awarded a Swedish Research Council Starting Grant (2025).

Geometric Learning
Operational space & symmetry
Deformable Objects
Riemannian geometry for manipulation
Physics-Informed
Guaranteed control & learning
Research threads: Geometric imitation learning · Vision-based manipulation · Bayesian optimisation on manifolds · Physics-informed control for dynamical systems
Team: Rodrigo Pérez Dattari (postdoc) · Seungyeon Kim (postdoc) · Loizos Hadjiloizou · Riccardo Morandi · Katharina Friedl · Federico Pavesi (visiting)
GeoRob Lab website ↗
Wallenberg Scholar · KAW · ~10-Year Grant · Started 2014
Robots Interacting with People — Danica Kragic Jensfelt
● Active

Funded by Knut och Alice Wallenbergs Stiftelse, this decade-long programme develops robotic systems that interpret their surroundings like human senses and learn through interaction with people and the environment. The goal is robots that can perform a wide range of tasks in homes and complex real-world settings — handling unpredictable, non-repetitive situations that are impossible to pre-program. Key advances include multimodal AI models processing sound, images, and force data simultaneously; learning from human demonstration and VR teleoperation; and manipulation of soft and deformable objects such as textiles, clothing, and groceries.

Robot in Danica Kragic's lab learning to transfer rice with a spoon Robot gripper demonstrating dexterous manipulation
Learning from Demo
Physical guidance & VR teleoperation
Deformable Objects
Textiles, clothing & groceries
Home Assistance
Support for elderly & disabled
"Now we can train robots instead of programming robots, which makes it much easier to find solutions to environments where things change."
— Danica Kragic Jensfelt
Researcher: Danica Kragic Jensfelt (Professor of Computer Science, KTH)  ·  Funder: Knut och Alice Wallenbergs Stiftelse (KAW)
KAW project page ↗
VR Rådprofessor · Vetenskapsrådet · 2020–2030
Learning, Interactive Autonomous Systems — Danica Kragic
● Active

A ten-year Swedish Research Council Distinguished Professor grant developing new self-supervised and meta-learning methodologies with causal reasoning for perception, control, and reasoning in robotics. The project targets robots capable of complex interactions with both rigid and deformable objects and humans in unstructured, real-world environments — moving beyond carefully structured lab settings to handle the open challenges of limited data, unknown unknowns, and transfer across tasks.

Self-Supervised & Meta-Learning
Learning from small datasets with causal reasoning
Multisensory Interaction
Rigid & deformable objects, human collaboration
Transfer & Incremental Learning
Generalisation across tasks and environments
Causal Inference
Modelling unknown unknowns in robot planning
PI: Danica Kragic  ·  Team: Christian Pek · Ioanna Mitsionni · Gustaf Tegnér  ·  Funder: Vetenskapsrådet (VR) · Distinguished Professor Grant
KTH RPL project page ↗
BIRD · ERC Advanced Grant · H2020 · Completed 2025
BIRD — Bimanual Manipulation of Rigid and Deformable Objects
Completed

A five-year ERC Advanced Grant (€2.4M) addressing one of the core open challenges in robotics: enabling machines to interact with deformable objects as naturally as humans do. BIRD created new informative and compact representations of deformable objects that combine analytical and learning-based approaches, encoding geometric, topological, and physical properties of the robot, object, and environment. Research focused on multimodal, bimanual interaction tasks, combining theoretical methods with rigorous experimental evaluation to model skilled sensorimotor behaviour in dual-arm robot systems.

Bimanual Systems
Dual-arm sensorimotor modelling
Deformable Objects
Geometric & topological representations
Hybrid Methods
Analytical + learning-based approaches
Coordinator: KTH Royal Institute of Technology  ·  Duration: Sep 2020 – Aug 2025  ·  Funding: EU H2020 ERC Advanced Grant · €2,424,186
CORDIS project page ↗
Intelligence Augmentation Lab — smart kitchen overview
Lab 02 · TMH / RPL

Intelligence Augmentation Lab

The IA-Lab is a fully-equipped, sensor-rich smart home kitchen built in collaboration with Akademiska Hus and Electrolux. Designed to resemble a real domestic environment, it bridges the gap between laboratory AI research and everyday life.

The space features a fully functioning Electrolux kitchen, an adjacent living room area (convertible to a warehouse layout for robot picking tasks), and a dedicated control room with GPU servers and professional video capture equipment — all connected on a 100 Gb wired internet backbone.

Research focus areas include AI and robot-supported cooking, human cooking behaviour studies in partnership with Electrolux, and zero-waste cooking through AI assistance.

Electrolux smart appliances Control room with GPU servers, 100 Gb internet and 80 inch screen 4 wireless microphones Video capture suite (ceiling, walls, mounted) Two Meta Aria gaze glasses Speakers suite (ceiling, walls, smart) Two 50 inch embedded displays Ceiling projector on work bench
Active Projects PerCorSo (WASP NEST) — socially acceptable autonomous robots  ·  SAInt (Promobilia) — humanoid robot domestic assistance
SAInt · Active Project · Promobilia · 5 Years
SAInt — Situated Agentic Intelligence
● Active

The IA-Lab kitchen is the primary human-subject testing environment for SAInt. All five interaction scenarios — from passive observation to full robot-human collaboration — are designed around realistic cooking and domestic tasks performed here. The lab's sensor array (overhead cameras, Meta Aria glasses, embedded displays, smart appliances) provides the rich multimodal data streams needed for each work package.

WP1 — Interaction
Joakim Gustafson
Proactive dialogue & situation awareness
WP2 — Perception
Danica Kragic
Human action & environment modelling
WP3 — Control
Noémie Jaquier
Safe physical robot assistance

Five scenarios of increasing complexity run in this kitchen: Observer → Apprentice → Instructor → Teacher → Collaborator, building from passive task observation to full shared autonomy between human and robot.

PerCorSo · Active Project · WASP NEST · ABB & Saab
PerCorSo — Perceiving & Communicating Correct-by-design Socially Acceptable Autonomous Systems
● Active

PerCorSo designs the most appropriate ways for robots to behave in human-crowded environments. Its novelty lies in integrating spatial and social context understanding, multimodal communication, and autonomous motion strategies to advance real-world social robots. The project addresses trust in autonomous robots in two complementary senses: verifiability (provably safe systems) and social acceptability (perceived as safe and trustworthy by humans). The IA-Lab kitchen is central to the project's real-world validation — building from controlled lab scenarios toward deployment in environments such as elderly care, autonomous driving, and human-robot coworking.

PerCorSo researcher Ermanno Bartoli and PAL robot Ari.
① Context Modelling
Data-driven spatial & social context; implicit and explicit communication
② Safety Alignment
Bridging formal safety specs with human perception of safety
③ Decision-Making
Multimodal communication strategies balancing task & social acceptability
④ Real-World Validation
Lab-to-deployment scaling in the IA-Lab and beyond
Researcher Interviews — September 2024
PIs: Iolanda Leite · Jana Tumova · Joakim Gustafson · Patric Jensfelt  · 
Industrial Partners: ABB AB · Saab AB
WASP project page ↗
Previous Projects AAIS (Digital Futures), FoodTalk (Vinnova), Personalized Companion Robot (Digital Futures), AnalyTIC / PREDICON (RJ + VR + WASP)
AAIS · Previous Project · Dataset
KTH-ARIA-referential — Look and Tell
🤗 HuggingFace

A multimodal dataset for referential expression grounding collected in the IA-Lab kitchen. Participants wore Meta Aria smart glasses for first-person gaze-tracked video while a GoPro captured the exocentric view — synchronising eye tracking, speech, and spatial grounding during live cooking tasks.

125
recordings
25
participants
3.7h
total duration
5
recipes

Each recording pairs egocentric Aria video (30 fps) with exocentric GoPro footage, real-time gaze coordinates, 48kHz audio, and word-level transcription via WhisperX. Designed for referential expression grounding, gaze–speech synchronisation, and embodied dialogue research. Created by Anna Deichler (KTH); presented at NeurIPS 2025 SpaVLE Workshop.

FoodTalk · Previous Project · Vinnova
FoodTalk — Proactive Cooking Companion for the Elderly
Completed
Senior participant cooking in the IA-Lab smart kitchen during FoodTalk WoZ experiment Participant during FoodTalk cooking experiment in IA-Lab

A Vinnova-funded project (Swedish: Prata mat) developing a proactive conversational AI cooking assistant for elderly users. Wizard-of-Oz experiments in the IA-Lab smart kitchen engaged 6 senior participants (ages 63–66) in comparing two AI chef personas: an Instructional variant and a Chatty variant. The chatty AI Chef was perceived as more situationally aware and intelligent. The project addressed food waste reduction and independent living for older adults.

6
senior participants
2
AI chef conditions
WoZ
methodology
Team: Joakim Gustafson (KTH), · Morgan Fredriksson (Nagoon) · Katarina Estève, Timo Mashiyi-Veikkola (Electrolux)
Estève et al., "Towards a proactive cooking companion for the elderly" · Funded by Vinnova
Digital Futures Postdoc · 2022–2024 · Completed
Personalized Companion Robot for Open-Domain Dialogue in Long-Term Elderly Care
Completed

A Digital Futures postdoctoral fellowship (June 2022–June 2024) developing robots capable of lifelong personalised dialogue — learning and recalling a person’s attributes, preferences, and shared history over long time horizons. Research addressed how foundation models and LLMs can enable open-domain conversation that adapts to individual elderly users, supporting daily reminders, collaborative tasks, and social engagement.

Five design principles (co-design with 28 older adults)
💬 Contextual communication
🧠 Personalisation & memory
📋 Practical daily support
🤝 Social enhancement
🏆 Best paper award: “Recommendations for Designing Conversational Companion Robots with Older Adults through Foundation Models” — selected among 11 best papers out of 261 published in Frontiers in Robotics and AI (2024)
Researcher: Bahar Irfan (postdoc) · Supervisors: Gabriel Skantze, Sanna Kuoppamäki (KTH)
Digital Futures project page
AnalyTIC · RJ + VR + WASP · 2021–2025 · Completed
Modeling Turn-taking in Conversation
Completed

Deep learning models for predicting turn-taking in spoken interaction — identifying when speakers will yield or hold the floor. The project produced two key models: TurnGPT (language-model-based, trained on transcripts with speaker-shift markers) and Voice Activity Projection (VAP) (audio-based, trained on ~2,000 hours of telephone conversations, preserving acoustic features like intonation and pausing). HRI experiments used a Furhat social robot — a KTH spin-off company co-founded by PI Gabriel Skantze and in business since 2014 — enabling the models to drive fewer interruptions and faster response times in face-to-face dialogue. Research showed turn-taking cues are largely language-specific across three language families.

🏆 Best Paper — SIGDIAL 2022 🏆 Best Paper — HRI 2025 Open source
PI: Gabriel Skantze · Team: Erik Ekstedt, Haotian Qi · Funders: RJ (AnalyTIC) · VR (PREDICON) · WASP
KTH TMH project page
PMIL motion capture studio — 25+ Optitrack cameras
Lab 03 · TMH / RPL

Performance & Multimodal
Interaction Lab

PMIL is a professional full-body motion capture studio and the primary facility for capturing high-fidelity human movement, gesture, dance, and multimodal interaction data for AI training and HRI research.

The studio features 25+ Optitrack infrared tracking cameras mounted on the ceiling and walls, providing sub-millimetre accuracy across the full capture volume. A Peel Capture system and Tentacle timecode devices synchronize all recording modalities into a unified data stream.

PMIL is collaboratively managed — there is no dedicated technician, and shared responsibility is placed on booking users to follow the usage guidelines. The space connects to the adjacent IA-Lab lounge for larger recording sessions.

PMIL has a foldable wall that when opened can host large AI workshops and press releases.

25+ Optitrack IR cameras Three Optitrack Colour cameras Ten Luxonis Oak 1 cameras Three Tobii gaze headsets Two Manus glove pairs Meta Quest VR headsets Peel Capture system Tentacle timecode Multichannel speaker system 100 inch TV screen 150 inch portable projector screen Wireless audio recording MoCap suits & markers Motive software
Booking & Access Request access at kth.se/form/pmil-request-access. Book via KTH Outlook calendar — room name: eecs_lv24_pmil. Maximum 4 consecutive days per booking. Master and bachelor students require a supervisor to hold the booking.
Highlights: December 2023 - Sweden's Prime Minister Ulf Kristersson visiting KTH IRL, where he announced the Swedish national AI commission at a press conference in PMIL. February 2024 - AI and Robotics workshop with 80+ personally invited attendees.
Active Projects — BodyTalk & SignBot (Vetenskapsrådet · WASP, 2024–2027)
BodyTalk · Active Project · Vetenskapsrådet · 2024–2027
BodyTalk — Integrated Modeling of Speech, Gesture & Face
● Active

BodyTalk develops unified models that synthesize speech, facial expressions, and gestures simultaneously from text, generating spontaneous and non-repetitive conversational behaviors for virtual characters and social robots. The project tackles two core challenges: (1) joint multimodal synthesis maintaining congruence across all modalities, and (2) high-level style control (engagement, agitation) that consistently shapes all output channels.

Joint Modelling
Speech · gesture · face from text
Style Control
Engagement & agitation levels
Applications
VR · gaming · social robots
Team: Simon Alexanderson · Éva Székely · Jonas Beskow · Gustav Henter
Project page at KTH TMH ↗
SignBot · Active Project · WASP / Vetenskapsrådet · 2024–2027
SignBot — Generative AI for Sign Language
● Active

SignBot employs state-of-the-art generative AI to create high-quality sign language animations and language processing systems. Sign languages — used by over 70 million people globally — have unique visuo-spatial and highly parallel structure that challenges conventional NLP methods. The project uses PMIL's motion capture infrastructure to record and model sign language motion, training neural synthesis systems capable of end-to-end sign language generation to improve accessibility and inclusion for deaf communities.

Neural Motion Synthesis
MoCap-trained sign generation models
Accessibility Impact
70M+ sign language users worldwide
Team: Jonas Beskow (PI) · Anna Klezovich · Fredrik Malmberg · Simon Alexanderson · Johanna Mesch (Stockholm University)
Project page at KTH TMH ↗ · Malmberg et al. (2024) · LREC-COLING Workshop on Sign Language
Previous Projects — AAIS (Digital Futures), Artificial Actors (Digital Futures)
AAIS · Previous Project · Dataset
MM-Conv: A Multi-Modal Conversational Dataset for Virtual Humans
Project page

A multimodal corpus of two-party conversations recorded in PMIL using the Meta Quest VR headset and the Optitrack motion capture system. Participants engaged in referential communication tasks inside the AI2-THOR physics simulator — describing, identifying, and giving instructions about objects and locations in a virtual 3D environment — while their full-body motion, speech, and gaze were captured simultaneously.

The dataset is designed to advance co-speech gesture generation in spatially grounded contexts: skeletal motion capture data is streamed directly from the VR headset into the simulator, enabling synchronised capture of embodied referential communication with full scene-graph annotations. Created by Anna Deichler, Jim O'Regan & Jonas Beskow (KTH); presented at the ECCV Multimodal Agents Workshop.

MoCap
full-body skeleton
Speech
audio + transcription
Gaze
VR eye tracking
Scene
AI2-THOR graphs
Artificial Actors · Digital Futures · Completed 2024
Artificial Actors — Directable Digital Humans
Completed

Artificial Actors developed virtual digital humans that function like actors receiving directorial guidance — agents with psychological inner states that govern their nonverbal behaviours, enabling motion synthesis with specific emotional qualities such as shyness or social anxiety rather than neutral, generic movement. The project combined three strands of work: (1) recording a database of acted behaviours across diverse psychological states using PMIL's motion capture infrastructure, (2) developing probabilistic generative methods for gesture synthesis conditioned on high-level personality traits, and (3) establishing a cognitive modelling framework to guide behavioural synthesis. A key outcome was a virtual agent simulating a therapy patient, evaluated with practising therapists and trainees.

Key Publication · ACM Transactions on Graphics · SIGGRAPH 2023
Listen, Denoise, Action! — Audio-Driven Motion Synthesis with Diffusion Models
Simon Alexanderson · Rajmund Nagy · Jonas Beskow · Gustav Eje Henter

Adapts diffusion models to synthesise co-speech gesture and dance from audio in real time. A Conformer-based architecture replaces dilated convolutions for improved modelling power, achieving top-of-the-line motion quality with controllable, speaker-distinctive styles. Generalised guidance enables product-of-expert diffusion ensembles for style interpolation and blending.

Listen, Denoise, Action! — video
Team: Simon Alexanderson · Robert Johansson (Stockholm University Psychology) · Jonas Beskow · Gustav Eje Henter  ·  Funder: Digital Futures Research Pair Call
Mobile Robotics Lab — fleet of custom-built UAV quadrotor platforms
Lab 04 · RPL

Mobile Robotics Lab

Houses a fleet of custom-built aerial robots and ground vehicles, including multiple quadrotor UAV designs, the Boston Dynamics Spot quadruped, and wheeled autonomous ground systems. Equipped with onboard compute, custom sensor arrays, and dedicated workshop space for hardware development.

Research areas include autonomous navigation and mapping, aerial perception, multi-robot coordination, and long-range inspection. The Spot robot — with its Spot Arm — is a regular presence throughout KTH IRL and is frequently used in the IA-Lab for mobile manipulation studies.

Custom quadrotor UAVs Boston Dynamics Spot Spot Arm ATKV ground vehicle Onboard GPU compute Custom sensor arrays Hardware workshop
ProSense · Vinnova · Scania · 2021–ongoing
ProSense — Proactive Context-Aware Perception for Autonomous Vehicles
● Extended

A Vinnova-funded project coordinated by Scania developing scene perception capabilities for safe autonomous driving. The KTH RPL team contributes two work packages: WP4 builds comprehensive local scene models by fusing data from multiple sensors, algorithms, and HD maps; WP5 uses these models for proactive situation interpretation and prediction — detecting objects that individual sensors would miss in occluded or complex traffic scenarios (e.g. vehicles emerging from behind stopped buses, or at intersections with limited sightlines).

WP4 — Multi-source Fusion
Local scene models combining sensors, algorithms & HD maps
WP5 — Proactive Perception
Situation prediction & detection of sensor-occluded objects
PIs: Patric Jensfelt, John Folkesson, Jana Tumova (KTH RPL) · Team: Ajinkya Khoche, Yi Yang, Chris Pek · Coordinator: Scania
KTH RPL project page
Digital Futures Postdoc · 2025–2026
Towards Smart Cities — Collaborative Spatial Perception for Digital Twinning
● Active

A Digital Futures postdoctoral project (Jan 2025–Dec 2026) building a collaborative spatial perception framework for city-scale digital twinning. Multiple autonomous agents jointly construct multi-level abstract representations of urban environments from LiDAR point clouds, RGB-D imagery, and remote sensing data — enabling robots to autonomously create and continuously update a complete digital mirror of the physical world with minimal human involvement.

LiDAR
point cloud mapping
Multi-agent
collaborative perception
Digital twins
city-scale urban models
Postdoc: Yixi Cai (KTH RPL) · Supervisor: Patric Jensfelt · Co-supervisor: Olov Andersson
Digital Futures project page
Cloud Robotics Lab — CloudGripper robot arm cells at KTH IRL
Lab 05 · RPL

Cloud Robotics Lab & CloudGripper

CloudGripper is an open-source cloud robotics testbed for remote robotic manipulation research, benchmarking, and large-scale data collection — hosted at KTH IRL and accessible to researchers worldwide over the internet.

The lab currently houses 32 small robot arm cells, each enclosed in a transparent acrylic frame with overhead cameras that capture ground-truth images of every grasp and push interaction. Researchers anywhere in the world can log in, operate the hardware remotely, collect training data, and run manipulation experiments — no physical presence required.

The transparent enclosure design ensures clean overhead ground-truth images for computer vision and robot learning training. The system is expanding to include industrial arms for more complex dexterous manipulation tasks.

32
Robot arm cells
~2 TB
Interaction data collected
1000+
Hours of video data
100+
Hours rope-push data

Research & Datasets

CloudGripper has generated approximately 2 terabytes of robotic interaction data, including the publicly available CloudGripper-Rope-100 dataset — over 100 hours of rope pushing interactions — and more than 1,000 hours of robot-object pushing video data for AI training.

The platform was presented at IEEE ICRA 2024 in the paper "CloudGripper: An Open Source Cloud Robotics Testbed for Robotic Manipulation Research, Benchmarking and Data Collection at Scale" by Muhammad Zahid and Florian T. Pokorny. It also hosts video transformer research on occlusion handling in robot manipulation.

WASP NEST Project — CloudRobotics Funded by the Wallenberg AI, Autonomous Systems and Software Program (WASP). Coordinated by Assoc. Prof. Florian T. Pokorny (KTH), with co-investigators from Lund University and Umeå University.
32 robot arm cells Transparent enclosures Overhead camera arrays Remote teleoperation API Open-source software Open CAD files (GitHub) MuJoCo simulation Internet-scale data collection

Milestones

2021–22
System development begins. The CloudRobotics NEST project launches, funded by WASP.
2023
IROS 2023 public demo. First major conference demonstration of the CloudGripper testbed to the international robotics community.
2024
IEEE ICRA 2024 paper published. System paper by Zahid & Pokorny. Live demo at ICRA 2024. 2 TB of manipulation data publicly released.
2025
CORL 2025 demo. Summer school at University of Padova on LLMs and reinforcement learning for robot control using CloudGripper.
2026
IEEE ICRA 2026 competition. CloudGripper hosts the cloud manipulation track at the 11th Robotic Grasping and Manipulation Competition.

Global Collaborators

UC Berkeley Carnegie Mellon University Univ. of Texas Austin Bosch University of Padova Lund University Umeå University
Group Perception Lab — tiered red velvet theatre seating for 50 people
Lab 06 · TMH

Group Perception Lab

A unique theatre-style space with tiered red velvet seating for up to 50 people, purpose-built for studying group dynamics, audience attention, and social perception in larger gatherings.

Equipped with wireless audience-response clickers for real-time participant input. On the front wall, a video wall consisting of two times two 55” displays that are connected to a control room as well as to the Intelligence Augmentation lab. This setup makes it possible for groups, panels, and public audiences to watch (and control) an ongoing human-robot interaction in the kitchen lab.

The Group Perception Lab doubles as a high-quality venue for seminars and shared presentations with similar venues (e.g USC-ICT).

50-seat tiered theatre Red velvet seating Wireless audience clickers Professional AV system 2 x 2 55” displays
● Active Infrastructure Språkbanken Tal (KTH) — national Swedish speech technology infrastructure, part of Swe-CLARIN
Språkbanken Tal · National Infrastructure · VR + KTH
Språkbanken Tal — National Swedish Speech Technology Infrastructure
● Active

KTH hosts the speech technology node of Sweden’s national language infrastructure. Språkbanken Tal develops and maintains open resources for Swedish speech — including automatic speech recognition (ASR), text-to-speech synthesis (TTS), and forced alignment tools — as part of Nationella språkbanken and the European CLARIN ERIC network. Funded jointly by the Swedish Research Council (VR) and KTH, the infrastructure serves universities, broadcasters, public agencies, and cultural institutions across Sweden.

ASR
speech recognition
TTS
speech synthesis
Corpora
Swedish speech data
Collaborators: Swedish universities · SVT · National Library of Sweden · Wikimedia Sweden · Bonnier · Swedish Agency for Accessible Media · Institute for Language and Folklore
Contact: Jens Edlund (KTH TMH)
sprakbanken.speech.kth.se
Research Projects

Hosted Projects

KTH IRL serves as the physical substrate for externally funded research projects spanning humanoid robotics, AI-supported daily life, conversational agents, and distributed robot learning.

Promobilia Foundation · 5 Years
SAInt — Situated Agentic Intelligence

A five-year project developing humanoid robots that understand context, learn from interaction, and provide proactive verbal and physical support — bridging the care gap for older adults and people with special needs. SAInt will expand the IRL humanoid fleet to five platforms and conduct longitudinal user studies in the IA-Lab kitchen.

Project Webpage
WASP NEST · ABB AB · Saab AB
PerCorSo — Socially Acceptable Autonomous Robots

Designs robot behaviour for human-crowded environments by integrating spatial & social context understanding, multimodal communication, and formal-methods decision-making. Research targets trust through both verifiability and social acceptability, with real-world validation in the IA-Lab. PIs: Leite, Tumova, Gustafson, Jensfelt (KTH).

WASP project page
Digital Futures · Completed
AAIS — Ambient AI Interaction Systems

Investigated conversational AI and gaze-grounded multimodal interaction in the IA-Lab kitchen and PMIL. Produced two publicly released corpora: the KTH-ARIA-referential dataset (egocentric cooking with Meta Aria glasses) and MM-Conv (motion-captured VR dialogue with AI2-THOR scene graphs). Both datasets are open for research use.

Open Infrastructure · RPL
CloudGripper — Distributed Robot Learning

An open cloud robotics platform enabling researchers worldwide to collect large-scale robot manipulation datasets via internet teleoperation. Hosted in the Cloud Robotics Lab, CloudGripper democratises access to robotic manipulation hardware and enables AI training data collection at unprecedented scale.

cloudgripper.org
Vetenskapsrådet · Active · 2024–2027
BodyTalk — Speech, Gesture & Face Synthesis

Develops unified generative models that synthesise speech, facial expressions, and gestures simultaneously from text, producing spontaneous and non-repetitive conversational behaviours for virtual characters and social robots. Addresses joint multimodal congruence and high-level style control (engagement, agitation). Hosted in PMIL.

KTH TMH project page
WASP / Vetenskapsrådet · Active · 2024–2027
SignBot — Generative AI for Sign Language

Employs state-of-the-art generative AI to create high-quality sign language animations and language processing systems, addressing the visuo-spatial and highly parallel structure of signed languages. Uses PMIL's motion capture to record and train neural sign synthesis models, improving accessibility for 70M+ sign language users worldwide.

KTH TMH project page
Digital Futures · Completed 2024
Artificial Actors — Directable Digital Humans

Developed virtual digital humans with psychological inner states governing nonverbal behaviour — enabling synthesis with specific emotional qualities such as shyness or social anxiety. Combined PMIL motion capture recordings, probabilistic gesture synthesis, and cognitive modelling. Key output: Listen, Denoise, Action! (ACM TOG / SIGGRAPH 2023), a diffusion model for audio-driven gesture and dance synthesis.

Digital Futures project page
WASP · VR Starting Grant · Active
GeoRob Lab — Geometric Robot Learning

Led by Noémie Jaquier (WASP Assistant Professor, RPL), the GeoRob Lab develops data-efficient robot learning, optimisation, and control algorithms using differential geometry and physics as core inductive biases. Recently awarded a Swedish Research Council Starting Grant (2025). Hosted in the Humanoid Robot Lab.

GeoRob Lab website
KAW Wallenberg Scholar · Active · Since 2014
Robots Interacting with People — Danica Kragic Jensfelt

A decade-long Wallenberg Scholar programme developing robotic systems that interpret their surroundings like human senses and learn through interaction. Focus areas include multimodal AI (sound, image, force), learning from demonstration and VR teleoperation, and manipulation of deformable objects such as textiles and groceries. Hosted in the Humanoid Robot Lab.

KAW project page
Vetenskapsrådet Distinguished Professor · 2020–2030
Learning, Interactive Autonomous Systems — Danica Kragic

A ten-year VR Rådprofessor grant developing self-supervised and meta-learning methodologies with causal reasoning for robot perception, control, and interaction with rigid and deformable objects in unstructured environments. Addresses open problems in transfer learning, modelling unknown unknowns, and multisensory physical interaction. Hosted in the Humanoid Robot Lab.

KTH RPL project page
ERC Advanced Grant · H2020 · Completed 2025
BIRD — Bimanual Manipulation of Rigid and Deformable Objects

A five-year ERC Advanced Grant (€2.4M, KTH) creating informative representations of deformable objects combining analytical and learning-based approaches. Research covered geometric, topological, and physical modelling for skilled sensorimotor behaviour in bimanual robot systems. PI: Danica Kragic. Hosted in the Humanoid Robot Lab.

CORDIS project page
Vinnova · Completed
FoodTalk — Proactive AI Cooking Companion

Vinnova-funded project (Swedish: Prata mat) developing a proactive conversational AI cooking assistant for elderly users. Wizard-of-Oz experiments in the IA-Lab smart kitchen compared Instructional vs Chatty AI Chef personas with 6 senior participants — the chatty variant was rated as more aware and intelligent. Partners: KTH, Electrolux, Nagoon. PI: Joakim Gustafson.

Academic Units

Two Departments

KTH IRL is jointly operated by two of KTH's most research-intensive departments. Their combined expertise spans the full stack — from dialogue and social perception to planning and physical robot control.

KTH EECS School
Department of Speech, Music and Hearing (TMH)

TMH investigates how humans use voice, music, and sound to communicate and interact. Faculty expertise covers multimodal dialogue systems, generative models for speech and gesture, social robotics, expressive speech synthesis, and face-to-face interaction modelling. TMH leads interaction design and user research across KTH IRL projects, and operates the IA-Lab, PMIL, and Group Perception Lab.

IA-Lab website ↗

Visit TMH
KTH EECS School
Department of Robotics, Perception & Learning (RPL)

RPL bridges machine learning, computer vision, and robotics to create autonomous systems that perceive and act in the physical world. Faculty expertise includes dexterous manipulation, geometric robot learning, human-robot interaction, formal methods, and autonomous navigation. RPL leads hardware, perception, and control research within KTH IRL, and operates the Humanoid Robot Lab, Mobile Robotics Lab, and Cloud Robotics Lab.

Visit RPL
Access & Booking

Get Access to KTH IRL

Whether you are a KTH researcher ready to book the motion-capture studio, an industry partner exploring collaboration, or a research group wanting to use CloudGripper remotely — KTH IRL welcomes you.

🔬 KTH Researchers

KTH faculty, postdocs, and PhD students can request access to PMIL via the online form and book through the KTH Outlook calendar system. Room name: eecs_lv24_pmil. Maximum 4 consecutive days per booking. Master and bachelor students may be granted time-limited access under supervisor responsibility.

Request PMIL Access

🤝 Industry & Partners

KTH IRL has a strong track record of joint R&D with Electrolux, Akademiska Hus, and other partners. We welcome collaboration with industry, care organisations, and international research groups. Contact us to explore how KTH IRL can support your research, product development, or innovation goals.

Contact TMH Contact RPL