Research

Research at Emergences Labs

We study how intelligence develops real-world competency.

Fri, 30 Jan 2026

AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios

The capacity of AI agents to effectively handle tasks of increasing duration and complexity continues to grow, demonstrating exceptional performance in coding, deep research, and complex problem-solving evaluations. However, in daily scenarios, the perception of these advanced AI capabilities among general users remains limited. We argue that current evaluations prioritize increasing task difficulty without sufficiently addressing the diversity of agentic tasks necessary to cover the daily work, life, and learning activities of a broad demographic. To address this, we propose AgentIF-OneDay, aimed at determining whether general users can utilize natural language instructions and AI agents to complete a diverse array of daily tasks.

Links

Paper Link

GitHub

HuggingFace

Research at Emergences Labs

We study how intelligence develops real-world competency.

Home

Assessment

Training

Data

Research

Blog

About

Careers

Privacy policy

Terms of use

Home

Assessment

Training

Data

Research

Blog

About

Careers

Privacy policy

Terms of use

Home

Assessment

Training

Data

Research

Blog

About

Careers

Privacy policy

Terms of use

Emergences Labs

Focus Areas

Blog

Research

Team

Emergences Labs