Block-Based Visual Programming Language

Zero-Shot Knowledge-Based Visual Question Answering with Frozen Language Models

Abstract: Knowledge-based Visual Question Answering (VQA) is a challenging task that requires models to access external knowledge for reasoning. Large Language Models (LLMs) have recently been ...

IEEE

Neurodynamics-Based Visual Servo Predictive Control for Improving Smooth Movement of Logistics Omnidirectional Robots

Abstract: Smooth movement and constraint satisfaction are the key safety and effectiveness concerns of visual servoing systems of logistics transport robots. In this article, we propose a novel ...

Ai2 releases MolmoWeb, an open-weight visual web agent with 30K human task trajectories and a full training stack

Ai2's MolmoWeb is the first open-weight visual web agent to ship with its full training dataset, giving enterprise teams the ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results

Zero-Shot Knowledge-Based Visual Question Answering with Frozen Language Models

Neurodynamics-Based Visual Servo Predictive Control for Improving Smooth Movement of Logistics Omnidirectional Robots

Ai2 releases MolmoWeb, an open-weight visual web agent with 30K human task trajectories and a full training stack

Trending now