Want to connect with Benetech?
Join organizations building the agentic web. Get introductions, share updates, and shape the future of .agent.
Is this your company?
Claim this profile to update your info, add products, and connect with the community.
Benetech is a fundamental player in the data-layer for AI agents. While most of the AI world is focused on LLM performance, Benetech is solving the problem of high-fidelity data extraction from complex, inaccessible formats like visual math, science diagrams, and legacy print layouts.
For builders in the agent ecosystem, Benetech provides a blueprint for how to handle "unstructured" information. Agents require a semantic map of the world to function effectively; Benetech’s work in MathML and accessible EPUBs effectively creates that semantic map for educational and literary content. As agents are increasingly deployed in educational settings, Benetech’s ability to provide machine-readable, structured data from visual content becomes a critical piece of the agent stack, ensuring that AI-driven tools are inclusive and capable of navigating the nuances of structured documentation.
Benetech is an outlier in Palo Alto. Founded in 2001, it is a nonprofit that operates with the engineering rigor of a software house. While many organizations in the social sector focus on direct service, Benetech focuses on data pipelines and semantic structure. Their core mission is to solve the problem of "book famine"—the reality that only a small fraction of published materials is ever converted into formats that people with visual impairments or dyslexia can read.
To solve this, they built Bookshare. It is not just a repository; it is a conversion engine. The platform takes unstructured or semi-structured files from publishers and applies a series of automated transformations to produce DAISY (Digital Accessible Information System), EPUB, and braille files. This process requires a deep understanding of document structure, converting visual hierarchies into the machine-readable trees that screen readers and assistive devices rely on. At a time when Silicon Valley was focused on building the next social network, Benetech was solving the hard engineering problem of optical character recognition (OCR) and document parsing at scale.
One of the most technically demanding areas Benetech tackles is the representation of mathematical notation. Traditional OCR and standard text-to-speech engines fail when encountering complex formulas. Benetech's MathML Cloud project is an open-source initiative designed to automate the conversion of math into accessible formats like MathML and SVG with tactile descriptions. This is a critical infrastructure layer for the education sector. By turning a visual image of a quadratic equation into a structured tree, they allow software to describe the math logically to a blind student or render it correctly on a refreshable braille display.
This focus on structured data is what makes Benetech relevant in the age of large language models. LLMs often hallucinate when interpreting visual layouts of textbooks or complex diagrams. Benetech has spent two decades building the training sets and the conversion logic required to turn those visual assets into clean, semantic text. Their work in image description—using AI to generate alt-text for complex educational diagrams—is a direct application of modern machine learning to their long-standing goal of universal information access.
While Bookshare is their most visible product, Benetech has a history of building tools for sensitive environments. Their past work on Martus, a secure human rights reporting tool, demonstrated an early commitment to encrypted data collection and privacy. This ethos continues in how they manage the massive amounts of student data within the Bookshare ecosystem. They maintain a presence on GitHub with over 170 repositories, reflecting an open-source philosophy that is rare for organizations of their size. Based in Palo Alto and led by founders with deep roots in the original Silicon Valley boom, Benetech is a reminder that the same technologies used to disrupt industries can be applied to the more fundamental task of making sure everyone can read a book.
The world's largest library of accessible ebooks for people with reading barriers.
Benetech is hiring
You've explored Benetech.
Join organizations building the agentic web.