Chunking
The naive chunking approach divides documents into chunks of equal size (tokens/characters) without consideration for content semantics. While efficient, this often splits related content across chunks. In Teammately's knowledgebook you can set chink size, chunk overlap and separators for specific documents set.
However, naive chunking approache face several challenges:
- No semantic awareness of content relationships
- Context loss at chunk boundaries
- Difficulty handling complex document structures
- Inefficient handling of references and dependencies
These limitations are what led to the development of our advanced contextual chunking approach.
Contextual Chunking​
Contextual chunking is a document segmentation approach that preserves semantic meaning and contextual relationships. Unlike traditional chunking methods that split documents based solely on token count or syntactic boundaries, contextual chunking uses AI Agent to maintain coherent information units while optimizing for retrieval effectiveness.
This process includes document analysis to build a single document context for each chunk, which provides general information inside every chunk that model retrieves and allows Agents to reply more accurately.
The combination of document cleaning and contextual chunking forms the foundation of Teammately's hallucination-resistant RAG system, ensuring that retrieved content is both relevant and contextually appropriate.