Who is this for?
This guide is intended for technical teams planning a rag poc or production project. It supports decisions that consider data, security, operations, and measurement together rather than treating technology selection in isolation.
1. Data sources and ownership
- Which repositories will connect?
- Who owns the documents?
- How are updates and deletion tracked?
- How are duplicates and old versions handled?
2. Permission model
- User and role sources
- Document-level access
- Tenant boundaries
- Propagation of permission changes into the index
3. Document quality and chunking
PDF type, scan quality, table structure, and heading hierarchy determine parsing strategy. Chunk size is not one fixed number; it should be evaluated against content and question patterns.
4. Embeddings, index, and evaluation
- Language and domain-aligned embeddings
- Vector database options
- Metadata filters
- Golden question set
- Retrieval and answer evaluation
5. Security and deployment
- Model provider and data boundary
- On-premise or private cloud needs
- Logging and masking
- Backups
- Cost and capacity tracking
6. Write success criteria first
Before implementation, define which questions should be answered, which sources are authoritative, and when the system should decline to answer. A few successful questions selected for a demo do not demonstrate production readiness.
The evaluation set should cover document types, short and long questions, alternate phrasing, outdated content, and access restrictions. Retrieval and answer-generation results should be reported separately.
7. Define the operating model
Be explicit about who updates the index, removes incorrect content, reviews user feedback, and approves model changes. A RAG system is not a static search screen that can be left without ownership after release.
Incident handling also matters. Technical teams and knowledge owners need agreed steps for restricted results, incorrect sources, latency problems, and provider outages.
8. Estimate cost and capacity
Cost includes more than LLM calls. Parsing, OCR, embeddings, index storage, backups, network traffic, observability, and human review all contribute. Document change frequency directly affects re-indexing load.
During the pilot, measure average and peak use, queries per user, context size, and latency targets. These observations support realistic technology and deployment choices.
9. Decision gate
- Are data access and ownership clear?
- Can the permission model be enforced in indexing and retrieval?
- Is a representative evaluation set ready?
- Have refusal and source-attribution behaviors been tested?
- Is there an operational owner for the pilot?
- Are production cost and capacity assumptions documented?
How to prepare for technical discovery
Before the first workshop, document the business problem, affected user groups, available data sources, and current security constraints. Any sample document or data set should represent real production variation, be approved for sharing, and be reviewed for personal or sensitive information.
The workshop does not need to end with an immediate technology choice. It should first clarify exclusions, success criteria, data owners, authorization assumptions, and risks that affect a pilot decision. This produces a verifiable PoC plan instead of an impressive but unmeasurable demonstration.
- Primary business problem and expected user outcome
- Representative and permitted data or document samples
- Existing identity, authorization, and integration boundaries
- Technical and operational measures for the end of the PoC
After the PoC decision
A successful PoC is not sufficient for an immediate broad rollout. A pilot should observe real user behavior, data update frequency, support needs, capacity, and failure scenarios. A production decision should depend on operational ownership, security approval, cost visibility, and rollback planning as well as technical quality.
How can Mansel help?
Mansel supports discovery, technical assessment, bounded PoC, pilot, and production planning with explicit security and data assumptions.