Question 1

Do clients always need full fine-tuning?

Accepted Answer

No. Many environments are better served by controlled retrieval, access design, and workflow orchestration rather than full model training or fine-tuning. Fine-tuning is recommended only when retrieval and prompting cannot meet the accuracy or behavior requirements of the use case.

Question 2

Is this only for large enterprises?

Accepted Answer

No. The stronger fit is teams with real data sensitivity and a clear initial use case, regardless of headcount. Mid-market organizations frequently benefit more than larger ones because decisions move faster.

Question 3

How is this different from a generic AI consultancy?

Accepted Answer

The focus is narrow: private deployment, security and governance, and disciplined rollout for one use case at a time. No model training research, no public chatbot work, no agency-style content output. The engagement ends when the environment is operable, not when retainer hours run out.

Question 4

Can private AI be deployed on-premises?

Accepted Answer

Yes. The delivery model is built around matching the deployment pattern — on-premises, private cloud, or tightly controlled hosted infrastructure — to the organization’s data sensitivity, operational maturity, and support constraints.

Question 5

Which LLMs and models do you work with?

Accepted Answer

Model choice is driven by the deployment model and use case rather than brand preference. Engagements have covered open-weight models hosted privately (Llama, Mistral, Qwen families) as well as private endpoints from major commercial providers when the contract terms support data handling requirements.

Question 6

What does retrieval (RAG) actually involve?

Accepted Answer

Retrieval-augmented generation pairs a language model with a controlled body of content so answers are grounded in the organization’s own material. Doing it well requires content selection, access control, freshness rules, evaluation, and review workflows — not just dropping documents into a vector database.

Question 7

Do you build custom user interfaces?

Accepted Answer

Sometimes, but only when an off-the-shelf chat or workflow interface cannot meet the requirements. The default is to ship the use case through an existing tool — internal portal, ticketing system, document workflow — so adoption does not depend on a new product surface.

Question 8

How do you handle data residency and cross-border requirements?

Accepted Answer

Region-bound deployments, data-flow inventories, and explicit decisions about where inference, retrieval, and logging run are part of the architecture phase. Common patterns include EU-only private cloud, US-only private cloud, and hybrid setups where retrieval indexes stay in one region while the model layer is replicated.

Question 9

Can the system be air-gapped or run without internet access?

Accepted Answer

Yes, when the use case warrants it. Air-gapped deployments use open-weight models, locally hosted retrieval, and offline update procedures. The trade-off is operational complexity — air-gapped systems require more deliberate change control and a longer update cycle. The recommendation is to air-gap only when contractual or regulatory constraints require it.

Question 10

What happens if a model is deprecated by the vendor mid-engagement?

Accepted Answer

Deprecation is treated as a risk during model selection, not as a surprise after launch. The architecture is designed so the model layer is swappable — the evaluation suite, prompt library, retrieval design, and access controls survive a model change. When a deprecation is announced, the rollback and migration path is already documented.

Question 11

Do you handle governance and documentation too?

Accepted Answer

Yes. The scope is intentionally broader than model setup. Governance, runbooks, policies, change control, and user enablement are part of making the deployment workable after launch.

Question 12

Can you guarantee compliance certification?

Accepted Answer

No. The work supports a controlled technical environment that can hold up to scrutiny. Regulatory interpretation and certification decisions — SOC 2, HIPAA, PCI, ISO 27001, GDPR — remain with the client and their advisors.

Question 13

Can you work with our existing security and compliance team?

Accepted Answer

Yes. The engagement is structured to plug into existing security review processes, not bypass them. Architecture decisions, logging design, retention rules, and access patterns are documented in a way that supports internal review and audit conversations.

Question 14

Can you sign our standard MSA and data processing terms?

Accepted Answer

Usually yes. Engagements are structured to sit cleanly under a client MSA with a typical scope of work and a data processing addendum. Engagement terms accommodate audit rights, breach notification obligations, and confidentiality requirements common to security-sensitive sectors.

Question 15

How long does a typical engagement take?

Accepted Answer

An assessment is usually two to four weeks. A pilot is typically six to twelve weeks depending on data access complexity. Production rollout varies with environment scope and integration count, and governance support runs on a quarterly or monthly cadence.

Question 16

Do you provide model hosting or run our environment for us?

Accepted Answer

No. The practice designs, builds, and hands off the environment to your team. Hosting, ongoing operations, and 24/7 support stay with you or with a hosting partner of your choosing. This keeps the engagement focused on architecture and design rather than becoming a managed-services contract.

Question 17

How are evaluation and quality measured during a pilot?

Accepted Answer

Each pilot has acceptance criteria agreed in advance with the business owner. A representative test set is built from real workload (or its closest available proxy), reviewed by a named subject-matter expert, and run as part of every model or prompt change. Disagreements have a documented resolution path. "Looks good in the demo" is not an acceptance criterion.

Question 18

What happens after the engagement ends?

Accepted Answer

The deliverable is an environment your team can operate. Documentation, runbooks, and training are part of the work so internal owners can run it without ongoing dependency. Continued advisory is available on a retainer if expansion, policy refresh, or architectural review is needed later.

Straight answers to the questions buyers actually ask

What private AI work actually covers