Say hello to the new Saazy! See what’s new ✨

Patching or Extending an Existing Model

See What AI Models Think

With LLMtel, you’ll get instant insights into how your brand is presented across 17+ leading AI systems, including:

Fine-Tuning

(Supervised Continual Training)

Gather & Clean Your Data

– Collect representative examples: FAQs, product specs, press kits, recent announcements.
– Format each as a JSONL record { “prompt”: “…”, “completion”: “…” } following OpenAI’s spec.

Request Fine-Tuning Access

– For GPT-3.5 Turbo: fine-tuning is public via the OpenAI API.
– For GPT-4: you must apply for the private beta through your OpenAI sales or enterprise contact.

Launch the Job

– Call the OpenAI CLI or API endpoint (POST /v1/fine_tunes) with your training file.
– Specify hyperparameters (learning rate multiplier, batch size) per their docs.

Monitor & Validate

– Watch the job status (GET /v1/fine_tunes/{id}), inspect loss curves.
– Test held-out examples to confirm the model learned your new facts without regressions.

Deploy & Route Traffic

– Once complete, use the new model:ft-your-org:… identifier in your API calls.
– Gradually shift production traffic and roll back if any unexpected behavior appears.

Limitations: Only possible if the provider supports fine-tuning (e.g. OpenAI’s GPT-3.5), and costs scale with data size and compute. Closed models without fine-tune APIs can’t be updated this way.

Model-Editing Techniques

(ROME, MEMIT, KnowledgeEditor)

Choose an Open-Source Base

– You need full weight access (e.g. LLaMA, Falcon, or other self-hosted checkpoint).

Install the Editing Library

– Clone the ROME or KnowledgeEditor GitHub repo; install dependencies (pip install -r requirements.txt).

Identify the Fact to Rewrite

– Define: “When asked X, the current answer is Y but should be Z.”

Run the Edit Routine

– For ROME: load the model, locate the relevant MLP layer, apply the rank-1 weight update per the paper’s API.
– For KnowledgeEditor: train the hypernetwork on your (X, Y → Z) pair and apply its update.

Test & Audit

– Query the edited model on X and on unrelated prompts to ensure no collateral damage.

Test & Audit

– Host the patched model on your inference server—there is no closed-API equivalent, so you must self-host.

Limitations: Experimental. Only a few hundred edits before side-effects grow. Not available for closed-source models where you can’t load weights.

Retrieval-Augmented Generation

(RAG)

Select a Vector Store

– Examples: Pinecone, Weaviate, or open-source FAISS/Chroma.

Ingest Your Documents

– Convert each into chunks; embed them with the same embedding model your LLM uses.

Set Up a Retrieval Layer

– At query time: embed the user’s question → fetch top-k nearest docs.

Augment the Prompt

– Prepend the retrieved excerpts to your LLM prompt, e.g.:
“Here are the latest company facts: [DOCS]. Based on that, answer: …”

Serve Through a Wrapper

– Build a microservice that handles retrieval + LLM call as one atomic operation.

Impact: Instant “knowledge updates” without touching model weights. Works on any public API.

Own Your AI Presence

Start monitoring your AI visibility today — it’s free, fast, and built for brand leaders.