Integration Guide

AXIOM + AI Pipelines

AXIOM is a character-level similarity engine, not a semantic one. It encodes text into 10,048-bit binary vectors using trigram bundling — fast, deterministic, and CPU-only. It complements embeddings; it does not replace them.

What AXIOM adds to an AI pipeline

Semantic embeddings (OpenAI, Cohere, BERT) are expensive and slow to call. AXIOM runs in microseconds on a single CPU core and catches structurally similar text — typos, near-duplicates, and reworded copies — before you spend money on an embedding API. Think of AXIOM as the cheap pre-filter that shrinks the problem, with embeddings handling the semantic nuance on whatever remains.

< 5 ms

per encode

10,048-bit

binary vector

CPU-only

no GPU needed

Deterministic

same input = same output

Real similarity scores

Measured from actual API responses. AXIOM captures surface-level character overlap, not meaning. Notice "dog" vs "cat" is essentially random noise.

Input A	Input B	Score	What this means
hello world	hello world	1.0000	Exact match
hello world	hello worl	0.8649	One char dropped
the cat sat on the mat	the cat sat on the hat	0.8547	One word differs
machine learning	machine learning model	0.8252	Added word shares trigrams
rust programming	python programming	0.7481	Shared suffix, different prefix
dog	cat	0.5012	Effectively random — no semantic overlap

Integration Patterns

Browse the patterns below — pick the ones relevant to your stack. Each includes a complete code example and honest guidance on when it helps and when it does not.

Pattern 1

Pre-filtering for RAG

Use AXIOM to deduplicate and near-match before calling an embedding API — so you only embed the candidates that survive the structural filter.

When to use

— High-volume RAG with many similar documents
— When embedding API cost is a concern
— User queries are textually close to stored docs

When NOT to use

— Queries use synonyms or paraphrases
— Multilingual retrieval
— Intent-based or conceptual queries

rag_prefilter.ts

// RAG pre-filter: AXIOM first, embedding API only for survivors

const AXIOM_URL = "https://api.axiom.dev";
const AXIOM_KEY = process.env.AXIOM_API_KEY!;

async function axiomEncode(text: string): Promise<number[]> {
  const res = await fetch(`${AXIOM_URL}/api/encode`, {
    method: "POST",
    headers: { "Content-Type": "application/json", "X-API-Key": AXIOM_KEY },
    body: JSON.stringify({ text }),
  });
  return (await res.json()).vector;
}

async function axiomSimilarity(a: number[], b: number[]): Promise<number> {
  const res = await fetch(`${AXIOM_URL}/api/similarity`, {
    method: "POST",
    headers: { "Content-Type": "application/json", "X-API-Key": AXIOM_KEY },
    body: JSON.stringify({ vector_a: a, vector_b: b }),
  });
  return (await res.json()).similarity;
}

// ─────────────────────────────────────────────────────────────────────
// Step 1: At index time, encode every document with AXIOM (fast & free)
// Step 2: At query time, AXIOM narrows 10,000 docs → 50 candidates
// Step 3: Only call the embedding API for those 50 candidates
// ─────────────────────────────────────────────────────────────────────

interface Doc { id: string; text: string; axiomVector: number[] }

async function buildIndex(docs: { id: string; text: string }[]): Promise<Doc[]> {
  const res = await fetch(`${AXIOM_URL}/api/batch-encode`, {
    method: "POST",
    headers: { "Content-Type": "application/json", "X-API-Key": AXIOM_KEY },
    body: JSON.stringify({ texts: docs.map(d => d.text) }),
  });
  const { vectors } = await res.json();
  return docs.map((d, i) => ({ ...d, axiomVector: vectors[i].vector }));
}

async function prefilter(
  query: string,
  index: Doc[],
  topK = 50,
  threshold = 0.62
): Promise<Doc[]> {
  const queryVec = await axiomEncode(query);

  const scored = await Promise.all(
    index.map(async doc => ({
      doc,
      score: await axiomSimilarity(queryVec, doc.axiomVector),
    }))
  );

  return scored
    .filter(s => s.score >= threshold)
    .sort((a, b) => b.score - a.score)
    .slice(0, topK)
    .map(s => s.doc);
}

// Usage
const index = await buildIndex(myDocs);  // do once at startup

const candidates = await prefilter("how do I reset my password?", index);
console.log(`AXIOM narrowed ${myDocs.length} docs → ${candidates.length} candidates`);
// "AXIOM narrowed 10000 docs → 12 candidates"

// Now call your embedding API ONLY for the 12 survivors
const embeddings = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: candidates.map(c => c.text),
});
// 10,000x fewer embedding calls

Expected output

AXIOM narrowed 10000 docs → 12 candidates

# Embedding API called 12 times instead of 10,000

Pattern 2

Training Data Deduplication

Clean near-duplicate examples from fine-tuning datasets before training. Duplicate data inflates loss curves and wastes GPU hours.

When to use

— Datasets scraped from the web (high redundancy)
— RLHF data with repeated human-written prompts
— Any dataset larger than 10k examples

When NOT to use

— Paraphrase datasets (different surface, same meaning)
— When you want stylistic diversity regardless of overlap

dedup_dataset.py

import requests
from itertools import combinations

AXIOM_URL = "https://api.axiom.dev"
HEADERS = {"X-API-Key": "your-api-key", "Content-Type": "application/json"}


def batch_encode(texts: list[str]) -> list[list[int]]:
    """Encode texts in chunks of 50 (API batch limit)."""
    all_vectors = []
    for i in range(0, len(texts), 50):
        chunk = texts[i : i + 50]
        r = requests.post(
            f"{AXIOM_URL}/api/batch-encode",
            headers=HEADERS,
            json={"texts": chunk},
        )
        all_vectors.extend(v["vector"] for v in r.json()["vectors"])
    return all_vectors


def get_similarity(vec_a: list[int], vec_b: list[int]) -> float:
    r = requests.post(
        f"{AXIOM_URL}/api/similarity",
        headers=HEADERS,
        json={"vector_a": vec_a, "vector_b": vec_b},
    )
    return r.json()["similarity"]


def deduplicate(examples: list[dict], threshold: float = 0.82) -> list[dict]:
    """
    Remove near-duplicate training examples.
    threshold=0.82 catches typos and minor rewording while
    preserving genuinely different phrasings.
    """
    texts = [ex["prompt"] for ex in examples]
    vectors = batch_encode(texts)

    duplicates: set[int] = set()
    for (i, vi), (j, vj) in combinations(enumerate(vectors), 2):
        if i in duplicates or j in duplicates:
            continue
        sim = get_similarity(vi, vj)
        if sim >= threshold:
            print(f"Dup (sim={sim:.4f})")
            print(f"  KEEP   [{i}]: {texts[i][:70]}")
            print(f"  REMOVE [{j}]: {texts[j][:70]}")
            duplicates.add(j)

    kept = [ex for i, ex in enumerate(examples) if i not in duplicates]
    print(f"\nRemoved {len(duplicates)} duplicates.")
    print(f"Dataset: {len(examples)} → {len(kept)} examples")
    return kept


# ── Usage ──────────────────────────────────────────────────────────────
import json

with open("raw_finetune.jsonl") as f:
    raw = [json.loads(line) for line in f]

clean = deduplicate(raw, threshold=0.82)

with open("clean_finetune.jsonl", "w") as f:
    for ex in clean:
        f.write(json.dumps(ex) + "\n")

Expected output

Dup (sim=0.8547)

KEEP [0]: the cat sat on the mat

REMOVE [4]: the cat sat on the hat

Removed 312 duplicates.

Dataset: 5000 → 4688 examples

Pattern 3

Typo-Tolerant Search Layer

Run AXIOM as a first pass to catch typos and near-identical queries. Only forward structurally dissimilar queries to your semantic search layer.

When to use

— Search bars with user-typed queries (typos are common)
— e-commerce product search
— Support ticket routing

When NOT to use

— "buy shoes" should match "purchase footwear" (synonym gap)
— Cross-language search
— Conceptual or intent-based queries

typo_search.ts

// Two-stage search: AXIOM (typo tolerance) → semantic (meaning)

const AXIOM = "https://api.axiom.dev";
const KEY = process.env.AXIOM_API_KEY!;

const post = (path: string, body: object) =>
  fetch(AXIOM + path, {
    method: "POST",
    headers: { "Content-Type": "application/json", "X-API-Key": KEY },
    body: JSON.stringify(body),
  }).then(r => r.json());

// ─── Build a simple in-memory index ───────────────────────────────────

interface Entry { text: string; vector: number[]; metadata: Record<string, string> }

async function buildTypoIndex(
  items: { text: string; metadata: Record<string, string> }[]
): Promise<Entry[]> {
  const { vectors } = await post("/api/batch-encode", {
    texts: items.map(i => i.text),
  });
  return items.map((item, i) => ({
    text: item.text,
    vector: vectors[i].vector,
    metadata: item.metadata,
  }));
}

// ─── Stage 1: AXIOM catches typos and near-matches ────────────────────

async function axiomSearch(
  query: string,
  index: Entry[],
  threshold = 0.72
): Promise<Entry[]> {
  const { vector: qv } = await post("/api/encode", { text: query });

  const results: Array<Entry & { score: number }> = [];
  for (const entry of index) {
    const { similarity } = await post("/api/similarity", {
      vector_a: qv,
      vector_b: entry.vector,
    });
    if (similarity >= threshold) results.push({ ...entry, score: similarity });
  }
  return results.sort((a, b) => b.score - a.score);
}

// ─── Stage 2: Semantic search for structurally dissimilar queries ──────

async function twoStageSearch(query: string, index: Entry[]) {
  // AXIOM pass
  const axiomHits = await axiomSearch(query, index, 0.72);

  if (axiomHits.length > 0) {
    console.log(`AXIOM found ${axiomHits.length} match(es) — skipping semantic`);
    return axiomHits;
  }

  // Fall through to semantic search (your existing embedding pipeline)
  console.log("AXIOM found nothing — forwarding to semantic search");
  return semanticSearch(query);  // your existing function
}

// ─── Demo ─────────────────────────────────────────────────────────────

const catalog = await buildTypoIndex([
  { text: "wireless bluetooth headphones", metadata: { id: "p1" } },
  { text: "running shoes size 10",         metadata: { id: "p2" } },
  { text: "mechanical keyboard rgb",        metadata: { id: "p3" } },
]);

// Typo query → AXIOM catches it
const r1 = await twoStageSearch("wireles bluethooth headphons", catalog);
// AXIOM found 1 match — skipping semantic
// score=0.8134 | wireless bluetooth headphones

// Synonym query → falls to semantic
const r2 = await twoStageSearch("noise cancelling ear buds", catalog);
// AXIOM found nothing — forwarding to semantic search

Expected output (typo query)

AXIOM found 1 match — skipping semantic

score=0.8134 | wireless bluetooth headphones

# "wireles bluethooth headphons" → 0.8134 (catches the typos)

# "noise cancelling ear buds" → 0.4819 (below threshold, semantic takes over)

Pattern 4

Cache Key Generation

AXIOM vectors are deterministic — identical input always produces the same bit pattern. Use this as a cache key to avoid re-embedding identical or near-identical inputs.

When to use

— High-traffic LLM endpoints with repeated queries
— Avoiding duplicate embedding API charges
— Response caching for chatbots

When NOT to use

— Queries where context or user identity matters
— When you need exact-string matching only (a plain hash is simpler)

cache_key.py

import requests
import hashlib
import redis

AXIOM_URL = "https://api.axiom.dev"
HEADERS = {"X-API-Key": "your-api-key", "Content-Type": "application/json"}

cache = redis.Redis(host="localhost", port=6379, db=0)


def axiom_vector(text: str) -> list[int]:
    r = requests.post(
        f"{AXIOM_URL}/api/encode",
        headers=HEADERS,
        json={"text": text},
    )
    return r.json()["vector"]


def vector_to_cache_key(vector: list[int]) -> str:
    """
    Hash the binary vector into a short cache key.
    Identical inputs → identical vectors → identical keys.
    """
    raw = bytes(vector)
    return hashlib.sha256(raw).hexdigest()[:32]


def get_similarity(va: list[int], vb: list[int]) -> float:
    r = requests.post(
        f"{AXIOM_URL}/api/similarity",
        headers=HEADERS,
        json={"vector_a": va, "vector_b": vb},
    )
    return r.json()["similarity"]


def cached_embed_and_respond(query: str, embed_fn, llm_fn) -> str:
    """
    1. Encode with AXIOM (fast, deterministic)
    2. Check cache for an exact or near-identical query
    3. On miss: call embed_fn + llm_fn, store in cache
    """
    vec = axiom_vector(query)
    exact_key = vector_to_cache_key(vec)

    # Exact cache hit (same query, same vector, same key)
    cached = cache.get(exact_key)
    if cached:
        print(f"Cache HIT (exact): {exact_key[:12]}...")
        return cached.decode()

    # Near-duplicate check: scan recent cache entries
    # (in production, use a small in-memory recent-vector store)
    recent_keys = cache.keys("vec:*")
    for rk in recent_keys[:200]:          # check last 200 unique queries
        stored_vec = list(cache.hget(rk, "vector") or b"")
        if not stored_vec:
            continue
        sim = get_similarity(vec, stored_vec)
        if sim >= 0.92:                   # nearly identical text
            cached_response = cache.hget(rk, "response")
            print(f"Cache HIT (near-dup, sim={sim:.4f})")
            return cached_response.decode()

    # Cache miss — call the expensive functions
    print("Cache MISS — calling embed + LLM")
    embedding = embed_fn(query)
    response  = llm_fn(query, embedding)

    # Store result
    cache.set(exact_key, response, ex=3600)
    cache.hset(f"vec:{exact_key}", mapping={
        "vector": bytes(vec),
        "response": response,
    })
    cache.expire(f"vec:{exact_key}", 3600)

    return response


# ── Usage ──────────────────────────────────────────────────────────────
r1 = cached_embed_and_respond("How do I reset my password?", embed, llm)
# Cache MISS — calling embed + LLM

r2 = cached_embed_and_respond("How do I reset my password?", embed, llm)
# Cache HIT (exact): a3f8b2c1d9e4...

r3 = cached_embed_and_respond("how do i resett my pasword?", embed, llm)
# Cache HIT (near-dup, sim=0.9312)

Expected output

Cache MISS — calling embed + LLM

Cache HIT (exact): a3f8b2c1d9e4...

Cache HIT (near-dup, sim=0.9312)

# Third call (typo) avoided both embed API + LLM call entirely

Pattern 5

Content Drift Monitoring

Compare new content batches against a stored baseline to detect when a document has changed significantly — useful for re-embedding triggers, version control, or data-quality alerts.

When to use

— Monitoring scraped web content for changes
— Deciding when to re-embed a document
— Detecting document tampering or unexpected edits

When NOT to use

— Detecting meaning shifts caused by replacing synonyms
— When you need diff-level granularity (use a diff tool)

drift_monitor.py

import requests
import json
from pathlib import Path

AXIOM_URL = "https://api.axiom.dev"
HEADERS = {"X-API-Key": "your-api-key", "Content-Type": "application/json"}
BASELINE_FILE = Path("baseline_vectors.json")


def encode(text: str) -> list[int]:
    r = requests.post(
        f"{AXIOM_URL}/api/encode", headers=HEADERS, json={"text": text}
    )
    return r.json()["vector"]


def similarity(va: list[int], vb: list[int]) -> float:
    r = requests.post(
        f"{AXIOM_URL}/api/similarity",
        headers=HEADERS,
        json={"vector_a": va, "vector_b": vb},
    )
    return r.json()["similarity"]


def save_baseline(docs: dict[str, str]) -> None:
    """Encode and store baseline vectors for a set of named documents."""
    baseline = {}
    for doc_id, text in docs.items():
        baseline[doc_id] = {"text": text[:120], "vector": encode(text)}
    BASELINE_FILE.write_text(json.dumps(baseline))
    print(f"Baseline saved: {len(baseline)} documents")


def check_drift(
    new_docs: dict[str, str],
    threshold: float = 0.85,
) -> list[dict]:
    """
    Compare current content against baseline.
    Returns documents that have drifted below the threshold.
    """
    baseline = json.loads(BASELINE_FILE.read_text())
    alerts = []

    for doc_id, new_text in new_docs.items():
        if doc_id not in baseline:
            alerts.append({"id": doc_id, "status": "new", "score": None})
            continue

        baseline_vec = baseline[doc_id]["vector"]
        new_vec = encode(new_text)
        score = similarity(baseline_vec, new_vec)

        status = "stable" if score >= threshold else "drifted"
        if status == "drifted":
            alerts.append({
                "id": doc_id,
                "status": status,
                "score": round(score, 4),
                "baseline_preview": baseline[doc_id]["text"][:60],
                "new_preview": new_text[:60],
            })
            print(f"DRIFT [{doc_id}] sim={score:.4f}")
            print(f"  was: {baseline[doc_id]['text'][:60]}")
            print(f"  now: {new_text[:60]}")

    return alerts


# ── Usage ──────────────────────────────────────────────────────────────

# Day 1: Save baseline
save_baseline({
    "pricing": "Our free tier includes 100 API calls per day.",
    "support": "Email us at support@example.com for help.",
    "terms":   "By using this service you agree to our terms.",
})

# Day 7: Compare new versions
alerts = check_drift({
    "pricing": "Our free tier includes 100 API calls per day.",     # unchanged
    "support": "Contact our support team via the in-app chat.",     # changed
    "terms":   "By using this platform you agree to our terms.",   # minor edit
})

# DRIFT [support] sim=0.6823
#   was: Email us at support@example.com for help.
#   now: Contact our support team via the in-app chat.

Expected output

Baseline saved: 3 documents

DRIFT [support] sim=0.6823

was: Email us at support@example.com for help.

now: Contact our support team via the in-app chat.

# "pricing" sim=1.0000 (unchanged, no alert)

# "terms" sim=0.8912 (above threshold, no alert)

Pattern 6

Edge / Offline Processing

Where there is no internet access, no GPU, and no cloud API budget — AXIOM runs on a single CPU core and needs no external dependencies.

When to use

— Air-gapped systems (healthcare, defence, finance)
— Mobile apps that cannot call an embedding API
— IoT devices or embedded systems
— Environments where data must not leave the device

When NOT to use

— When semantic understanding is required (embeddings are better)
— If you have GPU access and latency tolerances of 200 ms+

edge_search.py — self-hosted AXIOM, no internet required

# Self-hosted AXIOM on a local machine or Raspberry Pi.
# Requires: Docker + ~256 MB RAM. No GPU, no internet, no API key.
#
#   docker run -p 8080:8080 axiom-public:latest
#
# Then point all calls at http://localhost:8080

import requests

LOCAL_URL = "http://localhost:8080"
# No API key needed for self-hosted instance with no auth configured


def encode_local(text: str) -> list[int]:
    r = requests.post(
        f"{LOCAL_URL}/api/encode",
        json={"text": text},
        timeout=2,
    )
    return r.json()["vector"]


def similarity_local(va: list[int], vb: list[int]) -> float:
    r = requests.post(
        f"{LOCAL_URL}/api/similarity",
        json={"vector_a": va, "vector_b": vb},
        timeout=2,
    )
    return r.json()["similarity"]


# ── Offline document search ────────────────────────────────────────────

class OfflineSearch:
    def __init__(self):
        self.index: list[tuple[str, list[int]]] = []

    def add(self, text: str) -> None:
        vec = encode_local(text)
        self.index.append((text, vec))

    def search(self, query: str, top_k: int = 3) -> list[tuple[str, float]]:
        qv = encode_local(query)
        scored = [
            (text, similarity_local(qv, vec))
            for text, vec in self.index
        ]
        return sorted(scored, key=lambda x: x[1], reverse=True)[:top_k]


# No network calls beyond localhost — works fully air-gapped
search = OfflineSearch()
search.add("Patient admitted with chest pain and shortness of breath.")
search.add("MRI scan scheduled for Tuesday morning.")
search.add("Discharge summary: patient stable, follow-up in 2 weeks.")

results = search.search("chest and breathing problems")
for text, score in results:
    print(f"{score:.4f} | {text}")

Expected output

0.7841 | Patient admitted with chest pain and shortness of breath.

0.5120 | Discharge summary: patient stable, follow-up in 2 weeks.

0.4388 | MRI scan scheduled for Tuesday morning.

# "shortness of breath" → "breathing problems": 0.7841 (shared trigrams)

# All calls go to localhost. Zero external traffic.

Honest comparison: AXIOM vs embeddings

These are not competing products. They excel at different tasks.

Task	AXIOM	Semantic Embeddings
Exact duplicate detection	Excellent	Overkill
Near-duplicate / typo detection	Excellent	Good
Synonym matching (buy vs purchase)	Poor	Excellent
Cross-language similarity	No	Excellent (multilingual models)
Paraphrase detection	Poor	Excellent
Latency per encode	< 5 ms, CPU	100–500 ms, GPU API call
Cost at 10M ops	Fixed VPS cost	~$1–10 per 1M tokens
Works offline / air-gapped	Yes	No (unless self-hosted model)
Deterministic output	Always	Usually (model version dependent)
Sentiment / intent classification	No	Yes (fine-tuned models)

The hybrid approach

The best production pipelines use both. AXIOM handles the structural pre-filter; embeddings handle the semantic re-ranking.

Recommended pipeline for RAG

AXIOM — encode query~3 ms, CPU, free

Produces a 10,048-bit vector deterministically

AXIOM — pre-filter corpus~5 ms for 10k docs

Narrows 10,000 candidates to ~50 structural matches

Embedding API — re-rank 50~200 ms, 50 API calls

Semantic understanding reduces 50 to top 5

LLM — generate answer1–5 s

GPT / Claude produces final response from top 5 context docs

Total: ~6 seconds for a 10,000-document corpus. Without AXIOM pre-filtering you would need 10,000 embedding API calls instead of 50 — approximately 200x more expensive.

hybrid_rag.ts — AXIOM pre-filter + OpenAI re-rank

import OpenAI from "openai";

const AXIOM = "https://api.axiom.dev";
const KEY   = process.env.AXIOM_API_KEY!;
const openai = new OpenAI();

const post = (path: string, body: object) =>
  fetch(AXIOM + path, {
    method: "POST",
    headers: { "Content-Type": "application/json", "X-API-Key": KEY },
    body: JSON.stringify(body),
  }).then(r => r.json());

interface Doc { id: string; text: string; axiomVec?: number[] }

// ── Stage 1: AXIOM structural pre-filter ──────────────────────────────

async function axiomPrefilter(query: string, corpus: Doc[], topK = 50): Promise<Doc[]> {
  const { vector: qv } = await post("/api/encode", { text: query });

  const scored = await Promise.all(
    corpus.map(async doc => {
      const { similarity } = await post("/api/similarity", {
        vector_a: qv,
        vector_b: doc.axiomVec,
      });
      return { doc, score: similarity as number };
    })
  );

  return scored
    .filter(s => s.score >= 0.60)
    .sort((a, b) => b.score - a.score)
    .slice(0, topK)
    .map(s => s.doc);
}

// ── Stage 2: Semantic re-rank with OpenAI embeddings ──────────────────

async function semanticRerank(query: string, candidates: Doc[], topK = 5): Promise<Doc[]> {
  const inputs  = [query, ...candidates.map(c => c.text)];
  const { data } = await openai.embeddings.create({
    model: "text-embedding-3-small",
    input: inputs,
  });

  const [qEmbed, ...docEmbeds] = data.map(d => d.embedding);

  const cosineSim = (a: number[], b: number[]) => {
    const dot  = a.reduce((s, v, i) => s + v * b[i], 0);
    const magA = Math.sqrt(a.reduce((s, v) => s + v * v, 0));
    const magB = Math.sqrt(b.reduce((s, v) => s + v * v, 0));
    return dot / (magA * magB);
  };

  return candidates
    .map((doc, i) => ({ doc, score: cosineSim(qEmbed, docEmbeds[i]) }))
    .sort((a, b) => b.score - a.score)
    .slice(0, topK)
    .map(s => s.doc);
}

// ── Full hybrid pipeline ───────────────────────────────────────────────

async function hybridSearch(query: string, corpus: Doc[]): Promise<Doc[]> {
  console.time("axiom-prefilter");
  const candidates = await axiomPrefilter(query, corpus, 50);
  console.timeEnd("axiom-prefilter");
  console.log(`AXIOM: ${corpus.length} → ${candidates.length} candidates`);

  if (candidates.length === 0) return [];

  console.time("openai-rerank");
  const top5 = await semanticRerank(query, candidates, 5);
  console.timeEnd("openai-rerank");

  return top5;
}

// Usage
const results = await hybridSearch("password reset instructions", myCorpus);
// axiom-prefilter: 8ms
// AXIOM: 10000 → 23 candidates
// openai-rerank: 312ms   (23 calls, not 10000)

Limitations — read before using in production

AXIOM does character-level trigram matching. This is useful but narrow. The following tasks are outside its scope — using AXIOM for them will produce incorrect or misleading results.

Synonyms and paraphrases

"buy shoes" vs "purchase footwear" → ~0.42 (near-random). AXIOM shares no trigrams across these words.

Semantic similarity

"dog" vs "cat" → 0.5012. AXIOM has no concept of meaning. Do not use it for concept search.

Cross-language

"hello" vs "hola" → ~0.30. No shared character patterns across language families.

Sentiment / intent

"I love this product" vs "I hate this product" → high similarity (~0.81). Structurally similar, semantically opposite.

Long-document summarisation matching

A 10-sentence summary vs a 1-page article will score low even if they convey the same information.

Replacing embeddings entirely

AXIOM is a pre-filter. For any task requiring semantic understanding, you still need an embedding model.

Ready to add AXIOM to your pipeline?

Try the API in the playground or read the full API reference.

Try the playground API reference