AI & ML

Trend Watch: The Rise of 'Ad-Company' AI Assistants

· 5 min read
SitePoint Premium
Stay Relevant and Grow Your Career in Tech
  • Premium Results
  • Publish articles on SitePoint
  • Daily curated jobs
  • Learning Paths
  • Discounts to dev tools
Start Free Trial

7 Day Free Trial. Cancel Anytime.

Google Gemini, Meta AI, Amazon Alexa+, and Microsoft Copilot are now the dominant AI assistants in the market, and they are built by companies with major advertising businesses or data-monetization models. The privacy and architectural consequences for developers and power users create unaudited data exposure and legal risk for organizations handling regulated data, and almost nobody is examining them closely.

Table of Contents

The New Reality: Your AI Assistant Works for Advertisers First

Picture this: a developer asks their AI assistant to help debug a proprietary authentication flow, a product manager requests competitive analysis, or someone types a sensitive health question into a chatbot. In each case, the company processing that request earns the vast majority of its revenue from advertising. Google Gemini, Meta AI (embedded across WhatsApp, Instagram, and Facebook), Amazon Alexa+, and Microsoft Copilot are now the dominant AI assistants in the market, and they are built by companies with major advertising businesses or data-monetization models. The degree of ad-dependency varies: Meta (approximately 97% of revenue) and Alphabet (approximately 77%) are structurally ad companies; Amazon and Microsoft have large advertising segments within more diversified revenue bases. The structural implications of ad-supported AI assistants deserve far more scrutiny than they are getting. We are entering an era where one of the most personal computing interfaces yet created, one that processes natural language revealing intent, sentiment, and context, is owned and operated by companies with deep advertising incentives. The privacy and architectural consequences for developers and power users create unaudited data exposure and legal risk for organizations handling regulated data, and almost nobody is examining them closely.

The 'Every Company Is an Ad Company' Thesis Applied to AI

Revenue Breakdown: Follow the Money

The financial incentives here are not subtle. Alphabet, Google's parent company, derives roughly 77% of its total revenue from advertising (FY2024, Alphabet 10-K). Meta is even more concentrated, with approximately 97% of revenue coming from ads (FY2024, Meta 10-K). Amazon has grown its advertising segment into a $50 billion-plus business (FY2024, Amazon 10-K), making it one of the company's fastest-expanding divisions. Microsoft earns advertising revenue through Bing and LinkedIn, though this represents a small share of its total revenue; Copilot is primarily a Microsoft 365 and Azure product.

When advertising accounts for a dominant proportion of revenue, as it does for Alphabet and Meta, product decisions serve that business model. This is not a cynical interpretation; it is how public companies operate. AI assistant design, data retention policies, default sharing settings, and feature prioritization all flow downstream from the revenue structure. The question is not whether these companies will use AI assistant data for advertising purposes, but how aggressively and how transparently.

How AI Assistants Supercharge the Ad Model

Conversational AI data is qualitatively different from search queries or click streams. A search query gives the system a handful of keywords to infer intent from. A multi-turn conversation with an AI assistant, by contrast, captures explicit statements of intent, emotional context, preference hierarchies, and decision-making processes. This dataset is richer than search logs or click streams because users state their goals, constraints, and preferences directly rather than forcing the system to guess.

The shift underway is fundamental: from inferring what users might want based on indirect signals to having users articulate exactly what they want in plain language. For an advertising business, this is an unprecedented gift.

Google's public filings reference the integration of AI capabilities into its advertising products. Meta has made public statements about using AI to improve ad targeting and has updated its policies to permit training AI models on user content across its platforms. The shift underway is fundamental: from inferring what users might want based on indirect signals to having users articulate exactly what they want in plain language. For an advertising business, this is an unprecedented gift.

What This Means for Developers and Power Users

Data Flows You Can't Opt Out Of

The training pipelines behind these AI assistants are worth understanding concretely. These companies route user prompts into several distinct data-use pathways: reinforcement learning from human feedback (RLHF), separate fine-tuning datasets, and ad-personalization models. Their policies do not exclude ad-personalization use, and no public statement from any major provider explicitly rules it out. Google's updated privacy policy for Gemini explicitly describes how conversational data may be used. Meta's policy changes around AI training on user-generated content across its family of apps follow the same pattern.

For most consumer users, opting out either cannot be done or requires digging through settings menus that reset with updates. These products collect data by default, and the burden falls on the user to actively and repeatedly resist it.

The API Layer Isn't Immune

Developers building on hosted LLM APIs, whether the Gemini API, Meta's hosted Llama endpoints, or Amazon Bedrock, should read the specific data retention and usage policies for each provider. Relevant documents include Google's Gemini API Additional Terms of Service, Meta's Llama Acceptable Use Policy, and AWS's Bedrock Service Terms. API terms of service have historically promised more restrictive data use than consumer product terms, but the gap is narrowing. API providers increasingly reserve rights to log requests, retain prompt data for abuse monitoring, and in some cases use aggregated data for model improvement.

Here is a minimal example showing where data leaves a developer's control when calling a hosted LLM endpoint:

import logging
import os

import requests  # pip install requests~=2.31

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

# ILLUSTRATIVE EXAMPLE — replace with your actual provider values.
# Never hardcode API keys; load from environment variables or a secrets manager.

LLM_API_URL = "https://api.example-llm-provider.com/v1/chat/completions"
LLM_MODEL = "hosted-large-model"
TEMPERATURE = 0.7  # Range: 0.0 (deterministic) to 2.0 (most providers); verify with your provider.


def call_llm(prompt: str) -> dict:
    api_key = os.environ.get("LLM_API_KEY")
    if not api_key:
        raise EnvironmentError(
            "LLM_API_KEY environment variable is not set. "
            "Export it before running: export LLM_API_KEY=<your-key>"
        )

    # Every field in this payload exits your infrastructure
    payload = {
        "model": LLM_MODEL,
        "messages": [
            # This content is now on someone else's servers
            {"role": "user", "content": prompt}
        ],
        "temperature": TEMPERATURE,
    }

    logger.info("Sending request to %s, model=%s, prompt_len=%d",
                LLM_API_URL, LLM_MODEL, len(prompt))

    # timeout=(connect_seconds, read_seconds)
    response = requests.post(
        LLM_API_URL,
        headers={
            "Authorization": f"Bearer {api_key}",  # Identifies you
            "Content-Type": "application/json",
        },
        json=payload,
        timeout=(5, 30),
    )

    # Raise on 4xx/5xx BEFORE attempting to parse body.
    response.raise_for_status()

    try:
        data = response.json()
    except requests.exceptions.JSONDecodeError as exc:
        raise ValueError(
            f"Provider returned non-JSON response "
            f"(status={response.status_code}): {response.text[:200]}"
        ) from exc

    # The response, your prompt, your IP, your API key usage pattern:
    # all logged, all retained per the provider's policy, not yours.
    logger.info("Response received, status=%d", response.status_code)
    return data


if __name__ == "__main__":
    result = call_llm("Analyze our Q3 churn data: 14.2% in enterprise...")
    print(result)

The inline comments are not hypothetical. Every request payload, every header, and the metadata around usage patterns are subject to the provider's retention policy, not the developer's.

The Open-Source and Local AI Counter-Movement

Running Models Locally: The Privacy-First Alternative

A growing ecosystem of tools makes it straightforward to run capable language models on consumer hardware with zero data leaving the machine. Ollama, llama.cpp, LM Studio, and Jan.ai all provide different interfaces for local inference. Models that run well on Apple Silicon with 16GB unified memory, or on systems with a discrete GPU with 8GB or more VRAM (and 16GB+ system RAM), include Llama 3.1 8B, Mistral 7B, Phi-3, and Gemma 2 9B.

The trade-off is real: these smaller models cannot match the raw capability of GPT-4-class or Gemini Ultra-class systems. But each generation narrows the gap. For a wide range of developer tasks, including code generation, text summarization, data analysis, and brainstorming, they perform well enough that the privacy benefit outweighs the capability difference.

Setting up a private, local AI assistant takes minutes on a fast connection after the initial model download (4-8GB depending on model):

# ── Installation ──────────────────────────────────────────────────────────────
# Option A: Checksum-verified install (recommended for sensitive environments).
# 1. Download the installer. Review the script before executing.
curl -fsSL https://ollama.com/install.sh -o ollama_install.sh

# 2. Verify the checksum against the value published at https://ollama.com/download
#    before executing. Replace <expected_sha256> with the published value.
# sha256sum ollama_install.sh  # Linux
# shasum -a 256 ollama_install.sh  # macOS
echo "<expected_sha256>  ollama_install.sh" | sha256sum --check

# 3. Execute only after checksum passes.
sh ollama_install.sh

# Option B: GUI installer with checksum — https://ollama.com/download
# Windows: Download the installer from https://ollama.com/download/windows

# ── Pull a model (pin by digest for reproducibility) ─────────────────────────
# Replace <digest> with the sha256 published by Ollama for the specific release.
# Run 'ollama show llama3.1:8b' post-pull to record and assert the digest.
ollama pull llama3.1:8b
PULLED_DIGEST=$(ollama show llama3.1:8b --format json 2>/dev/null | grep -o '"digest":"[^"]*"' | head -1)
echo "Pulled digest: ${PULLED_DIGEST}"
# In CI: assert this matches your pinned expected digest.

# ── Interactive run ───────────────────────────────────────────────────────────
# For programmatic use with variable prompts, prefer the API endpoint below
# to avoid shell metacharacter injection in prompt strings.
ollama run llama3.1:8b "Explain the trade-offs between JWT and session-based auth" \
  || { echo "ERROR: ollama run failed (exit $?)"; exit 1; }

# ── Local API server call ─────────────────────────────────────────────────────
# Warning: Ollama serves on localhost:11434 with no authentication by default.
# On cloud or shared machines, restrict access via firewall rules
# or an authenticated reverse proxy before exposing beyond localhost.

# Linux / macOS (single-quoted JSON):
curl --connect-timeout 5 --max-time 120 \
     -w "
HTTP Status: %{http_code}
" \
     http://localhost:11434/api/generate \
     -d '{
       "model": "llama3.1:8b",
       "prompt": "Write a Python function to validate email addresses"
     }'

# Windows CMD / PowerShell — use a JSON file to avoid quoting issues:
# Create payload.json:
#   {"model": "llama3.1:8b", "prompt": "Write a Python function to validate email addresses"}
# Then run:
#   curl --connect-timeout 5 --max-time 120 -w "
HTTP Status: %{http_code}
" ^
#        http://localhost:11434/api/generate --data @payload.json

Inference tokens do not leave your machine. Verify current telemetry settings in Ollama's documentation before use in sensitive environments, as the desktop application may collect usage analytics.

Open-Source Doesn't Mean Open Season

Open-weight and open-source mean different things. Open-weight means the model parameters are publicly available for download and local inference. Truly open-source would also include the training data, methodology, and full reproducibility. Most of the prominent models, including Meta's Llama family, are open-weight but not fully open-source.

The irony is hard to miss: Meta, a company that derives approximately 97% of its revenue from advertising (FY2024, Meta 10-K), is also the largest contributor of open-weight AI models. The strategic calculus is clear. Releasing Llama builds an ecosystem around Meta's architecture, attracts developer talent and research attention, and positions Meta as an infrastructure layer even when developers run models locally. Still, from a privacy standpoint, open weights deliver a real benefit: the developer controls inference, and data never leaves their machine. Note that Meta's Llama models carry a custom commercial license that restricts certain uses, including commercial deployments above a threshold of monthly active users. Review the Llama Community License Agreement before production deployment.

The Developer's Decision Matrix

Hosted APIs from ad-company providers make sense for prototyping, applications that need frontier-model capability, and workloads where scale matters more than data sensitivity. When you are handling sensitive data, building user-facing products where trust is a feature, or operating under compliance frameworks like GDPR or HIPAA, local and self-hosted models are the stronger choice. Consult legal counsel to determine whether your application qualifies as a covered entity or business associate under HIPAA. This is not an ideological choice. It is an architectural one. A GDPR Article 28 processor obligation can land on your desk when a hosted provider changes its terms; a customer trust incident after a data-use disclosure can cost more than the engineering effort of running inference locally; and a compliance audit that reveals prompt data sitting on a third-party server creates liability you could have avoided.

This is not an ideological choice. It is an architectural one. A GDPR Article 28 processor obligation can land on your desk when a hosted provider changes its terms.

What Comes Next, and What to Watch For

Google has already tested sponsored results within AI Overviews in search. Ad-injected AI responses are not theoretical. The extension of this pattern into conversational AI assistants is a matter of when, not if. No regulation specifically and comprehensively addresses the pipeline from AI conversational data to ad-targeting models, though existing frameworks like GDPR and the EU AI Act may apply partially. The constraints, for now, are largely self-imposed by the companies involved.

On the other side of the equation, local-first AI tooling is gaining momentum. Projects like Ollama (over 120k GitHub stars as of mid-2025) and llama.cpp reflect a real community shift toward privacy-preserving inference. Developers and technical users are in a unique position here. They understand the data flows, they have the skills to run alternatives, and their choices signal to the broader market what is acceptable. If the technical community defaults to convenience over architectural integrity, consumer users will never have the option.

If the technical community defaults to convenience over architectural integrity, consumer users will never have the option.

Key Takeaways

  • Gemini, Meta AI, Alexa+, and Copilot are products of companies with major advertising businesses. Treat the data you send them accordingly.
  • Unlike click-based tracking, conversational AI captures intent, sentiment, and preference directly from the user's own words. This data is extraordinarily valuable to ad models.
  • Local, open-weight models running via tools like Ollama already handle a wide range of developer tasks with zero data leaving your machine. Verify telemetry settings and review model licenses before deploying in sensitive or commercial contexts.
  • Choosing your AI stack is a privacy architecture decision. Make it deliberately now rather than discovering the implications after your data is already on someone else's servers.