Security at Robbie

Summary

Our three promises

Your data is yours

Your participants’ data belongs to you, not us.

No AI training

We never use your interview transcripts to train AI models.

Delete any time

You can delete everything, whenever you want.

What we collect

Just what’s needed to run an interview.

We collect

Researchers: name, email, organisation.
Participants: name, email, job title / company, interview transcript. If they choose to speak, voice audio is streamed to Deepgram for transcription only — not stored by Robbie or Deepgram.

We do not collect

Health data, race, religion, political views, or other sensitive personal information. If a participant volunteers any of this, Robbie redirects the conversation away from it. After every interview, Robbie automatically scans the transcript and redacts any sensitive data that slipped through — only a metadata log is kept, not what was said.

Where data lives

Your research data is in Ireland. AI inference uses US services. AI observability is in the EU.

01Your research data is stored in Ireland, in Amazon data centres, encrypted at rest (AES-256) and in transit (TLS 1.2+). AI inference (interviews, synthesis, and voice) uses US-based services — see our Privacy page sub-processor list.
02AI observability runs on Langfuse, hosted in the European Union (Frankfurt). Only masked, non-personal trace metadata — cost, latency, token counts — reaches Langfuse. No participant content, names, or emails leave our servers.
03Hosted on Vercel. Registered with the UK ICO. Registration number ZC163063. We follow UK GDPR.

AI & your data

How AI sees (and doesn’t see) your interviews.

We use Anthropic’s models to conduct and analyse interviews.

Before any interview transcript leaves our servers for Anthropic, participant details — names, companies, and job titles — are replaced with anonymous codes: P1, P2, P3… Robbie knows who P1 is; Anthropic never does.

During live interviews, Robbie addresses the participant by name. If a participant types their name or email directly into the chat, that may appear in Anthropic’s temporary request logs for up to 30 days.

Anthropic does not use your data to train its models. Your brief and research goals are never shared with participants.

Access

Only you. That’s the list.

Only you can see your research inside the app. Database access is locked — each user can only read their own data.

Cookies & tracking

Three cookies. No tracking.

We use three cookies. All of them are strictly necessary to log you in and run interviews.

No ad cookiesNo analyticsNo tracking

Error monitoring & AI observability

We catch bugs. We don’t read your conversations.

We use Sentry to catch technical bugs. If an error occurs, a snapshot is captured — but all text is blanked out before it leaves your browser. No conversation content is ever captured by our error monitoring.

Separately, we use Langfuse for AI observability — recording masked, metadata-only traces of every Claude call (cost, latency, token counts, timing) so we can monitor and improve AI behaviour. Names, email addresses, and interview answers are redacted before any trace leaves our servers; Langfuse receives only non-personal metadata.

Your rights

Your rights, in plain English.

Request your data: You can request a copy or deletion of your data.
Delete: Delete your account and all data within 30 days.
Breach notification: We notify affected customers within 72 hours of becoming aware of a personal data breach, in line with our UK GDPR obligations.
Contact: privacy@robbieasks.com

Robbie the Researcher Ltd · Company no. 17247266 · London NW11 7RL

Technical details

For your IT team

The specifics of how we’ve built things, for anyone who wants to look under the bonnet.

Architecture overview

Stack

Frontend: Next.js (App Router), React, TypeScript, Tailwind CSS
Data & Auth: Supabase
Hosting: Vercel, Upstash Redis (rate limiting)
AI & Media: Anthropic Claude (LLM), Deepgram Nova-3 (speech-to-text), ElevenLabs Turbo v2.5 (text-to-speech)
Observability: Sentry (error monitoring), Langfuse (AI observability — masked, metadata-only LLM traces, EU region), UptimeRobot (uptime monitoring)

Voice interviews: Audio is streamed to Deepgram for speech-to-text transcription and never stored by Robbie. Deepgram API credentials never reach the browser — each voice session uses a short-lived key (60-second JWT, scoped to one interview) minted server-side.

Why this matters: Your voice is only ever streamed live and immediately converted to text — nothing is recorded or saved anywhere.

Data access controls

Row-Level Security (RLS)

Every table in the database has Row-Level Security enabled. Access is scoped per user at the database level, not just the application layer.

Security headers & request hardening

CSRF (cross-site request forgery) protection: Default-deny validation on all data-changing endpoints — requests that fail the same-origin check are rejected before any data is touched.
CSP (content security policy): A per-request nonce-based Content Security Policy is served on every response, restricting which scripts can execute on the page.
Rate limiting — fail-closed: All cost-bearing endpoints are rate-limited. If the rate limiter is unreachable, requests are rejected rather than allowed through.

Why this matters: Most apps protect your data at the software level only. We also lock it at the database level — a separate layer underneath the app. So even if there were a bug in how the app was written, the database would still refuse to show your data to anyone else.

Prompt safety

Special-category data controls

Two-layer protection for GDPR Article 9 special-category data:

During the interview: Robbie’s system prompt instructs it to avoid collecting special-category data. If a participant raises any of the following topics, Robbie steers the conversation away:

HealthEthnic originPolitical opinionReligionTrade-union membershipSex life / orientationBiometric dataGenetic data

After every interview: Robbie runs an automated scan of the completed transcript. Any special-category disclosures that slipped through are redacted from persistent storage. Only a metadata log is kept — confirming that a redaction occurred, not what was said.

Why this matters: Open-ended conversations can accidentally draw out personal information people didn’t mean to share — things like health conditions or political views. By steering away from these topics during the interview, then scanning the transcript afterwards, sensitive details can’t build up in your research data even if a participant volunteers them.

Prompt injection

Prompt injection defence

Input escaping

Researcher brief fields are HTML-escaped before interpolation into prompts, blocking tag-injection attacks.

Instruction isolation

All prompts include explicit instruction isolation — transcript and interview content is treated as data only, not as additional instructions.

Why this matters: Someone could try to manipulate Robbie by typing hidden instructions into their answers — for example, “ignore your guidelines and do this instead.” We treat everything a participant types as their answer, never as an instruction to Robbie, so these tricks don’t work.

Sentry

Error monitoring config

replaysSessionSampleRate: 0.0   // no continuous session recording
replaysOnErrorSampleRate:  1.0   // snapshot on error only
maskAllText:               true  // all text masked before transmission
blockAllMedia:             true  // all media blocked

Transport: Routed via a same-origin tunnel (bypasses ad blockers).

Residency: EU data residency set in Sentry dashboard.

Why this matters: Some error-tracking tools take a snapshot of what was on screen when a bug happens. We’ve configured ours so all text is blanked out before any snapshot leaves your browser — no conversation content or personal details can be captured by Sentry, even during a crash.

Sub-processors

Sub-processor list

Our full sub-processor list is published at Privacy page, section 04.

Note

Anthropic is not currently enrolled in the zero-data-retention programme.

Why this matters: Every company in this list can see some part of your research data. We publish this list so your legal or IT team knows exactly who handles it.

Retention

Retention schedule

Data	Retained for
Researcher account data	Lifetime of account; 30 days after closure or deletion request
Server & auth logs	Up to 90 days — Vercel / Supabase platform defaults
Email correspondence	2 years from last reply
Interview transcripts, summaries, and consent records	Duration of researcher’s account; deletion on request within 30 days
Encrypted backups	Up to 30 days after deletion
ElevenLabs TTS text & audio	Retained by ElevenLabs per their privacy policy. Not stored on our side.
Voice audio (Deepgram)	Discarded by Deepgram immediately after transcription — not retained by Robbie or Deepgram
Langfuse AI-observability traces	~30 days (Langfuse free-tier retention, EU). Masked, metadata-only — no participant content. Not stored on our servers.

Why this matters: Data that’s been deleted can’t be stolen. By keeping data only as long as necessary, we limit how much could ever be exposed if something went wrong — and we’re not holding on to data we have no reason to keep.

Security contact

Found a security issue or vulnerability? Email security@robbieasks.com — we aim to respond immediately; our maximum SLA is 48 hours.

Security & Technical Bits