Our three promises
Your data is yours
Your participants’ data belongs to you, not us.
No AI training
We never use your interview transcripts to train AI models.
Delete any time
You can delete everything, whenever you want.
Just what’s needed to run an interview.
We collect
- Researchers: name, email, organisation.
- Participants: name, email, job title / company, interview transcript. If they choose to speak, voice audio is streamed to Deepgram for transcription only — not stored by Robbie or Deepgram.
We do not collect
Health data, race, religion, political views, or other sensitive personal information. If a participant volunteers any of this, Robbie redirects the conversation away from it. After every interview, Robbie automatically scans the transcript and redacts any sensitive data that slipped through — only a metadata log is kept, not what was said.
Your research data is in Ireland. AI inference uses US services. AI observability is in the EU.
- 01Your research data is stored in Ireland, in Amazon data centres, encrypted at rest (AES-256) and in transit (TLS 1.2+). AI inference (interviews, synthesis, and voice) uses US-based services — see our Privacy page sub-processor list.
- 02AI observability runs on Langfuse, hosted in the European Union (Frankfurt). Only masked, non-personal trace metadata — cost, latency, token counts — reaches Langfuse. No participant content, names, or emails leave our servers.
- 03Hosted on Vercel. Registered with the UK ICO. Registration number ZC163063. We follow UK GDPR.
How AI sees (and doesn’t see) your interviews.
We use Anthropic’s models to conduct and analyse interviews.
Before any interview transcript leaves our servers for Anthropic, participant details — names, companies, and job titles — are replaced with anonymous codes: P1, P2, P3… Robbie knows who P1 is; Anthropic never does.
During live interviews, Robbie addresses the participant by name. If a participant types their name or email directly into the chat, that may appear in Anthropic’s temporary request logs for up to 30 days.
Only you. That’s the list.
Only you can see your research inside the app. Database access is locked — each user can only read their own data.
We catch bugs. We don’t read your conversations.
We use Sentry to catch technical bugs. If an error occurs, a snapshot is captured — but all text is blanked out before it leaves your browser. No conversation content is ever captured by our error monitoring.
Separately, we use Langfuse for AI observability — recording masked, metadata-only traces of every Claude call (cost, latency, token counts, timing) so we can monitor and improve AI behaviour. Names, email addresses, and interview answers are redacted before any trace leaves our servers; Langfuse receives only non-personal metadata.
Your rights, in plain English.
- Request your data: You can request a copy or deletion of your data.
- Delete: Delete your account and all data within 30 days.
- Breach notification: We notify affected customers within 72 hours of becoming aware of a personal data breach, in line with our UK GDPR obligations.
- Contact: privacy@robbieasks.com
Robbie the Researcher Ltd · Company no. 17247266 · London NW11 7RL
For your IT team
The specifics of how we’ve built things, for anyone who wants to look under the bonnet.
Stack
- Frontend
- Next.js (App Router), React, TypeScript, Tailwind CSS
- Data & Auth
- Supabase
- Hosting
- Vercel, Upstash Redis (rate limiting)
- AI & Media
- Anthropic Claude (LLM), Deepgram Nova-3 (speech-to-text), ElevenLabs Turbo v2.5 (text-to-speech)
- Observability
- Sentry (error monitoring), Langfuse (AI observability — masked, metadata-only LLM traces, EU region), UptimeRobot (uptime monitoring)
Voice interviews: Audio is streamed to Deepgram for speech-to-text transcription and never stored by Robbie. Deepgram API credentials never reach the browser — each voice session uses a short-lived key (60-second JWT, scoped to one interview) minted server-side.
Why this matters: Your voice is only ever streamed live and immediately converted to text — nothing is recorded or saved anywhere.
Row-Level Security (RLS)
Every table in the database has Row-Level Security enabled. Access is scoped per user at the database level, not just the application layer.
Security headers & request hardening
- CSRF (cross-site request forgery) protection: Default-deny validation on all data-changing endpoints — requests that fail the same-origin check are rejected before any data is touched.
- CSP (content security policy): A per-request nonce-based Content Security Policy is served on every response, restricting which scripts can execute on the page.
- Rate limiting — fail-closed: All cost-bearing endpoints are rate-limited. If the rate limiter is unreachable, requests are rejected rather than allowed through.
Why this matters: Most apps protect your data at the software level only. We also lock it at the database level — a separate layer underneath the app. So even if there were a bug in how the app was written, the database would still refuse to show your data to anyone else.
Consent records & IP hashing
When a participant gives consent, we record a hashed copy of their IP address using a salted SHA-256 formula:
- Per-row salt defeats rainbow-table attacks.
- Global pepper (environment variable) is defence-in-depth.
- Notice version hash detects material changes; stale submissions are rejected (HTTP 409).
Why this matters: We can prove consent was given without keeping your actual IP address forever — we convert it into a scrambled code that can’t be reversed. Each record is scrambled differently, so if someone accessed one record, it wouldn’t help them figure out any other.
Special-category data controls
Two-layer protection for GDPR Article 9 special-category data:
During the interview: Robbie’s system prompt instructs it to avoid collecting special-category data. If a participant raises any of the following topics, Robbie steers the conversation away:
After every interview: Robbie runs an automated scan of the completed transcript. Any special-category disclosures that slipped through are redacted from persistent storage. Only a metadata log is kept — confirming that a redaction occurred, not what was said.
Why this matters: Open-ended conversations can accidentally draw out personal information people didn’t mean to share — things like health conditions or political views. By steering away from these topics during the interview, then scanning the transcript afterwards, sensitive details can’t build up in your research data even if a participant volunteers them.
Prompt injection defence
Input escaping
Researcher brief fields are HTML-escaped before interpolation into prompts, blocking tag-injection attacks.
Instruction isolation
All prompts include explicit instruction isolation — transcript and interview content is treated as data only, not as additional instructions.
Why this matters: Someone could try to manipulate Robbie by typing hidden instructions into their answers — for example, “ignore your guidelines and do this instead.” We treat everything a participant types as their answer, never as an instruction to Robbie, so these tricks don’t work.
Error monitoring config
replaysSessionSampleRate: 0.0 // no continuous session recording
replaysOnErrorSampleRate: 1.0 // snapshot on error only
maskAllText: true // all text masked before transmission
blockAllMedia: true // all media blockedTransport: Routed via a same-origin tunnel (bypasses ad blockers).
Residency: EU data residency set in Sentry dashboard.
Why this matters: Some error-tracking tools take a snapshot of what was on screen when a bug happens. We’ve configured ours so all text is blanked out before any snapshot leaves your browser — no conversation content or personal details can be captured by Sentry, even during a crash.
Sub-processor list
Our full sub-processor list is published at Privacy page, section 04.
Anthropic is not currently enrolled in the zero-data-retention programme.
Why this matters: Every company in this list can see some part of your research data. We publish this list so your legal or IT team knows exactly who handles it.
Retention schedule
| Data | Retained for |
|---|---|
| Researcher account data | Lifetime of account; 30 days after closure or deletion request |
| Server & auth logs | Up to 90 days — Vercel / Supabase platform defaults |
| Email correspondence | 2 years from last reply |
| Interview transcripts, summaries, and consent records | Duration of researcher’s account; deletion on request within 30 days |
| Encrypted backups | Up to 30 days after deletion |
| ElevenLabs TTS text & audio | Retained by ElevenLabs per their privacy policy. Not stored on our side. |
| Voice audio (Deepgram) | Discarded by Deepgram immediately after transcription — not retained by Robbie or Deepgram |
| Langfuse AI-observability traces | ~30 days (Langfuse free-tier retention, EU). Masked, metadata-only — no participant content. Not stored on our servers. |
Why this matters: Data that’s been deleted can’t be stolen. By keeping data only as long as necessary, we limit how much could ever be exposed if something went wrong — and we’re not holding on to data we have no reason to keep.
Found a security issue or vulnerability? Email security@robbieasks.com — we aim to respond immediately; our maximum SLA is 48 hours.