How I Made Text Marv Secure: A Deep Dive into Privacy, Trust, and Third-Party Tools

By Nick Rogers in Marv — Dec 12, 2024

/imagine How I Made Text Marv Secure: A Deep Dive into Privacy, Trust, and Third-Party Tools --cref https://s.mj.run/CMMO9FMgMVk --ar 16:9

My recent personal project, Text Marv, is something I’ve built with security in mind. It's not just another tool—it's one that I use every single day as my own productivity and goal-achievement companion (and for fun stuff like image generation). From the start, I’ve been intentional about building Marv to be something I can trust—and that you can too. Users trust Marv with their phone number (and potentially name) and I want to prove to you that he deserves that trust.

I put every integration that I choose through a rigorous process to decide if they deserve to handle your data. I don't want my data to be exposed. I hate spam text messages, so protecting phone numbers is top of mind for me. They are stored securely in a database and never sent to any partner. When Marv invokes third party APIs, he always uses a random user id to identify you, so there is nothing in any external system that links your personal information. The same is true for logs and analytics. Messages sent to Marv are not for sale, will never be used to train additional models, and can be deleted and forgotten whenever you want. I built that functionality directly into Marv. No need to contact me, just send a text with "clear" and Marv will clean things up and it will be like you have never met.

The most heart-wrenching moment in all of Doctor Who - when the doctor had to wipe Donna Noble's memory to protect her. I get emotional just watching it. IYKYK 😭

Today I am going to discuss each of the partners I have chosen to power Marv, and I will tell you why I feel confident trusting them. Here are some questions I ask before even thinking about integrating a partner into the Marv ecosystem:

Do they have some kind of certification that proves they are secure?
Is there a chance of this company selling your data?
Are communications provably secure? Do they use HTTPS or verifiable communication schemes?
Can we delete the data that we have sent to them in the future? What retention policies are in place?
Can I avoid sending any personally identifiable information to them to keep users as anonymous as possible?

For each of the technology partners that I use for Marv, I'll answer these questions (those answers are summarized in a table at the end) and I'll include some links in case you want to check my work.

Twilio (Telephony / SMS Services)

Twilio is the text messaging partner that I chose for Marv. It's robust, reliable, and they have been around for a many years. I have used for personal projects and at projects for big companies. It's trusted by banks, tech startups, and huge companies around the world. They have a global presence and are a consistent partner for many many software projects globally. When Twilio receives your message, they sign it with a secret key that only I know about. They then send it to Marv via a web hook. The first thing we do when a message arrives is check that secret key to ensure that the traffic is coming from the correct source. Even if the location of this web hook is exposed (which is very unlikely) a random attacker will not have access to this key. Every message is validated with this key to protect against third party attacks. The traffic is encrypted and passed via https so even if a router somewhere in the middle of its trip is compromised, the attackers will not be able to read the contents.

Twilio has achieved certifications under ISO/IEC standards, including ISO/IEC 27001 (Information Security Management), ISO/IEC 27017 (Cloud-Specific Controls), and ISO/IEC 27018 (PII Protection). These certifications demonstrate Twilio's adherence to robust security and privacy controls, ensuring protection against various risks.

Twilio states that it does not use customer data for marketing or advertising without explicit consent. They do not have this consent for data sent to and from Marv. They comply only with legally binding requests for customer data and provide mechanisms for customers to control their data.

ISO/IEC Certification | Twilio

Learn how Twilio’s certified compliance with ISO/IEC 27001 standards assures your protection.

Twilio

OpenAI (Foundational AI)

Marv uses OpenAI as a large language foundational model to power his conversations. They are the company that runs ChatGPT and take security seriously. A breach for OpenAI would be catastrophic, so they are incredibly motivated to keep their systems secure. I use ChatGPT personally a lot, it's an amazing tool for generating ideas and moving from a blank canvas to something you can tweak - but my conversations and data are being used to train their models. I know that and it's a trade-off I am willing to make. Marv (on the other hand) uses the enterprise API, so the content you send to Marv isn't being used to train models that chat to anyone else. Any data and conversations you have with Marv are private and not used for training. This prevents things you say to Marv from leaking into conversations with other users.

OpenAI does not sell user data. Data from the free consumer tier (and in some cases paid consumer tiers) is used to train models, but the enterprise API (which Marv uses) guarantees that this is not the case. The data retention periods are longer than usual, but data is erased after 60 days of inactivity on any thread by default. If you want to erase your data earlier, that's possible, and it becomes only accessible for abuse detection and is completely removed after 30 days. You can request this directly through Marv by sending the command "clear". He will immediately delete all of the OpenAI threads associated with your user.

OpenAI holds multiple security certifications, including SOC 2 Type 2, SOC 3, and TX-RAMP compliance, demonstrating their adherence to robust security and data protection standards. These certifications are periodically audited to ensure compliance with industry benchmarks. OpenAI employs HTTPS for all data transmissions, ensuring encryption during communication. This is a standard practice for secure web communications and protects against interception.

https://trust.openai.com/

https://help.openai.com/en/collections/6864268-privacy-and-policies

PostHog (Analytics)

I use PostHog for storing analytical data. It's pretty awesome to be honest. If you're a dev, give it a try (this is not a paid endorsement). It does not power your interactions with Marv, but is used to track how sticky Marv is and give me insights into what Marv features people are enjoying and guide me towards what I should work on moving forward. All of the user data that is sent to PostHog is anonymized and there is nothing that links you personally to the data sets that are stored there. We use random user ids to make sure of this. (This is the best practice that they recommend). I use this tool to keep track of weekly and daily active users, persona usage, and growth accounting (ie - returning / dormant state of users).

PostHog requires running on HTTPS to secure all communications, so data in transit is secure by default. PostHog supports data anonymization, I only use random user ids for this reason - to avoid any transfer of personally identifiable information (PII). It aligns with GDPR and CCPA privacy standards to minimize risk while maintaining analytic functionality.

PostHog has achieved SOC 2 certification, which ensures adherence to strict security, availability, and confidentiality standards. As a privacy-focused platform, it emphasizes transparency and compliance with GDPR and CCPA.

Zep Cloud (Long Term Memory System)

Zep cloud is a tool that I use with Marv to manage long term memories. It's a fast and efficient way to manage and query memory and utilizes knowledge graphs which update over time. Marv uses this to keep a record of what he knows about you when threads become too long for the context to remain meaningful, and in particular if you want to ask about something you said a long time ago. Any data that is sent to Zep is associated only with the same randomized user ids that I use for every third party integration. This data gets cleared when you ask Marv to forget you.

Zep Cloud is very close to SOC 2 Type II compliant (you can check out the progress on their trust center linked below), demonstrating adherence to strong security standards. This is a newer company so I was more wary of integrating Zep, but I have to say - they really are doing the work to be compliant and I would be confident using them even in an enterprise setting with highly sensitive data. They don't sell data and are very focussed on privacy.

Further, Zep respects right to be forgotten requests with a single API call (which Marv uses whenever you send a text that says "clear"). But be careful with that one, Marv won't remember who you are after you send it 😉.

https://trust.getzep.com/

MongoDB (Data Storage)

Marv uses MongoDB to store only the data required for running day to day. The database includes user data as well as links to third party services such as ids for Stripe, OpenAI, Zep, and Twilio. I have used MongoDB in production environments with the most sensitive data you can imagine and have a high level of trust in this company. The data is secured in transit and encrypted at rest using using AES-256. Even in the case of a breach (which has not happened) they data will be effectively impossible to decrypt.

MongoDB's cloud platform, Atlas (which Marv uses), is compliant with major security and privacy standards, including SOC 2, GDPR, HIPAA, and ISO/IEC 27001. The data is only accessible through one account (mine) and that requires 2FA in order to access. Reads and writes are handled using TLS (Transport Layer Security) by default. This ensures secure communications across its cloud and enterprise platforms.

Mongo gives users full control over databases and can delete data as required. When you ask Marv to "clear" your message history, he does it. Literally. Backups will remain for the retention period of 72 hours (this is necessary in the case of unexpected failures).

11 MongoDB Security Features and Best Practices

What is MongoDB Security? MongoDB is an open source NoSQL database system that stores and manages document-oriented data. You can use MongoDB for ad hoc queries, indexing, load balancing, data aggregation, and server-side JavaScript execution. MongoDB is an enterprise-grade database that provides security features like authentication, access control, and encryption. We’ll cover these features and […]

Satori

Stripe (Payments)

I partner with Stripe to handle payments. Your credit card data is never stored on Marv's server, and only exists in the Stripe ecosystem. I handle the back and forth using user ids and payment ids. This is bog-standard for the SAAS industry and something I am extremely confident with. Stripe handled $1T of payments last year (one trillion!) so they are well versed in this space. I let the experts handle credit card numbers, and just get notified of successful or failed payments. You can review your current subscription and cancel or update it anytime easily using the link at the bottom of Marv's homepage.

Stripe is certified under several frameworks for privacy and security, including the U.S. Data Privacy Framework (EU-U.S. DPF), the UK Extension, the Swiss-U.S. DPF, and the CBPR and PRP systems. These certifications demonstrate their commitment to international data protection standards. Stripe does not sell customer data. The data collected is primarily used for payment processing, fraud detection, and improving services, with strict data protection policies in place. Additionally, any personal data used for training their fraud models is aggregated or anonymized.

Fal AI (Image and Media Generation)

Marv uses Fal AI for image generation models. Fal is a broker for various image and video generation models, and I chose it for Marv because it gives Marv a lot of room to grow. Right now I support image generation in several different formats such as landscape or portrait) but I imagine more functionality coming to Marv in the future so Fal provides a way to interface with that by supporting more complex workflows and alternative models (eg - stable diffusion, flux, and a lot more). They are SOC2 compliant, and their API is secured using both transport layer security as well as JWT tokens and secrets. I am confident that the traffic is secure to and from their service. They also do not expose image generations to the public so are inherently private by default. The images that Marv generates are cleared for commercial use so you can confidently use them in any project that you are working on.

There is no indication that Fal AI sells user data. Their focus is on providing secure AI services, with a clear recommendation to avoid embedding sensitive information directly into client applications. They retain images generated for at least seven days, and may delete them after that but don't have a default setting. When you generate an image with Marv, you'll receive it via MMS text message so no need to worry that they will disappear from Fal's servers. We use a server side proxy to communicate with Fal and no user data (beyond the exact prompt for the image) is ever sent to them so there is no link to you on their servers at all.

Trust Center

fal.ai is a technology company headquartered in San Francisco, California with a globally distributed team. Specializing in AI inference for generative media, Fal.ai focuses on optimizing inference speeds and efficiency, enabling developers to create advanced, scalable applications. Supported by leading Silicon Valley investors, the company aims to make generative AI accessible, reducing barriers to creative expression in media and other domains.

Tavily (SERP data)

Marv is using Tavily for SERP data (search engine results page). It's a way to let him search the web that is curated for higher quality content. The results are served via secure tunnels (HTTPS) and no personal or identifying information is ever sent to their API. There is no way for them to distinguish which user of Marv is performing which search. Further, they do not sell user data or use it for model training. There is no formal security certification here, but similar to Fal, there is little risk here of anything identifying or personal finding it's way into their systems, and even if it did, there is no way to link it to a specific Marv user.

Summary

I have been very mindful about choosing technology partners for Marv that value privacy and security and where ever possible guarantee it. I have built this system in such a way that even in the case of a third party breach, you will be protected. If you have any questions or concerns, I would be happy to address them, reach out at help@textmarv.com with any questions any time! It's just me working on this, but I have put the same level of rigour into the selection process of these partners as I would working at an enterprise company or a startup and I hope that shows.

Since Marv is text message based (for now), any kind of breach is incredibly unlikely, and I will be reviewing these partners and choices regularly. You also have full control over your data and messaging history and can send "clear" to Marv at any time to have him completely forget you. That's your right. Marv will miss you, but he will always respect your right to be forgotten.

Question	OpenAI	Twilio	PostHog	Zep Cloud	fal.ai	MongoDB	Tavily	Stripe
Security Certifications	SOC 2 Type 2, SOC 3, TX-RAMP	ISO 27001, 27017, 27018	SOC 2, HIPAA support	SOC 2 Type II	No explicit certifications; secure handling of secrets	SOC 2, GDPR, HIPAA, ISO/IEC 27001	No explicit certifications; emphasizes secure data handling	EU-U.S. DPF, UK Extension, CBPR, PRP
Chance of Selling Data	Does not sell data; APIs exclude data from model training	No, complies with GDPR; data not sold	No, privacy-focused, self-hosting available	No indication of data being sold	No, focused on secure serverless AI	Does not sell user data; data used for service improvements	No indication of data being sold; used for service delivery	Does not sell customer data; data used for fraud prevention
Communication Security	HTTPS enforced, encrypted data transmission	HTTPS, request validation, encrypted media for specific products	Requires HTTPS; supports secure cookies	HTTPS and strong encryption mechanisms	Uses JWT tokens; strong authentication recommended	TLS enforced for encrypted communication by default	HTTPS ensures encrypted communication	HTTPS, TLS, and mutual TLS for server-to-server communication
Data Deletion & Retention Policies	APIs retain data up to 30 days; users can request deletion	GDPR-compliant; users can manage and delete data	Data retained up to 7 years; deletion configurable for self-hosted setups	Data deleted on request	Files retained for at least 7 days; no broad retention policy specified	User-controlled data deletion; retention policies user-configurable	Limited details; users must contact support for specific data deletion	Retention based on legal/regulatory requirements; data deletion supported
Anonymity and Avoiding PII	Yes, anonymized inputs possible	Yes, supports anonymization through API features	Yes, can be configured to avoid PII; supports anonymization	Yes, users control data submitted to APIs	Yes, promotes anonymous operations using server-side proxies	Yes, client-side encryption available to anonymize sensitive data	Customizable API usage; anonymization requires user configuration	Fraud protection and sensitive data handled via encryption; PII minimized

^ Note - This table should work better in landscape mode on mobile. Tables are tough.

💡

To try out Marv for yourself, check out www.textmarv.com
and upgrade your phone with your new smartest contact.

Schedule your daily briefing with Marv today. He can act as your own personal accountability partner, create you a new phone wallpaper, or summarize news from around the web.