AI Data Engineer
Terra Security
About Us
Terra Security is on track to become the next breakout cybersecurity company. As the winner of the 2025 AWS + CrowdStrike + NVIDIA Cybersecurity Startup Accelerator, Terra has earned recognition from some of the most influential names in modern security. The company has raised $38 million to date, including a $30 million Series A led by Felicis Ventures with participation from Dell Technologies Capital, Silicon Valley CISO Investments (SVCI), SYN Ventures, LAMA Partners, Underscore VC, and Capital One Ventures.
Terra’s platform is powered by a swarm of fine-tuned AI agents with human-in-the-loop oversight, delivering unmatched efficiency, accuracy, and continuous attack surface coverage. It runs thousands of best-in-class tests and crafts tailored, exploit-driven assessments based on each organization's unique business logic and risk profile.
Summary
As an AI Data Engineer, you are the architect of the intelligence supply chain. You don't just move data; you engineer the "memory" and "context" that powers our AI. You bridge the gap between traditional massive-scale data lakes and the high-stakes world of Agentic Flows. You’ll be building the infrastructure that allows our AI to think, retrieve, and act with precision. We are looking for a "Builder Profile"—someone who thrives in the gray area between data science, data engineering, and full-stack.
What You'll Do
- Design and manage end-to-end data ecosystems that ingest, clean, and organize unstructured data from fragmented sources into high-performance Vector Databases.
- Build the "Context Engineering" pipelines that feed our AI agents, ensuring they have the right data at the right millisecond to reduce hallucinations and maximize reasoning.
- Design and monitor Agentic Flows and perform deep Trace Analysis to debug how data processing impacts model performance.
- Leverage Cursor, and automated agents to accelerate code generation, optimize SQL/Python, and maintain a "Fast Delivery" culture.
- Act as a Full-Stack contributor—building the APIs that expose data to models while ensuring the underlying infrastructure is robust, scalable, and secure.
What You'll Bring
- 3-5 years of experience in building end-to-end applications (Node.js, Python, or Go) and managing cloud infrastructure (MUST).
- Expert-level knowledge of ETL/ELT, data lake architectures (Delta Lake, Iceberg), and real-time processing (Kafka/Flink).
- Familiarity with Vector DBs (Pinecone, Weaviate, Milvus), orchestration frameworks (LangChain, LlamaIndex), and trace analysis tools.
- Ability to perform complex scale-level analytics to prove model ROI.
- You understand the fundamentals of LLMs, embedding models, and how data quality directly influences RAG (Retrieval-Augmented Generation).
- You have a "can-do" attitude and a bias toward action. You prefer shipping a working prototype today over a perfect slide deck tomorrow.
- You can jump from a Cyber-security data audit to optimizing a real-time streaming pipeline without breaking stride.
- You view data as an asset to be refined, not just a chore to be managed.
Advantage
- Cyber Background: Experience in data privacy, PII masking, or adversarial data testing.
- Prior experience in high-growth startups or early-stage product development.