* Salário: R$ 11.000 a R$ 20.000 por mês (estimado)
* O valor exibido é uma estimativa calculada com base em dados públicos e referências do mercado. Não garantimos que este seja o salário oferecido para esta vaga específica.
Área: Tecnologia da Informação
Nível: Senior
What makes us Confidencial (Apenas para Cadastrados)?
A Gartner® Magic Quadrant™ Leader for 15 years in a row, Confidencial (Apenas para Cadastrados) transforms complex data landscapes into actionable insights, driving strategic business outcomes. Serving over 40,000 global customers, our portfolio leverages pervasive data quality and advanced AI/ML capabilities that lead to better decisions, faster.
We excel in integration and governance solutions that work with diverse data sources, and our real-time analytics uncover hidden patterns, empowering teams to address complex challenges and seize new opportunities.
The Principal Data SRE Role - AI Platform & Reliability
The Principal Data SRE is the foundational architect for the infrastructure supporting Confidencial (Apenas para Cadastrados)'s internal AI transformation. You will join an elite architecture group to build the "Customer Zero" environment, the premier environment that defines Confidencial (Apenas para Cadastrados)'s production standards for AI and Data Integration. Your mandate is to ensure that the shift to a Retrieval-Augmented Generation (RAG) based, "Iceberg-first" data estate is stable, secure, and highly available. You will own reliability for MCP servers, vector databases, and knowledge graph infrastructure, ensuring AI systems have governed, performant access to enterprise data. You will lead the "build from scratch" mandate with high autonomy, prioritizing site reliability and building out robust Iceberg infrastructure while ensuring data engineering workflows are frictionless. This role also serves as a strategic bridge to a larger global SRE cohort across the US and Canada, aligning global standards with local innovation.
What makes this role interesting?
Some of the challenges you’ll work on include:
- Operating mission-critical AI infrastructure: Ensuring reliability and observability for vector databases, graph databases, and MCP servers powering agentic AI pipelines and autonomous workflows.
- Designing resilient distributed systems at global scale: Implementing fault isolation using N+M cluster architectures, predictive monitoring, and self-healing patterns to detect anomalies before they impact service.
- Architecting the Iceberg-first data estate: Leading the infrastructure implementation of Apache Iceberg, defining reference architectures (ADRs) for metadata lifecycle management, manifest file optimization, and snapshot retention.
- Supporting high-performance AI retrieval systems: Maintaining infrastructure capable of delivering high-dimensional semantic search with sub-second P99 query latency.
- Collaborating with principal engineering leaders: Partnering with the Principal Applied AI Engineer and Principal Data Engineer to remove architectural bottlenecks and support large-scale AI workloads.
- Enabling frictionless data engineering workflows: Developing self-service infrastructure and guardrails-as-code that allow data engineers to deploy and scale pipelines without operational friction.
- Mentoring the engineering organization: Providing reliability guidance and scalable infrastructure patterns to the Senior Data Engineer and broader engineering team.
Here’s how you’ll be making an impact:
In this role, your work will directly influence how AI systems access, process, and learn from enterprise data. You’ll ensure the platform is resilient, efficient, and ready to support future innovation.
- Defining reliability standards for AI infrastructure: Establishing Service Level Objectives (SLOs) and observability for vector databases, graph databases, and MCP servers.
- Ensuring secure and scalable MCP communication: Managing transport layers, authentication, and secure communication between host applications and AI systems.
- Operating the Iceberg lakehouse architecture: Maintaining data integrity, metadata consistency, and snapshot lifecycle management across the Iceberg-first data estate while maintaining a reliable open lakehouse foundation for AI knowledge workloads.
- Reducing operational toil through automation: Implementing predictive monitoring, self-healing systems, and automation-first infrastructure to minimize manual operational work.
- Optimizing the economics of AI infrastructure: Driving cost efficiency using smart cloud infrastructure strategies, monitoring the unit economics of AI workloads, and implementing serverless metadata polling, adaptive optimization, and semantic caching.
- Enforcing architectural guardrails: Ensuring security, compliance, and data protection by design across internal AI systems.
- Raising the bar for reliability engineering globally: Acting as a technical multiplier by coaching global teams on modern SRE practices and scaling reliability standards across Confidencial (Apenas para Cadastrados)’s AI platform.
We’re looking for a teammate with:
Required
- Experience: 8+ years leading SRE or infrastructure architecture in multi-cloud environments.
- Infrastructure Expertise: Deep expertise in Kubernetes (K8s) scaling for data workloads, Terraform or Pulumi for IaC, and modern observability stacks (e.g., OpenTelemetry).
- Lakehouse Experience: Proven record of managing Apache Iceberg at scale, including metadata management, partitioning strategies, and cross-engine consistency.
- Influence Skills: Strong ability to translate strategic vision into technical guardrails and mentor engineering teams on reliability best practices.
- Language: Expert-level English (Native or Professional) for global collaboration across time zones.
Preferred
- MCP Experience: Experience with the Model Context Protocol (MCP), including hosting secure, scalable MCP servers and managing transport layers for AI systems.
- Confidencial (Apenas para Cadastrados) Product Familiarity: Experience with Confidencial (Apenas para Cadastrados) Sense, Confidencial (Apenas para Cadastrados) Cloud, or Confidencial (Apenas para Cadastrados) Talend Cloud.
- Automation Focus: Background in software engineering with a focus on building automation-first workflows and reducing MTTR. Reporting Line and Collaboration
The location for this role is:
Office Location, São Paulo, Brazil
Hybrid: #LI-Hybrid
Apply now and help change how the world transforms complex data landscapes into actionable insights and turns complex data challenges into new opportunities!
More about Confidencial (Apenas para Cadastrados) and who we are:
Find out more about ‘Life at Confidencial (Apenas para Cadastrados)’ on social: Instagram, LinkedIn, YouTube, and X/Twitter, and to see all other opportunities to join us and our values, check out our Careers Page.
What else do we offer?
- Genuine career progression pathways and mentoring programs.
- Culture of innovation, technology, collaboration, and openness.
- Flexible, diverse, and international work environment.
Giving back is a huge part of our culture. Alongside an extra “change the world” day plus another for personal development, we also highly encourage participation in our Corporate Responsibility Employee Programs
If you need assistance applying for a role due to a disability, please submit your request via email to accessibilityta@Confidencial (Apenas para Cadastrados).com. Any information you provide will be treated according to Confidencial (Apenas para Cadastrados)’s Recruitment Privacy Notice. Confidencial (Apenas para Cadastrados) may only respond to emails related to accommodation requests.
Confidencial (Apenas para Cadastrados) is not accepting unsolicited assistance from search firms for this employment opportunity. Please, no phone calls or emails. All resumes submitted by search firms to any employee at Confidencial (Apenas para Cadastrados) via-email, the Internet or in any form and/or method without a valid written search agreement in place for this position will be deemed the sole property of Confidencial (Apenas para Cadastrados). No fee will be paid in the event the candidate is hired by Confidencial (Apenas para Cadastrados) as a result of the referral or through other means.
Work Location: Hybrid remote in Vila Olímpia, SP
