Cognitive Collective

Helping you find your next career in AI. Learn more about the job board on the Scale blog.

Are you a scaling AI startup? Email to be added to our board.

Principal Data Architect



United States · Remote
Posted on Friday, March 22, 2024

Predictive analytics and machine learning power Socure’s groundbreaking technology and fuel our mission to verify 100% of good identities in real time and completely eliminate identity fraud on the internet.

Socure is the world leader in digital identity verification and fraud prevention. Our recent awards include Forbes 2022 America’s Best Startup Employers, The Forbes Cloud 100, The Deloitte Technology Fast 500, and Inc. 5000’s fastest growing companies.

Listen to why some of the world’s top technology investors see the enormous, transformative potential in Socure’s mission and products:


As Socure's Principal Data Architect, your core responsibility is to architect and oversee the company's data strategies and infrastructure, supporting key product suites such as Fraud, KYC, and DocV. This role involves utilizing cloud technologies to craft scalable, efficient data solutions for a variety of uses, including ML training and analytics. You will unify data architecture for both commercial and public-sector applications, ensuring compliance with privacy and security standards. Collaboration with product managers, engineers, and legal teams is key to translating requirements into actionable data strategies. The ideal candidate will have extensive experience in data architecture within cloud environments, expertise in data modeling, and strong skills in Python and SQL. Your leadership will also extend to mentoring junior team members and fostering best practices in data management and architecture


  • Own Socure’s data strategies and architecture of our Identity graph datasets to support a wide range of Socure’s product suites including Fraud, KYC, and DocV with high data qualities.

  • Leverage cutting-edge cloud native technologies to design and optimize data solutions to support multiple use cases such as ML training, feature engineering, online serving, analytics, and business intelligence.

  • Drive an unified data architecture to serve both of Socure’s commercial and public-sector environments.

  • Partner with product managers across business lines to translate product requirements to scalable and cost-efficient data solutions and guide data engineers on implementations.

  • Partner with Socure’s legal, privacy, and security teams on the design and implementation of scalable data policies and processes to ensure Socure’s use of Personally Identifiable Information (“PII”) data sets comply with legal, privacy, and security requirements, including for responding to data subject requests.

  • Partner with engineering teams on aligning delivery timelines, making trade-off calls and guiding data engineering best practices.

  • Partner with data acquisition team to evaluate vendor datasets and guide integration with Socure’s ecosystem.

  • Partner with Socure’s platform vendors (e.g, AWS, Databricks, Fivetran) to evaluate new features and incorporate the latest and greatest into Scoure’s data architecuture.

  • Provide technical leadership, guidance, and mentorship to junior team members, fostering their growth and ensuring best practices are followed.


  • 10+ years of experience in architecting big-data/low latency solutions, and technical leadership.

  • You like to think at scale and design, develop and operate production data stores, pipelines and services that meet goals of low latency, high availability, resiliency, security and quality.

  • Experience in architecting large-scale data solutions in modern cloud environments (AWS, GCP ,etc).

  • Deep knowledge of both data lakes and data warehouses.

  • Extensive experiences in data architecture and modeling with both relational and NoSQL technologies.

  • Extensive hands-on experience with modern big-data technologies such as Apache Spark or Presto. Proficient in building complex large-scale data pipelines with both SQL and dataframes.

  • Strong programming skills and hands on development experience in Python and SQL.

  • Strong problem-solving and analytical skills, with the ability to design and implement efficient algorithms to process terabytes of data .

  • Hands-on experience in building streaming data pipelines with low latency. Familiar with tools such as Kafka, Spark Structured Streaming, Apache Flink, etc.

  • Empathy for people and how they use your work and constantly think about improving your work to improve your customer productivity.

  • Experience with collecting, using, and managing PII in compliance with privacy laws, including the US and international laws, such as the CCPA and GDPR.

Preferred Qualifications:

  • Experience in MLOps is a plus.

  • Experience in building low-latency ML feature data pipelines is a plus.

  • Hands-on experience in Databricks ecosystem is a plus.

  • Experience in building production scale LLM and RAG systems is a big plus.

Salary Disclosure:

Base Salary range: $200,000 - $250,000

This represents the expected salary range for this job requisition. Final offers may vary from the amount listed based on factors including geography, candidate experience and expertise, and other job related factors. Socure's compensation and rewards package for full time roles includes a market competitive salary, equity, comprehensive benefits, and, for applicable roles, commissions plans or an annual discretionary performance bonus.

Socure is all about encouraging people to push the boundaries of what’s possible through top-tier performance, innovation, ownership, and shared expertise.

We empower excellence by providing great perks and benefits to both our fully remote employees in North America and our hybrid teams in India.

To learn more, check out Socure’s Career Page:

Socure is an equal opportunity employer and value diversity of all kinds at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

To learn more about how our work is changing the world, check out these articles and videos: