S

Data Scientist II - Big Data R&D, Identity Graph & KYC

Socure

🌍 North America 🏠 Remote ⏱ Part-time 💼 Mid-level 🗓 2 weeks ago

WHY SOCURE?

Socure is building the identity trust infrastructure for the digital economy — verifying 100% of good identities in real time and stopping fraud before it starts. The mission is big, the problems are complex, and the impact is felt by businesses, governments, and millions of people every day.

We hire people who want that level of responsibility. People who move fast, think critically, act like owners, and care deeply about solving customer problems with precision. If you want predictability or narrow scope, this won’t be your place. If you want to help build the future of identity with a team that holds a high bar for itself — keep reading.

ABOUT THE ROLE

The Big Data R&D team is responsible for building the core identity graph and entity-resolution capabilities that power Socure’s KYC and compliance products. In this role, you will help develop graph-based algorithms and data pipelines on massive PII datasets, support modelers with high-quality features, and evaluate new data sources that feed our identity and fraud products. You will work closely with senior data scientists and engineers while developing your skills in large-scale ML, distributed systems, and graph analytics.

WHAT YOU'LL DO

- Contribute to the design and implementation of machine learning, data mining, statistical, and graph-based algorithms to analyze very large datasets for identity verification and anomaly detection.

- Analyze large datasets to help develop and refine entity-resolution and identity-matching algorithms that drive Socure’s KYC and compliance solutions.

- Build and maintain components of data-processing pipelines (ETL, feature generation, normalization) using tools such as Spark/PySpark and AWS (e.g., EMR, S3).

- Support senior data scientists with feature engineering, data exploration, error analysis, and A/B test setup for new models and signals.

- Help evaluate new third‑party and internal data sources: profile data quality, design offline experimen...

Share this job: