Socure

The leading provider of digital identity verification and fraud solutions. Salesinfo@socure.com

Staff Data Scientist – Entity Resolution, IDGraph

Full TimeRemoteTeam 501-1,000Since 2012H1B SponsorCompany SiteLinkedIn

Location

United States

Posted

3 days ago

Salary

$170K - $205K / year

Postgraduate Degree5 yrs expEnglishPy SparkPython

Job Description

• Lead the evaluation and continuous improvement of entity resolution and entity linking pipelines. • Debug new builds, identify anomalies, and recommend modeling or system-level improvements. • Define, implement, and maintain scalable performance and quality metrics, leveraging automation and LLM-based approaches where appropriate. • Partner with Engineering to optimize entity linking and ranking systems using Learning-to-Rank and related techniques. • Design methods to assess and classify entity confidence and quality across the graph. • Design and implement a comprehensive data quality framework for graph-based identity data. • Translate abstract quality concepts (e.g., reliability, stability, consistency) into measurable signals. • Use data quality insights to guide modeling decisions, experimentation strategy, and product prioritization. • Identify and operationalize generalized, high-impact predictive signals derived from graph structure, temporal dynamics, and relational patterns. • Develop scalable approaches to link prediction, label propagation, and semi-supervised learning within the ID Graph. • Explore and evaluate advanced graph modeling techniques, including graph-based ML, knowledge graph methods, and Graph Neural Networks (GNNs), when appropriate. • Focus on durable abstractions rather than one-off features, ensuring solutions are explainable, compliant, and reusable across multiple products. • Collaborate closely with Engineering, Product Management, Compliance, and downstream product teams. • Act as a technical leader within the Identity organization, influencing modeling standards, experimentation rigor, and best practices. • Translate complex technical findings into clear insights and recommendations for both technical and non-technical stakeholders. • Support the launch of new product capabilities built on top of the ID Graph. • Demonstrate strong ownership, strategic impact, and assertive communication. • Mentor peers, foster a culture of growth, and build authentic relationships across teams. • Embrace feedback, adapt resiliently to challenges, and pursue continual self-improvement.

Job Requirements

  • Master’s or PhD in Computer Science, Data Science, Machine Learning, Statistics, Mathematics, or a related field
  • 5+ years of experience in applied data science, machine learning, or artificial intelligence, with a focus on graph-based modeling and large-scale data systems
  • Strong proficiency in Python and PySpark
  • Deep experience with Classification models, Learning-to-Rank, Anomaly Detection, Statistical Modeling
  • Experience building and maintaining production-grade ML systems at scale
  • Hands-on experience with Databricks
  • Familiarity with graph databases and query languages such as NeptuneDB and OpenCypher
  • Experience with graph processing frameworks (e.g., GraphFrames)

Benefits

  • Offers Equity
  • Offers Bonus

Related Categories

Related Job Pages