Entity Resolution 101 from Paco Nathan

In the world of data science, connecting the dots between fragmented pieces of information is both a challenge and an art. In our latest GraphGeeks Explainer session, Senzing’s Paco Nathan provided an overview of entity resolution—what it is, why it matters, and how graph-based approaches are revolutionizing this complex problem.

Introducing Senzing

Senzing was founded by Jeff Jonas—dubbed the "wizard of Big Data" by National Geographic—and has been at the forefront of innovation in entity resolution since its inception. Jeff's journey began in the mid-1980s as a young software developer in Las Vegas, where he developed groundbreaking techniques for non-obvious relationship awareness (NORA), which became a key tool for detecting fraud and identifying threats.

Senzing spun out of IBM in 2016 and leverages over two decades of expertise in entity resolution. Today, a team of 30 professionals continues to drive advancements in data integrity. Senzing’s SDK is used for a variety of mission-critical applications, from voter registration to anti-fraud measures and cybersecurity. Whether in highly secure environments or public sector applications, these solutions power industries that require precision and accuracy in data matching, helping organizations make sense of complex, messy data to make better, faster decisions.

What is Entity Resolution?

Entity resolution (ER) is the process of identifying and linking records that refer to the same real-world entity across different data sources. Whether in healthcare, finance, or customer management, ER ensures that duplicate, conflicting, or incomplete records don’t create inconsistencies in decision-making.

At its core, entity resolution is about making sense of messy data—disambiguating names, addresses, products, or organizations that might be represented in different ways. Traditional methods rely on heuristics and rule-based systems, but modern approaches leverage machine learning and graph-based techniques to achieve far greater accuracy.

Graphs are a Natural Fit for Entity Resolution

In this explainer, Paco emphasized that graph-based models are particularly well-suited for entity resolution because they inherently capture relationships and context. Unlike traditional databases that store records in tables, graphs model connections between entities, making it easier to detect patterns and resolve ambiguities.

For example, in a customer database, a person might be recorded multiple times due to variations in name spelling or changes in contact details. A graph model doesn’t just compare individual records; it examines the surrounding relationships—shared phone numbers, addresses, or transaction histories—to infer whether two records actually represent the same entity.

Cultural Differences and the Globalization Challenge

Entity resolution becomes even more complex when applied across different cultures and languages. Paco highlighted how names, addresses, and business entities vary dramatically across regions. A name that appears common in one culture may have multiple variations in another, and address formatting differs widely across countries.

Globalization has increased the need for robust ER solutions that can handle multilingual datasets, diverse naming conventions, and regional business structures. Graph-based approaches help by capturing these variations and linking entities based on deeper contextual relationships rather than surface-level similarities. This is particularly important for organizations working across international markets, where failing to resolve entities accurately can lead to critical business and compliance risks.

The Future of Entity Resolution

As data continues to grow in volume and complexity, entity resolution will become even more critical. Paco pointed out that advancements in AI, knowledge graphs, and federated learning are shaping the next generation of ER solutions. By integrating these technologies, organizations can move beyond traditional data matching and toward a more holistic, intelligent way of understanding entities.

For anyone working with large-scale data, entity resolution is not just a technical challenge—it’s a fundamental step toward data integrity and trust. Thanks to thought leaders like Paco Nathan, we’re seeing just how powerful graph-based approaches can be in solving this problem at scale.

Catch Senzing and Link Curious at the Gartner Data & Analytics Summit

If you're attending the Gartner Data & Analytics Summit in Orlando next week, be sure to catch an exciting preview of some groundbreaking work Senzing is doing with its partners at Linkurious, a graph visualization company.

Paco will be presenting alongside the Linkurious team in the theater area right before cocktails on Monday evening, offering a sneak peek into this innovative partnership. It’s the perfect opportunity to gain insights into how Senzing’s entity resolution technology and Linkurious’ expertise are coming together to provide powerful solutions for business audiences.

Stop by Booth 132 to say hello and learn more! You can register for the event here.

Previous
Previous

RDF and Property Graphs: Two Different Models, No Wrong Answers

Next
Next

A Love Letter to Our Community