# Semantic Types: Making Data Mean Something

Q: Why are the goals of this project?

A: Allow users to better organize their information.

If you’ve spent time around computers, you’ve probably heard of data types: strings, integers, floats, and so on. These are the building blocks of how computers think about information. But while data types tell computers how to store and manipulate data, they don’t tell them what the data means. And that’s where semantic types come in.

Semantic types aren’t about syntax; they’re about meaning. They’re a way of saying, “Hey, this isn’t just a string—it’s an email address” or “This isn’t just a number—it’s someone’s age.” They’re the layer of understanding that bridges the cold, mechanical world of data types and the messy, human world of meaning.

### Why Semantic Types Matter

The world is complex and often appears chaotic. Language, too, can be imprecise, mirroring the intricacies of reality itself. To manage this complexity, we simplify things. We distill vast, nuanced information into simplified points that allow us to make sense of our surroundings. This balancing act—between oversimplification and overloading with nuance—is where semantic types shine.

Human languages thrive on nuance and complexity, but tools like lists, databases, and programming languages often oversimplify things. They’re designed to make building systems easier by taking a general-purpose approach. However, this oversimplification creates a gap—one that forces developers to repeatedly create custom solutions due to a lack of standardized tools that support truly flexible and adaptable patterns.

Let’s start with an example. Imagine you’re building a web form. It’s collecting all kinds of data: names, email addresses, dates, phone numbers. The form itself doesn’t care what kind of data it’s collecting; to the computer, it’s all just strings. But you care, because you want to make sure people aren’t putting “Banana” in the email address field.

If your form recognizes that a field’s semantic type is “email address,” you can validate it properly. Suddenly, you’ve got a smarter system—one that understands a little bit about the world it’s operating in. That’s the power of semantic types. They let systems reason about data based on what it *is*, not just how it’s stored.

### Semantic Types in the Wild

You’ve already met semantic types, even if you didn’t know it. Here are some everyday examples:

Email Addresses: A string that follows a specific format
(e.g., `name@example.com`).
Currency: A number paired with a currency symbol (like \$ or €) that’s formatted in a specific way.
Dates: A value that represents a point in time, not just numbers.
Geolocation: Latitude and longitude pairs - numbers but have a very specific meaning in context.

These types exist in almost every app or system you use, from spreadsheets to social media platforms. They’re what make systems feel intelligent—like when your phone recognizes an address in a text and lets you open it in Maps.

### Why Developers Love Semantic Types

If you’re a developer, semantic types are like a cheat code. They make your life easier in a bunch of ways:

Validation: Knowing that a field is an “email” or “date” lets you apply specific rules to check its validity. No more “123abc” in your email fields.
Interoperability: Semantic types help systems talk to each other. If one app calls something an “address” and another app calls it a “postal location,” semantic typing can align them.
Automation: Imagine you’re processing invoices, and your system knows that a column is “currency.” It can automatically apply currency formatting or conversions. That’s automation magic.
Contextual Actions: Semantic types enable smart behaviors. For example, a phone number can trigger a “click-to-call” feature.

### A Little Philosophy: Data Types vs. Semantic Types

Here’s a thought experiment. You have a number: `42`. What does it mean? It could be:

Someone’s age
The answer to the question we're trying to figure out.
A jersey number
A quantity of apples

Without context, the number is meaningless. Data types tell us it’s an integer. Semantic types tell us it’s “the age of a person” or “a jersey number.” This difference between how data is stored and what it means is the gap semantic types aim to fill.

### Filling the Gaps in Programming Languages

In Object-Oriented Programming (OOP), a class holds data through its member fields. But here’s the catch: these fields often mix wildly different concerns. Business concepts sit side by side with fields for presentation logic, persistence, APIs, or program state. All of this is bundled together into a single type, connected by basic relationships like composition, aggregation, and association. While this works to some extent, developers often find themselves writing custom code to model more nuanced relationships—relationships that aren’t natively supported by the language.

Can this be done better?

What if we could extend programming languages to let developers define their own relationships? Imagine being able to reuse and standardize those relationships, building libraries of flexible, expressive patterns to model entities more naturally. What if, instead of being bound by the limited constructs of today’s languages, we could design systems where relationships between entities are as clear, nuanced, and powerful as the entities themselves?

These constructs would remain precise—arguably more so than the language nuances of the requirements that currently drive custom solutions—while eliminating much of the unnecessary complexity developers wrestle with today.

Here’s another idea: make the designs and patterns we use to build systems reusable through language and tooling constructs. Think about things like user systems, access rights management, versioning, source control, publishing, and more—these should be standardized for any developer building systems, removing the need to reinvent the wheel every time.

Imagine treating code and developer tooling as you would an enterprise application—and vice versa. Apply access rights to classes and functions. Enable seamless code audit tracking. Provide built-in versioning and source control not just for your code but for business objects that need it.

### Transforming Databases Through Semantic Types

Databases are the backbone of organizing data, but to support a wide variety of use cases, they typically stick to the lowest common denominator—providing only the basics needed to work across all scenarios. This often leaves developers writing a significant amount of custom code for major features that the database engine doesn’t natively handle.

On top of that, data modeling in many cases is still handled by full-stack developers rather than by subject matter experts collaborating with data modeling specialists. This limits how effectively the database structure reflects the complexities of the domain.

Now imagine if we elevated databases into high-level, extensible data modeling tools. Tools that empower developers and domain experts to work seamlessly together, driving not just data organization but the system design itself. With such an approach, databases could take a leading role in building systems and applications, transforming how we model, develop, and evolve software.

### Semantic Types in AI and Beyond

Semantic types aren’t just useful for validating web forms or formatting phone numbers. They’re also foundational in cutting-edge fields like AI, machine learning, and the semantic web:

Machine Learning: When building AI models, knowing that a field is a “date” or “currency” lets the model make better predictions.
Semantic Web: Tools like RDF and OWL use semantic types to describe relationships between data, enabling smarter search engines and data integration.
Integration: Merging datasets is easier when semantic types align. For example, if one dataset uses “postal code” and another uses “ZIP code,” semantic typing can resolve the difference.

Semantic types could also serve as a two-way bridge between humans and AI, helping humans better understanding, and communicating with AI.

On one hand, they could enable AI systems to better interpret human context by embedding nuanced, human-centric meaning into data structures. This would allow AI to make decisions in ways that are closer to human reasoning, aligning machine-driven insights with human expectations.

On the other hand, semantic types could also provide a mechanism for AI to clearly express its decision-making processes to humans. Instead of operating as a "black box," AI systems could phrase their output by using semantic type directly or semantic annotations to explain their outputs.

For instance, an AI system analyzing customer feedback could go beyond assigning a sentiment score by highlighting specific phrases or patterns that influenced its assessment. By doing so, semantic types would enhance transparency and trust, enabling more intuitive collaboration between humans and AI.

### The Future of Semantic Types

Semantic types are one of those ideas that seem obvious in hindsight. Of course, data should have meaning! But their potential is still being unlocked. As systems get smarter and data gets messier, the need for semantic types will only grow.

Imagine a world where every piece of data comes with built-in meaning. Your apps don’t just handle numbers and strings; they handle salaries, birthdays, and coordinates. Your systems don’t just store data; they understand it. That’s the promise of semantic types.

They represent the middle ground—a balance between oversimplification and too much nuance. By distilling complex, real-world concepts into actionable, structured elements, semantic types make it possible to manage chaos without losing meaning.

And the best part? We’re just getting started.