Q: Why are the goals of this project?
A: Allow users to better organize their information.
If you’ve spent time around computers, you’ve probably heard of data types: strings, integers, floats, and so on. These are the building blocks of how computers think about information. But while data types tell computers how to store and manipulate data, they don’t tell them what the data means. And that’s where semantic types come in.
Semantic types aren’t about syntax; they’re about meaning. They’re a way of saying, “Hey, this isn’t just a string—it’s an email address” or “This isn’t just a number—it’s someone’s age.” They’re the layer of understanding that bridges the cold, mechanical world of data types and the messy, human world of meaning.
The world is complex and often appears chaotic. Language, too, can be imprecise, mirroring the intricacies of reality itself. To manage this complexity, we simplify things. We distill vast, nuanced information into simplified points that allow us to make sense of our surroundings. This balancing act—between oversimplification and overloading with nuance—is where semantic types shine.
Human languages thrive on nuance and complexity, but tools like lists, databases, and programming languages often oversimplify things. They’re designed to make building systems easier by taking a general-purpose approach. However, this oversimplification creates a gap—one that forces developers to repeatedly create custom solutions due to a lack of standardized tools that support truly flexible and adaptable patterns.
Let’s start with an example. Imagine you’re building a web form. It’s collecting all kinds of data: names, email addresses, dates, phone numbers. The form itself doesn’t care what kind of data it’s collecting; to the computer, it’s all just strings. But you care, because you want to make sure people aren’t putting “Banana” in the email address field.
If your form recognizes that a field’s semantic type is “email address,” you can validate it properly. Suddenly, you’ve got a smarter system—one that understands a little bit about the world it’s operating in. That’s the power of semantic types. They let systems reason about data based on what it *is*, not just how it’s stored.
You’ve already met semantic types, even if you didn’t know it. Here are some everyday examples:
These types exist in almost every app or system you use, from spreadsheets to social media platforms. They’re what make systems feel intelligent—like when your phone recognizes an address in a text and lets you open it in Maps.
If you’re a developer, semantic types are like a cheat code. They make your life easier in a bunch of ways:
Here’s a thought experiment. You have a number: `42`. What does it mean? It could be:
Without context, the number is meaningless. Data types tell us it’s an integer. Semantic types tell us it’s “the age of a person” or “a jersey number.” This difference between how data is stored and what it means is the gap semantic types aim to fill.
In Object-Oriented Programming (OOP), a class holds data through its member fields. But here’s the catch: these fields often mix wildly different concerns. Business concepts sit side by side with fields for presentation logic, persistence, APIs, or program state. All of this is bundled together into a single type, connected by basic relationships like composition, aggregation, and association. While this works to some extent, developers often find themselves writing custom code to model more nuanced relationships—relationships that aren’t natively supported by the language.
Can this be done better?
What if we could extend programming languages to let developers define their own relationships? Imagine being able to reuse and standardize those relationships, building libraries of flexible, expressive patterns to model entities more naturally. What if, instead of being bound by the limited constructs of today’s languages, we could design systems where relationships between entities are as clear, nuanced, and powerful as the entities themselves?
These constructs would remain precise—arguably more so than the language nuances of the requirements that currently drive custom solutions—while eliminating much of the unnecessary complexity developers wrestle with today.
Here’s another idea: make the designs and patterns we use to build systems reusable through language and tooling constructs. Think about things like user systems, access rights management, versioning, source control, publishing, and more—these should be standardized for any developer building systems, removing the need to reinvent the wheel every time.
Imagine treating code and developer tooling as you would an enterprise application—and vice versa. Apply access rights to classes and functions. Enable seamless code audit tracking. Provide built-in versioning and source control not just for your code but for business objects that need it.
Databases are the backbone of organizing data, but to support a wide variety of use cases, they typically stick to the lowest common denominator—providing only the basics needed to work across all scenarios. This often leaves developers writing a significant amount of custom code for major features that the database engine doesn’t natively handle.
On top of that, data modeling in many cases is still handled by full-stack developers rather than by subject matter experts collaborating with data modeling specialists. This limits how effectively the database structure reflects the complexities of the domain.
Now imagine if we elevated databases into high-level, extensible data modeling tools. Tools that empower developers and domain experts to work seamlessly together, driving not just data organization but the system design itself. With such an approach, databases could take a leading role in building systems and applications, transforming how we model, develop, and evolve software.
Semantic types aren’t just useful for validating web forms or formatting phone numbers. They’re also foundational in cutting-edge fields like AI, machine learning, and the semantic web:
Semantic types could also serve as a two-way bridge between humans and AI, helping humans better understanding, and communicating with AI.
On one hand, they could enable AI systems to better interpret human context by embedding nuanced, human-centric meaning into data structures. This would allow AI to make decisions in ways that are closer to human reasoning, aligning machine-driven insights with human expectations.
On the other hand, semantic types could also provide a mechanism for AI to clearly express its decision-making processes to humans. Instead of operating as a "black box," AI systems could phrase their output by using semantic type directly or semantic annotations to explain their outputs.
For instance, an AI system analyzing customer feedback could go beyond assigning a sentiment score by highlighting specific phrases or patterns that influenced its assessment. By doing so, semantic types would enhance transparency and trust, enabling more intuitive collaboration between humans and AI.
Semantic types are one of those ideas that seem obvious in hindsight. Of course, data should have meaning! But their potential is still being unlocked. As systems get smarter and data gets messier, the need for semantic types will only grow.
Imagine a world where every piece of data comes with built-in meaning. Your apps don’t just handle numbers and strings; they handle salaries, birthdays, and coordinates. Your systems don’t just store data; they understand it. That’s the promise of semantic types.
They represent the middle ground—a balance between oversimplification and too much nuance. By distilling complex, real-world concepts into actionable, structured elements, semantic types make it possible to manage chaos without losing meaning.
And the best part? We’re just getting started.