How Platform Teams capture feedback

Editor’s Note: The following article is a companion piece to an excerpt from The Manning Book: Effective Platform Engineering, focused on design choices to consider as you begin scoping out your software-defined platform. In its entirety, this Manning book explores how Platform Engineering practices can dramatically improve operations. This specific excerpt focuses on how to measure the usage of your platform, discover what your customers need, and how to elicit feedback. To read the whole book, skip ahead and download

TL;DR

Treat incoming platform requests like product feature ideas—not SLA-bound tickets (bugs/outages excepted).
Instrument real usage: capture telemetry on platform components and anonymous CLI metrics to hear the silent majority.
Enforce a layered architecture (Handler → Service → Repository → Datastore) with architectural fitness functions so teams can swap datastores/clouds without coupling control-plane logic.
Make decisions data-driven: use ADRs + test-driven development and fitness-function “gates” so new services ship with tests and observability/monitors by default.
Platform engineering is software product engineering: measure, test, and iterate continuously to align roadmap with verified developer needs.

At this point, you might be wondering how to identify the opportunities you need to build a platform. To answer that, there are several ways to measure the usage of our platform, discover what our customers need, and elicit feedback directly.

Two diagrams show workflow: the left side depicts a developer requesting more work items, illustrating user interactions; the right side shows six tasks available for the Ops team to pick from a "To do" queue.

A standard Operations Team ticket queue. Developer Requests are immediately placed into the ‘To-do’ pile.

The first method is the most direct and obvious; as a centralized team, you will probably have a ticket system. This fact is unavoidable in most organizations. However, as a product team, your ticket system is treated quite diﬀerently from the standard operations team.

In the typical operations workflow, it is assumed that when a ticket request is put in, there is an SLA on when it will be completed. But … there’s an even bigger assumption we just glossed over. And that’s the assumption that it will be done at all!

Think about a DevOps or Identity team: When tickets come into their queue, it’s assumed that most of these requests will be done at some point. This is not the case in the Product operating model of building our platform, because not all these requests will get prioritized.

This highlights the problem with assuming that “DevOps” is a team.

As we mentioned in Chapter 1 [of The Manning Book: Effective Platform Engineering, which can be downloaded here], DevOps should be a culture.

A developer submits feature requests; the platform team analyzes one request using an “AI Analysis” process shown on a Kanban board with Todo, Doing, and Done columns, leveraging user feedback to enhance decision-making.

This illustrates a Platform Development Ticket Queue. These get treated as product feature requests, to be analyzed, accepted, and prioritized.

Treat requests as product features, not tickets

When building an Engineering Platform and using the product operating model, it’s important to remember that all requests, except for bugs and outages, are treated as product requests. This means that the team needs to carefully review and analyze each one.

Customers of the platform have chosen to use the platform product, and that means they cannot expect to demand features be made, and certainly not with an SLA!

If teams outside of the platform team were able to demand new changes all the time with an SLA, then our Platform team would slowly decay into being only a DevOps team, and it would lose focus on the self-service features that make our platform a functional product.

Rename the “request queue” to an “idea queue” to set expectations

This doesn’t have to mean that you don’t need a request queue, though. In fact, a queue can turn into your platform backlog! By applying a bit of marketing, instead of calling it a request or demand queue, we might want to call it a Platform “Idea” queue or Platform “Feature Request” queue.

By changing the wording, we change its meaning, and teams will understand that requests can (and will) be denied if they don’t fit within the Platform as determined by the product team building it.

So, how else might we capture feedback and new needs of the platform?

Capture platform usage and developer needs with real data

As you are building the platform at PETech, you realize you need monitoring and observability tools for the customers deploying applications to your platform. You may not realize that you, the Platform team, need those same tools.

Measure usage with telemetry and metrics

Automated measurement of the platform’s usage is a key indicator of how the platform is being used, helping us to know which changes are liked and what new features we should prioritize. We’ll talk much more about measurement in [Chapter 4].

Lastly, you should get feedback from your customer base directly. There are numerous methods to accomplish this.

Instrument CLIs with anonymous metrics to hear the silent majority

When providing CLIs to your customers, you can include anonymous metrics gathering.

Creating touchpoints to close feedback loops

You can send out surveys and conduct 1-1 interviews to find out what features people like, don’t like, and don’t have. It’s important to come back and gather this type of data from your users regularly, and also connect with them.

Weekly demos to build trust, engagement, and steady adoption

Weekly or bi-weekly demos of new platform features help build engagement, trust, and interaction with the platform’s customers. We consider this to be an invaluable component of the platform development process because there is no better way to make your product better – by getting feedback on what you have built so far.

Architectural fitness functions for an Engineering Platform

As we are defining the APIs of the platform, the topic of databases comes up. Many of the APIs we will create for the Control Plane of our platform will need to store their state in a resilient database.

One of the senior developers on our Platform Team at PETech points out that our platform will support many regions, maybe even a Global topology where developers will be interacting with our control plane from multiple continents. So our control plane must be highly available, fast, and replicated across many regions.

So, as a team, you start thinking about globally available database services from your cloud provider. But another team member then points out that we also need an easy-to-use development experience for platform engineers, and a highly distributed global database could make the local development experience very complex.

And then another team member adds that while services like DynamoDB meet our requirements in AWS, we just bought another company, and their entire infrastructure is on GCP. They’ve recently asked us to start exploring supporting the Engineering Platform on their cloud as well!

Decoupling Datastores with Service and Repository Layers

First, how do we reconcile all of these concerns? Let’s return to the fact that we have a software-defined platform.

Good software design includes an architecture that decouples hard dependencies (like databases) and allows for change over time. To handle all the diﬀerent data needs, we’ve decided to separate the database details from our platform’s APIs.

Enforce clear service, repository, and datastore layers with automated checks

We’re setting up a service layer and a repository layer. The repository layer will use a Datastore interface, which can work with various database technologies. As long as these databases use the same functions, we won’t need to change the code in the repository or service layers at all.

Diagram showing an API structure with layers: Entrypoint, Handler, Service, Repository, and Datastore. Example code for initializing a Teams API and managing user interactions is displayed on the right.

An example of defining the necessary APIs for a datastore in an Engineering Platform (Teams API).

You might have heard about Fitness Functions before. They’re well-explained in many books.

Simply put, an Architecture Fitness Function is a tool that helps objectively measure how well certain aspects of a software’s architecture are performing. This concept is neatly summed up in “Fundamentals of Software Architecture” by Richards and Ford.

Use architectural fitness functions to protect design intent

Then, to ensure we always meet this pattern and keep these layers decoupled, we would write a fitness function that verifies the service layer only ever imports the repository layer, and the repository layer only ever imports an implemented Datastore.

We’d write another fitness function that ensures all of our Datastore implementations adhere to the standard Datastore interface.

Prevent control-plane coupling as the platform evolves

These Fitness functions ensure our control plane API logic doesn’t ever change when we decide to implement a new database, be it local to one developer’s computer or globally distributed.

You can think of them as a sort of Unit test that asserts the architectural patterns and decisions remain intact as you are making changes.

Here we can see that each layer is only consumed by the next, and we enforce this with a fitness function. This makes sure that down the line if we try to skip creating a datastore layer for a new database (choosing instead to call datastore functions from our service layer) our Fitness function will fail, stating that we must use the Repository Layer to interact with our new Datastore.

You can see more examples of this pattern in action in the Github repository for the book.

Another thought that may cross your mind for your platform at PETech is this Fitness Function practice feels a lot like the sort of thing the developers have to do for their applications, writing tests! And you would be right.

Remember, at the start of this excerpt, we said that Platform Products are also software to be developed using a software SDLC, and to build a scalable and successful engineering platform, we have to treat it with Software Principles. This includes fundamental architectural principles, like Fitness Functions, and writing tests that we continuously verify and trust.

Take a look at the repository for Chapter 2 [of The Manning Book: Effective Platform Engineering, which can be downloaded here] to see more examples of engineering platform fitness functions.

Fitness Functions: An Exercise using ADRs, Tests, and Monitoring Gates

At PETech, while we are going to build the Engineering Platform on AWS, we know that the merger with AllTech is pending completion. AllTech is 100% on Google Cloud, and they don’t even have an AWS account.

We know that when we build the platform for PETech, we have to focus on our immediate customers but make architectural decisions that allow us to change and adapt the platform, such as potentially for other clouds in the future. One area of high importance is our custom Platform APIs.

How might we write a fitness function that ensures our platform APIs are implemented in a cloud-agnostic way?

Using the sample API provided in the (C3 Repo)[Todo, Link] – Write a fitness function that ensures our API keeps cloud-specific features and operations in isolated interfaces that don’t aﬀect our service logic.
How might we expand this fitness function to work for all of our Platform APIs, not just this one?

After some debate amongst the team, we’ve decided that test-driven development is a rule we want to adopt and use across all of our platform’s custom software.

Write an Architectural Decision Record that captures this decision. Include reasons, alternatives, and details.
Write a fitness function that will fail if someone checks in a new Service Layer without tests associated with it.

As we’ve seen throughout this chapter [of the Manning book Effective Platform Engineering, which can be downloaded here], observability and monitoring data are at the core of every decision we make.

Consider how we might write ADRs and Fitness Functions that capture this.

How might we write a fitness function that would fail if a new API Feature gets checked in without any monitors? Feel free to use a specific observability tool to write your answer and then compare it against the answer in the back for similarities.
Consider how a data-driven approach might change the dynamics of the team’s interactions with other stakeholders and executives at the company. What tactics can you use to debate the merits of a new feature request using our ADRs, Fitness Functions, and observation-driven decision-making? Consider how these techniques remove emotions and assumptions from these sorts of debates.

Focus on what matters

If you’re wondering how to identify the right platform opportunities, focus on three inputs: usage metrics, articulated customer needs, and direct feedback. With the foundational concepts down for a software-defined platform, it’s time to explore the world of Domain Driven Platform Design. Download the entire book to keep reading.

Frequently Asked Questions

How should a platform team treat incoming requests?

When building an Engineering Platform and using the product operating model, it’s important to remember that all requests, except for bugs and outages, are treated as product requests. This means that the team needs to carefully review and analyze each one.

What are architectural fitness functions (in simple terms)?

Simply put, an Architecture Fitness Function is a tool that helps objectively measure how well certain aspects of a software’s architecture are performing.

Recent News

Featured Resources

How to capture user interactions and feedback using functions