Digital Transformation in Large Corporation — CookBook

11 min readFeb 1, 2021

DevOps Toolset as a Service Platform baseline and vision 🌼

Disclaimer:
Obviously, digital transformation is a vast topic and could be approached from various different angles. In this article, DevOps Services enablement platform core ideas and design principles will be covered.

Motivation CookBook

So you would like to conduct a digital transformation within a huge enterprise?
No problemo.

But first, you should better get your motives right. Cause, it won't be an easy task. The larger your corporation is, the more obstacles you will face.
Furthermore, no matter from which angle you try to encounter your Transformation program, those obstacles will come in one or another form anyway. Some of those obstacles might be:

Outdated corporate policies that long ago lost the track of to rapidly developing technology
Hard to break, multiple team silos
People used to work in a project-based mindset instead of product-based development
Lack of proper skills inhouse
No top-down support and lack of written goals to be cascaded down

… you name it.

Thus, before you start, answer yourself whether you have enough motivation and dedication to drive through this enormous task. Especially that it could be a crazy ride.

Otherwise, if your inspiration comes purely from the FOMO effect you better sit down and rethink your motives.

What would definitely help, is understanding of the possible outcomes associated with successful digital transformation. However, I will not write about the positive aspects and potential gains it might bring, as there are already tons of articles all around the web which cover that topic. Additionally, this is a high level topic and is not the main thing that I am going to tackle in this article.

And what could be the motivation itself? Well, it depends.

Nowadays one of the major aspects that are in most cases very strongly connected to digital transformation is the adoption of the public cloud.

Corporations have been focused on their own infrastructure and private clouds for years. Also, to increase efficiency of work and consequently improve the performance of the application teams, they were trying hard to build all of the needed automation and governance around it. Unfortunately, it is an enormous amount of work, and even putting a colossal amount of resources into enhancing the vast range of capabilities of your private infrastructure, isn’t usually enough to get the quality and feature level available in the public cloud.

Sooner or later it becomes obvious that moving workloads and applications to the cloud, training engineers to have proper skills, and hence leveraging on public cloud features is the crucial part of the digital transformation journey and is the only way to remain truly competitive and be able to leverage on the state of art functionalities that are constantly being released by cloud whales.

Having this established, let’s dig deeper and decompose the key aspects. Since a major chunk of digital transformation means “moving” to the Public Cloud— corporation application portfolio needs to be properly analyzed.

If you are going to move numerous applications to the public cloud, which means in most simplistic case just lift and shift, why don't give a try to change some of those monoliths into different, more evolutionary architecture and thus refactor them during the process?

As you are going to dive deeply and analyze those applications, why don't you also try to examine their usage, to be able to couple those of similar features value and remove the ones that are no longer required? Functionalities provided by some of them could be grouped and replaced with single applications. Others could be replaced by newer, better alternatives. Finally, you may notice that the usage of some of them is so insignificant or even negligible and it simply won't make any harm getting rid of them.

For lots of systems that are being moved, it means a brand new start for their engineering teams also. What is more, it is usually not only the system itself that has to be moved but also all of the ancillary services around. Systems that support the development process of a given application also need to be established from the very scratch in the new cloud environment.

Thus, here is an idea — why don't you allow “by default” all future application teams engineers to use and consume advanced DevOps toolset for CI/CD, Observability, Event Management, Artifact Management, Alerting, IAAC, etc as a service?

With that, application team DevOps engineers won't need to spend countless hours assessing and looking for the right technology, setting up those services, and constantly redesigning a wheel in application-team silos. Surely, it will also obviously made the “shift” less painful. Sounds encouraging?

Platforms Era

Currently, when looking at digital space, it is not hard to spot that digital platforms are all around us. For example, such technology giants like Amazon or Google, usually base most of their products and services on platforms. What is more, it even seems to be a worldwide trend which is impacting lots of business models. It is totally justifiable as modular platform architecture creates a very flexible and easily expandable ecosystem.

Encompassing the full capabilities of some of those already quite mature platforms is surely not an easy task. However, on the other hand, constantly expanding features are true enablers of corporate digital transformation and thus it is worth diving in.

As the concept of platform-based services is proven to be working, its incorporation might also be a good idea in frames of large corporation reality. Dissembling motives, and looking at a DevOps toolset from a “service” perspective one may find an obvious fit for a platform for large corporation application teams.

Core Ideas

Let’s decompose the platform by starting from the top. First what you need is to establish some basic, generic concepts that should sit in the heart of solution expansion and development and serve as solid foundations. In other words, that’s a Set of statements that you could confront with any possible enhancement and concept that might come into the future. Simultaneously, explicit enough so you don’t overcomplicate things that will get complicated anyway. Later those principles can serve as a guidepost for any disagreements.

Additionally, a mission statement needs to be established. And what could be the purpose of the platform, aka mission? You can think about it as of definition of what you are aiming to achieve and why do you want to accomplish that.

Having in mind the motivation that we’ve already established, you need to think of a platform as a software that enables and enhances application migration to the public cloud. It implies that the design needs to be architected, built and evolved in such a manner for applications to be operated easily, autonomously, and securely.

If things get overcomplicated and thus the entry-level burden will be raised too high, you end up in smth that no one is really willing to use. Business value comes first, yet, security is not something that you want to desist as in the long term it might not end up well. Finally, autonomy, which is a holy grail of every IT team. Appropriate privileges levels that reduce cross-team dependencies drastically speeds up the delivery process.

The platform should provide a set of self-served DevOps service “tenants” to be consumed as a service from the platform. Thanks to smart automation aligned with NoOps practices development teams should have a proper level of autonomy to get pre-defined services and accelerators without any external dependencies. What is more, solutions they get should be properly governed and scalable so their features can be efficiently used in the long term.

Self-service tools that application teams obtain should allow them to cover most of the DevOps and SRE practices and principles that application development needs these days.

All of that should be strongly set up on a crucial pillar — business value always comes first.

It is not surprising that everyone's desire is to use the best technology available, especially when it comes to IT Engineers. However, no matter how fancy tools you apply, technology itself is meaningless. After all, you don't want to justify fancy tools investments by vague connections to additional business value they bring.

Cost reduction, should not explain your decision, but rather be cheery on top of proper judgment. Clearly, it could be one of the influencing factors for picking between two close competitors for the addressed use-case-solving technology and rather not a major supporting claim.

The outcome provided by the platform has to be revenue-generating and thus yield business value to its main actors and key stakeholders.

Design & Delivery Principles

They could be defined as a set of domain and technology agnostic rules to be followed by an artifact via a set of patterns and technology used during implementation. Those are statements that help you to build and expand in a properly structured way. Following them ensures that we don’t take at some point the “wrong turn” that could collide with our principles. Surely, not all of them might be applicable for your use-case, thus it is up to you to decide how to leverage those concepts in a best possible way.

Self-service

Services provided by the platform should allow development teams that are customers of the platform to operate in full autonomy, without a need for any kind of intervention or action from external teams. Automation standing behind the self-provisioning of the service has to ensure frictionless delivery and include proper security and quality checks of the request. Security settings with proper access levels should be established and governed by the platform team, removing the risk of “breaking smth” by application teams. Provisioning the service for the given team should be handled quickly and reliably and in a truly NoOps manner.

Repeatable automation and operation

Automation decreases the overhead related to operations simultaneously reducing space for possible human errors. Thus, one of the principles for building the platform is ensuring that basic operations such as configuration management, deployments, and maintenance over the platform components are conducted via scripts and automation tools. Operations codifying additionally guarantees that given action will always end up in the expected results.

Multi-tenancy

Services delivered via the platform should be designed in such a way that each new customer has extensive autonomy within their service “tenant”. The team responsible for the platform should leverage smart automation and scripts to ensure the reliable performance of each tenant and correct security set up, thus giving development teams the possibility to operate in separation from one another.

Continuous measurement

Bringing business value to the end customers is the only viable justification for running the platform. Therefore new features should be constantly monitored via active and passive observability practices. Monitoring of trends based on synthetic user monitoring, application performance monitoring, real user monitoring, logs, and measurement of key service-specific features and metrics ensures that indeed platform is on a right track. Alerts triggered via exceeding threshold of the mentioned factors should ensure that every anomaly is quickly acknowledged and properly handled.

Optimized utilization

Each service and service tenant should be integrated with an observability solution in such a way that resource utilization in the underlying infrastructure and service components is constantly being measured. A unified approach towards each of the platform services resources utilization ensures that unused components are decommissioned and services are properly adjusted to current demand.

Resources as cattle not pets

As shown and described here, technology is constantly evolving in a direction that strongly supports the Cattle model. Leveraging on that concept will give the platform team the possibility to easily deliver, maintain, and scale-up the underlying infrastructure of the services and services tenants guaranteeing predictable results. In other words, try to avoid making exceptions and workarounds for customers — especially exceptions that impact core concepts of the platform. Overhead and debt generated with it will quickly make an impact on operational tasks.

Sprint Goal-based prioritization / Choreography

The development of the platform could be performed in Sprints where goals for each sprint are going to be defined by the product owner. However, contrary to the orchestration model, the way of achieving requirements specified within the sprint goals and way of handling dependencies between services should be in the scope of platform teams responsible for the services and not governed by anyone in the center. This model more resembles a Choreography, where every actor knows how to act given the particular even he is interested in coming from other team actors.

Zero-Bug Policy

Making a long story short, technical debt is not welcomed and thus no bugs should be allowed! Given that platform is being built from the scratch, acquiring this principle right away is the best timing for such a decision. No one is willing to end up in a comprehensive list of bugs that would have to be registered, and further on assessed and prioritized every sprint. The principle follows the rule, that either particular bug will be fixed this/next sprint or it is going to be deleted. Using shortcuts and not following the correct development practices usually tends to backfire in time and hence should not be welcomed.

Measure by Fitness Function

In a nutshell, fitness functions are objective functions that allow measuring particular artifact alignment with the desired architectural state. In other words, with their usage, we could answer the question of whether new features and components are developed in such a way that coherently fits the design principles established at the beginning. Platform evolution based on a continuous feedback loop coming from function-driven development allows the platform team to take action right on the spot and deliver increments that fit well the overall assumptions well.

Example: New code commits to the source code repository should not consist of any credentials or passwords written explicitly in plain text. Therefore code delivery pipeline should include properly designed checks that prohibit such action and inform developers to make appropriate changes to match desired security standards. As a result, every new piece of code is developed in a way that it needs to pass that particular fitness function requirements and thus proper quality is assured.

Product Led Operating Model

Teams should be formed to maximize the business value of the platform provided via the motto of “you build, you run”. What it means, that contrary to project-based organizations where teams are just temporarily assigned to a given project, in product-mode, teams' lifespan exceeds the pure development stage and fluently shifts to the run stage. As a result team goals are aligned to the long-term goal of the business area it tackles. Thus, platform teams exist as long as there is a business justification for the product to exist and evolve. What is more, in product-led organizations, to support long-term changes, reporting lines are also properly aligned. Finally, product teams are formed in a way that they have an appropriate set of skills to successfully deliver assigned tasks without or at least reducing outside team dependencies.

Finale 🍕

Even though motivation, goals, and principles are just very high-level topics, should be a good starting point. Willing to go with the digital transformation road, either way, they would need to be established. Hopefully, after the read, you might find some patterns that could be leveraged within your organization, and thus at least for some part, you don’t need to “reinvent the wheel” and feel a little bit more empowered and convinced to follow that path.

Lastly, I would like express my strong gratitude for Marco Randi as his ideas and concepts inspired me to write this article. 🔥