- Created by Unknown User (donaldh), last modified on Jan 14, 2019
Or specifically, what would be the service if we put PNDA in the cloud. Brief discussion on this, conclusion was that we have several possible layers that we could consider -
SaaS - user experience is to use APIs to put data in and use other APIs to get e.g. service assurance insight out. Platform entirely abstracted and it's completely domain specific. Think: Salesforce.
PaaS - something like Amazon's Kinesis but broader and covering all the PNDA capabilities. Abstract the larger areas of PNDA into services to be used together. User experience is to e.g. instantiate a PNDA ingest with specific configuration (e.g. protobuf, json or avro aware, use ingest or event time, etc) and get back endpoints for use in integration, to deploy Spark/Flink applications that use instantiated ingest, etc. Could offer domain specific building blocks e.g. handling netflow.
PNDAaaS - something lower than PaaS above whereby we literally offer the ability to provision a PNDA cluster and then get back the endpoints to use it, similar the Amazon EMR. This is the closest to what we have today in AWS.
Observation: each of these layers could be seen as building on the one underneath, although equally it's possible to imagine SaaS without having a well defined PaaS or PNDAaaS underneath.
Cloud Native PNDA
Around five years ago when we were starting PNDA, Heroku published a paper on their take on what it means to be cloud native called 'the twelve factor app'. This has been widely accepted and referenced by the industry and been updated by commentary by others since. How does PNDA compare against the principles set out in this paper?
For the purposes of this, we consider "app" to mean any cohesive, separable sub-part of the overall system, for example the Deployment Manager. Some people might use the term "microservice". For us they generally correspond to "roles" that we target with Salt grains.
Align Codebases with Apps
The principle here is that each app should have a strict 1-1 relationship with a codebase, which then simplifies versioning, CICD and so on.
From this we proposed the following adjustments -
- platform-salt, pnda-cli and pnda are 1 entity - "orchestrator" - and should be in 1 codebase (3 today)
- platform-testing - generates 2 applications, it should generate 1 with configurable capabilities
- data-mgmnt - generates 2 applications, this is the right number of applications but requires 2 codebases
- console-backend - this single codebase generates 2 apps and a library, it should be 3 codebases
The principle here is that all dependencies of an app should vendored with that app, which apart from simplifying delivery also means you avoid problems with mixed or conflicting dependencies in the environment or dependencies moving over time.
We've spent a lot of time thinking about this topic over the years as it can cause a lot of problems when you have a complicated system with lots of moving parts using a variety of online dependency management systems, and essentially we do vendor all of our dependencies today to avoid those problems. However, we do so in a way that requires management of an additional entity, the "PNDA mirror", which then complicates ops tasks like updates as well as having some specific technical issues around dependency management for the PNDA cluster as a whole which are awkward or difficult to manage. This is partly because it was a logical incremental step from what we were doing before this.
However, re-considering now together with the requirement above to simplify the deployment mechanics to one approach that works across cloud and bare metal environments, we concluded that this principle is in fact best served by pre-building roles with their dependencies via containerization. This prompted a long & involved discussion on how this could work for PNDA, so I've captured it separately below.
Cleanly separated Configuration
This principle is about keeping apps separate from the config they need to become a deployed release in a particular environment so as to enable "one app, multiple deploys" which in turns simplifies robust CI, test and deployment.
We already keep all our configuration separate from the code, but the Heroku paper suggests we don't go far enough - we tend to use a variety of configuration files rather than a ubiquitous system level mechanism like environment variables to convey configuration to apps at deploy time. The paper argues that arbitrary config files are hard to track, generate and manage consistently and instantiations of them can end up in source control by accident, as opposed to 'configuration control' which should be something separate from the code. In general, we manage configuration in Pillar, but this is 'close' to the code (it's in a codebase, by the above terminology).
We concluded that a move away from configuration files and towards using Consul to deliver configuration would help address this principle and in addition we need to do a review of all our current points of configuration and how they're handled with a view to moving them to a consistent approach that's fully separated from the code. We may be able to use collections of env vars as suggested in some places, especially if/when we move to a role/container based approach, but we are to some extent tied to the decisions made by the upstream projects we bring together.
Backing services as "Resources"
The example given is if you depend on a standard relational database then your code should simply make use of URIs and not be sensitive to swapping in/out different databases that essentially provide the same capability. However, the principle isn't limited to databases and the Heroku paper advocates treating all such 'backing resources' in this way so that the system remains loosely coupled, is easy to test, deploy the same code in different configurations without modification or the need for conditionality, etc.
An example of where is this currently done well is the console front and back ends, which have each some specific implementation but communicate only via a REST API. An example of where perhaps this isn't so well done is where we make use of HBase for storing state and the codebase is tied directly to HBase via a specific library. We concluded that we need to do a review component by component across PNDA, check our approach & plan any enhancement work.
Strict separation between build, release and run
The idea here is that keeping these stages strictly separated we can keep complexity isolated to the build stage (which is closely monitored by developers) and so keep it out of the run stage (so the system has few moving parts and there's less to go wrong when 'lights out') and to allow maximum flexibility in how to release from build to run (which is a restatement of the loose coupling principle echoed in the other principles).
Currently, our 'release' (which here means would we would call 'provision') does much of what the paper refers to as 'build', while we have a separate (lengthy) up-front build process (to take care of dependency vendoring as above). Our run phase is fairly clean, until something needs to be changed at which point it re-involves our build and release stages in a way that we already find problematic.
Closing this gap probably requires a change in approach and so, like the 'Dependencies' topic, this is covered by the containers discussion captured separately below.
Stateless, share-nothing processes
Essentially, are we able to move things around, run N rather than 1, kill and restart somewhere else, etc. If processes have process local state of some kind (in-memory or on local filesystem) then this principle is defeated and these things become difficult.
We have a built-in advantage here in that we're fundamentally layered on top of robust distributed state in the shape of a Hadoop stack. However, there are some areas that need review. We called out the following -
- Gobblin - is MRv2 but the launcher may have state tied to the node that runs the process that would lead to problems if it were run from multiple places at once.
- ELK - is inherently single node and writes everything to the local file system.
- Jupyter - uses the local file system for user data and notebooks. Here there are solutions for using HDFS instead, we need to look into them.
- Grafana - like Jupyter, uses local file system to persist dashboards and user data.
- Kafka Manager - we think this might have local file system state in addition to what's held in Zookeeper and Kafka
- Package Repository - has a mode in which it uses the local file system to store packages. There's no real reason for this and we should use HDFS instead.
- Deployment Manager - has some state in-process that would mean another DM process would be unaware of what should be global application/package state. This needs to be moved into a "resource".
There may be others, this is another area where we need to review component by component across PNDA, check our approach & plan any enhancement work.
Apps export services via port binding
This is best explained by the counter-example - an app that's unable to service requests without having some sort of web server given to it at deployment time by the environment. This model is common with J2EE web container technologies and a consequence is address/port binding decisions outside of the control of the app itself. Apps should instead export their services via port binding, which makes them easily portable across environments, from local dev to the cloud.
Everything deployed in PNDA adheres to this principle already.
Scale out via process model & disposability
I've put grouped two 'factors' into one here as they're fairly closely related.
- In support of scale out and robustness, applications should be designed to be able to span multiple processes, with workloads assigned to different process types.
- Processes should be robust against sudden death. For example, feed a workers with a queuing backend that returns work to the queue on failure.
- Supervision should be handled using the frameworks of the operating system distribution, not mixed into the application code.
- Processes should be disposable - i.e. should be able to simply kill them off and replace a la 'cattle and pets' metaphor.
- Should aim to minimize shutdown/startup time to better support dynamic scaling.
Generally PNDA services are single processes and we agreed it would be worthwhile to review whether this is appropriate in all cases or whether dividing applications into request handling and worker processes might make sense, for example. As part of this, we can review disposability and performance efficiency on start/stop and whether dividing into smaller processes helps or hinders these angles.
On daemonization and supervision, it's already the case that we use systemd across all of PNDA rather than handling these aspects ourselves.
This principle simply states that you should aim to keep dev/stage & prod as close to identical as possible. This is one of the key enablers of continuous delivery - working on something near-identical to stage/prod means the time to roll out new changes and the chance of the new changes doing something unexpected in stage/prod is as low as possible. Ideally, whatever processes are used to do updates, deployment, scaling, etc in prod should be the same ones used in dev. In this way, everything is continuously used and [re]proven day in day out. The dev/test cycle should not only be as realistic as possible, but also as fast as possible, since the more cycles you can go through the quicker bugs will be stamped out and features delivered.
At the moment PNDA is defined on a node/cluster level and 'flavors' define different kinds of PNDA applicable for dev or prod. However, the different flavors are really quite different animals and the definition of them is spread across several repositories and paths. Furthermore, we tend to work in the cloud and deploy on-prem.
Partly, we need to re-think the flavors concept - breaking down PNDA into roles and defining a clean pattern to deploy these with configuration in a particular deployment, rather than using fixed cluster templates. Also, we need to rationalize the deployment targets so we have one way of deploying regardless of dev/stage/prod - AWS will never be deployable on-prem, and although we could use the on-prem technique for all cluster in the cloud, it's time consuming & awkward to create clusters in this way.
All of this points towards a re-think of how deployment is handled and the containerization subject.
Logs as event streams
The idea here is that all applications should simply generate logs to their stdout and all policy around how those logs are routed and used is a matter for the deployment context. This facilitates keeping applications and their configuration the same or very similar across multiple different deployments - an inversion-of-control concept similar to the point about process supervision above.
In many places this is how we handle logs - they're effectively handed straight to the supervisor, which in the case of systemd sends them to journald. However, there are various places where the creation of log files and the management of those is part of the application configuration, especially outside of PNDA processes.
We agreed that we need to do a review on this theme across the system and create actions to remediate where needed.
Admin tasks as one-off processes
The principle here is basically to not build into your applications features to do what are in fact admin tasks (operations use cases, in other words). So we wouldn't expect to find a Kafka broker API to do partition reassignment but would expect a tool to do this - and this indeed is how Kafka is organized.
Essentially our 'tools' are distributed around the cluster and do tend to be separate apps - in addition to this we have the pnda-cli, which is separate from PNDA but carries out operational tasks on it.
Containerization of PNDA
Many of the above items point towards containerization -
- Develop once and deploy on any on-prem infra or cloud
- that can support containers (which is pretty much everything)
- Use the exact same stage/prod deployment approaches on all environments
- maybe even a laptop.
- Promote strong cohesion and loose coupling among system parts
- not a new idea! https://www.win.tue.nl/~wstomv/quotes/structured-design.html
- Allow multiple configurations of PNDA to be formed from those loosely coupled parts
- rather than be constrained to cluster/node oriented flavors
The curse of the Mommy server
We identified that the way we create/operate PNDA today involves a form of "mommy server" - the "PNDA mirror" that contains all the software needed to build the system. In terms of addressing the dependencies topic above, this already 'vendors' all the dependencies.
However, together with some of the repository layout issues captured above, managing what dependencies are needed separately from how they're used, separately again from where they're used is the significant source of complexity in PNDA today. Essentially it's a violation of the coupling/cohesion principles above.
Consider something like the deployment manager: the code is in one place, it's dependencies are in another, the definition of how it's configured and installed in yet another. Most non-trivial alterations involve changing all of these and updating multiple entities. This slows down dev and makes updates complex and prone to error, hampers fluid CICD, etc. What would be better is for each cohesive part of the system to define these things for themselves, then to be brought together.
What are the deployable units in PNDA?
So the next question we considered is: what are those deployable, cohesive, loosely coupled parts? Today we assemble PNDA on a node by node basis, installing software (dependencies) and components (PNDA and 3rd party open source) to those nodes directed by SaltStack after the nodes themselves have been created by something else which might be PNDA (pnda-cli) or separately.
The candidate 'cohesive separately deployable things' are, roughly -
- deployment manager (1 process)
- package repository (1 process)
- platform test agent & jmx proxy (2 processes)
- console front end & nginx (2 processes)
- console back end & graphite & redis (3 processes)
- data service (1 process)
- data curator agent (aka hdfs cleaner) (1 process)
3rd party stuff
- kafka manager (1 process)
- jupyter & jupyter proxy & livy (3 or more processes)
- gobblin & gobblin modules (1 process)
- kafka & kafkat (1 process)
- zookeeper (1 process)
- hadoop (many processes)
- opentsdb (1 process)
- grafana (1 process)
- ELK (2 processes + see logstash below)
Cross cutting concern stuff
Tools (probably not deployable, but versioned with the rest of PNDA as per principles above)
- pnda-cli + platform-salt + pnda = orchestrator
All of the above are candidates for, generally, one container binary. Hadoop is an exception.
Most of the above have common dependencies, therefore it makes sense to create a common base image for use in building the required containers. This would be one Dockerfile and build an image containing -
- other stuff tbd
From this, a set of additional build files would build each of the components listed above to [generally] one container. Given we have tens of thousands of lines of code that currently embodies our knowledge of how to build, install & configure these components in SaltStack, it would be beneficial to find a way to re-use this code to create these containers rather than re-write everything as Dockerfiles.
Deploy & Configure
Now we have a pile of binary containers, the next question is how do we compose a system. Container platforms like Kubernetes divide this up differently compared with how we approach this in PNDA today -
Where do things go
Flavor (infra layer & bootstrap)
Flavor (infra & platform layers)
What things exist
Flavor (platform layer)
Configuration & vertical scale
Pod & container config?
Flavor (platform layer)
So essentially what we currently define in thousands of lines of SaltStack code & CFN/Heat code will disperse into descriptors driving different aspects of Kubernetes, which we can divide into flavors. The bootstrap script code will become part of the containers themselves, along with the applications and their dependencies or become part of the pod definition of what goes where. The remainder of both SaltStack and bootstrap scripts which concerns per-flavor configuration will be given to the containers at deployment time, depending on deployment target.
- No labels