Hello everyone,
Going forward, one of the things we need to do is to revisit how
micro-services are structured. This is a big task, for which there
have been previous discussions on the mailing list and related github
issues. Those discussions mainly focused on splitting the
micro-services out, into a separate repository. While, this was given
a shot, it didn't work quite smooth.
With this email, I want to set the high-level requirements for a
plugin architecture which will help with separating the
micro-services, but also frontends and backends, to their own
packages, and make it easy to plug-in more micro-services, frontends,
backends (or other types of plugins).
Currently, the Satosa repository contains the core of satosa, the
supported backends, the supported frontends, and some micro-services.
Ideally, the satosa repository would contain only the core component.
What I want to do, is have each backend, frontend and micro-service be
its own package. These packages can then specify the desired
dependencies, their example configurations, documentation, as well as
installation and deployment instructions. Finally, the individual
plugins, can be developed and updated without the need to change the
core.
And, we can almost do that. We can separate backends, frontends, and
the micro-services out - not to a grouped repository (ie
"micro-services repository"), but each of those plugins to its own
repository and python package. There is little work that needs to be
done to enable this, mainly to decouple the core from two
micro-services (the CMservice and AccountLinking, that have special
requirements hardcoded).
But there is more than separating the plugins out. Separating the
plugins enables us to specify the control we want to have over the
provided functionality, the way certain aspects of the flow take
place, how state is managed, etc. By defining what a plugin is, we can
treat frontends, backends and micro-services in a uniform way.
A plugin is an external component, that will be invoked by Satosa at
certain points of an authentication flow. We need to _name_ those
points, and have plugins declare those in their configuration, thus
deciding when they should be invoked (and if they should be invoked in
multiple points). At different points in the flow, different
information is available. Plugins should be provided most of the
available information in a structured manner (the internal
representation).
Right now, we have four kinds of plugins (which can be though of as
roles), invoked at specific points of the flow:
- frontends: invoked when an authentication request is received
- backends: invoked when an authentication response is received
- request micro-services: invoked before the authentication request is
converted to the appropriate protocol
- response micro-services: invoked before the authentication response
is converted to the appropriate protocol
I'm not certain that this separation is the best; I can see the need
by some micro-services to know more than just the internal
representation of the available information. This can be solved in two
ways: introduce more points in the flow where a plugin will be invoked
and hope this point is better suited for the intended purpose, or,
enumerate what that needed information is and provide it in a safe way
- the example I have in mind, is a situation where a micro-service
needs to select certain SAML attributes to generate an id, but the
available information is limited to the internal attribute names,
which introduces an indirect coupling.
When talking about invoking external components, we usually think in
blocking terms. This, however, may not always be the case. Examples
include the need to do heavy IO operations, or invoke another service
over the network from which we do not expect a response (ie, send
logs, stats or crash reports to a monitoring service). For such cases,
we may want to set a certain plugin to work in async mode.
There are also cases, where we do expect an answer from an external
service, but this may come at an undefined time in the future and does
not interfere with the current flow. For these cases, we need to keep
some kind of state. This is now done in the form of a frontend channel
(ie, a cookie). What I'd like is to make this explicit and available
to the plugins as a module/function that handles state in a uniform
way.
Moving over to this structure will allow plugins to be much more
flexible. But there is still an issue hidden there. If we have plugins
as separate packages, developed and updated independently from the
core of Satosa, we also need a way to signal Satosa that such a
package has been installed, or updated and should be reloaded. This
will affect how all plugins are initialized and loaded internally, and
most probably it will affect how Satosa itself is initialized.
Along with that work, intrusive work needs to be done in error
handling and logging. At the moment, errors end up as plain text
messages (usually not that helpful) in the browser (which wraps the
text into basic html) and the logs. This needs to be change in the
direction of structured error messages. Logging will also change
towards that direction. Since, the logs will contain this information
in a structured manner, the same payload can be returned as the error
message. I would like to have messages structured in JSON format (most
probably), with context, data and a backtrace included among other
information (such as timestamp, hostname, process-id, src-map,
request-id, and more.) Provided this information, another process (a
frontend/error-handling service) can parse it and present the user
with the appropriate error message.
The structured logger and the error-handling service should be part of
the parameters that initiaze a plugin. The plugins should make use of
them, in order for the service to have a uniform way of handling these
cross-cutting concerns.
The library I'm looking into, to take care of the log format is structlog:
https://github.com/hynek/structlog
Other things to look for in the future, is grouping and high-level
coordination work between plugins in the form of invocation
strategies. Given three plugins, I want to invoke them in order, until
one succeeds with returning a result. Or, I want to invoke them in
parallel and get an array of results. Or, I want to invoke this plugin
that does a network operation, and if it fails, I want to retry 3
times with an exponential backoff.
To sum up, this was an (non-technical) overview of the things that I'd
like to do in relation to the "plugins". For some of the above, the
technical parts are still under consideration. There are more things
to be done for Satosa, both technical and not which I hope to write
down, discuss and do with everyone's help and suggestions.
Cheers,
--
Ivan c00kiemon5ter Kanakarakis >:3