Attending:
Scott Koranda, Heather Flanagan, Leif Johansson, Giuseppe de Marco, Johan Lundberg, John
Paraskevopoulos, Alex Stuard, Hannah Sebuliba,
Regrets:
Ivan, Rainer
Virtual IdP front end to Satosa - can expose multiple virtual IdPs through the Satosa
front end. Can configure various options for the IdP, including the name of the IdP, the
scope the IdP wants to use, etc. That config belongs to the front end. Scott also wants to
have some microservices that operate on the assertions as they go through the system, and
the microservices should have the same access to that configuration (e.g., so they can see
the scope). Waiting on a decision from Ivan on how to implement this.
Can you run a single Satosa instance with multiple front and back ends? Example, a SAML
back end that would authN against eduGAIN, and another that authN to ORCID, and front ends
that would work with either SAML or OIDC.
Question: has anyone set up OIDC front end and had it work with mod_auth_oidc?
Mod_auth_oidc is complaining. Giuseppe is planning on doing this in the next month or so.
With the OIDC front end, it won’t automatically work with multiple backends (cannot select
between multiple backends). Need a custom routing service. Does anyone have such a routing
service available? Giuseppe wrote one; can find it in the Satosa PRs. It intercepts the
call and uses a map of entity IDs that need this.
Update on pyFF
Current release = 1.1.1; there are some bug fixes that need to go into 1.1.2 asap.
Code is stabilizing, but not sure he’d bet on 1.1.2 being stable.
2.0 will start with Leif removing the front end bits; he will provide a bash script to
help people who are used to calling pyffd. There will still be a wsgi app (and it will be
the main entry point).
Hannah has been working on some interesting memory things. She is looking for memory
leaks. Scott thinks that when pyff is running as a server, it needs to never create a
really large DOM, avoid ever having read the eduGAIN feed as a single DOM object, because
that creates a huge, unnecessary memory request. Also have to avoid creating lists of many
things; even if you don’t load the whole DOM, you have a list of small DOMs and if it is
held in memory before being given to the backend store, you still consume a lot of memory.
Scott suggests the architecture needs to shift from parsing large chunks of metadata, to
parsing small chunks, handing them off to the backend, then garbage collect. Leif points
out that as soon as you’re dealing with signed metadata, you have to handle all of it at
once. Could try to do something by making the pipeline smaller.
One suggestion: switch to the Redis backend. Which could work in some use cases, but not
for the full aggregate
Could do an offline fetch as another way to control size.
One goal is to keep pyFF from needing a server with more than 4GB. Not likely that’s going
to function as eduGAIN gets larger
In eduTEAMS, pyFF does take up the largest memory footprint.
Could start pyFF, ingest all you need, use mirror MDQ to produce an offline copy, then
shut down the pyFF service until you need to reingest. The offline MDQ could be used for
discovery for as long as the signature is valid. Can use the thiss.io MDQ (thiss-mdq
<https://github.com/TheIdentitySelector/thiss-mdq>) for a miniature search function.
Giuseppe uses pyFF with a scheduler.
Another alternative is to use the default discovery service being put together by
SeamlessAccess.org <http://seamlessaccess.org/> (based on RA21)
Would be interesting to compare woosh+Redis to a JSON-only index store. Action item for
Hannah.
How to exclude entityIDs from pyFF? It works as expected up to 0.9.3. Can use a filter,
but the previous version of fork, merge, remove does not. (The latter impacts the current
working document, and should not actually work.) Suggest you look at load-cleanup -
there’s a way to run a pipeline early on before you update the backend store, and that
might be it.