Idpy meeting 3 September 2024
Attendees: Johan W, Johan L, Shayna, Ivan, Hannah S
0 - Agenda bash
1 - Project review
a. General - Ivan's plan is to merge things that don't break anyone's flow.
plan is still to move away from pyop, however
more patches coming up for idpy-oidc - internal repos so no PRs.
configuration change needed to handle redirect uris better- read urls with special characters like spaces work with some flows but not for others.
reuse indicators - there are specific use cases as to how they are to be treated and that has been encoded into tests - later on the code changes. Separating what happens when reuse indicators are in place in regards to token exchange - the two specs reference each other but also conflict in some ways.
introducing new concepts around audience policies
mechanism that allows you to state an audience
what requirements you have for the audience. Allow multiple values, one value, etc.
This has nothing to do with reuse indicators where you signal which value or values should be set as the audience.
There are also some questions as to how things work and when resolution takes place based on different layers - you could request resource X and this means the audience will get service 1 - the identifiers can be different.
Anything behind a features flag can probably be merged, such as the logout capabilities that Hannah S and Ali have been working on.
logout PRs that can be merged -
lxml:
https://github.com/IdentityPython/pysaml2/pull/940 - not complete - it is a draft. It is a basis for using lxml everywhere in the project. Lxml parser is Qname aware - it knows when an xml attibute contains a namespace or a type. The default python parser does not do anything with namespaces, so when you try to do validation, the namespace is missing because python has optimized it away (removed it). There are certain use cases where this is problem. Ivan may also talk to a person who has an xml validator which has a way of using the default python parser but still is able to check for those edge cases.
e. Any other project (pyFF, djangosaml2, pyMDOC-CBOR, etc)
question on slack concerning pyff from Hannah at CERN. Ivan will try to get to it today.
pyff - mdq - Ivan would like to have a configuration that says output should go into either the file system (what happens now), or into S3, or into a database (in which case you don't need a Discovery Service, everything can be an API call). In the database case then things can be quickly sorted and indexed the way you like. The problem is there is no mapping between xml and a table. Need to think about how to do indexing without the schemas, and so on. This will unlock capabilities that we don't have right now and also simplify what we do with the discovery service.
Also need to change the way we parse xml into memory - can do this within the entities descriptor. This shouldn't be hard. This would make it so we don't need a big machine or lots of resources to do the parsing of a large thing every 6 hours. Pyff could be put into a lambda, possibly.
SATOSA itself can also be simplified, but the whole configuration would need to change. They have looked at moving toward a framework like Django - not sure if this would be done as SATOSA or SATOSA version 2? New approach in parallel with what we have now - does that make sense time-wise and maintenance-wise? How to do this without breaking what is there now? Need to experiment with Django. Async parts of Django would make some parts of SATOSA easier. Background things like statistics that don't need to interact with the actual flow but need to be there - perhaps API call to elasticsearch to record that a new flow happened. Open telemetry - asynchronous calls to the logger - tracing - do these in a way that don't affect the timing of the flow itself.
2 - AOB
Ivan is doing a lot of work on EOSC with the AI integration.
Next meeting - 17 September. Shayna will not be available but will send out the meeting reminder. Ivan will take notes and send them to Shayna to distribute.