Unfortunately I can’t shine any light on why hashing of the user_id was introduced.
I was not directly involved in the SATOSA development at the time it happened.
I will though by chance meet the guy who was responsible next week.
If he remembers what made them add this functionality, I’ll be sure to transfer that piece
of information to here.
On 17 Sep 2018, at 23:42, Ivan Kanakarakis
<ivan.kanak at gmail.com> wrote:
Hi everyone,
I proposed to discuss this at the bi-weekly idpy call (Tue 18/9 at
13:00 UTC). This is a bit complex, so I thought I'd do a writeup for
anyone that would like to discuss this.
I will try to describe how SATOSA handles the NameID value, and by
this I mean how the text value of the NameID node is transformed,
after it is received by the saml-backend (from an IdP), and before it
is passed over to the appropriate (saml or openid) frontend.
Omitting some details the flow is as follows: Once an authn-response
is received, the backend parses the response and extracts the NameID
node as `name_id` and its text value as `user_id`[0]. Control flow is
returned to the `base` module where `used_id` is hashed[1] before it
is passed to the response-microservices. Once microservices are done
doing their job, control is back in `base` module and the `user_id` is
hashed again[2]. Finally, the appropriate frontend module (saml[3],
openid[4]) is invoked and `user_id` is used as the basis for the
attribute that is needed.
What stands out here is the hashing of `user_id` (that happens twice).
This is something that has come up multiple times in different forms
(pull-requests #137 #160 #193 work around it). I decided to figure out
what's going on: How is that decision formed? Why does that take
place? Does it provide any value?
Hashing of `user_id` was introduced by commit b54006f [5].
Unfortunately, the accompanying commit message does not offer an
explanation or any hints on why this was needed. Looking at the
evolution of the code, nothing shows me why that was needed or what
value it added. A hint is that it is closely coupled to the
`UserIdHashType` enumeration[6] from `internal_data` module.
Unfortunately, the evolution of that part of the code also doesn't
help to reveal the need for both the enumeration and the hash.
In the hope that maybe a test that I did not see at the time could
reveal a reason for this behaviour, I went ahead and removed the
relevant code. You can see the changeset on my personal fork on the
branch named 'refactor-remove-user-id-hash' [7]. And nothing really
broke! At this point I do not see any value keeping the UserIdHashType
enumeration and hashing of `user_id`.
If someone has an insight on why hashing `user_id` was introduced and
what its purpose was, I would be interested to know. If we do not come
up with a valid use case, I would propose to move forward by removing
that part of the code.
[0]:
https://github.com/IdentityPython/SATOSA/blob/d861e87/src/satosa/backends/s…
[1]:
https://github.com/IdentityPython/SATOSA/blob/d861e87/src/satosa/base.py#L1…
[2]:
https://github.com/IdentityPython/SATOSA/blob/d861e87/src/satosa/base.py#L1…
[3]:
https://github.com/IdentityPython/SATOSA/blob/d861e87/src/satosa/frontends/…
[4]:
https://github.com/IdentityPython/SATOSA/blob/d861e87/src/satosa/frontends/…
[5]:
https://github.com/IdentityPython/SATOSA/commit/b54006f
[6]:
https://github.com/IdentityPython/SATOSA/blob/d861e87/src/satosa/internal_d…
[7]:
https://github.com/c00kiemon5ter/SATOSA/compare/d861e87...refactor-remove-u…
Cheers,
--
Ivan c00kiemon5ter Kanakarakis >:3
_______________________________________________
Idpy-discuss mailing list
Idpy-discuss at lists.sunet.se
https://lists.sunet.se/listinfo/idpy-discuss
— Roland
The higher up you go, the more mistakes you are allowed. Right at the top, if you make
enough of them, it's considered to be your style.
-Fred Astaire, dancer, actor, singer, musician, and choreographer (10 May 1899-1987)