Hi all,
being part of Commons Conservancy brought up yet another subject,
which is whether we should add a header with license information in
every file in the projects under idpy. This is not something done in
an abstract way, there is a specific format modelling this information
(see
https://spdx.org/ and
https://reuse.software/ - more specifically
https://reuse.software/practices/2.0/) Still, I find it problematic.
We want to open up the question to the wider community and consider
their thoughts on this. The forwarded message below is discussing this
subject. You can see the question we posed, the answer we got and my
comments. Feel free to tell us what you think on this.
---------- Forwarded message ---------
Date: Thu, 16 May 2019 at 09:56
---------- Forwarded message ----------
Date: May 8, 2019, 8:15 AM -0700
Why does CC think having a single license file
per project is
insufficient? Our thought is that if we can avoid adding a header to
every single file, that would be nice, esp. given we already have this
info in the license file and we have the Note Well.
this is not just our opinion, but something that is an industry and
community standard for legal compliance these days. When companies like
Siemens, Samsung or Honeywell use some code in one of the hundreds or
thousands of devices and systems in their product line, they need to be
able to provide the correct license and a download of the exact version.
This means machine readability too.
I've actually observed the opposite of that. Communities abandon the
"license in every file" model, and just use a single LICENSE file in
the root of the project. The LICENSE file contains license
information, that is, it is not a single license but it has exception
sections and so on.
To quote from
https://reuse.software/practices/2.0/ :
Scroll to the section "2. Include a copyright notice and license in each
file"...
"Source code files are often reused across multiple projects, taken from
their origin and repurposed, or otherwise end up in repositories where
they are separate from its origin. You should therefore ensure that all
files in your project have a comment header that convey that file’s
copyright and license information: Who are the copyright holders and
under which license(s) do they release the file?
Continuing from above, the standardization of package-management
formats and tools has helped exactly with that: to avoid distribution
of single files, and instead provide packages and modules. It is bad
practice and considered a hack to copy files. Nobody liked that model
and everyone is moving away; it is unstructured, it becomes
unmanageable and it will cause problems.
It is highly recommended that you keep the format of
these headers
consistent across your files. It is important, however, that you do not
remove any information from headers in files of which you are not the
sole author.
You must convey the license information of your source code file in a
standardised way, so that computers can interpret it. You can do this
with an SPDX-License-Identifier tag followed by an SPDX expression
defined by the SPDX specifications."
(the text goes on for a while after this, to clarify the point but this
is the basic gist of it)
There is a nice Python tool to check:
https://github.com/fsfe/reuse-tool
I hope this makes sense
Well, it does not make complete sense. We're talking about licensing a
project. A project is not just code; there are data files (html, xml,
yaml, json files), binary files (archives/zip, images, audio, video,
etc), text files (configs, ini-files, etc) all "not-code". How do you
mark those files? Does the LICENSE file need a license-header? The
json format does not define comments, how do you add a header there?
If a binary file does not get a license header, why should a file with
code get one?
I would expect there to be a way to have the needed information
unified. If the files themselves cannot provide this information it
has to be external; thus the LICENSE file. If someone is worried about
somebody else re-using single files that do not have license
information (a python file, a png image, etc) there is really nothing
you can do (the DRM industry has been trying to solve for a long time;
and still your best bet is "social DRM").
Since, we're developing on open source with a permissive license, even
if someone does that, should we be happy that someone is actually
using what we built or sad that the files they copied did not have a
license header? And if they include the license information of that
copied file in their project's LICENSE file, is this solved?
Having pointed these contradictions, I am thinking that the "license
in every file" model seems to be a step backwards. It is introducing
overhead and does not really solve the problem, while at the same time
it enables a culture of bad practice (copying files around).
Cheers,
--
Ivan c00kiemon5ter Kanakarakis >:3