Discussion:
[conda] Small issues with MRO R packages
Shaun Walbridge
2018-03-23 03:22:40 UTC
Permalink
Hello,

Great to see the MRO packages and the continued work on putting up more of
the R ecosystem!

I recently installed a small collection of packages via the `r` channel,
and noticed a couple of possible issues with the mro-feedstock. The bigger
of the two is that any of the packages I installed included a
info/recipe/parent directory which was significant in size, in particular
because of the 40MB of test data. It looks like conda-build PR #2687
<https://github.com/conda/conda-build/pull/2687> which introduced this
behavior isn't probably wanted here, or the test data may need to be
omitted because its size. This doesn't affect the environment itself, but
it does waste space with each extracted package in `pkgs`, adding up
quickly, and will make the package downloads larger than necessary. It
looks like it should be possible to have these data files downloaded from
GitHub when the tests are executed instead of being included directly in
the recipe itself.

The second potential issue: the base package creates duplicate DLLs in the
form of "Rlapack.dll.mkl" which is also copied as the (used) "Rlapack.dll",
there are nomkl versions, and the same thing for Rblas. Perhaps these are
part of a pattern still in progress, but from what I can tell, there is no
"nomkl" context here, and omitting these would save ~60MB.

Thanks again for all the great work, and if you'd like an issue (or even a
PR) let me know, it wasn't clear if the aggregateMRO repository was
intended to be public-facing.

Cheers,
Shaun
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/875accb0-12a3-4c1f-bc27-27ca0ccb4e0f%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Shaun Walbridge
2018-03-23 20:45:39 UTC
Permalink
Got it on the MKL split. I figured I'd mention it since it doesn't map
directly onto the model used with Python in conda and e.g. the nomkl
package.

In terms of the testing, yes that makes sense to retain it with the package
so that it will work in an air-gapped environment. A possible compromise
would to just trim the dataset, I don't see anything specifically in the
tests that require that volume of data to confirm the software is
functioning correctly, but I of course could be missing something. Thanks
for taking a look, yes as you mentioned its only really an issue in
aggregate -- installing here pulled in 31 MRO packages, for a total of
1240MB of disk usage.
Oh actually, I didn't read carefully enough, ok 40MB per package is a bit
more of an issue (but it's not for every R package, just the ones that come
from that recipe, i.e. the 30 odd MS repacks). I will think about that one
some more then.
Omitting test data due to size isn't something I will do, disk space is
cheap and 40mb is not so much (yes it unpacks to more but still, the
package cache is unpacked only once and hardlinks are used). We want out
tests to work in "air-gapped" situations too, esp. for something as
important as the MRO R interpreter. Sorry, I will resist any suggestion to
change this at all.
Rlapack.dll.mkl comes with MRO, this is not a pattern in progress, it is
a deliberate decision on our part to include both the mkl optimized libs
and the GPL ones to for GPL compliance (and benchmarking) reasons.
GPL-wise, if you uninstall r-revoutilsmath then you will have
redistributable software, and the mro-base package itself is fully GPL
compliant.
aggregateMRO is intended to be public-facing (and PRs for many things
would be considered) but I will reject any PRs to change these along the
lines you are suggesting.
Thanks,
Ray.
On Fri, Mar 23, 2018 at 3:22 AM, Shaun Walbridge <
Post by Shaun Walbridge
Hello,
Great to see the MRO packages and the continued work on putting up more
of the R ecosystem!
I recently installed a small collection of packages via the `r` channel,
and noticed a couple of possible issues with the mro-feedstock. The bigger
of the two is that any of the packages I installed included a
info/recipe/parent directory which was significant in size, in particular
because of the 40MB of test data. It looks like conda-build PR #2687
<https://github.com/conda/conda-build/pull/2687> which introduced this
behavior isn't probably wanted here, or the test data may need to be
omitted because its size. This doesn't affect the environment itself, but
it does waste space with each extracted package in `pkgs`, adding up
quickly, and will make the package downloads larger than necessary. It
looks like it should be possible to have these data files downloaded from
GitHub when the tests are executed instead of being included directly in
the recipe itself.
The second potential issue: the base package creates duplicate DLLs in
the form of "Rlapack.dll.mkl" which is also copied as the (used)
"Rlapack.dll", there are nomkl versions, and the same thing for Rblas.
Perhaps these are part of a pattern still in progress, but from what I can
tell, there is no "nomkl" context here, and omitting these would save ~60MB.
Thanks again for all the great work, and if you'd like an issue (or even
a PR) let me know, it wasn't clear if the aggregateMRO repository was
intended to be public-facing.
Cheers,
Shaun
--
You received this message because you are subscribed to the Google
Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/a/co
ntinuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/co
ntinuum.io/d/msgid/conda/875accb0-12a3-4c1f-bc27-27ca0ccb4e0
f%40continuum.io
<https://groups.google.com/a/continuum.io/d/msgid/conda/875accb0-12a3-4c1f-bc27-27ca0ccb4e0f%40continuum.io?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/a/co
ntinuum.io/d/optout.
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/CABM2nYw7R2h8rm7wZVE_NB%2B1nfV2rYWm-TT1zuzFZFRW%3DiQrxA%40mail.gmail.com.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Loading...