Discussion:
[conda] environment.yaml for reproducible environments, across operating systems
Chris Withers
2018-04-05 07:41:07 UTC
Permalink
Hi All,

The common pattern, both across environment management tools, and indeed
languages, appears to be the split between a spec file (what I want) and
an environment file (what I get).
In the python world, both pip-tools and pipenv have similar concepts,
and I'm wondering what the equivalent is for conda environments?

In my case, I want to be able to specify the packages I want and some
version requirements (maybe just package names, maybe >someversion or
<someversion), and then have some form of lock file, ideally that will
work across operating systems, generated that I can use to set up
environments in production.

What are the options for this kind of thing when using conda?

cheers,

Chris
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/2cf95fd6-1b25-e45b-d87a-0b82e8996432%40withers.org.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Ariel Balter
2018-04-05 08:02:23 UTC
Permalink
@Chris -- being not experienced with production environments, so can you
please give me a use case that will help me understand the difference
between a spec file and an environment description?

When I do

conda env export -n base -fconda-export.yml

I get something like:

name: base
channels:
- anaconda
- defaults
dependencies:
- alabaster=0.7.10=py36hcd07829_0
- asn1crypto=0.24.0=py36_0
- astroid=1.6.2=py36_0
- babel=2.5.3=py36_0
.
.
.
- r-zoo=1.8_0=mro343h889e2dd_0
- pip:
- bash-kernel==0.7.1
- coloredlogs==9.0
- humanfriendly==4.8
.
.
.

You can recreate the environment with:

conda env create -f conda_export.yml

Seems like this is pretty close to a production method for reproducible
environments.

These also a helper utility called conda-env

https://github.com/conda/conda-env
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/9d283063-1569-4934-80c3-823cdefc0adf%40continuum.io.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Chris Withers
2018-04-05 09:16:28 UTC
Permalink
Post by Ariel Balter
@Chris -- being not experienced with production environments, so can you
When I do
conda env export-n base-fconda-export.yml
  - anaconda
  - defaults
  - alabaster=0.7.10=py36hcd07829_0
  - asn1crypto=0.24.0=py36_0
These are OS-specific and may also include packages for one OS that are
not present on another.
Post by Ariel Balter
conda  env create -f  conda_export.yml
Right, but how do you may sure the .yml file matches what you actually
have installed? Put differently, how can I get conda
(install|upgrade|remove) to update the .yml file each time it runs?
Post by Ariel Balter
Seems like this is pretty close to a production method for reproducible
environments.
These also a helper utility called conda-env
https://github.com/conda/conda-env
That's now part of conda, creating the confusion of when/if you should
use conda {something} versus conda env {something}

Chris
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/3edc836a-1a80-ecbf-97ee-aacde4481f50%40withers.org.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Chris Barker - NOAA Federal
2018-04-06 00:13:30 UTC
Permalink
This post might be inappropriate. Click to display it.
Chris Withers
2018-04-06 08:47:10 UTC
Permalink
Post by Chris Barker - NOAA Federal
These are OS-specific and may also include packages for one OS that are not present on another.
Yeah, and I’ve found that frustrating too — because it almost works :-)
But can we expect it to? The vet definition of a perfectly reproduced
environment is platform specific anyway.
What I do is create an environment.yaml file for each platform I need
to support.
I think this is why we're seeing the split into a 'user requirements'
file and a 'lock' file in other systems that do this kind of thing
(pip-tools, pipenv, npm, rust, etc)
Post by Chris Barker - NOAA Federal
Right, but how do you may sure the .yml file matches what you actually have installed?
That’s guaranteed (assuming no bugs :-) )
I beg to differ ;-)

conda create -n myenv package1
source activate myenv
conda env export > environment.yaml
conda install package2

environment.yaml is now out of date.
Post by Chris Barker - NOAA Federal
Put differently, how can I get conda (install|upgrade|remove) to update the .yml file each time it runs?
You can’t — you do a new export yourself.
Right, I'm wondering if maybe install/upgrade/remove should maintain one
or both of environment.yaml and environment.lock.yaml?
Post by Chris Barker - NOAA Federal
It kinda makes sense that you want to have control of when you lock
down your environment.
...and means you have to have some CI step that checks if your "lock"
file matches what's really installed once you install your "lock" file ;-)
Post by Chris Barker - NOAA Federal
I might update a package, run my tests, have some fail, then downgrade
it again. I wouldn’t want my environment file to have been updated in
the middle of that.
Assuming the install/upgrade/remove tooling what maintaining it, why
not? I agree that I wouldn't check the modified files into source
control until I was sure I wanted the changes, but I'd certainly like
the file(s) to be accurate at all times!
Post by Chris Barker - NOAA Federal
That's now part of conda, creating the confusion of when/if you should use conda {something} versus conda env {something}
Yup, that is confusing...
And conda devs knocking around on here who might be able to comment on
the best way forward for that?

Anyway, I need to get something working as my current
string'n'duct'tape[1] has run out of steam :-(

I'm likely going to resurrect picky-conda [2], and wonder how you and
others would feel about a workflow such as this:

conda create -n myenv
source activate myenv
"install packages"
picky lock
*do more*
pick check

So, "install packages" here could mean "conda install foo", but that
won't track which packages you've installed explicitly versus ones that
have come as dependencies, so I'd maybe suggest adding to a bare-bones
environment.yaml and then, each time, doing:

conda install -f environment.yaml

"picky lock" would take options in an environment.yaml section (assuming
conda ignores top-level keys it doesn't understand), and use them to
massage the output of "conda env export" into an environment.lock.yaml
that could be used to conda env create/update for reproducible builds.

"picky check" would be the same as lock but would just whine if the
environment.lock.yaml it generates doesn't match the one on disk.

Interesting opportunities for also pruning dependencies that are no
longer needed as well as upgrading packages in a more controlled fashion...

A sample environment.yaml might be:

channels:
- Quantopian
- defaults
dependencies:
- zipline
- bokeh
- notebook
- pandas<0.19
- pip:
- ansible=2.1
- testfixtures>=6.0.0
lock:
ignore:
# mac-only:
- appnope
# installed by 'pip install -e .'
- mypackage
detail: version
# above could be:
# version (just the same version)
# full (ie: raw conda env export)
# secure (include md5's, but conda doesn't currently support?)

Thoughts?

[1] http://picky.readthedocs.io/en/0.9.1/

[2] https://github.com/Simplistix/picky-conda
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/5397757b-1d13-9679-c64f-8d72856610f9%40withers.org.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Chris Barker
2018-04-06 19:41:38 UTC
Permalink
Post by Chris Withers
Right, but how do you may sure the .yml file matches what you actually
Post by Chris Withers
have installed?
That’s guaranteed (assuming no bugs :-) )
I beg to differ ;-)
conda create -n myenv package1
source activate myenv
conda env export > environment.yaml
conda install package2
it's guaranteed to match WHEN you do the export of course.

environment.yaml is now out of date.
Post by Chris Withers
Put differently, how can I get conda (install|upgrade|remove) to update
Post by Chris Withers
the .yml file each time it runs?
You can’t — you do a new export yourself.
Right, I'm wondering if maybe install/upgrade/remove should maintain one
or both of environment.yaml and environment.lock.yaml?
does pip or virtualenv really do that? It sure didn't before I gave up on
virtualenv :-)

Assuming the install/upgrade/remove tooling what maintaining it, why not? I
Post by Chris Withers
agree that I wouldn't check the modified files into source control until I
was sure I wanted the changes, but I'd certainly like the file(s) to be
accurate at all times!
I can see the logic, but I still think it's at the wrong point in your
workflow. Saying "this is the official deployment environment" should be a
pretty deliberate step. If it's updating itself constantly as I update the
environment, I"m not sure I see a point in having it at all.

Imagine multiple developers -- each manipulating their environments on the
fly differently - seems ripe for confusion and error.

and if multiple developers then you have a two-way street:

developer A makes changes to their environment -- the environment.yaml file
updates itself.

developer B makes different changes to their environment -- the
environment.yaml
file updates itself.

The both merge into master

now we have devA's environment, devB's environment, and a third merged
version (which could be broken with merge conflicts...)

devs A and B do a pull.

now the environment file is out of date with the developer's environment....

Does the environment somehow magically update itself??? or does it save its
current state back into the environment file, thereby downgrading the
environment again?

Anyway, I'm not saying an automated workflow for this couldn't be devised,
but I'm suggesting that conda probably isn't the place for that automation.

I'm likely going to resurrect picky-conda [2],


hmm, that could be handy, yes.
Post by Chris Withers
conda create -n myenv
source activate myenv
"install packages"
picky lock
*do more*
pick check
So, "install packages" here could mean "conda install foo", but that won't
track which packages you've installed explicitly versus ones that have come
as dependencies, so I'd maybe suggest adding to a bare-bones
conda install -f environment.yaml
I'm not sure I've arrived at the "best" way to do it, but I currently use
an conda_requirements.txt for development, that specifies only the
top-level packages and not all pinned down (i.e. >=) and then a separate
environment.yaml file to fully lock it down for the production environment.

This means that each developer on the project may not be using the exact
same packages, but we can push a fully tested environment for deployment.

And having the developers have a bit more flexibility helps us keep
dependencies up to date and catch the bugs that that introduces.

but it is kinda ugly to keep all that in sync.

now that i think about, the practical issues we've had are when ne
developer updated a dep (or adds one), and even if they updated the
requirements files), other deps are pulling from the main repo and running
with their existing environment.

Hmm -- we have actually put in some kludgy run time check code for versions
of our in-house-under development deps.

Maybe we should do the same for all deps, pulling from a single
requirements file.

"picky lock" would take options in an environment.yaml section (assuming
Post by Chris Withers
conda ignores top-level keys it doesn't understand), and use them to
massage the output of "conda env export" into an environment.lock.yaml that
could be used to conda env create/update for reproducible builds.
"picky check" would be the same as lock but would just whine if the
Post by Chris Withers
environment.lock.yaml it generates doesn't match the one on disk.
I'm still confused about what a "lock" is in this context, but I think
you're going in the right direction.
Post by Chris Withers
Interesting opportunities for also pruning dependencies that are no longer
needed
I've always wanted to do some kind of run-time checking -- run your test
code, and see what's in sys.modules -- those are your deps. Mapping that to
pip or conda packages is a different story, however...

-CHB
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/CALGmxEJSuAMDsjodNy4Mfq7Xqt2sGrHfY38oKjOv4W63EcT9aw%40mail.gmail.com.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Chris Withers
2018-04-09 06:29:41 UTC
Permalink
This post might be inappropriate. Click to display it.
Chris Barker
2018-04-09 13:57:04 UTC
Permalink
Post by Chris Barker
I can see the logic, but I still think it's at the wrong point in your
Post by Chris Barker
workflow. Saying "this is the official deployment environment" should be a
pretty deliberate step.
I don't want to labour the point, but yes, that step would be when you
check the .lock file into your source control.
my experience is that my devs (OK, me :-) ) when updating a lot of code,
tend to do a "git commit -a" and may not notice that they're committing a
changed .lock file. Which is why I prefer to NOT have that file get updated
automatically. But maybe that's a workflow problem on our end...

now we have devA's environment, devB's environment, and a third merged
Post by Chris Barker
Post by Chris Barker
version (which could be broken with merge conflicts...)
It's one project though? What would you like to have happen in this
situation, ignoring the implementation details I proposed?
I would like to have one source of "truth" for the deploy environment that
requires a conscious step to update.

I'm not sure I've arrived at the "best" way to do it, but I currently use
Post by Chris Barker
Post by Chris Barker
an conda_requirements.txt for development, that specifies only the
top-level packages and not all pinned down (i.e. >=)
Apologies for the pedantry, but it might be important: "pinned down" would
be ==,
I wasn't clear -- ">=" was an example of "not pinned down"
Post by Chris Barker
I'd hope you use your environment.yaml for CI testing before a deploy?
exactly -- the environment.yaml is used for the CI and deployment. and CI
runs on every push to the central repo.

And having the developers have a bit more flexibility helps us keep
Post by Chris Barker
Post by Chris Barker
dependencies up to date and catch the bugs that that introduces.
Interesting! I've found having all devs working off exactly the same
packages reduces bugs and confusion, but I would observe that it means we
tend to stick on older versions of packages for longer.
we get more bugs -- but find them earlier :-)


"picky lock" would be my proposal.
I'll keep an eye on that -- may be helpful for us too.

-CHB
Post by Chris Barker
Hmm -- we have actually put in some kludgy run time check code for
Post by Chris Barker
versions of our in-house-under development deps.
Maybe we should do the same for all deps, pulling from a single
requirements file.
I think that's what I'm proposing, would you be interested in trying picky
once I've coded it up?
"picky lock" would take options in an environment.yaml section
Post by Chris Barker
(assuming conda ignores top-level keys it doesn't understand), and
use them to massage the output of "conda env export" into an
environment.lock.yaml that could be used to conda env create/update
for reproducible builds.
"picky check" would be the same as lock but would just whine if the
environment.lock.yaml it generates doesn't match the one on disk.
I'm still confused about what a "lock" is in this context, but I think
you're going in the right direction.
Your "conda_requirements.txt" is my "environment.yaml"
Your "environment.yaml" is my "environment.lock.yaml".
I've always wanted to do some kind of run-time checking -- run your test
Post by Chris Barker
code, and see what's in sys.modules -- those are your deps. Mapping that to
pip or conda packages is a different story, however...
Indeed!
cheers,
Chris
--
Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R (206) 526-6959 voice
7600 Sand Point Way NE (206) 526-6329 fax
Seattle, WA 98115 (206) 526-6317 main reception

***@noaa.gov
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/CALGmxEKP3ABHSPqk%2BRdW%3DGQO1pt-YcM6UZyQ2VpAr2ydMwZb%3Dg%40mail.gmail.com.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Chris Withers
2018-04-09 14:57:39 UTC
Permalink
conda-env was created in the virutalenv days.  Anaconda Project is much
closer to pipenv.
First I've heard of it, and it looks like exactly what I'm after;
related questions:

- where do I install anaconda-project? Root environment or project
enviromment?

- does anaconda-project have a command to verify that my lock file is
correct? (ie: the current environment matches the lock file; essential
in CI to make sure the lock file *is* actually as it should be...)

- how does anaconda-project fare when it comes to local projects? (ie:
those installed with pip install -e .)
  They follow similar patterns (Anaconda Project came
first).
I suspect npm and cargo (from rust) and npm maybe have come even earlier ;-)

  Right now we have at least three ways of managing environment
specifications: conda's own 'conda create' and 'conda list --export',
conda-env's environment.yml file, and anaconda-project's methods.  It's
on our roadmap to clean this mess up.
Good.

cheers,

Chris
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/51c6d810-e945-d983-c5d7-87b14effa160%40withers.org.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Chris Withers
2018-04-09 16:28:09 UTC
Permalink
Post by Chris Withers
- where do I install anaconda-project? Root environment or project
enviromment?
- does anaconda-project have a command to verify that my lock file
is correct? (ie: the current environment matches the lock file;
essential in CI to make sure the lock file *is* actually as it
should be...)
- how does anaconda-project fare when it comes to local projects?
(ie: those installed with pip install -e .)
Docs for anaconda-project are at
http://anaconda-project.readthedocs.io/en/latest/.
- where do I install anaconda-project? Root environment or project
enviromment?
- does anaconda-project have a command to verify that my lock file
is correct? (ie: the current environment matches the lock file;
essential in CI to make sure the lock file *is* actually as it
should be...)
I'm guessing the answer to the former is "root environment", can you
confirm?

I'm guessing the answer for the second is "no", but how can I get
something equivalent for CI?

cheers,

Chris
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/79dff7dd-ca03-9940-e206-800542135ffa%40withers.org.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Chris Withers
2018-04-10 05:55:56 UTC
Permalink
Post by Chris Withers
- where do I install anaconda-project? Root environment or project
enviromment?
- does anaconda-project have a command to verify that my lock file
is correct? (ie: the current environment matches the lock file;
essential in CI to make sure the lock file *is* actually as it
should be...)
- how does anaconda-project fare when it comes to local projects?
(ie: those installed with pip install -e .)
Docs for anaconda-project are at
http://anaconda-project.readthedocs.io/en/latest/.  At this point, it is
what it is.  I don't anticipate further development work on it.  Further
development will be toward the conda unification project, as I described.
Okay, given it a spin and it's sadly lacking some stuff I really need:

- no way no add-packages using pip. anaconda-project update/prepare
appear to ignore pip dependencies, certainly when an project has already
been prepared at least once.

- as I said previously, no way to check whether the current lock file
and env(s) are in sync.

- no way to control where the conda envs are created. This is more of a
nice-to-have but wanted to mention it.

That said, this is the closest I've seen to what I need (albeit with a
tonne of stuff I have zero interest in: download urls, env vars, upload,
archive, etc) and would seem to be a much better place to start from
than scratch.

If I do development on the project, are PRs likely to be accepted and
released in a timely fashion or should I just fork and do my own thing?

cheers,

Chris
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/9ec701ca-c40d-1c62-7480-cac21b4ba6ab%40withers.org.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Chris Withers
2018-04-20 21:15:28 UTC
Permalink
I'll do a proper release announcement at some point, but here's my take:

http://picky-conda.readthedocs.io/en/latest/

Let me know yours thoughts, any bugs, or typos in docs...

Chris
It doesn’t have an active maintainer, so I’d say fork and do your own thing for now. When we get around to incorporating the ideas into conda, there’ll be plenty of public design work, so hopefully we cover most the majority of use cases.
Sent from my iPhone
Post by Chris Withers
- where do I install anaconda-project? Root environment or project
enviromment?
- does anaconda-project have a command to verify that my lock file
is correct? (ie: the current environment matches the lock file;
essential in CI to make sure the lock file *is* actually as it
should be...)
- how does anaconda-project fare when it comes to local projects?
(ie: those installed with pip install -e .)
Docs for anaconda-project are at http://anaconda-project.readthedocs.io/en/latest/. At this point, it is what it is. I don't anticipate further development work on it. Further development will be toward the conda unification project, as I described.
- no way no add-packages using pip. anaconda-project update/prepare appear to ignore pip dependencies, certainly when an project has already been prepared at least once.
- as I said previously, no way to check whether the current lock file and env(s) are in sync.
- no way to control where the conda envs are created. This is more of a nice-to-have but wanted to mention it.
That said, this is the closest I've seen to what I need (albeit with a tonne of stuff I have zero interest in: download urls, env vars, upload, archive, etc) and would seem to be a much better place to start from than scratch.
If I do development on the project, are PRs likely to be accepted and released in a timely fashion or should I just fork and do my own thing?
cheers,
Chris
--
You received this message because you are subscribed to the Google Groups "conda - Public" group.
To unsubscribe from this group and stop receiving emails from it, send an email to conda+***@continuum.io.
To post to this group, send email to ***@continuum.io.
Visit this group at https://groups.google.com/a/continuum.io/group/conda/.
To view this discussion on the web visit https://groups.google.com/a/continuum.io/d/msgid/conda/a33c627d-c8a5-618d-11ec-94eba1197f4a%40withers.org.
For more options, visit https://groups.google.com/a/continuum.io/d/optout.
Loading...