During my early years of using Python as someone without a computer science background, I consistently struggled with managing environments for my projects. A common mistake I made was using a single base environment for all tasks, which inevitably led to dependency conflicts that forced me to repeatedly remove and reinstall packages. Even today, I observe many junior researchers in my field facing similar challenges with Python environment management.
In this blog post, I will share my personal workflow for efficiently handling multiple projects and environments using Anaconda.
"Anaconda is a distribution of Python and R programming languages for scientific computing that simplifies package management and deployment. It allows users to create isolated environments with specific package versions, making it easy to maintain different project requirements without conflicts. With Anaconda, you can seamlessly switch between environments, ensuring reproducibility and stability across your data science and research projects."
Installation
To get started, download Anaconda for your operating system from the official website. The installer is available for Windows, macOS, and Linux.
While my primary development environment is Linux, the core concepts of Anaconda apply across all platforms with minor differences in command execution:
- Windows users: You can use the graphical installer, or for a more Linux-like experience, consider using Windows Subsystem for Linux (WSL).
- macOS users: The installation process is straightforward with the graphical installer, and terminal commands are similar to Linux.
- Linux users: Download the appropriate installer for your distribution and follow the installation instructions.
After installation, verify that Anaconda is properly set up by opening a terminal or command prompt and running:
conda --version
You should see the version number displayed, confirming a successful installation.
Create Environment
The real power of Anaconda lies in environments. Create your first one with:
conda create --name myenv python=3.10
Here, "myenv" is just a placeholder - feel free to name your environment something meaningful! I typically name mine after specific projects (like "DAC" or "thematic") or sometimes by framework (like "torch-projects" for all my PyTorch work).
The Python version specification (python=3.10) is completely optional. If you omit it, Conda will automatically install the latest stable Python version available. However, I've found that explicitly setting the version helps ensure consistency, especially when collaborating with others or when certain packages require specific Python versions.
Tip: For research projects that might be published, I strongly recommend documenting which Python version you used to ensure reproducibility down the road!
Package Installation
After creating your environment, activate it use:
conda activate myenv
You can also switching between envs using the above command line, just replace myenv with the name of the environment you want to switch to.
Package Installation
The next essential step is to install pip (Python Package Index installer).
conda install pip
While conda is also excellent for installing packages with complex dependencies (especially scientific libraries with C extensions), pip gives you access to the much larger PyPI repository with over 400,000 packages. Many cutting-edge or specialized Python packages are available on PyPI before they're packaged for conda.
Now, you can use pip to install the package you want (make sure you already activate your target environment):
pip install pandas
Cleaning Up: Environment Removal
As your projects evolve, you might accumulate environments you no longer need. Removing unused environments helps keep your system organized and saves disk space.
To remove an environment, first ensure you're not currently using it:
conda deactivate
Then remove the environment:
conda remove --name myenv --all
The --all
flag ensures that all packages in the environment are removed along with the environment itself.
You can also list all your environments to identify candidates for cleanup:
conda env list
I recommend periodically reviewing your environments and removing those that haven't been used in several months.
Active Environment Awareness
One of the most common pitfalls in environment management is installing packages into the wrong environment. To avoid this, always be aware of which environment is currently active. In most terminal setups, your active environment appears in parentheses at the beginning of your command prompt:
(myenv) username@computer:~$
You can also explicitly check your current environment:
conda info --envs
The active environment will be marked with an asterisk (*).
Reproducibility for Your Projects
The environments we've discussed let you work effectively on your local machine, but ensuring reproducibility becomes critical when collaborating with others. The most straightforward way to capture your dependencies is using pipreqs, which analyzes your code and generates a minimal requirements file:
pip install pipreqs
pipreqs /path/to/your/project/
Remember to conda activate your environment first if you are not connect to where you want. Unlike pip freeze which captures everything in your environment (including dependencies of dependencies, and those your project is not used), pipreqs intelligently identifies only the packages your code actually imports, creating a cleaner, more portable requirements file.
Environment Set Up from Requirements
For a more advanced approach, consider uv (short for "ultraviolet"), a blazingly fast Python package installer and resolver written in Rust. It can install packages up to 10-100x faster than pip, has improved dependency resolution, and works seamlessly with existing Python workflows.
"uv is a high-performance replacement for pip and virtualenv that installs packages significantly faster while maintaining compatibility with existing Python workflows."
After conda create a new environment and install pip, run the following line for set up the project with requirements.txt:
pip install uv
uv pip install -r requirements.txt
The uv tool can work directly with your pipreqs-generated files.
Key Takeaways
- Isolated Environments: Anaconda solves dependency conflicts by creating isolated environments for different projects, preventing package version conflicts
- Version Control: Explicitly specify Python versions (e.g.,
python=3.10
) when creating environments to ensure consistency and reproducibility - Installing Approach: Try to use
pip
for accessing the broader PyPI ecosystem - Smart Requirements: Generate minimal, project-specific requirements files with
pipreqs
instead of capturing everything withpip freeze
- Performance Boost: Consider using modern tools like
uv
to dramatically speed up package installation while maintaining compatibility - Documentation Habit: Always document your environment setup for future reference and collaboration