numpy
, pandas
, jupyter
and scikit-learn
. More information about Conda can be found in its documentation, while Anaconda has its own homepage too.pyproject.toml
and poetry.lock
files make it similar to the way the Node Package Manager (npm) for Node.js works. More information about Poetry can be found in its documentation.conda list
in the base environment, we might see something like this in the terminal:(base) user:~$ conda list
# packages in environment at /home/user/anaconda3:
#
# Name Version Build Channel
_ipyw_jlab_nb_ext_conf 0.1.0 py39h06a4308_0
_libgcc_mutex 0.1 main
_openmp_mutex 4.5 1_gnu
alabaster 0.7.12 pyhd3eb1b0_0
...
jupyter 1.0.0 py39h06a4308_7
...
numpy 1.20.3 py39hf144106_0
...
pandas 1.3.4 py39h8c16a72_0
...
scikit-learn 0.24.2 py39ha9443f7_0
...
conda install -c conda-forge numpy==1.22.3
, we would see the following in the terminal:The following NEW packages will be INSTALLED:
_libgcc_mutex conda-forge/linux-64::_libgcc_mutex-0.1-conda_forge
_openmp_mutex conda-forge/linux-64::_openmp_mutex-4.5-1_gnu
bzip2 conda-forge/linux-64::bzip2-1.0.8-h7f98852_4
ca-certificates conda-forge/linux-64::ca-certificates-2021.10.8-ha878542_0
ld_impl_linux-64 conda-forge/linux-64::ld_impl_linux-64-2.36.1-hea4e1c9_2
libblas conda-forge/linux-64::libblas-3.9.0-14_linux64_openblas
libcblas conda-forge/linux-64::libcblas-3.9.0-14_linux64_openblas
libffi conda-forge/linux-64::libffi-3.4.2-h7f98852_5
libgcc-ng conda-forge/linux-64::libgcc-ng-11.2.0-h1d223b6_15
libgfortran-ng conda-forge/linux-64::libgfortran-ng-11.2.0-h69a702a_15
libgfortran5 conda-forge/linux-64::libgfortran5-11.2.0-h5c6108e_15
libgomp conda-forge/linux-64::libgomp-11.2.0-h1d223b6_15
liblapack conda-forge/linux-64::liblapack-3.9.0-14_linux64_openblas
libnsl conda-forge/linux-64::libnsl-2.0.0-h7f98852_0
libopenblas conda-forge/linux-64::libopenblas-0.3.20-pthreads_h78a6416_0
libstdcxx-ng conda-forge/linux-64::libstdcxx-ng-11.2.0-he4da1e4_15
libuuid conda-forge/linux-64::libuuid-2.32.1-h7f98852_1000
libzlib conda-forge/linux-64::libzlib-1.2.11-h166bdaf_1014
ncurses conda-forge/linux-64::ncurses-6.3-h27087fc_1
numpy conda-forge/linux-64::numpy-1.22.3-py310h45f3432_2
openssl conda-forge/linux-64::openssl-3.0.2-h166bdaf_1
pip conda-forge/noarch::pip-22.0.4-pyhd8ed1ab_0
python conda-forge/linux-64::python-3.10.4-h2660328_0_cpython
python_abi conda-forge/linux-64::python_abi-3.10-2_cp310
readline conda-forge/linux-64::readline-8.1-h46c0cb4_0
setuptools conda-forge/linux-64::setuptools-62.1.0-py310hff52083_0
sqlite conda-forge/linux-64::sqlite-3.38.2-h4ff8645_0
tk conda-forge/linux-64::tk-8.6.12-h27826a3_0
tzdata conda-forge/noarch::tzdata-2022a-h191b570_0
wheel conda-forge/noarch::wheel-0.37.1-pyhd8ed1ab_0
xz conda-forge/linux-64::xz-5.2.5-h516909a_1
zlib conda-forge/linux-64::zlib-1.2.11-h166bdaf_1014
ca-certificates
and openssl
are not actually needed by numpy
, but let us see what we get when we try to install numpy
in a fresh virtual environment using Pip instead. I will also run the pip list
command to see what packages get installed together with numpy
:user:~$ pip install numpy==1.22.3
Collecting numpy==1.22.3
Downloading numpy-1.22.3-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.8 MB)
|████████████████████████████████| 16.8 MB 10.2 MB/s
Installing collected packages: numpy
Successfully installed numpy-1.22.3
user:~$ pip list
Package Version
---------- -------
numpy 1.22.3
pip 21.2.2
setuptools 57.4.0
wheel 0.36.2
numpy
. This dependency bloat is one big turn-off for me when it comes to dependency management in production code.numpy
through via Pip in a fresh Conda virtual environment, I would see something like this:(test-env) user:~$ conda list
# packages in environment at /home/user/anaconda3/envs/test-env:
#
# Name Version Build Channel
(test-env) user:~$ pip list
Package Version
---------- -------
numpy 1.22.3
pip 21.2.2
setuptools 57.4.0
wheel 0.36.2
numpy
does not show when we run the conda list
command, but shows up when we run the pip list
command.pandas
via conda-forge, we will see something like this when we run the conda list
and pip list
commands:(test-env) user:~$ conda list
# packages in environment at /home/user/anaconda3/envs/test-env:
#
# Name Version Build Channel
...
numpy 1.22.3 pypi_0 pypi
...
pandas 1.4.2 pypi_0 pypi
...
(test-env) user:~$ pip list
Package Version
--------------- -------
numpy 1.22.3
pandas 1.4.2
...
conda list
command if some other package using this package was installed via conda-forge. This can be rather confusing for users, as:conda list
to check for our dependenciesnumpy
in a virtual environment that has pandas
installed:(test-env) user:~$ pip list
Package Version
--------------- -------
numpy 1.22.3
pandas 1.4.2
pip 21.1.1
python-dateutil 2.8.2
pytz 2022.1
setuptools 56.0.0
six 1.16.0
(test-env) user:~$ pip install "numpy<1.18.5"
Collecting numpy<1.18.5
Downloading numpy-1.18.4-cp38-cp38-manylinux1_x86_64.whl (20.7 MB)
|████████████████████████████████| 20.7 MB 10.9 MB/s
Installing collected packages: numpy
Attempting uninstall: numpy
Found existing installation: numpy 1.22.3
Uninstalling numpy-1.22.3:
Successfully uninstalled numpy-1.22.3
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
pandas 1.4.2 requires numpy>=1.18.5; platform_machine != "aarch64" and platform_machine != "arm64" and python_version < "3.10", but you have numpy 1.18.4 which is incompatible.
Successfully installed numpy-1.18.4
(test-env) user:~$ pip list
Package Version
--------------- -------
numpy 1.18.4
pandas 1.4.2
pip 21.1.1
python-dateutil 2.8.2
pytz 2022.1
setuptools 56.0.0
six 1.16.0
numpy
to be installed has a conflict with the dependency requirements specified in the pandas
library, but still goes ahead to install that version of numpy
anyway. This could cause bugs to occur during runtime, which is definitely not what we want.user:~$ poetry show
numpy 1.22.3 NumPy is the fundamental package for array computing with Python.
pandas 1.4.2 Powerful data structures for data analysis, time series, and statistics
python-dateutil 2.8.2 Extensions to the standard Python datetime module
pytz 2022.1 World timezone definitions, modern and historical
six 1.16.0 Python 2 and 3 compatibility utilities
user:~$ poetry add "numpy<1.18.5"
Updating dependencies
Resolving dependencies... (53.1s)
SolverProblemError
Because pandas (1.4.2) depends on numpy (>=1.18.5)
and no versions of pandas match >1.4.2,<2.0.0, pandas (>=1.4.2,<2.0.0) requires numpy (>=1.18.5).
So, because dependency-manager-test depends on both pandas (^1.4.2) and numpy (<1.18.5), version solving failed.
at ~/.local/share/pypoetry/venv/lib/python3.8/site-packages/poetry/puzzle/solver.py:241 in _solve
237│ packages = result.packages
238│ except OverrideNeeded as e:
239│ return self.solve_in_compatibility_mode(e.overrides, use_latest=use_latest)
240│ except SolveFailure as e:
→ 241│ raise SolverProblemError(e)
242│
243│ results = dict(
244│ depth_first_search(
245│ PackageNode(self._package, packages), aggregate_package_nodes
user:~$ poetry show
numpy 1.22.3 NumPy is the fundamental package for array computing with Python.
pandas 1.4.2 Powerful data structures for data analysis, time series, and statistics
python-dateutil 2.8.2 Extensions to the standard Python datetime module
pytz 2022.1 World timezone definitions, modern and historical
six 1.16.0 Python 2 and 3 compatibility utilities
black
or isort
to reformat our code and make it more readable. Or we might have libraries like pytest
that we use for unit testing. Typically these libraries are not used in production, so we do not want to have them installed during the production runtime.# requirements.txt
numpy
pandas
# requirements-dev.txt
-r requirements.txt
black
isort
pytest
# Installing only production dependencies
(test-env) user:~$ pip install -r requirements.txt
# Installing both development and production dependencies
(test-env) user:~$ pip install -r requirements-dev.txt
pyproject.toml
file:...
[tool.poetry.dependencies]
numpy = "^1.22.3"
pandas = "^1.4.2"
[tool.poetry.dev-dependencies]
black = "^21.7b0"
isort = "^5.9.3"
pytest = "^6.0"
...
pyproject.toml
that the production libraries we want are numpy
and pandas
, and the ones for development only are black
, isort
and pytest
. We could also use the following commands to install the libraries:# Installing only production dependencies
poetry install --no-dev
# Installing both development and production dependencies
poetry install
poetry add
command, as shown:# Install production dependency
poetry add numpy
# Install development dependency
poetry add pytest --dev
pandas
via Pip using the requirements.txt
file shown below (using the command pip install -r requirements.txt
😞pandas==1.4.2
pip install
command, you notice that version 1.20.0 of numpy
was installed together with pandas
. However, when you run the pip install
command again half a year later, you find that the version of numpy
installed has changed to 1.22.3, even though you are using the same requirements.txt
file. This could potentially cause dependency conflicts if your project contains other dependencies that use numpy
too (e.g. scikit-learn
, tensorflow
).pip freeze > requirements.txt
to persist the metadata of installed dependencies (i.e. package names and version numbers) to the requirements.txt
file, but this can get rather tedious as we start to use more dependencies for our projects. Also, since Pip does not handle dependency conflicts that well (as mentioned earlier), we might end up persisting dependencies with conflicts between one another.poetry.lock
file, which basically stores only the metadata of dependencies that do not have conflicts with one another. A poetry.lock
looks something like this:[[package]]
name = "numpy"
version = "1.22.3"
description = "NumPy is the fundamental package for array computing with Python."
category = "main"
optional = false
python-versions = ">=3.8"
[[package]]
name = "pandas"
version = "1.4.2"
description = "Powerful data structures for data analysis, time series, and statistics"
category = "main"
optional = false
python-versions = ">=3.8"
[package.dependencies]
numpy = [
{version = ">=1.18.5", markers = "platform_machine != \"aarch64\" and platform_machine != \"arm64\" and python_version < \"3.10\""}, {version = ">=1.19.2", markers = "platform_machine == \"aarch64\" and python_version < \"3.10\""}, {version = ">=1.20.0", markers = "platform_machine == \"arm64\" and python_version < \"3.10\""}, {version = ">=1.21.0", markers = "python_version >= \"3.10\""},
]
python-dateutil = ">=2.8.1"
pytz = ">=2020.1"
[package.extras]
test = ["hypothesis (>=5.5.3)", "pytest (>=6.0)", "pytest-xdist (>=1.31)"]
...
poetry.lock
file is created automatically when we run poetry install
for the first time. This file is also updated automatically whenever we run poetry add
to install new dependencies, poetry update
to update dependency versions, or poetry lock
to check for conflicts in the dependencies listed in pyproject.toml
. With the poetry.lock
file, we can be sure that we are always installing the same versions of libraries whenever we run the poetry install
command.--index-url
or --extra-index-url
option of the pip install
command to specify the URL of the private repository. The command would look something like this:pip install --index-url url-of-private-repo library-name
requirements.txt
file, and we will see why in a moment.pip freeze
command to persist the installed dependencies, we would see something like this in the requirements.txt
file:library-1==0.1.0
library-2==1.1.0
...
library-1
only exists in a private repository, we would get an error like this if we tried to install the libraries without specifying the --index-url
option:(test-env) user:~$ pip install -r requirements.txt
ERROR: Could not find a version that satisfies the requirement library-1==0.1.0 (from versions: none)
ERROR: No matching distribution found for library-1==0.1.0
--index-url
option to the command above to be able to install library-1
from the private repository. Now, to complicate the story further, let us say that version 1.1.0 of library-2
exists in PyPI but not in the private repository (which is totally possible if the private repository is not updated with the latest libraries from PyPI). If we tried to run the command pip install -r requirements.txt --index-url url-of-private-repo
, we would get the same error message as seen previously. This happens because Pip will search for and install libraries from only the private repository if the --index-url
option is used, and search within and install from only PyPI if the --index-url
option is not used. This can be quite troublesome, as we would need to check our private repository to find the library versions that are available if we decided to install libraries from there.pyproject.toml
file to tell Poetry to search within both PyPI and the private repository.[[tool.poetry.source]]
name = "name-of-private-repo"
url = "url-of-private-repo"
secondary = true
secondary
parameter basically tells Poetry to search for and install libraries from PyPI first, and only go to the private repository if some of the libraries cannot be found in PyPI. This helps us have the best of both worlds, to be able to tap on the wide range of libraries out there in PyPI while also getting access to niche libraries that might only be available in a private repository.You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
4 | |
3 | |
3 | |
2 | |
2 | |
2 | |
2 | |
1 | |
1 | |
1 |