Installation
The aindo.rdml
package is a Python package and is available for Linux and for a few versions of Python.
It is recommended to have the venv
module available, to install the package in an isolated virtual environment.
The name of the provided wheel file has the following structure:
aindo_rdml-{version}-cp{py_version}-cp{py_version}-linux_x86_64.whl
,
where {version}
is the package version (e.g. 3.0.0
) and {py_version}
is the Python version
(e.g. 310
for Python 3.10).
Once the wheel is located, open a terminal and move to the folder where to create the virtual environment, then type the following commands:
Note that in the above example the command python
is used in the first line to create the virtual environment,
but the user should specify the command to launch the required Python version
(e.g. python3
for Python 3.10 in some Linux distributions).
Once the virtual environment is activated, the command python
will invoke the correct Python version.
Once the installation is complete, the package aindo.rdml
will be available in the virtual environment:
Optional dependencies
The procedure discussed above will install the library and its main dependencies. There are also a few optional dependencies that can be installed. They can be specified by adding them to the last command described above as follows:
The available optional dependencies are:
db
: Allow for DB reading/writing (the supported dialects are the ones supported bysqlalchemy
, but the user may have to install the appropriate DBAPI driver).excel
: Allow for reading from Excel files.dev
: Includes a few packages that may be useful to track the progression of the model training (e.g.tensorboard
).
License
Along with the wheel, a license key will also be provided.
The aindo.rdml
library will search for the license key in three possible locations:
- The
AINDO_LICENSE
environment variable. - The
.aindo
file in the current working directory. - The
.aindo
file in the user’s home directory (~).
The user can place the key in any of these locations.
Alternatively, the license can be activated manually at runtime using the activate()
function,
by providing the license key directly as a string.
Regardless of the method used to provide the license,
its activation and validity can be verified with the assert_active()
function.
Requirements
As mentioned earlier, installing the aindo.rdml
library requires a machine running a Linux OS
and a local installation of Python.
It is recommended to set up the package in an isolated environment, such as a Python virtual environment,
a Conda environment, or a Docker container.
There are no strict hardware requirements for the machines on which aindo.rdml
can be installed.
However, since the training process is resource-intensive, we recommend using a sufficiently powerful machine.
The amount of computational power required largely depends on the datasets used for training.
One key consideration is that the total RAM of the machine should be sufficient to load the entire dataset
for preprocessing.
As a general guideline, for CPU-based training, we suggest using a machine with at least 8 cores, 32 GB of RAM and supporting AVX2 and AVX-512 instructions. Using a GPU can significantly accelerate model training, especially for larger models, which often require more resources. Training can also be parallelized across multiple GPUs, if available. When training with one or more GPUs, it’s important to choose cards with adequate memory, depending on the dataset and batch size.
In the benchmark section we present the resources consumed by standard training on several datasets and with several model sizes on a specific CPU machine and a specific GPU machine. These examples can help guide the selection of an appropriate machine for your task.