PyFastANI Stars#

Cython bindings and Python interface to FastANI, a method for fast whole-genome similarity estimation.

Actions Coverage PyPI Bioconda AUR Wheel Versions Implementations License Source Mirror Issues Docs Changelog Downloads

Overview#

FastANI is a method published in 2018 by Jain et al. for high-throughput computation of whole-genome Average Nucleotide Identity (ANI). It uses MashMap to compute orthologous mappings without the need for expensive alignments.

pyfastani is a Python module, implemented using the Cython language, that provides bindings to FastANI. It directly interacts with the FastANI internals, which has the following advantages over CLI wrappers:

Batteries-included

Just add pyfastani as a pip or conda dependency, no need for the fastani binary or any external dependency.

Easy compilation

Required libraries that were needed for threading or I/O are provided as stubs, and Boost::math headers are vendored to build the package without any system dependency.

Sans I/O

Everything happens in memory, making it easier to pass your sequences to FastANI without needing to write them to a temporary file.

Multi-threaded

Genome query resolves the fragment mapping step in parallel, leading to shorter querying times even with a single genome.

Portable

Get SIMD-acceleration on any supported platform without having to build the package from scratch.

Introspectable

The genome sketches can be accessed from the Python API, allowing to view the minimizers for a genome database.

Setup#

PyFastANI is available for all modern Python versions (3.7+).

Run pip install pyfastani in a shell to download the latest release from PyPi, or have a look at the Installation page to find other ways to install pyfastani.

Library#

License#

This library is provided under the MIT License.

The fastANI code was written by Chirag Jain and is distributed under the terms of the Apache License 2.0, unless otherwise specified in vendored sources. The cpu_features code was written by Guillaume Chatelet and is distributed under the terms of the Apache License 2.0. The Boost::math headers were written by Boost Libraries contributors and is distributed under the terms of the Boost Software License. See the Copyright page for more information.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original fastANI authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.