PyFastANI Stars

Cython bindings and Python interface to FastANI, a method for fast whole-genome similarity estimation.

Actions Coverage PyPI Bioconda AUR Wheel Versions Implementations License Source Mirror Issues Docs Changelog Downloads

Overview

FastANI is a method published in 2018 by Jain et al. for high-throughput computation of whole-genome Average Nucleotide Identity (ANI). It uses MashMap to compute orthologous mappings without the need for expensive alignments.

pyfastani is a Python module, implemented using the Cython language, that provides bindings to FastANI. It directly interacts with the FastANI internals, which has the following advantages over CLI wrappers:

  • simpler compilation: FastANI requires several additional libraries, which make compilation of the original binary non-trivial. In PyFastANI, libraries that were needed for threading or I/O are provided as stubs, and Boost::math headers are vendored so you can build the package without hassle. Or even better, just install from one of the provided wheels!

  • single dependency: If your software or your analysis pipeline is distributed as a Python package, you can add pyfastani as a dependency to your project, and stop worrying about the FastANI binary being present on the end-user machine.

  • sans I/O: Everything happens in memory, in Python objects you control, making it easier to pass your sequences to FastANI without needing to write them to a temporary file.

  • multi-threading: Genome query resolves the fragment mapping step in parallel, leading to shorter querying times even with a single genome.

Setup

Run pip install pyfastani in a shell to download the latest release and all its dependencies from PyPi, or have a look at the Installation page to find other ways to install pyfastani.

Library

License

This library is provided under the MIT License.

The fastANI code was written by Chirag Jain and is distributed under the terms of the Apache License 2.0, unless otherwise specified in vendored sources. The cpu_features code was written by Guillaume Chatelet and is distributed under the terms of the Apache License 2.0. The Boost::math headers were written by Boost Libraries contributors and is distributed under the terms of the Boost Software License.

This project is in no way not affiliated, sponsored, or otherwise endorsed by the original fastANI authors. It was developed by Martin Larralde during his PhD project at the European Molecular Biology Laboratory in the Zeller team.