Dvc

Jul 20, 2023

Git for data scientists: manage code and data together

Data Science Version Control or DVC is an open-source tool for data science and machine learning projects. With a simple and flexible Git-like architecture and interface it helps data scientists

  • manage machine learning models - versioning, including data sets and transformations scripts that were used to generate models;
  • make projects reproducible;
  • make projects shareable;
  • manage experiments with branching and metrics tracking.

It aims to replace tools like Excel and Docs that are being commonly used as a knowledge repo and a ledger for the team, ad-hoc scripts to track and move deploy different model versions, ad-hoc data file suffixes and prefixes.



Checkout these related ports:
  • Zx - MQT ZX A library for working with ZX-diagrams
  • Zotero - Reference management for bibliographic data and research materials
  • Yoda - Particle physics package with classes for data analysis, histogramming
  • Xtb - Semiempirical Extended Tight-Binding Program Package
  • Xmakemol - Molecule Viewer Program Based on Motif Widget
  • Xdrawchem - Two-dimensional molecule drawing program
  • Xcrysden - Crystalline and molecular structure visualisation program
  • Xcfun - Exchange-correlation functionals with arbitrary-order derivatives
  • Wxmacmolplt - Graphical user interface principally for the GAMESS program
  • Wwplot - Plotting tool for experimental physics classes
  • Wannier90 - Maximally-localized Wannier functions (MLWFs) and Wannier90
  • Votca - CSG and XTP libraries for atomistic simulations
  • Voro++ - Three-dimensional computations of the Voronoi tessellation
  • Vmd - Molecular visualization program
  • Vipster - Crystalline and molecular structure visualisation program