1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Easily train a good VC model with voice data <= 10 mins!
A generative speech model for daily dialogue.
Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 100+ languages.
This is the Rust course used by the Android team at Google. It provides you the material to quickly teach Rust.
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Universal markup converter
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (ViT), MobileNetV4, MobileNet-V3 & V2, RegNet, DPN, CSPNet, Swin Transformer, MaxViT, CoAtNet, ConvNeXt, and more
aider is AI pair programming in your terminal
Integrate the DeepSeek API into popular software