High-performance inference API for transformer models trained with LLM Tool. Deploy your classification, generation, and embedding models with automatic resource management, concurrent request handling, and Ollama integration for generative AI.
██████ ██████ ██████ ██████ █████ ███ ██ ██████ ██ ██ ██ ██ ██ ██ ██ ██ ████ ██ ██ ██ ██ ██ ██ ██ ██ ██ ███████ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██ ██████ ██████ ██████ ██████ ██ ██ ██ ████ ██████
██╗ ██╗ ███╗ ███╗ ████████╗ ██████╗ ██████╗ ██╗ ██║ ██║ ████╗ ████║ ╚══██╔══╝██╔═══██╗██╔═══██╗██║ ██║ ██║ ██╔████╔██║ ██║ ██║ ██║██║ ██║██║ ██║ ██║ ██║╚██╔╝██║ ██║ ██║ ██║██║ ██║██║ ███████╗███████╗██║ ╚═╝ ██║ ██║ ╚██████╔╝╚██████╔╝███████╗ ╚══════╝╚══════╝╚═╝ ╚═╝ ╚═╝ ╚═════╝ ╚═════╝ ╚══════╝