AI Personal Learning
and practical guidance
讯飞绘镜
Total 27 articles

Tags: local deployment open source big modeling tools

LitServe:快速部署企业级通用AI模型推理服务-首席AI分享圈

LitServe: Rapidly Deploying Enterprise-Grade General AI Model Reasoning Services

Comprehensive Introduction LitServe is an open source AI model service engine from Lightning AI, built on FastAPI, focused on rapidly deploying inference services for general-purpose AI models. It supports a wide field of classical machine learning models from large language models (LLMs), visual models, audio models, to...

Harbor:一键部署本地LLM开发环境,轻松管理和运行AI服务的容器化工具集-首席AI分享圈

Harbor: a containerized toolset for easily managing and running AI services with one-click deployment of local LLM development environments

Comprehensive Introduction Harbor is a revolutionary containerized LLM toolset focused on simplifying the deployment and management of local AI development environments. It enables developers to launch and manage all AI services including LLM backends, API interfaces, front-end interfaces, etc. with a single click through a clean command line interface (CLI) and companion application...

AI Dev Gallery:Windows本地AI模型开发工具集,端侧模型集成到Windows应用-首席AI分享圈

AI Dev Gallery: Windows Native AI Model Development Toolset, End-Side Model Integration into Windows Applications

Comprehensive Introduction AI Dev Gallery is an AI development tools application from Microsoft (currently in public preview) designed for Windows developers. It provides a comprehensive platform to help developers easily integrate AI features into their Windows applications. The most notable feature of the tool...

GLM Edge:智谱发布适用于手机、车机和PC平台的端侧大语言模型和多模态理解模型-首席AI分享圈

GLM Edge: Smart Spectrum Releases End-Side Large Language Model and Multi-Modal Understanding Model for Mobile, Car and PC Platforms

Comprehensive Introduction GLM-Edge is a series of large language models and multimodal understanding models designed for end-side devices from Tsinghua University (Smart Spectrum Light Language). These models include GLM-Edge-1.5B-Chat, GLM-Edge-4B-Chat, GLM-Edge-V-2B, and GLM-Edge-V-5B, which are applicable to cell phones,...

EXO:利用闲置家用设备运行分布式AI集群,支持多种推理引擎和自动设备发现。-首席AI分享圈

EXO: Running distributed AI clusters using idle home devices with support for multiple inference engines and automated device discovery.

General Introduction Exo is an open source project designed to run its own AI cluster using everyday devices (e.g. iPhone, iPad, Android, Mac, Linux, etc.). Through dynamic model partitioning and automated device discovery, Exo is able to unify multiple devices into a single powerful GPU, supporting multiple models such as LLaMA, Mis...

LocalAI:开源的本地AI部署方案,支持多种模型架构,WebUI统一管理模型和API-首席AI分享圈

LocalAI: open source local AI deployment solutions, support for multiple model architectures, WebUI unified management of models and APIs

General Introduction LocalAI is an open source local AI alternative designed to provide API interfaces compatible with OpenAI, Claude, and others. It supports running on consumer-grade hardware, does not require a GPU, and is capable of performing a wide range of tasks such as text, audio, video, image generation, and speech cloning.LocalAI by Ettore ...

Petals:分布式共享GPU运行和微调大语言模型,像BitTorrent网络一样共享GPU资源-首席AI分享圈

Petals: distributed shared GPU running and fine-tuning of large language models, sharing GPU resources like a BitTorrent network

General Introduction Petals is an open source project developed by the BigScience Workshop to run Large Language Models (LLMs) through a distributed computing approach. Users can run and fine-tune LLMs at home using consumer-grade GPUs or Google Colab, such as Llama 3.1, Mixtral, F...

Aphrodite Engine: an efficient LLM inference engine that supports multiple quantization formats and distributed inference.

Comprehensive Introduction Aphrodite Engine is the official backend engine for PygmalionAI, designed to provide an inference endpoint for PygmalionAI sites and support rapid deployment of Hugging Face compatible models. The engine utilizes vLLM's Paged Attention technology for efficient K/V management and continuous batch processing,...

llama.cpp:高效推理工具,支持多种硬件,轻松实现LLM推理-首席AI分享圈

llama.cpp: efficient inference tool, supports multiple hardware, easy to implement LLM inference

General Introduction llama.cpp is a library implemented in pure C/C++ designed to simplify the inference process for Large Language Models (LLM). It supports a wide range of hardware platforms, including Apple Silicon, NVIDIA GPUs, and AMD GPUs, and provides a variety of quantization options to increase inference speed and reduce memory usage. The project ...

Hyperspace(aiOS):分布式AI算力共享网络,aiOS生成式浏览器,深度知识智能体-首席AI分享圈

Hyperspace (aiOS): distributed AI arithmetic sharing network, aiOS generative browser, deep knowledge intelligences

General Introduction Hyperspace is an innovative generative browser (aiOS), based on the world's largest peer-to-peer AI network, designed to provide users with powerful tools for deep research and analysis. By integrating a wide range of AI models and data sources, Hyperspace allows users to rapidly generate information networks, utilizing high-quality sources...

en_USEnglish