General Introduction
ModelBest is a company that specializes in developing lightweight and high-performance big models, and is committed to applying advanced AI technologies to mainstream consumer electronics and all kinds of terminal devices in daily life. ModelBest's MiniCPM series of end-side models are known for their extreme arithmetic and memory efficiency, small parameter counts, fast inference speeds, superior performance, and flexible deployments, etc. ModelBest's large models excel in multimodal comprehension, OCR, and video comprehension, and comprehensively benchmark and surpass advanced models such as GPT-4V.
Function List
- Lightweight, high-performance large model: Provides efficient arithmetic and memory usage for a wide range of end devices.
- MiniCPM SeriesIt has powerful multimodal understanding and OCR capabilities, including MiniCPM-V 2.6, MiniCPM-Llama3-V 2.5, and so on.
- multimodal understanding: Support for real-time video understanding, joint multi-image understanding, and visual analogies.
- Efficient Alignment Technology: Adoption of self-developed RLAIF-V technology to reduce illusions and improve the credibility of multimodal behaviors.
- end-user friendly: Only 6 GB of memory after quantization, with inference speeds up to 18 tokens/s.
- Open Source and Collaboration: Cooperate with Tsinghua University, Great Wall Motor and many other parties to promote the application and development of big model technology.
Using Help
Installation and Deployment
- Download model: Visit the official ModelBest website (github portal), select the desired MiniCPM model version to download.
- Environment Configuration: Ensure that the device has the necessary hardware (e.g., a GPU with 8G of RAM) and that the relevant dependency libraries are installed.
- Model loading: Load the model into the application using the provided API or SDK.
- Testing and Optimization: Run test cases to ensure that the model is working properly and optimize configurations as needed.
Rapid local deployment:Ollama: Native One-Click Deployment of Open Source Large Language Models
Function Operation Guide
- multimodal understanding::
- Real-time video comprehension: Input video data into the model to obtain real-time analysis results.
- Joint understanding of multiple diagrams: Enter multiple images and the model will perform a joint analysis to provide comprehensive results.
- visual analogy: With input images, the model performs visual analogies and outputs similar images or related information.
- OCR Functions::
- text recognition: Upload images, the model automatically recognizes and extracts the text information in the images.
- Scenario text comprehension: Model accurate recognition and understanding for text in complex scenes.
- model optimization::
- parameterization: Adjust model parameters to improve performance according to specific application scenarios.
- data enhancement: Improve the generalization ability of the model by increasing the diversity of the training data.
usage example
- smart device (smartphone, tablet, etc): Deploying MiniCPM models on smartphones, tablets, and other devices for efficient multimodal understanding and OCR capabilities.
- automatic drivingThe company's AI technology is used in the automatic driving system to improve the vehicle's environment perception and decision-making ability.
- intelligent robot: Integration of MiniCPM models in embodied robots for human-robot interaction and environment understanding.