Foundation Models

Foundation Models

We're excited to offer a diverse selection of powerful base models to fuel your AI development. This list represents a range of cutting-edge language models from leading AI research organizations, each with unique strengths and characteristics. Whether you're building chatbots, generating creative content, or tackling complex reasoning tasks, you'll find a model here to suit your needs.

The models are categorized by their origin and include key details like model size (number of parameters), architecture, and any specific optimizations (e.g., instruction following, chat optimization, math capabilities). We also provide information about the training techniques used (SFT, RLHF) and the model's context window size, which is crucial for handling longer texts.

Qwen (Alibaba Cloud)
  • Qwen2.5: An improved version of the Qwen2 series.
    • Qwen2.5-0.5B-Instruct, Qwen2.5-1.5B-Instruct, Qwen2.5-3B-Instruct, Qwen2.5-7B-Instruct, Qwen2.5-14B-Instruct, Qwen2.5-32B-Instruct, Qwen2.5-72B-Instruct: These are instruction-following models, varying in size from 0.5 billion to 72 billion parameters. Larger models generally have greater capacity for understanding and generating text.
  • Qwen2: A series of large language models.
    • Qwen2-0.5B-Instruct, Qwen2-1.5B-Instruct, Qwen2-7B-Instruct, Qwen2-72B-Instruct: Similar to the Qwen2.5 models, these are instruction-following versions with varying sizes.
    • Qwen2-Math-7B-Instruct, Qwen2-Math-72B-Instruct: Specialized versions of Qwen2 designed for mathematical reasoning and problem-solving.
  • Qwen1.5: An earlier generation of Qwen models.
    • Qwen1.5-110B-Chat: A large chat-optimized model with 110 billion parameters.

Llama (Meta)

  • Llama 3.2: A version of Meta's Llama series.
    • Llama-3.2-1B-Instruct, Llama-3.2-3B-Instruct: Instruction-following models.
    • Llama-3.2-1B, Llama-3.2-3B: Base Llama 3.2 models without specific instruction tuning.
  • Meta-Llama 3.1: An earlier version in the Llama 3 series.
    • Meta-Llama-3.1-8B-Instruct, Meta-Llama-3.1-70B-Instruct: Instruction-following models.
    • Meta-Llama-3.1-8B, Meta-Llama-3.1-70B: Base models.
  • Meta-Llama 3: Another version in the Llama 3 series.
    • Meta-Llama-3-8B-Instruct, Meta-Llama-3-70B-Instruct: Instruction-following models.
    • Meta-Llama-3-8B, Meta-Llama-3-70B: Base models.
  • Llama 3.3:
    • Llama-3.3-70B-Instruct: An instruction-following model.

Gemma (Google)

  • gemma-2-2b-it, gemma-2-9b-it, gemma-2-27b-it: These are instruction-tuned versions of the Gemma 2 model, with sizes of 2 billion, 9 billion, and 27 billion parameters respectively. "it" likely stands for instruction tuned.

SeaLLMs

  • SeaLLMs-v3-1.5B-Chat, SeaLLMs-v3-7B-Chat: Chat-optimized versions of SeaLLMs.
  • SeaLLMs-v3-1.5B, SeaLLMs-v3-7B: Base SeaLLMs models.

Phi (Microsoft)

  • Phi-3.5-mini-instruct: A smaller instruction-following model.
  • Phi-3-medium-128k-instruct, Phi-3-medium-4k-instruct: Medium-sized instruction-following models; "128k" and "4k" likely refer to context window size (the amount of text the model can consider at once).
  • Phi-3-small-128k-instruct, Phi-3-small-4k-instruct: Small instruction-following models with different context window sizes.
  • Phi-3-mini-128k-instruct, Phi-3-mini-4k-instruct: Smaller instruction-following models with different context window sizes.

Yi (01.AI)

  • Yi-1.5-6B-Chat, Yi-1.5-9B-Chat, Yi-1.5-34B-Chat: Chat-optimized models with varying sizes.
  • Yi-1.5-9B-Chat-16K, Yi-1.5-34B-Chat-16K: Chat-optimized models, possibly with extended context windows (16K tokens).

Aya (CohereForAI)

  • aya-23-8B, aya-23-35B: Models from the Aya series.

Baichuan (Baichuan Inc.)

  • Baichuan2-7B-Chat, Baichuan2-13B-Chat: Chat-optimized models from the Baichuan 2 series.

DeepSeek (DeepSeek AI)

  • DeepSeek-R1: A base model.
  • DeepSeek-R1-Zero: A potentially specialized version of DeepSeek-R1.
  • DeepSeek-R1-Distill-Llama-70B, DeepSeek-R1-Distill-Qwen-32B, DeepSeek-R1-Distill-Qwen-14B, DeepSeek-R1-Distill-Llama-8B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-1.5B: Distilled versions of DeepSeek-R1, using other models (Llama, Qwen) for knowledge transfer or improved efficiency.
InfoKey Points:
  • -Instruct: Suffix often indicates the model has been fine-tuned for following instructions.
  • -Chat: Suffix often indicates the model is optimized for conversational interactions.
  • -Math: Suffix suggests a model specialized for mathematical tasks.
  • Size (e.g., 7B, 70B, 110B): Refers to the number of parameters in the model (billions). More parameters generally mean greater capacity but also higher computational cost.
  • -16K, -128k, -4k: These suffixes likely relate to the model's context window size. A larger context window allows the model to process more text at once.
  • SFT (Supervised Fine-Tuning): A training technique where the model is fine-tuned on a dataset of input-output pairs.
  • RLHF (Reinforcement Learning from Human Feedback): A training technique that uses human feedback to improve the model's responses.
    • Related Articles

    • Supported Models

      GreenNode MaaS API provides full-precision access to the DeepSeek, Llama, and Qwen model families listed below. Model Details GreenNode MaaS supports several model types: Text generation Model Name Provider Quantization Context Length View on ...
    • Distributed Training: LLaMA-Factory on Managed Slurm

      1. Overview This guide walks you through implementing distributed training with LLaMA-Factory on a Managed Slurm cluster. The documentation covers all essential aspects of the workflow, including environment configuration, efficient job scheduling ...
    • How it works?

      Overall, the GreenNode AI Platform forms an end-to-end pipeline for building, training, managing, and deploying machine learning models in an AI platform, which includes four main components Notebook Instance, Model Training, Model Registry and ...
    • Overview

      GreenNode offers Model as a Service (MaaS) to help developers and businesses integrate powerful AI capabilities into their applications with ease. Whether you're working with language models, vision models, or multi-modal AI systems, GreenNode ...
    • Playground

      The GreenNode Playground is a web-based interface that allows you to test and compare AI models quickly and intuitively—without writing a single line of code. Designed for both developers and AI enthusiasts, it supports a wide range of ...