训练 API
InternEvo 的训练 API 由 internlm.core.trainer.Trainer 管理。在定义了训练引擎和调度器之后,我们可以调用 Trainer API 来执行模型训练、评估、梯度清零和参数更新等。
有关详细用法,请参阅 Trainer API 文档和示例。
- class internlm.core.trainer.Trainer(engine: Engine, schedule: BaseScheduler | None = None)[源代码]
This is a class tending for easy deployments of users’ training and evaluation instead of writing their own scripts.
- 参数:
engine (
Engine) – Engine responsible for the process function.schedule (
BaseScheduler, optional) – Runtime schedule. Defaults to None.
- property engine
Returns the engine that responsible for managing the training and evaluation process.
- property schedule
Returns the runtime scheduler.
- property uses_pipeline
Returns whether the pipeline parallel is used or not.
- execute_schedule(data_iter: Iterable, **kwargs)[源代码]
Runs the forward, loss computation, and backward for the model. Returns a tuple of (output, label, loss).
- 参数:
data_iter (Iterable) – The data iterator.
**kwargs – Additional keyword arguments.
- 返回:
A tuple of (output, label, loss, moe_loss).
- 返回类型:
Tuple[
torch.Tensor]