Skip to content

模块

lazyllm.module.ModuleBase

ModuleBase is the core base class in LazyLLM, defining the common interface and fundamental capabilities for all modules.
It abstracts training, deployment, inference, and evaluation logic, while also providing mechanisms for submodule management, hook registration, parameter passing, and recursive updates.
Custom modules should inherit from ModuleBase and implement the forward method to define specific inference logic.

Key Features
  • Unified management of submodules, automatically tracking held ModuleBase instances.
  • Support for Option type hyperparameters, enabling grid search and automated tuning.
  • Hook system that allows executing custom logic before and after calls.
  • Encapsulated update pipeline covering training, server deployment, and evaluation.
  • Built-in evalset loading and parallel inference evaluation.

Parameters:

  • return_trace (bool, default: False ) –

    Whether to write inference results into the trace queue for debugging and tracking. Default is False.

Use Cases
  1. When combining some or all of training, deployment, inference, and evaluation capabilities, e.g., an embedding model requiring both training and inference.
  2. When you want to recursively manage submodules through root-level methods such as start, update, and eval.
  3. When you want user parameters to be automatically propagated from outer modules to inner implementations (see WebModule).
  4. When you want the module to support parameter grid search (see TrialModule).

Examples:

>>> import lazyllm
>>> class Module(lazyllm.module.ModuleBase):
...     pass
... 
>>> class Module2(lazyllm.module.ModuleBase):
...     def __init__(self):
...         super(__class__, self).__init__()
...         self.m = Module()
... 
>>> m = Module2()
>>> m.submodules
[<Module type=Module>]
>>> m.m3 = Module()
>>> m.submodules
[<Module type=Module>, <Module type=Module>]
Source code in lazyllm/module/module.py
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
class ModuleBase(metaclass=_MetaBind):
    """ModuleBase is the core base class in LazyLLM, defining the common interface and fundamental capabilities for all modules.  
It abstracts training, deployment, inference, and evaluation logic, while also providing mechanisms for submodule management, hook registration, parameter passing, and recursive updates.  
Custom modules should inherit from ModuleBase and implement the ``forward`` method to define specific inference logic.  

Key Features:
    - Unified management of submodules, automatically tracking held ModuleBase instances.
    - Support for Option type hyperparameters, enabling grid search and automated tuning.
    - Hook system that allows executing custom logic before and after calls.
    - Encapsulated update pipeline covering training, server deployment, and evaluation.
    - Built-in evalset loading and parallel inference evaluation.

Args:
    return_trace (bool): Whether to write inference results into the trace queue for debugging and tracking. Default is ``False``.

Use Cases:
    1. When combining some or all of training, deployment, inference, and evaluation capabilities, e.g., an embedding model requiring both training and inference.
    2. When you want to recursively manage submodules through root-level methods such as ``start``, ``update``, and ``eval``.
    3. When you want user parameters to be automatically propagated from outer modules to inner implementations (see WebModule).
    4. When you want the module to support parameter grid search (see TrialModule).


Examples:
    >>> import lazyllm
    >>> class Module(lazyllm.module.ModuleBase):
    ...     pass
    ... 
    >>> class Module2(lazyllm.module.ModuleBase):
    ...     def __init__(self):
    ...         super(__class__, self).__init__()
    ...         self.m = Module()
    ... 
    >>> m = Module2()
    >>> m.submodules
    [<Module type=Module>]
    >>> m.m3 = Module()
    >>> m.submodules
    [<Module type=Module>, <Module type=Module>]
    """
    builder_keys = []  # keys in builder support Option by default

    def __new__(cls, *args, **kw):
        sig = inspect.signature(cls.__init__)
        paras = sig.parameters
        values = list(paras.values())[1:]  # paras.value()[0] is self
        for i, p in enumerate(args):
            if isinstance(p, Option):
                ann = values[i].annotation
                assert ann == Option or (isinstance(ann, (tuple, list)) and Option in ann), \
                    f'{values[i].name} cannot accept Option'
        for k, v in kw.items():
            if isinstance(v, Option):
                ann = paras[k].annotation
                assert ann == Option or (isinstance(ann, (tuple, list)) and Option in ann), \
                    f'{k} cannot accept Option'
        return object.__new__(cls)

    def __init__(self, *, return_trace=False):
        self._submodules = []
        self._evalset = None
        self._return_trace = return_trace
        self.mode_list = ('train', 'server', 'eval')
        self._set_mid()
        self._used_by_moduleid = None
        self._module_name = None
        self._options = []
        self.eval_result = None
        self._use_cache: Union[bool, str] = False
        self._hooks = set()

    def __setattr__(self, name: str, value):
        if isinstance(value, ModuleBase):
            self._submodules.append(value)
        elif isinstance(value, Option):
            self._options.append(value)
        elif name.endswith('_args') and isinstance(value, dict):
            for v in value.values():
                if isinstance(v, Option):
                    self._options.append(v)
        return super().__setattr__(name, value)

    def __getattr__(self, key):
        def _setattr(v, *, _return_value=self, **kw):
            k = key[:-7] if key.endswith('_method') else key
            if isinstance(v, tuple) and len(v) == 2 and isinstance(v[1], dict):
                kw.update(v[1])
                v = v[0]
            if len(kw) > 0:
                setattr(self, f'_{k}_args', kw)
            setattr(self, f'_{k}', v)
            if hasattr(self, f'_{k}_setter_hook'): getattr(self, f'_{k}_setter_hook')()
            return _return_value
        keys = self.__class__.builder_keys
        if key in keys:
            return _setattr
        elif key.startswith('_') and key[1:] in keys:
            return None
        elif key.startswith('_') and key.endswith('_args') and (key[1:-5] in keys or f'{key[1:-4]}method' in keys):
            return dict()
        raise AttributeError(f'{self.__class__} object has no attribute {key}')

    def __call__(self, *args, **kw):
        hook_objs = []
        for hook_type in self._hooks:
            if isinstance(hook_type, LazyLLMHook):
                hook_objs.append(copy.deepcopy(hook_type))
            elif isinstance(hook_type, type):
                assert issubclass(hook_type, LazyLLMHook), f'{hook_type} is not a subclass of LazyLLMHook'
                hook_objs.append(hook_type(self))
            hook_objs[-1].pre_hook(*args, **kw)
        try:
            kw.update(globals['global_parameters'].get(self._module_id, dict()))
            if (files := globals['lazyllm_files'].get(self._module_id)) is not None: kw['lazyllm_files'] = files
            if (history := globals['chat_history'].get(self._module_id)) is not None: kw['llm_chat_history'] = history

            r = (self._call_impl(**args[0], **kw)
                 if args and isinstance(args[0], kwargs) else self._call_impl(*args, **kw))
            if self._return_trace:
                lazyllm.FileSystemQueue.get_instance('lazy_trace').enqueue(str(r))
        except Exception as e:
            raise RuntimeError(
                f'\nAn error occured in {self.__class__} with name {self.name}.\n'
                f'Args:\n{args}\nKwargs\n{kw}\nError messages:\n{e}\n'
                f'Original traceback:\n{"".join(traceback.format_tb(e.__traceback__))}')
        for hook_obj in hook_objs[::-1]:
            hook_obj.post_hook(r)
        for hook_obj in hook_objs:
            hook_obj.report()
        self._clear_usage()
        return r

    def _call_impl(self, *args, **kw):
        if self._use_cache and 'R' in lazyllm.config['cache_mode']:
            try:
                return module_cache.get(self.__cache_hash__, args, kw)
            except CacheNotFoundError:
                self._cache_miss_handler()
        r = self.forward(**args[0], **kw) if args and isinstance(args[0], kwargs) else self.forward(*args, **kw)
        if self._use_cache and 'W' in lazyllm.config['cache_mode']:
            module_cache.set(self.__cache_hash__, args, kw, r)
        return r

    def _stream_output(self, text: str, color: Optional[str] = None, *, cls: Optional[str] = None):
        (FileSystemQueue.get_instance(cls) if cls else FileSystemQueue()).enqueue(colored_text(text, color))
        return ''

    @contextmanager
    def stream_output(self, stream_output: Optional[Union[bool, Dict]] = None):
        """Context manager for streaming output during inference or execution.  
When a dictionary is provided to ``stream_output``, a prefix and suffix can be specified along with optional colors.

Args:
    stream_output (Optional[Union[bool, Dict]]): Configuration for streaming output.

        - If True, enables default streaming output.
        - If a dictionary, may include:

            - 'prefix' (str): Text to output at the beginning.
            - 'prefix_color' (str, optional): Color of the prefix.
            - 'suffix' (str): Text to output at the end.
            - 'suffix_color' (str, optional): Color of the suffix.
"""
        if stream_output and isinstance(stream_output, dict) and (prefix := stream_output.get('prefix')):
            self._stream_output(prefix, stream_output.get('prefix_color'))
        yield
        if isinstance(stream_output, dict) and (suffix := stream_output.get('suffix')):
            self._stream_output(suffix, stream_output.get('suffix_color'))

    def used_by(self, module_id):
        """Mark which module is using the current module, indicating the calling relationship.  
Supports chaining by returning the module itself.

Args:
    module_id (str): Unique ID of the parent module that uses this module.

**Returns:**

- ModuleBase: Returns the module itself for method chaining.
"""
        self._used_by_moduleid = module_id
        return self

    def _clear_usage(self):
        globals['usage'].pop(self._module_id, None)

    # interfaces
    def forward(self, *args, **kw):
        """Forward computation interface that must be implemented by subclasses.  
This method defines the logic for receiving inputs and returning outputs, and is the core function of the module as a functor.

Args:
    *args: Variable positional arguments, subclass can define the input as needed.
    **kw: Variable keyword arguments, subclass can define the input as needed.
"""
        raise NotImplementedError

    def register_hook(self, hook_type: Union[LazyLLMHook, Callable]):
        """Register a hook to execute specific logic during module invocation.  
The hook must inherit from ``LazyLLMHook`` and can be used to add custom operations before or after the module's forward computation, such as logging or metrics collection.

Args:
    hook_type (LazyLLMHook): Hook object to register.
"""
        if not isinstance(hook_type, type) and not isinstance(hook_type, LazyLLMHook) and callable(hook_type):
            hook_type = LazyLLMFuncHook(hook_type)
        if not isinstance(hook_type, LazyLLMHook):
            raise TypeError(f'Invalid hook type: {type(hook_type)}, '
                            'must be subclass or instance of LazyLLMHook, or callable function')
        self._hooks.add(hook_type)

    def unregister_hook(self, hook_type: LazyLLMHook):
        """Unregister a previously registered hook.  
If the hook exists in the module, it will be removed and no longer executed during module invocation.

Args:
    hook_type (LazyLLMHook): Hook object to unregister.
"""
        if hook_type in self._hooks:
            self._hooks.remove(hook_type)

    def clear_hooks(self):
        """Clear all hooks registered in the module.  
After calling this, the module will no longer execute any hook logic.
"""
        self._hooks = set()

    def _get_train_tasks(self):
        """Define a training task. This function returns a training pipeline. Subclasses that override this function can be trained or fine-tuned during the update phase.


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...     def _get_train_tasks(self):
    ...         return lazyllm.pipeline(lambda : 1, lambda x: print(x))
    ... 
    >>> MyModule().update()
    1
    """
        return None
    def _get_deploy_tasks(self):
        """Define a deployment task. This function returns a deployment pipeline. Subclasses that override this function can be deployed during the update/start phase.


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...     def _get_deploy_tasks(self):
    ...         return lazyllm.pipeline(lambda : 1, lambda x: print(x))
    ... 
    >>> MyModule().start()
    1
    """
        return None
    def _get_post_process_tasks(self): return None

    def _set_mid(self, mid=None):
        self._module_id = mid if mid else str(uuid.uuid4().hex)
        return self

    @property
    def name(self):
        return self._module_name

    @name.setter
    def name(self, name):
        self._module_name = name

    @property
    def submodules(self):
        return self._submodules

    def evalset(self, evalset, load_f=None, collect_f=lambda x: x):
        """Set the evaluation set for the module.  
During ``update`` or ``eval``, the module will perform inference on the evaluation set, and the results will be stored in the ``eval_result`` variable.  

Args:
    evalset (Union[list, str]): Evaluation data list or path to an evaluation data file.
    load_f (Optional[Callable]): Function to load and parse the evaluation file into a list if ``evalset`` is a file path, default is None.
    collect_f (Callable): Function to post-process evaluation results, default is ``lambda x: x``.


Examples:
    >>> import lazyllm
    >>> m = lazyllm.module.TrainableModule().deploy_method(lazyllm.deploy.dummy).finetune_method(lazyllm.finetune.dummy).trainset("").mode("finetune").prompt(None)
    >>> m.evalset([1, 2, 3])
    >>> m.update()
    INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
    >>> print(m.eval_result)
    ["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
    """
        if isinstance(evalset, str) and os.path.exists(evalset):
            with open(evalset) as f:
                assert callable(load_f)
                self._evalset = load_f(f)
        else:
            self._evalset = evalset
        self.eval_result_collet_f = collect_f

    # TODO: add lazyllm.eval
    def _get_eval_tasks(self):
        def set_result(x): self.eval_result = x

        def parallel_infer():
            with ThreadPoolExecutor(max_workers=200) as executor:
                results = list(executor.map(lambda item: self(**item)
                                            if isinstance(item, dict) else self(item), self._evalset))
            return results
        if self._evalset:
            return Pipeline(parallel_infer,
                            lambda x: self.eval_result_collet_f(x),
                            set_result)
        return None

    # update module(train or finetune),
    def _update(self, *, mode: Optional[Union[str, List[str]]] = None, recursive: bool = True):  # noqa C901
        if not mode: mode = list(self.mode_list)
        if type(mode) is not list: mode = [mode]
        for item in mode:
            assert item in self.mode_list, f'Cannot find {item} in mode list: {self.mode_list}'
        # dfs to get all train tasks
        train_tasks, deploy_tasks, eval_tasks, post_process_tasks = FlatList(), FlatList(), FlatList(), FlatList()
        stack, visited = [(self, iter(self.submodules if recursive else []))], set()
        while len(stack) > 0:
            try:
                top = next(stack[-1][1])
                stack.append((top, iter(top.submodules)))
            except StopIteration:
                top = stack.pop()[0]
                if top._module_id in visited: continue
                visited.add(top._module_id)
                if 'train' in mode: train_tasks.absorb(top._get_train_tasks())
                if 'server' in mode: deploy_tasks.absorb(top._get_deploy_tasks())
                if 'eval' in mode: eval_tasks.absorb(top._get_eval_tasks())
                post_process_tasks.absorb(top._get_post_process_tasks())

        if 'train' in mode and len(train_tasks) > 0:
            Parallel(*train_tasks).set_sync(True)()
        if 'server' in mode and len(deploy_tasks) > 0:
            if redis_client:
                Parallel(*deploy_tasks).set_sync(False)()
            else:
                Parallel.sequential(*deploy_tasks)()
        if 'eval' in mode and len(eval_tasks) > 0:
            Parallel.sequential(*eval_tasks)()
        Parallel.sequential(*post_process_tasks)()
        return self

    def update(self, *, recursive: bool = True):
        """Update the module (and all its submodules). The module will be updated when the ``_get_train_tasks`` method is overridden.

Args:
    recursive (bool): Whether to recursively update all submodules, default is True.


Examples:
    >>> import lazyllm
    >>> m = lazyllm.module.TrainableModule().finetune_method(lazyllm.finetune.dummy).trainset("").deploy_method(lazyllm.deploy.dummy).mode('finetune').prompt(None)
    >>> m.evalset([1, 2, 3])
    >>> m.update()
    INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
    >>> print(m.eval_result)
    ["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
    """
        return self._update(mode=['train', 'server', 'eval'], recursive=recursive)

    def update_server(self, *, recursive: bool = True):
        """Update the deployment (server) part of the module and its submodules. When a module or submodule implements deployment functionality, the corresponding services will be started.

Args:
    recursive (bool): Whether to recursively update deployment tasks of all submodules, default is True.
"""
        return self._update(mode=['server'], recursive=recursive)
    def eval(self, *, recursive: bool = True):
        """Evaluate the module (and all its submodules). This function takes effect after the module has been set with an evaluation set using 'evalset'.

Args:
    recursive (bool): Whether to recursively evaluate all submodules. Defaults to True.


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...     def forward(self, input):
    ...         return f'reply for input'
    ... 
    >>> m = MyModule()
    >>> m.evalset([1, 2, 3])
    >>> m.eval().eval_result
    ['reply for input', 'reply for input', 'reply for input']
    """
        return self._update(mode=['eval'], recursive=recursive)
    def start(self):
        """Start the deployment services of the module and all its submodules. This ensures that the server functionality of the module and its submodules is executed, suitable for initialization or restarting services.

**Returns:**

- ModuleBase: Returns itself to support method chaining


Examples:
    >>> import lazyllm
    >>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
    >>> m.start()
    <Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
    >>> m(1)
    "reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
    """
        return self._update(mode=['server'], recursive=True)
    def restart(self):
        """Restart the deployment services of the module and its submodules. Internally calls the ``start`` method to reinitialize the services.

**Returns:**

- ModuleBase: Returns itself to support method chaining


Examples:
    >>> import lazyllm
    >>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
    >>> m.restart()
    <Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
    >>> m(1)
    "reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
    """
        return self.start()

    def wait(self):
        """Wait for the module or its submodules to finish execution. Currently, this method is a no-op and can be implemented by subclasses according to specific deployment logic.
"""
        pass

    def stop(self):
        """Stop the module and all its submodules. This method recursively calls the ``stop`` method of each submodule, suitable for releasing resources or shutting down services.
"""
        for m in self.submodules:
            m.stop()

    @property
    def options(self):
        options = self._options.copy()
        for m in self.submodules:
            options += m.options
        return options

    def _overwrote(self, f):
        return getattr(self.__class__, f) is not getattr(__class__, f)

    def __repr__(self):
        return lazyllm.make_repr('Module', self.__class__, name=self.name)

    def for_each(self, filter, action):
        """Execute a specified action on all submodules of the module. Recursively traverses all submodules, and if a submodule satisfies the ``filter`` condition, executes the ``action``.

Args:
    filter (Callable): A function that takes a submodule as input and returns a boolean, used to determine whether to perform the action.
    action (Callable): A function to perform on submodules that meet the condition.
"""
        for submodule in self.submodules:
            if filter(submodule):
                action(submodule)
            submodule.for_each(filter, action)

    @property
    def __cache_hash__(self):
        cache_hash = self.__class__.__name__
        if isinstance(self._use_cache, str): cache_hash += f'@{self._use_cache}'
        if hasattr(self, 'appendix_hash_key'): cache_hash += f'@{self.appendix_hash_key}'
        return cache_hash

    def use_cache(self, flag: Union[bool, str] = True):
        """Enable or disable the caching functionality for the module.

This method controls whether the module uses caching to store and retrieve execution results, 
improving performance and avoiding redundant computations.

Args:
    flag (bool or str, optional): Cache control flag. If True, enables caching; if False, disables caching;
                                 if a string, uses a specific cache identifier. Defaults to True.

**Returns:**

- Returns the module instance itself, supporting method chaining.

"""
        self._use_cache = flag or False
        return self

    def _cache_miss_handler(self): pass

eval(*, recursive=True)

Evaluate the module (and all its submodules). This function takes effect after the module has been set with an evaluation set using 'evalset'.

Parameters:

  • recursive (bool, default: True ) –

    Whether to recursively evaluate all submodules. Defaults to True.

Examples:

>>> import lazyllm
>>> class MyModule(lazyllm.module.ModuleBase):
...     def forward(self, input):
...         return f'reply for input'
... 
>>> m = MyModule()
>>> m.evalset([1, 2, 3])
>>> m.eval().eval_result
['reply for input', 'reply for input', 'reply for input']
Source code in lazyllm/module/module.py
    def eval(self, *, recursive: bool = True):
        """Evaluate the module (and all its submodules). This function takes effect after the module has been set with an evaluation set using 'evalset'.

Args:
    recursive (bool): Whether to recursively evaluate all submodules. Defaults to True.


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...     def forward(self, input):
    ...         return f'reply for input'
    ... 
    >>> m = MyModule()
    >>> m.evalset([1, 2, 3])
    >>> m.eval().eval_result
    ['reply for input', 'reply for input', 'reply for input']
    """
        return self._update(mode=['eval'], recursive=recursive)

evalset(evalset, load_f=None, collect_f=lambda x: x)

Set the evaluation set for the module.
During update or eval, the module will perform inference on the evaluation set, and the results will be stored in the eval_result variable.

Parameters:

  • evalset (Union[list, str]) –

    Evaluation data list or path to an evaluation data file.

  • load_f (Optional[Callable], default: None ) –

    Function to load and parse the evaluation file into a list if evalset is a file path, default is None.

  • collect_f (Callable, default: lambda x: x ) –

    Function to post-process evaluation results, default is lambda x: x.

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(lazyllm.deploy.dummy).finetune_method(lazyllm.finetune.dummy).trainset("").mode("finetune").prompt(None)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> print(m.eval_result)
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
Source code in lazyllm/module/module.py
    def evalset(self, evalset, load_f=None, collect_f=lambda x: x):
        """Set the evaluation set for the module.  
During ``update`` or ``eval``, the module will perform inference on the evaluation set, and the results will be stored in the ``eval_result`` variable.  

Args:
    evalset (Union[list, str]): Evaluation data list or path to an evaluation data file.
    load_f (Optional[Callable]): Function to load and parse the evaluation file into a list if ``evalset`` is a file path, default is None.
    collect_f (Callable): Function to post-process evaluation results, default is ``lambda x: x``.


Examples:
    >>> import lazyllm
    >>> m = lazyllm.module.TrainableModule().deploy_method(lazyllm.deploy.dummy).finetune_method(lazyllm.finetune.dummy).trainset("").mode("finetune").prompt(None)
    >>> m.evalset([1, 2, 3])
    >>> m.update()
    INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
    >>> print(m.eval_result)
    ["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
    """
        if isinstance(evalset, str) and os.path.exists(evalset):
            with open(evalset) as f:
                assert callable(load_f)
                self._evalset = load_f(f)
        else:
            self._evalset = evalset
        self.eval_result_collet_f = collect_f

forward(*args, **kw)

Forward computation interface that must be implemented by subclasses.
This method defines the logic for receiving inputs and returning outputs, and is the core function of the module as a functor.

Parameters:

  • *args

    Variable positional arguments, subclass can define the input as needed.

  • **kw

    Variable keyword arguments, subclass can define the input as needed.

Source code in lazyllm/module/module.py
    def forward(self, *args, **kw):
        """Forward computation interface that must be implemented by subclasses.  
This method defines the logic for receiving inputs and returning outputs, and is the core function of the module as a functor.

Args:
    *args: Variable positional arguments, subclass can define the input as needed.
    **kw: Variable keyword arguments, subclass can define the input as needed.
"""
        raise NotImplementedError

start()

Start the deployment services of the module and all its submodules. This ensures that the server functionality of the module and its submodules is executed, suitable for initialization or restarting services.

Returns:

  • ModuleBase: Returns itself to support method chaining

Examples:

>>> import lazyllm
>>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
>>> m.start()
<Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
Source code in lazyllm/module/module.py
    def start(self):
        """Start the deployment services of the module and all its submodules. This ensures that the server functionality of the module and its submodules is executed, suitable for initialization or restarting services.

**Returns:**

- ModuleBase: Returns itself to support method chaining


Examples:
    >>> import lazyllm
    >>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
    >>> m.start()
    <Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
    >>> m(1)
    "reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
    """
        return self._update(mode=['server'], recursive=True)

restart()

Restart the deployment services of the module and its submodules. Internally calls the start method to reinitialize the services.

Returns:

  • ModuleBase: Returns itself to support method chaining

Examples:

>>> import lazyllm
>>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
>>> m.restart()
<Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
Source code in lazyllm/module/module.py
    def restart(self):
        """Restart the deployment services of the module and its submodules. Internally calls the ``start`` method to reinitialize the services.

**Returns:**

- ModuleBase: Returns itself to support method chaining


Examples:
    >>> import lazyllm
    >>> m = lazyllm.TrainableModule().deploy_method(lazyllm.deploy.dummy).prompt(None)
    >>> m.restart()
    <Module type=Trainable mode=None basemodel= target= stream=False return_trace=False>
    >>> m(1)
    "reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
    """
        return self.start()

update(*, recursive=True)

Update the module (and all its submodules). The module will be updated when the _get_train_tasks method is overridden.

Parameters:

  • recursive (bool, default: True ) –

    Whether to recursively update all submodules, default is True.

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(lazyllm.finetune.dummy).trainset("").deploy_method(lazyllm.deploy.dummy).mode('finetune').prompt(None)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> print(m.eval_result)
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
Source code in lazyllm/module/module.py
    def update(self, *, recursive: bool = True):
        """Update the module (and all its submodules). The module will be updated when the ``_get_train_tasks`` method is overridden.

Args:
    recursive (bool): Whether to recursively update all submodules, default is True.


Examples:
    >>> import lazyllm
    >>> m = lazyllm.module.TrainableModule().finetune_method(lazyllm.finetune.dummy).trainset("").deploy_method(lazyllm.deploy.dummy).mode('finetune').prompt(None)
    >>> m.evalset([1, 2, 3])
    >>> m.update()
    INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
    >>> print(m.eval_result)
    ["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
    """
        return self._update(mode=['train', 'server', 'eval'], recursive=recursive)

stream_output(stream_output=None)

Context manager for streaming output during inference or execution.
When a dictionary is provided to stream_output, a prefix and suffix can be specified along with optional colors.

Parameters:

  • stream_output (Optional[Union[bool, Dict]], default: None ) –

    Configuration for streaming output.

    • If True, enables default streaming output.
    • If a dictionary, may include:

      • 'prefix' (str): Text to output at the beginning.
      • 'prefix_color' (str, optional): Color of the prefix.
      • 'suffix' (str): Text to output at the end.
      • 'suffix_color' (str, optional): Color of the suffix.
Source code in lazyllm/module/module.py
    @contextmanager
    def stream_output(self, stream_output: Optional[Union[bool, Dict]] = None):
        """Context manager for streaming output during inference or execution.  
When a dictionary is provided to ``stream_output``, a prefix and suffix can be specified along with optional colors.

Args:
    stream_output (Optional[Union[bool, Dict]]): Configuration for streaming output.

        - If True, enables default streaming output.
        - If a dictionary, may include:

            - 'prefix' (str): Text to output at the beginning.
            - 'prefix_color' (str, optional): Color of the prefix.
            - 'suffix' (str): Text to output at the end.
            - 'suffix_color' (str, optional): Color of the suffix.
"""
        if stream_output and isinstance(stream_output, dict) and (prefix := stream_output.get('prefix')):
            self._stream_output(prefix, stream_output.get('prefix_color'))
        yield
        if isinstance(stream_output, dict) and (suffix := stream_output.get('suffix')):
            self._stream_output(suffix, stream_output.get('suffix_color'))

used_by(module_id)

Mark which module is using the current module, indicating the calling relationship.
Supports chaining by returning the module itself.

Parameters:

  • module_id (str) –

    Unique ID of the parent module that uses this module.

Returns:

  • ModuleBase: Returns the module itself for method chaining.
Source code in lazyllm/module/module.py
    def used_by(self, module_id):
        """Mark which module is using the current module, indicating the calling relationship.  
Supports chaining by returning the module itself.

Args:
    module_id (str): Unique ID of the parent module that uses this module.

**Returns:**

- ModuleBase: Returns the module itself for method chaining.
"""
        self._used_by_moduleid = module_id
        return self

register_hook(hook_type)

Register a hook to execute specific logic during module invocation.
The hook must inherit from LazyLLMHook and can be used to add custom operations before or after the module's forward computation, such as logging or metrics collection.

Parameters:

Source code in lazyllm/module/module.py
    def register_hook(self, hook_type: Union[LazyLLMHook, Callable]):
        """Register a hook to execute specific logic during module invocation.  
The hook must inherit from ``LazyLLMHook`` and can be used to add custom operations before or after the module's forward computation, such as logging or metrics collection.

Args:
    hook_type (LazyLLMHook): Hook object to register.
"""
        if not isinstance(hook_type, type) and not isinstance(hook_type, LazyLLMHook) and callable(hook_type):
            hook_type = LazyLLMFuncHook(hook_type)
        if not isinstance(hook_type, LazyLLMHook):
            raise TypeError(f'Invalid hook type: {type(hook_type)}, '
                            'must be subclass or instance of LazyLLMHook, or callable function')
        self._hooks.add(hook_type)

unregister_hook(hook_type)

Unregister a previously registered hook.
If the hook exists in the module, it will be removed and no longer executed during module invocation.

Parameters:

  • hook_type (LazyLLMHook) –

    Hook object to unregister.

Source code in lazyllm/module/module.py
    def unregister_hook(self, hook_type: LazyLLMHook):
        """Unregister a previously registered hook.  
If the hook exists in the module, it will be removed and no longer executed during module invocation.

Args:
    hook_type (LazyLLMHook): Hook object to unregister.
"""
        if hook_type in self._hooks:
            self._hooks.remove(hook_type)

clear_hooks()

Clear all hooks registered in the module.
After calling this, the module will no longer execute any hook logic.

Source code in lazyllm/module/module.py
    def clear_hooks(self):
        """Clear all hooks registered in the module.  
After calling this, the module will no longer execute any hook logic.
"""
        self._hooks = set()

update_server(*, recursive=True)

Update the deployment (server) part of the module and its submodules. When a module or submodule implements deployment functionality, the corresponding services will be started.

Parameters:

  • recursive (bool, default: True ) –

    Whether to recursively update deployment tasks of all submodules, default is True.

Source code in lazyllm/module/module.py
    def update_server(self, *, recursive: bool = True):
        """Update the deployment (server) part of the module and its submodules. When a module or submodule implements deployment functionality, the corresponding services will be started.

Args:
    recursive (bool): Whether to recursively update deployment tasks of all submodules, default is True.
"""
        return self._update(mode=['server'], recursive=recursive)

wait()

Wait for the module or its submodules to finish execution. Currently, this method is a no-op and can be implemented by subclasses according to specific deployment logic.

Source code in lazyllm/module/module.py
    def wait(self):
        """Wait for the module or its submodules to finish execution. Currently, this method is a no-op and can be implemented by subclasses according to specific deployment logic.
"""
        pass

stop()

Stop the module and all its submodules. This method recursively calls the stop method of each submodule, suitable for releasing resources or shutting down services.

Source code in lazyllm/module/module.py
    def stop(self):
        """Stop the module and all its submodules. This method recursively calls the ``stop`` method of each submodule, suitable for releasing resources or shutting down services.
"""
        for m in self.submodules:
            m.stop()

for_each(filter, action)

Execute a specified action on all submodules of the module. Recursively traverses all submodules, and if a submodule satisfies the filter condition, executes the action.

Parameters:

  • filter (Callable) –

    A function that takes a submodule as input and returns a boolean, used to determine whether to perform the action.

  • action (Callable) –

    A function to perform on submodules that meet the condition.

Source code in lazyllm/module/module.py
    def for_each(self, filter, action):
        """Execute a specified action on all submodules of the module. Recursively traverses all submodules, and if a submodule satisfies the ``filter`` condition, executes the ``action``.

Args:
    filter (Callable): A function that takes a submodule as input and returns a boolean, used to determine whether to perform the action.
    action (Callable): A function to perform on submodules that meet the condition.
"""
        for submodule in self.submodules:
            if filter(submodule):
                action(submodule)
            submodule.for_each(filter, action)

lazyllm.module.servermodule.LLMBase

Bases: object

Base class for large language model modules, inheriting from ModuleBase.
Manages initialization and switching of streaming output, prompts, and formatters; processes file information in inputs; supports instance sharing.

Parameters:

  • stream (bool or dict, default: False ) –

    Whether to enable streaming output or streaming configuration, default is False.

  • return_trace (bool) –

    Whether to return execution trace, default is False.

  • init_prompt (bool, default: True ) –

    Whether to automatically create a default prompt at initialization, default is True.

Source code in lazyllm/module/servermodule.py
class LLMBase(object):
    """Base class for large language model modules, inheriting from ModuleBase.  
Manages initialization and switching of streaming output, prompts, and formatters; processes file information in inputs; supports instance sharing.

Args:
    stream (bool or dict): Whether to enable streaming output or streaming configuration, default is False.
    return_trace (bool): Whether to return execution trace, default is False.
    init_prompt (bool): Whether to automatically create a default prompt at initialization, default is True.
"""
    def __init__(self, stream: Union[bool, Dict[str, str]] = False,
                 init_prompt: bool = True, type: Optional[Union[str, LLMType]] = None):
        self._stream = stream
        self._type = LLMType(type) if type else LLMType.LLM
        if init_prompt: self.prompt()
        __class__.formatter(self)

    def _get_files(self, input, lazyllm_files):
        if isinstance(input, package):
            assert not lazyllm_files, 'Duplicate `files` argument provided by args and kwargs'
            input, lazyllm_files = input
        if isinstance(input, str) and input.startswith(LAZYLLM_QUERY_PREFIX):
            assert not lazyllm_files, 'Argument `files` is already provided by query'
            deinput = decode_query_with_filepaths(input)
            assert isinstance(deinput, dict), 'decode_query_with_filepaths must return a dict.'
            input, files = deinput['query'], deinput['files']
        else:
            files = _lazyllm_get_file_list(lazyllm_files) if lazyllm_files else []
        return input, files

    def prompt(self, prompt: Optional[str] = None, history: Optional[List[List[str]]] = None):
        """Set or switch the prompt. Supports None, PrompterBase subclass, or string/dict to create ChatPrompter.

Args:
    prompt (str/dict/PrompterBase/None): The prompt to set.
    history (list): Conversation history, only valid when prompt is str or dict.

**Returns:**

- self: For chaining calls.
"""
        if prompt is None:
            assert not history, 'history is not supported in EmptyPrompter'
            self._prompt = EmptyPrompter()
        elif isinstance(prompt, PrompterBase):
            assert not history, 'history is not supported in user defined prompter'
            self._prompt = prompt
        elif isinstance(prompt, (str, dict)):
            self._prompt = ChatPrompter(prompt, history=history)
        else:
            raise TypeError(f'{prompt} type is not supported.')
        return self

    def formatter(self, format: Optional[FormatterBase] = None):
        """Set or switch the output formatter. Supports None, FormatterBase subclass or callable.

Args:
    format (FormatterBase/Callable/None): Formatter object or function, default is None.

**Returns:**

- self: For chaining calls.
"""
        assert format is None or isinstance(format, FormatterBase) or callable(format), 'format must be None or Callable'
        self._formatter = format or EmptyFormatter()
        return self

    def share(self, prompt: Optional[Union[str, dict, PrompterBase]] = None, format: Optional[FormatterBase] = None,
              stream: Optional[Union[bool, Dict[str, str]]] = None, history: Optional[List[List[str]]] = None):
        """Creates a shallow copy of the current instance, with optional resetting of prompt, formatter, and stream attributes.  
Useful for scenarios where multiple sessions or agents share a base configuration but customize certain parameters.

Args:
    prompt (str/dict/PrompterBase/None): New prompt, optional.
    format (FormatterBase/None): New formatter, optional.
    stream (bool/dict/None): New streaming settings, optional.
    history (list/None): New conversation history, effective only when setting prompt.

**Returns:**

- LLMBase: The new shared instance.
"""
        new = copy.copy(self)
        new._hooks = set()
        new._set_mid()
        if prompt is not None: new.prompt(prompt, history=history)
        if format is not None: new.formatter(format)
        if stream is not None: new.stream = stream
        return new

    @property
    def type(self):
        return self._type.value

    @property
    def stream(self):
        return self._stream

    @stream.setter
    def stream(self, v: Union[bool, Dict[str, str]]):
        self._stream = v

    def __or__(self, other):
        if not isinstance(other, FormatterBase):
            return NotImplemented
        return self.share(format=(other if isinstance(self._formatter, EmptyFormatter) else (self._formatter | other)))

    @property
    def appendix_hash_key(self):
        try:
            prompts = self._prompt.generate_prompt('x')
        except Exception:
            prompts = self._prompt._instruction_template
        if not isinstance(prompts, str):
            try:
                content = json.dumps(prompts, sort_keys=True)
            except Exception:
                content = str(prompts)
        else:
            content = prompts
        return hashlib.md5(content.encode()).hexdigest()

prompt(prompt=None, history=None)

Set or switch the prompt. Supports None, PrompterBase subclass, or string/dict to create ChatPrompter.

Parameters:

  • prompt (str / dict / PrompterBase / None, default: None ) –

    The prompt to set.

  • history (list, default: None ) –

    Conversation history, only valid when prompt is str or dict.

Returns:

  • self: For chaining calls.
Source code in lazyllm/module/servermodule.py
    def prompt(self, prompt: Optional[str] = None, history: Optional[List[List[str]]] = None):
        """Set or switch the prompt. Supports None, PrompterBase subclass, or string/dict to create ChatPrompter.

Args:
    prompt (str/dict/PrompterBase/None): The prompt to set.
    history (list): Conversation history, only valid when prompt is str or dict.

**Returns:**

- self: For chaining calls.
"""
        if prompt is None:
            assert not history, 'history is not supported in EmptyPrompter'
            self._prompt = EmptyPrompter()
        elif isinstance(prompt, PrompterBase):
            assert not history, 'history is not supported in user defined prompter'
            self._prompt = prompt
        elif isinstance(prompt, (str, dict)):
            self._prompt = ChatPrompter(prompt, history=history)
        else:
            raise TypeError(f'{prompt} type is not supported.')
        return self

formatter(format=None)

Set or switch the output formatter. Supports None, FormatterBase subclass or callable.

Parameters:

  • format (FormatterBase / Callable / None, default: None ) –

    Formatter object or function, default is None.

Returns:

  • self: For chaining calls.
Source code in lazyllm/module/servermodule.py
    def formatter(self, format: Optional[FormatterBase] = None):
        """Set or switch the output formatter. Supports None, FormatterBase subclass or callable.

Args:
    format (FormatterBase/Callable/None): Formatter object or function, default is None.

**Returns:**

- self: For chaining calls.
"""
        assert format is None or isinstance(format, FormatterBase) or callable(format), 'format must be None or Callable'
        self._formatter = format or EmptyFormatter()
        return self

share(prompt=None, format=None, stream=None, history=None)

Creates a shallow copy of the current instance, with optional resetting of prompt, formatter, and stream attributes.
Useful for scenarios where multiple sessions or agents share a base configuration but customize certain parameters.

Parameters:

  • prompt (str / dict / PrompterBase / None, default: None ) –

    New prompt, optional.

  • format (FormatterBase / None, default: None ) –

    New formatter, optional.

  • stream (bool / dict / None, default: None ) –

    New streaming settings, optional.

  • history (list / None, default: None ) –

    New conversation history, effective only when setting prompt.

Returns:

  • LLMBase: The new shared instance.
Source code in lazyllm/module/servermodule.py
    def share(self, prompt: Optional[Union[str, dict, PrompterBase]] = None, format: Optional[FormatterBase] = None,
              stream: Optional[Union[bool, Dict[str, str]]] = None, history: Optional[List[List[str]]] = None):
        """Creates a shallow copy of the current instance, with optional resetting of prompt, formatter, and stream attributes.  
Useful for scenarios where multiple sessions or agents share a base configuration but customize certain parameters.

Args:
    prompt (str/dict/PrompterBase/None): New prompt, optional.
    format (FormatterBase/None): New formatter, optional.
    stream (bool/dict/None): New streaming settings, optional.
    history (list/None): New conversation history, effective only when setting prompt.

**Returns:**

- LLMBase: The new shared instance.
"""
        new = copy.copy(self)
        new._hooks = set()
        new._set_mid()
        if prompt is not None: new.prompt(prompt, history=history)
        if format is not None: new.formatter(format)
        if stream is not None: new.stream = stream
        return new

lazyllm.module.ActionModule

Bases: ModuleBase

Used to wrap a Module around functions, modules, flows, Module, and other callable objects. The wrapped Module (including the Module within the flow) will become a submodule of this Module.

Parameters:

  • action (Callable | list[Callable], default: () ) –

    The object to be wrapped, which is one or a set of callable objects.

  • return_trace (bool, default: False ) –

    Whether to enable trace mode to record the execution stack. Defaults to False.

Examples:

>>> import lazyllm
>>> def myfunc(input): return input + 1
... 
>>> class MyModule1(lazyllm.module.ModuleBase):
...     def forward(self, input): return input * 2
... 
>>> class MyModule2(lazyllm.module.ModuleBase):
...     def _get_deploy_tasks(self): return lazyllm.pipeline(lambda : print('MyModule2 deployed!'))
...     def forward(self, input): return input * 4
... 
>>> class MyModule3(lazyllm.module.ModuleBase):
...     def _get_deploy_tasks(self): return lazyllm.pipeline(lambda : print('MyModule3 deployed!'))
...     def forward(self, input): return f'get {input}'
... 
>>> m = lazyllm.ActionModule(myfunc, lazyllm.pipeline(MyModule1(), MyModule2), MyModule3())
>>> print(m(1))
get 16
>>> 
>>> m.evalset([1, 2, 3])
>>> m.update()
MyModule2 deployed!
MyModule3 deployed!
>>> print(m.eval_result)
['get 16', 'get 24', 'get 32']

evalset(evalset, load_f=None, collect_f=<function ModuleBase.<lambda>>)

Set the evaluation set for the Module. Modules that have been set with an evaluation set will be evaluated during update or eval, and the evaluation results will be stored in the eval_result variable.

evalset(evalset, collect_f=lambda x: ...)→ None

Parameters:

  • evalset (list)

    Evaluation set

  • collect_f (Callable)

    Post-processing method for evaluation results, no post-processing by default.

evalset(evalset, load_f=None, collect_f=lambda x: ...)→ None

Parameters:

  • evalset (str)

    Path to the evaluation set

  • load_f (Callable)

    Method for loading the evaluation set, including parsing file formats and converting to a list

  • collect_f (Callable)

    Post-processing method for evaluation results, no post-processing by default.

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
Source code in lazyllm/module/module.py
class ActionModule(ModuleBase):
    """Used to wrap a Module around functions, modules, flows, Module, and other callable objects. The wrapped Module (including the Module within the flow) will become a submodule of this Module.

Args:
    action (Callable|list[Callable]): The object to be wrapped, which is one or a set of callable objects.
    return_trace (bool): Whether to enable trace mode to record the execution stack. Defaults to ``False``.

**Examples:**

```python
>>> import lazyllm
>>> def myfunc(input): return input + 1
... 
>>> class MyModule1(lazyllm.module.ModuleBase):
...     def forward(self, input): return input * 2
... 
>>> class MyModule2(lazyllm.module.ModuleBase):
...     def _get_deploy_tasks(self): return lazyllm.pipeline(lambda : print('MyModule2 deployed!'))
...     def forward(self, input): return input * 4
... 
>>> class MyModule3(lazyllm.module.ModuleBase):
...     def _get_deploy_tasks(self): return lazyllm.pipeline(lambda : print('MyModule3 deployed!'))
...     def forward(self, input): return f'get {input}'
... 
>>> m = lazyllm.ActionModule(myfunc, lazyllm.pipeline(MyModule1(), MyModule2), MyModule3())
>>> print(m(1))
get 16
>>> 
>>> m.evalset([1, 2, 3])
>>> m.update()
MyModule2 deployed!
MyModule3 deployed!
>>> print(m.eval_result)
['get 16', 'get 24', 'get 32']
```


<span style="font-size: 20px;">**`evalset(evalset, load_f=None, collect_f=<function ModuleBase.<lambda>>)`**</span>

Set the evaluation set for the Module. Modules that have been set with an evaluation set will be evaluated during ``update`` or ``eval``, and the evaluation results will be stored in the eval_result variable. 


<span style="font-size: 18px;">&ensp;**`evalset(evalset, collect_f=lambda x: ...)→ None `**</span>


Args:
    evalset (list) :Evaluation set
    collect_f (Callable) :Post-processing method for evaluation results, no post-processing by default.



<span style="font-size: 18px;">&ensp;**`evalset(evalset, load_f=None, collect_f=lambda x: ...)→ None`**</span>


Args:
    evalset (str) :Path to the evaluation set
    load_f (Callable) :Method for loading the evaluation set, including parsing file formats and converting to a list
    collect_f (Callable) :Post-processing method for evaluation results, no post-processing by default.

**Examples:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
```


"""
    def __init__(self, *action, return_trace=False):
        super().__init__(return_trace=return_trace)
        if len(action) == 1 and isinstance(action, FlowBase): action = action[0]
        if isinstance(action, (tuple, list)):
            action = Pipeline(*action)
        assert isinstance(action, FlowBase), f'Invalid action type {type(action)}'
        self.action = action

    def forward(self, *args, **kw):
        """Executes the wrapped action with the provided input arguments. Equivalent to directly calling the module.

Args:
    args (list of callables or single callable): Positional arguments to be passed to the wrapped action.
    kwargs (dict of callables): Keyword arguments to be passed to the wrapped action.

**Returns:**

- Any: The result of executing the wrapped action.
"""
        return self.action(*args, **kw)

    @property
    def submodules(self):
        """Returns all submodules of type ModuleBase contained in the wrapped action. This automatically traverses any nested modules inside a Pipeline.

**Returns:**

- list[ModuleBase]: List of submodules
"""
        try:
            if isinstance(self.action, FlowBase):
                submodule = []
                self.action.for_each(lambda x: isinstance(x, ModuleBase), lambda x: submodule.append(x))
                return submodule
        except Exception as e:
            raise RuntimeError(f'{str(e)}\nOriginal traceback:\n{"".join(traceback.format_tb(e.__traceback__))}')
        return super().submodules

    def __repr__(self):
        return lazyllm.make_repr('Module', 'Action', subs=[repr(self.action)],
                                 name=self._module_name, return_trace=self._return_trace)

submodules property

Returns all submodules of type ModuleBase contained in the wrapped action. This automatically traverses any nested modules inside a Pipeline.

Returns:

  • list[ModuleBase]: List of submodules

forward(*args, **kw)

Executes the wrapped action with the provided input arguments. Equivalent to directly calling the module.

Parameters:

  • args (list of callables or single callable, default: () ) –

    Positional arguments to be passed to the wrapped action.

  • kwargs (dict of callables) –

    Keyword arguments to be passed to the wrapped action.

Returns:

  • Any: The result of executing the wrapped action.
Source code in lazyllm/module/module.py
    def forward(self, *args, **kw):
        """Executes the wrapped action with the provided input arguments. Equivalent to directly calling the module.

Args:
    args (list of callables or single callable): Positional arguments to be passed to the wrapped action.
    kwargs (dict of callables): Keyword arguments to be passed to the wrapped action.

**Returns:**

- Any: The result of executing the wrapped action.
"""
        return self.action(*args, **kw)

lazyllm.module.TrainableModule

Bases: UrlModule

Trainable module, all models (including LLM, Embedding, etc.) are served through TrainableModule

TrainableModule(base_model='', target_path='', *, stream=False, return_trace=False)

Parameters:

  • base_model (str, default: '' ) –

    Name or path of the base model.

  • target_path (str, default: '' ) –

    Path to save the fine-tuning task.

  • source (str) –

    Model source. If not set, it will read the value from the environment variable LAZYLLM_MODEL_SOURCE.

  • stream (bool, default: False ) –

    Whether to output stream.

  • return_trace (bool, default: False ) –

    Record the results in trace.

TrainableModule.trainset(v):

Set the training set for TrainableModule

Parameters:

  • v (str) –

    Path to the training/fine-tuning dataset.

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).trainset('/file/to/path').deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}

TrainableModule.train_method(v, **kw):

Set the training method for TrainableModule. Continued pre-training is not supported yet, expected to be available in the next version.

Parameters:

  • v (LazyLLMTrainBase) –

    Training method, options include train.auto etc.

  • kw (**dict) –

    Parameters required by the training method, corresponding to v.

TrainableModule.finetune_method(v, **kw):

Set the fine-tuning method and its parameters for TrainableModule.

Parameters:

  • v (LazyLLMFinetuneBase) –

    Fine-tuning method, options include finetune.auto / finetune.alpacalora / finetune.collie etc.

  • kw (**dict) –

    Parameters required by the fine-tuning method, corresponding to v.

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}                

TrainableModule.deploy_method(v, **kw):

Set the deployment method and its parameters for TrainableModule.

Parameters:

  • v (LazyLLMDeployBase) –

    Deployment method, options include deploy.auto / deploy.lightllm / deploy.vllm etc.

  • kw (**dict) –

    Parameters required by the deployment method, corresponding to v.

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy).mode('finetune')
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]

TrainableModule.mode(v):

Set whether to execute training or fine-tuning during update for TrainableModule.

Parameters:

  • v (str) –

    Sets whether to execute training or fine-tuning during update, options are 'finetune' and 'train', default is 'finetune'.

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}

eval(*, recursive=True) Evaluate the module (and all its submodules). This function takes effect after the module has set an evaluation set through evalset.

Parameters:

  • recursive (bool)

    Whether to recursively evaluate all submodules, default is True.

evalset(evalset, load_f=None, collect_f=<function ModuleBase.<lambda>>)

Set the evaluation set for the Module. Modules that have been set with an evaluation set will be evaluated during update or eval, and the evaluation results will be stored in the eval_result variable.

evalset(evalset, collect_f=lambda x: ...)→ None

Parameters:

  • evalset (list)

    Evaluation set

  • collect_f (Callable)

    Post-processing method for evaluation results, no post-processing by default.

evalset(evalset, load_f=None, collect_f=lambda x: ...)→ None

Parameters:

  • evalset (str)

    Path to the evaluation set

  • load_f (Callable)

    Method for loading the evaluation set, including parsing file formats and converting to a list

  • collect_f (Callable)

    Post-processing method for evaluation results, no post-processing by default.

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]

restart()

Restart the module and all its submodules.

Examples:

>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.restart()
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"

start()

Deploy the module and all its submodules.

Examples:

import lazyllm
m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
m.start()
m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
Source code in lazyllm/module/llms/trainablemodule.py
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
class TrainableModule(UrlModule):
    """Trainable module, all models (including LLM, Embedding, etc.) are served through TrainableModule

<span style="font-size: 20px;">**`TrainableModule(base_model='', target_path='', *, stream=False, return_trace=False)`**</span>


Args:
    base_model (str): Name or path of the base model. 
    target_path (str): Path to save the fine-tuning task. 
    source (str): Model source. If not set, it will read the value from the environment variable LAZYLLM_MODEL_SOURCE.
    stream (bool): Whether to output stream. 
    return_trace (bool): Record the results in trace.


<span style="font-size: 20px;">**`TrainableModule.trainset(v):`**</span>

Set the training set for TrainableModule


Args:
    v (str): Path to the training/fine-tuning dataset.

**Examples:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).trainset('/file/to/path').deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
```

<span style="font-size: 20px;">**`TrainableModule.train_method(v, **kw):`**</span>

Set the training method for TrainableModule. Continued pre-training is not supported yet, expected to be available in the next version.

Args:
    v (LazyLLMTrainBase): Training method, options include ``train.auto`` etc.
    kw (**dict): Parameters required by the training method, corresponding to v.

<span style="font-size: 20px;">**`TrainableModule.finetune_method(v, **kw):`**</span>

Set the fine-tuning method and its parameters for TrainableModule.

Args:
    v (LazyLLMFinetuneBase): Fine-tuning method, options include ``finetune.auto`` / ``finetune.alpacalora`` / ``finetune.collie`` etc.
    kw (**dict): Parameters required by the fine-tuning method, corresponding to v.

**Examples:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}                
```

<span style="font-size: 20px;">**`TrainableModule.deploy_method(v, **kw):`**</span>

Set the deployment method and its parameters for TrainableModule.

Args:
    v (LazyLLMDeployBase): Deployment method, options include ``deploy.auto`` / ``deploy.lightllm`` / ``deploy.vllm`` etc.
    kw (**dict): Parameters required by the deployment method, corresponding to v.

**Examples:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy).mode('finetune')
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
```                


<span style="font-size: 20px;">**`TrainableModule.mode(v):`**</span>

Set whether to execute training or fine-tuning during update for TrainableModule.

Args:
    v (str): Sets whether to execute training or fine-tuning during update, options are 'finetune' and 'train', default is 'finetune'.

**Examples:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().finetune_method(finetune.dummy).deploy_method(None).mode('finetune')
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
```    

<span style="font-size: 20px;">**`eval(*, recursive=True)`**</span>
Evaluate the module (and all its submodules). This function takes effect after the module has set an evaluation set through evalset.

Args:
    recursive (bool) :Whether to recursively evaluate all submodules, default is True.                         

<span style="font-size: 20px;">**`evalset(evalset, load_f=None, collect_f=<function ModuleBase.<lambda>>)`**</span>

Set the evaluation set for the Module. Modules that have been set with an evaluation set will be evaluated during ``update`` or ``eval``, and the evaluation results will be stored in the eval_result variable. 


<span style="font-size: 18px;">&ensp;**`evalset(evalset, collect_f=lambda x: ...)→ None `**</span>


Args:
    evalset (list) :Evaluation set
    collect_f (Callable) :Post-processing method for evaluation results, no post-processing by default.



<span style="font-size: 18px;">&ensp;**`evalset(evalset, load_f=None, collect_f=lambda x: ...)→ None`**</span>


Args:
    evalset (str) :Path to the evaluation set
    load_f (Callable) :Method for loading the evaluation set, including parsing file formats and converting to a list
    collect_f (Callable) :Post-processing method for evaluation results, no post-processing by default.

**Examples:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.evalset([1, 2, 3])
>>> m.update()
INFO: (lazyllm.launcher) PID: dummy finetune!, and init-args is {}
>>> m.eval_result
["reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1}", "reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1}"]
```

<span style="font-size: 20px;">**`restart() `**</span>

Restart the module and all its submodules.

**Examples:**

```python
>>> import lazyllm
>>> m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
>>> m.restart()
>>> m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
```

<span style="font-size: 20px;">**`start() `**</span> 

Deploy the module and all its submodules.

**Examples:**

```python
import lazyllm
m = lazyllm.module.TrainableModule().deploy_method(deploy.dummy)
m.start()
m(1)
"reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1}"
```                                  
"""
    builder_keys = _TrainableModuleImpl.builder_keys

    def __init__(self, base_model: Option = '', target_path='', *, stream: Union[bool, Dict[str, str]] = False,
                 return_trace: bool = False, trust_remote_code: bool = True, type: Optional[Union[str, LLMType]] = None):
        super().__init__(url=None, stream=stream, return_trace=return_trace, init_prompt=False)
        self._template = _UrlTemplateStruct()
        self._impl = _TrainableModuleImpl(base_model, target_path, stream, None, lazyllm.finetune.auto,
                                          lazyllm.deploy.auto, self._template, self._url_wrapper,
                                          trust_remote_code, type)
        self._stream = stream
        self.prompt()
        if config['cache_local_module']:
            self.use_cache()

    template_message = property(lambda self: self._template.template_message)
    keys_name_handle = property(lambda self: self._template.keys_name_handle)
    template_headers = property(lambda self: self._template.template_headers)
    extract_result_func = property(lambda self: self._template.extract_result_func)
    stream_parse_parameters = property(lambda self: self._template.stream_parse_parameters)
    stream_url_suffix = property(lambda self: self._template.stream_url_suffix)

    base_model = property(lambda self: self._impl._base_model)
    target_path = property(lambda self: self._impl._target_path)
    finetuned_model_path = property(lambda self: self._impl._finetuned_model_path)
    _url_id = property(lambda self: self._impl._module_id)

    @property
    def series(self):
        return re.sub(r'\d+$', '', ModelManager._get_model_name(self.base_model).split('-')[0].upper())

    @property
    def type(self):
        if self._impl._type is not None: return self._impl._type.value
        return ModelManager.get_model_type(self.base_model).upper()

    def get_all_models(self):
        """get_all_models() -> List[str]

Returns a list of all fine-tuned model paths under the current target path.

**Returns:**

- List[str]: A list of fine-tuned model identifiers or directories.
"""
        return self._impl._get_all_finetuned_models()

    def set_specific_finetuned_model(self, model_path):
        """set_specific_finetuned_model(model_path: str) -> None

Sets the model to be used from a specific fine-tuned model path.

Args:
    model_path (str): The path to the fine-tuned model to use.
"""
        return self._impl._set_specific_finetuned_model(model_path)

    @property
    def _deploy_type(self):
        if self._impl._deploy is not lazyllm.deploy.AutoDeploy:
            return self._impl._deploy
        elif self._impl._deployer:
            return type(self._impl._deployer)
        else:
            return lazyllm.deploy.AutoDeploy

    def wait(self):
        """Wait for the model deployment task to complete. This method blocks the current thread until the deployment is finished.


Examples:
    >>> import lazyllm
    >>> class Mywait(lazyllm.module.llms.TrainableModule):
    ...    def forward(self):
    ...        self.wait()
    """
        if launcher := self._impl._launchers['default'].get('deploy'):
            launcher.wait()

    def stop(self, task_name: Optional[str] = None):
        """Pause a specific task of the model.

Args:
    task_name (str): The name of the task to pause. Defaults to None (pauses the 'deploy' task by default).


Examples:
    >>> import lazyllm
    >>> class Mystop(lazyllm.module.llms.TrainableModule):
    ...    def forward(self, task):
    ...        self.stop(task)
    """
        try:
            launcher = self._impl._launchers['manual' if task_name else 'default'][task_name or 'deploy']
        except KeyError:
            raise RuntimeError('Cannot stop an unstarted task')
        if not task_name: self._impl._get_deploy_tasks.flag.reset()
        launcher.cleanup()

    def status(self, task_name: Optional[str] = None):
        """status(task_name: Optional[str] = None) -> str

Returns the current status of a specific task in the module.

Args:
    task_name (Optional[str]): Name of the task (e.g., 'deploy'). Defaults to 'deploy' if not provided.

**Returns:**

- str: Status string such as 'running', 'finished', or 'stopped'.
"""
        launcher = self._impl._launchers['manual' if task_name else 'default'][task_name or 'deploy']
        return launcher.status

    def log_path(self, task_name: Optional[str] = None):
        """Get task log path.

Get corresponding log file path based on task name, supports default deployment tasks and manually specified tasks.

Args:
    task_name (Optional[str]): Task name, defaults to None (get default deployment task log)

Returns:
    str: Log file path
"""
        launcher = self._impl._launchers['manual' if task_name else 'default'][task_name or 'deploy']
        return launcher.log_path

    # modify default value to ''
    def prompt(self, prompt: Union[str, dict] = '', history: Optional[List[List[str]]] = None):
        """Processes the input prompt and generates a format compatible with the model.

Args:
    prompt (str): The input prompt. Defaults to an empty string.
    history (List): Conversation history.


Examples:
    >>> import lazyllm
    >>> class Myprompt(lazyllm.module.llms.TrainableModule):
    ...    def forward(self, prompt, history):
    ...        self.prompt(prompt,history)
    """
        if self.base_model != '' and prompt == '' and self.type != 'LLM':
            prompt = None
        clear_system = isinstance(prompt, dict) and prompt.get('drop_builtin_system')
        prompter = super(__class__, self).prompt(prompt, history)._prompt
        self._tools = getattr(prompter, '_tools', None)
        keys = ModelManager.get_model_prompt_keys(self.base_model).copy()
        if keys:
            if clear_system: keys['system'] = ''
            prompter._set_model_configs(**keys)
            for key in ['tool_start_token', 'tool_args_token', 'tool_end_token']:
                if key in keys: setattr(self, f'_{key}', keys[key])
        return self

    def _loads_str(self, text: str) -> Union[str, Dict]:
        try:
            ret = json.loads(text)
            return self._loads_str(ret) if isinstance(ret, str) else ret
        except Exception:
            LOG.error(f'{text} is not a valid json string.')
            return text

    def _parse_arguments_with_args_token(self, output: str) -> tuple[str, dict]:
        items = output.split(self._tool_args_token)
        func_name = items[0].strip()
        if len(items) == 1:
            return func_name.split(self._tool_end_token)[0].strip() if getattr(self, '_tool_end_token', None)\
                else func_name, {}
        args = (items[1].split(self._tool_end_token)[0].strip() if getattr(self, '_tool_end_token', None)
                else items[1].strip())
        return func_name, self._loads_str(args) if isinstance(args, str) else args

    def _parse_arguments_without_args_token(self, output: str) -> tuple[str, dict]:
        items = output.split(self._tool_end_token)[0] if getattr(self, '_tool_end_token', None) else output
        func_name = ''
        args = {}
        try:
            items = json.loads(items.strip())
            func_name = items.get('name', '')
            args = items.get('parameters', items.get('arguments', {}))
        except Exception:
            LOG.error(f'tool calls info {items} parse error')

        return func_name, self._loads_str(args) if isinstance(args, str) else args

    def _parse_arguments_with_tools(self, output: Dict[str, Any], tools: List[str]) -> bool:
        func_name = ''
        args = {}
        is_tc = False
        tc = {}
        if output.get('name', '') in tools:
            is_tc = True
            func_name = output.get('name', '')
            args = output.get('parameters', output.get('arguments', {}))
            tc = {'name': func_name, 'arguments': self._loads_str(args) if isinstance(args, str) else args}
            return is_tc, tc
        return is_tc, tc

    def _parse_tool_start_token(self, output: str) -> tuple[str, List[Dict]]:
        tool_calls = []
        segs = output.split(self._tool_start_token)
        content = segs[0]
        for seg in segs[1:]:
            func_name, arguments = self._parse_arguments_with_args_token(seg.strip())\
                if getattr(self, '_tool_args_token', None)\
                else self._parse_arguments_without_args_token(seg.strip())
            if func_name:
                tool_calls.append({'name': func_name, 'arguments': arguments})

        return content, tool_calls

    def _parse_tools(self, output: str) -> tuple[str, List[Dict]]:
        tool_calls = []
        tools = {tool['function']['name'] for tool in self._tools}
        lines = output.strip().split('\n')
        content = []
        is_tool_call = False
        for idx, line in enumerate(lines):
            if line.startswith('{') and idx > 0:
                func_name = lines[idx - 1].strip()
                if func_name in tools:
                    is_tool_call = True
                    if func_name == content[-1].strip():
                        content.pop()
                    arguments = '\n'.join(lines[idx:]).strip()
                    tool_calls.append({'name': func_name, 'arguments': arguments})
                    continue
            if '{' in line and 'name' in line:
                try:
                    items = json.loads(line.strip())
                    items = [items] if isinstance(items, dict) else items
                    if isinstance(items, list):
                        for item in items:
                            is_tool_call, tc = self._parse_arguments_with_tools(item, tools)
                            if is_tool_call:
                                tool_calls.append(tc)
                except Exception:
                    LOG.error(f'tool calls info {line} parse error')
            if not is_tool_call:
                content.append(line)
        content = '\n'.join(content) if len(content) > 0 else ''
        return content, tool_calls

    def _extract_tool_calls(self, output: str) -> tuple[str, List[Dict]]:
        tool_calls = []
        content = ''
        if getattr(self, '_tool_start_token', None) and self._tool_start_token in output:
            content, tool_calls = self._parse_tool_start_token(output)
        elif self._tools:
            content, tool_calls = self._parse_tools(output)
        else:
            content = output

        return content, tool_calls

    def _decode_base64_to_file(self, content: str) -> str:
        decontent = decode_query_with_filepaths(content)
        files = [_base64_to_file(file_content) if _is_base64_with_mime(file_content) else file_content
                 for file_content in decontent['files']]
        return encode_query_with_filepaths(query=decontent['query'], files=files)

    def _build_response(self, content: str, tool_calls: List[Dict[str, str]]) -> str:
        tc = [{'id': str(uuid.uuid4().hex), 'type': 'function', 'function': tool_call} for tool_call in tool_calls]
        if content and tc:
            return globals['tool_delimiter'].join([content, json.dumps(tc, ensure_ascii=False)])
        elif not content and tc:
            return globals['tool_delimiter'] + json.dumps(tc, ensure_ascii=False)
        else:
            return content

    def _extract_and_format(self, output: str) -> str:
        """
        1.extract tool calls information;
            a. If 'tool_start_token' exists, the boundary of tool_calls can be found according to 'tool_start_token',
               and then the function name and arguments of tool_calls can be extracted according to 'tool_args_token'
               and 'tool_end_token'.
            b. If 'tool_start_token' does not exist, the text is segmented using '\n' according to the incoming tools
               information, and then processed according to the rules.
        """
        content, tool_calls = self._extract_tool_calls(output)
        if isinstance(content, str) and content.startswith(LAZYLLM_QUERY_PREFIX):
            content = self._decode_base64_to_file(content)
        return self._build_response(content, tool_calls)

    def __repr__(self):
        return lazyllm.make_repr('Module', 'Trainable', mode=self._impl._mode, basemodel=self.base_model,
                                 target=self.target_path, name=self._module_name, deploy_type=self._deploy_type,
                                 stream=bool(self._stream), return_trace=self._return_trace)

    def __getattr__(self, key):
        if key in self.__class__.builder_keys:
            return functools.partial(getattr(self._impl, key), _return_value=self)
        raise AttributeError(f'{__class__} object has no attribute {key}')

    def _record_usage(self, text_input_for_token_usage: str, temp_output: str):
        usage = {'prompt_tokens': self._estimate_token_usage(text_input_for_token_usage)}
        usage['completion_tokens'] = self._estimate_token_usage(temp_output)
        self._record_usage_impl(usage)

    def _record_usage_impl(self, usage: dict):
        globals['usage'][self._module_id] = usage
        par_muduleid = self._used_by_moduleid
        if par_muduleid is None:
            return
        if par_muduleid not in globals['usage']:
            globals['usage'][par_muduleid] = usage
            return
        existing_usage = globals['usage'][par_muduleid]
        if existing_usage['prompt_tokens'] == -1 or usage['prompt_tokens'] == -1:
            globals['usage'][par_muduleid] = {'prompt_tokens': -1, 'completion_tokens': -1}
        else:
            for k in globals['usage'][par_muduleid]:
                globals['usage'][par_muduleid][k] += usage[k]

    def forward(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, **kw):
        """Supports handling various input formats, automatically builds the input structure required by the model, and adapts to multimodal scenarios.


Examples:
    >>> import lazyllm
    >>> from lazyllm.module import TrainableModule
    >>> class MyModule(TrainableModule):
    ...     def forward(self, __input, **kw):
    ...         return f"processed: {__input}"
    ...
    >>> MyModule()("Hello")
    'processed: Hello'
    """
        if self._url.endswith('/v1/'):
            return self.forward_openai(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                       tools=tools, stream_output=stream_output, **kw)
        else:
            return self.forward_standard(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                         tools=tools, stream_output=stream_output, **kw)

    def forward_openai(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                       *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, **kw):
        """Perform forward inference using OpenAI compatible interface.

Call deployed model service through OpenAI standard API format, supports chat history, file processing, tool calling and streaming output.

Args:
    __input (Union[Tuple[Union[str, Dict], str], str, Dict]): Input data, can be text, dictionary or packaged data
    llm_chat_history: Chat history records
    lazyllm_files: File data
    tools: Tool calling configuration
    stream_output (bool): Whether to stream output
    **kw: Other keyword arguments

Returns:
    Model inference result
"""
        if not getattr(self, '_openai_module', None):
            model_type = self.type.lower()
            if model_type in ['llm', 'vlm']:
                self._openai_module = lazyllm.OnlineChatModule(
                    source='openai', model='lazyllm', base_url=self._url, skip_auth=True, type=model_type,
                    stream=self._stream).share(prompt=self._prompt, format=self._formatter)
                self._openai_module._prompt._set_model_configs(system='You are LazyLLM, \
                    a large language model developed by SenseTime.')
            elif model_type in ['embed', 'rerank']:
                self._openai_module = lazyllm.OnlineEmbeddingModule(
                    source='openai', embed_model_name='lazyllm', embed_url=self._url, type=model_type)
            else:
                raise ValueError(f'Unsupported type: {model_type} for openai compatible module')
            self._openai_module.used_by(self._module_id)
        return self._openai_module.forward(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                           tools=tools, stream_output=stream_output, **kw)

    def forward_standard(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                         *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, **kw):
        """Perform forward inference using standard interface.

Call deployed model service through custom standard API format, supports template messages, file encoding and streaming output.

Args:
    __input (Union[Tuple[Union[str, Dict], str], str, Dict]): Input data, can be text, dictionary or packaged data
    llm_chat_history: Chat history records
    lazyllm_files: File data
    tools: Tool calling configuration
    stream_output (bool): Whether to stream output
    **kw: Other keyword arguments

Returns:
    Model inference result
"""
        __input, files = self._get_files(__input, lazyllm_files)
        text_input_for_token_usage = __input = self._prompt.generate_prompt(__input, llm_chat_history, tools)
        url = self._url

        if self.template_message:
            data = self._modify_parameters(copy.deepcopy(self.template_message), kw, optional_keys='modality')
            data[self.keys_name_handle.get('inputs', 'inputs')] = __input
            if files and (keys := list(set(self.keys_name_handle).intersection(LazyLLMDeployBase.encoder_map.keys()))):
                assert len(keys) == 1, 'Only one key is supported for encoder_mapping'
                data[self.keys_name_handle[keys[0]]] = encode_files(files, LazyLLMDeployBase.encoder_map[keys[0]])

            if stream_output:
                if self.stream_url_suffix and not url.endswith(self.stream_url_suffix):
                    url += self.stream_url_suffix
                if 'stream' in data: data['stream'] = stream_output
        else:
            data = __input
            if stream_output: LOG.warning('stream_output is not supported when template_message is not set, ignore it')
            assert not kw, 'kw is not supported when template_message is not set'

        with self.stream_output((stream_output := (stream_output or self._stream))):
            return self._forward_impl(data, stream_output=stream_output, url=url, text_input=text_input_for_token_usage)

    def _maybe_has_fc(self, token: str, chunk: str) -> bool:
        return token and (token.startswith(chunk if token.startswith('\n') else chunk.lstrip('\n')) or token in chunk)

    def _forward_impl(self, data: Union[Tuple[Union[str, Dict], str], str, Dict] = package(), *,  # noqa B008
                      url: str, stream_output: Optional[Union[bool, Dict]] = None, text_input: Optional[str] = None):
        headers = self.template_headers or {'Content-Type': 'application/json'}
        parse_parameters = self.stream_parse_parameters if stream_output else {'delimiter': b'<|lazyllm_delimiter|>'}

        # context bug with httpx, so we use requests
        with requests.post(url, json=data, stream=True, headers=headers, proxies={'http': None, 'https': None}) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            messages, cache = '', ''
            token = getattr(self, '_tool_start_token', '')
            color = stream_output.get('color') if isinstance(stream_output, dict) else None

            for line in r.iter_lines(**parse_parameters):
                if not line: continue
                line = self._decode_line(line)

                chunk = self._prompt.get_response(self.extract_result_func(line, data))
                chunk = chunk[len(messages):] if isinstance(chunk, str) and chunk.startswith(messages) else chunk
                messages = chunk if not isinstance(chunk, str) else messages + chunk

                if not stream_output: continue
                if not cache: cache = chunk if self._maybe_has_fc(token, chunk) else self._stream_output(chunk, color)
                elif token in cache:
                    stream_output = False
                    if not cache.startswith(token): self._stream_output(cache.split(token)[0], color)
                else:
                    cache += chunk
                    if not self._maybe_has_fc(token, cache): cache = self._stream_output(cache, color)

            temp_output = self._extract_and_format(messages)
            if text_input: self._record_usage(text_input, temp_output)
            return self._formatter(temp_output)

    def _modify_parameters(self, paras: dict, kw: dict, *, optional_keys: Union[List[str], str] = None):
        for key, value in paras.items():
            if key == self.keys_name_handle['inputs']: continue
            elif isinstance(value, dict):
                if key in kw:
                    assert set(kw[key].keys()).issubset(set(value.keys()))
                    value.update(kw.pop(key))
                else: [setattr(value, k, kw.pop(k)) for k in value.keys() if k in kw]
            elif key in kw: paras[key] = kw.pop(key)

        optional_keys = [optional_keys] if isinstance(optional_keys, str) else (optional_keys or [])
        assert set(kw.keys()).issubset(set(optional_keys)), f'{kw.keys()} is not in {optional_keys}'
        paras.update(kw)
        return paras

    def set_default_parameters(self, *, optional_keys: Optional[List[str]] = None, **kw):
        """set_default_parameters(*, optional_keys: List[str] = [], **kw) -> None

Sets the default parameters to be used during inference or evaluation.

Args:
    optional_keys (List[str]): A list of optional keys to allow additional parameters without error.
    **kw: Key-value pairs for default parameters such as temperature, top_k, etc.
"""
        self._modify_parameters(self.template_message, kw, optional_keys=optional_keys or [])

    def _cache_miss_handler(self):
        if not self._url or self._url == fake_url:
            raise RuntimeError('Cache miss, please use `start()` to deploy the module first')

    def __getstate__(self):
        state = self.__dict__.copy()
        state['base_model'] = self._impl._base_model
        return state

    def __setstate__(self, state):
        self.__dict__.update(state)
        self._impl._base_model = state['base_model']

wait()

Wait for the model deployment task to complete. This method blocks the current thread until the deployment is finished.

Examples:

>>> import lazyllm
>>> class Mywait(lazyllm.module.llms.TrainableModule):
...    def forward(self):
...        self.wait()
Source code in lazyllm/module/llms/trainablemodule.py
    def wait(self):
        """Wait for the model deployment task to complete. This method blocks the current thread until the deployment is finished.


Examples:
    >>> import lazyllm
    >>> class Mywait(lazyllm.module.llms.TrainableModule):
    ...    def forward(self):
    ...        self.wait()
    """
        if launcher := self._impl._launchers['default'].get('deploy'):
            launcher.wait()

stop(task_name=None)

Pause a specific task of the model.

Parameters:

  • task_name (str, default: None ) –

    The name of the task to pause. Defaults to None (pauses the 'deploy' task by default).

Examples:

>>> import lazyllm
>>> class Mystop(lazyllm.module.llms.TrainableModule):
...    def forward(self, task):
...        self.stop(task)
Source code in lazyllm/module/llms/trainablemodule.py
    def stop(self, task_name: Optional[str] = None):
        """Pause a specific task of the model.

Args:
    task_name (str): The name of the task to pause. Defaults to None (pauses the 'deploy' task by default).


Examples:
    >>> import lazyllm
    >>> class Mystop(lazyllm.module.llms.TrainableModule):
    ...    def forward(self, task):
    ...        self.stop(task)
    """
        try:
            launcher = self._impl._launchers['manual' if task_name else 'default'][task_name or 'deploy']
        except KeyError:
            raise RuntimeError('Cannot stop an unstarted task')
        if not task_name: self._impl._get_deploy_tasks.flag.reset()
        launcher.cleanup()

prompt(prompt='', history=None)

Processes the input prompt and generates a format compatible with the model.

Parameters:

  • prompt (str, default: '' ) –

    The input prompt. Defaults to an empty string.

  • history (List, default: None ) –

    Conversation history.

Examples:

>>> import lazyllm
>>> class Myprompt(lazyllm.module.llms.TrainableModule):
...    def forward(self, prompt, history):
...        self.prompt(prompt,history)
Source code in lazyllm/module/llms/trainablemodule.py
    def prompt(self, prompt: Union[str, dict] = '', history: Optional[List[List[str]]] = None):
        """Processes the input prompt and generates a format compatible with the model.

Args:
    prompt (str): The input prompt. Defaults to an empty string.
    history (List): Conversation history.


Examples:
    >>> import lazyllm
    >>> class Myprompt(lazyllm.module.llms.TrainableModule):
    ...    def forward(self, prompt, history):
    ...        self.prompt(prompt,history)
    """
        if self.base_model != '' and prompt == '' and self.type != 'LLM':
            prompt = None
        clear_system = isinstance(prompt, dict) and prompt.get('drop_builtin_system')
        prompter = super(__class__, self).prompt(prompt, history)._prompt
        self._tools = getattr(prompter, '_tools', None)
        keys = ModelManager.get_model_prompt_keys(self.base_model).copy()
        if keys:
            if clear_system: keys['system'] = ''
            prompter._set_model_configs(**keys)
            for key in ['tool_start_token', 'tool_args_token', 'tool_end_token']:
                if key in keys: setattr(self, f'_{key}', keys[key])
        return self

log_path(task_name=None)

Get task log path.

Get corresponding log file path based on task name, supports default deployment tasks and manually specified tasks.

Parameters:

  • task_name (Optional[str], default: None ) –

    Task name, defaults to None (get default deployment task log)

Returns:

  • str

    Log file path

Source code in lazyllm/module/llms/trainablemodule.py
    def log_path(self, task_name: Optional[str] = None):
        """Get task log path.

Get corresponding log file path based on task name, supports default deployment tasks and manually specified tasks.

Args:
    task_name (Optional[str]): Task name, defaults to None (get default deployment task log)

Returns:
    str: Log file path
"""
        launcher = self._impl._launchers['manual' if task_name else 'default'][task_name or 'deploy']
        return launcher.log_path

forward_openai(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, **kw)

Perform forward inference using OpenAI compatible interface.

Call deployed model service through OpenAI standard API format, supports chat history, file processing, tool calling and streaming output.

Parameters:

  • __input (Union[Tuple[Union[str, Dict], str], str, Dict], default: package() ) –

    Input data, can be text, dictionary or packaged data

  • llm_chat_history

    Chat history records

  • lazyllm_files

    File data

  • tools

    Tool calling configuration

  • stream_output (bool, default: False ) –

    Whether to stream output

  • **kw

    Other keyword arguments

Returns:

  • Model inference result

Source code in lazyllm/module/llms/trainablemodule.py
    def forward_openai(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                       *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, **kw):
        """Perform forward inference using OpenAI compatible interface.

Call deployed model service through OpenAI standard API format, supports chat history, file processing, tool calling and streaming output.

Args:
    __input (Union[Tuple[Union[str, Dict], str], str, Dict]): Input data, can be text, dictionary or packaged data
    llm_chat_history: Chat history records
    lazyllm_files: File data
    tools: Tool calling configuration
    stream_output (bool): Whether to stream output
    **kw: Other keyword arguments

Returns:
    Model inference result
"""
        if not getattr(self, '_openai_module', None):
            model_type = self.type.lower()
            if model_type in ['llm', 'vlm']:
                self._openai_module = lazyllm.OnlineChatModule(
                    source='openai', model='lazyllm', base_url=self._url, skip_auth=True, type=model_type,
                    stream=self._stream).share(prompt=self._prompt, format=self._formatter)
                self._openai_module._prompt._set_model_configs(system='You are LazyLLM, \
                    a large language model developed by SenseTime.')
            elif model_type in ['embed', 'rerank']:
                self._openai_module = lazyllm.OnlineEmbeddingModule(
                    source='openai', embed_model_name='lazyllm', embed_url=self._url, type=model_type)
            else:
                raise ValueError(f'Unsupported type: {model_type} for openai compatible module')
            self._openai_module.used_by(self._module_id)
        return self._openai_module.forward(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                           tools=tools, stream_output=stream_output, **kw)

forward_standard(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, **kw)

Perform forward inference using standard interface.

Call deployed model service through custom standard API format, supports template messages, file encoding and streaming output.

Parameters:

  • __input (Union[Tuple[Union[str, Dict], str], str, Dict], default: package() ) –

    Input data, can be text, dictionary or packaged data

  • llm_chat_history

    Chat history records

  • lazyllm_files

    File data

  • tools

    Tool calling configuration

  • stream_output (bool, default: False ) –

    Whether to stream output

  • **kw

    Other keyword arguments

Returns:

  • Model inference result

Source code in lazyllm/module/llms/trainablemodule.py
    def forward_standard(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                         *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, **kw):
        """Perform forward inference using standard interface.

Call deployed model service through custom standard API format, supports template messages, file encoding and streaming output.

Args:
    __input (Union[Tuple[Union[str, Dict], str], str, Dict]): Input data, can be text, dictionary or packaged data
    llm_chat_history: Chat history records
    lazyllm_files: File data
    tools: Tool calling configuration
    stream_output (bool): Whether to stream output
    **kw: Other keyword arguments

Returns:
    Model inference result
"""
        __input, files = self._get_files(__input, lazyllm_files)
        text_input_for_token_usage = __input = self._prompt.generate_prompt(__input, llm_chat_history, tools)
        url = self._url

        if self.template_message:
            data = self._modify_parameters(copy.deepcopy(self.template_message), kw, optional_keys='modality')
            data[self.keys_name_handle.get('inputs', 'inputs')] = __input
            if files and (keys := list(set(self.keys_name_handle).intersection(LazyLLMDeployBase.encoder_map.keys()))):
                assert len(keys) == 1, 'Only one key is supported for encoder_mapping'
                data[self.keys_name_handle[keys[0]]] = encode_files(files, LazyLLMDeployBase.encoder_map[keys[0]])

            if stream_output:
                if self.stream_url_suffix and not url.endswith(self.stream_url_suffix):
                    url += self.stream_url_suffix
                if 'stream' in data: data['stream'] = stream_output
        else:
            data = __input
            if stream_output: LOG.warning('stream_output is not supported when template_message is not set, ignore it')
            assert not kw, 'kw is not supported when template_message is not set'

        with self.stream_output((stream_output := (stream_output or self._stream))):
            return self._forward_impl(data, stream_output=stream_output, url=url, text_input=text_input_for_token_usage)

forward(__input=package(), *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, **kw)

Supports handling various input formats, automatically builds the input structure required by the model, and adapts to multimodal scenarios.

Examples:

>>> import lazyllm
>>> from lazyllm.module import TrainableModule
>>> class MyModule(TrainableModule):
...     def forward(self, __input, **kw):
...         return f"processed: {__input}"
...
>>> MyModule()("Hello")
'processed: Hello'
Source code in lazyllm/module/llms/trainablemodule.py
    def forward(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(),  # noqa B008
                *, llm_chat_history=None, lazyllm_files=None, tools=None, stream_output=False, **kw):
        """Supports handling various input formats, automatically builds the input structure required by the model, and adapts to multimodal scenarios.


Examples:
    >>> import lazyllm
    >>> from lazyllm.module import TrainableModule
    >>> class MyModule(TrainableModule):
    ...     def forward(self, __input, **kw):
    ...         return f"processed: {__input}"
    ...
    >>> MyModule()("Hello")
    'processed: Hello'
    """
        if self._url.endswith('/v1/'):
            return self.forward_openai(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                       tools=tools, stream_output=stream_output, **kw)
        else:
            return self.forward_standard(__input, llm_chat_history=llm_chat_history, lazyllm_files=lazyllm_files,
                                         tools=tools, stream_output=stream_output, **kw)

lazyllm.module.UrlModule

Bases: ModuleBase, LLMBase, _UrlHelper

The URL obtained from deploying the ServerModule can be wrapped into a Module. When calling __call__ , it will access the service.

Parameters:

  • url (str, default: '' ) –

    The URL of the service to be wrapped, defaults to empty string.

  • stream (bool | Dict[str, str], default: False ) –

    Whether to request and output in streaming mode, default is non-streaming.

  • return_trace (bool, default: False ) –

    Whether to record the results in trace, default is False.

  • init_prompt (bool, default: True ) –

    Whether to initialize prompt, defaults to True.

Examples:

>>> import lazyllm
>>> def demo(input): return input * 2
... 
>>> s = lazyllm.ServerModule(demo, launcher=lazyllm.launchers.empty(sync=False))
>>> s.start()
INFO:     Uvicorn running on http://0.0.0.0:35485
>>> u = lazyllm.UrlModule(url=s._url)
>>> print(u(1))
2
Source code in lazyllm/module/servermodule.py
class UrlModule(ModuleBase, LLMBase, _UrlHelper):
    """The URL obtained from deploying the ServerModule can be wrapped into a Module. When calling ``__call__`` , it will access the service.

Args:
    url (str): The URL of the service to be wrapped, defaults to empty string.
    stream (bool|Dict[str, str]): Whether to request and output in streaming mode, default is non-streaming.
    return_trace (bool): Whether to record the results in trace, default is False.
    init_prompt (bool): Whether to initialize prompt, defaults to True.


Examples:
    >>> import lazyllm
    >>> def demo(input): return input * 2
    ... 
    >>> s = lazyllm.ServerModule(demo, launcher=lazyllm.launchers.empty(sync=False))
    >>> s.start()
    INFO:     Uvicorn running on http://0.0.0.0:35485
    >>> u = lazyllm.UrlModule(url=s._url)
    >>> print(u(1))
    2
    """

    def __new__(cls, *args, **kw):
        if cls is not UrlModule:
            return super().__new__(cls)
        return ServerModule(*args, **kw)

    def __init__(self, *, url: Optional[str] = '', stream: Union[bool, Dict[str, str]] = False,
                 return_trace: bool = False, init_prompt: bool = True):
        super().__init__(return_trace=return_trace)
        LLMBase.__init__(self, stream=stream, init_prompt=init_prompt)
        _UrlHelper.__init__(self, url)

    def _estimate_token_usage(self, text):
        if not isinstance(text, str):
            return 0
        # extract english words, number and comma
        pattern = r'\b[a-zA-Z0-9]+\b|,'
        ascii_words = re.findall(pattern, text)
        ascii_ch_count = sum(len(ele) for ele in ascii_words)
        non_ascii_pattern = r'[^\x00-\x7F]'
        non_ascii_chars = re.findall(non_ascii_pattern, text)
        non_ascii_char_count = len(non_ascii_chars)
        return int(ascii_ch_count / 3.0 + non_ascii_char_count + 1)

    def _decode_line(self, line: bytes):
        try:
            return pickle.loads(codecs.decode(line, 'base64'))
        except Exception:
            return line.decode('utf-8')

    def _extract_and_format(self, output: str) -> str:
        return output

    def forward(self, *args, **kw):
        """Defines the computation steps to be executed each time. All subclasses of ModuleBase need to override this function.


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...    def forward(self, input):
    ...        return input + 1
    ...
    >>> MyModule()(1)
    2
    """
        raise NotImplementedError

    def __call__(self, *args, **kw):
        assert self._url is not None, f'Please start {self.__class__} first'
        if len(args) > 1:
            return super(__class__, self).__call__(package(args), **kw)
        return super(__class__, self).__call__(*args, **kw)

    def __repr__(self):
        return lazyllm.make_repr('Module', 'Url', name=self._module_name, url=self._url,
                                 stream=self._stream, return_trace=self._return_trace)

forward(*args, **kw)

Defines the computation steps to be executed each time. All subclasses of ModuleBase need to override this function.

Examples:

>>> import lazyllm
>>> class MyModule(lazyllm.module.ModuleBase):
...    def forward(self, input):
...        return input + 1
...
>>> MyModule()(1)
2
Source code in lazyllm/module/servermodule.py
    def forward(self, *args, **kw):
        """Defines the computation steps to be executed each time. All subclasses of ModuleBase need to override this function.


Examples:
    >>> import lazyllm
    >>> class MyModule(lazyllm.module.ModuleBase):
    ...    def forward(self, input):
    ...        return input + 1
    ...
    >>> MyModule()(1)
    2
    """
        raise NotImplementedError

lazyllm.module.ServerModule

Bases: UrlModule

The ServerModule class inherits from UrlModule and provides functionality to deploy any callable object as an API service.
Built on FastAPI, it supports launching a main service with multiple satellite services, as well as preprocessing, postprocessing, and streaming capabilities.
A local callable can be deployed as a service, or an existing service can be accessed directly via a URL.

Parameters:

  • m (Optional[Union[str, ModuleBase]], default: None ) –

    The module or its name to be wrapped as a service.
    If a string is provided, it is treated as a URL and url must be None.
    If a ModuleBase is provided, it will be wrapped as a service.

  • pre (Optional[Callable], default: None ) –

    Preprocessing function executed in the service process. Default is None.

  • post (Optional[Callable], default: None ) –

    Postprocessing function executed in the service process. Default is None.

  • stream (Union[bool, Dict], default: False ) –

    Whether to enable streaming output. Can be a boolean or a dictionary with streaming configuration. Default is False.

  • return_trace (Optional[bool], default: False ) –

    Whether to return debug trace information. Default is False.

  • port (Optional[int], default: None ) –

    Port to deploy the service. If None, a random port will be assigned.

  • pythonpath (Optional[str], default: None ) –

    PYTHONPATH environment variable passed to the subprocess. Defaults to None.

  • launcher (Optional[LazyLLMLaunchersBase], default: None ) –

    The launcher used to deploy the service. Defaults to asynchronous remote deployment.

  • url (Optional[str], default: None ) –

    URL of an already deployed service. If provided, m must be None.

Examples:

>>> import lazyllm
>>> def demo(input): return input * 2
...
>>> s = lazyllm.ServerModule(demo, launcher=launchers.empty(sync=False))
>>> s.start()
INFO:     Uvicorn running on http://0.0.0.0:35485
>>> print(s(1))
2
>>> class MyServe(object):
...     def __call__(self, input):
...         return 2 * input
...
...     @lazyllm.FastapiApp.post
...     def server1(self, input):
...         return f'reply for {input}'
...
...     @lazyllm.FastapiApp.get
...     def server2(self):
...        return f'get method'
...
>>> m = lazyllm.ServerModule(MyServe(), launcher=launchers.empty(sync=False))
>>> m.start()
INFO:     Uvicorn running on http://0.0.0.0:32028
>>> print(m(1))
2
Source code in lazyllm/module/servermodule.py
class ServerModule(UrlModule):
    """The ServerModule class inherits from UrlModule and provides functionality to deploy any callable object as an API service.  
Built on FastAPI, it supports launching a main service with multiple satellite services, as well as preprocessing, postprocessing, and streaming capabilities.  
A local callable can be deployed as a service, or an existing service can be accessed directly via a URL.

Args:
    m (Optional[Union[str, ModuleBase]]): The module or its name to be wrapped as a service.  
        If a string is provided, it is treated as a URL and `url` must be None.  
        If a ModuleBase is provided, it will be wrapped as a service.
    pre (Optional[Callable]): Preprocessing function executed in the service process. Default is ``None``.
    post (Optional[Callable]): Postprocessing function executed in the service process. Default is ``None``.
    stream (Union[bool, Dict]): Whether to enable streaming output. Can be a boolean or a dictionary with streaming configuration. Default is ``False``.
    return_trace (Optional[bool]): Whether to return debug trace information. Default is ``False``.
    port (Optional[int]): Port to deploy the service. If ``None``, a random port will be assigned.
    pythonpath (Optional[str]): PYTHONPATH environment variable passed to the subprocess. Defaults to ``None``.
    launcher (Optional[LazyLLMLaunchersBase]): The launcher used to deploy the service. Defaults to asynchronous remote deployment.
    url (Optional[str]): URL of an already deployed service. If provided, `m` must be None.


Examples:
    >>> import lazyllm
    >>> def demo(input): return input * 2
    ...
    >>> s = lazyllm.ServerModule(demo, launcher=launchers.empty(sync=False))
    >>> s.start()
    INFO:     Uvicorn running on http://0.0.0.0:35485
    >>> print(s(1))
    2

    >>> class MyServe(object):
    ...     def __call__(self, input):
    ...         return 2 * input
    ...
    ...     @lazyllm.FastapiApp.post
    ...     def server1(self, input):
    ...         return f'reply for {input}'
    ...
    ...     @lazyllm.FastapiApp.get
    ...     def server2(self):
    ...        return f'get method'
    ...
    >>> m = lazyllm.ServerModule(MyServe(), launcher=launchers.empty(sync=False))
    >>> m.start()
    INFO:     Uvicorn running on http://0.0.0.0:32028
    >>> print(m(1))
    2
    """
    def __init__(self, m: Optional[Union[str, ModuleBase]] = None, pre: Optional[Callable] = None,
                 post: Optional[Callable] = None, stream: Union[bool, Dict] = False,
                 return_trace: bool = False, port: Optional[int] = None, pythonpath: Optional[str] = None,
                 launcher: Optional[LazyLLMLaunchersBase] = None, url: Optional[str] = None,
                 num_replicas: int = 1, security_key: Optional[Union[str, bool]] = None):
        assert stream is False or return_trace is False, 'Module with stream output has no trace'
        assert (post is None) or (stream is False), 'Stream cannot be true when post-action exists'
        if isinstance(m, str):
            assert url is None, 'url should be None when m is a url'
            url, m = m, None
        if url:
            assert is_valid_url(url), f'Invalid url: {url}'
            assert m is None, 'm should be None when url is provided'
        super().__init__(url=url, stream=stream, return_trace=return_trace)
        self._security_key = f'sk-{str(uuid.uuid4().hex)}' if security_key is True else security_key
        self._impl = _ServerModuleImpl(m, pre, post, launcher, port, pythonpath, self._url_wrapper,
                                       num_replicas=num_replicas, security_key=self._security_key)
        if url: self._impl._get_deploy_tasks.flag.set()

    _url_id = property(lambda self: self._impl._module_id)

    def wait(self):
        """Wait for the current module service to finish starting or executing.  
Typically used to block the main thread until the service finishes or is interrupted.  
"""
        self._impl._launcher.wait()

    def stop(self):
        """Stop the current module service and its related subprocesses.  
After this call, the module will no longer respond to requests.  
"""
        self._impl.stop()

    @property
    def status(self):
        return self._impl._launcher.status

    def _call(self, fname, *args, **kwargs):
        args, kwargs = lazyllm.dump_obj(args), lazyllm.dump_obj(kwargs)
        url = urljoin(self._url.rsplit('/', 1)[0], '_call')
        r = requests.post(url, json=(fname, args, kwargs), headers={'Content-Type': 'application/json'})
        if r.status_code != 200:
            try:
                error_info = r.json()
            except ValueError:
                error_info = r.text
            raise requests.RequestException(f'{r.status_code}: {error_info}')
        return pickle.loads(codecs.decode(r.content, 'base64'))

    def forward(self, __input: Union[Tuple[Union[str, Dict], str], str, Dict] = package(), **kw):  # noqa B008
        headers = {
            'Content-Type': 'application/json',
            'Global-Parameters': globals.pickled_data,
            'Session-ID': globals._sid,
            'Security-Key': self._security_key,
        }
        data = obj2str((__input, kw))

        # context bug with httpx, so we use requests
        with requests.post(self._url, json=data, stream=True, headers=headers,
                           proxies={'http': None, 'https': None}) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            messages = ''
            with self.stream_output(self._stream):
                for line in r.iter_lines(delimiter=b'<|lazyllm_delimiter|>'):
                    line = self._decode_line(line)
                    if self._stream:
                        self._stream_output(str(line), getattr(self._stream, 'get', lambda x: None)('color'))
                    messages = (messages + str(line)) if self._stream else line

                temp_output = self._extract_and_format(messages)
                return self._formatter(temp_output)

    def __repr__(self):
        return lazyllm.make_repr('Module', 'Server', subs=[repr(self._impl._m)], name=self._module_name,
                                 stream=self._stream, return_trace=self._return_trace)

wait()

Wait for the current module service to finish starting or executing.
Typically used to block the main thread until the service finishes or is interrupted.

Source code in lazyllm/module/servermodule.py
    def wait(self):
        """Wait for the current module service to finish starting or executing.  
Typically used to block the main thread until the service finishes or is interrupted.  
"""
        self._impl._launcher.wait()

stop()

Stop the current module service and its related subprocesses.
After this call, the module will no longer respond to requests.

Source code in lazyllm/module/servermodule.py
    def stop(self):
        """Stop the current module service and its related subprocesses.  
After this call, the module will no longer respond to requests.  
"""
        self._impl.stop()

lazyllm.module.AutoModel

A module for deploying either online API-based models or local models, supporting both online inference and locally trainable modules.

Parameters:

  • model (str) –

    The name of the model to load, e.g., internlm2-chat-7b. If None, internlm2-chat-7b will be loaded by default.

  • source (str) –

    Specifies the online model service to use. Required when using online models. Supported values include qwen, glm, openai, moonshot, etc.

  • framework (str) –

    The local inference framework to use for deployment. Supported values are lightllm, vllm, and lmdeploy. The model will be deployed via TrainableModule using the specified framework.

Source code in lazyllm/module/llms/automodel.py
class AutoModel:
    """A module for deploying either online API-based models or local models, supporting both online inference and locally trainable modules.

Args:
    model (str): The name of the model to load, e.g., ``internlm2-chat-7b``. If None, ``internlm2-chat-7b`` will be loaded by default.
    source (str): Specifies the online model service to use. Required when using online models. Supported values include ``qwen``, ``glm``, ``openai``, ``moonshot``, etc.
    framework (str): The local inference framework to use for deployment. Supported values are ``lightllm``, ``vllm``, and ``lmdeploy``. The model will be deployed via ``TrainableModule`` using the specified framework.
"""
    def __new__(cls, model=None, source=None, framework=None):
        if model in OnlineChatModule.MODELS:
            assert source is None
            source = model
            model = None
        assert source is None or source in OnlineChatModule.MODELS
        assert framework is None or framework in ['lightllm', 'vllm', 'lmdeploy']

        if source:
            return OnlineChatModule(model=model, source=source)
        elif framework:
            model = model or 'internlm2-chat-7b'
            return TrainableModule(model).deploy_method(getattr(lazyllm.deploy, framework))
        elif not model:
            try:
                return OnlineChatModule()
            except KeyError as e:
                LOG.warning('`OnlineChatModule` creation failed, and will try to '
                            f'load model internlm2-chat-7b with local `TrainableModule`. Since the error: {e}')
                return TrainableModule('internlm2-chat-7b')
        else:
            return TrainableModule(model)

lazyllm.module.TrialModule

Bases: object

Parameter grid search module will traverse all its submodules, collect all searchable parameters, and iterate over these parameters for fine-tuning, deployment, and evaluation.

Parameters:

  • m (Callable) –

    The submodule whose parameters will be grid-searched. Fine-tuning, deployment, and evaluation will be based on this module.

Examples:

>>> import lazyllm
>>> from lazyllm import finetune, deploy
>>> m = lazyllm.TrainableModule('b1', 't').finetune_method(finetune.dummy, **dict(a=lazyllm.Option(['f1', 'f2'])))
>>> m.deploy_method(deploy.dummy).mode('finetune').prompt(None)
>>> s = lazyllm.ServerModule(m, post=lambda x, ori: f'post2({x})')
>>> s.evalset([1, 2, 3])
>>> t = lazyllm.TrialModule(s)
>>> t.update()
>>>
dummy finetune!, and init-args is {a: f1}
dummy finetune!, and init-args is {a: f2}
[["post2(reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1})"], ["post2(reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1})"]]
Source code in lazyllm/module/trialmodule.py
class TrialModule(object):
    """Parameter grid search module will traverse all its submodules, collect all searchable parameters, and iterate over these parameters for fine-tuning, deployment, and evaluation.

Args:
    m (Callable): The submodule whose parameters will be grid-searched. Fine-tuning, deployment, and evaluation will be based on this module.


Examples:
    >>> import lazyllm
    >>> from lazyllm import finetune, deploy
    >>> m = lazyllm.TrainableModule('b1', 't').finetune_method(finetune.dummy, **dict(a=lazyllm.Option(['f1', 'f2'])))
    >>> m.deploy_method(deploy.dummy).mode('finetune').prompt(None)
    >>> s = lazyllm.ServerModule(m, post=lambda x, ori: f'post2({x})')
    >>> s.evalset([1, 2, 3])
    >>> t = lazyllm.TrialModule(s)
    >>> t.update()
    >>>
    dummy finetune!, and init-args is {a: f1}
    dummy finetune!, and init-args is {a: f2}
    [["post2(reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1})"], ["post2(reply for 1, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 2, and parameters is {'do_sample': False, 'temperature': 0.1})", "post2(reply for 3, and parameters is {'do_sample': False, 'temperature': 0.1})"]]
    """
    def __init__(self, m):
        self.m = m

    @staticmethod
    def work(m, q):
        """Static method to deepcopy the module, perform update in a subprocess, and put the evaluation result into a queue.

Args:
    m (Callable): The module to perform update on.
    q (multiprocessing.Queue): Queue to store evaluation results.
"""
        # update option at module.update()
        m = copy.deepcopy(m)
        m.update()
        q.put(m.eval_result)

    def update(self):
        """Iterates through all configuration options of the module, updates the module in parallel using multiprocessing, and collects the evaluation results for each configuration.
"""
        options = get_options(self.m)
        q = multiprocessing.Queue()
        ps = []
        for _ in OptionIter(options, get_options):
            p = ForkProcess(target=TrialModule.work, args=(self.m, q), sync=True)
            ps.append(p)
            p.start()
            time.sleep(1)
        [p.join() for p in ps]
        result = [q.get() for p in ps]
        LOG.info(f'{result}')

update()

Iterates through all configuration options of the module, updates the module in parallel using multiprocessing, and collects the evaluation results for each configuration.

Source code in lazyllm/module/trialmodule.py
    def update(self):
        """Iterates through all configuration options of the module, updates the module in parallel using multiprocessing, and collects the evaluation results for each configuration.
"""
        options = get_options(self.m)
        q = multiprocessing.Queue()
        ps = []
        for _ in OptionIter(options, get_options):
            p = ForkProcess(target=TrialModule.work, args=(self.m, q), sync=True)
            ps.append(p)
            p.start()
            time.sleep(1)
        [p.join() for p in ps]
        result = [q.get() for p in ps]
        LOG.info(f'{result}')

work(m, q) staticmethod

Static method to deepcopy the module, perform update in a subprocess, and put the evaluation result into a queue.

Parameters:

  • m (Callable) –

    The module to perform update on.

  • q (Queue) –

    Queue to store evaluation results.

Source code in lazyllm/module/trialmodule.py
    @staticmethod
    def work(m, q):
        """Static method to deepcopy the module, perform update in a subprocess, and put the evaluation result into a queue.

Args:
    m (Callable): The module to perform update on.
    q (multiprocessing.Queue): Queue to store evaluation results.
"""
        # update option at module.update()
        m = copy.deepcopy(m)
        m.update()
        q.put(m.eval_result)

lazyllm.module.OnlineChatModule

Used to manage and create access modules for large model platforms currently available on the market. Currently, it supports openai, sensenova, glm, kimi, qwen, doubao and deepseek (since the platform does not allow recharges for the time being, access is not supported for the time being). For how to obtain the platform's API key, please visit Getting Started

Parameters:

  • model (str) –

    Specify the model to access (Note that you need to use Model ID or Endpoint ID when using Doubao. For details on how to obtain it, see Getting the Inference Access Point. Before using the model, you must first activate the corresponding service on the Doubao platform.), default is gpt-3.5-turbo(openai) / SenseChat-5(sensenova) / glm-4(glm) / moonshot-v1-8k(kimi) / qwen-plus(qwen) / mistral-7b-instruct-v0.2(doubao) .

  • source (str) –

    Specify the type of module to create. Options include openai / sensenova / glm / kimi / qwen / doubao / deepseek (not yet supported) .

  • base_url (str) –

    Specify the base link of the platform to be accessed. The default is the official link.

  • system_prompt (str) –

    Specify the requested system prompt. The default is the official system prompt.

  • stream (bool) –

    Whether to request and output in streaming mode, default is streaming.

  • return_trace (bool) –

    Whether to record the results in trace, default is False.

Examples:

>>> import lazyllm
>>> from functools import partial
>>> m = lazyllm.OnlineChatModule(source="sensenova", stream=True)
>>> query = "Hello!"
>>> with lazyllm.ThreadPoolExecutor(1) as executor:
...     future = executor.submit(partial(m, llm_chat_history=[]), query)
...     while True:
...         if value := lazyllm.FileSystemQueue().dequeue():
...             print(f"output: {''.join(value)}")
...         elif future.done():
...             break
...     print(f"ret: {future.result()}")
...
output: Hello
output: ! How can I assist you today?
ret: Hello! How can I assist you today?
>>> from lazyllm.components.formatter import encode_query_with_filepaths
>>> vlm = lazyllm.OnlineChatModule(source="sensenova", model="SenseChat-Vision")
>>> query = "what is it?"
>>> inputs = encode_query_with_filepaths(query, ["/path/to/your/image"])
>>> print(vlm(inputs))
Source code in lazyllm/module/llms/onlinemodule/chat.py
class OnlineChatModule(metaclass=_ChatModuleMeta):
    """Used to manage and create access modules for large model platforms currently available on the market. Currently, it supports openai, sensenova, glm, kimi, qwen, doubao and deepseek (since the platform does not allow recharges for the time being, access is not supported for the time being). For how to obtain the platform's API key, please visit [Getting Started](/#platform)

Args:
    model (str): Specify the model to access (Note that you need to use Model ID or Endpoint ID when using Doubao. For details on how to obtain it, see [Getting the Inference Access Point](https://www.volcengine.com/docs/82379/1099522). Before using the model, you must first activate the corresponding service on the Doubao platform.), default is ``gpt-3.5-turbo(openai)`` / ``SenseChat-5(sensenova)`` / ``glm-4(glm)`` / ``moonshot-v1-8k(kimi)`` / ``qwen-plus(qwen)`` / ``mistral-7b-instruct-v0.2(doubao)`` .
    source (str): Specify the type of module to create. Options include  ``openai`` /  ``sensenova`` /  ``glm`` /  ``kimi`` /  ``qwen`` / ``doubao`` / ``deepseek (not yet supported)`` .
    base_url (str): Specify the base link of the platform to be accessed. The default is the official link.
    system_prompt (str): Specify the requested system prompt. The default is the official system prompt.
    stream (bool): Whether to request and output in streaming mode, default is streaming.
    return_trace (bool): Whether to record the results in trace, default is False.      


Examples:
    >>> import lazyllm
    >>> from functools import partial
    >>> m = lazyllm.OnlineChatModule(source="sensenova", stream=True)
    >>> query = "Hello!"
    >>> with lazyllm.ThreadPoolExecutor(1) as executor:
    ...     future = executor.submit(partial(m, llm_chat_history=[]), query)
    ...     while True:
    ...         if value := lazyllm.FileSystemQueue().dequeue():
    ...             print(f"output: {''.join(value)}")
    ...         elif future.done():
    ...             break
    ...     print(f"ret: {future.result()}")
    ...
    output: Hello
    output: ! How can I assist you today?
    ret: Hello! How can I assist you today?
    >>> from lazyllm.components.formatter import encode_query_with_filepaths
    >>> vlm = lazyllm.OnlineChatModule(source="sensenova", model="SenseChat-Vision")
    >>> query = "what is it?"
    >>> inputs = encode_query_with_filepaths(query, ["/path/to/your/image"])
    >>> print(vlm(inputs))
    """
    MODELS = {'openai': OpenAIModule,
              'sensenova': SenseNovaModule,
              'glm': GLMModule,
              'kimi': KimiModule,
              'qwen': QwenModule,
              'doubao': DoubaoModule,
              'deepseek': DeepSeekModule,
              'siliconflow': SiliconFlowModule}

    @staticmethod
    def _encapsulate_parameters(base_url: str, model: str, stream: bool, return_trace: bool, **kwargs) -> Dict[str, Any]:
        params = {'stream': stream, 'return_trace': return_trace}
        if base_url is not None:
            params['base_url'] = base_url
        if model is not None:
            params['model'] = model
        params.update(kwargs)
        return params

    def __new__(self, model: str = None, source: str = None, base_url: str = None, stream: bool = True,
                return_trace: bool = False, skip_auth: bool = False, type: Optional[str] = None, **kwargs):
        if model in OnlineChatModule.MODELS.keys() and source is None: source, model = model, source
        if type is None:
            type = get_model_type(model)
        if type in ['embed', 'rerank', 'cross_modal_embed']:
            raise AssertionError(f'\'{model}\' should use OnlineEmbeddingModule')
        elif type in ['sst', 'tts', 'sd']:
            raise AssertionError(f'\'{model}\' should use OnlineMultiModalModule')
        params = OnlineChatModule._encapsulate_parameters(base_url, model, stream, return_trace,
                                                          skip_auth=skip_auth, type=type.upper() if type else None,
                                                          **kwargs)

        if skip_auth:
            source = source or 'openai'
            if not base_url:
                raise KeyError('base_url must be set for local serving.')

        if source is None:
            if 'api_key' in kwargs and kwargs['api_key']:
                raise ValueError('No source is given but an api_key is provided.')
            for source in OnlineChatModule.MODELS.keys():
                if lazyllm.config[f'{source}_api_key']: break
            else:
                raise KeyError(f'No api_key is configured for any of the models {OnlineChatModule.MODELS.keys()}.')

        assert source in OnlineChatModule.MODELS.keys(), f'Unsupported source: {source}'
        return OnlineChatModule.MODELS[source](**params)

lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoModule

Bases: OnlineChatModuleBase

Doubao online chat module, inheriting from OnlineChatModuleBase.
Encapsulates the Doubao API (ByteDance) for multi-turn Q&A interactions. Defaults to model doubao-1-5-pro-32k-250115, supporting streaming and optional trace return.

Parameters:

  • model (str, default: None ) –

    The model name to use. Defaults to doubao-1-5-pro-32k-250115.

  • base_url (str, default: 'https://ark.cn-beijing.volces.com/api/v3/' ) –

    Base URL of the API, default is "https://ark.cn-beijing.volces.com/api/v3/".

  • api_key (Optional[str], default: None ) –

    Doubao API key. If not provided, it is read from lazyllm.config['doubao_api_key'].

  • stream (bool, default: True ) –

    Whether to enable streaming output. Defaults to True.

  • return_trace (bool, default: False ) –

    Whether to return trace information. Defaults to False.

  • **kwargs

    Additional arguments passed to the base class OnlineChatModuleBase.

Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py
class DoubaoModule(OnlineChatModuleBase):
    """Doubao online chat module, inheriting from OnlineChatModuleBase.  
Encapsulates the Doubao API (ByteDance) for multi-turn Q&A interactions. Defaults to model `doubao-1-5-pro-32k-250115`, supporting streaming and optional trace return.

Args:
    model (str): The model name to use. Defaults to `doubao-1-5-pro-32k-250115`.
    base_url (str): Base URL of the API, default is "https://ark.cn-beijing.volces.com/api/v3/".
    api_key (Optional[str]): Doubao API key. If not provided, it is read from `lazyllm.config['doubao_api_key']`.
    stream (bool): Whether to enable streaming output. Defaults to True.
    return_trace (bool): Whether to return trace information. Defaults to False.
    **kwargs: Additional arguments passed to the base class OnlineChatModuleBase.
"""
    MODEL_NAME = 'doubao-1-5-pro-32k-250115'
    VLM_MODEL_PREFIX = ['doubao-seed-1-6-vision', 'doubao-1-5-ui-tars']

    def __init__(self, model: str = None, base_url: str = 'https://ark.cn-beijing.volces.com/api/v3/',
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):
        super().__init__(model_series='DOUBAO', api_key=api_key or lazyllm.config['doubao_api_key'], base_url=base_url,
                         model_name=model or lazyllm.config['doubao_model_name'] or DoubaoModule.MODEL_NAME,
                         stream=stream, return_trace=return_trace, **kwargs)

    def _get_system_prompt(self):
        return ('You are Doubao, an AI assistant. Your task is to provide appropriate responses '
                'and support to user\'s questions and requests.')

    def _set_chat_url(self):
        self._url = urljoin(self._base_url, 'chat/completions')

    def _validate_api_key(self):
        """Validate API Key by sending a minimal request"""
        try:
            # Doubao (Volcano Engine) validates API key using a minimal chat request
            chat_url = urljoin(self._base_url, 'chat/completions')
            headers = {
                'Authorization': f'Bearer {self._api_key}',
                'Content-Type': 'application/json'
            }
            data = {
                'model': self._model_name,
                'messages': [{'role': 'user', 'content': 'hi'}],
                'max_tokens': 1  # Only generate 1 token for validation
            }
            response = requests.post(chat_url, headers=headers, json=data, timeout=10)
            return response.status_code == 200
        except Exception:
            return False

lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoMultiModal

Bases: OnlineMultiModalBase

Doubao MultiModal module, inheriting from OnlineMultiModalBase, encapsulates the functionality to call Doubao's multimodal service.
By specifying the API key, model name, and base service URL, it allows remote interaction with Doubao's API for multimodal data processing and feature extraction.

Parameters:

  • api_key (Optional[str], default: None ) –

    API key for accessing Doubao service. If not provided, it is read from lazyllm config.

  • model_name (Optional[str], default: None ) –

    Name of the Doubao multimodal model to use.

  • base_url (str, default: 'https://ark.cn-beijing.volces.com/api/v3' ) –

    Base URL of the Doubao service, defaulting to the Beijing region endpoint.

  • return_trace (bool, default: False ) –

    Whether to return debug trace information, default is False.

  • **kwargs

    Additional parameters passed to OnlineMultiModalBase.

Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py
class DoubaoMultiModal(OnlineMultiModalBase):
    """Doubao MultiModal module, inheriting from OnlineMultiModalBase, encapsulates the functionality to call Doubao's multimodal service.  
By specifying the API key, model name, and base service URL, it allows remote interaction with Doubao's API for multimodal data processing and feature extraction.

Args:
    api_key (Optional[str]): API key for accessing Doubao service. If not provided, it is read from lazyllm config.
    model_name (Optional[str]): Name of the Doubao multimodal model to use.
    base_url (str): Base URL of the Doubao service, defaulting to the Beijing region endpoint.
    return_trace (bool): Whether to return debug trace information, default is False.
    **kwargs: Additional parameters passed to OnlineMultiModalBase.
"""
    def __init__(self, api_key: str = None, model_name: str = None, base_url='https://ark.cn-beijing.volces.com/api/v3',
                 return_trace: bool = False, **kwargs):
        OnlineMultiModalBase.__init__(self, model_series='DOUBAO', model_name=model_name,
                                      return_trace=return_trace, **kwargs)
        self._client = volcenginesdkarkruntime.Ark(
            base_url=base_url,
            api_key=api_key or lazyllm.config['doubao_api_key'],
        )

lazyllm.module.OnlineEmbeddingModule

Used to manage and create online Embedding service modules currently on the market, currently supporting openai, sensenova, glm, qwen, doubao.

Parameters:

  • source (str) –

    Specify the type of module to create. Options are openai / sensenova / glm / qwen / doubao.

  • embed_url (str) –

    Specify the base link of the platform to be accessed. The default is the official link.

  • embed_mode_name (str) –

    Specify the model to access (Note that you need to use Model ID or Endpoint ID when using Doubao. For details on how to obtain it, see Getting the Inference Access Point. Before using the model, you must first activate the corresponding service on the Doubao platform.), default is text-embedding-ada-002(openai) / nova-embedding-stable(sensenova) / embedding-2(glm) / text-embedding-v1(qwen) / doubao-embedding-text-240715(doubao)

Examples:

>>> import lazyllm
>>> m = lazyllm.OnlineEmbeddingModule(source="sensenova")
>>> emb = m("hello world")
>>> print(f"emb: {emb}")
emb: [0.0010528564, 0.0063285828, 0.0049476624, -0.012008667, ..., -0.009124756, 0.0032043457, -0.051696777]
Source code in lazyllm/module/llms/onlinemodule/embedding.py
class OnlineEmbeddingModule(metaclass=__EmbedModuleMeta):
    """Used to manage and create online Embedding service modules currently on the market, currently supporting openai, sensenova, glm, qwen, doubao.

Args:
    source (str): Specify the type of module to create. Options are  ``openai`` /  ``sensenova`` /  ``glm`` /  ``qwen`` / ``doubao``.
    embed_url (str): Specify the base link of the platform to be accessed. The default is the official link.
    embed_mode_name (str): Specify the model to access (Note that you need to use Model ID or Endpoint ID when using Doubao. For details on how to obtain it, see [Getting the Inference Access Point](https://www.volcengine.com/docs/82379/1099522). Before using the model, you must first activate the corresponding service on the Doubao platform.), default is ``text-embedding-ada-002(openai)`` / ``nova-embedding-stable(sensenova)`` / ``embedding-2(glm)`` / ``text-embedding-v1(qwen)`` / ``doubao-embedding-text-240715(doubao)``


Examples:
    >>> import lazyllm
    >>> m = lazyllm.OnlineEmbeddingModule(source="sensenova")
    >>> emb = m("hello world")
    >>> print(f"emb: {emb}")
    emb: [0.0010528564, 0.0063285828, 0.0049476624, -0.012008667, ..., -0.009124756, 0.0032043457, -0.051696777]
    """
    EMBED_MODELS = {'openai': OpenAIEmbedding,
                    'sensenova': SenseNovaEmbedding,
                    'glm': GLMEmbedding,
                    'qwen': QwenEmbedding,
                    'doubao': DoubaoEmbedding,
                    'siliconflow': SiliconFlowEmbedding
                    }
    RERANK_MODELS = {'qwen': QwenReranking,
                     'glm': GLMReranking,
                     'openai': OpenAIReranking,
                     'siliconflow': SiliconFlowReranking}

    @staticmethod
    def _encapsulate_parameters(embed_url: str,
                                embed_model_name: str,
                                **kwargs) -> Dict[str, Any]:
        params = {}
        if embed_url is not None:
            params['embed_url'] = embed_url
        if embed_model_name is not None:
            params['embed_model_name'] = embed_model_name
        params.update(kwargs)
        return params

    @staticmethod
    def _check_available_source(available_models):
        for source in available_models.keys():
            if lazyllm.config[f'{source}_api_key']: break
        else:
            raise KeyError(f'No api_key is configured for any of the models {available_models.keys()}.')

        assert source in available_models.keys(), f'Unsupported source: {source}'
        return source

    def __new__(self,
                source: str = None,
                embed_url: str = None,
                embed_model_name: str = None,
                **kwargs):
        params = OnlineEmbeddingModule._encapsulate_parameters(embed_url, embed_model_name, **kwargs)

        if source is None and 'api_key' in kwargs and kwargs['api_key']:
            raise ValueError('No source is given but an api_key is provided.')

        if 'type' in params:
            params.pop('type')
        if kwargs.get('type', 'embed') == 'embed':
            if source is None:
                source = OnlineEmbeddingModule._check_available_source(OnlineEmbeddingModule.EMBED_MODELS)
            if source == 'doubao':
                if embed_model_name.startswith('doubao-embedding-vision'):
                    return DoubaoMultimodalEmbedding(**params)
                else:
                    return DoubaoEmbedding(**params)
            return OnlineEmbeddingModule.EMBED_MODELS[source](**params)
        elif kwargs.get('type') == 'rerank':
            if source is None:
                source = OnlineEmbeddingModule._check_available_source(OnlineEmbeddingModule.RERANK_MODELS)
            return OnlineEmbeddingModule.RERANK_MODELS[source](**params)
        else:
            raise ValueError('Unknown type of online embedding module.')

lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIEmbedding

Bases: OnlineEmbeddingModuleBase

Online embedding module using OpenAI. This class wraps the OpenAI Embedding API, defaulting to the text-embedding-ada-002 model, and converts text into vector representations.

Parameters:

  • embed_url (str, default: 'https://api.openai.com/v1/' ) –

    The URL endpoint of the OpenAI embedding API. Default is "https://api.openai.com/v1/embeddings".

  • embed_model_name (str, default: 'text-embedding-ada-002' ) –

    The name of the embedding model to use. Default is "text-embedding-ada-002".

  • api_key (str, default: None ) –

    The OpenAI API key. If not provided, it will be read from lazyllm.config.

Source code in lazyllm/module/llms/onlinemodule/supplier/openai.py
class OpenAIEmbedding(OnlineEmbeddingModuleBase):
    """Online embedding module using OpenAI.
This class wraps the OpenAI Embedding API, defaulting to the `text-embedding-ada-002` model, and converts text into vector representations.

Args:
    embed_url (str): The URL endpoint of the OpenAI embedding API. Default is "https://api.openai.com/v1/embeddings".
    embed_model_name (str): The name of the embedding model to use. Default is "text-embedding-ada-002".
    api_key (str, optional): The OpenAI API key. If not provided, it will be read from `lazyllm.config`.
"""
    NO_PROXY = True

    def __init__(self,
                 embed_url: str = 'https://api.openai.com/v1/',
                 embed_model_name: str = 'text-embedding-ada-002',
                 api_key: str = None, batch_size: int = 16, **kw):
        super().__init__('OPENAI', embed_url, api_key or lazyllm.config['openai_api_key'], embed_model_name,
                         batch_size=batch_size, **kw)

    def _set_embed_url(self):
        self._embed_url = urljoin(self._embed_url, 'embeddings')

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenSTTModule

Bases: QwenMultiModal

Speech-to-Text (STT) module based on Qwen's multimodal API, with paraformer-v2 as the default model.

Parameters:

  • model (str, default: None ) –

    Model name. Defaults to None, in which case it will use lazyllm.config['qwen_stt_model_name'] or QwenSTTModule.MODEL_NAME.

  • api_key (str, default: None ) –

    API key for Qwen service. Defaults to None.

  • return_trace (bool, default: False ) –

    Whether to return intermediate trace information during inference. Defaults to False.

  • **kwargs

    Additional parameters passed to the parent class QwenMultiModal.

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
class QwenSTTModule(QwenMultiModal):
    """Speech-to-Text (STT) module based on Qwen's multimodal API, with ``paraformer-v2`` as the default model.

Args:
    model (str): Model name. Defaults to ``None``, in which case it will use ``lazyllm.config['qwen_stt_model_name']`` or ``QwenSTTModule.MODEL_NAME``.
    api_key (str): API key for Qwen service. Defaults to ``None``.
    return_trace (bool): Whether to return intermediate trace information during inference. Defaults to ``False``.
    **kwargs: Additional parameters passed to the parent class ``QwenMultiModal``.
"""
    MODEL_NAME = 'paraformer-v2'

    def __init__(self, model: str = None, api_key: str = None, return_trace: bool = False, **kwargs):
        QwenMultiModal.__init__(self, api_key=api_key,
                                model_name=model or lazyllm.config['qwen_stt_model_name'] or QwenSTTModule.MODEL_NAME,
                                return_trace=return_trace, **kwargs)

    def _forward(self, files: List[str] = [], **kwargs):  # noqa B006
        assert any(file.startswith('http') for file in files), 'QwenSTTModule only supports http file urls'
        call_params = {'model': self._model_name, 'file_urls': files, **kwargs}
        if self._api_key: call_params['api_key'] = self._api_key
        task_response = dashscope.audio.asr.Transcription.async_call(**call_params)
        transcribe_response = dashscope.audio.asr.Transcription.wait(task=task_response.output.task_id,
                                                                     api_key=self._api_key)
        if transcribe_response.status_code == HTTPStatus.OK:
            result_text = ''
            for task in transcribe_response.output.results:
                assert task['subtask_status'] == 'SUCCEEDED', 'subtask_status is not SUCCEEDED'
                response = json.loads(requests.get(task['transcription_url']).text)
                for transcript in response['transcripts']:
                    result_text += re.sub(r'<[^>]+>', '', transcript['text'])
            return result_text
        else:
            lazyllm.LOG.error(f'failed to transcribe: {transcribe_response.output}')
            raise Exception(f'failed to transcribe: {transcribe_response.output.message}')

lazyllm.module.OnlineChatModuleBase

Bases: OnlineModuleBase, LLMBase

OnlineChatModuleBase is a public component that manages the LLM interface for open platforms, and has key capabilities such as training, deployment, and inference. OnlineChatModuleBase itself does not support direct instantiation; it requires subclasses to inherit from this class and implement interfaces related to fine-tuning, such as uploading files, creating fine-tuning tasks, querying fine-tuning tasks, and deployment-related interfaces, such as creating deployment services and querying deployment tasks.

If you need to support the capabilities of a new open platform's LLM, please extend your custom class from OnlineChatModuleBase:

1. Consider post-processing the returned results based on the parameters returned by the new platform's model. If the model's return format is consistent with OpenAI, no processing is necessary.
2. If the new platform supports model fine-tuning, you must also inherit from the FileHandlerBase class. This class primarily validates file formats and converts .jsonl formatted data into a format supported by the model for subsequent training. 
3. If the new platform supports model fine-tuning, you must implement interfaces for file upload, creating fine-tuning services, and querying fine-tuning services. Even if the new platform does not require deployment of the fine-tuned model, please implement dummy interfaces for creating and querying deployment services.
4. If the new platform supports model fine-tuning, provide a list of models that support fine-tuning to facilitate judgment during the fine-tuning service process.
5. Configure the api_key supported by the new platform as a global variable by using ``lazyllm.config.add(variable_name, type, default_value, environment_variable_name)`` .

Parameters:

  • model_series (str) –

    Model series name

  • api_key (str) –

    API access key

  • base_url (str) –

    API base URL

  • model_name (str) –

    Model name

  • stream (Union[bool, Dict[str, str]]) –

    Whether to stream output or stream configuration

  • return_trace (bool, default: False ) –

    Whether to return trace information, defaults to False

  • skip_auth (bool, default: False ) –

    Whether to skip authentication, defaults to False

  • static_params (Optional[StaticParams], default: None ) –

    Static parameter configuration, defaults to None

  • **kwargs

    Other model parameters

Examples:

>>> import lazyllm
>>> from lazyllm.module import OnlineChatModuleBase
>>> from lazyllm.module.onlineChatModule.fileHandler import FileHandlerBase
>>> class NewPlatformChatModule(OnlineChatModuleBase):
...     def __init__(self,
...                   base_url: str = "<new platform base url>",
...                   model: str = "<new platform model name>",
...                   system_prompt: str = "<new platform system prompt>",
...                   stream: bool = True,
...                   return_trace: bool = False):
...         super().__init__(model_type="new_class_name",
...                          api_key=lazyllm.config['new_platform_api_key'],
...                          base_url=base_url,
...                          system_prompt=system_prompt,
...                          stream=stream,
...                          return_trace=return_trace)
...
>>> class NewPlatformChatModule1(OnlineChatModuleBase, FileHandlerBase):
...     TRAINABLE_MODELS_LIST = ['model_t1', 'model_t2', 'model_t3']
...     def __init__(self,
...                   base_url: str = "<new platform base url>",
...                   model: str = "<new platform model name>",
...                   system_prompt: str = "<new platform system prompt>",
...                   stream: bool = True,
...                   return_trace: bool = False):
...         OnlineChatModuleBase.__init__(self,
...                                       model_type="new_class_name",
...                                       api_key=lazyllm.config['new_platform_api_key'],
...                                       base_url=base_url,
...                                       system_prompt=system_prompt,
...                                       stream=stream,
...                                       trainable_models=NewPlatformChatModule1.TRAINABLE_MODELS_LIST,
...                                       return_trace=return_trace)
...         FileHandlerBase.__init__(self)
...     
...     def _convert_file_format(self, filepath:str) -> str:
...         pass
...         return data_str
...
...     def _upload_train_file(self, train_file):
...         pass
...         return train_file_id
...
...     def _create_finetuning_job(self, train_model, train_file_id, **kw):
...         pass
...         return fine_tuning_job_id, status
...
...     def _query_finetuning_job(self, fine_tuning_job_id):
...         pass
...         return fine_tuned_model, status
...
...     def _create_deployment(self):
...         pass
...         return self._model_name, "RUNNING"
... 
...     def _query_deployment(self, deployment_id):
...         pass
...         return "RUNNING"
...
Source code in lazyllm/module/llms/onlinemodule/base/onlineChatModuleBase.py
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
class OnlineChatModuleBase(OnlineModuleBase, LLMBase):
    """OnlineChatModuleBase is a public component that manages the LLM interface for open platforms, and has key capabilities such as training, deployment, and inference. OnlineChatModuleBase itself does not support direct instantiation; it requires subclasses to inherit from this class and implement interfaces related to fine-tuning, such as uploading files, creating fine-tuning tasks, querying fine-tuning tasks, and deployment-related interfaces, such as creating deployment services and querying deployment tasks.

If you need to support the capabilities of a new open platform's LLM, please extend your custom class from OnlineChatModuleBase:

    1. Consider post-processing the returned results based on the parameters returned by the new platform's model. If the model's return format is consistent with OpenAI, no processing is necessary.
    2. If the new platform supports model fine-tuning, you must also inherit from the FileHandlerBase class. This class primarily validates file formats and converts .jsonl formatted data into a format supported by the model for subsequent training. 
    3. If the new platform supports model fine-tuning, you must implement interfaces for file upload, creating fine-tuning services, and querying fine-tuning services. Even if the new platform does not require deployment of the fine-tuned model, please implement dummy interfaces for creating and querying deployment services.
    4. If the new platform supports model fine-tuning, provide a list of models that support fine-tuning to facilitate judgment during the fine-tuning service process.
    5. Configure the api_key supported by the new platform as a global variable by using ``lazyllm.config.add(variable_name, type, default_value, environment_variable_name)`` .

Args:
    model_series (str): Model series name
    api_key (str): API access key
    base_url (str): API base URL
    model_name (str): Model name
    stream (Union[bool, Dict[str, str]]): Whether to stream output or stream configuration
    return_trace (bool, optional): Whether to return trace information, defaults to False
    skip_auth (bool, optional): Whether to skip authentication, defaults to False
    static_params (Optional[StaticParams], optional): Static parameter configuration, defaults to None
    **kwargs: Other model parameters


Examples:
    >>> import lazyllm
    >>> from lazyllm.module import OnlineChatModuleBase
    >>> from lazyllm.module.onlineChatModule.fileHandler import FileHandlerBase
    >>> class NewPlatformChatModule(OnlineChatModuleBase):
    ...     def __init__(self,
    ...                   base_url: str = "<new platform base url>",
    ...                   model: str = "<new platform model name>",
    ...                   system_prompt: str = "<new platform system prompt>",
    ...                   stream: bool = True,
    ...                   return_trace: bool = False):
    ...         super().__init__(model_type="new_class_name",
    ...                          api_key=lazyllm.config['new_platform_api_key'],
    ...                          base_url=base_url,
    ...                          system_prompt=system_prompt,
    ...                          stream=stream,
    ...                          return_trace=return_trace)
    ...
    >>> class NewPlatformChatModule1(OnlineChatModuleBase, FileHandlerBase):
    ...     TRAINABLE_MODELS_LIST = ['model_t1', 'model_t2', 'model_t3']
    ...     def __init__(self,
    ...                   base_url: str = "<new platform base url>",
    ...                   model: str = "<new platform model name>",
    ...                   system_prompt: str = "<new platform system prompt>",
    ...                   stream: bool = True,
    ...                   return_trace: bool = False):
    ...         OnlineChatModuleBase.__init__(self,
    ...                                       model_type="new_class_name",
    ...                                       api_key=lazyllm.config['new_platform_api_key'],
    ...                                       base_url=base_url,
    ...                                       system_prompt=system_prompt,
    ...                                       stream=stream,
    ...                                       trainable_models=NewPlatformChatModule1.TRAINABLE_MODELS_LIST,
    ...                                       return_trace=return_trace)
    ...         FileHandlerBase.__init__(self)
    ...     
    ...     def _convert_file_format(self, filepath:str) -> str:
    ...         pass
    ...         return data_str
    ...
    ...     def _upload_train_file(self, train_file):
    ...         pass
    ...         return train_file_id
    ...
    ...     def _create_finetuning_job(self, train_model, train_file_id, **kw):
    ...         pass
    ...         return fine_tuning_job_id, status
    ...
    ...     def _query_finetuning_job(self, fine_tuning_job_id):
    ...         pass
    ...         return fine_tuned_model, status
    ...
    ...     def _create_deployment(self):
    ...         pass
    ...         return self._model_name, "RUNNING"
    ... 
    ...     def _query_deployment(self, deployment_id):
    ...         pass
    ...         return "RUNNING"
    ...
    """
    TRAINABLE_MODEL_LIST = []
    VLM_MODEL_PREFIX = []
    NO_PROXY = True

    def __init__(self, model_series: str, api_key: str, base_url: str, model_name: str,
                 stream: Union[bool, Dict[str, str]], return_trace: bool = False, skip_auth: bool = False,
                 static_params: Optional[StaticParams] = None, type: Optional[str] = None, **kwargs):
        if any([model_name.startswith(prefix) for prefix in self.VLM_MODEL_PREFIX]):
            if type is None: type = 'VLM'
            else: assert type == 'VLM', f'model_name {model_name} is a VLM model, but type is {type}'
        OnlineModuleBase.__init__(self, return_trace=return_trace)
        LLMBase.__init__(self, stream=stream, type=type)
        self._model_series = model_series
        if not skip_auth and not api_key:
            raise ValueError('api_key is required')
        self._api_key = '' if skip_auth else api_key
        self._base_url = base_url
        self._model_name = model_name
        self.trainable_models = self.TRAINABLE_MODEL_LIST
        self._set_headers()
        self._set_chat_url()
        self._is_trained = False
        self._model_optional_params = {}
        self._vlm_force_format_input_with_files = False
        self._static_params = static_params or {}

    @property
    def series(self):
        return self._model_series

    @property
    def static_params(self) -> StaticParams:
        return self._static_params

    @static_params.setter
    def static_params(self, value: StaticParams):
        if not isinstance(value, dict):
            raise TypeError('static_params must be a dict (TypedDict)')
        self._static_params = value

    def prompt(self, prompt: Optional[str] = None, history: Optional[List[List[str]]] = None):
        super().prompt('' if prompt is None else prompt, history=history)
        self._prompt._set_model_configs(system=self._get_system_prompt())
        return self

    def share(self, prompt: Optional[Union[str, dict, PrompterBase]] = None, format: Optional[FormatterBase] = None,
              stream: Optional[Union[bool, Dict[str, str]]] = None, history: Optional[List[List[str]]] = None,
              copy_static_params: bool = False):
        new = super().share(prompt, format, stream, history)
        if copy_static_params: new._static_params = copy.deepcopy(self._static_params)
        return new

    def _get_system_prompt(self):
        raise NotImplementedError('_get_system_prompt is not implemented.')

    def _set_headers(self):
        self._headers = {
            'Content-Type': 'application/json',
            **({'Authorization': 'Bearer ' + self._api_key} if self._api_key else {})
        }

    def _set_chat_url(self):
        self._url = urljoin(self._base_url, 'chat/completions')

    def _get_models_list(self):
        url = urljoin(self._base_url, 'models')
        headers = {'Authorization': 'Bearer ' + self._api_key} if self._api_key else None
        with requests.get(url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            res_json = r.json()
            return res_json

    def _convert_msg_format(self, msg: Dict[str, Any]):
        return msg

    def _str_to_json(self, msg: str, stream_output: bool):
        if isinstance(msg, bytes):
            pattern = re.compile(r'^data:\s*')
            msg = re.sub(pattern, '', msg.decode('utf-8'))
        try:
            message = self._convert_msg_format(json.loads(msg))
            if not stream_output: return message
            color = stream_output.get('color') if isinstance(stream_output, dict) else None
            for item in message.get('choices', []):
                delta = item.get('message', item.get('delta', {}))
                if (reasoning_content := delta.get('reasoning_content', '')):
                    self._stream_output(reasoning_content, color, cls='think')
                elif (content := delta.get('content', '')) and not delta.get('tool_calls'):
                    self._stream_output(content, color)
            lazyllm.LOG.debug(f'message: {message}')
            return message
        except Exception:
            return ''

    def _extract_specified_key_fields(self, response: Dict[str, Any]):
        if not ('choices' in response and isinstance(response['choices'], list)):
            raise ValueError(f'The response {response} does not contain a `choices` field.')
        outputs = response['choices'][0].get('message') or response['choices'][0].get('delta', {})
        if 'reasoning_content' in outputs and outputs['reasoning_content'] and 'content' in outputs:
            outputs['content'] = r'<think>' + outputs.pop('reasoning_content') + r'</think>' + outputs['content']

        result, tool_calls = outputs.get('content') or '', outputs.get('tool_calls')
        if tool_calls:
            try:
                if isinstance(tool_calls, list): [item.pop('index', None) for item in tool_calls]
                tool_calls = tool_calls if isinstance(tool_calls, str) else json.dumps(tool_calls, ensure_ascii=False)
                if tool_calls: result += '<|tool_calls|>' + tool_calls
            except (KeyError, IndexError, TypeError):
                pass
        return result

    def _merge_stream_result(self, src: List[Union[str, int, list, dict]], force_join: bool = False):
        src = [ele for ele in src if ele is not None]
        if not src: return None
        elif len(src) == 1: return src[0]
        assert len(set(map(type, src))) == 1, f'The elements in the list: {src} are of inconsistent types'

        if isinstance(src[0], str):
            src = [ele for ele in src if ele]
            if not src: return ''
            if force_join or not all(src[0] == ele for ele in src): return ''.join(src)
        elif isinstance(src[0], list):
            assert len(set(map(len, src))) == 1, f'The lists of elements: {src} have different lengths.'
            ret = list(map(self._merge_stream_result, zip(*src)))
            return ret[0] if (len(ret) > 0 and isinstance(ret[0], list)) else ret
        elif isinstance(src[0], dict):  # list of dicts
            if 'index' in src[-1]:
                grouped = [list(g) for _, g in groupby(sorted(src, key=itemget('index')), key=itemget('index'))]
                if len(grouped) > 1: return [self._merge_stream_result(src) for src in grouped]
            return {k: self._merge_stream_result([d.get(k) for d in src], k == 'content') for k in set().union(*src)}
        return src[-1]

    def forward(self, __input: Union[Dict, str] = None, *, llm_chat_history: List[List[str]] = None,
                tools: List[Dict[str, Any]] = None, stream_output: bool = False, lazyllm_files=None, **kw):
        """LLM inference interface"""
        # TODO(dengyuang): if current forward set stream_output = False but self._stream = True, will use stream = True
        stream_output = stream_output or self._stream
        __input, files = self._get_files(__input, lazyllm_files)
        params = {'input': __input, 'history': llm_chat_history, 'return_dict': True}
        if tools: params['tools'] = tools
        data = self._prompt.generate_prompt(**params)
        data.update(self._static_params, **dict(model=self._model_name, stream=bool(stream_output)))

        if len(kw) > 0: data.update(kw)
        if len(self._model_optional_params) > 0: data.update(self._model_optional_params)

        if self.type == 'VLM' and (files or self._vlm_force_format_input_with_files):
            data['messages'][-1]['content'] = self._format_input_with_files(data['messages'][-1]['content'], files)

        proxies = {'http': None, 'https': None} if self.NO_PROXY else None
        with requests.post(self._url, json=data, headers=self._headers, stream=stream_output, proxies=proxies) as r:
            if r.status_code != 200:  # request error
                msg = '\n'.join([c.decode('utf-8') for c in r.iter_content(None)]) if stream_output else r.text
                raise requests.RequestException(f'{r.status_code}: {msg}')

            with self.stream_output(stream_output):
                msg_json = list(filter(lambda x: x, ([self._str_to_json(line, stream_output) for line in r.iter_lines()
                                if len(line)] if stream_output else [self._str_to_json(r.text, stream_output)]),))

            usage = {'prompt_tokens': -1, 'completion_tokens': -1}
            if len(msg_json) > 0 and 'usage' in msg_json[-1] and isinstance(msg_json[-1]['usage'], dict):
                for k in usage:
                    usage[k] = msg_json[-1]['usage'].get(k, usage[k])
            self._record_usage(usage)
            extractor = self._extract_specified_key_fields(self._merge_stream_result(msg_json))
            return self._formatter(extractor) if extractor else ''

    def _record_usage(self, usage: dict):
        globals['usage'][self._module_id] = usage
        par_muduleid = self._used_by_moduleid
        if par_muduleid is None:
            return
        if par_muduleid not in globals['usage']:
            globals['usage'][par_muduleid] = usage
            return
        existing_usage = globals['usage'][par_muduleid]
        if existing_usage['prompt_tokens'] == -1 or usage['prompt_tokens'] == -1:
            globals['usage'][par_muduleid] = {'prompt_tokens': -1, 'completion_tokens': -1}
        else:
            for k in globals['usage'][par_muduleid]:
                globals['usage'][par_muduleid][k] += usage[k]

    def _upload_train_file(self, train_file) -> str:
        raise NotImplementedError(f'{self._model_series} not implemented _upload_train_file method in subclass')

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        raise NotImplementedError(f'{self._model_series} not implemented _create_finetuning_job method in subclass')

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        raise NotImplementedError(f'{self._model_series} not implemented _query_finetuning_job method in subclass')

    def _query_finetuned_jobs(self) -> dict:
        raise NotImplementedError(f'{self._model_series} not implemented _query_finetuned_jobs method in subclass')

    def _get_finetuned_model_names(self) -> Tuple[List[str], List[str]]:
        raise NotImplementedError(f'{self._model_series} not implemented _get_finetuned_model_names method in subclass')

    def set_train_tasks(self, train_file, **kw):
        """Set model fine-tuning training task parameters.

Configure training data file and training hyperparameters required for fine-tuning, preparing for subsequent training tasks.

Args:
    train_file: Training data file path or file object
    **kw: Training hyperparameters such as learning rate, training epochs, etc.
"""
        self._train_file = train_file
        self._train_parameters = kw

    def set_specific_finetuned_model(self, model_id):
        """Set and use specific fine-tuned model.

Select specified model ID from completed fine-tuned model list as current model to use.

Args:
    model_id (str): Fine-tuned model ID to use

**Exceptions:** 

- ValueError: Raised when provided model_id is not in valid fine-tuned model list
"""
        valid_jobs, _ = self._get_finetuned_model_names()
        valid_model_id = [model for _, model in valid_jobs]
        if model_id in valid_model_id:
            self._model_name = model_id
            self._is_trained = True
        else:
            raise ValueError(f'Cannot find modle({model_id}), in fintuned model list: {valid_model_id}')

    def _get_temp_save_dir_path(self):
        save_dir = os.path.join(lazyllm.config['temp_dir'], 'online_model_sft_log')
        if not os.path.exists(save_dir):
            os.system(f'mkdir -p {save_dir}')
        else:
            _delete_old_files(save_dir)
        return save_dir

    def _validate_api_key(self):
        try:
            self._query_finetuned_jobs()
            return True
        except Exception:
            return False

    def _get_train_tasks(self):
        if not self._model_name or not self._train_file:
            raise ValueError('train_model and train_file is required')
        if self._model_name not in self.trainable_models:
            lazyllm.LOG.log_once(f'The current model {self._model_name} is not in the trainable \
                                  model list {self.trainable_models}. The deadline for this list is June 1, 2024. \
                                  This model may not be trainable. If your model is a new model, \
                                  you can ignore this warning.')

        def _create_for_finetuning_job():
            """
            create for finetuning job to finish
            """
            file_id = self._upload_train_file(train_file=self._train_file)
            lazyllm.LOG.info(f'{os.path.basename(self._train_file)} upload success! file id is {file_id}')
            (fine_tuning_job_id, status) = self._create_finetuning_job(self._model_name,
                                                                       file_id,
                                                                       **self._train_parameters)
            lazyllm.LOG.info(f'fine tuning job {fine_tuning_job_id} created, status: {status}')

            if status.lower() == 'failed':
                raise ValueError(f'Fine tuning job {fine_tuning_job_id} failed')
            while status.lower() != 'succeeded':
                try:
                    # wait 10 seconds before querying again
                    time.sleep(random.randint(60, 120))
                    (fine_tuned_model, status) = self._query_finetuning_job(fine_tuning_job_id)
                    lazyllm.LOG.info(f'fine tuning job {fine_tuning_job_id} status: {status}')
                    if status.lower() == 'failed':
                        raise ValueError(f'Finetuning job {fine_tuning_job_id} failed')
                except ValueError:
                    raise ValueError(f'Finetuning job {fine_tuning_job_id} failed')

            lazyllm.LOG.info(f'fine tuned model: {fine_tuned_model} finished')
            self._model_name = fine_tuned_model
            self._is_trained = True

        return pipeline(_create_for_finetuning_job)

    def _create_deployment(self) -> Tuple[str, str]:
        raise NotImplementedError(f'{self._model_series} not implemented _create_deployment method in subclass')

    def _query_deployment(self, deployment_id) -> str:
        raise NotImplementedError(f'{self._model_series} not implemented _query_deployment method in subclass')

    def _get_deploy_tasks(self):
        if not self._is_trained: return None

        def _start_for_deployment():
            (deployment_id, status) = self._create_deployment()
            lazyllm.LOG.info(f'deployment {deployment_id} created, status: {status}')

            if status.lower() == 'failed':
                raise ValueError(f'Deployment task {deployment_id} failed')
            status = self._query_deployment(deployment_id)
            while status.lower() != 'running':
                # wait 10 seconds before querying again
                time.sleep(10)
                status = self._query_deployment(deployment_id)
                lazyllm.LOG.info(f'deployment {deployment_id} status: {status}')
                if status.lower() == 'failed':
                    raise ValueError(f'Deployment task {deployment_id} failed')
            lazyllm.LOG.info(f'deployment {deployment_id} finished')
        return pipeline(_start_for_deployment)

    def _format_vl_chat_query(self, query: str):
        return [{'type': 'text', 'text': query}]

    def _format_vl_chat_image_url(self, image_url: str, mime: str) -> List[Dict[str, str]]:
        return [{'type': 'image_url', 'image_url': {'url': f'data:{mime};base64,{image_url}'}}]

    # for online vlm
    def _format_input_with_files(self, query: str, query_files: list[str]) -> List[Dict[str, str]]:
        if not query_files:
            return self._format_vl_chat_query(query)
        output = [{'type': 'text', 'text': query}]
        assert isinstance(query_files, list), 'query_files must be a list.'
        for file in query_files:
            mime = None
            if not file.startswith('http'):
                file, mime = _image_to_base64(file)
            output.extend(self._format_vl_chat_image_url(file, mime))
        return output

    def __repr__(self):
        return lazyllm.make_repr('Module', 'OnlineChat', name=self._module_name, url=self._base_url,
                                 stream=bool(self._stream), return_trace=self._return_trace)

set_train_tasks(train_file, **kw)

Set model fine-tuning training task parameters.

Configure training data file and training hyperparameters required for fine-tuning, preparing for subsequent training tasks.

Parameters:

  • train_file

    Training data file path or file object

  • **kw

    Training hyperparameters such as learning rate, training epochs, etc.

Source code in lazyllm/module/llms/onlinemodule/base/onlineChatModuleBase.py
    def set_train_tasks(self, train_file, **kw):
        """Set model fine-tuning training task parameters.

Configure training data file and training hyperparameters required for fine-tuning, preparing for subsequent training tasks.

Args:
    train_file: Training data file path or file object
    **kw: Training hyperparameters such as learning rate, training epochs, etc.
"""
        self._train_file = train_file
        self._train_parameters = kw

set_specific_finetuned_model(model_id)

Set and use specific fine-tuned model.

Select specified model ID from completed fine-tuned model list as current model to use.

Parameters:

  • model_id (str) –

    Fine-tuned model ID to use

Exceptions:

  • ValueError: Raised when provided model_id is not in valid fine-tuned model list
Source code in lazyllm/module/llms/onlinemodule/base/onlineChatModuleBase.py
    def set_specific_finetuned_model(self, model_id):
        """Set and use specific fine-tuned model.

Select specified model ID from completed fine-tuned model list as current model to use.

Args:
    model_id (str): Fine-tuned model ID to use

**Exceptions:** 

- ValueError: Raised when provided model_id is not in valid fine-tuned model list
"""
        valid_jobs, _ = self._get_finetuned_model_names()
        valid_model_id = [model for _, model in valid_jobs]
        if model_id in valid_model_id:
            self._model_name = model_id
            self._is_trained = True
        else:
            raise ValueError(f'Cannot find modle({model_id}), in fintuned model list: {valid_model_id}')

lazyllm.module.OnlineEmbeddingModuleBase

Bases: OnlineModuleBase

OnlineEmbeddingModuleBase is the base class for managing embedding model interfaces on open platforms, used for requesting text to obtain embedding vectors. It is not recommended to directly instantiate this class. Specific platform classes should inherit from this class for instantiation.

If you need to support the capabilities of embedding models on a new open platform, please extend your custom class from OnlineEmbeddingModuleBase:

  1. If the request and response data formats of the new platform's embedding model are the same as OpenAI's, no additional processing is needed; simply pass the URL and model.
  2. If the request or response data formats of the new platform's embedding model differ from OpenAI's, you need to override the _encapsulated_data or _parse_response methods.
  3. Configure the api_key supported by the new platform as a global variable by using lazyllm.config.add(variable_name, type, default_value, environment_variable_name) .

Parameters:

  • model_series (str) –

    Model series name identifier.

  • embed_url (str) –

    Embedding API URL address.

  • api_key (str) –

    API access key.

  • embed_model_name (str) –

    Embedding model name.

  • return_trace (bool, default: False ) –

    Whether to return trace information, defaults to False.

Examples:

>>> import lazyllm
>>> from lazyllm.module import OnlineEmbeddingModuleBase
>>> class NewPlatformEmbeddingModule(OnlineEmbeddingModuleBase):
...     def __init__(self,
...                 embed_url: str = '<new platform embedding url>',
...                 embed_model_name: str = '<new platform embedding model name>'):
...         super().__init__(embed_url, lazyllm.config['new_platform_api_key'], embed_model_name)
...
>>> class NewPlatformEmbeddingModule1(OnlineEmbeddingModuleBase):
...     def __init__(self,
...                 embed_url: str = '<new platform embedding url>',
...                 embed_model_name: str = '<new platform embedding model name>'):
...         super().__init__(embed_url, lazyllm.config['new_platform_api_key'], embed_model_name)
...
...     def _encapsulated_data(self, text:str, **kwargs):
...         pass
...         return json_data
...
...     def _parse_response(self, response: dict[str, any]):
...         pass
...         return embedding
Source code in lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py
class OnlineEmbeddingModuleBase(OnlineModuleBase):
    """OnlineEmbeddingModuleBase is the base class for managing embedding model interfaces on open platforms, used for requesting text to obtain embedding vectors. It is not recommended to directly instantiate this class. Specific platform classes should inherit from this class for instantiation.


If you need to support the capabilities of embedding models on a new open platform, please extend your custom class from OnlineEmbeddingModuleBase:

1. If the request and response data formats of the new platform's embedding model are the same as OpenAI's, no additional processing is needed; simply pass the URL and model.
2. If the request or response data formats of the new platform's embedding model differ from OpenAI's, you need to override the _encapsulated_data or _parse_response methods.
3. Configure the api_key supported by the new platform as a global variable by using ``lazyllm.config.add(variable_name, type, default_value, environment_variable_name)`` .

Args:
    model_series (str): Model series name identifier.
    embed_url (str): Embedding API URL address.
    api_key (str): API access key.
    embed_model_name (str): Embedding model name.
    return_trace (bool, optional): Whether to return trace information, defaults to False.


Examples:
    >>> import lazyllm
    >>> from lazyllm.module import OnlineEmbeddingModuleBase
    >>> class NewPlatformEmbeddingModule(OnlineEmbeddingModuleBase):
    ...     def __init__(self,
    ...                 embed_url: str = '<new platform embedding url>',
    ...                 embed_model_name: str = '<new platform embedding model name>'):
    ...         super().__init__(embed_url, lazyllm.config['new_platform_api_key'], embed_model_name)
    ...
    >>> class NewPlatformEmbeddingModule1(OnlineEmbeddingModuleBase):
    ...     def __init__(self,
    ...                 embed_url: str = '<new platform embedding url>',
    ...                 embed_model_name: str = '<new platform embedding model name>'):
    ...         super().__init__(embed_url, lazyllm.config['new_platform_api_key'], embed_model_name)
    ...
    ...     def _encapsulated_data(self, text:str, **kwargs):
    ...         pass
    ...         return json_data
    ...
    ...     def _parse_response(self, response: dict[str, any]):
    ...         pass
    ...         return embedding
    """
    NO_PROXY = True

    def __init__(self,
                 model_series: str,
                 embed_url: str,
                 api_key: str,
                 embed_model_name: str,
                 return_trace: bool = False,
                 batch_size: int = 1,
                 num_worker: int = 1,
                 timeout: int = 10):
        super().__init__(return_trace=return_trace)
        self._model_series = model_series
        self._embed_url = embed_url
        self._api_key = api_key
        self._embed_model_name = embed_model_name
        self._set_headers()
        self._batch_size = batch_size
        self._num_worker = num_worker
        self._timeout = timeout
        if hasattr(self, '_set_embed_url'): self._set_embed_url()

    @property
    def series(self):
        return self._model_series

    @property
    def type(self):
        return 'EMBED'

    @property
    def batch_size(self):
        return self._batch_size

    @batch_size.setter
    def batch_size(self, value: int):
        self._batch_size = value

    def _set_headers(self) -> Dict[str, str]:
        self._headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}'
        }

    def forward(self, input: Union[List, str], **kwargs) -> Union[List[float], List[List[float]]]:
        data = self._encapsulated_data(input, **kwargs)
        proxies = {'http': None, 'https': None} if self.NO_PROXY else None
        if isinstance(data, list):
            return self.run_embed_batch(input, data, proxies, **kwargs)
        else:
            with requests.post(self._embed_url, json=data, headers=self._headers, proxies=proxies,
                               timeout=self._timeout) as r:
                if r.status_code == 200:
                    return self._parse_response(r.json(), input=input)
                else:
                    raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

    def _encapsulated_data(self, input: Union[List, str], **kwargs):
        if isinstance(input, str):
            json_data = {
                'input': [input],
                'model': self._embed_model_name
            }
            if len(kwargs) > 0:
                json_data.update(kwargs)
            return json_data
        else:
            text_batch = [input[i: i + self._batch_size] for i in range(0, len(input), self._batch_size)]
            json_data = [{'input': texts, 'model': self._embed_model_name} for texts in text_batch]
            if len(kwargs) > 0:
                for i in range(len(json_data)):
                    json_data[i].update(kwargs)
            return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> Union[List[List[float]], List[float]]:
        data = response.get('data', [])
        if not data:
            raise Exception('no data received')
        if isinstance(input, str):
            return data[0].get('embedding', [])
        else:
            return [res.get('embedding', []) for res in data]

    def run_embed_batch(self, input: List, data: List, proxies, **kwargs):
        """Internal method for executing batch embedding processing.

This method handles batch text embedding requests, supporting both single-threaded 
and multi-threaded processing modes. It automatically adjusts batch size and retries 
on request failures, providing robust error handling mechanisms.

Args:
    input (List): Original input text list
    data (List): Encapsulated batch request data list
    proxies: Proxy settings, set to None if NO_PROXY is True
    **kwargs: Additional keyword arguments

**Returns:**

- A list of embedding vector lists, each sublist corresponds to an input text's embedding vector
"""
        ret = [[] for _ in range(len(input))]
        flag = False
        if self._num_worker == 1:
            with requests.Session() as session:
                while not flag:
                    for i in range(len(data)):
                        r = session.post(self._embed_url, json=data[i], headers=self._headers,
                                         proxies=proxies, timeout=self._timeout)
                        if r.status_code == 200:
                            vec = self._parse_response(r.json(), input=input)
                            start = i * self._batch_size
                            ret[start: start + len(vec)] = vec
                            if i == len(data) - 1:
                                flag = True
                        else:
                            error_msg = '\n'.join([c.decode('utf-8') for c in r.iter_content(None)])
                            if self._batch_size == 1 or r.status_code in [401, 429]:
                                raise requests.RequestException(error_msg)
                            else:
                                msg = f'Online embedding:{self._embed_model_name} post failed, adjust batch_size: '
                                msg = msg + f' from {self._batch_size} to {max(self._batch_size // 2, 1)}'
                                LOG.warning(msg)
                                self._batch_size = max(self._batch_size // 2, 1)
                                data = self._encapsulated_data(input, **kwargs)
                                break
        else:
            with ThreadPoolExecutor(max_workers=self._num_worker) as executor:
                while not flag:
                    futures = [executor.submit(requests.post, self._embed_url, json=t, headers=self._headers,
                                               proxies=proxies, timeout=self._timeout) for t in data]
                    fut_to_index = {fut: idx for idx, fut in enumerate(futures)}
                    for fut in as_completed(futures):
                        r = fut.result()
                        i = fut_to_index.pop(fut)
                        if r.status_code == 200:
                            vec = self._parse_response(r.json(), input=input)
                            start = i * self._batch_size
                            ret[start: start + len(vec)] = vec
                            if len(fut_to_index) == 0:
                                flag = True
                        else:
                            wait(futures)
                            error_msg = '\n'.join([c.decode('utf-8') for c in r.iter_content(None)])
                            if self._batch_size == 1 or r.status_code in [401, 429]:
                                raise requests.RequestException(error_msg)
                            else:
                                msg = f'Online embedding:{self._embed_model_name} post failed, adjust batch_size: '
                                msg = msg + f' from {self._batch_size} to {max(self._batch_size // 2, 1)}'
                                LOG.warning(msg)
                                self._batch_size = max(self._batch_size // 2, 1)
                                data = self._encapsulated_data(input, **kwargs)
                                break
        return ret

run_embed_batch(input, data, proxies, **kwargs)

Internal method for executing batch embedding processing.

This method handles batch text embedding requests, supporting both single-threaded and multi-threaded processing modes. It automatically adjusts batch size and retries on request failures, providing robust error handling mechanisms.

Parameters:

  • input (List) –

    Original input text list

  • data (List) –

    Encapsulated batch request data list

  • proxies

    Proxy settings, set to None if NO_PROXY is True

  • **kwargs

    Additional keyword arguments

Returns:

  • A list of embedding vector lists, each sublist corresponds to an input text's embedding vector
Source code in lazyllm/module/llms/onlinemodule/base/onlineEmbeddingModuleBase.py
    def run_embed_batch(self, input: List, data: List, proxies, **kwargs):
        """Internal method for executing batch embedding processing.

This method handles batch text embedding requests, supporting both single-threaded 
and multi-threaded processing modes. It automatically adjusts batch size and retries 
on request failures, providing robust error handling mechanisms.

Args:
    input (List): Original input text list
    data (List): Encapsulated batch request data list
    proxies: Proxy settings, set to None if NO_PROXY is True
    **kwargs: Additional keyword arguments

**Returns:**

- A list of embedding vector lists, each sublist corresponds to an input text's embedding vector
"""
        ret = [[] for _ in range(len(input))]
        flag = False
        if self._num_worker == 1:
            with requests.Session() as session:
                while not flag:
                    for i in range(len(data)):
                        r = session.post(self._embed_url, json=data[i], headers=self._headers,
                                         proxies=proxies, timeout=self._timeout)
                        if r.status_code == 200:
                            vec = self._parse_response(r.json(), input=input)
                            start = i * self._batch_size
                            ret[start: start + len(vec)] = vec
                            if i == len(data) - 1:
                                flag = True
                        else:
                            error_msg = '\n'.join([c.decode('utf-8') for c in r.iter_content(None)])
                            if self._batch_size == 1 or r.status_code in [401, 429]:
                                raise requests.RequestException(error_msg)
                            else:
                                msg = f'Online embedding:{self._embed_model_name} post failed, adjust batch_size: '
                                msg = msg + f' from {self._batch_size} to {max(self._batch_size // 2, 1)}'
                                LOG.warning(msg)
                                self._batch_size = max(self._batch_size // 2, 1)
                                data = self._encapsulated_data(input, **kwargs)
                                break
        else:
            with ThreadPoolExecutor(max_workers=self._num_worker) as executor:
                while not flag:
                    futures = [executor.submit(requests.post, self._embed_url, json=t, headers=self._headers,
                                               proxies=proxies, timeout=self._timeout) for t in data]
                    fut_to_index = {fut: idx for idx, fut in enumerate(futures)}
                    for fut in as_completed(futures):
                        r = fut.result()
                        i = fut_to_index.pop(fut)
                        if r.status_code == 200:
                            vec = self._parse_response(r.json(), input=input)
                            start = i * self._batch_size
                            ret[start: start + len(vec)] = vec
                            if len(fut_to_index) == 0:
                                flag = True
                        else:
                            wait(futures)
                            error_msg = '\n'.join([c.decode('utf-8') for c in r.iter_content(None)])
                            if self._batch_size == 1 or r.status_code in [401, 429]:
                                raise requests.RequestException(error_msg)
                            else:
                                msg = f'Online embedding:{self._embed_model_name} post failed, adjust batch_size: '
                                msg = msg + f' from {self._batch_size} to {max(self._batch_size // 2, 1)}'
                                LOG.warning(msg)
                                self._batch_size = max(self._batch_size // 2, 1)
                                data = self._encapsulated_data(input, **kwargs)
                                break
        return ret

lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoEmbedding

Bases: OnlineEmbeddingModuleBase

DoubaoEmbedding class inherits from OnlineEmbeddingModuleBase, encapsulating the functionality to call Doubao's online text embedding service.
It supports remote text vector representation retrieval by specifying the service URL, model name, and API key.

Parameters:

  • embed_url (Optional[str], default: 'https://ark.cn-beijing.volces.com/api/v3/embeddings' ) –

    URL of the Doubao text embedding service, defaulting to the Beijing region endpoint.

  • embed_model_name (Optional[str], default: 'doubao-embedding-text-240715' ) –

    Name of the Doubao embedding model used, default is "doubao-embedding-text-240715".

  • api_key (Optional[str], default: None ) –

    API key for accessing the Doubao service. If not provided, it is read from lazyllm config.

Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py
class DoubaoEmbedding(OnlineEmbeddingModuleBase):
    """DoubaoEmbedding class inherits from OnlineEmbeddingModuleBase, encapsulating the functionality to call Doubao's online text embedding service.  
It supports remote text vector representation retrieval by specifying the service URL, model name, and API key.

Args:
    embed_url (Optional[str]): URL of the Doubao text embedding service, defaulting to the Beijing region endpoint.
    embed_model_name (Optional[str]): Name of the Doubao embedding model used, default is "doubao-embedding-text-240715".
    api_key (Optional[str]): API key for accessing the Doubao service. If not provided, it is read from lazyllm config.
"""
    def __init__(self,
                 embed_url: str = 'https://ark.cn-beijing.volces.com/api/v3/embeddings',
                 embed_model_name: str = 'doubao-embedding-text-240715',
                 api_key: str = None,
                 batch_size: int = 16,
                 **kw):
        super().__init__('DOUBAO', embed_url, api_key or lazyllm.config['doubao_api_key'], embed_model_name,
                         batch_size=batch_size, **kw)

lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoMultimodalEmbedding

Bases: OnlineEmbeddingModuleBase

DoubaoMultimodalEmbedding class inherits from OnlineEmbeddingModuleBase, encapsulating the functionality to call Doubao's online multimodal (text + image) embedding service.
It supports converting text and image inputs into a unified vector representation by specifying the service URL, model name, and API key, enabling remote retrieval of multimodal embeddings.

Parameters:

  • embed_url (Optional[str], default: 'https://ark.cn-beijing.volces.com/api/v3/embeddings/multimodal' ) –

    URL of the Doubao multimodal embedding service, defaulting to the Beijing region endpoint.

  • embed_model_name (Optional[str], default: 'doubao-embedding-vision-241215' ) –

    Name of the Doubao multimodal embedding model used, default is "doubao-embedding-vision-241215".

  • api_key (Optional[str], default: None ) –

    API key for accessing the Doubao service. If not provided, it is read from lazyllm config.

Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py
class DoubaoMultimodalEmbedding(OnlineEmbeddingModuleBase):
    """DoubaoMultimodalEmbedding class inherits from OnlineEmbeddingModuleBase, encapsulating the functionality to call Doubao's online multimodal (text + image) embedding service.  
It supports converting text and image inputs into a unified vector representation by specifying the service URL, model name, and API key, enabling remote retrieval of multimodal embeddings.

Args:
    embed_url (Optional[str]): URL of the Doubao multimodal embedding service, defaulting to the Beijing region endpoint.
    embed_model_name (Optional[str]): Name of the Doubao multimodal embedding model used, default is "doubao-embedding-vision-241215".
    api_key (Optional[str]): API key for accessing the Doubao service. If not provided, it is read from lazyllm config.
"""
    def __init__(self,
                 embed_url: str = 'https://ark.cn-beijing.volces.com/api/v3/embeddings/multimodal',
                 embed_model_name: str = 'doubao-embedding-vision-241215',
                 api_key: str = None):
        super().__init__('DOUBAO', embed_url, api_key or lazyllm.config['doubao_api_key'], embed_model_name)

    def _encapsulated_data(self, input: Union[List, str], **kwargs) -> Dict[str, str]:
        if isinstance(input, str):
            input = [{'text': input}]
        elif isinstance(input, list):
            # Validate input format, at most 1 text segment + 1 image
            if len(input) == 0:
                raise ValueError('Input list cannot be empty')
            if len(input) > 2:
                raise ValueError('Input list must contain at most 2 items (1 text and/or 1 image)')
        else:
            raise ValueError('Input must be either a string or a list of dictionaries')

        json_data = {
            'input': input,
            'model': self._embed_model_name
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)

        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[float]:
        # Doubao multimodal embedding returns a single fused embedding
        return response['data']['embedding']

lazyllm.module.llms.onlinemodule.supplier.glm.GLMModule

Bases: OnlineChatModuleBase, FileHandlerBase

GLMModule class inherits from OnlineChatModuleBase and FileHandlerBase, encapsulating the functionality of accessing Zhipu's GLM series models online.
It supports chat generation, file handling, and fine-tuning. The default model is GLM-4, but other trainable models (e.g., chatglm3-6b, chatglm_12b) are also supported.

Parameters:

  • base_url (Optional[str], default: 'https://open.bigmodel.cn/api/paas/v4/' ) –

    API endpoint for Zhipu GLM service, default is "https://open.bigmodel.cn/api/paas/v4/".

  • model (Optional[str], default: None ) –

    Name of the GLM model to use. Defaults to "glm-4", or one from the TRAINABLE_MODEL_LIST.

  • api_key (Optional[str], default: None ) –

    API key for accessing GLM service. If not provided, it is read from lazyllm config.

  • stream (Optional[bool], default: True ) –

    Whether to enable streaming output. Defaults to True.

  • return_trace (Optional[bool], default: False ) –

    Whether to return debug trace information. Defaults to False.

  • **kwargs

    Additional optional parameters passed to OnlineChatModuleBase.

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
class GLMModule(OnlineChatModuleBase, FileHandlerBase):
    """GLMModule class inherits from OnlineChatModuleBase and FileHandlerBase, encapsulating the functionality of accessing Zhipu's GLM series models online.  
It supports chat generation, file handling, and fine-tuning. The default model is GLM-4, but other trainable models (e.g., chatglm3-6b, chatglm_12b) are also supported.

Args:
    base_url (Optional[str]): API endpoint for Zhipu GLM service, default is "https://open.bigmodel.cn/api/paas/v4/".
    model (Optional[str]): Name of the GLM model to use. Defaults to "glm-4", or one from the TRAINABLE_MODEL_LIST.
    api_key (Optional[str]): API key for accessing GLM service. If not provided, it is read from lazyllm config.
    stream (Optional[bool]): Whether to enable streaming output. Defaults to True.
    return_trace (Optional[bool]): Whether to return debug trace information. Defaults to False.
    **kwargs: Additional optional parameters passed to OnlineChatModuleBase.
"""
    TRAINABLE_MODEL_LIST = ['chatglm3-6b', 'chatglm_12b', 'chatglm_32b', 'chatglm_66b', 'chatglm_130b']
    VLM_MODEL_PREFIX = ['glm-4.5v', 'glm-4.1v', 'glm-4v']
    MODEL_NAME = 'glm-4'

    def __init__(self, base_url: str = 'https://open.bigmodel.cn/api/paas/v4/', model: str = None,
                 api_key: str = None, stream: str = True, return_trace: bool = False, **kwargs):
        OnlineChatModuleBase.__init__(self, model_series='GLM', api_key=api_key or lazyllm.config['glm_api_key'],
                                      model_name=model or lazyllm.config['glm_model_name'] or GLMModule.MODEL_NAME,
                                      base_url=base_url, stream=stream, return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        self.default_train_data = {
            'model': None,
            'training_file': None,
            'validation_file': None,
            'extra_hyperparameters': {
                'fine_tuning_method': None,  # lora\full, default: lora,
                'fine_tuning_parameters': {
                    'max_sequence_length': None  # [1, 8192](int), default: 8192
                }
            },
            'hyperparameters': {
                'learning_rate_multiplier': 0.01,  # (0,5] , default: 1.0
                'batch_size': None,  # [1, 32], default: 8
                'n_epochs': 1,  # [1, 10], default: 3
            },
            'suffix': None,
            'request_id': None
        }
        self.fine_tuning_job_id = None

    def _get_system_prompt(self):
        return ('You are ChatGLM, an AI assistant developed based on a language model trained by Zhipu AI. '
                'Your task is to provide appropriate responses and support for user\'s questions and requests.')

    def _get_models_list(self):
        return ['glm-4', 'glm-4v', 'glm-3-turbo', 'chatglm-turbo', 'cogview-3', 'embedding-2', 'text-embedding']

    def _convert_file_format(self, filepath: str) -> str:
        with open(filepath, 'r', encoding='utf-8') as fr:
            dataset = [json.loads(line) for line in fr]

        json_strs = []
        for ex in dataset:
            lineEx = {'messages': []}
            messages = ex.get('messages', [])
            for message in messages:
                role = message.get('role', '')
                content = message.get('content', '')
                if role in ['system', 'user', 'assistant']:
                    lineEx['messages'].append({'role': role, 'content': content})
            json_strs.append(json.dumps(lineEx, ensure_ascii=False))

        return '\n'.join(json_strs)

    def _upload_train_file(self, train_file):
        headers = {
            'Authorization': 'Bearer ' + self._api_key
        }

        url = urljoin(self._base_url, 'files')
        self.get_finetune_data(train_file)

        file_object = {
            'purpose': (None, 'fine-tune', None),
            'file': (os.path.basename(train_file), self._dataHandler, 'application/json')
        }

        with requests.post(url, headers=headers, files=file_object) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            # delete temporary training file
            self._dataHandler.close()
            return r.json()['id']

    def _update_kw(self, data, normal_config):
        cur_data = self.default_train_data.copy()
        cur_data.update(data)

        cur_data['extra_hyperparameters']['fine_tuning_method'] = normal_config['finetuning_type'].strip().lower()
        cur_data['extra_hyperparameters']['fine_tuning_parameters']['max_sequence_length'] = normal_config['cutoff_len']
        cur_data['hyperparameters']['learning_rate_multiplier'] = normal_config['learning_rate']
        cur_data['hyperparameters']['batch_size'] = normal_config['batch_size']
        cur_data['hyperparameters']['n_epochs'] = normal_config['num_epochs']
        cur_data['suffix'] = str(uuid.uuid4())[:7]
        return cur_data

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'fine_tuning/jobs')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}',
        }
        data = {
            'model': train_model,
            'training_file': train_file_id
        }
        if len(kw) > 0:
            if 'finetuning_type' in kw:
                data = self._update_kw(data, kw)
            else:
                data.update(kw)

        with requests.post(url, headers=headers, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            fine_tuning_job_id = r.json()['id']
            self.fine_tuning_job_id = fine_tuning_job_id
            status = self._status_mapping(r.json()['status'])
            return (fine_tuning_job_id, status)

    def _cancel_finetuning_job(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            return 'Invalid'
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{job_id}/cancel')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}',
        }
        with requests.post(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        status = r.json()['status']
        if status == 'cancelled':
            return 'Cancelled'
        else:
            return f'JOB {job_id} status: {status}'

    def _query_finetuned_jobs(self):
        fine_tune_url = os.path.join(self._base_url, 'fine_tuning/jobs/')
        headers = {
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _get_finetuned_model_names(self) -> Tuple[List[Tuple[str, str]], List[Tuple[str, str]]]:
        model_data = self._query_finetuned_jobs()
        res = list()
        for model in model_data['data']:
            res.append([model['id'], model['fine_tuned_model'], self._status_mapping(model['status'])])
        return res

    def _status_mapping(self, status):
        if status == 'succeeded':
            return 'Done'
        elif status == 'failed':
            return 'Failed'
        elif status == 'cancelled':
            return 'Cancelled'
        elif status == 'running':
            return 'Running'
        else:  # create, validating_files, queued
            return 'Pending'

    def _query_job_status(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        _, status = self._query_finetuning_job(job_id)
        return self._status_mapping(status)

    def _get_log(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{job_id}/events')
        headers = {
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return job_id, r.json()

    def _get_curr_job_model_id(self):
        if not self.fine_tuning_job_id:
            return None, None
        model_id, _ = self._query_finetuning_job(self.fine_tuning_job_id)
        return self.fine_tuning_job_id, model_id

    def _query_finetuning_job_info(self, fine_tuning_job_id):
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{fine_tuning_job_id}')
        headers = {
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        status = info['status']
        fine_tuned_model = info['fine_tuned_model'] if 'fine_tuned_model' in info else None
        return (fine_tuned_model, status)

    def _query_finetuning_cost(self, fine_tuning_job_id):
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        if 'trained_tokens' in info and info['trained_tokens']:
            return info['trained_tokens']
        else:
            return None

    def _create_deployment(self) -> Tuple[str]:
        return (self._model_name, 'RUNNING')

    def _query_deployment(self, deployment_id) -> str:
        return 'RUNNING'

lazyllm.module.llms.onlinemodule.supplier.glm.GLMTextToImageModule

Bases: GLMMultiModal

GLM Text-to-Image module, inheriting from GLMMultiModal, encapsulates the functionality to generate images using the GLM CogView-4 model.
It supports generating a specified number of images with given resolution based on a text prompt and can call the remote service via an API key.

Parameters:

  • model_name (Optional[str], default: None ) –

    Name of the GLM model to use, defaulting to "cogview-4-250304" or the 'glm_text_to_image_model_name' in config.

  • api_key (Optional[str], default: None ) –

    API key to access the GLM image generation service.

  • return_trace (bool, default: False ) –

    Whether to return debug trace information, default is False.

  • **kwargs

    Additional parameters passed to GLMMultiModal.

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
class GLMTextToImageModule(GLMMultiModal):
    """GLM Text-to-Image module, inheriting from GLMMultiModal, encapsulates the functionality to generate images using the GLM CogView-4 model.  
It supports generating a specified number of images with given resolution based on a text prompt and can call the remote service via an API key.

Args:
    model_name (Optional[str]): Name of the GLM model to use, defaulting to "cogview-4-250304" or the 'glm_text_to_image_model_name' in config.
    api_key (Optional[str]): API key to access the GLM image generation service.
    return_trace (bool): Whether to return debug trace information, default is False.
    **kwargs: Additional parameters passed to GLMMultiModal.
"""
    MODEL_NAME = 'cogview-4-250304'

    def __init__(self, model_name: str = None, api_key: str = None, return_trace: bool = False, **kwargs):
        GLMMultiModal.__init__(self, model_name=model_name or GLMTextToImageModule.MODEL_NAME
                               or lazyllm.config['glm_text_to_image_model_name'], api_key=api_key,
                               return_trace=return_trace, **kwargs)

    def _forward(self, input: str = None, n: int = 1, size: str = '1024x1024', **kwargs):
        call_params = {
            'model': self._model_name,
            'prompt': input,
            'n': n,
            'size': size,
            **kwargs
        }
        response = self._client.images.generations(**call_params)
        return encode_query_with_filepaths(None, bytes_to_file([requests.get(result.url).content
                                                                for result in response.data]))

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenTextToImageModule

Bases: QwenMultiModal

Qwen Text-to-Image module, inheriting from QwenMultiModal, encapsulates the functionality to generate images using the Qwen Wanx2.1-t2i-turbo model.
It supports generating a specified number of images with given resolution based on a text prompt, and allows setting negative prompts, random seeds, and prompt extension. The service is called remotely via DashScope API.

Parameters:

  • model (Optional[str], default: None ) –

    Name of the Qwen model to use, default is taken from config 'qwen_text2image_model_name', or "wanx2.1-t2i-turbo" if not set.

  • api_key (Optional[str], default: None ) –

    API key for accessing DashScope service.

  • return_trace (bool, default: False ) –

    Whether to return debug trace information, default is False.

  • **kwargs

    Additional parameters passed to QwenMultiModal.

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
class QwenTextToImageModule(QwenMultiModal):
    """Qwen Text-to-Image module, inheriting from QwenMultiModal, encapsulates the functionality to generate images using the Qwen Wanx2.1-t2i-turbo model.  
It supports generating a specified number of images with given resolution based on a text prompt, and allows setting negative prompts, random seeds, and prompt extension. The service is called remotely via DashScope API.

Args:
    model (Optional[str]): Name of the Qwen model to use, default is taken from config 'qwen_text2image_model_name', or "wanx2.1-t2i-turbo" if not set.
    api_key (Optional[str]): API key for accessing DashScope service.
    return_trace (bool): Whether to return debug trace information, default is False.
    **kwargs: Additional parameters passed to QwenMultiModal.
"""
    MODEL_NAME = 'wanx2.1-t2i-turbo'

    def __init__(self, model: str = None, api_key: str = None, return_trace: bool = False, **kwargs):
        QwenMultiModal.__init__(self, api_key=api_key,
                                model_name=model or lazyllm.config['qwen_text2image_model_name']
                                or QwenTextToImageModule.MODEL_NAME, return_trace=return_trace, **kwargs)

    def _forward(self, input: str = None, negative_prompt: str = None, n: int = 1, prompt_extend: bool = True,
                 size: str = '1024*1024', seed: int = None, **kwargs):
        call_params = {
            'model': self._model_name,
            'prompt': input,
            'negative_prompt': negative_prompt,
            'n': n,
            'prompt_extend': prompt_extend,
            'size': size,
            **kwargs
        }
        if self._api_key: call_params['api_key'] = self._api_key
        if seed: call_params['seed'] = seed
        task_response = dashscope.ImageSynthesis.async_call(**call_params)
        response = dashscope.ImageSynthesis.wait(task=task_response.output.task_id, api_key=self._api_key)
        if response.status_code == HTTPStatus.OK:
            return encode_query_with_filepaths(None, bytes_to_file([requests.get(result.url).content
                                                                    for result in response.output.results]))
        else:
            lazyllm.LOG.error(f'failed to generate image: {response.output}')
            raise Exception(f'failed to generate image: {response.output.message}')

lazyllm.module.llms.onlinemodule.supplier.kimi.KimiModule

Bases: OnlineChatModuleBase

KimiModule class, inheriting from OnlineChatModuleBase, encapsulates the functionality to call Kimi chat service provided by Moonshot AI.
By specifying the API key, model name, and service URL, it supports safe and accurate Chinese and English Q&A interactions, as well as image input in base64 format.

Parameters:

  • base_url (str, default: 'https://api.moonshot.cn/' ) –

    Base URL of the Kimi service, default is "https://api.moonshot.cn/".

  • model (str, default: 'moonshot-v1-8k' ) –

    Kimi model name to use, default is "moonshot-v1-8k".

  • api_key (Optional[str], default: None ) –

    API key for accessing Kimi service. If not provided, it is read from lazyllm config.

  • stream (bool, default: True ) –

    Whether to enable streaming output, default is True.

  • return_trace (bool, default: False ) –

    Whether to return debug trace information, default is False.

  • **kwargs

    Additional parameters passed to OnlineChatModuleBase.

Source code in lazyllm/module/llms/onlinemodule/supplier/kimi.py
class KimiModule(OnlineChatModuleBase):
    """KimiModule class, inheriting from OnlineChatModuleBase, encapsulates the functionality to call Kimi chat service provided by Moonshot AI.  
By specifying the API key, model name, and service URL, it supports safe and accurate Chinese and English Q&A interactions, as well as image input in base64 format.

Args:
    base_url (str): Base URL of the Kimi service, default is "https://api.moonshot.cn/".
    model (str): Kimi model name to use, default is "moonshot-v1-8k".
    api_key (Optional[str]): API key for accessing Kimi service. If not provided, it is read from lazyllm config.
    stream (bool): Whether to enable streaming output, default is True.
    return_trace (bool): Whether to return debug trace information, default is False.
    **kwargs: Additional parameters passed to OnlineChatModuleBase.
"""

    def __init__(self, base_url: str = 'https://api.moonshot.cn/', model: str = 'moonshot-v1-8k',
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):

        super().__init__(model_series='KIMI', api_key=api_key or lazyllm.config['kimi_api_key'], base_url=base_url,
                         model_name=model, stream=stream, return_trace=return_trace, **kwargs)

    def _get_system_prompt(self):
        return ('You are Kimi, an AI assistant provided by Moonshot AI. You are better at speaking '
                'Chinese and English. You will provide users with safe, helpful, and accurate answers. '
                'At the same time, you will reject all answers involving terrorism, racial discrimination, '
                'pornographic violence, etc. Moonshot AI is a proper noun and cannot be translated '
                'into other languages.')

    def _set_chat_url(self):
        self._url = urljoin(self._base_url, 'v1/chat/completions')

    def _format_vl_chat_image_url(self, image_url, mime):
        assert not image_url.startswith('http'), 'Kimi vision model only supports base64 format'
        assert mime is not None, 'Kimi Module requires mime info.'
        image_url = f'data:{mime};base64,{image_url}'
        return [{'type': 'image_url', 'image_url': {'url': image_url}}]

    def _format_vl_chat_query(self, query: str):
        return query

    def _validate_api_key(self):
        try:
            models_url = urljoin(self._base_url, 'v1/models')
            headers = {
                'Authorization': f'Bearer {self._api_key}',
                'Content-Type': 'application/json'
            }
            response = requests.get(models_url, headers=headers, timeout=10)
            return response.status_code == 200
        except Exception:
            return False

lazyllm.module.llms.onlinemodule.fileHandler.FileHandlerBase

FileHandlerBase is a base class for handling fine-tuning data files, mainly used for validating and converting fine-tuning data formats.
This class cannot be instantiated directly; it must be inherited by a subclass that implements specific file format conversion logic.

Capabilities include
  1. Validate that the fine-tuning data file is in standard .jsonl format.
  2. Check that each data entry contains messages in the correct format (with role and content fields).
  3. Verify that roles are within the allowed range (system, knowledge, user, assistant).
  4. Ensure each conversation example contains at least one assistant response.
  5. Provide temporary file storage for further processing.

Examples:

>>> import lazyllm
>>> from lazyllm.module.llms.onlinemodule.fileHandler import FileHandlerBase
>>> import tempfile
>>> import json
>>> sample_data = [
...     {"messages": [{"role": "user", "content": "Hello"}, {"role": "assistant", "content": "Hi there!"}]},
...     {"messages": [{"role": "user", "content": "How are you?"}, {"role": "assistant", "content": "I'm doing well, thank you!"}]}
... ] 
>>> with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
...     for item in sample_data:
...         f.write(json.dumps(item, ensure_ascii=False) + '
')
...     temp_file_path = f.name
>>> class CustomFileHandler(FileHandlerBase):
...     def _convert_file_format(self, filepath: str) -> str:
...         with open(filepath, 'r', encoding='utf-8') as f:
...             data = [json.loads(line) for line in f]
...         converted_data = []
...         for item in data:
...             messages = item.get('messages', [])
...             conversation = []
...             for msg in messages:
...                 conversation.append(f"{msg['role']}: {msg['content']}")
...             converted_data.append('
'.join(conversation))
...         return '
---
'.join(converted_data)
>>> handler = CustomFileHandler()
>>> try:
...     result = handler.get_finetune_data(temp_file_path)
...     print("数据验证和转换成功")
... except Exception as e:
...     print(f"错误: {e}")
... finally:
...     import os
...     os.unlink(temp_file_path)
Source code in lazyllm/module/llms/onlinemodule/fileHandler.py
class FileHandlerBase:
    """FileHandlerBase is a base class for handling fine-tuning data files, mainly used for validating and converting fine-tuning data formats.  
This class cannot be instantiated directly; it must be inherited by a subclass that implements specific file format conversion logic.

Capabilities include:
    1. Validate that the fine-tuning data file is in standard `.jsonl` format.
    2. Check that each data entry contains messages in the correct format (with `role` and `content` fields).
    3. Verify that roles are within the allowed range (system, knowledge, user, assistant).
    4. Ensure each conversation example contains at least one assistant response.
    5. Provide temporary file storage for further processing.


Examples:
    >>> import lazyllm
    >>> from lazyllm.module.llms.onlinemodule.fileHandler import FileHandlerBase
    >>> import tempfile
    >>> import json
    >>> sample_data = [
    ...     {"messages": [{"role": "user", "content": "Hello"}, {"role": "assistant", "content": "Hi there!"}]},
    ...     {"messages": [{"role": "user", "content": "How are you?"}, {"role": "assistant", "content": "I'm doing well, thank you!"}]}
    ... ] 
    >>> with tempfile.NamedTemporaryFile(mode='w', suffix='.jsonl', delete=False) as f:
    ...     for item in sample_data:
    ...         f.write(json.dumps(item, ensure_ascii=False) + '
    ')
    ...     temp_file_path = f.name
    >>> class CustomFileHandler(FileHandlerBase):
    ...     def _convert_file_format(self, filepath: str) -> str:
    ...         with open(filepath, 'r', encoding='utf-8') as f:
    ...             data = [json.loads(line) for line in f]
    ...         converted_data = []
    ...         for item in data:
    ...             messages = item.get('messages', [])
    ...             conversation = []
    ...             for msg in messages:
    ...                 conversation.append(f"{msg['role']}: {msg['content']}")
    ...             converted_data.append('
    '.join(conversation))
    ...         return '
    ---
    '.join(converted_data)
    >>> handler = CustomFileHandler()
    >>> try:
    ...     result = handler.get_finetune_data(temp_file_path)
    ...     print("数据验证和转换成功")
    ... except Exception as e:
    ...     print(f"错误: {e}")
    ... finally:
    ...     import os
    ...     os.unlink(temp_file_path)
    """

    def __init__(self):
        self._roles = ['system', 'knowledge', 'user', 'assistant']

    def _validate_json(self, data_path: str) -> None:  # noqa C901
        # Check if file name format
        if os.path.splitext(data_path)[-1] != '.jsonl':
            raise ValueError('The file name must end with .jsonl')
        # Check if the file exists
        if not os.path.exists(data_path):
            raise FileNotFoundError(f'File {data_path} does not exist.')

        # Load dataset
        with open(data_path, 'r', encoding='utf-8') as f:
            dataset = [json.loads(line) for line in f]

        # Initial dataset stats
        lazyllm.LOG.info('Num examples:', len(dataset))
        lazyllm.LOG.info('First example:')
        for message in dataset[0]['messages']:
            lazyllm.LOG.info(message)

        # Format error checks
        format_error: Dict[str, list[int]] = defaultdict(list)
        for index, line in enumerate(dataset, start=1):
            # Check if example is a dictionary type
            if not isinstance(line, dict):
                format_error['data_type'].append(index)
                continue

            messages = line.get('messages', None)
            # Check if messages keyword exists
            if messages is None:
                format_error['missing_messages_list'].append(index)
                continue

            for message in messages:
                if 'role' not in message or 'content' not in message:
                    format_error['message_missing_key'].append(index)

                if any(k not in ('role', 'content') for k in message):
                    format_error['message_unrecognized_key'].append(index)

                if message.get('role', None) not in self._roles:
                    format_error['unrecognized_role'].append(index)

                content = message.get('content', None)
                if content is None or not isinstance(content, str):
                    format_error['missing_content'].append(index)

            if not any(message.get('role', None) == 'assistant' for message in messages):
                format_error['example_missing_assistant_message'].append(index)

        if format_error:
            lazyllm.LOG.error('Found errors: ')
            for k, v in format_error.items():
                lazyllm.LOG.error(f'Error Type: {k}, Error number: {len(v)}')
                lazyllm.LOG.error(f'Error Type: {k}, Error line number: {v}')
        else:
            lazyllm.LOG.info('No errors found')

    def get_finetune_data(self, filepath: str) -> str:
        """Get and process fine-tuning data files, including validating file format and converting to the format supported by the target platform.

Args:
    filepath (str): Path to the fine-tuning data file, must be in .jsonl format
"""
        self._validate_json(filepath)
        self._save_tempfile(self._convert_file_format(filepath))

    def _save_tempfile(self, data: str):
        self._dataHandler = tempfile.TemporaryFile()
        self._dataHandler.write(data.encode())
        self._dataHandler.seek(0)

    def _convert_file_format(self, filepath: str) -> str:
        raise NotImplementedError

get_finetune_data(filepath)

Get and process fine-tuning data files, including validating file format and converting to the format supported by the target platform.

Parameters:

  • filepath (str) –

    Path to the fine-tuning data file, must be in .jsonl format

Source code in lazyllm/module/llms/onlinemodule/fileHandler.py
    def get_finetune_data(self, filepath: str) -> str:
        """Get and process fine-tuning data files, including validating file format and converting to the format supported by the target platform.

Args:
    filepath (str): Path to the fine-tuning data file, must be in .jsonl format
"""
        self._validate_json(filepath)
        self._save_tempfile(self._convert_file_format(filepath))

lazyllm.module.llms.onlinemodule.supplier.glm.GLMModule

Bases: OnlineChatModuleBase, FileHandlerBase

GLMModule class inherits from OnlineChatModuleBase and FileHandlerBase, encapsulating the functionality of accessing Zhipu's GLM series models online.
It supports chat generation, file handling, and fine-tuning. The default model is GLM-4, but other trainable models (e.g., chatglm3-6b, chatglm_12b) are also supported.

Parameters:

  • base_url (Optional[str], default: 'https://open.bigmodel.cn/api/paas/v4/' ) –

    API endpoint for Zhipu GLM service, default is "https://open.bigmodel.cn/api/paas/v4/".

  • model (Optional[str], default: None ) –

    Name of the GLM model to use. Defaults to "glm-4", or one from the TRAINABLE_MODEL_LIST.

  • api_key (Optional[str], default: None ) –

    API key for accessing GLM service. If not provided, it is read from lazyllm config.

  • stream (Optional[bool], default: True ) –

    Whether to enable streaming output. Defaults to True.

  • return_trace (Optional[bool], default: False ) –

    Whether to return debug trace information. Defaults to False.

  • **kwargs

    Additional optional parameters passed to OnlineChatModuleBase.

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
class GLMModule(OnlineChatModuleBase, FileHandlerBase):
    """GLMModule class inherits from OnlineChatModuleBase and FileHandlerBase, encapsulating the functionality of accessing Zhipu's GLM series models online.  
It supports chat generation, file handling, and fine-tuning. The default model is GLM-4, but other trainable models (e.g., chatglm3-6b, chatglm_12b) are also supported.

Args:
    base_url (Optional[str]): API endpoint for Zhipu GLM service, default is "https://open.bigmodel.cn/api/paas/v4/".
    model (Optional[str]): Name of the GLM model to use. Defaults to "glm-4", or one from the TRAINABLE_MODEL_LIST.
    api_key (Optional[str]): API key for accessing GLM service. If not provided, it is read from lazyllm config.
    stream (Optional[bool]): Whether to enable streaming output. Defaults to True.
    return_trace (Optional[bool]): Whether to return debug trace information. Defaults to False.
    **kwargs: Additional optional parameters passed to OnlineChatModuleBase.
"""
    TRAINABLE_MODEL_LIST = ['chatglm3-6b', 'chatglm_12b', 'chatglm_32b', 'chatglm_66b', 'chatglm_130b']
    VLM_MODEL_PREFIX = ['glm-4.5v', 'glm-4.1v', 'glm-4v']
    MODEL_NAME = 'glm-4'

    def __init__(self, base_url: str = 'https://open.bigmodel.cn/api/paas/v4/', model: str = None,
                 api_key: str = None, stream: str = True, return_trace: bool = False, **kwargs):
        OnlineChatModuleBase.__init__(self, model_series='GLM', api_key=api_key or lazyllm.config['glm_api_key'],
                                      model_name=model or lazyllm.config['glm_model_name'] or GLMModule.MODEL_NAME,
                                      base_url=base_url, stream=stream, return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        self.default_train_data = {
            'model': None,
            'training_file': None,
            'validation_file': None,
            'extra_hyperparameters': {
                'fine_tuning_method': None,  # lora\full, default: lora,
                'fine_tuning_parameters': {
                    'max_sequence_length': None  # [1, 8192](int), default: 8192
                }
            },
            'hyperparameters': {
                'learning_rate_multiplier': 0.01,  # (0,5] , default: 1.0
                'batch_size': None,  # [1, 32], default: 8
                'n_epochs': 1,  # [1, 10], default: 3
            },
            'suffix': None,
            'request_id': None
        }
        self.fine_tuning_job_id = None

    def _get_system_prompt(self):
        return ('You are ChatGLM, an AI assistant developed based on a language model trained by Zhipu AI. '
                'Your task is to provide appropriate responses and support for user\'s questions and requests.')

    def _get_models_list(self):
        return ['glm-4', 'glm-4v', 'glm-3-turbo', 'chatglm-turbo', 'cogview-3', 'embedding-2', 'text-embedding']

    def _convert_file_format(self, filepath: str) -> str:
        with open(filepath, 'r', encoding='utf-8') as fr:
            dataset = [json.loads(line) for line in fr]

        json_strs = []
        for ex in dataset:
            lineEx = {'messages': []}
            messages = ex.get('messages', [])
            for message in messages:
                role = message.get('role', '')
                content = message.get('content', '')
                if role in ['system', 'user', 'assistant']:
                    lineEx['messages'].append({'role': role, 'content': content})
            json_strs.append(json.dumps(lineEx, ensure_ascii=False))

        return '\n'.join(json_strs)

    def _upload_train_file(self, train_file):
        headers = {
            'Authorization': 'Bearer ' + self._api_key
        }

        url = urljoin(self._base_url, 'files')
        self.get_finetune_data(train_file)

        file_object = {
            'purpose': (None, 'fine-tune', None),
            'file': (os.path.basename(train_file), self._dataHandler, 'application/json')
        }

        with requests.post(url, headers=headers, files=file_object) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            # delete temporary training file
            self._dataHandler.close()
            return r.json()['id']

    def _update_kw(self, data, normal_config):
        cur_data = self.default_train_data.copy()
        cur_data.update(data)

        cur_data['extra_hyperparameters']['fine_tuning_method'] = normal_config['finetuning_type'].strip().lower()
        cur_data['extra_hyperparameters']['fine_tuning_parameters']['max_sequence_length'] = normal_config['cutoff_len']
        cur_data['hyperparameters']['learning_rate_multiplier'] = normal_config['learning_rate']
        cur_data['hyperparameters']['batch_size'] = normal_config['batch_size']
        cur_data['hyperparameters']['n_epochs'] = normal_config['num_epochs']
        cur_data['suffix'] = str(uuid.uuid4())[:7]
        return cur_data

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'fine_tuning/jobs')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}',
        }
        data = {
            'model': train_model,
            'training_file': train_file_id
        }
        if len(kw) > 0:
            if 'finetuning_type' in kw:
                data = self._update_kw(data, kw)
            else:
                data.update(kw)

        with requests.post(url, headers=headers, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            fine_tuning_job_id = r.json()['id']
            self.fine_tuning_job_id = fine_tuning_job_id
            status = self._status_mapping(r.json()['status'])
            return (fine_tuning_job_id, status)

    def _cancel_finetuning_job(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            return 'Invalid'
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{job_id}/cancel')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}',
        }
        with requests.post(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        status = r.json()['status']
        if status == 'cancelled':
            return 'Cancelled'
        else:
            return f'JOB {job_id} status: {status}'

    def _query_finetuned_jobs(self):
        fine_tune_url = os.path.join(self._base_url, 'fine_tuning/jobs/')
        headers = {
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _get_finetuned_model_names(self) -> Tuple[List[Tuple[str, str]], List[Tuple[str, str]]]:
        model_data = self._query_finetuned_jobs()
        res = list()
        for model in model_data['data']:
            res.append([model['id'], model['fine_tuned_model'], self._status_mapping(model['status'])])
        return res

    def _status_mapping(self, status):
        if status == 'succeeded':
            return 'Done'
        elif status == 'failed':
            return 'Failed'
        elif status == 'cancelled':
            return 'Cancelled'
        elif status == 'running':
            return 'Running'
        else:  # create, validating_files, queued
            return 'Pending'

    def _query_job_status(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        _, status = self._query_finetuning_job(job_id)
        return self._status_mapping(status)

    def _get_log(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{job_id}/events')
        headers = {
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return job_id, r.json()

    def _get_curr_job_model_id(self):
        if not self.fine_tuning_job_id:
            return None, None
        model_id, _ = self._query_finetuning_job(self.fine_tuning_job_id)
        return self.fine_tuning_job_id, model_id

    def _query_finetuning_job_info(self, fine_tuning_job_id):
        fine_tune_url = os.path.join(self._base_url, f'fine_tuning/jobs/{fine_tuning_job_id}')
        headers = {
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        status = info['status']
        fine_tuned_model = info['fine_tuned_model'] if 'fine_tuned_model' in info else None
        return (fine_tuned_model, status)

    def _query_finetuning_cost(self, fine_tuning_job_id):
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        if 'trained_tokens' in info and info['trained_tokens']:
            return info['trained_tokens']
        else:
            return None

    def _create_deployment(self) -> Tuple[str]:
        return (self._model_name, 'RUNNING')

    def _query_deployment(self, deployment_id) -> str:
        return 'RUNNING'

lazyllm.module.llms.onlinemodule.supplier.glm.GLMReranking

Bases: OnlineEmbeddingModuleBase

Reranking module for Zhipu AI, inheriting from OnlineEmbeddingModuleBase, used for relevance reranking of documents.

Parameters:

  • embed_url (str, default: 'https://open.bigmodel.cn/api/paas/v4/rerank' ) –

    Base URL for reranking API, defaults to "https://open.bigmodel.cn/api/paas/v4/rerank".

  • embed_model_name (str, default: 'rerank' ) –

    Model name to use, defaults to "rerank".

  • api_key (str, default: None ) –

    Zhipu AI API key, if not provided will be read from lazyllm.config['glm_api_key'].

Properties

type: Returns model type, fixed as "ONLINE_RERANK".

Main Features
  • Performs relevance reranking for input query and document list
  • Supports custom ranking parameters
  • Returns relevance scores for each document
Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
class GLMReranking(OnlineEmbeddingModuleBase):
    """Reranking module for Zhipu AI, inheriting from OnlineEmbeddingModuleBase, used for relevance reranking of documents.

Args:
    embed_url (str): Base URL for reranking API, defaults to "https://open.bigmodel.cn/api/paas/v4/rerank".
    embed_model_name (str): Model name to use, defaults to "rerank".
    api_key (str): Zhipu AI API key, if not provided will be read from lazyllm.config['glm_api_key'].

Properties:
    type: Returns model type, fixed as "ONLINE_RERANK".

Main Features:
    - Performs relevance reranking for input query and document list
    - Supports custom ranking parameters
    - Returns relevance scores for each document
"""

    def __init__(self,
                 embed_url: str = 'https://open.bigmodel.cn/api/paas/v4/rerank',
                 embed_model_name: str = 'rerank',
                 api_key: str = None, **kw):
        super().__init__('GLM', embed_url, api_key or lazyllm.config['glm_api_key'], embed_model_name, **kw)

    @property
    def type(self):
        return 'RERANK'

    def _encapsulated_data(self, query: str, documents: List[str], top_n: int, **kwargs) -> Dict[str, str]:
        json_data = {
            'query': query,
            'documents': documents,
            'top_n': top_n,
            'return_documents': False,
            'return_raw_scores': True
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)

        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[Tuple]:
        return [(result['index'], result['relevance_score']) for result in response['results']]

lazyllm.module.llms.onlinemodule.supplier.glm.GLMMultiModal

Bases: OnlineMultiModalBase

Zhipu AI's multimodal base module, inheriting from OnlineMultiModalBase, for handling multimodal tasks.

Parameters:

  • model_name (str) –

    Model name.

  • api_key (str, default: None ) –

    API key, if not provided will be read from lazyllm.config['glm_api_key'].

  • base_url (str, default: 'https://open.bigmodel.cn/api/paas/v4' ) –

    Base URL for API, defaults to 'https://open.bigmodel.cn/api/paas/v4'.

  • return_trace (bool, default: False ) –

    Whether to return call trace information, defaults to False.

  • **kwargs

    Additional arguments passed to the base class.

Features:

1. Supports multimodal input processing
2. Uses ZhipuAI client for API calls
3. Provides unified multimodal interface
4. Customizable base URL and API key
Note

This class serves as the base class for GLM multimodal functionality, typically used as the parent class for specific multimodal implementations (such as speech-to-text, text-to-image, etc.).

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
class GLMMultiModal(OnlineMultiModalBase):
    """Zhipu AI's multimodal base module, inheriting from OnlineMultiModalBase, for handling multimodal tasks.

Args:
    model_name (str): Model name.
    api_key (str): API key, if not provided will be read from lazyllm.config['glm_api_key'].
    base_url (str): Base URL for API, defaults to 'https://open.bigmodel.cn/api/paas/v4'.
    return_trace (bool): Whether to return call trace information, defaults to False.
    **kwargs: Additional arguments passed to the base class.

Features:

    1. Supports multimodal input processing
    2. Uses ZhipuAI client for API calls
    3. Provides unified multimodal interface
    4. Customizable base URL and API key

Note:
    This class serves as the base class for GLM multimodal functionality, typically used as the parent class for specific multimodal implementations (such as speech-to-text, text-to-image, etc.).
"""
    def __init__(self, model_name: str, api_key: str = None,
                 base_url: str = 'https://open.bigmodel.cn/api/paas/v4', return_trace: bool = False,
                 **kwargs):
        OnlineMultiModalBase.__init__(self, model_series='GLM', model_name=model_name,
                                      return_trace=return_trace, **kwargs)
        self._client = zhipuai.ZhipuAI(api_key=api_key or lazyllm.config['glm_api_key'], base_url=base_url)

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenReranking

Bases: OnlineEmbeddingModuleBase

Qwen reranking module, inheriting from OnlineEmbeddingModuleBase, used for relevance reranking of documents.

Parameters:

  • embed_url (str, default: 'https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank' ) –

    Base URL for reranking API, defaults to "https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank".

  • embed_model_name (str, default: 'gte-rerank-v2' ) –

    Model name to use, defaults to "gte-rerank".

  • api_key (str, default: None ) –

    Qwen API key, if not provided will be read from lazyllm.config['qwen_api_key'].

  • **kwargs

    Additional arguments passed to the base class.

Properties

type: Returns model type, fixed as "ONLINE_RERANK".

Main Features
  • Performs relevance reranking for input query and document list
  • Supports custom ranking parameters
  • Returns index and relevance score for each document
Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
class QwenReranking(OnlineEmbeddingModuleBase):
    """Qwen reranking module, inheriting from OnlineEmbeddingModuleBase, used for relevance reranking of documents.

Args:
    embed_url (str): Base URL for reranking API, defaults to "https://dashscope.aliyuncs.com/api/v1/services/rerank/text-rerank/text-rerank".
    embed_model_name (str): Model name to use, defaults to "gte-rerank".
    api_key (str): Qwen API key, if not provided will be read from lazyllm.config['qwen_api_key'].
    **kwargs: Additional arguments passed to the base class.

Properties:
    type: Returns model type, fixed as "ONLINE_RERANK".

Main Features:
    - Performs relevance reranking for input query and document list
    - Supports custom ranking parameters
    - Returns index and relevance score for each document
"""

    def __init__(self,
                 embed_url: str = ('https://dashscope.aliyuncs.com/api/v1/services/'
                                   'rerank/text-rerank/text-rerank'),
                 embed_model_name: str = 'gte-rerank-v2',
                 api_key: str = None, **kw):
        super().__init__('QWEN', embed_url, api_key or lazyllm.config['qwen_api_key'], embed_model_name, **kw)

    @property
    def type(self):
        return 'RERANK'

    def _encapsulated_data(self, query: str, documents: List[str], top_n: int, **kwargs) -> Dict[str, str]:
        json_data = {
            'input': {
                'query': query,
                'documents': documents
            },
            'parameters': {
                'top_n': top_n,
            },
            'model': self._embed_model_name
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)

        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[Tuple]:
        results = response['output']['results']
        return [(result['index'], result['relevance_score']) for result in results]

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenTTSModule

Bases: QwenMultiModal

Qwen's text-to-speech module, inheriting from QwenMultiModal, providing support for multiple speech synthesis models.

Parameters:

  • model (str, default: None ) –

    Model name, defaults to "qwen-tts". Available models include: - cosyvoice-v2 - cosyvoice-v1 - sambert - qwen-tts - qwen-tts-latest

  • api_key (str, default: None ) –

    API key, defaults to None, will be read from lazyllm.config['qwen_api_key'].

  • return_trace (bool, default: False ) –

    Whether to return call trace information, defaults to False.

  • **kwargs

    Additional arguments passed to the base class.

Synthesis Parameters:

input (str): Text content to convert.
voice (str): Speaker voice, defaults to model's default voice.
speech_rate (float): Speech rate, defaults to 1.0.
volume (int): Volume, defaults to 50.
pitch (float): Pitch, defaults to 1.0.
Note
  • Different models may support different voice options
  • Returned audio data is automatically encoded into file format
Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
class QwenTTSModule(QwenMultiModal):
    """Qwen's text-to-speech module, inheriting from QwenMultiModal, providing support for multiple speech synthesis models.

Args:
    model (str): Model name, defaults to "qwen-tts". Available models include:
        - cosyvoice-v2
        - cosyvoice-v1
        - sambert
        - qwen-tts
        - qwen-tts-latest
    api_key (str): API key, defaults to None, will be read from lazyllm.config['qwen_api_key'].
    return_trace (bool): Whether to return call trace information, defaults to False.
    **kwargs: Additional arguments passed to the base class.

Synthesis Parameters:

    input (str): Text content to convert.
    voice (str): Speaker voice, defaults to model's default voice.
    speech_rate (float): Speech rate, defaults to 1.0.
    volume (int): Volume, defaults to 50.
    pitch (float): Pitch, defaults to 1.0.

Note:
    - Different models may support different voice options
    - Returned audio data is automatically encoded into file format
"""
    MODEL_NAME = 'qwen-tts'
    SYNTHESIZERS = {
        'cosyvoice-v2': (synthesize_v2, 'longxiaochun_v2'),
        'cosyvoice-v1': (synthesize_v2, 'longxiaochun'),
        'sambert': (synthesize, 'zhinan-v1'),
        'qwen-tts': (synthesize_qwentts, 'Cherry'),
        'qwen-tts-latest': (synthesize_qwentts, 'Cherry')
    }

    def __init__(self, model: str = None, api_key: str = None, return_trace: bool = False, **kwargs):
        QwenMultiModal.__init__(self, api_key=api_key,
                                model_name=model or lazyllm.config['qwen_tts_model_name'] or QwenTTSModule.MODEL_NAME,
                                return_trace=return_trace, **kwargs)
        if self._model_name not in self.SYNTHESIZERS:
            raise ValueError(f'unsupported model: {self._model_name}. '
                             f'supported models: {QwenTTSModule.SYNTHESIZERS.keys()}')
        self._synthesizer_func, self._voice = QwenTTSModule.SYNTHESIZERS[self._model_name]

    def _forward(self, input: str = None, voice: str = None, speech_rate: float = 1.0, volume: int = 50,
                 pitch: float = 1.0, **kwargs):
        call_params = {
            'input': input,
            'model_name': self._model_name,
            'voice': voice or self._voice,
            'speech_rate': speech_rate,
            'volume': volume,
            'pitch': pitch,
            **kwargs
        }
        if self._api_key: call_params['api_key'] = self._api_key
        return encode_query_with_filepaths(None, bytes_to_file(self._synthesizer_func(**call_params)))

lazyllm.module.llms.onlinemodule.supplier.sensenova.SenseNovaModule

Bases: OnlineChatModuleBase, FileHandlerBase, _SenseNovaBase

SenseNovaModule is the LLM interface management component for SenseTime's open platform, inheriting from OnlineChatModuleBase and FileHandlerBase, providing both chat and file handling capabilities.

Parameters:

  • base_url (str, default: 'https://api.sensenova.cn/compatible-mode/v1/' ) –

    Base URL for the API, defaults to "https://api.sensenova.cn/compatible-mode/v1/".

  • model (str, default: 'SenseChat-5' ) –

    Name of the model to use, defaults to "SenseChat-5".

  • api_key (str, default: None ) –

    SenseTime API key, if not provided will be read from lazyllm.config['sensenova_api_key'].

  • secret_key (str, default: None ) –

    SenseTime secret key, if not provided will be read from lazyllm.config['sensenova_secret_key'].

  • stream (bool, default: True ) –

    Whether to enable streaming output, defaults to True.

  • return_trace (bool, default: False ) –

    Whether to return trace information, defaults to False.

  • **kwargs

    Additional arguments passed to the base class.

Source code in lazyllm/module/llms/onlinemodule/supplier/sensenova.py
class SenseNovaModule(OnlineChatModuleBase, FileHandlerBase, _SenseNovaBase):
    """SenseNovaModule is the LLM interface management component for SenseTime's open platform, inheriting from OnlineChatModuleBase and FileHandlerBase, providing both chat and file handling capabilities.

Args:
    base_url (str): Base URL for the API, defaults to "https://api.sensenova.cn/compatible-mode/v1/".
    model (str): Name of the model to use, defaults to "SenseChat-5".
    api_key (str): SenseTime API key, if not provided will be read from lazyllm.config['sensenova_api_key'].
    secret_key (str): SenseTime secret key, if not provided will be read from lazyllm.config['sensenova_secret_key'].
    stream (bool): Whether to enable streaming output, defaults to True.
    return_trace (bool): Whether to return trace information, defaults to False.
    **kwargs: Additional arguments passed to the base class.
"""
    TRAINABLE_MODEL_LIST = ['nova-ptc-s-v2']
    VLM_MODEL_PREFIX = ['SenseNova-V6-Turbo', 'SenseChat-Vision', 'SenseNova-V6-Pro', 'SenseNova-V6-Reasoner',
                        'SenseNova-V6-5-Pro', 'SenseNova-V6-5-Turbo']

    def __init__(self, base_url: str = 'https://api.sensenova.cn/compatible-mode/v1/', model: str = 'SenseChat-5',
                 api_key: str = None, secret_key: str = None, stream: bool = True,
                 return_trace: bool = False, **kwargs):
        api_key = self._get_api_key(api_key, secret_key)
        OnlineChatModuleBase.__init__(self, model_series='SENSENOVA', api_key=api_key, base_url=base_url,
                                      model_name=model, stream=stream, return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        self._deploy_paramters = None
        self._vlm_force_format_input_with_files = True

    def _get_system_prompt(self):
        return 'You are an AI assistant, developed by SenseTime.'

    def _set_chat_url(self):
        self._url = urljoin(self._base_url, 'chat/completions')

    def _convert_file_format(self, filepath: str) -> None:
        with open(filepath, 'r', encoding='utf-8') as fr:
            dataset = [json.loads(line) for line in fr]

        json_strs = []
        for ex in dataset:
            lineEx = []
            messages = ex.get('messages', [])
            for message in messages:
                role = message.get('role', '')
                content = message.get('content', '')
                if role in ['system', 'knowledge', 'user', 'assistant']:
                    lineEx.append({'role': role, 'content': content})
            json_strs.append(json.dumps(lineEx, ensure_ascii=False))

        return '\n'.join(json_strs)

    def _upload_train_file(self, train_file):
        headers = {
            'Authorization': 'Bearer ' + self._api_key
        }
        url = self._train_parameters.get('upload_url', 'https://file.sensenova.cn/v1/files')
        self.get_finetune_data(train_file)
        file_object = {
            # The correct format should be to pass in a tuple in the format of:
            # (<fileName>, <fileObject>, <Content-Type>),
            # where fileObject refers to the specific value.

            'description': (None, 'train_file', None),
            'scheme': (None, 'FINE_TUNE_2', None),
            'file': (os.path.basename(train_file), self._dataHandler, 'application/json')
        }

        train_file_id = None
        with requests.post(url, headers=headers, files=file_object) as r:
            if r.status_code != 200:
                raise requests.RequestException(r.text)

            train_file_id = r.json()['id']
            # delete temporary training file
            self._dataHandler.close()
            lazyllm.LOG.info(f'train file id: {train_file_id}')

        def _create_finetuning_dataset(description, files):
            url = urljoin(self._base_url, 'fine-tune/datasets')
            headers = {
                'Content-Type': 'application/json',
                'Authorization': f'Bearer {self._api_key}',
            }
            data = {
                'description': description,
                'files': files
            }
            with requests.post(url, headers=headers, json=data) as r:
                if r.status_code != 200:
                    raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

                dataset_id = r.json()['dataset']['id']
                status = r.json()['dataset']['status']
                url = url + f'/{dataset_id}'
                while status.lower() != 'ready':
                    try:
                        time.sleep(10)
                        with requests.get(url, headers=headers) as r:
                            if r.status_code != 200:
                                raise requests.RequestException(r.text)

                            dataset_id = r.json()['dataset']['id']
                            status = r.json()['dataset']['status']
                    except Exception as e:
                        lazyllm.LOG.error(f'error: {e}')
                        raise ValueError(f'created datasets {dataset_id} failed')
                return dataset_id

        return _create_finetuning_dataset('fine-tuning dataset', [train_file_id])

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'fine-tunes')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}',
        }
        data = {
            'model': train_model,
            'training_file': train_file_id,
            'suffix': kw.get('suffix', 'ft-' + str(uuid.uuid4().hex))
        }
        if 'training_parameters' in kw.keys():
            data.update(kw['training_parameters'])

        with requests.post(url, headers=headers, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException(r.text)

            fine_tuning_job_id = r.json()['job']['id']
            status = r.json()['job']['status']
            return (fine_tuning_job_id, status)

    def _validate_api_key(self):
        fine_tune_url = urljoin('https://api.sensenova.cn/v1/llm/', 'models')
        headers = {
            'Authorization': f'Bearer {self._api_key}',
            'Content-Type': 'application/json'
        }
        response = requests.get(fine_tune_url, headers=headers)
        if response.status_code == 200:
            return True
        return False

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        fine_tune_url = urljoin(self._base_url, f'fine-tunes/{fine_tuning_job_id}')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            status = r.json()['job']['status']
            fine_tuned_model = None
            if status.lower() == 'succeeded':
                fine_tuned_model = r.json()['job']['fine_tuned_model']
            return (fine_tuned_model, status)

    def set_deploy_parameters(self, **kw):
        """Set parameters for model deployment.

Args:
    **kw: Key-value pairs of deployment parameters that will be used when creating deployment.
"""
        self._deploy_paramters = kw

    def _create_deployment(self) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'fine-tune/servings')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}',
        }
        data = {
            'model': self._model_name,
            'config': {
                'run_time': 0
            }
        }
        if self._deploy_paramters and len(self._deploy_paramters) > 0:
            data.update(self._deploy_paramters)

        with requests.post(url, headers=headers, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            fine_tuning_job_id = r.json()['job']['id']
            status = r.json()['job']['status']
            return (fine_tuning_job_id, status)

    def _query_deployment(self, deployment_id) -> str:
        fine_tune_url = urljoin(self._base_url, f'fine-tune/servings/{deployment_id}')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            status = r.json()['job']['status']
            return status

    def _format_vl_chat_image_url(self, image_url, mime):
        if image_url.startswith('http'):
            return [{'type': 'image_url', 'image_url': image_url}]
        else:
            return [{'type': 'image_base64', 'image_base64': image_url}]

set_deploy_parameters(**kw)

Set parameters for model deployment.

Parameters:

  • **kw

    Key-value pairs of deployment parameters that will be used when creating deployment.

Source code in lazyllm/module/llms/onlinemodule/supplier/sensenova.py
    def set_deploy_parameters(self, **kw):
        """Set parameters for model deployment.

Args:
    **kw: Key-value pairs of deployment parameters that will be used when creating deployment.
"""
        self._deploy_paramters = kw

lazyllm.module.llms.onlinemodule.base.onlineMultiModalBase.OnlineMultiModalBase

Bases: OnlineModuleBase, LLMBase

Base class for online multimodal models, inheriting from LLMBase, providing basic functionality for multimodal models.

Parameters:

  • model_series (str) –

    Model series name, cannot be empty.

  • model_name (str, default: None ) –

    Model name, defaults to None. A warning will be generated if not specified.

  • return_trace (bool, default: False ) –

    Whether to return call trace information, defaults to False.

  • **kwargs

    Additional arguments passed to the base class.

Properties:

series: Returns the model series name.
type: Returns the model type, fixed as "MultiModal".

Main Methods:

share(): Create a shared instance of the module.
forward(input, lazyllm_files, **kwargs): Main method for handling input and files.
_forward(input, files, **kwargs): Forward method to be implemented by subclasses.
Notes
  • Subclasses must implement the _forward method.
  • Model series name (model_series) is required.
  • A warning log will be generated if model name (model_name) is not specified.
Source code in lazyllm/module/llms/onlinemodule/base/onlineMultiModalBase.py
class OnlineMultiModalBase(OnlineModuleBase, LLMBase):
    """Base class for online multimodal models, inheriting from LLMBase, providing basic functionality for multimodal models.

Args:
    model_series (str): Model series name, cannot be empty.
    model_name (str): Model name, defaults to None. A warning will be generated if not specified.
    return_trace (bool): Whether to return call trace information, defaults to False.
    **kwargs: Additional arguments passed to the base class.

Properties:

    series: Returns the model series name.
    type: Returns the model type, fixed as "MultiModal".

Main Methods:

    share(): Create a shared instance of the module.
    forward(input, lazyllm_files, **kwargs): Main method for handling input and files.
    _forward(input, files, **kwargs): Forward method to be implemented by subclasses.

Notes:
    - Subclasses must implement the _forward method.
    - Model series name (model_series) is required.
    - A warning log will be generated if model name (model_name) is not specified.
"""
    def __init__(self, model_series: str, model_name: str = None, return_trace: bool = False, **kwargs):
        super().__init__(return_trace=return_trace)
        self._model_series = model_series
        self._model_name = model_name
        self._validate_model_config()

    def _validate_model_config(self):
        """Validate model configuration"""
        if not self._model_series:
            raise ValueError('model_series cannot be empty')
        if not self._model_name:
            lazyllm.LOG.warning(f'model_name not specified for {self._model_series}')

    @property
    def series(self):
        return self._model_series

    @property
    def type(self):
        return 'MultiModal'

    def share(self):
        """Create a shared instance of the module"""
        new = copy.copy(self)
        return new

    def _forward(self, input: Union[Dict, str] = None, files: List[str] = None, **kwargs):
        """Forward method to be implemented by subclasses"""
        raise NotImplementedError(f'Subclass {self.__class__.__name__} must implement this method')

    def forward(self, input: Union[Dict, str] = None, *, lazyllm_files=None, **kwargs):
        """Main forward method with file handling"""
        try:
            input, files = self._get_files(input, lazyllm_files)
            call_params = {'input': input, **kwargs}
            if files: call_params['files'] = files
            return self._forward(**call_params)

        except Exception as e:
            lazyllm.LOG.error(f'Error in {self.__class__.__name__}.forward: {str(e)}')
            raise

    def __repr__(self):
        return lazyllm.make_repr('Module', 'OnlineMultiModalModule',
                                 series=self._model_series,
                                 name=self._model_name,
                                 return_trace=self._return_trace)

lazyllm.module.llms.onlinemodule.base.utils.OnlineModuleBase

Bases: ModuleBase

Base class for online modules, inheriting from ModuleBase, providing unified basic functionality for all online service modules.
This class encapsulates common behaviors of online modules, including caching mechanisms and debug tracing functionality, serving as the foundation for building various online API service modules.

Key Features
  • Inherits all basic functionality from ModuleBase, including submodule management, hook registration, etc.
  • Supports online module caching mechanism, controllable through configuration.
  • Provides debug tracing functionality for troubleshooting and performance analysis.
  • Serves as a common base class for all online service modules (chat, embedding, multimodal, etc.).

Parameters:

  • return_trace (bool, default: False ) –

    Whether to write inference results into the trace queue for debugging and tracking. Default is False.

Use Cases
  1. As a base class for online chat modules (OnlineChatModuleBase).
  2. As a base class for online embedding modules (OnlineEmbeddingModuleBase).
  3. As a base class for online multimodal modules (OnlineMultiModalBase).
  4. Providing unified basic functionality for custom online service modules.
Source code in lazyllm/module/llms/onlinemodule/base/utils.py
class OnlineModuleBase(ModuleBase):
    """Base class for online modules, inheriting from ModuleBase, providing unified basic functionality for all online service modules.  
This class encapsulates common behaviors of online modules, including caching mechanisms and debug tracing functionality, serving as the foundation for building various online API service modules.

Key Features:
    - Inherits all basic functionality from ModuleBase, including submodule management, hook registration, etc.
    - Supports online module caching mechanism, controllable through configuration.
    - Provides debug tracing functionality for troubleshooting and performance analysis.
    - Serves as a common base class for all online service modules (chat, embedding, multimodal, etc.).

Args:
    return_trace (bool): Whether to write inference results into the trace queue for debugging and tracking. Default is ``False``.

Use Cases:
    1. As a base class for online chat modules (OnlineChatModuleBase).
    2. As a base class for online embedding modules (OnlineEmbeddingModuleBase).
    3. As a base class for online multimodal modules (OnlineMultiModalBase).
    4. Providing unified basic functionality for custom online service modules.
"""
    def __init__(self, return_trace: bool = False):
        super().__init__(return_trace=return_trace)
        if config['cache_online_module']:
            self.use_cache()

lazyllm.module.module.ModuleCache

Bases: object

Module cache manager providing unified cache storage and retrieval functionality.
This class encapsulates multiple cache strategies (memory, file, SQLite, Redis), automatically selecting cache storage methods based on configuration, providing efficient caching mechanisms for module execution results.

Key Features
  • Supports multiple cache strategies: memory cache, file cache, SQLite database cache, Redis cache.
  • Automatically selects cache strategy based on configuration, defaults to memory cache.
  • Supports cache mode control (read-write, read-only, write-only, disabled).
  • Provides unified cache interface, hiding underlying storage implementation details.
  • Supports parameter hashing to ensure uniqueness of cache keys.

Parameters:

  • strategy (Optional[str], default: None ) –

    Cache strategy, options include 'memory', 'file', 'sqlite', 'redis'. Defaults to None, will use strategy from configuration.

Use Cases
  1. Provide caching for module execution results to avoid redundant computation.
  2. Use Redis cache in distributed environments for sharing.
  3. Use file or database cache for persistent storage.
  4. Select different cache strategies based on performance requirements.
Source code in lazyllm/module/module.py
class ModuleCache(object):
    """Module cache manager providing unified cache storage and retrieval functionality.  
This class encapsulates multiple cache strategies (memory, file, SQLite, Redis), automatically selecting cache storage methods based on configuration, providing efficient caching mechanisms for module execution results.

Key Features:
    - Supports multiple cache strategies: memory cache, file cache, SQLite database cache, Redis cache.
    - Automatically selects cache strategy based on configuration, defaults to memory cache.
    - Supports cache mode control (read-write, read-only, write-only, disabled).
    - Provides unified cache interface, hiding underlying storage implementation details.
    - Supports parameter hashing to ensure uniqueness of cache keys.

Args:
    strategy (Optional[str]): Cache strategy, options include 'memory', 'file', 'sqlite', 'redis'. Defaults to None, will use strategy from configuration.

Use Cases:
    1. Provide caching for module execution results to avoid redundant computation.
    2. Use Redis cache in distributed environments for sharing.
    3. Use file or database cache for persistent storage.
    4. Select different cache strategies based on performance requirements.
"""
    def __init__(self, strategy: Optional[str] = None):
        self._strategy = self._create_strategy(strategy or lazyllm.config['cache_strategy'])

    def _create_strategy(self, strategy: str) -> _CacheStorageStrategy:
        strategy = strategy.lower()
        strategies = {
            'memory': _MemoryCacheStrategy,
            'file': _FileCacheStrategy,
            'sqlite': _SQLiteCacheStrategy,
            'redis': _RedisCacheStrategy,
        }

        if strategy not in strategies:
            raise ValueError(f'Unsupported cache strategy: {strategy}. '
                             f'Available strategies: {list(strategies.keys())}')
        return strategies[strategy]()

    def _hash(self, args, kw):
        def process_value(value, hash_obj):
            meta = ''
            if isinstance(value, (list, tuple, dict, set)):
                meta = str(type(value)) + str(len(value))
            if isinstance(value, str):
                hash_obj.update(str(file_content_hash(value)).encode())
            elif isinstance(value, set):
                hash_obj.update((meta + '>').encode())
                for item in sorted(value):
                    process_value(item, hash_obj)
                hash_obj.update(('<' + meta).encode())
            elif isinstance(value, (list, tuple)):
                hash_obj.update((meta + '>').encode())
                for item in value:
                    process_value(item, hash_obj)
                hash_obj.update(('<' + meta).encode())
            elif isinstance(value, dict):
                hash_obj.update((meta + '>').encode())
                for k, v in sorted(value.items()):
                    key_meta = 'key:' + str(type(k)) + str(k)
                    hash_obj.update(key_meta.encode())
                    process_value(v, hash_obj)
                hash_obj.update(('<' + meta).encode())
            else:
                value_meta = str(type(value)) + str(value)
                hash_obj.update(value_meta.encode())
        hash_obj = hashlib.md5()
        process_value(args, hash_obj)
        if kw:
            process_value(kw, hash_obj)
        return hash_obj.hexdigest()

    def get(self, key, args, kw):
        """Retrieve data from cache.

Retrieves data from cache based on the provided key and parameters. Throws an exception if cache mode doesn't allow reading or data doesn't exist.

Args:
    key: Cache key used to identify cached data.
    args: Positional arguments used to generate cache hash key.
    kw: Keyword arguments used to generate cache hash key.

**Returns:**

- Any: Data stored in cache.

**Exceptions:** 

- CacheNotFoundError: Raised when specified data doesn't exist in cache.
- RuntimeError: Raised when cache mode is set to write-only (WO).
"""
        if 'R' not in lazyllm.config['cache_mode']:
            raise CacheNotFoundError('Cannot read cache due to `LAZYLLM_CACHE_MODE = WO`')
        hash_key = self._hash(args, kw)
        value = self._strategy.get(key, hash_key)
        return transform_path(value, mode='r2a')

    def set(self, key, args, kw, value):
        """Store data in cache.

Stores data in cache based on the provided key and parameters. If cache mode doesn't allow writing, returns directly without executing storage operation.

Args:
    key: Cache key used to identify cached data.
    args: Positional arguments used to generate cache hash key.
    kw: Keyword arguments used to generate cache hash key.
    value: Data to be stored.

**Note:** 

- If cache mode is set to read-only (RO) or disabled (NONE), this method will return directly without executing storage operation.
"""
        if 'W' not in lazyllm.config['cache_mode']: return
        hash_key = self._hash(args, kw)
        value = transform_path(value, mode='a2r')
        self._strategy.set(key, hash_key, value)

    def close(self):
        """Close cache storage strategy.

Releases resources occupied by the cache storage strategy, such as closing database connections, clearing memory cache, etc. After calling this method, the cache will no longer be available.

**Note:** 

- After calling this method, the cache instance will no longer be usable.
- Different cache strategies may have different resource cleanup behaviors.
"""
        self._strategy.close()

close()

Close cache storage strategy.

Releases resources occupied by the cache storage strategy, such as closing database connections, clearing memory cache, etc. After calling this method, the cache will no longer be available.

Note:

  • After calling this method, the cache instance will no longer be usable.
  • Different cache strategies may have different resource cleanup behaviors.
Source code in lazyllm/module/module.py
    def close(self):
        """Close cache storage strategy.

Releases resources occupied by the cache storage strategy, such as closing database connections, clearing memory cache, etc. After calling this method, the cache will no longer be available.

**Note:** 

- After calling this method, the cache instance will no longer be usable.
- Different cache strategies may have different resource cleanup behaviors.
"""
        self._strategy.close()

get(key, args, kw)

Retrieve data from cache.

Retrieves data from cache based on the provided key and parameters. Throws an exception if cache mode doesn't allow reading or data doesn't exist.

Parameters:

  • key

    Cache key used to identify cached data.

  • args

    Positional arguments used to generate cache hash key.

  • kw

    Keyword arguments used to generate cache hash key.

Returns:

  • Any: Data stored in cache.

Exceptions:

  • CacheNotFoundError: Raised when specified data doesn't exist in cache.
  • RuntimeError: Raised when cache mode is set to write-only (WO).
Source code in lazyllm/module/module.py
    def get(self, key, args, kw):
        """Retrieve data from cache.

Retrieves data from cache based on the provided key and parameters. Throws an exception if cache mode doesn't allow reading or data doesn't exist.

Args:
    key: Cache key used to identify cached data.
    args: Positional arguments used to generate cache hash key.
    kw: Keyword arguments used to generate cache hash key.

**Returns:**

- Any: Data stored in cache.

**Exceptions:** 

- CacheNotFoundError: Raised when specified data doesn't exist in cache.
- RuntimeError: Raised when cache mode is set to write-only (WO).
"""
        if 'R' not in lazyllm.config['cache_mode']:
            raise CacheNotFoundError('Cannot read cache due to `LAZYLLM_CACHE_MODE = WO`')
        hash_key = self._hash(args, kw)
        value = self._strategy.get(key, hash_key)
        return transform_path(value, mode='r2a')

set(key, args, kw, value)

Store data in cache.

Stores data in cache based on the provided key and parameters. If cache mode doesn't allow writing, returns directly without executing storage operation.

Parameters:

  • key

    Cache key used to identify cached data.

  • args

    Positional arguments used to generate cache hash key.

  • kw

    Keyword arguments used to generate cache hash key.

  • value

    Data to be stored.

Note:

  • If cache mode is set to read-only (RO) or disabled (NONE), this method will return directly without executing storage operation.
Source code in lazyllm/module/module.py
    def set(self, key, args, kw, value):
        """Store data in cache.

Stores data in cache based on the provided key and parameters. If cache mode doesn't allow writing, returns directly without executing storage operation.

Args:
    key: Cache key used to identify cached data.
    args: Positional arguments used to generate cache hash key.
    kw: Keyword arguments used to generate cache hash key.
    value: Data to be stored.

**Note:** 

- If cache mode is set to read-only (RO) or disabled (NONE), this method will return directly without executing storage operation.
"""
        if 'W' not in lazyllm.config['cache_mode']: return
        hash_key = self._hash(args, kw)
        value = transform_path(value, mode='a2r')
        self._strategy.set(key, hash_key, value)

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenModule

Bases: OnlineChatModuleBase, FileHandlerBase

TODO: The Qianwen model has been finetuned and deployed successfully,

   but it is not compatible with the OpenAI interface and can only
   be accessed through the Dashscope SDK.
Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
class QwenModule(OnlineChatModuleBase, FileHandlerBase):
    """
    #TODO: The Qianwen model has been finetuned and deployed successfully,
           but it is not compatible with the OpenAI interface and can only
           be accessed through the Dashscope SDK.
    """
    TRAINABLE_MODEL_LIST = ['qwen-turbo', 'qwen-7b-chat', 'qwen-72b-chat']
    VLM_MODEL_PREFIX = ['qwen-vl-plus', 'qwen-vl-max', 'qvq-max', 'qvq-plus']
    MODEL_NAME = 'qwen-plus'

    def __init__(self, base_url: str = 'https://dashscope.aliyuncs.com/', model: str = None,
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):
        OnlineChatModuleBase.__init__(self, model_series='QWEN', api_key=api_key or lazyllm.config['qwen_api_key'],
                                      model_name=model or lazyllm.config['qwen_model_name'] or QwenModule.MODEL_NAME,
                                      base_url=base_url, stream=stream, return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        self._deploy_paramters = dict()
        if stream:
            self._model_optional_params['incremental_output'] = True
        self.default_train_data = {
            'model': 'qwen-turbo',
            'training_file_ids': None,
            'validation_file_ids': None,
            'training_type': 'efficient_sft',  # sft or efficient_sft
            'hyper_parameters': {
                'n_epochs': 1,
                'batch_size': 16,
                'learning_rate': '1.6e-5',
                'split': 0.9,
                'warmup_ratio': 0.0,
                'eval_steps': 1,
                'lr_scheduler_type': 'linear',
                'max_length': 2048,
                'lora_rank': 8,
                'lora_alpha': 32,
                'lora_dropout': 0.1,
            }
        }
        self.fine_tuning_job_id = None

    def _get_system_prompt(self):
        return ('You are a large-scale language model from Alibaba Cloud, '
                'your name is Tongyi Qianwen, and you are a useful assistant.')

    def _set_chat_url(self):
        self._url = urljoin(self._base_url, 'compatible-mode/v1/chat/completions')

    def _convert_file_format(self, filepath: str) -> None:
        with open(filepath, 'r', encoding='utf-8') as fr:
            dataset = [json.loads(line) for line in fr]

        json_strs = []
        for ex in dataset:
            lineEx = {'messages': []}
            messages = ex.get('messages', [])
            for message in messages:
                role = message.get('role', '')
                content = message.get('content', '')
                if role in ['system', 'user', 'assistant']:
                    lineEx['messages'].append({'role': role, 'content': content})
            json_strs.append(json.dumps(lineEx, ensure_ascii=False))

        return '\n'.join(json_strs)

    def _upload_train_file(self, train_file):
        headers = {
            'Authorization': 'Bearer ' + self._api_key
        }

        url = urljoin(self._base_url, 'api/v1/files')

        self.get_finetune_data(train_file)

        file_object = {
            # The correct format should be to pass in a tuple in the format of:
            # (<fileName>, <fileObject>, <Content-Type>),
            # where fileObject refers to the specific value.
            'files': (os.path.basename(train_file), self._dataHandler, 'application/json'),
            'descriptions': (None, 'training file', None)
        }

        with requests.post(url, headers=headers, files=file_object) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            if 'data' not in r.json().keys():
                raise ValueError('No data found in response')
            if 'uploaded_files' not in r.json()['data'].keys():
                raise ValueError('No uploaded_files found in response')
            # delete temporary training file
            self._dataHandler.close()
            return r.json()['data']['uploaded_files'][0]['file_id']

    def _update_kw(self, data, normal_config):
        current_train_data = self.default_train_data.copy()
        current_train_data.update(data)

        current_train_data['hyper_parameters']['n_epochs'] = normal_config['num_epochs']
        current_train_data['hyper_parameters']['learning_rate'] = str(normal_config['learning_rate'])
        current_train_data['hyper_parameters']['lr_scheduler_type'] = normal_config['lr_scheduler_type']
        current_train_data['hyper_parameters']['batch_size'] = normal_config['batch_size']
        current_train_data['hyper_parameters']['max_length'] = normal_config['cutoff_len']
        current_train_data['hyper_parameters']['lora_rank'] = normal_config['lora_r']
        current_train_data['hyper_parameters']['lora_alpha'] = normal_config['lora_alpha']

        return current_train_data

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'api/v1/fine-tunes')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}',
        }
        data = {
            'model': train_model,
            'training_file_ids': [train_file_id]
        }
        if 'training_parameters' in kw.keys():
            data.update(kw['training_parameters'])
        elif 'finetuning_type' in kw:
            data = self._update_kw(data, kw)

        with requests.post(url, headers=headers, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            fine_tuning_job_id = r.json()['output']['job_id']
            self.fine_tuning_job_id = fine_tuning_job_id
            status = r.json()['output']['status']
            return (fine_tuning_job_id, status)

    def _cancel_finetuning_job(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            return 'Invalid'
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = urljoin(self._base_url, f'api/v1/fine-tunes/{job_id}/cancel')
        headers = {
            'Authorization': f'Bearer {self._api_key}',
            'Content-Type': 'application/json'
        }
        with requests.post(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        status = r.json()['output']['status']
        if status == 'success':
            return 'Cancelled'
        else:
            return f'JOB {job_id} status: {status}'

    def _query_finetuned_jobs(self):
        fine_tune_url = urljoin(self._base_url, 'api/v1/fine-tunes')
        headers = {
            'Authorization': f'Bearer {self._api_key}',
            'Content-Type': 'application/json'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _get_finetuned_model_names(self) -> Tuple[List[Tuple[str, str]], List[Tuple[str, str]]]:
        model_data = self._query_finetuned_jobs()
        res = list()
        if 'jobs' not in model_data['output']:
            return res
        for model in model_data['output']['jobs']:
            status = self._status_mapping(model['status'])
            if status == 'Done':
                model_id = model['finetuned_output']
            else:
                model_id = model['model'] + '-' + model['job_id']
            res.append([model['job_id'], model_id, status])
        return res

    def _status_mapping(self, status):
        if status == 'SUCCEEDED':
            return 'Done'
        elif status == 'FAILED':
            return 'Failed'
        elif status in ('CANCELING', 'CANCELED'):
            return 'Cancelled'
        elif status == 'RUNNING':
            return 'Running'
        else:  # PENDING, QUEUING
            return 'Pending'

    def _query_job_status(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        _, status = self._query_finetuning_job(job_id)
        return self._status_mapping(status)

    def _get_log(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = urljoin(self._base_url, f'api/v1/fine-tunes/{job_id}/logs')
        headers = {
            'Authorization': f'Bearer {self._api_key}',
            'Content-Type': 'application/json'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return job_id, r.json()

    def _get_curr_job_model_id(self):
        if not self.fine_tuning_job_id:
            return None, None
        model_id, _ = self._query_finetuning_job(self.fine_tuning_job_id)
        return self.fine_tuning_job_id, model_id

    def _query_finetuning_job_info(self, fine_tuning_job_id):
        fine_tune_url = urljoin(self._base_url, f'api/v1/fine-tunes/{fine_tuning_job_id}')
        headers = {
            'Authorization': f'Bearer {self._api_key}',
            'Content-Type': 'application/json'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()['output']

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        status = info['status']
        # QWen only status == 'SUCCEEDED' can have `finetuned_output`
        if 'finetuned_output' in info:
            fine_tuned_model = info['finetuned_output']
        else:
            fine_tuned_model = info['model'] + '-' + info['job_id']
        return (fine_tuned_model, status)

    def _query_finetuning_cost(self, fine_tuning_job_id):
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        if 'usage' in info and info['usage']:
            return info['usage']
        else:
            return None

    def set_deploy_parameters(self, **kw):
        """Set model deployment parameters.

Configure relevant parameters for deployment tasks, such as capacity specifications, for subsequent model deployment.

Args:
    **kw: Deployment parameter key-value pairs.
"""
        self._deploy_paramters = kw

    def _create_deployment(self) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'api/v1/deployments')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}',
        }
        data = {
            'model_name': self._model_name,
            'capacity': self._deploy_paramters.get('capcity', 2)
        }
        if self._deploy_paramters and len(self._deploy_paramters) > 0:
            data.update(self._deploy_paramters)

        with requests.post(url, headers=headers, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            deployment_id = r.json()['output']['deployed_model']
            status = r.json()['output']['status']
            return (deployment_id, status)

    def _query_deployment(self, deployment_id) -> str:
        fine_tune_url = urljoin(self._base_url, f'api/v1/deployments/{deployment_id}')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            status = r.json()['output']['status']
            return status

    def _format_vl_chat_image_url(self, image_url, mime):
        assert mime is not None, 'Qwen Module requires mime info.'
        image_url = f'data:{mime};base64,{image_url}'
        return [{'type': 'image_url', 'image_url': {'url': image_url}}]

set_deploy_parameters(**kw)

Set model deployment parameters.

Configure relevant parameters for deployment tasks, such as capacity specifications, for subsequent model deployment.

Parameters:

  • **kw

    Deployment parameter key-value pairs.

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
    def set_deploy_parameters(self, **kw):
        """Set model deployment parameters.

Configure relevant parameters for deployment tasks, such as capacity specifications, for subsequent model deployment.

Args:
    **kw: Deployment parameter key-value pairs.
"""
        self._deploy_paramters = kw

lazyllm.module.llms.onlinemodule.supplier.qwen.QwenEmbedding

Bases: OnlineEmbeddingModuleBase

Qwen online text embedding module.

This class inherits from OnlineEmbeddingModuleBase and provides interaction capabilities with the Qwen text embedding API, supporting conversion of text to vector representations.

Parameters:

  • embed_url (str, default: 'https://dashscope.aliyuncs.com/api/v1/services/embeddings/text-embedding/text-embedding' ) –

    Embedding API URL address. Defaults to Qwen official API address

  • embed_model_name (str, default: 'text-embedding-v1' ) –

    Embedding model name. Defaults to 'text-embedding-v1'

  • api_key (str, default: None ) –

    API key. Defaults to 'qwen_api_key' from configuration

Source code in lazyllm/module/llms/onlinemodule/supplier/qwen.py
class QwenEmbedding(OnlineEmbeddingModuleBase):
    """Qwen online text embedding module.

This class inherits from OnlineEmbeddingModuleBase and provides interaction capabilities with the Qwen text embedding API, supporting conversion of text to vector representations.

Args:
    embed_url (str, optional): Embedding API URL address. Defaults to Qwen official API address
    embed_model_name (str, optional): Embedding model name. Defaults to 'text-embedding-v1'
    api_key (str, optional): API key. Defaults to 'qwen_api_key' from configuration
"""

    def __init__(self,
                 embed_url: str = ('https://dashscope.aliyuncs.com/api/v1/services/'
                                   'embeddings/text-embedding/text-embedding'),
                 embed_model_name: str = 'text-embedding-v1',
                 api_key: str = None,
                 batch_size: int = 16,
                 **kw):
        super().__init__('QWEN', embed_url, api_key or lazyllm.config['qwen_api_key'], embed_model_name,
                         batch_size=batch_size, **kw)

    def _encapsulated_data(self, text: Union[List, str], **kwargs):
        if isinstance(text, str):
            json_data = {
                'input': {
                    'texts': [text]
                },
                'model': self._embed_model_name
            }
            if len(kwargs) > 0:
                json_data.update(kwargs)
            return json_data
        else:
            text_batch = [text[i: i + self._batch_size] for i in range(0, len(text), self._batch_size)]
            json_data = [{'input': {'texts': texts}, 'model': self._embed_model_name} for texts in text_batch]
            if len(kwargs) > 0:
                for i in range(len(json_data)):
                    json_data[i].update(kwargs)
            return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> Union[List[List[float]], List[float]]:
        output = response.get('output', {})
        if not output:
            return []
        embeddings = output.get('embeddings', [])
        if not embeddings:
            return []
        if isinstance(input, str):
            return embeddings[0].get('embedding', [])
        else:
            return [res.get('embedding', []) for res in embeddings]

lazyllm.module.llms.onlinemodule.supplier.glm.GLMEmbedding

Bases: OnlineEmbeddingModuleBase

GLM embedding model interface class for calling Zhipu AI's text embedding services.

Parameters:

  • embed_url (str, default: 'https://open.bigmodel.cn/api/paas/v4/embeddings' ) –

    Embedding service API address, defaults to "https://open.bigmodel.cn/api/paas/v4/embeddings"

  • embed_model_name (str, default: 'embedding-2' ) –

    Embedding model name, defaults to "embedding-2"

  • api_key (str, default: None ) –

    API key

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
class GLMEmbedding(OnlineEmbeddingModuleBase):
    """GLM embedding model interface class for calling Zhipu AI's text embedding services.

Args:
    embed_url (str): Embedding service API address, defaults to "https://open.bigmodel.cn/api/paas/v4/embeddings"
    embed_model_name (str): Embedding model name, defaults to "embedding-2"
    api_key (str): API key
"""
    def __init__(self,
                 embed_url: str = 'https://open.bigmodel.cn/api/paas/v4/embeddings',
                 embed_model_name: str = 'embedding-2',
                 api_key: str = None,
                 batch_size: int = 16,
                 **kw):
        super().__init__('GLM', embed_url, api_key or lazyllm.config['glm_api_key'], embed_model_name,
                         batch_size=batch_size, **kw)

lazyllm.module.llms.onlinemodule.supplier.glm.GLMSTTModule

Bases: GLMMultiModal

GLM Speech-to-Text module, inherits from GLMMultiModal.

Provides speech-to-text (STT) functionality based on Zhipu AI, supports audio file speech recognition.

Parameters:

  • model_name (str, default: None ) –

    Model name, defaults to configured model name or "glm-asr"

  • api_key (str, default: None ) –

    API key, defaults to configured key

  • return_trace (bool, default: False ) –

    Whether to return trace information, defaults to False

  • **kwargs

    Other model parameters

Source code in lazyllm/module/llms/onlinemodule/supplier/glm.py
class GLMSTTModule(GLMMultiModal):
    """GLM Speech-to-Text module, inherits from GLMMultiModal.

Provides speech-to-text (STT) functionality based on Zhipu AI, supports audio file speech recognition.

Args:
    model_name (str, optional): Model name, defaults to configured model name or "glm-asr"
    api_key (str, optional): API key, defaults to configured key
    return_trace (bool, optional): Whether to return trace information, defaults to False
    **kwargs: Other model parameters
"""
    MODEL_NAME = 'glm-asr'

    def __init__(self, model_name: str = None, api_key: str = None, return_trace: bool = False, **kwargs):
        GLMMultiModal.__init__(self, model_name=model_name or GLMSTTModule.MODEL_NAME
                               or lazyllm.config['glm_stt_model_name'], api_key=api_key,
                               return_trace=return_trace, **kwargs)

    def _forward(self, files: List[str] = [], **kwargs):  # noqa B006
        assert len(files) == 1, 'GLMSTTModule only supports one file'
        assert os.path.exists(files[0]), f'File {files[0]} not found'
        transcriptResponse = self._client.audio.transcriptions.create(
            model=self._model_name,
            file=open(files[0], 'rb'),
        )
        return transcriptResponse.text

lazyllm.module.llms.onlinemodule.supplier.deepseek.DeepSeekModule

Bases: OnlineChatModuleBase

DeepSeek large language model interface module.

Parameters:

  • base_url (str, default: 'https://api.deepseek.com' ) –

    API base URL, defaults to "https://api.deepseek.com"

  • model (str, default: 'deepseek-chat' ) –

    Model name, defaults to "deepseek-chat"

  • api_key (str, default: None ) –

    API key, if None, gets from configuration

  • stream (bool, default: True ) –

    Whether to enable streaming output, defaults to True

  • return_trace (bool, default: False ) –

    Whether to return trace information, defaults to False

  • **kwargs

    Other parameters passed to base class

Source code in lazyllm/module/llms/onlinemodule/supplier/deepseek.py
class DeepSeekModule(OnlineChatModuleBase):
    """DeepSeek large language model interface module.

Args:
    base_url (str): API base URL, defaults to "https://api.deepseek.com"
    model (str): Model name, defaults to "deepseek-chat"
    api_key (str): API key, if None, gets from configuration
    stream (bool): Whether to enable streaming output, defaults to True
    return_trace (bool): Whether to return trace information, defaults to False
    **kwargs: Other parameters passed to base class
"""
    def __init__(self, base_url: str = 'https://api.deepseek.com', model: str = 'deepseek-chat',
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):
        super().__init__(model_series='DEEPSEEK', api_key=api_key or lazyllm.config['deepseek_api_key'],
                         base_url=base_url, model_name=model, stream=stream, return_trace=return_trace, **kwargs)

    def _get_system_prompt(self):
        return 'You are an intelligent assistant developed by China\'s DeepSeek. You are a helpful assistanti.'

    def _set_chat_url(self):
        self._url = urljoin(self._base_url, 'chat/completions')

    def _validate_api_key(self):
        try:
            models_url = urljoin(self._base_url, 'models')
            headers = {
                'Authorization': f'Bearer {self._api_key}',
                'Content-Type': 'application/json'
            }
            response = requests.get(models_url, headers=headers, timeout=10)
            return response.status_code == 200
        except Exception:
            return False

lazyllm.module.llms.onlinemodule.supplier.doubao.DoubaoTextToImageModule

Bases: DoubaoMultiModal

ByteDance Doubao Text-to-Image module supporting text to image generation.

Based on ByteDance Doubao multimodal model's text-to-image functionality, inherits from DoubaoMultiModal, providing high-quality text to image generation capability.

Parameters:

  • api_key (str, default: None ) –

    Doubao API key, defaults to None.

  • model_name (str, default: None ) –

    Model name, defaults to "doubao-seedream-3-0-t2i-250415".

  • return_trace (bool, default: False ) –

    Whether to return trace information, defaults to False.

  • **kwargs

    Other parameters passed to parent class.

Source code in lazyllm/module/llms/onlinemodule/supplier/doubao.py
class DoubaoTextToImageModule(DoubaoMultiModal):
    """ByteDance Doubao Text-to-Image module supporting text to image generation.

Based on ByteDance Doubao multimodal model's text-to-image functionality, 
inherits from DoubaoMultiModal, providing high-quality text to image generation capability.

Args:
    api_key (str, optional): Doubao API key, defaults to None.
    model_name (str, optional): Model name, defaults to "doubao-seedream-3-0-t2i-250415".
    return_trace (bool, optional): Whether to return trace information, defaults to False.
    **kwargs: Other parameters passed to parent class.
"""
    MODEL_NAME = 'doubao-seedream-3-0-t2i-250415'

    def __init__(self, api_key: str = None, model_name: str = None, return_trace: bool = False, **kwargs):
        DoubaoMultiModal.__init__(self, api_key=api_key, model_name=model_name
                                  or DoubaoTextToImageModule.MODEL_NAME
                                  or lazyllm.config['doubao_text2image_model_name'],
                                  return_trace=return_trace, **kwargs)

    def _forward(self, input: str = None, size: str = '1024x1024', seed: int = -1, guidance_scale: float = 2.5,
                 watermark: bool = True, **kwargs):
        imagesResponse = self._client.images.generate(
            model=self._model_name,
            prompt=input,
            size=size,
            seed=seed,
            guidance_scale=guidance_scale,
            watermark=watermark,
            **kwargs
        )
        return encode_query_with_filepaths(None, bytes_to_file([requests.get(result.url).content
                                                                for result in imagesResponse.data]))

lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIModule

Bases: OnlineChatModuleBase, FileHandlerBase

OpenAI API integration module for chat completion and fine-tuning operations.

Provides interface to interact with OpenAI's chat models, supporting both inference and fine-tuning capabilities. Inherits from OnlineChatModuleBase and FileHandlerBase.

Parameters:

  • base_url (str, default: 'https://api.openai.com/v1/' ) –

    OpenAI API base URL, defaults to "https://api.openai.com/v1/".

  • model (str, default: 'gpt-3.5-turbo' ) –

    Model name to use for chat completion, defaults to "gpt-3.5-turbo".

  • api_key (str, default: None ) –

    OpenAI API key, defaults to lazyllm.config['openai_api_key'].

  • stream (bool, default: True ) –

    Whether to use streaming response, defaults to True.

  • return_trace (bool, default: False ) –

    Whether to return trace information, defaults to False.

  • **kwargs

    Additional arguments passed to OnlineChatModuleBase.

Source code in lazyllm/module/llms/onlinemodule/supplier/openai.py
class OpenAIModule(OnlineChatModuleBase, FileHandlerBase):
    """OpenAI API integration module for chat completion and fine-tuning operations.

Provides interface to interact with OpenAI's chat models, supporting both inference
and fine-tuning capabilities. Inherits from OnlineChatModuleBase and FileHandlerBase.

Args:
    base_url (str, optional): OpenAI API base URL, defaults to "https://api.openai.com/v1/".
    model (str, optional): Model name to use for chat completion, defaults to "gpt-3.5-turbo".
    api_key (str, optional): OpenAI API key, defaults to lazyllm.config['openai_api_key'].
    stream (bool, optional): Whether to use streaming response, defaults to True.
    return_trace (bool, optional): Whether to return trace information, defaults to False.
    **kwargs: Additional arguments passed to OnlineChatModuleBase.
"""
    TRAINABLE_MODEL_LIST = ['gpt-3.5-turbo-0125', 'gpt-3.5-turbo-1106',
                            'gpt-3.5-turbo-0613', 'babbage-002',
                            'davinci-002', 'gpt-4-0613']
    NO_PROXY = False

    def __init__(self, base_url: str = 'https://api.openai.com/v1/', model: str = 'gpt-3.5-turbo',
                 api_key: str = None, stream: bool = True, return_trace: bool = False, skip_auth: bool = False, **kw):
        OnlineChatModuleBase.__init__(self, model_series='OPENAI', api_key=api_key or lazyllm.config['openai_api_key'],
                                      base_url=base_url, model_name=model, stream=stream, return_trace=return_trace,
                                      skip_auth=skip_auth, **kw)
        FileHandlerBase.__init__(self)
        self.default_train_data = {
            'model': 'gpt-3.5-turbo-0613',
            'training_file': None,
            'validation_file': None,
            'hyperparameters': {
                'n_epochs': 1,
                'batch_size': 16,
                'learning_rate_multiplier': '1.6e-5',
            }
        }
        self.fine_tuning_job_id = None

    def _get_system_prompt(self):
        return 'You are ChatGPT, a large language model trained by OpenAI.You are a helpful assistant.'

    def _convert_file_format(self, filepath: str) -> str:
        with open(filepath, 'r', encoding='utf-8') as fr:
            dataset = [json.loads(line) for line in fr]

        json_strs = []
        for ex in dataset:
            lineEx = {'messages': []}
            messages = ex.get('messages', [])
            for message in messages:
                role = message.get('role', '')
                content = message.get('content', '')
                if role in ['system', 'user', 'assistant']:
                    lineEx['messages'].append({'role': role, 'content': content})
            json_strs.append(json.dumps(lineEx, ensure_ascii=False))

        return '\n'.join(json_strs)

    def _upload_train_file(self, train_file):
        headers = {
            'Authorization': 'Bearer ' + self._api_key
        }

        url = urljoin(self._base_url, 'files')

        self.get_finetune_data(train_file)

        file_object = {
            'purpose': (None, 'fine-tune', None),
            'file': (os.path.basename(train_file), self._dataHandler, 'application/json')
        }

        with requests.post(url, headers=headers, files=file_object) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            # delete temporary training file
            self._dataHandler.close()
            return r.json()['id']

    def _update_kw(self, data, normal_config):
        current_train_data = self.default_train_data.copy()
        current_train_data.update(data)

        current_train_data['hyperparameters']['n_epochs'] = normal_config['num_epochs']
        current_train_data['hyperparameters']['learning_rate_multiplier'] = str(normal_config['learning_rate'])
        current_train_data['hyperparameters']['batch_size'] = normal_config['batch_size']
        current_train_data['suffix'] = str(uuid.uuid4())[:7]

        return current_train_data

    def _create_finetuning_job(self, train_model, train_file_id, **kw) -> Tuple[str, str]:
        url = urljoin(self._base_url, 'fine_tuning/jobs')
        headers = {
            'Content-Type': 'application/json',
            'Authorization': f'Bearer {self._api_key}',
        }
        data = {
            'model': train_model,
            'training_file': train_file_id
        }
        if len(kw) > 0:
            if 'finetuning_type' in kw:
                data = self._update_kw(data, kw)
            else:
                data.update(kw)

        with requests.post(url, headers=headers, json=data) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))

            fine_tuning_job_id = r.json()['id']
            self.fine_tuning_job_id = fine_tuning_job_id
            status = r.json()['status']
            return (fine_tuning_job_id, status)

    def _cancel_finetuning_job(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            return 'Invalid'
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = urljoin(self._base_url, f'fine_tuning/jobs/{job_id}/cancel')
        headers = {
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.post(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        status = r.json()['status']
        if status == 'cancelled':
            return 'Cancelled'
        else:
            return f'JOB {job_id} status: {status}'

    def _query_finetuned_jobs(self):
        fine_tune_url = urljoin(self._base_url, 'fine_tuning/jobs')
        headers = {
            'Authorization': f'Bearer {self._api_key}',
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _get_finetuned_model_names(self) -> Tuple[List[Tuple[str, str]], List[Tuple[str, str]]]:
        model_data = self._query_finetuned_jobs()
        res = list()
        for model in model_data['data']:
            res.append([model['id'], model['fine_tuned_model'], self._status_mapping(model['status'])])
        return res

    def _status_mapping(self, status):
        if status == 'succeeded':
            return 'Done'
        elif status == 'failed':
            return 'Failed'
        elif status == 'cancelled':
            return 'Cancelled'
        elif status == 'running':
            return 'Running'
        else:  # validating_files, queued
            return 'Pending'

    def _query_job_status(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        _, status = self._query_finetuning_job(job_id)
        return self._status_mapping(status)

    def _get_log(self, fine_tuning_job_id=None):
        if not fine_tuning_job_id and not self.fine_tuning_job_id:
            raise RuntimeError('No job ID specified. Please ensure that a valid "fine_tuning_job_id" is '
                               'provided as an argument or started a training job.')
        job_id = fine_tuning_job_id if fine_tuning_job_id else self.fine_tuning_job_id
        fine_tune_url = urljoin(self._base_url, f'fine_tuning/jobs/{job_id}/events')
        headers = {
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return job_id, r.json()

    def _get_curr_job_model_id(self):
        if not self.fine_tuning_job_id:
            return None, None
        model_id, _ = self._query_finetuning_job(self.fine_tuning_job_id)
        return self.fine_tuning_job_id, model_id

    def _query_finetuning_job_info(self, fine_tuning_job_id):
        fine_tune_url = urljoin(self._base_url, f'fine_tuning/jobs/{fine_tuning_job_id}')
        headers = {
            'Authorization': f'Bearer {self._api_key}'
        }
        with requests.get(fine_tune_url, headers=headers) as r:
            if r.status_code != 200:
                raise requests.RequestException('\n'.join([c.decode('utf-8') for c in r.iter_content(None)]))
        return r.json()

    def _query_finetuning_job(self, fine_tuning_job_id) -> Tuple[str, str]:
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        status = info['status']
        fine_tuned_model = info['fine_tuned_model'] if 'fine_tuned_model' in info else None
        return (fine_tuned_model, status)

    def _query_finetuning_cost(self, fine_tuning_job_id):
        info = self._query_finetuning_job_info(fine_tuning_job_id)
        if 'trained_tokens' in info and info['trained_tokens']:
            return info['trained_tokens']
        else:
            return None

    def _create_deployment(self) -> Tuple[str, str]:
        return (self._model_name, 'RUNNING')

    def _query_deployment(self, deployment_id) -> str:
        return 'RUNNING'

lazyllm.module.llms.onlinemodule.supplier.openai.OpenAIReranking

Bases: OnlineEmbeddingModuleBase

The OpenAIReranking class provides functionality to call OpenAI's Reranking API for re-ordering a list of text documents.

This class inherits from OnlineEmbeddingModuleBase and mainly provides:

  • Setting the embedding model URL and name;
  • Encapsulating request data and calling the OpenAI Rerank API;
  • Parsing the returned ranking results.

Parameters:

  • embed_url (str, default: 'https://api.openai.com/v1/' ) –

    Base URL of the OpenAI API, default is 'https://api.openai.com/v1/'.

  • embed_model_name (str, default: '' ) –

    Name of the embedding model used for Rerank.

  • api_key (str, default: None ) –

    OpenAI API Key, optional. If not provided, the default from lazyllm config is used.

  • **kw

    Additional keyword arguments passed to the parent constructor.

Source code in lazyllm/module/llms/onlinemodule/supplier/openai.py
class OpenAIReranking(OnlineEmbeddingModuleBase):
    """
The OpenAIReranking class provides functionality to call OpenAI's Reranking API for re-ordering a list of text documents.

This class inherits from `OnlineEmbeddingModuleBase` and mainly provides:

- Setting the embedding model URL and name;
- Encapsulating request data and calling the OpenAI Rerank API;
- Parsing the returned ranking results.

Args:
    embed_url (str): Base URL of the OpenAI API, default is 'https://api.openai.com/v1/'.
    embed_model_name (str): Name of the embedding model used for Rerank.
    api_key (str): OpenAI API Key, optional. If not provided, the default from lazyllm config is used.
    **kw: Additional keyword arguments passed to the parent constructor.
"""
    NO_PROXY = True

    def __init__(self,
                 embed_url: str = 'https://api.openai.com/v1/',
                 embed_model_name: str = '',
                 api_key: str = None,
                 **kw):
        super().__init__('OPENAI', embed_url, api_key or lazyllm.config['openai_api_key'], embed_model_name, **kw)

    def _set_embed_url(self):
        self._embed_url = urljoin(self._embed_url, 'rerank')

    @property
    def type(self):
        return 'RERANK'

    def _encapsulated_data(self, query: str, documents: List[str], top_n: int, **kwargs) -> Dict[str, str]:
        json_data = {
            'query': query,
            'documents': documents,
            'top_n': top_n,
            'model': self._embed_model_name
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)

        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[Tuple]:
        results = response['results']
        return [(result['index'], result['relevance_score']) for result in results]

lazyllm.module.llms.onlinemodule.supplier.sensenova.SenseNovaEmbedding

Bases: OnlineEmbeddingModuleBase, _SenseNovaBase

SenseTime SenseNova Embedding module for text vectorization operations.Provides interface to interact with SenseTime's SenseNova embedding models, supporting text-to-vector conversion functionality. Inherits from OnlineEmbeddingModuleBase and _SenseNovaBase.

Parameters:

  • embed_url (str, default: 'https://api.sensenova.cn/v1/llm/embeddings' ) –

    Embedding API URL, defaults to "https://api.sensenova.cn/v1/llm/embeddings".

  • embed_model_name (str, default: 'nova-embedding-stable' ) –

    Embedding model name, defaults to "nova-embedding-stable".

  • api_key (str, default: None ) –

    API access key, defaults to None.

  • secret_key (str, default: None ) –

    API secret key, defaults to None.

Source code in lazyllm/module/llms/onlinemodule/supplier/sensenova.py
class SenseNovaEmbedding(OnlineEmbeddingModuleBase, _SenseNovaBase):
    """SenseTime SenseNova Embedding module for text vectorization operations.Provides interface to interact with SenseTime's SenseNova embedding models, supporting text-to-vector conversion functionality. Inherits from OnlineEmbeddingModuleBase and _SenseNovaBase.

Args:
    embed_url (str, optional): Embedding API URL, defaults to "https://api.sensenova.cn/v1/llm/embeddings".
    embed_model_name (str, optional): Embedding model name, defaults to "nova-embedding-stable".
    api_key (str, optional): API access key, defaults to None.
    secret_key (str, optional): API secret key, defaults to None.
"""

    def __init__(self,
                 embed_url: str = 'https://api.sensenova.cn/v1/llm/embeddings',
                 embed_model_name: str = 'nova-embedding-stable',
                 api_key: str = None,
                 secret_key: str = None,
                 batch_size: int = 16,
                 **kw):
        api_key = self._get_api_key(api_key, secret_key)
        super().__init__('SENSENOVA', embed_url, api_key, embed_model_name,
                         batch_size=batch_size, **kw)

    def _parse_response(self, response: Dict, input: Union[List, str]) -> Union[List[List[float]], List[float]]:
        embeddings = response.get('embeddings', [])
        if not embeddings:
            return []
        if isinstance(input, str):
            return embeddings[0].get('embedding', [])
        else:
            return [res.get('embedding', []) for res in embeddings]

lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowTTS

Bases: OnlineMultiModalBase

SiliconFlow Text-to-Speech module, inherits from OnlineMultiModalBase.

Provides text-to-speech (TTS) functionality based on SiliconFlow, supports converting text to audio files.

Parameters:

  • api_key (str, default: None ) –

    API key, defaults to configured siliconflow_api_key

  • model_name (str, default: None ) –

    Model name, defaults to "fnlp/MOSS-TTSD-v0.5"

  • base_url (str, default: 'https://api.siliconflow.cn/v1/' ) –

    Base API URL, defaults to "https://api.siliconflow.cn/v1/"

  • return_trace (bool, default: False ) –

    Whether to return trace information, defaults to False

  • **kwargs

    Other model parameters

Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py
class SiliconFlowTTS(OnlineMultiModalBase):
    """SiliconFlow Text-to-Speech module, inherits from OnlineMultiModalBase.

Provides text-to-speech (TTS) functionality based on SiliconFlow, supports converting text to audio files.

Args:
    api_key (str, optional): API key, defaults to configured siliconflow_api_key
    model_name (str, optional): Model name, defaults to "fnlp/MOSS-TTSD-v0.5"
    base_url (str, optional): Base API URL, defaults to "https://api.siliconflow.cn/v1/"
    return_trace (bool, optional): Whether to return trace information, defaults to False
    **kwargs: Other model parameters
"""
    MODEL_NAME = 'fnlp/MOSS-TTSD-v0.5'

    def __init__(self, api_key: str = None, model_name: str = None,
                 base_url: str = 'https://api.siliconflow.cn/v1/',
                 return_trace: bool = False, **kwargs):
        OnlineMultiModalBase.__init__(self, model_series='SiliconFlow',
                                      model_name=model_name or SiliconFlowTTS.MODEL_NAME,
                                      return_trace=return_trace, **kwargs)
        self._endpoint = 'audio/speech'
        self._base_url = base_url
        self._api_key = api_key or lazyllm.config['siliconflow_api_key']

    def _make_binary_request(self, endpoint, payload, timeout=180):

        headers = {
            'Authorization': f'Bearer {self._api_key}',
            'Content-Type': 'application/json'
        }

        url = f'{self._base_url}{endpoint}'

        try:
            response = requests.post(url, headers=headers, json=payload, timeout=timeout)
            response.raise_for_status()
            return response.content
        except Exception as e:
            lazyllm.LOG.error(f'API request failed: {str(e)}')
            raise

    def _forward(self, input: str = None, response_format: str = 'mp3',
                 sample_rate: int = 44100, speed: float = 1.0,
                 voice: str = None, references=None, out_path: str = None, **kwargs):

        payload = {
            'model': self._model_name,
            'input': input,
            'response_format': response_format,
            'sample_rate': sample_rate,
            'speed': speed
        }

        if voice:
            payload['voice'] = voice
        if references:
            payload['references'] = references

        payload.update(kwargs)
        audio_content = self._make_binary_request(self._endpoint, payload, timeout=180)
        file_path = bytes_to_file([audio_content])[0]

        if out_path:
            with open(file_path, 'rb') as src, open(out_path, 'wb') as dst:
                dst.write(src.read())
            file_path = out_path

        result = encode_query_with_filepaths(None, [file_path])

        if self._return_trace:
            return {
                'response': result,
                'trace_info': {
                    'model': self._model_name,
                    'full_response': f'Audio generated successfully, length: {len(audio_content)} bytes'
                }
            }
        return result

lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowModule

Bases: OnlineChatModuleBase, FileHandlerBase

SiliconFlow module, inherits from OnlineChatModuleBase and FileHandlerBase. Provides large language model chat capabilities via the SiliconFlow platform, supports multiple models (including vision-language models), and includes file handling functionality. Args: base_url (str, optional): Base API URL, defaults to "https://api.siliconflow.cn/v1/" model (str, optional): Model name to use, defaults to "Qwen/QwQ-32B" api_key (str, optional): API key, defaults to lazyllm.config['siliconflow_api_key'] stream (bool, optional): Whether to enable streaming output, defaults to True return_trace (bool, optional): Whether to return trace information, defaults to False **kwargs: Other model parameters

Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py
class SiliconFlowModule(OnlineChatModuleBase, FileHandlerBase):
    """SiliconFlow module, inherits from OnlineChatModuleBase and FileHandlerBase.
Provides large language model chat capabilities via the SiliconFlow platform, supports multiple models (including vision-language models), and includes file handling functionality.
Args:
    base_url (str, optional): Base API URL, defaults to "https://api.siliconflow.cn/v1/"
    model (str, optional): Model name to use, defaults to "Qwen/QwQ-32B"
    api_key (str, optional): API key, defaults to lazyllm.config['siliconflow_api_key']
    stream (bool, optional): Whether to enable streaming output, defaults to True
    return_trace (bool, optional): Whether to return trace information, defaults to False
    **kwargs: Other model parameters
"""
    VLM_MODEL_PREFIX = ['Qwen/Qwen2.5-VL-72B-Instruct', 'Qwen/Qwen3-VL-30B-A3B-Instruct', 'deepseek-ai/deepseek-vl2',
                        'Qwen/Qwen3-VL-30B-A3B-Thinking', 'THUDM/GLM-4.1V-9B-Thinking']

    def __init__(self, base_url: str = 'https://api.siliconflow.cn/v1/', model: str = 'Qwen/QwQ-32B',
                 api_key: str = None, stream: bool = True, return_trace: bool = False, **kwargs):
        OnlineChatModuleBase.__init__(self, model_series='SILICONFLOW',
                                      api_key=api_key or lazyllm.config['siliconflow_api_key'],
                                      base_url=base_url, model_name=model, stream=stream,
                                      return_trace=return_trace, **kwargs)
        FileHandlerBase.__init__(self)
        if stream:
            self._model_optional_params['stream'] = True

    def _get_system_prompt(self):
        return 'You are an intelligent assistant provided by SiliconFlow. You are a helpful assistant.'

    def _set_chat_url(self):
        self._url = urljoin(self._base_url, 'chat/completions')

    def _validate_api_key(self):
        """Validate API Key by sending a minimal request"""
        try:
            # SiliconFlow validates API key using a minimal chat request
            models_url = urljoin(self._base_url, 'models')
            headers = {
                'Authorization': f'Bearer {self._api_key}',
                'Content-Type': 'application/json'
            }
            response = requests.get(models_url, headers=headers, timeout=10)
            return response.status_code == 200
        except Exception:
            return False

lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowReranking

Bases: OnlineEmbeddingModuleBase

SiliconFlow reranking module, inherits from OnlineEmbeddingModuleBase. Provides text reranking functionality via the SiliconFlow platform, reordering a list of documents based on their relevance to a given query. Args: rerank_url (str, optional): Reranking API URL, defaults to "https://api.siliconflow.cn/v1/rerank" rerank_model_name (str, optional): Name of the reranking model to use, defaults to "BAAI/bge-reranker-v2-m3" api_key (str, optional): API key, defaults to lazyllm.config['siliconflow_api_key'] **kw: Additional reranking module parameters Input format: Supports two input formats: - List: [query: str, documents: List[str]] - Dict: {'query': str, 'documents': List[str]} Returns: List[Dict]: A list of reranking results, each containing fields such as 'index', 'relevance_score', and 'document'.

Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py
class SiliconFlowReranking(OnlineEmbeddingModuleBase):
    """SiliconFlow reranking module, inherits from OnlineEmbeddingModuleBase.
Provides text reranking functionality via the SiliconFlow platform, reordering a list of documents based on their relevance to a given query.
Args:
    rerank_url (str, optional): Reranking API URL, defaults to "https://api.siliconflow.cn/v1/rerank"
    rerank_model_name (str, optional): Name of the reranking model to use, defaults to "BAAI/bge-reranker-v2-m3"
    api_key (str, optional): API key, defaults to lazyllm.config['siliconflow_api_key']
    **kw: Additional reranking module parameters
Input format:
    Supports two input formats:
    - List: [query: str, documents: List[str]]
    - Dict: {'query': str, 'documents': List[str]}
Returns:
    List[Dict]: A list of reranking results, each containing fields such as 'index', 'relevance_score', and 'document'.
"""
    def __init__(self, rerank_url: str = 'https://api.siliconflow.cn/v1/rerank',
                 rerank_model_name: str = 'BAAI/bge-reranker-v2-m3', api_key: str = None, **kw):
        super().__init__('SILICONFLOW', rerank_url, api_key or lazyllm.config['siliconflow_api_key'],
                         rerank_model_name, **kw)
        self._rerank_model_name = rerank_model_name

    def _encapsulated_data(self, input: Union[List, str], **kwargs) -> Dict:
        if isinstance(input, str):
            raise ValueError('Rerank requires both query and documents')

        if isinstance(input, list) and len(input) == 2:
            query = input[0]
            documents = input[1]
        elif isinstance(input, dict):
            query = input.get('query', '')
            documents = input.get('documents', [])
        else:
            raise ValueError("Input must be a list [query, documents] or dict with 'query' and 'documents' keys")

        json_data = {
            'model': self._rerank_model_name,
            'query': query,
            'documents': documents
        }
        if len(kwargs) > 0:
            json_data.update(kwargs)

        return json_data

    def _parse_response(self, response: Dict, input: Union[List, str]) -> List[Dict]:
        return response.get('results', [])

lazyllm.module.llms.onlinemodule.supplier.siliconflow.SiliconFlowTextToImageModule

Bases: OnlineMultiModalBase

SiliconFlow Text-to-Image module, inherits from OnlineMultiModalBase.

Provides text-to-image generation functionality based on SiliconFlow, supports generating images from text descriptions.

Parameters:

  • api_key (str, default: None ) –

    API key, defaults to configured siliconflow_api_key

  • model_name (str, default: None ) –

    Model name, defaults to "Qwen/Qwen-Image"

  • base_url (str, default: 'https://api.siliconflow.cn/v1/' ) –

    Base API URL, defaults to "https://api.siliconflow.cn/v1/"

  • return_trace (bool, default: False ) –

    Whether to return trace information, defaults to False

  • **kwargs

    Other model parameters

Source code in lazyllm/module/llms/onlinemodule/supplier/siliconflow.py
class SiliconFlowTextToImageModule(OnlineMultiModalBase):
    """SiliconFlow Text-to-Image module, inherits from OnlineMultiModalBase.

Provides text-to-image generation functionality based on SiliconFlow, supports generating images from text descriptions.

Args:
    api_key (str, optional): API key, defaults to configured siliconflow_api_key
    model_name (str, optional): Model name, defaults to "Qwen/Qwen-Image"
    base_url (str, optional): Base API URL, defaults to "https://api.siliconflow.cn/v1/"
    return_trace (bool, optional): Whether to return trace information, defaults to False
    **kwargs: Other model parameters
"""
    MODEL_NAME = 'Qwen/Qwen-Image'

    def __init__(self, api_key: str = None, model_name: str = None,
                 base_url: str = 'https://api.siliconflow.cn/v1/',
                 return_trace: bool = False, **kwargs):
        OnlineMultiModalBase.__init__(self, model_series='SiliconFlow',
                                      model_name=model_name or SiliconFlowTextToImageModule.MODEL_NAME,
                                      return_trace=return_trace, **kwargs)
        self._endpoint = 'images/generations'
        self._base_url = base_url
        self._api_key = api_key or lazyllm.config['siliconflow_api_key']

    def _make_request(self, endpoint, payload, timeout=60):

        headers = {
            'Authorization': f'Bearer {self._api_key}',
            'Content-Type': 'application/json'
        }

        url = f'{self._base_url}{endpoint}'

        try:
            response = requests.post(url, headers=headers, json=payload, timeout=timeout)
            response.raise_for_status()
            return response.json()
        except Exception as e:
            lazyllm.LOG.error(f'API request failed: {str(e)}')
            raise

    def _forward(self, input: str = None, size: str = '1024x1024', **kwargs):
        payload = {
            'model': self._model_name,
            'prompt': input
        }
        payload.update(kwargs)

        result = self._make_request(self._endpoint, payload)

        image_urls = [item['url'] for item in result['data']]

        image_files = []
        for url in image_urls:
            img_response = requests.get(url, timeout=60)
            if img_response.status_code == 200:
                image_files.append(img_response.content)
            else:
                raise Exception(f'Failed to download image from {url}')

        file_paths = bytes_to_file(image_files)

        if self._return_trace:
            return {
                'response': encode_query_with_filepaths(None, file_paths),
                'trace_info': {
                    'model': self._model_name,
                    'full_response': result
                }
            }
        return encode_query_with_filepaths(None, file_paths)