Splitting a Large Class and Multiple Inheritance in Python

本贴最后更新于 1542 天前,其中的信息可能已经物是人非

When I started refactoring EFB Telegram Master Channel (ETM) for 2.0 updates, I was investigating ways to organize code into different files in a decent manner. In this article I’d like to talk about the strategy I used, comparing to another codebase I was reading back then, itchat.

In ETM version 1, most of the code is written the heavy and ugly 1675-line-long __init__.py. As more features planned to be added to ETM, it was really hard for me to navigate through the code, which have brought up my need of refactoring this huge thing.

Back then (which, surprisingly, was over 2 years ago), the main reference I had on a large enough project was itchat. Their code structure hasn’t been changing much since then. itchat did have a reasonably large code repository, but the way it splits its functions is rather unideal.

The way itchat did to have all functioned defined at root level of each file, and have a loader function that “loads” these methods to an object called core which contains some configuration data. To the Python interpreter, this method indeed works, thanks to its dynamic typing. But this looks really bad when you were trying to work with the code, as IDE usually can’t give any hint with objects defined in this way. That also happens when you try to work on the library itself, despite every function starts with a self in their arguments.

Then I went on looking for other common practices on breaking down a large class, some suggested importing functions inside a function, other using multiple inheritance. [Ref.] The former is not much different from what itchat was doing, and the latter looked promising at the beginning. I went on to do some experiment with multiple inheritance, and found that it does provide better autocomplete with IDE, but only in the main class. I can’t see one subclass from another one in the IDE. That is still reasonable as all those subclasses only comes together in the main class, they are not aware of each other.

from .components import load_components

class Core:
def method_1(self, param_1, param_2, param_3):
"""Doc string goes here."""
raise NotImplementedError()

def method_2(self):
    """Doc string goes here."""
    raise NotImplementedError()

def method_3(self, param_1):
    """Doc string goes here."""
    raise NotImplementedError()

load_components(Core)

from .component_1 import load_component_1
from .component_2 import load_component_2

def load_components(core):
load_component_1(core)
load_component_2(core)

def load_contact(core):
    core.method_1 = method_1
    core.method_2 = method_2

def method_1(self, param_1, param_2, param_3):
# Actual implementation
...

def method_2(self):
# Actual implementation
...

def load_contact(core):
    core.method_3 = method_3

def method_3(self, param_1):
# Actual implementation
...

I thought to myself, why can’t I just make some more classes and let them reference each other? Turns out that worked pretty well for me. I split my functions into several different “manager” classes, each of which is initialized with a reference to the main class. These classes are instantiated in topological order such that classes being referred to by others are created earlier. In ETM, the classes that are being referred to are usually those data providers utilities, namely ExperimentalFlagsManager, DatabaseManager, and TelegramBotManager.

from .flags import ExperimentalFlagsManager
from .db import DatabaseManager
from .chat_binding import ChatBindingManager

class TelegramChannel():
def init(self):
self.flags: ExperimentalFlagsManager = ExperimentalFlagsManager(self)
self.db: DatabaseManager = DatabaseManager(self)
self.chat_binding: ChatBindingManager = ChatBindingManager(self)

from typing import TYPE_CHECKING

if TYPE_CHECKING:
# Avoid cycle import for type checking
from . import TelegramChannel

class ExperimentalFlagsManager:
def init(channel: 'TelegramChannel'):
self.channel = channel
...

from typing import TYPE_CHECKING
from .flags import ExperimentalFlagsManager

if TYPE_CHECKING:
# Avoid cycle import for type checking
from . import TelegramChannel

class DatabaseManager:
def init(channel: 'TelegramChannel'):
self.channel: 'TelegramChannel' = channel
self.flags: ExperimentalFlagsManager = channel.flags
...

from typing import TYPE_CHECKING
from .chat_binding import ChatBindingManager
from .db import DatabaseManager

if TYPE_CHECKING:
# Avoid cycle import for type checking
from . import TelegramChannel

class ChatBindingManager:
def init(channel: 'TelegramChannel'):
self.channel: 'TelegramChannel' = channel
self.flags: ExperimentalFlagsManager = channel.flags
self.db: DatabaseManager = channel.db
...

While going on refactoring ETM, I learnt that multiple inheritance in Python is also used in another way – mixins. Mixins are classes that are useful when you want to add a set of features to many other classes. This has enlightened me when I was trying to deal with constantly adding references of the gettext translator in all manager classes.

I added a mixin called LocaleMixin that extracts the translator functions (gettext and ngettext) from the main class reference (assuming they are guaranteed to be there), and assign a local property that reflects these methods.

class LocaleMixin:
    channel: 'TelegramChannel'

    @property
    def _(self):
        return self.channel.gettext

    @property
    def ngettext(self):
        return self.channel.ngettext

When the mixin classes is added to the list of inherited classes, the IDE can properly recognise these helper properties, and their definitions are consolidated in the same place. I find it more organised that the previous style.


In the end, I find that simply creating classes for each component of my code turns out to be the most organised, and IDE-friendly way to breakdown a large class, and mixins are helpful to make references or helper functions available to multiple classes.

  • Python

    Python 是一种面向对象、直译式电脑编程语言,具有近二十年的发展历史,成熟且稳定。它包含了一组完善而且容易理解的标准库,能够轻松完成很多常见的任务。它的语法简捷和清晰,尽量使用无异义的英语单词,与其它大多数程序设计语言使用大括号不一样,它使用缩进来定义语句块。

    536 引用 • 672 回帖

相关帖子

欢迎来到这里!

我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。

注册 关于
请输入回帖内容 ...

推荐标签 标签

  • 微信

    腾讯公司 2011 年 1 月 21 日推出的一款手机通讯软件。用户可以通过摇一摇、搜索号码、扫描二维码等添加好友和关注公众平台,同时可以将自己看到的精彩内容分享到微信朋友圈。

    130 引用 • 793 回帖
  • IPFS

    IPFS(InterPlanetary File System,星际文件系统)是永久的、去中心化保存和共享文件的方法,这是一种内容可寻址、版本化、点对点超媒体的分布式协议。请浏览 IPFS 入门笔记了解更多细节。

    20 引用 • 245 回帖 • 234 关注
  • abitmean

    有点意思就行了

    29 关注
  • WebClipper

    Web Clipper 是一款浏览器剪藏扩展,它可以帮助你把网页内容剪藏到本地。

    3 引用 • 9 回帖 • 2 关注
  • JVM

    JVM(Java Virtual Machine)Java 虚拟机是一个微型操作系统,有自己的硬件构架体系,还有相应的指令系统。能够识别 Java 独特的 .class 文件(字节码),能够将这些文件中的信息读取出来,使得 Java 程序只需要生成 Java 虚拟机上的字节码后就能在不同操作系统平台上进行运行。

    180 引用 • 120 回帖 • 1 关注
  • React

    React 是 Facebook 开源的一个用于构建 UI 的 JavaScript 库。

    192 引用 • 291 回帖 • 433 关注
  • 架构

    我们平时所说的“架构”主要是指软件架构,这是有关软件整体结构与组件的抽象描述,用于指导软件系统各个方面的设计。另外还有“业务架构”、“网络架构”、“硬件架构”等细分领域。

    140 引用 • 441 回帖 • 1 关注
  • Chrome

    Chrome 又称 Google 浏览器,是一个由谷歌公司开发的网页浏览器。该浏览器是基于其他开源软件所编写,包括 WebKit,目标是提升稳定性、速度和安全性,并创造出简单且有效率的使用者界面。

    60 引用 • 287 回帖
  • 反馈

    Communication channel for makers and users.

    124 引用 • 907 回帖 • 210 关注
  • 电影

    这是一个不能说的秘密。

    120 引用 • 597 回帖
  • danl
    89 关注
  • Laravel

    Laravel 是一套简洁、优雅的 PHP Web 开发框架。它采用 MVC 设计,是一款崇尚开发效率的全栈框架。

    19 引用 • 23 回帖 • 699 关注
  • Maven

    Maven 是基于项目对象模型(POM)、通过一小段描述信息来管理项目的构建、报告和文档的软件项目管理工具。

    186 引用 • 318 回帖 • 336 关注
  • BND

    BND(Baidu Netdisk Downloader)是一款图形界面的百度网盘不限速下载器,支持 Windows、Linux 和 Mac,详细介绍请看这里

    107 引用 • 1281 回帖 • 31 关注
  • Angular

    AngularAngularJS 的新版本。

    26 引用 • 66 回帖 • 531 关注
  • 知乎

    知乎是网络问答社区,连接各行各业的用户。用户分享着彼此的知识、经验和见解,为中文互联网源源不断地提供多种多样的信息。

    10 引用 • 66 回帖
  • 学习

    “梦想从学习开始,事业从实践起步” —— 习近平

    163 引用 • 473 回帖
  • GitLab

    GitLab 是利用 Ruby 一个开源的版本管理系统,实现一个自托管的 Git 项目仓库,可通过 Web 界面操作公开或私有项目。

    46 引用 • 72 回帖
  • 30Seconds

    📙 前端知识精选集,包含 HTML、CSS、JavaScript、React、Node、安全等方面,每天仅需 30 秒。

    • 精选常见面试题,帮助您准备下一次面试
    • 精选常见交互,帮助您拥有简洁酷炫的站点
    • 精选有用的 React 片段,帮助你获取最佳实践
    • 精选常见代码集,帮助您提高打码效率
    • 整理前端界的最新资讯,邀您一同探索新世界
    488 引用 • 383 回帖 • 4 关注
  • WordPress

    WordPress 是一个使用 PHP 语言开发的博客平台,用户可以在支持 PHP 和 MySQL 数据库的服务器上架设自己的博客。也可以把 WordPress 当作一个内容管理系统(CMS)来使用。WordPress 是一个免费的开源项目,在 GNU 通用公共许可证(GPLv2)下授权发布。

    45 引用 • 113 回帖 • 284 关注
  • 前端

    前端技术一般分为前端设计和前端开发,前端设计可以理解为网站的视觉设计,前端开发则是网站的前台代码实现,包括 HTML、CSS 以及 JavaScript 等。

    247 引用 • 1347 回帖
  • Flume

    Flume 是一套分布式的、可靠的,可用于有效地收集、聚合和搬运大量日志数据的服务架构。

    9 引用 • 6 回帖 • 608 关注
  • PHP

    PHP(Hypertext Preprocessor)是一种开源脚本语言。语法吸收了 C 语言、 Java 和 Perl 的特点,主要适用于 Web 开发领域,据说是世界上最好的编程语言。

    165 引用 • 407 回帖 • 514 关注
  • Solo

    Solo 是一款小而美的开源博客系统,专为程序员设计。Solo 有着非常活跃的社区,可将文章作为帖子推送到社区,来自社区的回帖将作为博客评论进行联动(具体细节请浏览 B3log 构思 - 分布式社区网络)。

    这是一种全新的网络社区体验,让热爱记录和分享的你不再感到孤单!

    1425 引用 • 10043 回帖 • 474 关注
  • Scala

    Scala 是一门多范式的编程语言,集成面向对象编程和函数式编程的各种特性。

    13 引用 • 11 回帖 • 111 关注
  • 一些有用的避坑指南。

    69 引用 • 93 回帖
  • RESTful

    一种软件架构设计风格而不是标准,提供了一组设计原则和约束条件,主要用于客户端和服务器交互类的软件。基于这个风格设计的软件可以更简洁,更有层次,更易于实现缓存等机制。

    30 引用 • 114 回帖