[lua source code]global_State

源码阅读顺序

Lua的源码短小精悍,一般的阅读顺序是建议从外围到内层。例如,下面这个顺序是reddit上的一篇帖子的推荐:

Recommended reading order

If you're done before X-Mas and understood all of it, you're good. The information density of the code is rather high.

  • lmathlib.c,lstrlib.c: get familiar with the external C API. Don't bother with the pattern matcher though. Just the easy functions.
  • lapi.c: Check how the API is implemented internally. Only skim this to get a feeling for the code. Cross-reference to lua.h and luaconf.h as needed.
  • lobject.h: tagged values and object representation. skim through this first. you'll want to keep a window with this file open all the time.
  • lstate.h: state objects. ditto.
  • lopcodes.h: bytecode instruction format and opcode definitions. easy.
  • lvm.c: scroll down to luaV_execute, the main interpreter loop. see how all of the instructions are implemented. skip the details for now. reread later.
  • ldo.c: calls, stacks, exceptions, coroutines. tough read.
  • lstring.c: string interning. cute, huh?
  • ltable.c: hash tables and arrays. tricky code.
  • ltm.c: metamethod handling, reread all of lvm.c now.
  • You may want to reread lapi.c now.
  • ldebug.c: surprise waiting for you. abstract interpretation is used to find object names for tracebacks. does bytecode verification, too.
  • lparser.c,lcode.c: recursive descent parser, targetting a register-based VM. start from chunk() and work your way through. read the expression parser and the code generator parts last.
  • lgc.c: incremental garbage collector. take your time.
  • Read all the other files as you see references to them. Don't let your stack get too deep though.

我并没有按这个顺序读。外围的API在使用Lua以及给Lua写扩展库的过程中,已经比较熟悉了,C 的 API 在我看来都是按需查文档的事情,所以lmathlapi.h这两部分并需要特地去每个看。所以也是直接到lobject.hlstate.h来的。

其实,从lobjectlstate出发,我觉的是很有道理的。因为:程序=算法+数据结构。所以我们要先从梳理数据结构开始,这也是为什么在前几篇里,我总是有意略掉某个??榈木咛逑附?。等把数据结构的轮廓整体大致理清之后,再把每个部分的方法部分展开,就变成体力活。当然一个语言在发展演化过程中,肯定不是一开始就是这样的,我们这样的做法,好像这些结构的设计是“天然的”,“一开始就想的非常周全”似的,所以有时候会拼命为某个字段为什么需要,为什么在那个地方给找理由,但实际的开发过程可能是经过N个版本和补丁后才变成那样的。

到这篇,我们开始看global_State对象,不过我依然会只做到梳理下脉络,而不展开细节。

global_State

上一篇提到,在lua_State对象里有一个global_State,这个才是Lua 状态机的全局状态的存储的地方,所有的lua_State共享这个全局状态,由于Lua被设计为单线程的,所以global_State上的状态控制变的简单很多,完全不考虑多线程问题。

/*
** 'global state', shared by all threads of this state
*/
typedef struct global_State {
  lua_Alloc frealloc;  /* function to reallocate memory */
  void *ud;         /* auxiliary data to 'frealloc' */
  lu_mem totalbytes;  /* number of bytes currently allocated - GCdebt */
  l_mem GCdebt;  /* bytes allocated not yet compensated by the collector */
  lu_mem GCmemtrav;  /* memory traversed by the GC */
  lu_mem GCestimate;  /* an estimate of the non-garbage memory in use */
  stringtable strt;  /* hash table for strings */
  TValue l_registry;
  unsigned int seed;  /* randomized seed for hashes */
  lu_byte currentwhite;
  lu_byte gcstate;  /* state of garbage collector */
  lu_byte gckind;  /* kind of GC running */
  lu_byte gcrunning;  /* true if GC is running */
  GCObject *allgc;  /* list of all collectable objects */
  GCObject **sweepgc;  /* current position of sweep in list */
  GCObject *finobj;  /* list of collectable objects with finalizers */
  GCObject *gray;  /* list of gray objects */
  GCObject *grayagain;  /* list of objects to be traversed atomically */
  GCObject *weak;  /* list of tables with weak values */
  GCObject *ephemeron;  /* list of ephemeron tables (weak keys) */
  GCObject *allweak;  /* list of all-weak tables */
  GCObject *tobefnz;  /* list of userdata to be GC */
  GCObject *fixedgc;  /* list of objects not to be collected */
  struct lua_State *twups;  /* list of threads with open upvalues */
  Mbuffer buff;  /* temporary buffer for string concatenation */
  unsigned int gcfinnum;  /* number of finalizers to call in each GC step */
  int gcpause;  /* size of pause between successive GCs */
  int gcstepmul;  /* GC 'granularity' */
  lua_CFunction panic;  /* to be called in unprotected errors */
  struct lua_State *mainthread;
  const lua_Number *version;  /* pointer to version number */
  TString *memerrmsg;  /* memory-error message */
  TString *tmname[TM_N];  /* array with tag-method names */
  struct Table *mt[LUA_NUMTAGS];  /* metatables for basic types */
} global_State;

这么多字段,即使加了一堆注释,看上去依然是不直观的。怎样算直观的呢?我觉的能比较清晰看出层级关系的代码比较直观,层级关系一般就是代表了一个对象是由哪些子??楣钩傻?,这是符合人类的线性直觉的吧?姑且怎么做:

/*
** 'global state', shared by all threads of this state
*/
typedef struct global_State {
  // Version
  const lua_Number *version;      /* pointer to version number */

  // Hash
  unsigned int seed;              /* randomized seed for hashes */

  // Registry
  TValue l_registry;

  // String table
  stringtable strt;               /* hash table for strings */
  Mbuffer buff;                   /* temporary buffer for string concatenation */

  // Meta table
  TString *tmname[TM_N];          /* array with tag-method names */
  struct Table *mt[LUA_NUMTAGS];  /* metatables for basic types */

  // Thread list
  struct lua_State *mainthread;
  struct lua_State *twups;        /* list of threads with open upvalues */
  
  // Error Recover
  lua_CFunction panic;            /* to be called in unprotected errors */

  // Memory Allocator
  lua_Alloc frealloc;             /* function to reallocate memory */
  void *ud;                       /* auxiliary data to 'frealloc' */
  
  // GC
  lu_mem totalbytes;              /* number of bytes currently allocated - GCdebt */
  TString *memerrmsg;             /* memory-error message */
  
  l_mem  GCdebt;                  /* bytes allocated not yet compensated by the collector */
  lu_mem GCmemtrav;               /* memory traversed by the GC */
  lu_mem GCestimate;              /* an estimate of the non-garbage memory in use */

  lu_byte currentwhite;
  lu_byte gcstate;                /* state of garbage collector */
  lu_byte gckind;                 /* kind of GC running */
  lu_byte gcrunning;              /* true if GC is running */

  int gcpause;                    /* size of pause between successive GCs */
  int gcstepmul;                  /* GC 'granularity' */
  unsigned int gcfinnum;          /* number of finalizers to call in each GC step */
  
  GCObject *allgc;                /* list of all collectable objects */
  GCObject *finobj;               /* list of collectable objects with finalizers */
  GCObject *gray;                 /* list of gray objects */
  GCObject *grayagain;            /* list of objects to be traversed atomically */
  GCObject *weak;                 /* list of tables with weak values */
  GCObject *ephemeron;            /* list of ephemeron tables (weak keys) */
  GCObject *allweak;              /* list of all-weak tables */
  GCObject *tobefnz;              /* list of userdata to be GC */
  GCObject *fixedgc;              /* list of objects not to be collected */

  GCObject **sweepgc;             /* current position of sweep in list */

} global_State;

分开后,看一下,后面一大票都是垃圾回收相关的。要是一个语言不用处理垃圾回收的话,代码应该会清爽非常多,从这点来看,也许可以整理一个不管垃圾回收版本的Lua源码来看:)。

现在,我们可以进一步把垃圾回收的部分拆开:

typedef struct lua_GCInfo{
  lu_mem totalbytes;              /* number of bytes currently allocated - GCdebt */
  TString *memerrmsg;             /* memory-error message */
  
  l_mem  GCdebt;                  /* bytes allocated not yet compensated by the collector */
  lu_mem GCmemtrav;               /* memory traversed by the GC */
  lu_mem GCestimate;              /* an estimate of the non-garbage memory in use */

  lu_byte currentwhite;
  lu_byte gcstate;                /* state of garbage collector */
  lu_byte gckind;                 /* kind of GC running */
  lu_byte gcrunning;              /* true if GC is running */

  int gcpause;                    /* size of pause between successive GCs */
  int gcstepmul;                  /* GC 'granularity' */
  unsigned int gcfinnum;          /* number of finalizers to call in each GC step */
  
  GCObject *allgc;                /* list of all collectable objects */
  GCObject *finobj;               /* list of collectable objects with finalizers */
  GCObject *gray;                 /* list of gray objects */
  GCObject *grayagain;            /* list of objects to be traversed atomically */
  GCObject *weak;                 /* list of tables with weak values */
  GCObject *ephemeron;            /* list of ephemeron tables (weak keys) */
  GCObject *allweak;              /* list of all-weak tables */
  GCObject *tobefnz;              /* list of userdata to be GC */
  GCObject *fixedgc;              /* list of objects not to be collected */

  GCObject **sweepgc;             /* current position of sweep in list */
}GCInfo;

从而,global_State就会变的清晰很多:


typedef struct global_State {
  // Version
  const lua_Number *version;      /* pointer to version number */

  // Hash
  unsigned int seed;              /* randomized seed for hashes */

  // Global Registry
  TValue l_registry;

  // Global String table
  stringtable strt;               /* hash table for strings */
  Mbuffer buff;                   /* temporary buffer for string concatenation */

  // Global Meta table
  TString *tmname[TM_N];          /* array with tag-method names */
  struct Table *mt[LUA_NUMTAGS];  /* metatables for basic types */

  // Global Thread list
  struct lua_State *mainthread;
  struct lua_State *twups;        /* list of threads with open upvalues */
  
  // Memory Allocator
  lua_Alloc frealloc;             /* function to reallocate memory */
  void *ud;                       /* auxiliary data to 'frealloc' */

  // GC Info
  GCInfo *gcinfo;

  // Error Recover
  lua_CFunction panic;            /* to be called in unprotected errors */
}
  • registry:
    • 注册表管理全局数据
  • string
    • stringtable: 全局字符串表,几乎每个语言都会对字符串做池化,作成immutable的,Lua的字符串分短字符串和长字符串
    • buff: 在Lua解析(parse)源代码的过程中,以及字符串处理过程中需要的临时缓存
  • meta table,其实无论什么语言,所有的魔法都可以归结为“查表”,例如面向对象里的虚函数表,所有的OOP机制都依赖于虚函数表。
    • tmname: metatable的预定义方法名字数组,tm是tag method的缩写
    • mt:每个基本类型一个metatable,注意table、userdata等则是每个实例一个metatable。metatable+tag method可以说是整个Lua最重要的Hook机制。
  • thread,当然global_State需要持有所有的线程(协程)。
    • mainthread: 主线程
    • twups: 闭包了当前线程变量的其他线程列表
  • memory
    • frealloc: Lua的全局内存分配器,用户可以替换成自己的
    • ud: 分配器的userdata
  • gc
    • 垃圾回收所需要的信息特别多,先不管,整个垃圾回收系统应该单独来分析。
  • error handle
    • panic: 全局错误处理响应点

其中,stringtable如下:

typedef struct stringtable {
  TString **hash;
  int nuse;  /* number of elements */
  int size;
} stringtable;

待续

这样我们初步把gloabl_State的轮廓搞清楚了。后面,按global_State的子模块,逐个分析。

最后编辑于
?著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 214,172评论 6 493
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 91,346评论 3 389
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事?!?“怎么了?”我有些...
    开封第一讲书人阅读 159,788评论 0 349
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 57,299评论 1 288
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 66,409评论 6 386
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 50,467评论 1 292
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 39,476评论 3 412
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,262评论 0 269
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 44,699评论 1 307
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 36,994评论 2 328
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,167评论 1 343
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 34,827评论 4 337
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 40,499评论 3 322
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,149评论 0 21
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,387评论 1 267
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,028评论 2 365
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,055评论 2 352

推荐阅读更多精彩内容

  • **2014真题Directions:Read the following text. Choose the be...
    又是夜半惊坐起阅读 9,457评论 0 23
  • 说明本次redis集群安装在rhel6.8 64位机器上,redis版本为3.2.8,redis的gem文件版本为...
    读或写阅读 14,693评论 3 9
  • 血想必所有人都见过但你敢看冰冷的针管从你的血管里把你的血抽走的过程吗?反正我是不敢,但我看了。我们常常见到被抽血的...
    一只默阅读 365评论 0 0
  • 团子很馋,可以说是我养过的狗里面最馋的一个,原来是朱莉比较馋,爱吃水果和人的饭,只要看见我们手里的苹果、鸡蛋啥的,...
    午后窗台的猫阅读 208评论 0 0
  • 你天真地以为可以完全依靠依赖一个人,然而那只是因为没到冲突的时候,一旦需要做出选择,需要有所承担,一旦有些冲突,就...
    琉璃兔阅读 340评论 0 0