erlang热更新的实现与原理

vlambda
2021-06-26

erlang热更新的实现与原理

erlang的热更新是模块级别的，就是一个模块一个模块更新的。

热更新是什么，就是在不停止系统的情况下对运行的代码进行替换。

如何进行热更新？

c(Mod) -> compile:file(Mod), code:purge(Mod), code:load_file(Mod).

以上就是shell c(Mod) 的主要代码，3个步骤：编译新的代码，清除旧代码，加载新代码

同样， l(Mod) 的主要代码如下，少了编译过程：

l(Mod) -> code:purge(Mod), code:load_file(Mod).

热更新原理

erlang文档有说明：

The code of a module can exist in two variants in a system: current and old. When a module is loaded into the system for the first time, the code becomes 'current'. If then a new instance of the module is loaded, the code of the previous instance becomes 'old' and the new instance becomes 'current'.

意思是，erlang每个模块都能保存2份代码，当前版本'current'和旧版本'old'，当模块第一次被加载时，代码就是'current'版本。如果有新的代码被加载，'current'版本代码就变成了'old'版本，新的代码就成了'current'版本

这样，就算代码在热更新，有进程在调用这个模块，执行的代码也不会受影响。热更新后，这个进程执行的代码没有改变，只不过代码被标记成'old'版本。而新的进程调用这个模块时，只会访问'current'版本的代码。而'old'版本的代码如果没有进程再访问，就会在下次热更新被系统清除掉。

erlang用两个版本共存的方法来保证任何时候总有一个版本可用，这样，对外服务就不会停止。

热更新问题

有个问题，如果'old'版本一直都有进程在调用，在此期间，代码热再更新了会发生什么情况？

热更新时，如果模块存在'old'版本代码，erlang会kill掉所有调用这个'old'版本代码的进程，然后移除掉'old'版本代码，'current'版本变成了'old'版本，新的代码就成了'current'版本。

热更新问题重现

-module(t).-compile(export_all).
start() -> Pid = spawn(fun() -> do_loop() end), register(t, Pid). do_loop() -> receive Msg -> io:format("~p~n", [Msg]) end, do_loop().

结果如下：

7> t:start().true8> erlang:monitor(process, whereis(t)). %%进程监控#Ref<0.0.0.56>9> whereis(t).<0.40.0>10> l(t). %%第1次热更{module,t}11> whereis(t).<0.40.0>12> l(t). %%第2次热更{module,t}13> whereis(t).undefined14> flush().Shell got {'DOWN',#Ref<0.0.0.56>,process,<0.40.0>,killed}ok

热更新2次后，进程就被kill掉了。（想知道在哪被kill，可在code_server中do_purge/3找到，参考[ 1]）

解决热更新问题

如果进程一直在自己loop里面，就会一直跑着'old'版本的代码，这样的后果就是新的代码没有被使用，而且在下一次热更新时进程会被系统kill掉。

怎么解决这个问题，erlang文档还是能找到答案：

To change from old code to current code, a process must make a fully qualified function call. Example:

-module(m).-export([loop/0]).

loop() -> receive code_switch -> m:loop(); Msg -> do_something(), loop() end.

就是在热更新后，给这个进程发消息 code_switch ，这样进程会调用 m:loop()

这里，loop()和m:loop()有什么区别呢？

erlang根据模块划分，函数分本地调用和外部调用，其中，本地调用是调用本模块内的函数，函数可以不导出，调用形式为 Atom(Args) ；外部调用就是调用别的模块函数，函数必须导出，调用形式为 Module:Function(Args).

在erlang VM中，进程调用模块的过程是先加载这个模块当前版本的代码再执行，如果进程一直都是本地调用，那么所有操作都是在进程当前运行的代码中完成。换句话，这个过程中进程不会去加载新的代码。打破这种局面的就是外部调用。

看这张图的时候先把绿色的内容去掉，进程就一直都是本地调用，加入绿色内容后，进程会重新加载这个模块的代码再运行。

那么有些同学会好奇，既然这样，erlang为何还要本地调用，直接全部都外部调用就好了？

参考：http://blog.csdn.net/mycwq/article/details/41175237

http://learnyousomeerlang.com/designing-a-concurrent-application#hot-code-loving

http://www.erlang.org/doc/reference_manual/code_loading.html#id86381

————————————————

原文链接：https://blog.csdn.net/mycwq/article/details/41175237

vlambda博客
学习文章列表