Apollo的启动过程2——功能模块加载

功能模块的启动过程(以Planning为例)

Apollo 3.5版本引入了中间件Cyber RT作为底层通讯与调度平台,Cyber RT以组件(component)的概念构建、加载各功能模块。Perception、Localization、Planning、Control等功能模块均作为Cyber RT框架的一个组件而存在,基于Cyber RT提供的调度程序mainboard加载运行。下面以Planning模块为例具体阐述。 Planning模块BUILD文件中生成binary文件的配置项如下:

1
2
3
4
5
6
cc_binary(
name = "libplanning_component.so",
linkshared = True,
linkstatic = False,
deps = [":planning_component_lib"],
)

该配置项中没有srcs文件,仅包含一个依赖项:planning_component_lib。又注意到后者的定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
cc_library(
name = "planning_component_lib",
srcs = ["planning_component.cc"],
hdrs = ["planning_component.h"],
copts = [
"-DMODULE_NAME=\\\"planning\\\"",
],
deps = [
":navi_planning",
":on_lane_planning",
"//cyber",
"//modules/common/adapters:adapter_gflags",
"//modules/common/util:message_util",
"//...
],
)

srcs文件以及deps文件中均没有main()函数。Planning模块binary文件libplanning_component.so是作为Cyber RT的一个组件启动,不需要main()函数。

Apollo以Dreamview为启动一切模块的中心,在Dreamview界面中启动Planning模块的前端操作,后端的响应函数HMI::RegisterMessageHandlers()位于/apollo/modules/dreamview/backend/hmi/hmi.cc文件中:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
void HMI::RegisterMessageHandlers() {

// ...
websocket_->RegisterMessageHandler(
"HMIAction",
[this](const Json& json, WebSocketHandler::Connection* conn) {
std::string action;
if (!JsonUtil::GetStringFromJson(json, "action", &action)) {
AERROR << "Truncated HMIAction request.";
return;
}
HMIAction hmi_action;
// 解析动作参数
if (!HMIAction_Parse(action, &hmi_action)) {
AERROR << "Invalid HMIAction string: " << action;
}
std::string value;
if (JsonUtil::GetStringFromJson(json, "value", &value)) {
// 执行相关动作
hmi_worker_->Trigger(hmi_action, value);
} else {
hmi_worker_->Trigger(hmi_action);
}

// Extra works for current Dreamview...
}
});

// ...
}

HMIAction_Parse(action,&hmi_action)用于解析动作参数,hmi_worker_->Trigger(hmi_action, value);用于执行相关的动作:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
bool HMIWorker::Trigger(const HMIAction action, const std::string& value) {
AINFO << "HMIAction " << HMIAction_Name(action) << "(" << value
<< ") was triggered!";
// 可见HMIAction的结构如下,未找到对应的定义文件!
switch (action) {
case HMIAction::CHANGE_MODE:
ChangeMode(value);
break;
case HMIAction::CHANGE_MAP:
ChangeMap(value);
break;
case HMIAction::CHANGE_VEHICLE:
ChangeVehicle(value);
break;
// 运行modul模块
case HMIAction::START_MODULE:
StartModule(value);
break;
case HMIAction::STOP_MODULE:
StopModule(value);
break;
default:
AERROR << "HMIAction not implemented, yet!";
return false;
}
return true;
}

对于Planning模块的启动来说,hmi_action = HMIAction::START_MODULEvalue的值为Planning。实际上,Dreamview将操作模式分为多种hmi mode,位于目录/apollo/modules/dreamview/conf/hmi_modes下,每一个配置文件对应一种hmi mode。每一种模式的dag文件不一样,启动的modul不同。继续探究HMIWorker::StartModule(const std::string& module)函数:

1
2
3
4
5
6
7
8
9
10
void HMIWorker::StartModule(const std::string& module) const {
// current_model_通过LoadMode()加载进去,current_mode_为读入的HMIMode
const Module* module_conf = FindOrNull(current_mode_.modules(), module);
if (module_conf != nullptr) {
// 启动进程
System(module_conf->start_command());
} else {
AERROR << "Cannot find module " << module;
}
}

上述函数中成员变量current_mode_保存着当前hmi mode对应配置文件包含的所有配置项。例如modules/dreamview/conf/hmi_modes/mkz_standard_debug.pb.txt里面就包含了MKZ标准调试模式下所有的功能模块,该配置文件通过HMIWorker::LoadMode(const std::string& mode_config_path)函数读入到成员变量current_mode_中。如果基于字符串module查找到了对应的模块名,则调用System函数(内部实际调用std::system函数)基于命令module_conf->start_command()启动一个进程。LoadMode()函数为:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
HMIMode HMIWorker::LoadMode(const std::string& mode_config_path) {
// 对应在/apollo/modules/dreamview/proto/hmi_mode.proto
HMIMode mode;
// 加载hmi mode的配置文件到HMIModel mode
CHECK(cyber::common::GetProtoFromFile(mode_config_path, &mode))
<< "Unable to parse HMIMode from file " << mode_config_path;
// Translate cyber_modules to regular modules,将cyber_modules-->modules
for (const auto& iter : mode.cyber_modules()) {
const std::string& module_name = iter.first; // such as Planning
const CyberModule& cyber_module = iter.second;
// Each cyber module such as Planning、Control should have at least one dag file.
CHECK(!cyber_module.dag_files().empty())
<< "None dag file is provided for " << module_name << " module in "
<< mode_config_path;

Module& module = LookupOrInsert(mode.mutable_modules(), module_name, {});
module.set_required_for_safety(cyber_module.required_for_safety());

// Construct start_command:
// nohup mainboard -p <process_group> -d <dag> ... &
module.set_start_command("nohup mainboard");
const auto& process_group = cyber_module.process_group();
if (!process_group.empty()) {
absl::StrAppend(module.mutable_start_command(), " -p ", process_group);
}
for (const std::string& dag : cyber_module.dag_files()) {
absl::StrAppend(module.mutable_start_command(), " -d ", dag);
}
absl::StrAppend(module.mutable_start_command(), " &");

// Construct stop_command:
const std::string& first_dag = cyber_module.dag_files(0);
module.set_stop_command(absl::StrCat("pkill -f \"", first_dag, "\""));
// Construct process_monitor_config.
module.mutable_process_monitor_config()->add_command_keywords("mainboard");
module.mutable_process_monitor_config()->add_command_keywords(first_dag);
}
mode.clear_cyber_modules();
AINFO << "Loaded HMI mode: " << mode.DebugString();
return mode;
}

该函数中的结构HMIModeModule等均来自/apollo/modules/dreamview/proto/hmi_mode.proto

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
message Module {
optional string start_command = 1;
optional string stop_command = 2;

// We use the config in ProcessMonitor to check if the module is running.
optional ProcessMonitorConfig process_monitor_config = 3;
// Whether to trigger safe-mode if the module is down.
optional bool required_for_safety = 4 [default = true];
}

// A CyberModule will be translated to a regular Module upon loading.
message CyberModule {
repeated string dag_files = 1;
optional bool required_for_safety = 2 [default = true];
optional string process_group = 3;
}
// 存储hmi_modes目录下的模式文件
message HMIMode {
map<string, CyberModule> cyber_modules = 1;
map<string, Module> modules = 2;
map<string, MonitoredComponent> monitored_components = 3;
}

可以看出HMIWorker::LoadMode()函数的主要作用是构建cyber_modules的启动、停止命令,在这个过程中将cyber_modules转化为了modules。构建出来的start_command的格式为nohup mainboard -p <process_group> -d <dag> ... &,(缺少-s )供给System(module_conf->start_command())调用。其中<process_group><dag>文件都来自于hmi mode对应的配置文件。以modules/dreamview/conf/hmi_modes/mkz_close_loop.pb.txt为例:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
cyber_modules {
key: "Computer"
value: {
dag_files: "/apollo/modules/tick/dag/tick.dag"
dag_files: "/apollo/modules/drivers/camera/dag/camera_no_compress.dag"
dag_files: "/apollo/modules/drivers/gnss/dag/gnss.dag"
dag_files: "/apollo/modules/drivers/radar/conti_radar/dag/conti_radar.dag"
dag_files: "/apollo/modules/drivers/velodyne/dag/velodyne.dag"
dag_files: "/apollo/modules/localization/dag/dag_streaming_msf_localization.dag"
dag_files: "/apollo/modules/perception/production/dag/dag_streaming_perception.dag"
dag_files: "/apollo/modules/perception/production/dag/dag_streaming_perception_trafficlights.dag"
dag_files: "/apollo/modules/planning/dag/planning.dag"
dag_files: "/apollo/modules/prediction/dag/prediction.dag"
dag_files: "/apollo/modules/routing/dag/routing.dag"
dag_files: "/apollo/modules/transform/dag/static_transform.dag"
process_group: "compute_sched"
}
}
cyber_modules {
key: "Controller"
value: {
dag_files: "/apollo/modules/canbus/dag/canbus.dag"
dag_files: "/apollo/modules/control/dag/control.dag"
dag_files: "/apollo/modules/guardian/dag/guardian.dag"
process_group: "control_sched"
}
}
...

它包含两个cyber_modules,对于Computer模块而言,它包含了11个dag_files文件(对应11个子功能模块component),这些子功能模块全部属于名为compute_schedprocess_group。根据Apollo 3.5 Cyber - 如何為Dreamview新增hmi mode,process_group就是Cyber RT中调度配置文件scheduler conf的名字,process_group: "compute_sched"表明使用配置文件cyber/conf/compute_sched.conf进行任务调度,process_group: "control_sched"表明使用配置文件control_sched.conf进行任务调度。

至此,Planning模块的启动命令是:

1
nohup mainboard -p compute_sched -d /apollo/modules/planning/dag/planning.d

实际上对于配置文件modules/dreamview/conf/hmi_modes/mkz_close_loop.pb.txt而言,它包含两个大的启动命令,如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
nohup mainboard -p compute_sched 
-d /apollo/modules/drivers/camera/dag/camera_no_compress.dag
-d /apollo/modules/drivers/gnss/dag/gnss.dag
-d /apollo/modules/drivers/radar/conti_radar/dag/conti_radar.dag
-d /apollo/modules/drivers/velodyne/dag/velodyne.dag
-d /apollo/modules/localization/dag/dag_streaming_msf_localization.dag
-d /apollo/modules/perception/production/dag/dag_streaming_perception.dag
-d /apollo/modules/perception/production/dag/dag_streaming_perception_trafficlights.dag
-d /apollo/modules/planning/dag/planning.dag
-d /apollo/modules/prediction/dag/prediction.dag
-d /apollo/modules/routing/dag/routing.dag
-d /apollo/modules/transform/dag/static_transform.dag &
1
2
3
4
nohup mainboard -p control_sched 
-d /apollo/modules/canbus/dag/canbus.dag
-d /apollo/modules/control/dag/control.dag
-d /apollo/modules/guardian/dag/guardian.dag &

根据以上的代码分析,相应的停止命令如下:(未经过验证)

1
2
pkill -f /apollo/modules/drivers/camera/dag/camera_no_compress.dag 
pkill -f /apollo/modules/canbus/dag/canbus.dag

nohup表示非挂断方式启动,mainboard无疑是启动的主程序,入口main函数必定包含于其中。process_group是对功能模块分组而已;dag_files才是启动相关功能模块的真正配置文件。

听说打赏我的人,最后都找到了真爱。