Prometheus 实战于源码分析之服务启动
在之前的铺垫后下面进行prometheus的源码分析,首先要看的是服务启动。在cmd/prometheus/main.go中main方法,由于太长了,所以这里分段解说一下:先是启动本地存储var localStorage local.Storageswitch cfg.localStorageEngine {case "persisted":loca
·
在之前的铺垫后下面进行prometheus的源码分析,首先要看的是服务启动。在cmd/prometheus/main.go中main方法,由于太长了,所以这里分段解说一下:
先是启动本地存储
var localStorage local.Storage
switch cfg.localStorageEngine {
case "persisted":
localStorage = local.NewMemorySeriesStorage(&cfg.storage)
sampleAppender = storage.Fanout{localStorage}
case "none":
localStorage = &local.NoopStorage{}
default:
log.Errorf("Invalid local storage engine %q", cfg.localStorageEngine)
return 1
}
remoteStorage := &remote.Storage{}
sampleAppender = append(sampleAppender, remoteStorage)
reloadables = append(reloadables, remoteStorage)
由于config里面默认本地存储是persistence,也就是是存在本地磁盘中
cfg.fs.StringVar(
&cfg.localStorageEngine, "storage.local.engine", "persisted",
"Local storage engine. Supported values are: 'persisted' (full local storage with on-disk persistence) and 'none' (no local storage).",
)
如果设置是none的话就不保存在本地了,NoopStorage是空实现。prometheus不经支持本地存储还支持远端存储,这个后面再细说。
var (
notifier = notifier.New(&cfg.notifier)
targetManager = retrieval.NewTargetManager(sampleAppender)
queryEngine = promql.NewEngine(localStorage, &cfg.queryEngine)
ctx, cancelCtx = context.WithCancel(context.Background())
)
ruleManager := rules.NewManager(&rules.ManagerOptions{
SampleAppender: sampleAppender,
Notifier: notifier,
QueryEngine: queryEngine,
Context: ctx,
ExternalURL: cfg.web.ExternalURL,
})
cfg.web.Context = ctx
cfg.web.Storage = localStorage
cfg.web.QueryEngine = queryEngine
cfg.web.TargetManager = targetManager
cfg.web.RuleManager = ruleManager
cfg.web.Notifier = notifier
初始化上下午ctx,创建管理target的targetmanager、还有查询引擎和告警通知notifier。
reloadables = append(reloadables, targetManager, ruleManager, webHandler, notifier)
if err := reloadConfig(cfg.configFile, reloadables...); err != nil {
log.Errorf("Error loading config: %s", err)
return 1
}
初始的配置是可以通过加载配置文件修改的。下面的代码是prometheus新加的自动更新配置的功能,通过监听SIGHUP信号去重新加载配置文件
hup := make(chan os.Signal)
hupReady := make(chan bool)
signal.Notify(hup, syscall.SIGHUP)
go func() {
<-hupReady
for {
select {
case <-hup:
if err := reloadConfig(cfg.configFile, reloadables...); err != nil {
log.Errorf("Error reloading config: %s", err)
}
case rc := <-webHandler.Reload():
if err := reloadConfig(cfg.configFile, reloadables...); err != nil {
log.Errorf("Error reloading config: %s", err)
rc <- err
} else {
rc <- nil
}
}
}
}()
下面是启动本地存储的方法:
if err := localStorage.Start(); err != nil {
log.Errorln("Error opening memory series storage:", err)
return 1
}
defer func() {
if err := localStorage.Stop(); err != nil {
log.Errorln("Error stopping storage:", err)
}
}()
prometheus不仅能监控外部系统,还能监控自身,所以需要把自身的性能指标暴露出去
if instrumentedStorage, ok := localStorage.(prometheus.Collector); ok {
prometheus.MustRegister(instrumentedStorage)
}
prometheus.MustRegister(notifier)
prometheus.MustRegister(configSuccess)
prometheus.MustRegister(configSuccessTime)
通过MustRegister主注册,这个后面再深入讲解。
go notifier.Run()
defer notifier.Stop()
go ruleManager.Run()
defer ruleManager.Stop()
go targetManager.Run()
defer targetManager.Stop()
// Shutting down the query engine before the rule manager will cause pending queries
// to be canceled and ensures a quick shutdown of the rule manager.
defer cancelCtx()
go webHandler.Run()
紧接着就是启动各种服务,如告警通知服务,target管理服务,规则管理服务,web等。下面就是优雅停止服务的部分
term := make(chan os.Signal)
signal.Notify(term, os.Interrupt, syscall.SIGTERM)
select {
case <-term:
log.Warn("Received SIGTERM, exiting gracefully...")
case <-webHandler.Quit():
log.Warn("Received termination request via web service, exiting gracefully...")
case err := <-webHandler.ListenError():
log.Errorln("Error starting web server, exiting gracefully:", err)
}
log.Info("See you next time!")
return 0
下面截取了服务的启动日志
time="2017-04-27T11:05:10Z" level=info msg="Starting prometheus (version=1.6.1, branch=master, revision=4666df502c0e239ed4aa1d80abbbfb54f61b23c3)" source="main.go:88"
time="2017-04-27T11:05:10Z" level=info msg="Build context (go=go1.8.1, user=root@7e45fa0366a7, date=20170419-14:32:22)" source="main.go:89"
time="2017-04-27T11:05:10Z" level=info msg="Loading configuration file /etc/prometheus/prometheus.yml" source="main.go:251"
time="2017-04-27T11:05:10Z" level=info msg="Loading series map and head chunks..." source="storage.go:421"
time="2017-04-27T11:05:10Z" level=info msg="0 series loaded." source="storage.go:432"
time="2017-04-27T11:05:10Z" level=info msg="Starting target manager..." source="targetmanager.go:61"
time="2017-04-27T11:05:10Z" level=info msg="Listening on :9090" source="web.go:259"
这样服务主体就已经启动了。
更多推荐
所有评论(0)