在之前的铺垫后下面进行prometheus的源码分析,首先要看的是服务启动。在cmd/prometheus/main.go中main方法,由于太长了,所以这里分段解说一下:
先是启动本地存储

    var localStorage local.Storage
    switch cfg.localStorageEngine {
    case "persisted":
        localStorage = local.NewMemorySeriesStorage(&cfg.storage)
        sampleAppender = storage.Fanout{localStorage}
    case "none":
        localStorage = &local.NoopStorage{}
    default:
        log.Errorf("Invalid local storage engine %q", cfg.localStorageEngine)
        return 1
    }

    remoteStorage := &remote.Storage{}
    sampleAppender = append(sampleAppender, remoteStorage)
    reloadables = append(reloadables, remoteStorage)

由于config里面默认本地存储是persistence,也就是是存在本地磁盘中

cfg.fs.StringVar(
        &cfg.localStorageEngine, "storage.local.engine", "persisted",
        "Local storage engine. Supported values are: 'persisted' (full local storage with on-disk persistence) and 'none' (no local storage).",
    )

如果设置是none的话就不保存在本地了,NoopStorage是空实现。prometheus不经支持本地存储还支持远端存储,这个后面再细说。

    var (
        notifier       = notifier.New(&cfg.notifier)
        targetManager  = retrieval.NewTargetManager(sampleAppender)
        queryEngine    = promql.NewEngine(localStorage, &cfg.queryEngine)
        ctx, cancelCtx = context.WithCancel(context.Background())
    )

    ruleManager := rules.NewManager(&rules.ManagerOptions{
        SampleAppender: sampleAppender,
        Notifier:       notifier,
        QueryEngine:    queryEngine,
        Context:        ctx,
        ExternalURL:    cfg.web.ExternalURL,
    })

    cfg.web.Context = ctx
    cfg.web.Storage = localStorage
    cfg.web.QueryEngine = queryEngine
    cfg.web.TargetManager = targetManager
    cfg.web.RuleManager = ruleManager
    cfg.web.Notifier = notifier

初始化上下午ctx,创建管理target的targetmanager、还有查询引擎和告警通知notifier。

    reloadables = append(reloadables, targetManager, ruleManager, webHandler, notifier)

    if err := reloadConfig(cfg.configFile, reloadables...); err != nil {
        log.Errorf("Error loading config: %s", err)
        return 1
    }

初始的配置是可以通过加载配置文件修改的。下面的代码是prometheus新加的自动更新配置的功能,通过监听SIGHUP信号去重新加载配置文件

    hup := make(chan os.Signal)
    hupReady := make(chan bool)
    signal.Notify(hup, syscall.SIGHUP)
    go func() {
        <-hupReady
        for {
            select {
            case <-hup:
                if err := reloadConfig(cfg.configFile, reloadables...); err != nil {
                    log.Errorf("Error reloading config: %s", err)
                }
            case rc := <-webHandler.Reload():
                if err := reloadConfig(cfg.configFile, reloadables...); err != nil {
                    log.Errorf("Error reloading config: %s", err)
                    rc <- err
                } else {
                    rc <- nil
                }
            }
        }
    }()

下面是启动本地存储的方法:

    if err := localStorage.Start(); err != nil {
        log.Errorln("Error opening memory series storage:", err)
        return 1
    }
    defer func() {
        if err := localStorage.Stop(); err != nil {
            log.Errorln("Error stopping storage:", err)
        }
    }()

prometheus不仅能监控外部系统,还能监控自身,所以需要把自身的性能指标暴露出去

if instrumentedStorage, ok := localStorage.(prometheus.Collector); ok {
        prometheus.MustRegister(instrumentedStorage)
    }
    prometheus.MustRegister(notifier)
    prometheus.MustRegister(configSuccess)
    prometheus.MustRegister(configSuccessTime)

通过MustRegister主注册,这个后面再深入讲解。

    go notifier.Run()
    defer notifier.Stop()

    go ruleManager.Run()
    defer ruleManager.Stop()

    go targetManager.Run()
    defer targetManager.Stop()

    // Shutting down the query engine before the rule manager will cause pending queries
    // to be canceled and ensures a quick shutdown of the rule manager.
    defer cancelCtx()

    go webHandler.Run()

紧接着就是启动各种服务,如告警通知服务,target管理服务,规则管理服务,web等。下面就是优雅停止服务的部分

term := make(chan os.Signal)
    signal.Notify(term, os.Interrupt, syscall.SIGTERM)
    select {
    case <-term:
        log.Warn("Received SIGTERM, exiting gracefully...")
    case <-webHandler.Quit():
        log.Warn("Received termination request via web service, exiting gracefully...")
    case err := <-webHandler.ListenError():
        log.Errorln("Error starting web server, exiting gracefully:", err)
    }

    log.Info("See you next time!")
    return 0

下面截取了服务的启动日志

time="2017-04-27T11:05:10Z" level=info msg="Starting prometheus (version=1.6.1, branch=master, revision=4666df502c0e239ed4aa1d80abbbfb54f61b23c3)" source="main.go:88" 
time="2017-04-27T11:05:10Z" level=info msg="Build context (go=go1.8.1, user=root@7e45fa0366a7, date=20170419-14:32:22)" source="main.go:89" 
time="2017-04-27T11:05:10Z" level=info msg="Loading configuration file /etc/prometheus/prometheus.yml" source="main.go:251" 
time="2017-04-27T11:05:10Z" level=info msg="Loading series map and head chunks..." source="storage.go:421" 
time="2017-04-27T11:05:10Z" level=info msg="0 series loaded." source="storage.go:432" 
time="2017-04-27T11:05:10Z" level=info msg="Starting target manager..." source="targetmanager.go:61" 
time="2017-04-27T11:05:10Z" level=info msg="Listening on :9090" source="web.go:259" 

这样服务主体就已经启动了。

Logo

Agent 垂直技术社区,欢迎活跃、内容共建。

更多推荐