Statistiques
| Branche: | Révision :

xlcloud / papers / 2014 / kwapi / cloudam2014.tex @ 68fb6bef

Historique | Voir | Annoter | Télécharger (28,19 ko)

1 e542267e Marcos Assuncao
2 46564e42 Marcos Assuncao
\documentclass[conference]{IEEEtran}
3 46564e42 Marcos Assuncao
% Add the compsoc option for Computer Society conferences.
4 46564e42 Marcos Assuncao
5 46564e42 Marcos Assuncao
\usepackage{ctable}
6 46564e42 Marcos Assuncao
\usepackage{cite}
7 e542267e Marcos Assuncao
\usepackage[cmex10]{amsmath}
8 e542267e Marcos Assuncao
% \usepackage{acronym}
9 e542267e Marcos Assuncao
\usepackage{graphicx}
10 e542267e Marcos Assuncao
\usepackage{multirow}
11 46564e42 Marcos Assuncao
\usepackage{listings}
12 46564e42 Marcos Assuncao
\usepackage{color}
13 46564e42 Marcos Assuncao
\usepackage{xcolor}
14 46564e42 Marcos Assuncao
\usepackage{balance}
15 46564e42 Marcos Assuncao
16 46564e42 Marcos Assuncao
\colorlet{@punct}{red!60!black}
17 46564e42 Marcos Assuncao
\definecolor{@delim}{RGB}{20,105,176}
18 46564e42 Marcos Assuncao
19 46564e42 Marcos Assuncao
\lstdefinelanguage{json}{
20 46564e42 Marcos Assuncao
    basicstyle=\footnotesize\ttfamily,
21 46564e42 Marcos Assuncao
    literate=
22 46564e42 Marcos Assuncao
     *{\ }{{{\ }}}{1}
23 46564e42 Marcos Assuncao
      {:}{{{\color{@punct}{:}}}}{1}
24 46564e42 Marcos Assuncao
      {,}{{{\color{@punct}{,}}}}{1}
25 46564e42 Marcos Assuncao
      {\{}{{{\color{@delim}{\{}}}}{1}
26 46564e42 Marcos Assuncao
      {\}}{{{\color{@delim}{\}}}}}{1}
27 46564e42 Marcos Assuncao
      {[}{{{\color{@delim}{[}}}}{1}
28 46564e42 Marcos Assuncao
      {]}{{{\color{@delim}{]}}}}{1},
29 46564e42 Marcos Assuncao
}
30 46564e42 Marcos Assuncao
31 46564e42 Marcos Assuncao
\newcommand{\includeJSON}[1]{\lstinputlisting[language=json,firstnumber=1]{#1}}
32 46564e42 Marcos Assuncao
33 46564e42 Marcos Assuncao
% correct bad hyphenation here
34 46564e42 Marcos Assuncao
\hyphenation{op-tical net-works semi-conduc-tor}
35 46564e42 Marcos Assuncao
36 46564e42 Marcos Assuncao
\begin{document}
37 46564e42 Marcos Assuncao
38 46564e42 Marcos Assuncao
\title{A Generic and Extensible Framework for Monitoring Energy Consumption in OpenStack Clouds}
39 46564e42 Marcos Assuncao
40 46564e42 Marcos Assuncao
41 46564e42 Marcos Assuncao
\author{\IEEEauthorblockN{Francois Rossigneux, Jean-Patrick Gelas, Laurent Lef\`{e}vre, Marcos D. Assun\c{c}\~ao}
42 46564e42 Marcos Assuncao
\IEEEauthorblockA{Inria Avalon team, LIP Laboratory\\
43 46564e42 Marcos Assuncao
Ecole Normale Superieure of Lyon\\
44 46564e42 Marcos Assuncao
University of Lyon, France}
45 46564e42 Marcos Assuncao
}
46 46564e42 Marcos Assuncao
47 46564e42 Marcos Assuncao
48 46564e42 Marcos Assuncao
\maketitle
49 46564e42 Marcos Assuncao
50 46564e42 Marcos Assuncao
51 46564e42 Marcos Assuncao
\begin{abstract}
52 389115e3 Marcos Assuncao
Although cloud computing has been transformational in the IT industry, it often relies on large data centres that consume massive amounts of electrical power. Efforts have been made to reduce the power consumed by Clouds, with certain data centres now approaching a PUE factor of 1.08. That means that the IT infrastructure is now responsible for a large amount of the power a data centre consumes, and hence means to monitor and analyse how energy is spent have never been so crucial. Such monitoring is required for a better understanding of how power is consumed by the IT infrastructure and for assessing the impact of energy management policies. In this article, we draw some lessons from previous experience in monitoring large-scale systems and introduce an energy monitoring software framework called Kwapi. The framework supports several wattmeter devices, multiple measurement formats, and reduces communication overhead. Kwapi architecture is scalable and extensible and completly integrated into OpenStack.
53 46564e42 Marcos Assuncao
54 46564e42 Marcos Assuncao
\end{abstract}
55 46564e42 Marcos Assuncao
56 46564e42 Marcos Assuncao
57 46564e42 Marcos Assuncao
\IEEEpeerreviewmaketitle
58 46564e42 Marcos Assuncao
59 46564e42 Marcos Assuncao
60 46564e42 Marcos Assuncao
\section{Introduction}
61 46564e42 Marcos Assuncao
% no \IEEEPARstart
62 46564e42 Marcos Assuncao
63 46564e42 Marcos Assuncao
Cloud computing \cite{ArmbrustCloud:2009} has become a key building block in providing IT resources and services to organisations of all sizes. Amongst its claimed benefits, the most appealing derive from economies of scale and often include a pay-as-you-go business model, resource consolidation, elasticity, good availability, and wide geographical coverage. Despite the advantages when compared to other provisioning models, to serve customers with the resources they need Clouds often rely on large data centres that consume massive amounts of electrical power \cite{BaligaInternet:2011}.
64 e542267e Marcos Assuncao
 
65 610b40cd Laurent Lefevre
Numerous efforts have been made to curb the energy consumed by Clouds, with some data centres now approaching a Power Usage Effectiveness (PUE) factor of 1.08\footnote{http://gigaom.com/2012/03/26/whose-data-centers-are-more-efficient-facebooks-or-googles/}. As a result, the IT infrastructure is now responsible for a large share of the power consumed by current data centres, and hence means to monitor and analyse how energy is spent have never been so crucial. Our experience in this area, however, has demonstrated that monitoring the power consumed by large systems is not always an easy task \cite{OrgerieSaveWatts:2008,AssuncaoIngrid:2010,DaCostaGreenNet:2010}. There are multiple power probes available in the market, generally with their own APIs, physical connections, precision, and communication protocols\cite{eelsd2013}. Moreover, cost related constraints can lead to decisions to acquire and deploy equipments at multiple stages or to monitor the power consumption of only part of the infrastructure.
66 e542267e Marcos Assuncao
67 46564e42 Marcos Assuncao
Although from a cost perspective, monitoring the power consumption of only part of deployed equipments is sound, it prevents one from capturing certain nuances of the infrastructure. Previous work has shown that as a computer cluster ages, certain components wear out, while others are replaced, leading to heterogeneous power consumption among nodes that were seemingly homogeneous. The difference between nodes that consume the least power and nodes that consume the most can reach 20\% \cite{MehdiHeterogeneous:2013}, which reinforces the idea that monitoring the consumption of the whole set of IT equipments can allow for further improvements in energy efficiency. Monitoring a great number of nodes, however, require the design of an efficient infrastructure for collecting and processing the power consumption data.
68 e542267e Marcos Assuncao
69 68fb6bef Marcos Assuncao
This paper describes the design and architecture of a generic and flexible framework, termed as Kilowatt API (Kwapi), that interfaces with OpenStack to provide it with power consumption information collected from multiple heterogeneous probes. OpenStack is a project that aims to provide ubiquitous open source cloud computing platform and is currently used by many corporations, researchers and global data centres\footnote{http://www.openstack.org/user-stories/}. Ceilometer is an OpenStack component conceived provide a framework to collect a large range of metrics for metering purposes\footnote{https://wiki.openstack.org/wiki/Ceilometer}. In this work we describe how Kwapi has been integrated into Ceilometer. With the increasing use of Ceilometer as the de facto metering tool for OpenStack, we believe that such an integration of a power monitoring framework into OpenStack can be of great value to the research community and practitioners.
70 46564e42 Marcos Assuncao
71 68fb6bef Marcos Assuncao
The remaining part of this paper is organised as follows. Section~\ref{sec:related_work} describes background and related work, Section~\ref{sec:architecture} presents the requirements and introduces the Kwapi architecture. Section~\ref{sec:performance} discusses experimental results measuring the throughput of drivers and plug-ins and Section~\ref{sec:conclusion} concludes the paper.
72 e542267e Marcos Assuncao
73 e542267e Marcos Assuncao
% ----------------------------------------------------------------------------------------
74 e542267e Marcos Assuncao
75 389115e3 Marcos Assuncao
\section{Background and Related Work}
76 e542267e Marcos Assuncao
\label{sec:related_work}
77 e542267e Marcos Assuncao
78 389115e3 Marcos Assuncao
This section provides an overview of Ceilometer's architecture and describes related work in the field of monitoring the power consumption of large-scale computing infrastructure.
79 389115e3 Marcos Assuncao
80 389115e3 Marcos Assuncao
\subsection{OpenStack Ceilometer}
81 389115e3 Marcos Assuncao
82 389115e3 Marcos Assuncao
Ceilometer --- whose logical architecture\footnote{http://docs.openstack.org/developer/ceilometer/architecture.html} is depicted in Fugure~\ref{fig:arch_ceilometer} --- is OpenStack's framework for collecting performance metrics and information on resource consumption. As of writing, it allows for data collection in three ways:
83 389115e3 Marcos Assuncao
84 389115e3 Marcos Assuncao
\begin{itemize}
85 68fb6bef Marcos Assuncao
\item \textbf{Bus listener agent}, which picks events on the Oslo notification bus and turns them into Ceilometer samples (\textit{e.g.} cumulative type, gauge or delta) that can then be stored into the database or provided to an external system via the publishing pipeline.
86 68fb6bef Marcos Assuncao
87 68fb6bef Marcos Assuncao
\item \textbf{Push agents}, more intrusive, consist in deploying agents on the monitored nodes to push the data remotely to be taken by the collector.
88 68fb6bef Marcos Assuncao
89 389115e3 Marcos Assuncao
\item \textbf{Polling agents} that poll APIs or other tool to collect information.
90 389115e3 Marcos Assuncao
\end{itemize} 
91 389115e3 Marcos Assuncao
92 389115e3 Marcos Assuncao
\begin{figure}[!htb]
93 389115e3 Marcos Assuncao
\center
94 389115e3 Marcos Assuncao
\includegraphics[width=1.\columnwidth]{figs/ceilometer_logical_architecture.pdf}
95 389115e3 Marcos Assuncao
\caption{Overview of Ceilometer's logical architecture.}
96 389115e3 Marcos Assuncao
\label{fig:arch_ceilometer}
97 389115e3 Marcos Assuncao
\end{figure}
98 389115e3 Marcos Assuncao
99 68fb6bef Marcos Assuncao
The last two methods depend on a combination of central agent, computer agents and collector. Whilst the compute agents run on nodes and retrieve information about resource usage related to a given virtual machine instance and a resource owner, the central agent on the other hand, executes \textit{pollsters} on the management server to retrieve data that is not linked to a particular instance. Pollsters are executed, for example, to poll resources by using an API or other method. The Ceilometer database can be queried via the Ceilometer API, and allows an external system to view the history of a resource's metrics. It also enables a system to set and receive alarms.
100 389115e3 Marcos Assuncao
101 389115e3 Marcos Assuncao
Metering messages can be signed using the \textit{hmac} module in Python's library, and a shared secret value can be provided in the configuration settings. The message signature is included in the message to be used for verification by the colector or by systems accessing the API.
102 389115e3 Marcos Assuncao
 
103 389115e3 Marcos Assuncao
104 389115e3 Marcos Assuncao
\subsection{Related Work}
105 389115e3 Marcos Assuncao
106 e542267e Marcos Assuncao
Over the past years, several techniques have been provided to minimise the energy consumed by computing infrastructure. At the hardware level, for instance, processors are able to operate at multiple frequency and voltage levels, and the operating systems or resource managers can choose the level that matches the current workload \cite{LaszewskiDVFS:2009}. At the resource management level, several approaches are proposed, including resource consolidation \cite{BeloglazovOpenStack:2014} and rescheduling requests \cite{OrgerieSaveWatts:2008}, generally with the goal of switching off unused resources or setting them to low power consumption modes. Attempts have also been made to assess the power consumed by individual applications \cite{NoureddineThesis:2014}.
107 e542267e Marcos Assuncao
108 46564e42 Marcos Assuncao
A means to monitor the energy consumption is often a key component to assess potential gains of techniques that aim to improve software and cloud resource management systems. Monitoring of Clouds is not a new topic \cite{AcetoMonitoring:2013} as tools to monitor computing infrastructure \cite{BrinkmannMonitoring:2013,VarretteICPP:2014} as well as ways to address some of the usual issues in management systems have been introduced \cite{WardMonitoring:2013,TanMonitoring:2013}. Moreover, several systems for measuring the power consumed by compute clusters have been described in the literature \cite{AssuncaoIngrid:2010}. As traditional system and network monitoring techniques lack the capability to interface with wattmeters, most approaches for measuring energy consumption have been tailored to the specific needs of the projects in which they were conceived.
109 e542267e Marcos Assuncao
110 46564e42 Marcos Assuncao
In our work we aim to draw some lessons from previous approaches to monitor and analyse the energy consumption of large scale distributed systems \cite{OrgerieSaveWatts:2008,DaCostaGreenNet:2010,AssuncaoIngrid:2010,MehdiHeterogeneous:2013}. We opted for creating a framework and integrate it with a successful cloud platform; OpenStack. Such a framework can be of value to the research community and practitioners working on the topic.
111 e542267e Marcos Assuncao
112 e542267e Marcos Assuncao
% ----------------------------------------------------------------------------------------
113 e542267e Marcos Assuncao
114 610b40cd Laurent Lefevre
\section{The Kwapi Architecture}
115 e542267e Marcos Assuncao
\label{sec:architecture}
116 e542267e Marcos Assuncao
117 46564e42 Marcos Assuncao
Depending on the number of monitored devices and the frequency at which measurements are taken, wattmeters can generate a large amount of data, which requires storage capacity for further processing and analysis. Although storing and performing pre-processing locally in the monitored nodes if often an approach followed by certain management systems, such an approach poses a few challenges when measuring power consumption; it can impact on the CPU utilisation and hence influence in the power consumed by the nodes, and depending on the power management policy in place, unused nodes may be switched off or set to stand by mode to save energy. Centralised storage, on the other hand, allows for faster access and processing of data, but can generate more network traffic given that all measurements need to be transferred continuously over the network to be stored. Once stored in a central repository, this data can be easily retrieved by components like OpenStack's Ceilometer.
118 46564e42 Marcos Assuncao
119 46564e42 Marcos Assuncao
Wattmeters available in the market vary in terms of physical interconnection, communication protocols, packaging and precision of measurements. They are mostly packaged in multiple outlet power strips called Power Distribution Units (PDUs) or enclosure PDUs (ePDUs), or more recently in the Intelligent Platform Management Interface (IPMI) cards embedded in computers; initially used as an alternative to shutdown or power up the central agent and a dedicated pollster we developed. IPMI is used to query a computer chassis remotely.
120 e542267e Marcos Assuncao
121 46564e42 Marcos Assuncao
The type of used interconnection is often either Ethernet to transport IPMI or SNMP packets over IP, or USB or RS-232 serial links. Wattmeters relying on Ethernet are generally linked to the administration network (off the data centre customer's network). Moreover, wattmeters may differ in the manner they operate. Some wattmeters send measurements to a management node on a regularly basis (push mode), whereas others must be queried (pull mode). Amongst the characteristics that differ across wattmeters we can list: 
122 e542267e Marcos Assuncao
123 46564e42 Marcos Assuncao
\begin{itemize}
124 46564e42 Marcos Assuncao
\item maximum number of measurements per second (\textit{i.e.} refresh rate);
125 46564e42 Marcos Assuncao
\item measurement precision; and 
126 46564e42 Marcos Assuncao
\item methodology applied to each measurement (\textit{e.g.} mean value between several measurements, instantaneous values, and exponential moving averages).
127 46564e42 Marcos Assuncao
\end{itemize}
128 e542267e Marcos Assuncao
129 610b40cd Laurent Lefevre
As an example, Table \ref{tab:wattmeters} shows the characteristics of energy sensors infrastructure that we deploy and evaluate on our data centres.
130 e542267e Marcos Assuncao
131 e542267e Marcos Assuncao
\begin{table}
132 e542267e Marcos Assuncao
\centering
133 610b40cd Laurent Lefevre
\caption{Wattmeters infrastructure}
134 e542267e Marcos Assuncao
\label{tab:wattmeters}
135 e542267e Marcos Assuncao
\begin{footnotesize}
136 e542267e Marcos Assuncao
\begin{tabular}{llcc}
137 e542267e Marcos Assuncao
\toprule
138 e542267e Marcos Assuncao
\multirow{2}{18mm}{\textbf{Device Name}} & \multirow{2}{30mm}{\textbf{Interface}} & \multirow{2}{12mm}{\centering{\textbf{Refresh Time (s)}}} & \multirow{2}{10mm}{\centering{\textbf{Precision (W)}}}  \\
139 e542267e Marcos Assuncao
& & & \\
140 e542267e Marcos Assuncao
\toprule
141 610b40cd Laurent Lefevre
Dell iDrac6    & IPMI / Ethernet           & 5    & 7 \\
142 e542267e Marcos Assuncao
\midrule
143 610b40cd Laurent Lefevre
Eaton          & Serial, SNMP via Ethernet & 5    & 1 \\
144 e542267e Marcos Assuncao
\midrule
145 e542267e Marcos Assuncao
OmegaWatt      & IrDA Serial               & 1    & 0.125 \\
146 e542267e Marcos Assuncao
\midrule
147 610b40cd Laurent Lefevre
Schleifenbauer & SNMP via Ethernet         & 3    & 0.1 \\
148 e542267e Marcos Assuncao
\midrule
149 e542267e Marcos Assuncao
Watts Up?      & Proprietary via USB       & 1    & 0.1 \\
150 e542267e Marcos Assuncao
\midrule
151 e542267e Marcos Assuncao
ZEZ LMG450     & Serial                    & 0.05 & 0.01 \\
152 e542267e Marcos Assuncao
\bottomrule
153 e542267e Marcos Assuncao
\end{tabular}
154 e542267e Marcos Assuncao
\end{footnotesize}
155 e542267e Marcos Assuncao
\end{table}
156 e542267e Marcos Assuncao
157 46564e42 Marcos Assuncao
The granularity at which measurements are taken is another important factor as the needs often vary depending on what one wishes to evaluate. Taking measurements at time intervals of one second or smaller is common in several scenarios. This can be a challenge in an infrastructure comprising hundreds or thousands of nodes, demanding efficient and scalable mechanisms for transferring information on power consumption.   
158 e542267e Marcos Assuncao
159 46564e42 Marcos Assuncao
Furthermore, leveraging the capabilities offered by existing cloud platforms like OpenStack, can help the adoption of a monitoring system, ease deployment, and reduce its learning curve. In addition, users and systems administrators need management reports and visualisation tools to analyse the impact of energy management policies and quickly retrieve relevant data for further analysis.  
160 e542267e Marcos Assuncao
161 389115e3 Marcos Assuncao
\begin{figure*}[!htb]
162 389115e3 Marcos Assuncao
\center
163 389115e3 Marcos Assuncao
\includegraphics[width=0.6\linewidth]{figs/architecture.pdf}
164 389115e3 Marcos Assuncao
\caption{Overview of Kwapi's architecture.}
165 389115e3 Marcos Assuncao
\label{fig:architecture}
166 389115e3 Marcos Assuncao
\end{figure*}
167 389115e3 Marcos Assuncao
168 46564e42 Marcos Assuncao
Hence, we summarise the main requirements for our energy monitoring platform as follows: 
169 e542267e Marcos Assuncao
170 e542267e Marcos Assuncao
\begin{itemize}
171 46564e42 Marcos Assuncao
\item \textbf{Reliable data storage}: a centralised storage where energy consumption data can be placed and easily retrieved. Note that centralised storage here does not imply that data is stored on a single node. Systems like Apache Hadoop HDFS\footnote{http://hadoop.apache.org/}, Apache Cassandra\footnote{http://cassandra.apache.org/}, and MongoDB\footnote{http://www.mongodb.org/} could be used.
172 e542267e Marcos Assuncao
173 e542267e Marcos Assuncao
\item \textbf{Handle heterogeneous wattmeters}: there is a need for handling multiple device types and to design the architecture in a way that support for new wattmeters can be included.
174 e542267e Marcos Assuncao
175 e542267e Marcos Assuncao
\item \textbf{Efficient communication}: the envisioned system should provide a means for nodes to efficiently communicate their energy consumption to components interested in processing it. A message bus could be used to manage this communication efficiently.
176 e542267e Marcos Assuncao
177 e542267e Marcos Assuncao
\item \textbf{Integration with open source cloud platform}: the proposed system should interface with existing open source cloud platforms in order to ease deployment and use.
178 e542267e Marcos Assuncao
179 e542267e Marcos Assuncao
\item \textbf{Visualisation and reports}: the system should offer a set of management reports that provide quick feedback to system administrators and users during execution of tasks or applications. In addition, it should provide means and APIs that allow more advanced queries to be made.
180 e542267e Marcos Assuncao
\end{itemize}
181 e542267e Marcos Assuncao
182 46564e42 Marcos Assuncao
The following sections describe the architecture of Kwapi and how it addresses the aforementioned requirements.
183 46564e42 Marcos Assuncao
184 46564e42 Marcos Assuncao
\subsection{Kwapi}
185 e542267e Marcos Assuncao
186 389115e3 Marcos Assuncao
Figure~\ref{fig:architecture} depicts the architecture of Kwapi, which is based on a set of layers comprising drivers, responsible for performing the measurements, and plug-ins that subscribe to collect the collected information. The communication between these two layers is handled by a bus as explained later. As a publish/subscribe architecture, plug-ins can subscribe to receive information collected by drivers from multiple sites. Drivers and plug-ins are easily extensible to support other types of wattmeters, and provide other services. Kwapi API is designed to provide a programming interface for developers and system administrators, and is used to interface with OpenStack by providing the information (\text{i.e.} by polling monitored devices) required to feed Ceilometer.
187 e542267e Marcos Assuncao
188 389115e3 Marcos Assuncao
In the context of publishing energy metrics, we use the central agent and a dedicated pollster we developed. It queries the Kwapi API plug-in and publishes cumulative (kWh) and gauge (W) counters. These counters are not yet associated with a particular user, since a server can host multiple clients simultaneously. In the following, we provide more details about some of the framework layers.
189 e542267e Marcos Assuncao
190 e542267e Marcos Assuncao
\subsubsection{Drivers}
191 e542267e Marcos Assuncao
192 46564e42 Marcos Assuncao
The drivers are threads initialised by a manager by providing a set of parameters loaded from a file compliant with the OpenStack configuration format, similar to INI. These parameters are used to query the meters (\textit{e.g.} IP address and port) and indicate the sensor IDs in the issued metrics. The measurements that a driver obtains are represented as JSON dictionaries, which have the advantage of being human readable and can be parsed easily, while keeping a small footprint. The size of the dictionaries may vary depending on the number of fields set by the drivers (\textit{i.e.} whether message signing is enabled). Figure~\ref{fig:json} shows an example of a JSON payload containing one measurement. Optional fields can be added, such as voltage and current. ACK messages have a fixed size of 66 bytes (on a TCP link). When drivers and API are on the same machine, they communicate via IPC sockets.
193 46564e42 Marcos Assuncao
194 46564e42 Marcos Assuncao
\begin{figure}
195 46564e42 Marcos Assuncao
\includeJSON{figs/measurement.json}
196 46564e42 Marcos Assuncao
\caption{Example of JSON payload.}
197 46564e42 Marcos Assuncao
\label{fig:json}
198 46564e42 Marcos Assuncao
\end{figure}
199 e542267e Marcos Assuncao
200 46564e42 Marcos Assuncao
The manager periodically checks if all threads are active, restarting them if necessary as incidents may occur; for example, if a meter is disconnected or becomes inaccessible. The drivers can manage incidents themselves, but if for any reason they stop their execution, they are automatically restarted by the manager. It is important to avoid losing measurements because the information reported is in W and not kWh; the loss of a measurement is hence important.
201 e542267e Marcos Assuncao
202 e542267e Marcos Assuncao
203 46564e42 Marcos Assuncao
\subsubsection{Plug-ins}
204 46564e42 Marcos Assuncao
205 46564e42 Marcos Assuncao
A plug-in retrieves and processes measurements taken by the drivers and provided via the bus. Plug-ins expose this information to other services like Ceilometer and to the user via visualisation tools. They can subscribe to all sensors, a subset of them, or to other plug-ins by using a system of prefixes. After verifying a message signature, they extract the fields and process the received data. As described in the following, currently Kwapi provides two plug-ins, namely an API to interface with Ceilometer, and a visualisation tool.
206 e542267e Marcos Assuncao
207 389115e3 Marcos Assuncao
\begin{figure*}[!htb]
208 389115e3 Marcos Assuncao
\center
209 68fb6bef Marcos Assuncao
\includegraphics[width=0.9\linewidth]{figs/graph_example.jpg}
210 389115e3 Marcos Assuncao
\caption{Example of graph generated by the visualisation plug-in (4 monitored servers).}
211 389115e3 Marcos Assuncao
\label{fig:graph_example}
212 389115e3 Marcos Assuncao
\end{figure*}
213 389115e3 Marcos Assuncao
214 e542267e Marcos Assuncao
\begin{itemize}
215 46564e42 Marcos Assuncao
216 46564e42 Marcos Assuncao
\item \textbf{API for Ceilometer}: the API plug-in computes the number of kWh of each probe, adds a timestamp, and stores the last value in watts. This data is not stored in a database as Ceilometer already has its own. If a probe has not provided measurements for a long time, the corresponding data is removed. This plug-in has a REST API that allows a client to retrieve the name of the probes, measurements in W, kWh, and timestamps. The API is secured by using OpenStack Keystone tokens, whereby the client provides a token, and the plug-in contacts Keystone API to check the token validity before sending its response.
217 e542267e Marcos Assuncao
  
218 46564e42 Marcos Assuncao
\item \textbf{Visualisation}: the visualisation plug-in builds Round-Robin Database (RRD) files from received measurements, and generates graphs that show the energy consumption over a given period, with additional information (average electricity consumption, minimum and maximum watt values, last value, total energy and cost in Euros). RRD files are of fixed size, and store several collections of metrics with different granularities. Figure~\ref{fig:graph_example} shows an example of generated graph. In addition, a web interface displays the generated graphics and a cache mechanism triggers the creation of graphs during queries only if they are out of date. 
219 e542267e Marcos Assuncao
\end{itemize}
220 e542267e Marcos Assuncao
221 610b40cd Laurent Lefevre
222 46564e42 Marcos Assuncao
\subsubsection{Internal communication bus}
223 e542267e Marcos Assuncao
224 46564e42 Marcos Assuncao
Kwapi uses ZeroMQ\footnote{http://zeromq.org/}, a fast broker-less messaging framework, written in C++, where transmitters play the role of buffers. ZeroMQ supports a wide range of bus modes, including cross-thread communication, IPC, and TCP. Switching from one to another is straightforward. It also provides several design patterns such as publish/subscribe, and request/response. In our architecture, we use a publish/subscribe design pattern where drivers are publishers, and plug-ins are subscribers. Amongst them, one or more forwarders simply forward packets, and broadcast a packet to all plug-ins subscribed to receive information from a given probe. Thanks to the forwarders, the network usage is optimised because the packets generated by a driver are sent only once, regardless the number of plug-ins that listen to a probe. If a probe is not listened by any plug-in, its measurements are neither sent over the network nor to the first forwarder. The forwarders not only reduce dramatically the network usage, but allow to build flexible architectures, by bypassing network isolation problems, or doing load balancing.
225 e542267e Marcos Assuncao
226 e542267e Marcos Assuncao
227 46564e42 Marcos Assuncao
\section{Performance Evaluation}
228 46564e42 Marcos Assuncao
\label{sec:performance}
229 9d39d328 François Rossigneux
230 46564e42 Marcos Assuncao
In this section we provide results of a simple performance evaluation we carried out in our testbed. Note that our goal is not to compare publish/subscribe systems as such work has already been performed elsewhere \cite{EugsterSurvey:2003,FabretPS:2001}. The evaluation demonstrates that the framework serves well the needs of a large range of users of the Grid'5000 platform \cite{Grid5000}; the system we use and where the framework is currently deployed as a means to collect and provide energy consumption information to users.    
231 9d39d328 François Rossigneux
232 389115e3 Marcos Assuncao
We wanted to evaluate the CPU and network usage of a typical driver to observe the framework's throughput, since provisioning a large number of resources for monitoring purposes was not desirable. For this experiment we deployed the Kwapi drivers and API on a machine with a Core 2 Duo P8770 2.53Ghz processor and 4GB of RAM. We considered several scenarios where we emulated several IPMI cards, each card monitored by a driver thread placing a measurement per second on the communication bus; and cases with multiple PDUs with 10 outlets each and each PDU monitored by a driver thread placing ten values per second on the bus. We have evaluated these scenarios considering both message signature enabled and disabled. Table~\ref{tab:parameters_usage} summarises the considered scenarios.
233 9d39d328 François Rossigneux
234 9d39d328 François Rossigneux
\begin{table}
235 9d39d328 François Rossigneux
\centering
236 389115e3 Marcos Assuncao
\caption{Scenarios considered in the experiments.}
237 46564e42 Marcos Assuncao
\label{tab:parameters_usage}
238 389115e3 Marcos Assuncao
\begin{tabular}{lcc}
239 46564e42 Marcos Assuncao
\toprule
240 389115e3 Marcos Assuncao
\textbf{Scenario name} & \textbf{Agent thread scheme} & \textbf{Message signature}  \\
241 46564e42 Marcos Assuncao
\toprule
242 389115e3 Marcos Assuncao
IPMI message signed     & 1 thread per card & Enabled\\
243 389115e3 Marcos Assuncao
\midrule
244 389115e3 Marcos Assuncao
IPMI message unsigned   & 1 thread per card & Disabled\\
245 46564e42 Marcos Assuncao
\midrule
246 389115e3 Marcos Assuncao
PDU message signed     & 1 thread per PDU & Enabled\\
247 46564e42 Marcos Assuncao
\midrule
248 389115e3 Marcos Assuncao
PDU message unsigned   & 1 thread per PDU & Disabled\\
249 46564e42 Marcos Assuncao
\bottomrule
250 9d39d328 François Rossigneux
\end{tabular}
251 9d39d328 François Rossigneux
\end{table}
252 9d39d328 François Rossigneux
253 389115e3 Marcos Assuncao
Moreover, we vary the number of IPMI cards and PDUs respectively from 500 to 5000 and from 50 to 500 to observe the scalability of the framework. 
254 389115e3 Marcos Assuncao
255 389115e3 Marcos Assuncao
% This is going to change...
256 46564e42 Marcos Assuncao
Figure~\ref{fig:cpu_usage} shows the results of CPU usage. Under the evaluated scenarios, the socket type and number of driver threads do not seem to have a distinguishable impact on the CPU usage. On the test machine, the Kwapi drivers with message signing disabled (\textit{i.e.} IPMI cards unsigned and PDUs unsigned) consumed on average 20\% of the total CPU power. The Kwapi API consumed around 10\% with message signing disabled and 16\% when making one request per second querying the last measurements of all probes. Message signing overall increases the CPU usage by 30\% (see IPMI cards signed and PDUs signed).
257 9d39d328 François Rossigneux
258 46564e42 Marcos Assuncao
\begin{figure}[!ht]
259 46564e42 Marcos Assuncao
\center
260 46564e42 Marcos Assuncao
\includegraphics[width=1.0\columnwidth]{figs/cpu_usage.pdf}
261 46564e42 Marcos Assuncao
\caption{CPU usage under the evaluated scenarios.}
262 46564e42 Marcos Assuncao
\label{fig:cpu_usage}
263 46564e42 Marcos Assuncao
\end{figure}
264 46564e42 Marcos Assuncao
265 46564e42 Marcos Assuncao
Although the CPU usage often depends on the drivers, plug-ins, and their complexity, and whether message signing is enabled, the experiments show that a large number of probes can be managed by a single machine. In our environment, a management machine per site is more than enough to accommodate the monitoring needs of users. The drivers and API can reuse a machine that already serves other monitoring purposes.
266 46564e42 Marcos Assuncao
267 46564e42 Marcos Assuncao
\begin{figure}[!ht]
268 46564e42 Marcos Assuncao
\center
269 46564e42 Marcos Assuncao
\includegraphics[width=1.0\columnwidth]{figs/packet_size.pdf}
270 46564e42 Marcos Assuncao
\caption{Packet sizes under the evaluated scenarios.}
271 46564e42 Marcos Assuncao
\label{fig:packet_size}
272 46564e42 Marcos Assuncao
\end{figure}
273 9d39d328 François Rossigneux
274 46564e42 Marcos Assuncao
While measuring the network usage, our experiments showed a transfer rate of around 230KB/s with message signing enabled and around 135KBs/s otherwise. Message signing overall introduces an overhead of 70\%. Sending large packets can be explored to decrease the packet overhead. If several drivers send measurments simultaneously, ZeroMQ provides an optimisation mechanism that aggregates the data into a single TCP datagram. Figure~\ref{fig:packet_size} shows the number of packets under the evaluated scenarios. We noticed that certain packets contain up to forty measurements.
275 9d39d328 François Rossigneux
276 46564e42 Marcos Assuncao
As mentioned earlier, plug-ins can subscribe and select probes from which they want to receive information. If multiple plug-ins select a node, information from the node is sent only once through the network. The architecture also allows for a hierarchy of plug-ins to be established, where a plug-in can be deployed on a site to summarise or compute average values that are placed on the bus to be consumed by higher level plug-ins. 
277 9d39d328 François Rossigneux
278 e542267e Marcos Assuncao
% ----------------------------------------------------------------------------------------
279 46564e42 Marcos Assuncao
280 46564e42 Marcos Assuncao
\section{Conclusion}
281 e542267e Marcos Assuncao
\label{sec:conclusion}
282 46564e42 Marcos Assuncao
283 610b40cd Laurent Lefevre
In this paper, we described a framework for monitoring the power consumed by resources of a data centre. Based on lessg
284 610b40cd Laurent Lefevre
ons learned by monitoring the power consumption of a large distributed infrastructure, we described the main user requirements and how they are met by the proposed architecture. The framework works in tandem with OpenStack's ceilometer. Experimental results demonstrate that the overhead posed by the monitoring framework is small, allowing us to serve the users' monitoring needs in our infrastructure.
285 46564e42 Marcos Assuncao
286 46564e42 Marcos Assuncao
As future work, we intend to explore means to increase the monitoring granularity and the number of measured devices by applying a hierarchy of plug-ins, and a stream processing system \footnote{https://storm.incubator.apache.org}$^,$\footnote{http://incubator.apache.org/s4/} for processing sterams of measurement tuples.     
287 46564e42 Marcos Assuncao
288 e542267e Marcos Assuncao
% ----------------------------------------------------------------------------------------
289 46564e42 Marcos Assuncao
290 46564e42 Marcos Assuncao
\section*{Acknowledgment}
291 46564e42 Marcos Assuncao
292 46564e42 Marcos Assuncao
This research is supported by the French FSN (Fonds national pour la Societe Numerique) XLcloud project. Some experiments presented in this paper were carried out using the Grid'5000 experimental testbed, being developed under the Inria ALADDIN development action with support from CNRS, RENATER and several Universities as well as other funding bodies (see https://www.grid5000.fr). Authors wish to thank Julien Danjou for his help during the integration of Kwapi with Openstack and Ceilometer.
293 46564e42 Marcos Assuncao
294 e542267e Marcos Assuncao
\bibliographystyle{IEEEtran}
295 389115e3 Marcos Assuncao
%\balance
296 46564e42 Marcos Assuncao
\bibliography{references}
297 46564e42 Marcos Assuncao
298 46564e42 Marcos Assuncao
\end{document}