Statistiques
| Branche: | Révision :

xlcloud / papers / 2014 / kwapi / cloudam2014.tex @ 610b40cd

Historique | Voir | Annoter | Télécharger (26,49 ko)

1 e542267e Marcos Assuncao
2 46564e42 Marcos Assuncao
\documentclass[conference]{IEEEtran}
3 46564e42 Marcos Assuncao
% Add the compsoc option for Computer Society conferences.
4 46564e42 Marcos Assuncao
5 46564e42 Marcos Assuncao
\usepackage{ctable}
6 46564e42 Marcos Assuncao
\usepackage{cite}
7 e542267e Marcos Assuncao
\usepackage[cmex10]{amsmath}
8 e542267e Marcos Assuncao
% \usepackage{acronym}
9 e542267e Marcos Assuncao
\usepackage{graphicx}
10 e542267e Marcos Assuncao
\usepackage{multirow}
11 46564e42 Marcos Assuncao
\usepackage{listings}
12 46564e42 Marcos Assuncao
\usepackage{color}
13 46564e42 Marcos Assuncao
\usepackage{xcolor}
14 46564e42 Marcos Assuncao
\usepackage{balance}
15 46564e42 Marcos Assuncao
16 46564e42 Marcos Assuncao
\colorlet{@punct}{red!60!black}
17 46564e42 Marcos Assuncao
\definecolor{@delim}{RGB}{20,105,176}
18 46564e42 Marcos Assuncao
19 46564e42 Marcos Assuncao
\lstdefinelanguage{json}{
20 46564e42 Marcos Assuncao
    basicstyle=\footnotesize\ttfamily,
21 46564e42 Marcos Assuncao
    literate=
22 46564e42 Marcos Assuncao
     *{\ }{{{\ }}}{1}
23 46564e42 Marcos Assuncao
      {:}{{{\color{@punct}{:}}}}{1}
24 46564e42 Marcos Assuncao
      {,}{{{\color{@punct}{,}}}}{1}
25 46564e42 Marcos Assuncao
      {\{}{{{\color{@delim}{\{}}}}{1}
26 46564e42 Marcos Assuncao
      {\}}{{{\color{@delim}{\}}}}}{1}
27 46564e42 Marcos Assuncao
      {[}{{{\color{@delim}{[}}}}{1}
28 46564e42 Marcos Assuncao
      {]}{{{\color{@delim}{]}}}}{1},
29 46564e42 Marcos Assuncao
}
30 46564e42 Marcos Assuncao
31 46564e42 Marcos Assuncao
\newcommand{\includeJSON}[1]{\lstinputlisting[language=json,firstnumber=1]{#1}}
32 46564e42 Marcos Assuncao
33 46564e42 Marcos Assuncao
% correct bad hyphenation here
34 46564e42 Marcos Assuncao
\hyphenation{op-tical net-works semi-conduc-tor}
35 46564e42 Marcos Assuncao
36 46564e42 Marcos Assuncao
\begin{document}
37 46564e42 Marcos Assuncao
38 46564e42 Marcos Assuncao
\title{A Generic and Extensible Framework for Monitoring Energy Consumption in OpenStack Clouds}
39 46564e42 Marcos Assuncao
40 46564e42 Marcos Assuncao
41 46564e42 Marcos Assuncao
\author{\IEEEauthorblockN{Francois Rossigneux, Jean-Patrick Gelas, Laurent Lef\`{e}vre, Marcos D. Assun\c{c}\~ao}
42 46564e42 Marcos Assuncao
\IEEEauthorblockA{Inria Avalon team, LIP Laboratory\\
43 46564e42 Marcos Assuncao
Ecole Normale Superieure of Lyon\\
44 46564e42 Marcos Assuncao
University of Lyon, France}
45 46564e42 Marcos Assuncao
}
46 46564e42 Marcos Assuncao
47 46564e42 Marcos Assuncao
48 46564e42 Marcos Assuncao
\maketitle
49 46564e42 Marcos Assuncao
50 46564e42 Marcos Assuncao
51 46564e42 Marcos Assuncao
\begin{abstract}
52 610b40cd Laurent Lefevre
Although cloud computing has been transformational in the IT industry, it often relies on large data centres that consume massive amounts of electrical power. Efforts have been made to reduce the power consumed by Clouds, with certain data centres now approaching a PUE factor of 1.08. That means that the IT infrastructure is now responsible for a large amount of the power a data centre consumes, and hence means to monitor and analyse how energy is spent have never been so crucial. Such monitoring is required for a better understanding of how power is consumed by the IT infrastructure and for assessing the impact of energy management policies. In this article, we draw some lessons from previous experience in monitoring large-scale systems and introduce an energy monitoring software framework called Kwapi. The framework supports several wattmeter devices, multiple measurement formats, and reduces communication overhead. Kwapi architecture is scalable and extensible and completly integrated within  OpenStack.
53 46564e42 Marcos Assuncao
54 46564e42 Marcos Assuncao
\end{abstract}
55 46564e42 Marcos Assuncao
56 46564e42 Marcos Assuncao
57 46564e42 Marcos Assuncao
\IEEEpeerreviewmaketitle
58 46564e42 Marcos Assuncao
59 46564e42 Marcos Assuncao
60 46564e42 Marcos Assuncao
\section{Introduction}
61 46564e42 Marcos Assuncao
% no \IEEEPARstart
62 46564e42 Marcos Assuncao
63 46564e42 Marcos Assuncao
Cloud computing \cite{ArmbrustCloud:2009} has become a key building block in providing IT resources and services to organisations of all sizes. Amongst its claimed benefits, the most appealing derive from economies of scale and often include a pay-as-you-go business model, resource consolidation, elasticity, good availability, and wide geographical coverage. Despite the advantages when compared to other provisioning models, to serve customers with the resources they need Clouds often rely on large data centres that consume massive amounts of electrical power \cite{BaligaInternet:2011}.
64 e542267e Marcos Assuncao
 
65 610b40cd Laurent Lefevre
Numerous efforts have been made to curb the energy consumed by Clouds, with some data centres now approaching a Power Usage Effectiveness (PUE) factor of 1.08\footnote{http://gigaom.com/2012/03/26/whose-data-centers-are-more-efficient-facebooks-or-googles/}. As a result, the IT infrastructure is now responsible for a large share of the power consumed by current data centres, and hence means to monitor and analyse how energy is spent have never been so crucial. Our experience in this area, however, has demonstrated that monitoring the power consumed by large systems is not always an easy task \cite{OrgerieSaveWatts:2008,AssuncaoIngrid:2010,DaCostaGreenNet:2010}. There are multiple power probes available in the market, generally with their own APIs, physical connections, precision, and communication protocols\cite{eelsd2013}. Moreover, cost related constraints can lead to decisions to acquire and deploy equipments at multiple stages or to monitor the power consumption of only part of the infrastructure.
66 e542267e Marcos Assuncao
67 46564e42 Marcos Assuncao
Although from a cost perspective, monitoring the power consumption of only part of deployed equipments is sound, it prevents one from capturing certain nuances of the infrastructure. Previous work has shown that as a computer cluster ages, certain components wear out, while others are replaced, leading to heterogeneous power consumption among nodes that were seemingly homogeneous. The difference between nodes that consume the least power and nodes that consume the most can reach 20\% \cite{MehdiHeterogeneous:2013}, which reinforces the idea that monitoring the consumption of the whole set of IT equipments can allow for further improvements in energy efficiency. Monitoring a great number of nodes, however, require the design of an efficient infrastructure for collecting and processing the power consumption data.
68 e542267e Marcos Assuncao
69 610b40cd Laurent Lefevre
This paper describes the design and architecture of a generic and flexible framework, termed as Kwapi ("Kilo-watt API"), that interfaces with OpenStack to provide it with power consumption information collected from multiple probes. OpenStack is project that aims to provide ubiquitous open source cloud computing platform and is currently used by many corporations, researchers and global data centres\footnote{http://www.openstack.org/user-stories/}. We believe that integration of power monitoring framework with Ceilometer\footnote{https://wiki.openstack.org/wiki/Ceilometer}, OpenStack's main infrastructure for monitoring and metering, can be of great value to the research community and practitioners.
70 46564e42 Marcos Assuncao
71 46564e42 Marcos Assuncao
The remaining part of this paper is organised as follows. Section~\ref{sec:related_work} describes related work, Section~\ref{sec:architecture} presents the requirements and introduces the Kwapi architecture. Section~\ref{sec:performance} discusses experimental results measuring the throughput of drivers and plug-ins and Section~\ref{sec:conclusion} concludes the paper. 
72 e542267e Marcos Assuncao
73 e542267e Marcos Assuncao
% ----------------------------------------------------------------------------------------
74 e542267e Marcos Assuncao
75 e542267e Marcos Assuncao
\section{Related Work}
76 e542267e Marcos Assuncao
\label{sec:related_work}
77 e542267e Marcos Assuncao
78 e542267e Marcos Assuncao
Over the past years, several techniques have been provided to minimise the energy consumed by computing infrastructure. At the hardware level, for instance, processors are able to operate at multiple frequency and voltage levels, and the operating systems or resource managers can choose the level that matches the current workload \cite{LaszewskiDVFS:2009}. At the resource management level, several approaches are proposed, including resource consolidation \cite{BeloglazovOpenStack:2014} and rescheduling requests \cite{OrgerieSaveWatts:2008}, generally with the goal of switching off unused resources or setting them to low power consumption modes. Attempts have also been made to assess the power consumed by individual applications \cite{NoureddineThesis:2014}.
79 e542267e Marcos Assuncao
80 46564e42 Marcos Assuncao
A means to monitor the energy consumption is often a key component to assess potential gains of techniques that aim to improve software and cloud resource management systems. Monitoring of Clouds is not a new topic \cite{AcetoMonitoring:2013} as tools to monitor computing infrastructure \cite{BrinkmannMonitoring:2013,VarretteICPP:2014} as well as ways to address some of the usual issues in management systems have been introduced \cite{WardMonitoring:2013,TanMonitoring:2013}. Moreover, several systems for measuring the power consumed by compute clusters have been described in the literature \cite{AssuncaoIngrid:2010}. As traditional system and network monitoring techniques lack the capability to interface with wattmeters, most approaches for measuring energy consumption have been tailored to the specific needs of the projects in which they were conceived.
81 e542267e Marcos Assuncao
82 46564e42 Marcos Assuncao
In our work we aim to draw some lessons from previous approaches to monitor and analyse the energy consumption of large scale distributed systems \cite{OrgerieSaveWatts:2008,DaCostaGreenNet:2010,AssuncaoIngrid:2010,MehdiHeterogeneous:2013}. We opted for creating a framework and integrate it with a successful cloud platform; OpenStack. Such a framework can be of value to the research community and practitioners working on the topic.
83 e542267e Marcos Assuncao
84 e542267e Marcos Assuncao
% ----------------------------------------------------------------------------------------
85 e542267e Marcos Assuncao
86 610b40cd Laurent Lefevre
\section{The Kwapi Architecture}
87 e542267e Marcos Assuncao
\label{sec:architecture}
88 e542267e Marcos Assuncao
89 46564e42 Marcos Assuncao
Depending on the number of monitored devices and the frequency at which measurements are taken, wattmeters can generate a large amount of data, which requires storage capacity for further processing and analysis. Although storing and performing pre-processing locally in the monitored nodes if often an approach followed by certain management systems, such an approach poses a few challenges when measuring power consumption; it can impact on the CPU utilisation and hence influence in the power consumed by the nodes, and depending on the power management policy in place, unused nodes may be switched off or set to stand by mode to save energy. Centralised storage, on the other hand, allows for faster access and processing of data, but can generate more network traffic given that all measurements need to be transferred continuously over the network to be stored. Once stored in a central repository, this data can be easily retrieved by components like OpenStack's Ceilometer.
90 46564e42 Marcos Assuncao
91 46564e42 Marcos Assuncao
Wattmeters available in the market vary in terms of physical interconnection, communication protocols, packaging and precision of measurements. They are mostly packaged in multiple outlet power strips called Power Distribution Units (PDUs) or enclosure PDUs (ePDUs), or more recently in the Intelligent Platform Management Interface (IPMI) cards embedded in computers; initially used as an alternative to shutdown or power up the central agent and a dedicated pollster we developed. IPMI is used to query a computer chassis remotely.
92 e542267e Marcos Assuncao
93 46564e42 Marcos Assuncao
The type of used interconnection is often either Ethernet to transport IPMI or SNMP packets over IP, or USB or RS-232 serial links. Wattmeters relying on Ethernet are generally linked to the administration network (off the data centre customer's network). Moreover, wattmeters may differ in the manner they operate. Some wattmeters send measurements to a management node on a regularly basis (push mode), whereas others must be queried (pull mode). Amongst the characteristics that differ across wattmeters we can list: 
94 e542267e Marcos Assuncao
95 46564e42 Marcos Assuncao
\begin{itemize}
96 46564e42 Marcos Assuncao
\item maximum number of measurements per second (\textit{i.e.} refresh rate);
97 46564e42 Marcos Assuncao
\item measurement precision; and 
98 46564e42 Marcos Assuncao
\item methodology applied to each measurement (\textit{e.g.} mean value between several measurements, instantaneous values, and exponential moving averages).
99 46564e42 Marcos Assuncao
\end{itemize}
100 e542267e Marcos Assuncao
101 610b40cd Laurent Lefevre
As an example, Table \ref{tab:wattmeters} shows the characteristics of energy sensors infrastructure that we deploy and evaluate on our data centres.
102 e542267e Marcos Assuncao
103 e542267e Marcos Assuncao
\begin{table}
104 e542267e Marcos Assuncao
\centering
105 610b40cd Laurent Lefevre
\caption{Wattmeters infrastructure}
106 e542267e Marcos Assuncao
\label{tab:wattmeters}
107 e542267e Marcos Assuncao
\begin{footnotesize}
108 e542267e Marcos Assuncao
\begin{tabular}{llcc}
109 e542267e Marcos Assuncao
\toprule
110 e542267e Marcos Assuncao
\multirow{2}{18mm}{\textbf{Device Name}} & \multirow{2}{30mm}{\textbf{Interface}} & \multirow{2}{12mm}{\centering{\textbf{Refresh Time (s)}}} & \multirow{2}{10mm}{\centering{\textbf{Precision (W)}}}  \\
111 e542267e Marcos Assuncao
& & & \\
112 e542267e Marcos Assuncao
\toprule
113 610b40cd Laurent Lefevre
Dell iDrac6    & IPMI / Ethernet           & 5    & 7 \\
114 e542267e Marcos Assuncao
\midrule
115 610b40cd Laurent Lefevre
Eaton          & Serial, SNMP via Ethernet & 5    & 1 \\
116 e542267e Marcos Assuncao
\midrule
117 e542267e Marcos Assuncao
OmegaWatt      & IrDA Serial               & 1    & 0.125 \\
118 e542267e Marcos Assuncao
\midrule
119 610b40cd Laurent Lefevre
Schleifenbauer & SNMP via Ethernet         & 3    & 0.1 \\
120 e542267e Marcos Assuncao
\midrule
121 e542267e Marcos Assuncao
Watts Up?      & Proprietary via USB       & 1    & 0.1 \\
122 e542267e Marcos Assuncao
\midrule
123 e542267e Marcos Assuncao
ZEZ LMG450     & Serial                    & 0.05 & 0.01 \\
124 e542267e Marcos Assuncao
\bottomrule
125 e542267e Marcos Assuncao
\end{tabular}
126 e542267e Marcos Assuncao
\end{footnotesize}
127 e542267e Marcos Assuncao
\end{table}
128 e542267e Marcos Assuncao
129 46564e42 Marcos Assuncao
The granularity at which measurements are taken is another important factor as the needs often vary depending on what one wishes to evaluate. Taking measurements at time intervals of one second or smaller is common in several scenarios. This can be a challenge in an infrastructure comprising hundreds or thousands of nodes, demanding efficient and scalable mechanisms for transferring information on power consumption.   
130 e542267e Marcos Assuncao
131 46564e42 Marcos Assuncao
Furthermore, leveraging the capabilities offered by existing cloud platforms like OpenStack, can help the adoption of a monitoring system, ease deployment, and reduce its learning curve. In addition, users and systems administrators need management reports and visualisation tools to analyse the impact of energy management policies and quickly retrieve relevant data for further analysis.  
132 e542267e Marcos Assuncao
133 46564e42 Marcos Assuncao
Hence, we summarise the main requirements for our energy monitoring platform as follows: 
134 e542267e Marcos Assuncao
135 e542267e Marcos Assuncao
\begin{itemize}
136 46564e42 Marcos Assuncao
\item \textbf{Reliable data storage}: a centralised storage where energy consumption data can be placed and easily retrieved. Note that centralised storage here does not imply that data is stored on a single node. Systems like Apache Hadoop HDFS\footnote{http://hadoop.apache.org/}, Apache Cassandra\footnote{http://cassandra.apache.org/}, and MongoDB\footnote{http://www.mongodb.org/} could be used.
137 e542267e Marcos Assuncao
138 e542267e Marcos Assuncao
\item \textbf{Handle heterogeneous wattmeters}: there is a need for handling multiple device types and to design the architecture in a way that support for new wattmeters can be included.
139 e542267e Marcos Assuncao
140 e542267e Marcos Assuncao
\item \textbf{Efficient communication}: the envisioned system should provide a means for nodes to efficiently communicate their energy consumption to components interested in processing it. A message bus could be used to manage this communication efficiently.
141 e542267e Marcos Assuncao
142 e542267e Marcos Assuncao
\item \textbf{Integration with open source cloud platform}: the proposed system should interface with existing open source cloud platforms in order to ease deployment and use.
143 e542267e Marcos Assuncao
144 e542267e Marcos Assuncao
\item \textbf{Visualisation and reports}: the system should offer a set of management reports that provide quick feedback to system administrators and users during execution of tasks or applications. In addition, it should provide means and APIs that allow more advanced queries to be made.
145 e542267e Marcos Assuncao
\end{itemize}
146 e542267e Marcos Assuncao
147 46564e42 Marcos Assuncao
\begin{figure*}[!htb]
148 e542267e Marcos Assuncao
\center
149 46564e42 Marcos Assuncao
\includegraphics[width=0.6\linewidth]{figs/architecture.pdf}
150 e542267e Marcos Assuncao
\caption{Overview of Kwapi's architecture.}
151 e542267e Marcos Assuncao
\label{fig:architecture}
152 46564e42 Marcos Assuncao
\end{figure*}
153 46564e42 Marcos Assuncao
154 46564e42 Marcos Assuncao
The following sections describe the architecture of Kwapi and how it addresses the aforementioned requirements.
155 46564e42 Marcos Assuncao
156 46564e42 Marcos Assuncao
\subsection{Kwapi}
157 e542267e Marcos Assuncao
158 46564e42 Marcos Assuncao
Figure~\ref{fig:architecture} depicts the architecture of Kwapi, which is based on set of layers comprising drivers, responsible for the performing the measurements, and plug-ins that subscribe to collect the collected information. The communication between these two layers is handled by a bus as explained later. As a publish/subscribe architecture, plug-ins can subscribe to receive information collected by drivers from multiple sites. Drivers and plug-ins are easily extensible to support other types of wattmeters, and provide other services. Kwapi API is designed to provide a programming interface for developers and system administrators, and is used to interface with OpenStack by providing the information (\text{i.e.} by polling monitored devices) required to feed Ceilometer.
159 e542267e Marcos Assuncao
160 46564e42 Marcos Assuncao
Ceilometer, OpenStack's framework for collecting values of performance metrics and resource consumption, and also used for billing, has two types of agents, namely compute agents and a central agent. The compute agents run on compute nodes and retrieve information about resource usage related to a given virtual machine instance and a given resource owner. The central agent, on the other hand, executes pollsters on the management server to retrieve the data that is not related to a particular instance. Measurements of metrics are published to the internal Ceilometer bus as counters (cumulative type, gauge or delta). Several modules listen to this bus, including the Ceilometer Collector responsible for storing these counter values into a database. The database can be queried via Ceilometer API, and allows one to view the history of a resource's metrics. In the context of publishing energy metrics, we use the central agent and a dedicated pollster we developed. It queries the Kwapi API plug-in and publishes cumulative (kWh) and gauge (W) counters. These counters are not yet associated with a particular user, since a server can host multiple clients simultaneously.
161 e542267e Marcos Assuncao
162 e542267e Marcos Assuncao
In the following, we provide more details about some of the framework layers.
163 e542267e Marcos Assuncao
164 e542267e Marcos Assuncao
\subsubsection{Drivers}
165 e542267e Marcos Assuncao
166 46564e42 Marcos Assuncao
The drivers are threads initialised by a manager by providing a set of parameters loaded from a file compliant with the OpenStack configuration format, similar to INI. These parameters are used to query the meters (\textit{e.g.} IP address and port) and indicate the sensor IDs in the issued metrics. The measurements that a driver obtains are represented as JSON dictionaries, which have the advantage of being human readable and can be parsed easily, while keeping a small footprint. The size of the dictionaries may vary depending on the number of fields set by the drivers (\textit{i.e.} whether message signing is enabled). Figure~\ref{fig:json} shows an example of a JSON payload containing one measurement. Optional fields can be added, such as voltage and current. ACK messages have a fixed size of 66 bytes (on a TCP link). When drivers and API are on the same machine, they communicate via IPC sockets.
167 46564e42 Marcos Assuncao
168 46564e42 Marcos Assuncao
\begin{figure}
169 46564e42 Marcos Assuncao
\includeJSON{figs/measurement.json}
170 46564e42 Marcos Assuncao
\caption{Example of JSON payload.}
171 46564e42 Marcos Assuncao
\label{fig:json}
172 46564e42 Marcos Assuncao
\end{figure}
173 e542267e Marcos Assuncao
174 46564e42 Marcos Assuncao
The manager periodically checks if all threads are active, restarting them if necessary as incidents may occur; for example, if a meter is disconnected or becomes inaccessible. The drivers can manage incidents themselves, but if for any reason they stop their execution, they are automatically restarted by the manager. It is important to avoid losing measurements because the information reported is in W and not kWh; the loss of a measurement is hence important.
175 e542267e Marcos Assuncao
176 e542267e Marcos Assuncao
177 46564e42 Marcos Assuncao
\subsubsection{Plug-ins}
178 46564e42 Marcos Assuncao
179 46564e42 Marcos Assuncao
A plug-in retrieves and processes measurements taken by the drivers and provided via the bus. Plug-ins expose this information to other services like Ceilometer and to the user via visualisation tools. They can subscribe to all sensors, a subset of them, or to other plug-ins by using a system of prefixes. After verifying a message signature, they extract the fields and process the received data. As described in the following, currently Kwapi provides two plug-ins, namely an API to interface with Ceilometer, and a visualisation tool.
180 e542267e Marcos Assuncao
181 e542267e Marcos Assuncao
\begin{itemize}
182 46564e42 Marcos Assuncao
183 46564e42 Marcos Assuncao
\item \textbf{API for Ceilometer}: the API plug-in computes the number of kWh of each probe, adds a timestamp, and stores the last value in watts. This data is not stored in a database as Ceilometer already has its own. If a probe has not provided measurements for a long time, the corresponding data is removed. This plug-in has a REST API that allows a client to retrieve the name of the probes, measurements in W, kWh, and timestamps. The API is secured by using OpenStack Keystone tokens, whereby the client provides a token, and the plug-in contacts Keystone API to check the token validity before sending its response.
184 e542267e Marcos Assuncao
  
185 46564e42 Marcos Assuncao
\item \textbf{Visualisation}: the visualisation plug-in builds Round-Robin Database (RRD) files from received measurements, and generates graphs that show the energy consumption over a given period, with additional information (average electricity consumption, minimum and maximum watt values, last value, total energy and cost in Euros). RRD files are of fixed size, and store several collections of metrics with different granularities. Figure~\ref{fig:graph_example} shows an example of generated graph. In addition, a web interface displays the generated graphics and a cache mechanism triggers the creation of graphs during queries only if they are out of date. 
186 e542267e Marcos Assuncao
\end{itemize}
187 e542267e Marcos Assuncao
188 46564e42 Marcos Assuncao
\begin{figure}[!ht]
189 46564e42 Marcos Assuncao
\center
190 46564e42 Marcos Assuncao
\includegraphics[width=.9\columnwidth]{figs/graph_example.jpg}
191 46564e42 Marcos Assuncao
\caption{Example of graph generated by a visualisation plug-in.}
192 46564e42 Marcos Assuncao
\label{fig:graph_example}
193 46564e42 Marcos Assuncao
\end{figure}
194 e542267e Marcos Assuncao
195 610b40cd Laurent Lefevre
\begin{figure*}[!ht]
196 610b40cd Laurent Lefevre
\center
197 610b40cd Laurent Lefevre
\includegraphics[width=2\columnwidth]{figs/kwapi_interface.png}
198 610b40cd Laurent Lefevre
\caption{Kwapi interface with 4 monitored servers}
199 610b40cd Laurent Lefevre
\label{kwapi_interface}
200 610b40cd Laurent Lefevre
\end{figure*}
201 610b40cd Laurent Lefevre
202 610b40cd Laurent Lefevre
203 46564e42 Marcos Assuncao
\subsubsection{Internal communication bus}
204 e542267e Marcos Assuncao
205 46564e42 Marcos Assuncao
Kwapi uses ZeroMQ\footnote{http://zeromq.org/}, a fast broker-less messaging framework, written in C++, where transmitters play the role of buffers. ZeroMQ supports a wide range of bus modes, including cross-thread communication, IPC, and TCP. Switching from one to another is straightforward. It also provides several design patterns such as publish/subscribe, and request/response. In our architecture, we use a publish/subscribe design pattern where drivers are publishers, and plug-ins are subscribers. Amongst them, one or more forwarders simply forward packets, and broadcast a packet to all plug-ins subscribed to receive information from a given probe. Thanks to the forwarders, the network usage is optimised because the packets generated by a driver are sent only once, regardless the number of plug-ins that listen to a probe. If a probe is not listened by any plug-in, its measurements are neither sent over the network nor to the first forwarder. The forwarders not only reduce dramatically the network usage, but allow to build flexible architectures, by bypassing network isolation problems, or doing load balancing.
206 e542267e Marcos Assuncao
207 e542267e Marcos Assuncao
208 46564e42 Marcos Assuncao
\section{Performance Evaluation}
209 46564e42 Marcos Assuncao
\label{sec:performance}
210 9d39d328 François Rossigneux
211 46564e42 Marcos Assuncao
In this section we provide results of a simple performance evaluation we carried out in our testbed. Note that our goal is not to compare publish/subscribe systems as such work has already been performed elsewhere \cite{EugsterSurvey:2003,FabretPS:2001}. The evaluation demonstrates that the framework serves well the needs of a large range of users of the Grid'5000 platform \cite{Grid5000}; the system we use and where the framework is currently deployed as a means to collect and provide energy consumption information to users.    
212 9d39d328 François Rossigneux
213 46564e42 Marcos Assuncao
We wanted to evaluate the CPU and network usage of a typical driver to observe its throughput, since provisioning a large number of resources for monitoring purposes was not desirable. For this experiment we deployed the Kwapi drivers and API on a machine with a Core 2 Duo P8770 2.53Ghz processor and 4GB of RAM. First, we emulated 1000 IPMI cards, each card monitored by a driver thread placing a measurement per second on the communication bus. Second, 100 PDUs with 10 outlets each were emulated, each PDU monitored by a driver thread placing ten values per second at a time on the bus. Hence, in both scenarios 1000 values per second are placed on the communication bus. We have run multiple experiments considering these two scenarios, taking into account message signatures and different communication sockets as summarised in Table~\ref{tab:parameters_usage}.
214 9d39d328 François Rossigneux
215 9d39d328 François Rossigneux
\begin{table}
216 9d39d328 François Rossigneux
\centering
217 46564e42 Marcos Assuncao
\caption{Parameters of the resource usage experiment.}
218 46564e42 Marcos Assuncao
\label{tab:parameters_usage}
219 46564e42 Marcos Assuncao
\begin{tabular}{lc}
220 46564e42 Marcos Assuncao
\toprule
221 46564e42 Marcos Assuncao
\textbf{Parameter description} & \textbf{Possible values}  \\
222 46564e42 Marcos Assuncao
\toprule
223 46564e42 Marcos Assuncao
Number of driver threads    & 100 or 1000 \\
224 46564e42 Marcos Assuncao
\midrule
225 46564e42 Marcos Assuncao
Message signature           & enabled or disabled \\
226 46564e42 Marcos Assuncao
\midrule
227 46564e42 Marcos Assuncao
Socket type                 & IPC or TCP \\
228 46564e42 Marcos Assuncao
\bottomrule
229 9d39d328 François Rossigneux
\end{tabular}
230 9d39d328 François Rossigneux
\end{table}
231 9d39d328 François Rossigneux
232 46564e42 Marcos Assuncao
Figure~\ref{fig:cpu_usage} shows the results of CPU usage. Under the evaluated scenarios, the socket type and number of driver threads do not seem to have a distinguishable impact on the CPU usage. On the test machine, the Kwapi drivers with message signing disabled (\textit{i.e.} IPMI cards unsigned and PDUs unsigned) consumed on average 20\% of the total CPU power. The Kwapi API consumed around 10\% with message signing disabled and 16\% when making one request per second querying the last measurements of all probes. Message signing overall increases the CPU usage by 30\% (see IPMI cards signed and PDUs signed).
233 9d39d328 François Rossigneux
234 46564e42 Marcos Assuncao
\begin{figure}[!ht]
235 46564e42 Marcos Assuncao
\center
236 46564e42 Marcos Assuncao
\includegraphics[width=1.0\columnwidth]{figs/cpu_usage.pdf}
237 46564e42 Marcos Assuncao
\caption{CPU usage under the evaluated scenarios.}
238 46564e42 Marcos Assuncao
\label{fig:cpu_usage}
239 46564e42 Marcos Assuncao
\end{figure}
240 46564e42 Marcos Assuncao
241 46564e42 Marcos Assuncao
Although the CPU usage often depends on the drivers, plug-ins, and their complexity, and whether message signing is enabled, the experiments show that a large number of probes can be managed by a single machine. In our environment, a management machine per site is more than enough to accommodate the monitoring needs of users. The drivers and API can reuse a machine that already serves other monitoring purposes.
242 46564e42 Marcos Assuncao
243 46564e42 Marcos Assuncao
\begin{figure}[!ht]
244 46564e42 Marcos Assuncao
\center
245 46564e42 Marcos Assuncao
\includegraphics[width=1.0\columnwidth]{figs/packet_size.pdf}
246 46564e42 Marcos Assuncao
\caption{Packet sizes under the evaluated scenarios.}
247 46564e42 Marcos Assuncao
\label{fig:packet_size}
248 46564e42 Marcos Assuncao
\end{figure}
249 9d39d328 François Rossigneux
250 46564e42 Marcos Assuncao
While measuring the network usage, our experiments showed a transfer rate of around 230KB/s with message signing enabled and around 135KBs/s otherwise. Message signing overall introduces an overhead of 70\%. Sending large packets can be explored to decrease the packet overhead. If several drivers send measurments simultaneously, ZeroMQ provides an optimisation mechanism that aggregates the data into a single TCP datagram. Figure~\ref{fig:packet_size} shows the number of packets under the evaluated scenarios. We noticed that certain packets contain up to forty measurements.
251 9d39d328 François Rossigneux
252 46564e42 Marcos Assuncao
As mentioned earlier, plug-ins can subscribe and select probes from which they want to receive information. If multiple plug-ins select a node, information from the node is sent only once through the network. The architecture also allows for a hierarchy of plug-ins to be established, where a plug-in can be deployed on a site to summarise or compute average values that are placed on the bus to be consumed by higher level plug-ins. 
253 9d39d328 François Rossigneux
254 e542267e Marcos Assuncao
% ----------------------------------------------------------------------------------------
255 46564e42 Marcos Assuncao
256 46564e42 Marcos Assuncao
\section{Conclusion}
257 e542267e Marcos Assuncao
\label{sec:conclusion}
258 46564e42 Marcos Assuncao
259 610b40cd Laurent Lefevre
In this paper, we described a framework for monitoring the power consumed by resources of a data centre. Based on lessg
260 610b40cd Laurent Lefevre
ons learned by monitoring the power consumption of a large distributed infrastructure, we described the main user requirements and how they are met by the proposed architecture. The framework works in tandem with OpenStack's ceilometer. Experimental results demonstrate that the overhead posed by the monitoring framework is small, allowing us to serve the users' monitoring needs in our infrastructure.
261 46564e42 Marcos Assuncao
262 46564e42 Marcos Assuncao
As future work, we intend to explore means to increase the monitoring granularity and the number of measured devices by applying a hierarchy of plug-ins, and a stream processing system \footnote{https://storm.incubator.apache.org}$^,$\footnote{http://incubator.apache.org/s4/} for processing sterams of measurement tuples.     
263 46564e42 Marcos Assuncao
264 e542267e Marcos Assuncao
% ----------------------------------------------------------------------------------------
265 46564e42 Marcos Assuncao
266 46564e42 Marcos Assuncao
\section*{Acknowledgment}
267 46564e42 Marcos Assuncao
268 46564e42 Marcos Assuncao
This research is supported by the French FSN (Fonds national pour la Societe Numerique) XLcloud project. Some experiments presented in this paper were carried out using the Grid'5000 experimental testbed, being developed under the Inria ALADDIN development action with support from CNRS, RENATER and several Universities as well as other funding bodies (see https://www.grid5000.fr). Authors wish to thank Julien Danjou for his help during the integration of Kwapi with Openstack and Ceilometer.
269 46564e42 Marcos Assuncao
270 e542267e Marcos Assuncao
\bibliographystyle{IEEEtran}
271 46564e42 Marcos Assuncao
\balance
272 46564e42 Marcos Assuncao
\bibliography{references}
273 46564e42 Marcos Assuncao
274 46564e42 Marcos Assuncao
\end{document}