xlcloud / papers / 2014 / kwapi / cloudam2014.tex @ 9d39d328
Historique | Voir | Annoter | Télécharger (22,03 ko)
1 |
|
---|---|
2 |
\documentclass[conference]{IEEEtran} |
3 |
% Add the compsoc option for Computer Society conferences. |
4 |
|
5 |
\usepackage{ctable} |
6 |
\usepackage{cite} |
7 |
\usepackage[cmex10]{amsmath} |
8 |
% \usepackage{acronym} |
9 |
\usepackage{graphicx} |
10 |
\usepackage{multirow} |
11 |
|
12 |
|
13 |
% correct bad hyphenation here |
14 |
\hyphenation{op-tical net-works semi-conduc-tor} |
15 |
|
16 |
\begin{document} |
17 |
|
18 |
\title{A Generic and Extensible Framework for Monitoring Energy Consumption in OpenStack Based Clouds} |
19 |
|
20 |
|
21 |
\author{\IEEEauthorblockN{Francois Rossigneux} |
22 |
\IEEEauthorblockA{Inria, Lyon, France} |
23 |
\and |
24 |
\IEEEauthorblockN{Jean-Patrick Gelas} |
25 |
\IEEEauthorblockA{University of Lyon, France} |
26 |
\and |
27 |
\IEEEauthorblockN{Laurent Lef\`evre} |
28 |
\IEEEauthorblockA{LIP, Inria, ENS de Lyon, France}} |
29 |
|
30 |
|
31 |
\maketitle |
32 |
|
33 |
|
34 |
\begin{abstract} |
35 |
%\boldmath |
36 |
Although cloud computing has been transformational in the IT industry, cloud infrastructure often relies on large data centres that consume massive amounts of electrical power. Efforts have been made to reduce the power consumed by Clouds, with certain data centres now approaching a PUE factor of 1.08. That means that the IT infrastructure is now responsible for a large amount of the power a data centre consumes, and hence means to monitor and analyse how energy is spent have never been so crucial. Such monitoring is required for a better understanding of how power is consumed by the IT infrastructure and for assessing the impact of energy management policies. In this article, we draw some lessons from previous experience in monitoring large-scale systems and introduce an energy monitoring software framework called Kwapi. The framework supports several wattmeter devices, multiple measurement formats, and reduces communication overhead. Kwapi architecture is scalable and extensible, and interfaces with OpenStack Ceilometer. |
37 |
|
38 |
\end{abstract} |
39 |
|
40 |
|
41 |
\IEEEpeerreviewmaketitle |
42 |
|
43 |
|
44 |
\section{Introduction} |
45 |
% no \IEEEPARstart |
46 |
|
47 |
Cloud computing has become a key building block in providing IT resources and services to organisations of all sizes. Amongst its claimed benefits, the most appealing derive from economies of scale and often include a pay-as-you-go business model, resource consolidation, elasticity, good availability, and wide geographical coverage. Despite the advantages when compared to other provisioning models, to serve customers with the resources they need Clouds often rely on large data centres that consume massive amounts of electrical power. |
48 |
|
49 |
Numerous efforts have been made to curb the energy consumed by Clouds, with some data centres now approaching a Power Usage Effectiveness (PUE) factor of 1.08\footnote{http://gigaom.com/2012/03/26/whose-data-centers-are-more-efficient-facebooks-or-googles/}. As a result, the IT infrastructure is now responsible for a large share of the power consumed by current data centres, and hence means to monitor and analyse how energy is spent have never been so crucial. Our experience in this area, however, has demonstrated that monitoring the power consumed by large systems is not always an easy task \cite{OrgerieSaveWatts:2008,AssuncaoIngrid:2010,DaCostaGreenNet:2010}. There are multiple power probes available in the market, generally with their own APIs, physical connections, precision, and communication protocols. Moreover, cost related constraints can lead to decisions to acquire and deploy equipments at multiple stages, and monitor the power consumption of only part of the infrastructure. |
50 |
|
51 |
Although from a cost perspective monitoring the power consumption of only part of deployed equipments is sound, it prevents one from capturing certain nuances of the infrastructure. Previous work has shown as a computer cluster ages, certain components wear out, while others are replaced, leading to heterogeneous power consumption among nodes that were seemingly homogeneous. The difference between nodes that consume the least power and nodes that consume the most can reach 20\% \cite{MehdiHeterogeneous:2013}, which reinforces the idea that monitoring the consumption of the whole set of IT equipments can allow for further improvements in energy efficiency. Monitoring a great number of nodes, however, require the design of an efficient infrastructure for collecting and processing the power consumption data. |
52 |
|
53 |
This paper describes the design and architecture of a generic and flexible framework, termed as Kwapi, that interfaces with OpenStack to provide it with power consumption information collected from multiple probes. OpenStack is project that aims to provide ubiquitous open source cloud computing platform and is currently used many corporations, researchers and global data centres. We believe that integration of power monitoring framework with Ceilometer\footnote{https://wiki.openstack.org/wiki/Ceilometer}, OpenStack's main infrastructure for monitoring and metering, can be of great value to the research community and practioneers. |
54 |
|
55 |
% ---------------------------------------------------------------------------------------- |
56 |
|
57 |
\section{Related Work} |
58 |
\label{sec:related_work} |
59 |
|
60 |
Over the past years, several techniques have been provided to minimise the energy consumed by computing infrastructure. At the hardware level, for instance, processors are able to operate at multiple frequency and voltage levels, and the operating systems or resource managers can choose the level that matches the current workload \cite{LaszewskiDVFS:2009}. At the resource management level, several approaches are proposed, including resource consolidation \cite{BeloglazovOpenStack:2014} and rescheduling requests \cite{OrgerieSaveWatts:2008}, generally with the goal of switching off unused resources or setting them to low power consumption modes. Attempts have also been made to assess the power consumed by individual applications \cite{NoureddineThesis:2014}. |
61 |
|
62 |
A means to monitor the energy consumption is often a key component to assess potential gains of techniques that aim to improve software and cloud resource management systems. Monitoring of Clouds is not a new topic \cite{AcetoMonitoring:2013} as tools to monitor computing infrastructure \cite{BrinkmannMonitoring:2013} as well as ways to address some of the usual issues in management systems have been introduced \cite{WardMonitoring:2013,TanMonitoring:2013}. Moreover, several systems for measuring the power consumed by compute clusters have been described in the literature \cite{AssuncaoIngrid:2010}. As traditional system and network monitoring systems lack the capability to interface with wattmeters, most of systems for measing energy consumption have been tailored to the specific needs of the projects in which thet were conceived. |
63 |
|
64 |
In our work we aimed to draw some lessons from previous approaches to monitor and analyse the energy consumption of large scale distributed systems. We opted for creating a framework and integrate it with a successful cloud platform, OpenStack. Such a framework can be o value to the research community working on the topic. |
65 |
|
66 |
% ---------------------------------------------------------------------------------------- |
67 |
|
68 |
\section{Requirements and Proposed Architecture} |
69 |
\label{sec:architecture} |
70 |
|
71 |
|
72 |
Depending on the number of monitored devices and the frequency at which measurements are taken, wattmeters can generate a large amount of data, which requires storage capacity for further processing and analysis. Although storing and performing pre-processing locally in the monitored nodes if often an approach followed by certain management systems, such an approach poses a few challenges when measuring power consumption; it can impact on the CPU utilisation and hence influence in the power consumed by the nodes, and depending on the power management policy in place, unused nodes may be switched off or set to stand by mode to save energy. Centralised storage, on the other hand, allows for faster access and processing of data, but can generate more network traffic given that all measurements need to be sent continuously transfer over the network to be stored. Once stored in a central repository, this data can be easily retrieved by components like OpenStack's Ceilometer. |
73 |
|
74 |
Available wattmeters on the market vary in terms of physical interconnection, communication protocols, packaging and precision of measurements. They are mostly packaged in multiple outlet power strips called PDU or ePDU, or more recently in the IPMI cards embedded in computers, used initially as an alternative to shutdown or power up the central agent and a dedicated pollster we developed. It queries computer chassis remotely. |
75 |
|
76 |
The type of used interface is either Ethernet to transport IPMI or SNMP packets over IP, or USB or RS-232 serial links. Wattmeters relying on Ethernet are generally linked to the administration network (off the data centre customer's network). Moreover, wattmeters may differ in the manner they operate. Some wattmeters send measurements to a management node on a regularly basis (push mode), whereas others must be queried (pull mode). Amongst the characteristics that differ across wattmeters we can list: the maximum number of measurements per seconds (\textit{i.e.} refresh rate), measurement precision, and the methodology applied for each measure (\textit{e.g.} mean value between several measurements, instantaneous values, and exponential moving averages). As an example, Table \ref{tab:wattmeters} shows the characteristics of some of the devices we had the chance to evaluate on our data centres. |
77 |
|
78 |
\begin{table} |
79 |
\centering |
80 |
\caption{Wattmeters available in our environment.} |
81 |
\label{tab:wattmeters} |
82 |
\begin{footnotesize} |
83 |
\begin{tabular}{llcc} |
84 |
\toprule |
85 |
\multirow{2}{18mm}{\textbf{Device Name}} & \multirow{2}{30mm}{\textbf{Interface}} & \multirow{2}{12mm}{\centering{\textbf{Refresh Time (s)}}} & \multirow{2}{10mm}{\centering{\textbf{Precision (W)}}} \\ |
86 |
& & & \\ |
87 |
\toprule |
88 |
Eaton & Serial, SNMP via Ethernet & 5 & 1 \\ |
89 |
\midrule |
90 |
Schleifenbauer & SNMP via Ethernet & 3 & 0.1 \\ |
91 |
\midrule |
92 |
OmegaWatt & IrDA Serial & 1 & 0.125 \\ |
93 |
\midrule |
94 |
Dell iDrac6 & IPMI / Ethernet & 5 & 7 \\ |
95 |
\midrule |
96 |
Watts Up? & Proprietary via USB & 1 & 0.1 \\ |
97 |
\midrule |
98 |
ZEZ LMG450 & Serial & 0.05 & 0.01 \\ |
99 |
\bottomrule |
100 |
\end{tabular} |
101 |
\end{footnotesize} |
102 |
\end{table} |
103 |
|
104 |
The granularity at which measurements are taken is another important factor as the needs often vary depending on what one wishes to evaluate. Taking measurements at a time intervals of one second or smaller is common in several scenarios. This can be a challenge in an infrastructure comprising hundreds or thousands of nodes, demanding efficient and scalable mechanisms for transfering information on power consumption. |
105 |
|
106 |
Furthermore, leveraging the capabilities offered by existing cloud platforms like OpenStack, can help the adoption of monitoring system, ease deployment, and reduce the learning curve when using the system. In addition, users and systems administrators need management reports and visualisation tools to analyse the impact of energy management policies and quickly retrieve lelevant data for further analysis. |
107 |
|
108 |
We summarise the main requirements for our energy monitoring platform as follows: |
109 |
|
110 |
\begin{itemize} |
111 |
\item \textbf{Reliable data storage}: a centralised storage where energy consumption data can be placed and easily retrieved. Note that centralised storage storage here does not imply that data is stored on a single node. Systems like Apache Hadoop HDFS\footnote{http://hadoop.apache.org/}, Apache Cassandra\footnote{http://cassandra.apache.org/}, and MongoDB\footnote{http://www.mongodb.org/} could be used. |
112 |
|
113 |
\item \textbf{Handle heterogeneous wattmeters}: there is a need for handling multiple device types and to design the architecture in a way that support for new wattmeters can be included. |
114 |
|
115 |
\item \textbf{Efficient communication}: the envisioned system should provide a means for nodes to efficiently communicate their energy consumption to components interested in processing it. A message bus could be used to manage this communication efficiently. |
116 |
|
117 |
\item \textbf{Integration with open source cloud platform}: the proposed system should interface with existing open source cloud platforms in order to ease deployment and use. |
118 |
|
119 |
\item \textbf{Visualisation and reports}: the system should offer a set of management reports that provide quick feedback to system administrators and users during execution of tasks or applications. In addition, it should provide means and APIs that allow more advanced queries to be made. |
120 |
\end{itemize} |
121 |
|
122 |
The following sections describe the architecture of Kwapi and how it addresses the aforementioned requirements. |
123 |
|
124 |
\subsection{Kwapi} |
125 |
|
126 |
Figure~\ref{fig:architecture} depicts the architecture of Kwapi. It is based on set of layers comprising drivers, responsible for the performing the measurements, plugins that subscribe to collect the collected information. The communication between these two layers is handled by a bus as explained later. A the case of a distributed architecture, plugin can subscribe to receive information collected by drivers from multiple sites. Drivers and plugins are easily extensible to support other types of wattmeters, and provide other services. Kwapi API is designed to provide a programming interface for developers and system administrators, and is used to interface with OpenStack by providing the information (\text{i.e.} by polling monitored devices) required to feed Ceilometer. |
127 |
|
128 |
\begin{figure}[!ht] |
129 |
\center |
130 |
\includegraphics[width=1.0\columnwidth]{figs/architecture.pdf} |
131 |
\caption{Overview of Kwapi's architecture.} |
132 |
\label{fig:architecture} |
133 |
\end{figure} |
134 |
|
135 |
% We have chosen the centralised storage approach, while minimising the network traffic and preserving the scalability. Ceilometer is used to store our power consumption metrics, and we propose an architecture to retrieve values from wattmeters, and send them to Ceilometer. In the following section we will describe Ceilometer followed by Kwapi. |
136 |
|
137 |
Ceilometer, OpenStack's framework for collecting values of performance metrics and resource consumption and also used for billing, has two types of agents, namely compute agents and a central agent. The compute agents run on compute nodes. They retrieve information about resources usage related to a given virtual machine instance and a given resource owner, whereas the central agent executes pollsters on the management server to retrieve the data that |
138 |
is not related to a particular instance. Measurements of metrics are published on the internal Ceilometer bus as counters (cumulative type, gauge or delta). Several modules listen to this bus, including the Ceilometer Collector which then stores these counters in a database. This database can be queried via Ceilometer API, and allows on to view the history of a resource's metrics. In the context of publishing energy metrics, we use the central agent and a dedicated pollster we developed. It queries the Kwapi API plugin and publishes cumulative (kWh) and gauge (W) counters. These counters are not yet associated with a particular user, since a server can host multiple clients simultaneously. |
139 |
|
140 |
In the following, we provide more details about some of the framework layers. |
141 |
|
142 |
\subsubsection{Drivers} |
143 |
|
144 |
The drivers are threads initiallised by a manager by providing a set of parameters loaded from a configuration file compliant with the OpenStack configuration file format, similar to INI. These parameters are used to query the meters (\textit{e.g.} IP address and port) and indicate the sensor IDs in the issued metrics. The metrics are Python dictionaries with a set of fields. Optional fields can be added, such as voltage and current. Also, metrics are signed. |
145 |
|
146 |
The manager periodically checks if all threads are active, restarting them if necessary; incidents may occur, for example if a meter is disconnected or becomes inaccessible). The drivers can manage incidents themselves. However, if for any reason they stop their execution, they will be automatically restarted by the manager. It is important to avoid losing measurements because the information reported is W and not kWh; the loss of measurement is hence important. |
147 |
|
148 |
\subsubsection{Plugins} |
149 |
|
150 |
The plugins retrieve and process the measurements taken by the drivers and provided via the bus. Drivers expose this information to other services like Ceilometer and to the user via visualization tools. Plugins can subscribe to all sensors, or a subset of them, by using a system of prefixes. After verifying the message signature, they extract the fields and process the received data. Currently Kwapi provides two plugins, namely an API to interface with Ceilometer, and a visualization tool. |
151 |
|
152 |
\begin{itemize} |
153 |
\item \textbf{API for Ceilometer}: the API plugin computes the number of kWh of each probe, adds a timestamp, and stores the last value in watts. This data is not stored in a database as Ceilometer already has its own. If a probe has not provided measurements for a long time, the corresponding data is removed. This plugin has a REST API that allows one to retrieve the name of the probes, measurements in W, kWh, and timestamp. The API is secured with OpenStack Keystone tokens, whereby the client provides a token, and the plugin contacts Keystone API to check the token validity before sending its response. |
154 |
|
155 |
\item \textbf{Visualisation}: the visualization plugin builds Round-Robin Database (RRD) files from received measurements, and generates graphs, where each shows the energy consumption over a given period, with additional information (average electricity consumption, minimum and maximum watts, last value, total energy and cost in euros). A web interface displays the generated graphics. A cache mechanism triggers the creation of graphss only if they are out of date, during queries. RRD files are of fixed size, and store several collections of metrics with different granularities. |
156 |
\end{itemize} |
157 |
|
158 |
\subsubsection{Internal bus} |
159 |
|
160 |
Kwapi uses ZeroMQ, a fast brokerless messaging framework, written in C++, where transmitters play the role of buffers. ZeroMQ supports a wide range of bus modes, including cross-thread communication, IPC, and TCP. Switching from one to another is straighforward. It also provides several design patterns such as publisher/subscriber, and request/response. In our architecture, we use a publisher/subscriber design pattern where drivers are publishers, and plugins are subscribers. Amongst them, one or more forwarders simply forward the packets, and broadcast a packet to all plugins subscribed to receive information from a given probe. Thanks to the forwarders, the network usage is optimised because the packets are sent only once, regardless the number of plugins that listen to a probe. If a probe is not listened by any plugin, its measurements are neither sent over the network nor to the first forwarder. The forwarders not only reduce dramatically the network usage, but allow to build flexible architectures, by bypassing networks isolation problems, or doing load balancing. |
161 |
|
162 |
|
163 |
\section{Performance Evaluation} |
164 |
|
165 |
We can evaluate Kwapi in terms of CPU and network usage. |
166 |
|
167 |
We have run kwapi-drivers and kwapi-api on a machine with a Core 2 Duo P8770 2.53Ghz processor, and 4GB of RAM. |
168 |
|
169 |
We have simulated 1000 IPMI cards, each one monitored by one driver thread and emitting one value per second on the bus. Then we have simulated 100 PDUs with 10 outlets, each one monitored by one driver thread and emiting ten values per second on the bus. So in each scenarii, 1000 values per second are emitted on the bus. |
170 |
|
171 |
We did multiple simulations with different parameters (table \ref{parameters_table}). |
172 |
|
173 |
\begin{table} |
174 |
\renewcommand{\arraystretch}{1.3} |
175 |
\caption{Parameters} |
176 |
\label{parameters_table} |
177 |
\centering |
178 |
\begin{tabular}{|l|l|} |
179 |
\hline |
180 |
\bfseries Name & \bfseries Values\\ |
181 |
\hline |
182 |
Driver threads & 100 or 1000\\ |
183 |
Signature & on or off\\ |
184 |
Socket & IPC or TCP\\ |
185 |
\hline |
186 |
\end{tabular} |
187 |
\end{table} |
188 |
|
189 |
\subsection{CPU} |
190 |
We have measured the CPU impact of the different parameters. |
191 |
The socket type and the number of driver threads have no impact on the CPU consumption. |
192 |
On our machine, kwapi-drivers without signing enabled consumed around 20\% of the total CPU power. |
193 |
And kwapi-api consumed around 10\% without signing enabled and 16\% with one request per second querying all probes. |
194 |
Signing the message increases the CPU usage by 30\%. |
195 |
The CPU usage depends very much on the drivers, plugins amount and complexity, and on the signing of messages (enabled or not). |
196 |
If it lacks compute power, it is easy to add another machine with kwapi-drivers. |
197 |
|
198 |
\subsection{Network} |
199 |
Our experiments shows a consumption of around 230KBytes/s with the message signing enabled and a consumption of 135KBytes/s otherwise. Signing induces an overhead of 70\%. |
200 |
|
201 |
To decrease the header overhead, it is better to send large packets. ZeroMQ has its own optimization mechanism: if several drivers send metrics simultaneously, ZeroMQ aggregates them in one TCP datagram. In our experiments, some packets contain up to fourty metrics. |
202 |
The metrics are Json dictionaries, which has the advantage of being human readable and easily parsable, while keeping a very small surcharge. The size of those dictionaries may vary, depending on the number of fields set by the drivers (signing add some overhead), while the ACKs have a fixed size of 66 bytes (on a TCP link). In a simple architecture where kwapi-drivers and kwapi-api are on the same machine, the network traffic is null if Kwapi is configured to use an IPC socket. |
203 |
|
204 |
The plugins can select the probes they want to watch, so any useless traffic is eliminated. |
205 |
|
206 |
% ---------------------------------------------------------------------------------------- |
207 |
|
208 |
\section{Conclusion} |
209 |
\label{sec:conclusion} |
210 |
The conclusion goes here. |
211 |
|
212 |
% ---------------------------------------------------------------------------------------- |
213 |
|
214 |
\section*{Acknowledgment} |
215 |
|
216 |
This research is supported by the French FSN (Fonds national pour la Societe Numerrique) XLcloud project. Some experiments presented in this paper were carried out using the Grid'5000 experimental testbed, being developed under the Inria ALADDIN development action with support from CNRS, RENATER and several Universities as well as other funding bodies (see https://www.grid5000.fr). Authors wish to thank Julien Danjou for his help during the integration of Kwapi with Openstack and Ceilometer. |
217 |
|
218 |
\bibliographystyle{IEEEtran} |
219 |
\bibliography{references} |
220 |
|
221 |
\end{document} |
222 |
|
223 |
|