Révision 10f7c1c4

b/papers/2014/kwapi/experiments/data/cpu_usage_api.txt
1 1
Scenarios,Driver,API,API+1 req/s
2
IPMI cards signed,27,16,20
3
PDUs signed,26,16,20
4
IPMI cards unsigned,20,12,16
5
PDUs unsigned,11,9,13
2
IPMI cards signed,16,20
3
PDUs signed,16,20
4
IPMI cards unsigned,12,16
5
PDUs unsigned,9,13
b/papers/2014/kwapi/experiments/src/python/lib/graph.py
136 136

  
137 137
    # plt.ylabel('Scenarios')
138 138
    plt.xlabel('CPU Usage (\%)')
139
    ax.set_xlim(0,30)
139 140
    # plt.title('CPU Usage of Drivers Under Different Scenarios')
140 141
    plt.legend((bars_api, bars_apireq), ['REST API only', 'REST API + 1 req./s'])
141 142

  
b/papers/2014/kwapi/paper.tex
71 71
 
72 72
Although some data centres now approach a \ac{PUE} factor of 1.08\footnote{http://gigaom.com/2012/03/26/whose-data-centers-are-more-efficient-facebooks-or-googles/}, such a mark means that the IT infrastructure is now responsible for a large part of the consumed power. Means to monitor and analyse how energy is spent are crucial to further improvement, but our previous work in this area has demonstrated that monitoring the power consumed by large systems is not always an easy task \cite{OrgerieSaveWatts:2008,AssuncaoIngrid:2010,DaCostaGreenNet:2010}. There are multiple power probes available in the market, generally with their own APIs, physical connections, precision, and communication protocols\cite{eelsd2013}. Moreover, cost related constraints can lead data centre operators to acquire and deploy equipments at multiple stages, or to monitor the power consumption of only part of an infrastructure.
73 73

  
74

  
75

  
76 74
From a cost perspective, monitoring the power consumption of only a small part of deployed equipments is sound, but it prevents one from capturing important nuances of the infrastructure. Previous work has shown that as a computer cluster ages, certain components wear out, while others are replaced, leading to heterogeneous power consumption among nodes that were seemingly homogeneous \cite{DeanGoogleFailures:2008}. The difference between nodes that consume the least power and nodes that consume the most can reach 20\% \cite{MehdiHeterogeneous:2013}, which reinforces the idea that monitoring the consumption of all equipments is required for exploring further improvement in energy efficiency and evaluate the impact of system-wide policies. Monitoring a great number of nodes, however, requires the design of an efficient infrastructure for collecting and processing the power consumption data.
77 75

  
78 76
This paper describes the design and architecture of a generic and flexible framework, termed as \ac{KWAPI}, that interfaces with OpenStack to provide it with power consumption information collected from multiple heterogeneous probes. OpenStack is a project that aims to provide ubiquitous open source cloud computing platform and is currently used by many corporations, researchers and global data centres\footnote{http://www.openstack.org/user-stories/}. We describe how \ac{KWAPI} is integrated into Ceilometer; OpenStack's  component conceived to provide a framework to collect a large range of metrics for metering purposes\footnote{https://wiki.openstack.org/wiki/Ceilometer}. With the increasing use of Ceilometer as the \textit{de facto} metering tool for OpenStack, we believe that such an integration of a power monitoring framework into OpenStack can be of great value to the research community and practitioners.
......
116 114

  
117 115
A means to monitor the energy consumption is key to assess potential gains of techniques to improve software and cloud resource management systems. Cloud monitoring is not a new topic \cite{AcetoMonitoring:2013} as tools to monitor computing infrastructure \cite{BrinkmannMonitoring:2013,VarretteICPP:2014} as well as ways to address some of the usual issues of management systems have been introduced \cite{WardMonitoring:2013,TanMonitoring:2013}. Moreover, several systems for measuring the power consumed by compute clusters have been described in the literature \cite{AssuncaoIngrid:2010}. As traditional system and network monitoring techniques lack the capability to interface with wattmeters, most approaches for measuring energy consumption have been tailored to the needs of projects for which they were conceived.
118 116

  
119
In our work, we draw lessons from previous approaches to monitor and analyse energy consumption of large-scale distributed systems \cite{OrgerieSaveWatts:2008,DaCostaGreenNet:2010,AssuncaoIngrid:2010,MehdiHeterogeneous:2013,CGC2012}. We opt for creating a framework and integrating it with a successful cloud platform (\textit{i.e.} OpenStack), which we believe is of value to the research community and practitioners working on the topic.
117
In our work, we draw lessons from previous approaches to monitor and analyse energy consumption of large-scale distributed systems \cite{OrgerieSaveWatts:2008,DaCostaGreenNet:2010,AssuncaoIngrid:2010,MehdiHeterogeneous:2013,CGC2012}. We opt for creating a framework and integrating it with a successful cloud platform (\textit{i.e.} OpenStack), which we believe is of value to the research community and practitioners working on the topic. To the best of our knowledge, this is the first generic energy monitoring framework to be integrated with OpenStack. 
120 118

  
121 119
% ----------------------------------------------------------------------------------------
122 120

  
......
137 135
\label{fig:architecture}
138 136
\end{figure*}
139 137

  
140
The communication between layers is handled by a bus, as explained in detail later. Data consumers can subscribe to receive information collected by drivers from multiple sites. Both drivers and consumers are easily extensible to support, respectively, several types of wattmeters and provide additional data processing services. A REST API is designed as a data consumer to provide a programming interface for developers and system administrators. In this work it is used to interface with OpenStack by providing the information (\textit{i.e.} by polling monitored devices) required by a \textit{\ac{KWAPI} Pollster} to feed Ceilometer.
138
The communication between layers is handled by a bus, as explained in detail later. Data consumers can subscribe to receive information collected by drivers from multiple sites. Both drivers and consumers are easily extensible to support, respectively, several types of wattmeters and provide additional data processing services. A REST API is designed as a data consumer to provide a programming interface for developers and system administrators. In this work it is used to interface with OpenStack by providing the information (\textit{i.e.} by polling monitored devices) required by a \textit{\ac{KWAPI} Pollster} that feeds Ceilometer.
141 139

  
142 140
The following sections provide more details on the main architecture components and their relationship with OpenStack Ceilometer.
143 141

  
......
157 155

  
158 156
Wattmeters available in the market vary in terms of physical interconnection, communication protocols, packaging and precision of measurements they take. They are mostly packaged in multiple outlet power strips called \acp{PDU} or \acp{ePDU}, and more recently in the \ac{IPMI} cards embedded in the computers themselves. Support for several types of wattmeter has been implemented, which drivers can use to interface with a wide range of equipments. In our work, we used \ac{IPMI} initially at Nova to shutdown and turn on compute nodes, but nowadays we also use it to query a computer chassis remotely.
159 157

  
160
Although Ethernet is generally used to transport \ac{IPMI} or SNMP packets over IP, USB and RS-232 serial links are also common. Wattmeters that use Ethernet are generally connected to an administration network (isolated from the data centre main data network). Moreover, wattmeters may differ in the manner they operate; some equipments send measurements to a management node on a regularly basis (push mode), whereas others respond to queries (pull mode). Other characteristics that differ across wattmeters include: 
158
Although Ethernet is generally used to transport \ac{IPMI} or SNMP packets over IP, USB and RS-232 serial links are also common. Wattmeters that use Ethernet are generally connected to an administration network (isolated from the data centre main data network). Moreover, wattmeters may differ in the manner they operate; some equipments send measurements to a management node on a regular basis (push mode), whereas others respond to queries (pull mode). Other characteristics that differ across wattmeters include: 
161 159

  
162 160
\begin{itemize}
163 161
\item refresh rate (\textit{i.e.} maximum number of measurements per second);
......
215 213

  
216 214
\subsection{Internal Communication Bus}
217 215

  
218
\ac{KWAPI} uses ZeroMQ \cite{HintjensZeroMQ:2013}, a fast broker-less messaging framework written in C++, where transmitters play the role of buffers. ZeroMQ supports a wide range of bus modes, including cross-thread communication, IPC, and TCP. Switching from one mode to another is straightforward. ZeroMQ also provides several design patterns such as publish/subscribe and request/response. As mentioner earlier, in our publish/subscribe architecture drivers are publishers, and data consumers are subscribers. If no data consumer is subscribed to receive data from a given driver, the latter will not send any information through the network.
216
\ac{KWAPI} uses ZeroMQ \cite{HintjensZeroMQ:2013}, a fast broker-less messaging framework written in C++, where transmitters play the role of buffers. ZeroMQ supports a wide range of bus modes, including cross-thread communication, IPC, and TCP. Switching from one mode to another is straightforward. ZeroMQ also provides several design patterns such as publish/subscribe and request/response. As mentioned earlier, in our publish/subscribe architecture drivers are publishers, and data consumers are subscribers. If no data consumer is subscribed to receive data from a given driver, the latter will not send any information through the network.
219 217

  
220 218
Moreover, one or more optional forwarders can be installed between drivers and data consumers to minimise network usage. Forwarders are designed to act as especial data consumers who subscribe to receive information from a driver and multicast it to all normal data consumers subscribed to receive the information. Forwarders enable the design of complex topologies and optimisation of network usage when handling data from multiple sites. They can also be used to bypass network isolation problems and perform load balancing.
221 219

  
222 220
\subsection{Interface with Ceilometer}
223 221

  
224
We opted for integrating KWAPI with an existing open source cloud platform to ease deployment and use. Leveraging the capabilities offered OpenStack can help in the adoption of a monitoring system and reduce its learning curve.
222
We opted for integrating KWAPI with an existing open source cloud platform to ease deployment and use. Leveraging the capabilities offered by OpenStack can help in the adoption of a monitoring system and reduce its learning curve.
225 223

  
226 224
Ceilometer's central agent and a dedicated pollster (\textit{i.e.} \ac{KWAPI} Pollster) are used to publish and store energy metrics into Ceilometer's database. They query the REST API data consumer and publish cumulative (kWh) and gauge (W) counters that are not associated with a particular tenant, since a server can host multiple clients simultaneously. 
227 225

  
228
Depending on the number of monitored devices and the frequency at which measurements are taken, wattmeters can generate a large amount of data thus demanding storage capacity for further processing and analysis. Management systems often store and perform pre-processing locally on monitored nodes, but such an approach can impact on CPU utilisation and influence the power consumption. In addition, resource managers may switch off idle nodes or set them to stand by mode to save energy, which make them unavailable for processing. Centralised storage, on the other hand, allows for faster data access and processing, but can generate more traffic given that measurements need to be continuously transferred over the network to a central point.  
226
Depending on the number of monitored devices and the frequency at which measurements are taken, wattmeters can generate a large amount of data, thus demanding storage capacity for further processing and analysis. Management systems often store and perform pre-processing locally on monitored nodes, but such an approach can impact on CPU utilisation and influence the power consumption. In addition, resource managers may switch off idle nodes or set them to stand by mode to save energy, which make them unavailable for processing. Centralised storage, on the other hand, allows for faster data access and processing, but can generate more traffic given that measurements need to be continuously transferred over the network to a central point.  
229 227

  
230
Ceilometer using its own central database, which is used here to store the energy consumption metrics. In this way, systems that interface with OpenStack's Ceilometer, including Nova, can easily retrieve the data. It is important to notice that, even though Ceilometer provides the notion of a central repository for metrics, it also uses a database abstraction that enables the use of distributed systems such as Apache Hadoop HDFS\footnote{http://hadoop.apache.org/}, Apache Cassandra\footnote{http://cassandra.apache.org/}, and MongoDB\footnote{http://www.mongodb.org/}. 
228
Ceilometer uses its own central database, which is leveraged here to store the energy consumption metrics. In this way, systems that interface with OpenStack's Ceilometer, including Nova, can easily retrieve the data. It is important to notice that, even though Ceilometer provides the notion of a central repository for metrics, it also uses a database abstraction that enables the use of distributed systems such as Apache Hadoop HDFS\footnote{http://hadoop.apache.org/}, Apache Cassandra\footnote{http://cassandra.apache.org/}, and MongoDB\footnote{http://www.mongodb.org/}. 
231 229

  
232
The granularity at which measurements are taken and metrics are computed is another important factor because user needs vary depending on what they wish to evaluate. Taking measurements at one-second interval or smaller is common under several scenarios, which can be a challenge in an infrastructure comprising hundreds or thousands of nodes, demanding efficient and scalable mechanisms for transferring information on power consumption. Hence, in the next section we evaluate the throughput of KWAPI under a few scenarios.
230
The granularity at which measurements are taken and metrics are computed is another important factor because user needs vary depending on what they wish to evaluate. Taking one or more measurements  per second is not common under certain scenarios, which can be a challenge in an infrastructure comprising hundreds or thousands of nodes, demanding efficient and scalable mechanisms for transferring information on power consumption. Hence, in the next section we evaluate the throughput of KWAPI under a few scenarios.
233 231

  
234 232

  
235 233
\section{Performance Evaluation}
236 234
\label{sec:performance}
237 235

  
238
This section provides results of a performance evaluation carried out in our testbed. The goal is not to compare publish/subscribe systems since such work has already been performed elsewhere \cite{EugsterSurvey:2003,FabretPS:2001}. The evaluation demonstrates that the framework serves well the needs of a large range of users of the Grid'5000 platform \cite{Grid5000} --- the infrastructure we use and where the Kwapi framework is currently deployed in production mode as the means for collecting and providing energy consumption information to users. 
236
This section provides results of a performance evaluation carried out in our testbed. The goal is not to compare publish/subscribe systems since such work has already been performed elsewhere \cite{EugsterSurvey:2003,FabretPS:2001}. The evaluation demonstrates that the framework serves well the needs of a large range of users of the Grid'5000 platform \cite{Grid5000} --- the infrastructure we use and where the \ac{KWAPI} framework is currently deployed in production mode as the means for collecting and providing energy consumption information to users. 
239 237

  
240
Firstly, we want to evaluate the CPU and network usage of a typical driver to observe the framework's throughput, since provisioning a large number of resources for monitoring purposes is not desirable. For this experiment we deployed the \ac{KWAPI} drivers and API on a machine with a Core 2 Duo P8770 2.53Ghz processor and 4GB of RAM. We considered:
238
First we want to evaluate the CPU and network usage of a typical driver to observe the framework's throughput, since provisioning a large number of resources for monitoring purposes is not desirable. For this experiment we deployed the \ac{KWAPI} drivers and API on a machine with a Core 2 Duo P8770 2.53Ghz processor and 4GB of RAM. We considered:
241 239

  
242 240
\begin{itemize}
243 241
 \item a scenario where we emulated 1,000 \ac{IPMI} cards, each card monitored by a driver thread placing a measurement per second on the communication bus.
244
 \item a case with 100 \acp{PDU} with 10 outlets each and each \ac{PDU} monitored by a driver thread placing ten values per second on the bus. 
242
 \item a case with 100 ten-outlet \acp{PDU}, each monitored by a driver thread placing ten values per second on the communication bus. 
245 243
\end{itemize}
246 244

  
247
Under both scenarios, 1,000 measurements per second were placed on the bus, even though monitoring was done using different types of probes. We have evaluated these scenarios considering both message signature enabled and disabled. Table~\ref{tab:parameters_usage} summarises the considered scenarios.
245
Under both scenarios, 1,000 measurements per second were placed on the bus, even though monitoring was done using different types of probes. We have evaluated these scenarios considering both with and without message signature. Table~\ref{tab:parameters_usage} summarises the considered scenarios.
248 246

  
249 247
\begin{table}
250 248
\centering
......
265 263
\end{tabular}
266 264
\end{table}
267 265

  
268
Figure~\ref{fig:cpu_usage} shows the results of CPU usage of drivers under the evaluated scenarios. The socket type and number of driver threads do not seem to have a distinguishable impact on the CPU usage. On the test machine, the \ac{KWAPI} drivers with message signature disabled (\textit{i.e.} \ac{IPMI} cards unsigned and \acp{PDU} unsigned) consumed on average 20\% of the total CPU power. 
266
Figure~\ref{fig:cpu_usage} shows the results of CPU usage of drivers under the evaluated scenarios. The socket type and number of driver threads do not have a distinguishable impact on the CPU usage. On the test machine, the \ac{KWAPI} drivers with message signature disabled (\textit{i.e.} \ac{IPMI} cards unsigned and \acp{PDU} unsigned) consumed on average 20\% of the total CPU power. 
269 267

  
270 268
% The \ac{KWAPI} API consumed around 10\% with message signature disabled and 16\% when making one request per second querying the last measurements of all probes. Message signature overall increases CPU usage by 30\% (see \ac{IPMI} cards signed and \acp{PDU} signed scenarios).
271 269

  
......
276 274
\label{fig:cpu_usage}
277 275
\end{figure}
278 276

  
279
We also evaluated the CPU consumption of the REST API data consumer under the scenarios described in Table \ref{tab:parameters_usage}. In addition to these scenarios, two conditions were assessed, namely (i) the REST API working as a consumer requesting data from drivers at a one-second time interval (REST API only); and (ii) the API requesting data at one-second interval and also answering a call every second to provide the collected data to an external system (REST API + 1 req/s). Figure \ref{fig:cpu_usage_consumer} summarises the obtained results. The CPU consumption is in general low, and even when message signing is enabled and the API serves a query, its consumption is below 20\%. The small variation between the scenarios without message signing is caused by the manner ZeroMQ accumulates data on nodes prior to transmission. 
277
We also evaluated the CPU consumption of the REST API data consumer under the scenarios described in Table \ref{tab:parameters_usage}. In addition to these scenarios, two conditions were assessed, namely (i) the REST API working as a consumer requesting data from drivers at a one-second time interval (REST API only); and (ii) the API requesting data at one-second interval and also answering a call every second to provide the collected data to an external system (REST API + 1 req/s). Figure \ref{fig:cpu_usage_consumer} summarises the obtained results. The CPU consumption is in general low. Even when message signing is enabled and the API serves a query, its consumption is below 20\%. The small variation between the scenarios without message signing is caused by the manner ZeroMQ accumulates data on nodes prior to transmission. 
280 278

  
281 279
\begin{figure}[!ht]
282 280
\center
......
303 301

  
304 302
%While measuring the network usage, our experiments showed a transfer rate of around 230KB/s with message signing enabled and around 135KBs/s otherwise. Message signing overall introduces an overhead of 70\%. Sending large packets can be explored to decrease the packet overhead. If several drivers send measurements simultaneously, ZeroMQ provides an optimisation mechanism that aggregates the data into a single TCP datagram. Figure~\ref{fig:packet_size} shows the number of packets under the evaluated scenarios. We noticed that certain packets contain up to forty measurements.
305 303

  
306
Although a measurement interval of one second meets the requirements of users in our platform, we wanted to evaluate the impact of using a communication bus in the transfer of observations between drivers and the REST API consumer. In a second experiment we used two machines. On the first machine we instantiated 1,000 driver threads placing random observations on the communication bus. On the second machine we measured the number of measurements that the API is able to receive over a minute. We varied the time between measurements from 0.2 to 1.0 seconds. Figure~\ref{fig:measurement_intervals} summarises the obtained results. Though the number of observations generated in this experiment is much higher than what we currently need to handle in our platform, we observe that the framework is able to transfer measurements from drivers to API under a 0.4 second interval without adding much jitter. Under smaller measurement intervals, however, observations start to accumulate and are transferred at large chunks. We believe that under small measurement intervals, and consequently a very large number of observations per second, an architecture based on stream processing systems that guarantees data processing might be more appropriate. Hence, although the framework suits the purposes of large range of users, if measurements are to be taken at very small time intervals, a stream processing architecture would probably yield better performance by enabling the placement of elements to preprocess data closer to where it is generated.
304
Although a measurement interval of one second meets the requirements of users in our platform, we wanted to evaluate the impact of using a communication bus in the transfer of observations between drivers and the REST API consumer. In a second experiment we used two machines. On the first machine we instantiated 1,000 driver threads placing random observations on the communication bus. On the second machine we measured the number of measurements that the API is able to receive over a minute. We varied the time between measurements from 0.2 to 1.0 seconds. Figure~\ref{fig:measurement_intervals} summarises the obtained results. Though the number of observations generated in this experiment is much higher than what we currently need to handle in our platform, we observe that the framework is able to transfer measurements from drivers to API under a 0.4 second interval without adding much jitter. Under smaller measurement intervals, however, observations start to accumulate and are transferred at large chunks. We believe that under small measurement intervals, and consequently a very large number of observations per second, an architecture based on stream processing systems that guarantees data processing might be more appropriate. Hence, although the framework suits the purposes of large range of users, if measurements are to be taken at very small time intervals, a stream processing architecture would probably yield better performance by enabling the placement of elements to pre-process data closer to where it is generated.
307 305

  
308 306
% As mentioned earlier, plug-ins can subscribe and select probes from which they want to receive information. If multiple plug-ins select a node, information from the node is sent only once through the network. The architecture also allows for a hierarchy of plug-ins to be established, where a plug-in can be deployed on a site to summarise or compute average values that are placed on the bus to be consumed by higher level plug-ins. 
309 307

  
......
312 310
\section{Conclusion}
313 311
\label{sec:conclusion}
314 312

  
315
In this paper, we described a novel framework (KWAPI) for monitoring the power consumed by resources of an Openstack cloud. Based on lessons learned by monitoring the power consumption of large distributed infrastructure, we proposed an energy monitoring architecture based on a publish/subscribe model. The framework works in tandem with OpenStack's Ceilometer. Experimental results demonstrate that the overhead posed by the monitoring framework is small, allowing us to serve the users' monitoring needs of our large scale infrastructure. 
313
In this paper, we described a novel framework (\ac{KWAPI}) for monitoring the power consumed by resources of an Openstack cloud. Based on lessons learned by monitoring the power consumption of large distributed infrastructure, we proposed an energy monitoring architecture based on a publish/subscribe model. The framework works in tandem with OpenStack's Ceilometer. Experimental results demonstrate that the overhead posed by the monitoring framework is small, allowing us to serve the users' monitoring needs of our large scale infrastructure. 
316 314

  
317
As future work, we intend to explore means to increase the monitoring granularity and the number of measured devices by applying a hierarchy of plug-ins, and a stream processing system with guarantees on data processing \cite{Storm,S4} for processing streams of measurement tuples.
315
As future work, we intend to explore means to increase the monitoring granularity and the number of measured devices by applying a hierarchy of plug-ins, and a stream processing system with guarantees on data processing \cite{Storm,S4} for processing streams of measurement tuples. 
318 316

  
319 317
% ----------------------------------------------------------------------------------------
320 318

  

Formats disponibles : Unified diff