Révision 848e7701 papers/2014/reservation/paper.tex

b/papers/2014/reservation/paper.tex
12 12
\usepackage{color}
13 13
\usepackage{xcolor}
14 14
\usepackage{balance}
15
\usepackage{algorithm2e}
15 16

  
16 17
\acrodef{KWAPI}{KiloWatt API}
17 18
\acrodef{KWRanking}{KiloWatt Ranking}
......
181 182
\section{Reservation Strategies}
182 183
\label{sec:strategies}
183 184

  
184
To benefit from adding support for reservation of bare-metal resources, we discuss simple strategies that a private cloud could implement towards reducing the amount of energy spent with computational resources. These strategies can complement the ranking system described above.
185
To benefit from adding support for reservation of bare-metal resources, we discuss strategies that a private cloud could implement towards reducing the amount of energy consumed by compute resources. These strategies could be used to complement the ranking system described above by, for instance, selecting the least power efficient resources to be switched off.
185 186

  
186 187
\subsection{Power-off Idle Resources}
187 188

  
188
This strategy...
189
The strategy considered here consist of checking periodically what resources are idle. Once determined that a resource has remained idle for a number of consecutive intervals and it is not committed to serve a reservation over a give time horizon --- \textit{i.e.} when reservation is enabled --- the resource is powered off. Previous work \cite{OrgerieSaveWatts:2008} has evaluated the impact of decisions on appropriate intervals for measuring idleness, for deciding on the horizon for switching off resources committed to reservations. This work considers that the measurement interval, idleness time, and reservation horizon are respectively 1, 5 and 15 minutes. Algorithm~\ref{algo:alloc_policy} summarises the strategy.
190
          
191
\IncMargin{-0.6em}
192
\RestyleAlgo{ruled}\LinesNumbered
193
\begin{algorithm}[ht]
194
\caption{Sample resource allocation policy.}
195
\label{algo:alloc_policy} 
196
\DontPrintSemicolon
197
\SetAlgoLined
198
\SetAlgoVlined
199
\footnotesize{
200

  
201
\label{algo:check_idle_start}\textbf{procedure} checkIdleNodes()\;
202
\Begin{ 
203
	$Res\_idle_t \leftarrow $ get list of idle resources at interval $t$\;
204
	$Res\_idle_{t-1} \leftarrow $ get list of idle resources at interval $t-1$\;
205
	\If{$success = $ \textbf{false}}{
206
		$success \leftarrow enqueueReq(r)$\;
207
	}
208
	
209
	\ForEach{resource $r \in Res\_idle_t$}{
210
	      \eIf{$r \in Res\_idle_{t-1}$}{
211
	          // increase number of idle intervals of $r$\;
212
	          $r.idle_intervals \leftarrow r.idle_intervals + 1$\;
213
	      }{
214
	          $r.idle_intervals \leftarrow 1$\;\label{algo:check_idle_end}
215
	      }
216
	}
217
}
218

  
219
\BlankLine
220
\label{algo:switch_start}\textbf{procedure} switchResourcesOnOff()\;
221
\Begin{ 
222
	$Res\_on_{t} \leftarrow $ list of resources switched on\;
223
	$Res\_off_{t} \leftarrow $ list of resources switched off\;
224
	$Res\_reserv_{t,h} \leftarrow $ resources reserved until horizon $h$\;
225
	$nres\_reserv_{t,h} \leftarrow $ number of resources in $Res\_reserved_{t,h}$\;
226
	$nres\_fcast_{t+1} \leftarrow $ forecast number of resources required at $t+1$ \;
227
	$nres\_req_{t+1} \leftarrow max(nres\_fcast_{t+1},nres\_reserv_{t,h})$\;
228
	
229
	\While{$nres\_req_{t+1} < sizeof(Res\_on_{t})$} {
230
		$r \leftarrow $ pop resource from $Res_{off}$\;
231
		switch resource $r$ on\;
232
		add $r$ to $Res\_on_{t}$\;
233
	}
234
	
235
	$Res\_idle_t \leftarrow $ get list of idle resources at interval $t$\;
236
	\While{$nres\_req_{t+1} > sizeof(Res\_on_{t})$} {
237
	    \ForEach{resource $r \in Res\_idle_t$}{
238
	    	\If{$r.idle\_intervals \geq 5$ and $r \notin Res\_reserv_{t,h}$}{
239
		    remove $r$ from $Res\_on_{t}$\;
240
		    switch resource $r$ off\;
241
		    add $r$ to $Res\_off_{t}$\;
242
		}
243
		\If{$nres\_req_{t+1} == sizeof(Res\_on_{t})$}{
244
		   \textbf{break}\; \label{algo:switch_end}
245
		}
246
	    }
247
	}
248
}
249
}
250

  
251
\While{system is running} {
252
   every minute call checkIdleNodes()\;
253
   every 5 minutes call switchResourcesOnOff()\;
254
}
255
\end{algorithm}
256
\IncMargin{0.6em} 
257

  
258
Lines~\ref{algo:check_idle_start} to \ref{algo:check_idle_end} contains the pseudo-code to identify idle resources, whereas lines \ref{algo:switch_start} to \ref{algo:switch_end} determines the resources that need to be switched on or off. 
189 259

  
190 260
% ----------------------------------------------------------------------------------------
191 261

  
......
198 268

  
199 269
A discrete-event simulator developed in house is used to model and simulate the resource allocation and request scheduling in a private cloud setting. We resort to simulation as it enables controlled, repeatable and large-scale experiments. Both infrastructure capacity and resource requests are expressed in number of CPU cores. As traces of cloud workloads are very difficult to obtain, we use request logs gathered from Grid'5000 sites and adapt them to model cloud users' resource demands. Under normal operation, Grid'5000 enables resource reservations, but users' requests are conditioned by the available resources. For instance, a user willing to allocate resources for an experiment will often check a site's agenda, see what resources are available and will eventually make a reservation during a convenient time frame. If the user cannot find enough resources, she will either adapt her requirements to resource availability --- \textit{e.g.} change the number of required resources, and reservation start or/and finish time --- or choose another site with available capacity. The request traces, however, do not capture what users' initial requirements were before they make their requests.
200 270

  
201
In order to obtain a workload trace on provisioning of bare-metal resources that is more cloud oriented, we adapt the request traces and infrastructure capacity of Grid'5000 by making the following changes:
271
In order to obtain a workload trace on provisioning of bare-metal resources that is more cloud oriented, we adapt the request traces and infrastructure capacity of Grid'5000 by making the following changes to reservation requests:
202 272

  
203 273
\begin{enumerate}
204
\item \label{enum:cond1} Advance reservation requests whose original submission time is within working hours and start time lies outside these hours are considered immediate reservations starting at their original submission time.
205
\item \label{enum:cond2} Requests whose original submission and start times are on different days of the week are also turned into immediate reservations, both submitted and starting at their original start time.
274
\item \label{enum:cond1} Requests whose original submission time is within working hours and start time lies outside these hours are considered as on-demand requests starting at their original submission time.
275
\item \label{enum:cond2} Remaining requests are considered as on-demand requests both submitted and starting at their original start time.
206 276
\item \label{enum:capacity} The resource capacity of a site is modified to the maximum number of CPU cores required to honour all requests, plus a safety factor.
207 277
\end{enumerate}
208 278

  
209
Change \ref{enum:cond1} modifies the behaviour of users who today explore resources during off-peak periods, whereas \ref{enum:cond2} alters the current practice of planning experiments in advance and reserving resources before they are taken by other users. Although the changes may seem extreme at first, they allow us to evaluate what we consider to be our \textit{worst case scenario}. Moreover, as mentioned earlier, we believe the model adopted by existing clouds, where short-term advance reservations are generally not allowed and prices of on-demand instances do not vary over time, users would have little incentives to explore off-peak periods or plan their demand in advance. Change \ref{enum:capacity} reflects the industry practice of provisioning resources to handle peak demand and including a margin of safety.
279
The characteristics of best-effort requests are not changed. Change \ref{enum:cond1} modifies the behaviour of users who today explore resources during off-peak periods, whereas \ref{enum:cond2} alters the current practice of planning experiments in advance and reserving resources before they are taken by other users. Although the changes may seem extreme at first, they allow us to evaluate what we consider to be our \textit{worst case scenario} where reservation is not enabled. Moreover, as mentioned earlier, we believe the model adopted by existing clouds, where short-term advance reservations are generally not allowed and prices of on-demand instances do not vary over time, users would have little incentives to explore off-peak periods or plan their demand in advance. Change \ref{enum:capacity} reflects the industry practice of provisioning resources to handle peak demand and including a margin of safety.
210 280

  
211 281
\subsection{Performance Metrics}
212 282

  

Formats disponibles : Unified diff