| | 1 | = Quattor Workshop - RAL - 11-13/10/2010 = |
| | 2 | [[TracNav]] |
| | 3 | |
| | 4 | [[TOC(inline)]] |
| | 5 | |
| | 6 | [http://indico.cern.ch/conferenceTimeTable.py?confId=105169#all.detailed Agenda] |
| | 7 | |
| | 8 | == Quattor at RAL T1 - A. Sun == |
| | 9 | |
| | 10 | Started grid with some bricolage based on Kickstart, Puppet... In 2006 realized that this should be reenginered. |
| | 11 | * 500 WNs, 500 disk servers |
| | 12 | |
| | 13 | MAin benefit of Quattor so far: huge improvement in system management efficiency. |
| | 14 | * But must not underestimate the difficulty of getting the whole team onboard: mostly done now. It takes time to get the existing knowledge put in Quattor config. |
| | 15 | * Experienced it during the last kernel update: full reinstallation would have been an affordable option if necessary |
| | 16 | |
| | 17 | |
| | 18 | == Quattor Usage Report - M. Jouvin == |
| | 19 | |
| | 20 | See [http://indico.cern.ch/getFile.py/access?contribId=3&sessionId=2&resId=0&materialId=slides&confId=105169 slides]. |
| | 21 | |
| | 22 | |
| | 23 | == Core Tools == |
| | 24 | === ncm-filesystems and ncm-lib-blockdevices New Ideas - L. Munoz === |
| | 25 | |
| | 26 | ncm-filesystems: NCM components able to build/destroy block devices |
| | 27 | * Take advantage of advanced description available in Quattor |
| | 28 | * Able to do things not possible to do with Kickstart |
| | 29 | * A few bugs: |
| | 30 | * Logical partitions cannot be grown: use LVM instead, no plan to fix |
| | 31 | * preserve_partitions sometimes not honoured: pb understood, some time required to fix it |
| | 32 | * preserve and format: required by AII but should not be available in the component |
| | 33 | * Some requests for new file system types: tmpfs, iscsi (replacement for ncm-iscsitarget), smbfs, FUSE filesystems... |
| | 34 | |
| | 35 | One of the problem is that ncm-filesystem also manages fstab: proposal to move this part to a specific component, ncm-fstab. |
| | 36 | * Add pseudo-file systems and network filesytems (without a block device) to fstab |
| | 37 | |
| | 38 | This changes require some change in the schema |
| | 39 | * Some validation relaxation |
| | 40 | |
| | 41 | Backward compatibility |
| | 42 | * Profiles 100% compatible |
| | 43 | * ncm-filesystems will require ncm-fstab: can be handled in the component templates |
| | 44 | |
| | 45 | Remark on FUSE filesystems: an alternative to fstab is to use a specific daemon for that but this will not be managed by the component. |
| | 46 | |
| | 47 | |
| | 48 | === SCDB Update - M. Jouvin === |
| | 49 | |
| | 50 | See [http://indico.cern.ch/getFile.py/access?contribId=2&sessionId=2&resId=1&materialId=slides&confId=105169 Slides]. |
| | 51 | |
| | 52 | |
| | 53 | === PAN Update - C. Loomis === |
| | 54 | |
| | 55 | Status of v8 series |
| | 56 | * 8.4.2: last announced, the one everybody should be using |
| | 57 | * 8.4.3, 8.4.4: not yet announced |
| | 58 | * Maaven integration |
| | 59 | * 8.4.5: planned soon after the workshop, last v8 version |
| | 60 | * All deprecated features will provide warnings |
| | 61 | * Add `prefix` keyword to grammar (not active, implementation in v9) |
| | 62 | |
| | 63 | v9: first beta planned soon after 8.4.5 |
| | 64 | * Main feature for the first release: |
| | 65 | * Removal of deprecated features |
| | 66 | * Bareword includes |
| | 67 | * New syntax of external reference: `machine:/path` instead of `//machine/path` (will allow proper support of anespaces) |
| | 68 | * Search order between .pan and .tpl reversed? |
| | 69 | |
| | 70 | Spare time going way down: no time to tackle performance issues |
| | 71 | * Not foreseen in the next 6 months |
| | 72 | |
| | 73 | Discussion - Wishes: |
| | 74 | * MS: signing of XML profiles |
| | 75 | * May be easier to chain a signing task with Maaven, the same way it is done for gzipping profiles |
| | 76 | * Michel: auto-escaping of keys in nlist |
| | 77 | * Difficult to implement, pretty intrusive change in the compiler, risk of ambiguity |
| | 78 | |
| | 79 | === QWG Update - M. Jouvin === |
| | 80 | |
| | 81 | See [http://indico.cern.ch/getFile.py/access?contribId=1&sessionId=5&resId=0&materialId=slides&confId=105169 Slides]. |
| | 82 | |
| | 83 | Scrum/Agile ideas agreed. Let's decide later how to do it |
| | 84 | |
| | 85 | |
| | 86 | === Quattor FS - N. William === |
| | 87 | |
| | 88 | Alternative to ncm-query implemented as a FUSE file system |
| | 89 | * Written in Python (300 lines): manages access to XML profile |
| | 90 | * Currently requiring 2.5 but should be easy to backport to 2.4 |
| | 91 | |
| | 92 | Can specify a default mode for accessing the profile still restricting some parts of the namespace |
| | 93 | * Currently only explicit path but could be possible to add pattern matching to disable access to all 'password' attribute for example |
| | 94 | |
| | 95 | nlist/list become directories, values become files whose content is the value. |
| | 96 | |
| | 97 | Can use all the file commands to browse, show the differences between the configuration versions. |
| | 98 | |
| | 99 | Escaped values (e.g. package names): a symlink is created with the unescaped value. |
| | 100 | |
| | 101 | With `--cache` you can start quattorfs to browse profiles in arbitrary locations (e.g. SCDB `build/xml`). |
| | 102 | |
| | 103 | Works on Linux and Mac. |
| | 104 | |
| | 105 | Python code stats the NCM directory at every request to detect profile changes. |
| | 106 | |
| | 107 | === Aquilon and other related MS developments - N. William === |
| | 108 | |
| | 109 | No Aquilon changes since last workshop. |
| | 110 | * Mainly worked on a DNS schema to produce DNS configuration, including host records, based on what is in the configuration database |
| | 111 | * Aquilon soon to be committed to SF repository |
| | 112 | * A few specific requirements in particular Python 2.6, Kerberos |
| | 113 | |
| | 114 | The main task still to be done is QWG templates integration |
| | 115 | * Aquilon enforces the expected namespaces |
| | 116 | * Namespaces: more directories expected by Aquilon than QWG |
| | 117 | * E.g. cpu/intel/l5520 rather than cpu/intel_l5520, rhel/5.0-x86_64 rather than rhel5.0-x86_64 |
| | 118 | |
| | 119 | Schema extensions: need to way to allow optional extensions |
| | 120 | * Product lifecycle |
| | 121 | * Personality: what is used for, OS version/arch |
| | 122 | * Related to Aquilon |
| | 123 | * Function: development, production... |
| | 124 | * Threshold, maintenance windows |
| | 125 | * Start/stop jobs: scripts that must be executed at startup but are not a service |
| | 126 | |
| | 127 | Component subclassing implement and working, including for exception handling |
| | 128 | * ncm-ncd (NCM::Component) enhanced to have a method prefix() returning the configuration path for the current component. |
| | 129 | * Replacement of "my $base" definition |
| | 130 | * Helps with support of subclassing |
| | 131 | |
| | 132 | ncm-network: would like to add support for loopback aliases |
| | 133 | |
| | 134 | Monitoring configuration: would be good to have it embedded into the service configuration rather than done at a later stage. |
| | 135 | * More discussion required on where the template configuring the information should site (service directory or monitoring directory) |
| | 136 | * May think about putting a meta-description of monitoring information in the service and generate the appropriate information on the fly |
| | 137 | |
| | 138 | Versionned components: we need to be able to get several version of a component installed to deploy a new version still using the component described in the configuration description |
| | 139 | * One idea would be to install components at a location that includes the version number |
| | 140 | * Still to be checked if it really solves the problem: the real problem may be to enable SPMA to deploy only a subset of the RPM changes |
| | 141 | |
| | 142 | @xxx@ substitution in source code: should be reduced and probably restricted to where it doesn't break syntax checks (strings, comments) |
| | 143 | |
| | 144 | |
| | 145 | == Security Management == |
| | 146 | |
| | 147 | === Quattor and Security - Mingchao Ma === |
| | 148 | |
| | 149 | Operational security aims to maintain normal operation at a reasonable cost and effort |
| | 150 | * Prevention and response |
| | 151 | |
| | 152 | Problems at many sites run by Quattor because a lot of unnecessary stuff is installed on WNs |
| | 153 | * E.g. firefox: lot of severe well known vulnerabilities with known exploits |
| | 154 | * Other examples: Samba, Xorg, KDE... |
| | 155 | * XWindows used to triggered a kernel exploit with one recent vulnerability |
| | 156 | |
| | 157 | Would be better not to install something which is not needed. |
| | 158 | |
| | 159 | Also saw some sites claiming having upgraded but in fact still running the old kernel. |
| | 160 | |
| | 161 | |
| | 162 | === Cruft Removal - J. Adams === |
| | 163 | |
| | 164 | Work started after realizing that mrtg is running on WNs... |
| | 165 | |
| | 166 | Started to remove the not so useful stuff |
| | 167 | * pkg_del |
| | 168 | * Removal of the inclusion of some RH groups |
| | 169 | |
| | 170 | But got immediatly some complaints about favorite admin tools, editors missing |
| | 171 | * Also a few grid bits broken |
| | 172 | |
| | 173 | Started to turn out in a nightmare: decided to improve checkdeps to produce .dot files showing dependencies. |
| | 174 | * Result: 223 packages removed, 29 running process disappeared, hundreds of MB freed in RAM and disk |
| | 175 | * Goal should be only to have 'core' group and add explicitly the required bits |
| | 176 | |
| | 177 | |
| | 178 | === QWG Errata Framework - I. Collier === |
| | 179 | |
| | 180 | See [http://indico.cern.ch/conferenceTimeTable.py?confId=105169#20101011.detailed slides]. |
| | 181 | |
| | 182 | |
| | 183 | === GRIF Approach to Errata Deployment - G. Philippon === |
| | 184 | |
| | 185 | GRIF would like to move to scheduled deployment of errata once a month. |
| | 186 | * Only SL security errata (no fastbug) |
| | 187 | * No kernel update except if there is a critical kernel vulnerability to avoid complex/disruptive reboots |
| | 188 | * Kernel updates controlled by each GRIF site admins |
| | 189 | |
| | 190 | In case of a critical vulnerability, a specific out-of-schedule errata is produced. |
| | 191 | |
| | 192 | Deployment strategy |
| | 193 | * First deployment on a test cluster representing the most common configurations to fix main problems |
| | 194 | * Then deployment on production clusters, under control of GRIF site admins |
| | 195 | * NODE_OS_ERRATA_TEMPLATE used to force a machine to stay with its current errata level |
| | 196 | * Cumbersome to maintain, some GRIF-specific scripts to help |
| | 197 | |
| | 198 | Main issues is changed in RPM version, dependencies, arch |
| | 199 | * arch-specific to noarch |
| | 200 | * RPM splitted into -common and -libs |
| | 201 | * RPM name change |
| | 202 | * Pb with algorithm used to guess the most recent version: 4.6 considered more recent than 4.7 |
| | 203 | |
| | 204 | Useful companion tools |
| | 205 | * Pakiti: easy to see problems with undeployed/misdeployed errata |
| | 206 | * Nagios: specific probe to detect SPMA problems |
| | 207 | |
| | 208 | |
| | 209 | == Quattor Software Process == |
| | 210 | |
| | 211 | === Build Tools - C. Loomis === |
| | 212 | |
| | 213 | Reasons to replace Quattor Build Tools |
| | 214 | * Broken |
| | 215 | * Incredibly complex |
| | 216 | * Linux dependent |
| | 217 | * No maintainer |
| | 218 | |
| | 219 | Why Maven? |
| | 220 | * Portable, open to non-java language |
| | 221 | * Clean, standard build process |
| | 222 | * Integrated mechanisms for release management |
| | 223 | * All build information in a single file |
| | 224 | |
| | 225 | First tests done with a few components and looked reasonably easy |
| | 226 | |
| | 227 | Roadmap |
| | 228 | 1. Components: may need some adjustements in configuration |
| | 229 | 1. Primary tools: in particular NCM related stuff |
| | 230 | 1. Other tools |
| | 231 | |
| | 232 | Updated build for configuration components exists and ready to be applied to all components |
| | 233 | * Lots of changed required but most of them can be automated |
| | 234 | * Current features |
| | 235 | * Only build functionality, no tagging, code update... |
| | 236 | * Creates RPMs for Quattor clients but only with the Perl modules |
| | 237 | * Tarball with pan configuration and documentation |
| | 238 | * Configuration component archetype |
| | 239 | * Example component with simple command |
| | 240 | |
| | 241 | Basically 3 commands: |
| | 242 | * mvn clean: clean up workspace |
| | 243 | * mvn install: build and store locally |
| | 244 | * mvn deploy: build and deploy |
| | 245 | |
| | 246 | Build features: |
| | 247 | * Substitue values in source files but no need for specific extensions |
| | 248 | * Checks pan language syntax |
| | 249 | * Create RPM and tarballs |
| | 250 | |
| | 251 | Still to be done |
| | 252 | * Script for automated conversion |
| | 253 | * Finish documentation on website |
| | 254 | * Apply to all existing components |
| | 255 | * Verify conversion/update of components |
| | 256 | * Determine integration continuous integration |
| | 257 | |
| | 258 | Maven requirements |
| | 259 | * Java + Maven core (jar file) |
| | 260 | * Or Eclipse plugin |
| | 261 | |
| | 262 | === Discussion === |
| | 263 | |
| | 264 | Components |
| | 265 | * Review which ones are needed and/or actively maintained |
| | 266 | * Maven migration may be an occasion to contact official maintainers, discuss problems during monthly meetings |
| | 267 | * Ensure there is a reference person for every critical/important component |
| | 268 | |
| | 269 | Quattor releases: not a big deal for people already using Quattor but important for new users |
| | 270 | * Maven should help as it allows to easily maintain a list of what is considered production |
| | 271 | * Component maintainers will keep the control/responsibility of updating the list |
| | 272 | |
| | 273 | SVN repository usage |
| | 274 | * Encourage developpers to use Git at their SVN client to commit their work and restrict trunk to reasonably good stuff |
| | 275 | * trunk should be able to build successfully at every revision |
| | 276 | * May be enforced by rebuilding trunk every night |
| | 277 | * tag management: delay decision until we have some experience with Maven and possible workflow for tagging new versions |
| | 278 | |
| | 279 | |
| | 280 | == Site Experiences == |
| | 281 | |
| | 282 | === NIKHEF Migration to QWG - R. Starink === |
| | 283 | |
| | 284 | Historically, NIKHEF started with pre-QWG templates and tried to remain as compatible as possible but lot of effort duplication |
| | 285 | * Some specific requirements, in particular wants to stay with YAIM |
| | 286 | |
| | 287 | Approach for migration |
| | 288 | * Use 1 SCDB with duplicate namespaces |
| | 289 | * Migration per cluster |
| | 290 | * Focus host layout and services: generic hosts, non-gLite hosts, gLite hosts |
| | 291 | * Hit dirty workarounds |
| | 292 | |
| | 293 | First impressions |
| | 294 | 1. Lots of old stuff in our CBD: clean up before migrating |
| | 295 | 1. gLite everywhere in QWG examples: in fact missed core machine type |
| | 296 | 1. Various services not in QWG: authconfig, audit, psacct... |
| | 297 | |
| | 298 | Decided to keep the current node layout/machine types |
| | 299 | |
| | 300 | Kernel errata management: intentions ok but implementation/documentation not consistent across all OS versions |
| | 301 | |
| | 302 | File system configuration: AII magic too complicated or risky |
| | 303 | |
| | 304 | Node cloning: looks too complicated in QWG |
| | 305 | |
| | 306 | User management via central LDAP |
| | 307 | |
| | 308 | Hardware description: machine classes rather than individual machines |
| | 309 | |
| | 310 | Migration done after 4 weeks without disrupting the site and without complaint from other admins. |
| | 311 | * Too early to say about real benefits |
| | 312 | * Now in a position to contribute to QWG but need to better understand how specific a contribution must be |
| | 313 | * In particular may contribute to better consistency in QWG templates |
| | 314 | * "One size fits all" wont work... but not a reason not to collaborate |
| | 315 | |
| | 316 | === SINDES - J. Dudziek === |
| | 317 | |
| | 318 | SINDES main purposes: |
| | 319 | * CA : manage the certificates, confirm identities, create/revoke certificates |
| | 320 | * Generated certificate intended to be used only for securing communication: different from the service certificates |
| | 321 | * Notion of time windows during which a given client can request a new certificate |
| | 322 | * Storage centre for secret files, passowrds... |
| | 323 | * Deliver them in a secure way |
| | 324 | |
| | 325 | Based on Apache, openssl, mod_rewrite |
| | 326 | |
| | 327 | Currently in use at CERN and serving 8000 hosts |
| | 328 | * Several applications relying on it |
| | 329 | |
| | 330 | Weaknesses |
| | 331 | * No feature to delete files |
| | 332 | * Only 2 target types: host and cluster. Subclusters needed |
| | 333 | * No easy way to move a machine from a cluster to another one |
| | 334 | * No possibility to view files |
| | 335 | * No file versioning |
| | 336 | |
| | 337 | Possibility for improvements |
| | 338 | * Enhance the current implementation |
| | 339 | * Provide the same features based on an exiting product, e.g. wallet |
| | 340 | * Manpower available: 1 year of technical student |
| | 341 | |
| | 342 | |
| | 343 | === SINDES at RAL - J. Adams === |
| | 344 | |
| | 345 | Background: desire to put passwords in templates but plain http serving not very appropriate |
| | 346 | * Anyone can access a machine profile |
| | 347 | * Every node can access another node profile |
| | 348 | |
| | 349 | SINDES used only to deliver the certificates |
| | 350 | * File store has been disabled |
| | 351 | * Information that could be in the file store is put in the profile then transfered securely |
| | 352 | * Integration with AII through hooks (from BEGrid) |
| | 353 | |
| | 354 | Used to secure transmission of profiles but don't secure template files in the repository |
| | 355 | * Assumption that users accessing the repository can be trusted |
| | 356 | |
| | 357 | Problems |
| | 358 | * SINDES version used for SLC4 only: required a lot of effort to port to SL5 |
| | 359 | * In fact CERN has a SL5 version... but no official distribution point |
| | 360 | * Documentation: the most useful is BEGrid documentation |
| | 361 | |
| | 362 | Question: would CERN accept that we import it to quattor.org for easier use by Quattor site and better integration with Quattor? |
| | 363 | * May become a standard component of a Quattor server |
| | 364 | |
| | 365 | |
| | 366 | === BEGrid Experience and Questions - D. Durvaux / S. Rugovac === |
| | 367 | |
| | 368 | BEGrid: several partners around BELNET |
| | 369 | * Need to reengineer current SCDB structure: looking for some input |
| | 370 | |
| | 371 | Current configuration based on 2-tier infrastructure |
| | 372 | * 1 central national server running SCDB |
| | 373 | * 1 site server per site which is a SCDB client (doing a checkout): no ability to commit |
| | 374 | * `runcheck` script on each site server doing replacement in central configuration for site-specific parts and handling/triggering the deployment for the site |
| | 375 | |
| | 376 | Problems: |
| | 377 | * Quattor out-of-sync with the community: way to use it, QWG templats, OS/errata |
| | 378 | * BELNET not yet enough skilled with Quattor to take over the coordination responsibility: still relying on IIHE team |
| | 379 | * SINDES support |
| | 380 | * dCache support: no other Quattor site using it? Relying on very old templates |
| | 381 | |
| | 382 | Possible solution envisionned |
| | 383 | * Refactoring of SVN structure with SVN externals |
| | 384 | * SWrep replacement: is http-based repository suitable? |
| | 385 | |
| | 386 | Discussion |
| | 387 | * Need to clarify the workflow |
| | 388 | * In particular how much control of the central team over the sites: ability to trig deployment... |
| | 389 | * Centrally-triggered deployment requires nothing more than ability to write the site tags/ branch |
| | 390 | * 1 specific branch per site (with its trunk/, tags/ structure) and an svn:externals reference to the central server |
| | 391 | * An option is one SVN server per site |
| | 392 | * External reference can be or not to a fix revision |
| | 393 | |
| | 394 | |
| | 395 | == Monitoring Support == |
| | 396 | |
| | 397 | === Future Changes in QWG Nagios Templates - R. Starink === |
| | 398 | |
| | 399 | Problems found in standard templates by NIKHEF |
| | 400 | * `monitoring/nagios/config` |
| | 401 | * Location of some RPM includes (minor) |
| | 402 | * Host list derived from HW database: a problem with a master/slave Nagios config, one function needlessly complicated |
| | 403 | * `monitoring/nagios/command`: huge list of command definitions |
| | 404 | * Should be break into a common part and a site-specific part |
| | 405 | * Some configuration variable have meaningless defaults rather than trig an error if undefined |
| | 406 | * pnp4nagios configuration missing: general setup easy to add, keep it optional |
| | 407 | * Currently requires modification (1 line) of service.tpl |
| | 408 | |
| | 409 | Hierarchy of Nagios servers: slaves collect the information, master runs the web interface |
| | 410 | * Configuration of host list is different on slave and master: a problem for the current automatic determination of host list in QWG templates |
| | 411 | * Current NIKHEF implementation based on hosts and services groups |
| | 412 | * Grouping done by cluster |
| | 413 | * Master is configured with everything |
| | 414 | |
| | 415 | Grid monitoring based on EGEE/EGI work |
| | 416 | * Using YAIM |
| | 417 | |
| | 418 | Discussion |
| | 419 | * Build a RPM for probles related to monitoring Quattor activity |
| | 420 | * Probe sources should be put and built in the SF repository as any usual Quattor component |
| | 421 | |
| | 422 | == Build Awareness (and knowledge) of Quattor - D. O'Callaghan == |
| | 423 | |
| | 424 | We need to promote Quattor to: |
| | 425 | * have more users |
| | 426 | * have more contributors |
| | 427 | * to improve Quattor |
| | 428 | |
| | 429 | This requires to make Quattor easier to discover and to make easier to join the community. |
| | 430 | |
| | 431 | Functionality / complexity must be taken into account |
| | 432 | * Quattor's narrow OS support counts against it |
| | 433 | * Pan language is powerful |
| | 434 | * Complexiting of creating a new configuration component and bringing it to the community |
| | 435 | |
| | 436 | User-facing website: content is good but several problems |
| | 437 | * Server certificate check and user-certificate check |
| | 438 | * Server certificate should be from a well-known CA |
| | 439 | * Should not require a user certificate |
| | 440 | * Search results on Trac are not focused enough, e.g. `tutorial`. Should not return results about Trac documentation |
| | 441 | * Too much hierarchy exposed on the first page |
| | 442 | * `quattor.org` doesn't appear in Google |
| | 443 | |
| | 444 | Aims for user website |
| | 445 | * Create a user landing page outside Trac |
| | 446 | * Resolve security issues or solve it |
| | 447 | * Better internal search results |
| | 448 | * Better external search engine results |
| | 449 | * Links on sites "power by Quattor" |
| | 450 | |
| | 451 | Worth looking at how the competition does things: Puppet, Chef |
| | 452 | * Document differences |
| | 453 | * Document how Quattor can be used in these other environements |
| | 454 | * In particular for OS configuration and initial installation |
| | 455 | |
| | 456 | Marketing Pan outside the Quattor community and separately the other parts of Quattor framework |
| | 457 | |
| | 458 | Need to check Quattor description on some well-known places: Wikipedia, freshmeat, Ohloh... |
| | 459 | * Fix wikipedia based on LISA paper introduction |
| | 460 | |
| | 461 | David agrees to spend some effort on the new landing page before the next monthly meeting. |
| | 462 | * Landing page to be hosted on SF website |
| | 463 | * Try to define a stylesheet that could be reused by other pages, in particular those generated by Maven |
| | 464 | |
| | 465 | == Virtualisation Support - Cal == |
| | 466 | |
| | 467 | Quattor survey showed 80% of sites use some virtualization |
| | 468 | * Easy integration as PXE booting supported by all hypervisors |
| | 469 | * S. Childs provided configuration component for Xen with some support in QWG |
| | 470 | |
| | 471 | StratusLab contributions |
| | 472 | * 2 new configuration components: source currently in StratusLab repository and will remain there during the duration of the project |
| | 473 | * ncm-libvirtd: configure and control libvirt |
| | 474 | * ncm-oned: configure and control OpenNebula |
| | 475 | * Quattor configuration of a cloud |
| | 476 | * Configures OpenNebula, incl. NFS mounts |
| | 477 | * Configures private and public network bridges |
| | 478 | * HTTPS proxy for OpenNebula XMLRPC server (which is plain http) |
| | 479 | * Ganglia for rudimentary monitoring of cloud infrastructure |
| | 480 | * 2 manual configuration required |
| | 481 | * Definition/addition of hosts in the cloud: in the future may be done by a cron/service with SINDES |
| | 482 | * NFS mounts verification: currently static mounts, should be solved using autofs |
| | 483 | |
| | 484 | Contextualization |
| | 485 | * Networking done by DHCP |
| | 486 | * Files and parameters passed through a disk (ISO image) mounted on /dev/hdc |
| | 487 | * Provide a nice way of handling credentials: the disk is not exposed to other VMs |
| | 488 | * Initialization script on disk run through rc.local |
| | 489 | |
| | 490 | Virtualization provides a cleaner implementation of profile cloning for WNs... |
| | 491 | * One reference WN compilation, without node specific information |
| | 492 | * Node specific information passed in trhgh contextualization |
| | 493 | * One profile per group of machines |
| | 494 | * Compilation time scales with number of groups |
| | 495 | |
| | 496 | ... but rely on an external tool (VM manager) to handle the deployment |
| | 497 | * Require a mechanism to specify state of the fabric |
| | 498 | * Still require profile cloning for efficient management of hypervisor machines |
| | 499 | |
| | 500 | Quattor is a good appliance generator offering a bookkeeping to track VM configuration, save state information and regerate images if necessary |
| | 501 | * Would like to be able to interface with virt-install for automated image generation and deployment |
| | 502 | * Implementation could be a service listening profile changes and running virt-install |
| | 503 | * Deployement may use AII hooks |
| | 504 | |
| | 505 | StratusLab developpements will be publically available on Nov. the 2nd. |
| | 506 | |
| | 507 | |
| | 508 | == Actions == |
| | 509 | |
| | 510 | QuattorFS |
| | 511 | * Backport to Python 2.4 if possible/easy (James?) |
| | 512 | |
| | 513 | SINDES |
| | 514 | * Check with CERN agreement to import it in SF repository (Véronique) |
| | 515 | * Check licensing (Véronique) |
| | 516 | * Integration Quattor server configuration (RAL?) |
| | 517 | |
| | 518 | Web site |
| | 519 | * Landing page for quattor.org on SF (David) |
| | 520 | * Develop a stylesheet that could be reused by pages generated from Maven |
| | 521 | * Fix Trac server certificate CA (Michel) |
| | 522 | * Remove Trac request for a user certificate for anonymous access (Michel) |
| | 523 | * Enable Trac indexing (Michel) |
| | 524 | * Fix navigation menu behaviour: discuss by email what we want to implement, then implement it (Michel) |
| | 525 | |
| | 526 | QWG |
| | 527 | * Organize initial development meeting based on today's proposal (Michel) |
| | 528 | * Discuss namespace improvements as suggested by Nic (email discussion + decision at monthly meeting) |
| | 529 | * standard/hardware/... including a vendor directory |
| | 530 | * vendor/version-arch or vendor/version/arch for OS templates |
| | 531 | |
| | 532 | Monitoring |
| | 533 | * Implement NIKHEF suggestions to add flexibility and support hierarchy of Nagios servers (Ronald?) |
| | 534 | * Collect existing Nagios probes related to Quattor activity monitoring, put them in SF and package them as a RPM |
| | 535 | |
| | 536 | Documentation |
| | 537 | * Better integration of former MediaWiki content into existing section, remove duplicates |
| | 538 | * Update SINDES related documentation, improve based on BEGrid wiki and RAL experience |
| | 539 | * Clarify or add missing material to answer Ronald's questions after his QWG migration experience |
| | 540 | * Implement changes based on Andrea's review |
| | 541 | * Fix/improve Quattor description in Wikipedia and Freshmeat |
| | 542 | * Ensure Quattor is reference on the appropriate open-source or software project portals |
| | 543 | |
| | 544 | == Wrap-up == |
| | 545 | |
| | 546 | |