Changes between Version 5 and Version 6 of Meetings/Workshops/20081027


Ignore:
Timestamp:
Oct 28, 2008, 6:13:01 PM (17 years ago)
Author:
/O=GRID-FR/C=FR/O=CNRS/OU=LAL/CN=Michel Jouvin
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • Meetings/Workshops/20081027

    v5 v6  
    1 = Quattor Workshop - NIKHEF - 27-29/08 =
    2 [[TracNav]]
    3 
    4 [[TOC(inline)]]
    5 
    6 [http://indico.cern.ch/conferenceDisplay.py?confId=40056 Agenda].
    7 
     1= Quattor Workshop - NIKHEF - 27-29/10/08 =
     2[[TracNac]]
     3
     4[http://indico.cern.ch/conferenceTimeTable.py?confId=40056 Agenda].
     5
     6== Introduction - S. Child ==
     7
     8Quattor survey: 48 answer, 27 sites using Quattor, 14 countries
     9 * Still a strong use of CDB: as used as SCDB
     10
     11[http://indico.cern.ch/materialDisplay.py?contribId=10&sessionId=4&materialId=slides&confId=40056 Slides].
     12
     13Evaluation of Quattor community:
     14 * Strengths:
     15   * Open and supportive community
     16   * Improving integration with related tools (monitoring, VMs)
     17   * Large suite of configuration components (254)
     18   * Well defined language model
     19   * Good support for gLite
     20 * Weaknesses:
     21    * Lack of effort for release management
     22    * No automated build and test
     23    * Lack of documentation, in specially quick-start style
     24    * Difficult to contribute (access to repositories)
     25    * Legacy architecture makes writing components difficult
     26    * No official support for gLite configuration with Quattor
     27    * Hard to bootstrap because of the many external dependencies
     28    * Level of knowledge still low among prospective "customers"
     29   
     30Need to focus effort on completing migration from grid, CERN-centered usage model to a generic open source project.
     31 * More information on documentation and packaging
     32 * Quattor appliance for site bootstrap
     33 * Dissemenation: several things done recently (EGEE, LISA, HEPiX). More to do: SC, Linux sysadmins...
     34 * Seek funding for dedicated development efforts ? High overhead but may be worth
     35 
     36Goals for the workshop:
     37 * Finalize structures for opening up community collaboration: accelerate move to SourceForge, ensure access rights are easy to manage
     38 * Coordinate tool development: centralise and catalogue existing tools
     39 * Decide on sustainable build and release procedures, as automated as possible, not dependent from one person
     40 
    841== Site Reports ==
    942
    10 == Actions ==
    11 === Stephen Childs ===
    12  * Check in AII web interface
    13  * Integrate Grid-Ireland site/cluster hierarchy for SCDB
    14  * Make sure TCD monitoring additions are checked in to QWG and documented (in particular conversion of lists of machine classes to nagios hostgroups)
    15 
    16 === Morgan Stanley ===
    17  * Check in patches for AII (via Luis)
    18  * Check in patches to components (once SF migration complete)
    19 
    20 === All ===
    21  * Review component writing howto -- recommend gettree method
     43=== GRIF - M. Jouvin ===
     44
     45[http://indico.cern.ch/materialDisplay.py?contribId=2&sessionId=4&materialId=paper&confId=40056 Slides].
     46
     47=== UAM - L. Munoz ===
     48
     49Using QWG + SCDB.
     50
     51Manpower potential problem: Luis applied for a fellowship at CERN.
     52
     53Documentation: would like a tool for automatic extraction of documentation from PAN code
     54 * Doxygen plugin for PAN
     55 
     56Packages lists are much too long and sometimes cannot be handled at installation time.
     57 * Currently 1200 packages on a 64-bit machine, was able to reduce it to 411 on a WN
     58 
     59SCDB at UAM configured to use svn:externals: great for update but a problem if LAL is down. Will be worst if moving to SourceForge.
     60
     61We'like to impement new ant targets without modifying build.xml.
     62
     63Had like to implement a staging strategy for deploying changes: fits better on a distributed model.
     64 * How specific is AQDB from Morgan and Stanley
     65 * Is CDB over Git or Mercurial a project worth the effort ?
     66 
     67
     68=== CERN - V. Lefébure ===
     69
     70[http://indico.cern.ch/materialDisplay.py?contribId=5&sessionId=4&materialId=slides&confId=40056 Slides].
     71
     72Main instances: 6300 profiles, 140 clusters
     73 * Less than in March 08 because of retired HW
     74 * panc: still using panc v6
     75 * Cake now multithreaded
     76 * Starting to use namespaces in SL5 templates
     77 * SPMA and SWrep: starting to have VO-dedicated SWrep, still not enforcing RPM signing
     78 * CDB2SQL: rewrite almost complete
     79 * CCM: use SSL-based transfer, with fallback to http:
     80 * More and more CDB users: ACL working well but management is an issue
     81 * Update to Quattor 1.3 templates well under way
     82 
     83Xen-based virtualisation being taken over by Ewan Roche
     84
     85
     86=== CNAF - A. Chierici ===
     87
     88Quattor 1.3 templates (namespace): still a lot to be adapted to new schema:
     89 * Conversion going on smoothly, with help from ME. Storage people very happy.
     90 * Imported templates for configuring ncm-yaim from NIKHEF
     91 * CEs on a 64-bit OS using Michel's tricks: improved the situation a lot.
     92 * WN 64-bit tested and soon in production
     93 
     94Xen used on LHCb-T2 and a few nodes in T1
     95 * Dom0 managed by hand
     96
     97Building RPM lists: tried checkdeps.py but failed because of the YUM version required.
     98  * Currently using a node in a VM to install metapackage and dependencies and produce the RPM list
     99
     100CDB vs. SCDB: thinking about migrating, to take advantage of advanced logging features. Also concern on CDB support after ME leaves.
     101 * No plan to use QWG at the beginning
     102 * CNAF is very short on staff members and temporary contracts will be reduced
     103
     104=== Philips - S. ===
     105
     1062 instances based on SCDB + QWG:
     107 * testgrid: 6 nodes, based on VMs
     108 * biggrid: 206 nodes, including a SE with 3 disk servers
     109 
     110Future:
     111 * panc v8
     112 * Dummy template for WNs
     113 * Nagios integration
     114 * Xen to run biggrid services
     115 
     116
     117=== Grid Ireland - S. Childs ===
     118
     119[http://indico.cern.ch/materialDisplay.py?contribId=3&sessionId=4&materialId=slides&confId=40056].
     120
     1211 SCDB for 22 sites
     122 * 17 externals, 5 internals
     123 * 7 site admins, 5.4 average commits per day
     124 * 450 managed machines
     125 * 20 Quattor servers
     126 
     127Monitoring expanded: Nagios deployed through Quattor
     128 * Local sensors + read in EGEE SAMs
     129 * LEMON erros fed to Nagios
     130 
     131panc v8 now in use: logging features already invaluable
     132
     133Configured iSCSI+RAID storage (aggregating WN disks).
     134
     135Issues and future work:
     136 * Would like equivalent to WN speedup integrated into the compiler as dummy workaround is too fragile
     137 * Dependency checker now incompatible with our SL4 server
     138 * Get network, filesystems, IPMI under Quattor control
     139 * Configuring cross-site redundancy for core services using drbd+Xen
     140 * Get all TCD admins actively contributing to Quattor community
     141 
     142
     143 === NIKHEF - R. Starink ===
     144 
     145[http://indico.cern.ch/materialDisplay.py?contribId=0&sessionId=4&materialId=slides&confId=40056 Slides].
     146
     147Now using SCDB with some local modifications, with SCDB inside Subversion
     148 * Still using panc 7.2.9, upgrade planned Dec./Jan.
     149 
     150AII: added new feature to give Anaconda literal definition of filesystems
     151 * blockdevices and filesystems seen as too risky for SEs
     152 
     153Cloning WN definitions for speeding up but issue with dependencies when recompiling
     154
     155Redundant repository servers with DNS load balancing
     156
     157Projects:
     158 * Virtualization: not yet based on QWG but considering to move. Want to work on management: where is my VM running, what should host X do ? See QuatView talk.
     159 * Monitoring via Nagios: client side already done by Quattor, servers still manually configured but considering QWG.
     160   * NCG: will try YAIM
     161 * Planning OS setup via QWG
     162
     163Issues:
     164 * SW updates:
     165   * Kernel: install only a limited number of updates
     166   * gLite: update repository for WN separate from other node types
     167 * SPMA pb with RPM file/pkg name mismatch
     168 * PAN schema: would like a property describing node function (CE, WN...)
     169 * ncm-xen: support for relocatable domUs
     170 * ncm-yaim: remove siteinfo.def on failure
     171
     172
     173=== LAPP - E. Fede ===
     174
     175SCDB + QWG templates used for 2 years for grid resource and started recently used it for internal systems.
     176 * Quattor runs in a VM  for easier backup and restore. But this is at the price of performances : 3mn for 100 templates.
     177 * 4 people involved in Quattor management
     178 * Management of GPFS nodes with Quattor: selection of appropriate RPMs based on kernel version, configuration of SSH keys...
     179
     180Wish: web interface for AII
     181
     182
     183=== GRNET/AUTH - C. Triantafyllidis ===
     184
     185Since Nov. 2007, using SCDB + QWG templates
     186 * Started during EDG with LCFG and LCFGng
     187 * First look at Quattor in 2006 and first attempt to use it mid-2007 with CDB + YAIM
     188 * Remaining issue: installing a new node type is still not trivial but site is now consistent
     189   * Troubleshooting errors often involved a node without Quattor and comparing
     190 
     191Synchronization with LCG QWG repository still a painful process
     192 * Introduced a cfg/local to hold all local modifications common to both sites managed from SCDB and also to redefine machine-types
     193   * Missed that standard machine-types could be customized without duplicating/redefining
     194 * gLite updates is still not a straightforward process: need testing nodes
     195 
     196Components: several new components written (Pakiti, hydraclient, hydra).
     197 * Ganglia component in progress
     198 * A few components patches (krb5client)
     199
     200Working on a web interface for management providing a web-based wizard for creating profiles from templates, creating new hardware templates and updating RPMs.
     201
     202Current status in Greece: half of sites using Quattor for initial installation, only 2 sites for the full management
     203 * Doing some efforts to get other site using it for the full site management too
     204
     205
     206=== Morgan & Stanley - N. Williams ===
     207
     208Currently moving 6500 nodes under the new ''Aquilon'' system based on Quattor
     209 * Alread a few hundreds
     210 * 1 Quattor server, 6 boot servers (DHCP+TFTP), 14 admins among which 6 template amdins
     211 * Monitoring with LEMON + perfdata: LEMON used for aggregated views
     212 
     213Aquilon vs. QWG templates: Aquilon is based on a relational model where a node is associated with a role
     214 * Plenary templates: templates that provide date to the other templates. Typically generated on the fly from an external source.
     215 * Template library: what configuration is required to provide specific behaviours, implementing configuration policies given the input plenary data.
     216   * Only template library is versionned.
     217   * Modeled after QWG ideas but with a different layout: every group of templates grouped by ''archetype'' (one archetype is ''qwg'', others are ''aquilon'', ''service'', ''hardware'', ''pan'', implemented as a pan namespace).
     218   * Own standard schema to add M&S-specific information: location details, events/actions, archetype, personality, release and version allowed for components...
     219   * Distributed services modeled with several instances for redundancy, scalability. Instances of each service are described into ''service'' archetype. One specific node uses one specific instance. Personality describes the services required but not the instance (selected using plenary template information).
     220
     221Aquilon vs. CDB: decided not to use CDB but instead roll our own
     222 * Replacing an aging asset database, integrating with legacy dbs.
     223 * Based on Oracle
     224 * Use AQD broker (written in Python) that mediates Aquilon commands to Oracle
     225 * Entitlement done in AQD broker
     226 
     227Sandboxes used to implement staged deployment
     228 * Hosts are associated with 1 sandbox and can be moved from one to another without any config change
     229 * A sandbox implemented as a SCM branch: SCM used is Git
     230 * Production templates cannot be edited but can only takes merges from sandboxes
     231 * Still to come: unit testing of sandboxes, better cherry-picking tools and methodology (based on Git hooks)
     232 
     233Bootservers:
     234 * DHCP, SPMA proxies, Kickstart servers
     235 * CDB notification from AQD trigger aii-shellfe --notify: 2s per client to configure
     236 * AQ translates pxeswitch commandes into ai-installfe requests to the appropriate servers
     237 * Just switched to AII v2 to get advantage of more flexible file system and block device definitions. But problems with performances due to the plugin-based architecture. A few patches made, in particular to support selectable CCM database formats (will use CDB_file).
     238 
     239Statistics:
     240 * Infrastructure servers are 8 core, 2.5Ghz, 16GB
     241 * PAN: 13 profiles per second compile, <3 mn for 2K hosts, XML profiles are approx 230 KB
     242 * Reconfiguring bootservers: 1 client takes 2s, reconfinguring 2K tagkes more than 1h...
     243 * Installation: 700 RPMs, 1.8 GB, 15 minutes
     244 * 10% transient failures on installing because of ccm-fetch unable to get its lock
     245 
     246Wishes and remarks:
     247 * Boot servers don't need whole profile
     248 * ROCKS RPM download vi abittorient
     249 * Components often have insufficient logging, not easy to get status of component configuration
     250 * Really want to participate with development: SourceForge wil be a Really Good Thing
     251 * Subclassing components allows for replicated functionality but better transactions
     252 * Entitlements in PAN
     253 * LEMON and SLS are 'under-sold"
     254 
     255 
     256== Core Components ==
     257
     258=== CDB - M.E. Poleggi ===
     259
     260[http://indico.cern.ch/materialDisplay.py?contribId=12&sessionId=3&materialId=slides&confId=40056 Slides].
     261
     262Mainly busy with CNAF operations, not a lot of development
     263 * Mainly cdb-tpl-view and pangraph ready for panc v8): not CDB specific
     264 * No progress on fine grain locking and common authentication framework
     265 
     266cdb/cdb-soap
     267 * Faster state management
     268 * Multi-threaded dependency calculation
     269 * Authentication supports locking
     270 
     271Wish list:
     272 * ACLs restrictions on include
     273 * Expose some CVS features to end-user
     274 
     275panc v8: still some problems, investigating...
     276
     277Scalability of dependency calculation: requires a lot of operations and CPU, improved by moving to multi-threaded dependency calculation.
     278
     279
     280=== SCDB - M. Jouvin ===
     281
     282[http://indico.cern.ch/getFile.py/access?contribId=13&sessionId=3&resId=0&materialId=slides&confId=40056 Slides].
     283
     284
     285=== QWG - M. Jouvin ===
     286
     287[http://indico.cern.ch/getFile.py/access?contribId=13&sessionId=3&resId=2&materialId=slides&confId=40056 Slides].
     288
     289
     290=== PAN Compiler - C. Loomis ===
     291
     292[http://indico.cern.ch/materialDisplay.py?contribId=14&sessionId=3&materialId=slides&confId=40056].
     293
     294Current version: 8.2.2
     295 * v8 is a major rewrite
     296 * 8.2.3 should be released next week
     297 * Significant language and implementation changes from v7
     298 * Known intermittent threading problem since v7: not yet identified
     299 * 7.2.9 is in maintenance: no new feature
     300 
     301Future plans:
     302 * Facilitate pan language edtior and debugger
     303   * Making grammar changes for better support: save comments...
     304   * Internal refactoring to make such things easier: ability to put breakpoints...
     305 * Performance: continued optimization for generated code, reduce memory footpring via object sharing, understand speed differences between machines, JVMs...
     306   * SL4/SL5 difference: 2x slower
     307 * XInclude support: replace stream, fetc, embed...
     308
     309Language changes planned for v9 (not before next meeting..):
     310 * Remove deprecated syntax from v8
     311 * Authentitcation/authorization: still requires some discussion to find the appropriate solution (based on type extension) without performance degradation
     312 * Support for annotation (doxygen)
     313
     314
     315=== AII - L. Munoz ===
     316
     317[http://indico.cern.ch/materialDisplay.py?contribId=15&sessionId=3&materialId=slides&confId=40056].
     318
     319Lots of feedback and bug fixes in the last months.
     320
     321New hooks:
     322 * Aconda hook
     323 * NBP hook
     324
     325AII runs in tainted mode: ill-formed profiles won't force AII to do wrong things
     326 * Still sensible to symlink attacks
     327
     328Known issues:
     329 * File system creation is slow: working on improvements for 1.4
     330 * aii-dhcp still around and doesn't fit v2's philosphy based on plugins: slow down aii-shellfe because it is doing DNS requests
     331   * Implement a ncm-dhcp component that can be used as a replacement ?
     332 * Reduce the number of profile download: currently one per AII command
     333 * Remove 'base_url not defined' message
     334
     335Dropping root privilege for AII
     336 * aii-shellfe running as user aii
     337 * /osintall owned by aii
     338 * sudo for running aii-dhcp or ncm-dhcp component
     339 * Install CGI no longer running as root
     340
     341Document SINDES hooks or similar solutions to install certificates during installation
     342 * Link to BEGRID wiki
     343
     344
     345=== Other Modules - M.E. Poleggi ===
     346
     347[http://indico.cern.ch/materialDisplay.py?contribId=17&sessionId=3&materialId=slides&confId=40056 Slides]
     348
     349CAF:
     350 * Ability to log to syslog
     351 * Process.pm: wrapper over LC::Process with support for verbose run
     352 * FileWriter.pm: wrapper for file writing operations
     353 * Fixed Reporter.pm
     354
     355LC: updated to last version, should try to pick up changes more frequently
     356 * In fact it is currently maintained even though we don't have much contact with Lionel
     357
     358CCM: not unescape(), getUnescapedName(), fixed for taint mode
     359 * Still on todo list: de-privileged CCM execution, flag for disabling CCM updates
     360
     361cdispd:
     362 * Redirected output to aovid SElinux related problems: but a pb in the implementation, needs to be fixed (Véronique)
     363 * Wish list: disable further updates by Quattor (inhibition flag using a property in profile)
     364
     365ncd:
     366 * NCM::Check extended with 'noaction' for overriding the global flag
     367 * NCD/* inheriting from CAF::RepoerterMany
     368 * Sanitization to run in taint mode without warnings
     369 * Wish list: protect against malicious modifications of profile, for example signing it
     370
     371ncm-templates: no change
     372 * Todo list: review it and remove everything that was AII v1 specific
     373
     374SPMA/rpmt: ME reports situation where RPM encouters warning during scripts and SPMA returns a success even though, running the same command manuallu returns rc<>0
     375 * To be investigated, not so clear...
     376
     377pangraph + cdb-tpl-view ready for namespaces and navigation via panc logging information
     378
     379New tools: checkdeps, QuatView...
     380
     381
     382== New Tools ==
     383
     384=== checkdeps.py - S. Childs ===
     385
     386[http://indico.cern.ch/materialDisplay.py?contribId=18&sessionId=3&materialId=slides&confId=40056 Slides].
     387
     388Goal: check dependency problems before deployement rather than after
     389 * Only approach to make it efficient and robust is to use package metadata, normally not available on the machine you compile on
     390 * When you start to write something, you realize you are rewritting YUM...
     391 
     392checkdeps concept:
     393 * Retrieve package list from node profile parsing XML file
     394 * Create YUM repository for each repository in profile
     395 * Instruct YUM to use only these repositories and to use checkdeps configuration
     396 * Create a transaction set for YUM will all packages in
     397
     398Issues:
     399 * YUM doesn't have a real API: have to run in debug mode and parse YUM output
     400
     401When it works it's great but hard to get working!
     402 * YUM parsing very version sensitive
     403 * Current version requires up-to-date createrepo on server (SL5+)
     404 * Even when up-to-date, can return misleading results...
     405
     406Conclusion:
     407 * This is definitely the way to go... almost there. Can get the result in a few seconds for a profile.
     408 * Parsing YUM output really doesn't work reliably: should consider adding "dtermineDesp" to YUM that returns the list of additional packages requires as a Python stucture
     409
     410
     411=== QuatView - T. Suerink (NIKHEF) ===
     412
     413[http://indico.cern.ch/materialDisplay.py?contribId=19&sessionId=3&materialId=slides&confId=40056 Slides].
     414
     415Web-based display of information from profiles to display OS, MAC address... of every node.
     416 * SCDB compatible
     417 * Flexible, one config file
     418 * Profiles parsed into a SQL db: database updated with as script that can run as a cron
     419   * QuatView backend may be merged with CDB2SQL
     420   * For simple reports could produce an XML file rather than a database
     421
     422
     423=== Panc Logging Tools ===
     424
     425Main logging categories:
     426 * Includes
     427 * Calls : includes + function calls
     428 * Tasks: performance of each internal tasks
     429 * Memory
     430 * Threads
     431 * All/None
     432
     433panc delivered with a set of perl scripts getting the logging output and produces reasonable human-readable summaries.
     434
     435=== New Tools : Requests and Status - S. Childs ===
     436
     437[http://indico.cern.ch/materialDisplay.py?contribId=21&sessionId=3&materialId=slides&confId=40056 Slides].
     438
     439Several categories identified:
     440 * Debugging
     441   * panc logging: part of PAN
     442   * pangraph: able to graph output pan logging tools
     443   * pantree: deprecated
     444   * checkdeps: in QWG
     445   * gencompswebdoc: generate html documentation for components, in CERN CVS
     446   * cdb-pl-view: simple template viewer in PHP, not CDB-specific, must be renamed, CERN CVS
     447   * cdb-getclusters: CERN specific
     448   * lld2pkgs, rpmcheck.pl: obsolete, replaced by checkdeps, QWG
     449   * xmldb2hw: very limited, superceded by QuatView, QWG
     450   * xml2pkgs: generate RPM list from set of XML profiles, obsolete, QWG
     451   * compare_xml: compare 2 XML profiles, QWG
     452 * Setup
     453   * rpmConflicts: very old stuff, obsolete, CERN CVS
     454   * fill_swrep_server: import RPM to SWrep through SOAP interface, move to SWrep, CERN CVS
     455   * getpkgarch: very old, probably obsolete, CERN CVS
     456   * quattor-etics-add-component: generate and upload component config to ETICS, CERN CVS
     457   * mac: tool for capturing MAC addresses, obsolete, CERN CVS
     458   * quattor-client-install: installs Quattor SW on non-Quattor machine, QWG
     459   * AII web interface: from TCD, not committed yet, more work needed
     460 * Template generation
     461   * buildOSTemplates and companion tools: generate OS RPM templates, QWG
     462   * createPackagesTemplate: generate a templae list from a repository or URL, QWG
     463   * html2pan: script used before ant update.rep.templates existed, obsolete, QWG
     464   * generate-hw-templates: generate HW template from a CSV, quite LAL-specific currently, QWG
     465   * rpmq2panpkg: generate a PAN pkg list from installed RPM on the current machine, CERN CVS
     466   * groupandpasswd2tpl: same for user and groups, QWG
     467   * rpmErrata, rpmUpdates: should move into createPackagesTemplates, QWG
     468   * ncmtplconv: convert flat components to namespace, CERN CVS
     469   * check-compile.sh: download and compile SCDB + QWG trees, should be renameed, QWG
     470   * quattor build tools: CERN CVS
     471
     472Tool requests from survey:
     473 * GUI for CDB
     474 * Pan parser for Doxygen
     475 * The base "getting started tools"
     476 * CDB (cdbop) grep functionality, dependency information: CDB-specific, SCDB has existing tools
     477 * Display inclusion tree for templates: panc v8 logging
     478 * Having the possibility to restore a machine configuration to a given date: just revert configuration DB
     479 * XSL stylesheets fir visualization of node profiles so that they can be viewed in a browser
     480
     481GUI: Eclipse for SCDB
     482 * Benefit from work on Eclipse
     483 * Effort of developing own GUI would be huge and better spent elsewhere
     484 * Also the need for some web-based wizards
     485
     486Need to merge in 1 location from the 3 repositories and merge duplicate functionalities
     487
     488Need to document recipes on wik FAQ page
     489 * A lot of documentation available, but need to rationalize
     490 * SF area for documentation HTML
     491
     492
     493== Monitoring in QWG - S. Childs ==
     494
     495[http://indico.cern.ch/materialDisplay.py?contribId=31&sessionId=3&materialId=slides&confId=40056].
     496
     497Objectives:
     498 * Describe monitoring one for each machine class
     499 * Configure the server too using nlist to describe each machine to monitor
     500 * Support hierarchy: cluster aggregates similar nodes, super-cluster aggregates clusters
     501
     502 NCG: template added to QWG for Nagios NCG services
     503  * Currently adds services for SAM and NPM tests and provides some variables for local customizations
     504{{{
     505 include {'standard/monitoring/nagios/ncg_services'};
     506}}}
     507 
     508TCD Usage: Lemon + Nagios host monitoring
     509 * Nagios: active network checks
     510 * LEMON: host checking via LEMON metrics
     511 * LEMON results fed to Nagios, output sent to Nagios via nsca
     512 
     513It'd be nice to get the last version of LEMON sensors and metrics from CERN
     514 * They are still in old format but this is a pretty easy task to convert them
     515 
     516 
     517== OpenVZ - L. Munoz ==
     518
     519[http://indico.cern.ch/materialDisplay.py?contribId=30&sessionId=3&materialId=slides&confId=40056 Slides].
     520
     521OpenVZ is a virtualization solution with the following specific features:
     522 * Run as a partition (container) in kernel host
     523 * No specific kernel, HW... for the guest
     524 
     525Limitations:
     526 * VMs can only run Linux
     527 * VMs cannot use kernel thread (eg. run a NFS server)
     528
     529Quattor implications:
     530 * Use of a single file system for all VEs
     531 * No Anoconda
     532 * Host requires an OpenVZ kernel (namespacing patches) and OpenVZ tools (vzctl...)
     533 * Host acts as a router for VMs
     534 * Guest declaration: container ID, network information, limit on resources
     535   * Configured with ncm-openvz
     536 * Avoid to install all VMs at same time (SPMA...)
     537
     538Guest installation: no Anaconda, only post-reboot script with aii-openvz
     539 * file sysystem, HW description... must be empty
     540 * Virtual Ethernet (veth): slower, less secure than OpenVZ default network but able to receive broadcasts
     541   * Not handled by ncm-openvz, must be bridged and configured with ncm-network
     542 
     543To be committed soon...
     544
     545
     546== Coding Practices and Conventions - L. Munoz ==
     547
     548[http://indico.cern.ch/materialDisplay.py?contribId=16&sessionId=3&materialId=slides&confId=40056 Slides].
     549
     550Identation: preferred is 4 whitespaces, no tab
     551 * If identation style is changed, do it in a revision dedicated to formatting changes, not embedded into other changes
     552
     553Coding style: be consistent with the existing coding style if any, else cleanup the code to adhere to one coding style
     554 * If identation style is changed, do it in a revision dedicated to formatting changes, not embedded into other changes
     555
     556Modularity: avoid very long function or instruction blocks
     557 
     558Comments:
     559 * Comment PAN data structures in PAN code
     560 * Comment function in function headers
     561 * Don't embed a lot of comment into the code
     562 
     563Other recommandations:
     564 * Misc: use 'our' instead of 'use vars'
     565 * Don't use system or qx// or LC::Process but CAF::Process as it logs the command being executed and stores the exit status in $?
     566 * Writing files: use CAF::FileWriter
     567   * Can also be used for temporary files
     568   * For feeding a command, use a pipe with CAF::Process
     569 * Use LC::File or File::Copy or File::PAth as it does many checks, in particular it doesn't follow symlinks
     570 * Make sure code is running in taint mode: don't trust any input, sanitize command/system call input
     571
     572Missing bit in CAF:
     573 * CAF::FileReader ?
     574 * Any need to rewrite/improve NCM::Check::lines ?
     575 * A module to download and verify from the Internet ?
     576 
     577A wiki page on Twiki exists decribing these conventions, linked from SourceForge
     578
     579
     580== SourceForge Migration ==
     581
     582=== Resources Available - S. Childs ===
     583
     584[http://indico.cern.ch/materialDisplay.py?contribId=23&sessionId=3&materialId=slides&confId=40056 Slides].
     585
     586SVN Repositories: work in progress
     587
     588File release; what is under download menu, no APT/YUM, size unlimited
     589 * Easy to rsync what's there
     590 * Currently only panc and ncm-components
     591
     592Wiki: up and running, need people to contribute
     593 * Web space: up and running, redirected to the wiki, also used for docs, 100 MB
     594
     595Mailing lists, issue trackers: not investigated yet
     596 * Any need to migrate from CERN Savanah ?
     597
     598Forums: not investigated yet, are they useful for us?
     599
     600Documentation: not investigated yet
     601 * Try to redirect to wiki
     602