| | 1 | = Quattor Workshop - Univ. of Gent - 29-31/10/2012 = |
| | 2 | [[TracNav]] |
| | 3 | |
| | 4 | [[TOC(inline)]] |
| | 5 | |
| | 6 | [https://indico.cern.ch/conferenceTimeTable.py?confId=196712#all.detailed Agenda] |
| | 7 | |
| | 8 | == Site Reports == |
| | 9 | |
| | 10 | === IHEP === |
| | 11 | |
| | 12 | 2 annoying problems |
| | 13 | * lock error at installation due to ccm-fetch installing before ccm-initialize |
| | 14 | * AII --configurelist fails to configure nodes with different HW configs (mix up the disk names) |
| | 15 | |
| | 16 | === Aquilon at RAL === |
| | 17 | |
| | 18 | Achetype: highest grouping of hosts in Aquilon |
| | 19 | * Similar to SCDB sites |
| | 20 | |
| | 21 | Personality: similar to QWG machine type |
| | 22 | * Built by assembling ''features'' (e.g. apache-httpd, nfs-server) |
| | 23 | * No possibility to cross-reference personalities (include a personality into another one) |
| | 24 | |
| | 25 | Service: a feature with servers and clients |
| | 26 | * Allow to manage the binding between clients and servers |
| | 27 | |
| | 28 | HW description based on: |
| | 29 | * Model: the generic HW description |
| | 30 | * Machine: a ''model'' instance |
| | 31 | |
| | 32 | Host: a ''machine'' with an OS and a network config |
| | 33 | |
| | 34 | Sandboxes: allow development of templates without impacting the production |
| | 35 | * Includes the possibility to deploy a machine from the sandbox for testing |
| | 36 | |
| | 37 | Some work needed to integrate QWG into layout expected by Aquilon: mostly done by "hooks" |
| | 38 | * Would be great to have a wiki page summarizing what needs to be done that could be used as input for future QWG developments and restructuring |
| | 39 | |
| | 40 | Quattor profile schema slightly modified to support "metadata" information describing where the profile is coming from |
| | 41 | * Should be pushed to the core repository |
| | 42 | |
| | 43 | Currently, no history available for modifications done by `aq` command |
| | 44 | * Plans to add to Git the plenary templates (templates generated by the broker) |
| | 45 | |
| | 46 | Internal authentication requires Kerberos |
| | 47 | * Can use Windows Krb |
| | 48 | |
| | 49 | To foster adoption need both to package it and document the basic configuration |
| | 50 | * Documentation may be started on a wiki but the Pan book may be used as the model for the real documentation |
| | 51 | |
| | 52 | == Core Tools == |
| | 53 | |
| | 54 | === QWG === |
| | 55 | |
| | 56 | See slides. |
| | 57 | |
| | 58 | Main technical issue: integration with Aquilon |
| | 59 | * Need to better understand the changes that may be needed |
| | 60 | |
| | 61 | Open questions |
| | 62 | * A new relase manager taking over from Michel: Guillaume? Jerome? |
| | 63 | * How to better organize QWG templates into well defined subsets? |
| | 64 | * Easier for users to start without pulling a lot of unneeded things |
| | 65 | * Easier sharing of release manager role between several people |
| | 66 | * Move the QWG repository to Git to ease integration of contributions? |
| | 67 | * Easier and more flexible merge |
| | 68 | * May need to forget about SVN history: previous attempt to migrate it failed: not necessarily a problem if we want to reorganize QWG templates into different subsets |
| | 69 | |
| | 70 | === ncm-metaconfig === |
| | 71 | |
| | 72 | A filecopy replacement, able to do handle content templating and to validate the configuration provided against a schema. |
| | 73 | * Template system is implemented in ncm-ncd |
| | 74 | * Same features as filecopy regarding ability to restart a service: would try to avoid arbitrary commands |
| | 75 | |
| | 76 | See if it is possible to ensure that metaconfig is a superset for filecopy and download and obsolete them |
| | 77 | |
| | 78 | === panc === |
| | 79 | |
| | 80 | Current version is 9.2 |
| | 81 | |
| | 82 | 9.3: development version |
| | 83 | * Significant parts now in clojure |
| | 84 | * Ready to generate multiple output formats |
| | 85 | * Bug fix: timestamp check on all output files |
| | 86 | * xmldb removal? |
| | 87 | * Bug fix: Java 1.7 support |
| | 88 | * Bug fix: catch invalid replace string exception |
| | 89 | * Allow negative values in range expressions |
| | 90 | * annotation/compilation split into 2 commands |
| | 91 | * Change of options for panc command to streamline use: still need to update Ant and Maven |
| | 92 | * panc.old still here and support the old options to ease the transition: should disappear at the next release |
| | 93 | * Ant and Maven will support both option sets |
| | 94 | * Release planned soon after the workshop |
| | 95 | |
| | 96 | Future: |
| | 97 | * more clojure... |
| | 98 | * Plan to remove escaping as soon as xmlbdb support has been dropped off |
| | 99 | * First optional: through a switch |
| | 100 | * Requires an updated version of CCM (tbd by Luis) |
| | 101 | |
| | 102 | === SPMA + YUM === |
| | 103 | |
| | 104 | Good reasons for SPMA design but: |
| | 105 | * Controlling the version of each package is turning into a nightmare |
| | 106 | * Dependency resolution is a must |
| | 107 | |
| | 108 | New ncm-spma developped uisng a YUM backend rather than spma |
| | 109 | * Use the same configuration information |
| | 110 | * But for most packages, it's enough to have an empty nlist for the package: huge time saving if not executing pkcg_xxx() |
| | 111 | * No more need for repository resolution: huge improvement in compilation time |
| | 112 | * Small reduction in profile size (~10%) |
| | 113 | * Repository templates are reduced to the declaration of the YUM repository, not its contents |
| | 114 | * Updating a repository will trigger a node update: may need to use repository snapshots for deployment to avoid permanent upgrades |
| | 115 | * Clearly the area needing some efforts and experience feedback but they are standard tools available for that |
| | 116 | * Support for rollbacks with YUM history plugins |
| | 117 | |
| | 118 | Additional benefits |
| | 119 | * No more use of spma and rpmt-py |
| | 120 | |
| | 121 | Missing bit |
| | 122 | * AII update: do not run spma anymore |
| | 123 | * Use Kickstart 'repository' directive instead |
| | 124 | * GPG key support |
| | 125 | * Package blacklisting |
| | 126 | |
| | 127 | |
| | 128 | == Development Process == |
| | 129 | |
| | 130 | === Scrum process - Releases === |
| | 131 | |
| | 132 | How useful are the weekly meeting? |
| | 133 | * Very small attendance |
| | 134 | * These meetings turn out to be too much a status report of what people are doing |
| | 135 | |
| | 136 | Backlog is too long for the manpower available to work on non site specific issues |
| | 137 | |
| | 138 | Cal's proposal |
| | 139 | * Every monday: a check by email of what was done by everyone and decide from this if the meeting is worth |
| | 140 | * Cancel the meeting if there is nothing to discuss |
| | 141 | |
| | 142 | Connected to the release discussion... |
| | 143 | * What do we call a release? When do we do a new one? |
| | 144 | * Feature based: significant new features? bunch of bug fixes? |
| | 145 | * Time based? May be more suitable for a best-effort project. In this case use YYYYMM as the base number |
| | 146 | * A separate release of each subset rather than 1 big fat release: same numbering for all components |
| | 147 | |
| | 148 | Building a release means be confident that the different pieces work together |
| | 149 | * Need an infrastructure to do automatic testing |
| | 150 | |
| | 151 | Specific problem of components: require an upgrade of the templates to provide the appropriate schema, explicit version if used... |
| | 152 | |
| | 153 | How to start with releases |
| | 154 | * Cal will create/initialize a Git repository dedicated to the release management, based on StratusLab experience |
| | 155 | * Target date: end of month |
| | 156 | * Use wathever we have today: goal of first release is to test the release process |
| | 157 | |
| | 158 | Still in SVN |
| | 159 | * ncm-cdispd/ncm-listend |
| | 160 | * gLite configuration component |
| | 161 | * SCDB |
| | 162 | * QWG templates |
| | 163 | |
| | 164 | === GitHub vs. SF === |
| | 165 | |
| | 166 | GitHub service looks much better and more performant |
| | 167 | * Agreement to move to GitHub |
| | 168 | * If MS has a legal problem, we'll look for a workaround |
| | 169 | |
| | 170 | |
| | 171 | === Automatic testing of client code === |
| | 172 | |
| | 173 | A test framework developed for Quattor configuration modules (Perl) |
| | 174 | * Requires PERL5LIB env variable to be defined and contain CCM, NCM, CAF, LC |
| | 175 | * Add one line to disable actual execution of commands: |
| | 176 | {{{ |
| | 177 | use Test::Quattor; |
| | 178 | }}} |
| | 179 | |
| | 180 | Tests |
| | 181 | * Must not access the network |
| | 182 | * Any function in the module called by the tested function must be mocked |
| | 183 | {{{ |
| | 184 | ue Test::MockModule; |
| | 185 | }}} |
| | 186 | * Must be unit tests: test only one specific feature/function |
| | 187 | * To test the Configure method, use a real profile passed as argument to Test::Quattor |
| | 188 | * Avoid constants in functions as they make testing difficult without impacting the production. Prefer passing these values as arguments, allowing to use different values in tests: |
| | 189 | {{{ |
| | 190 | my ($self,$const) = @_; |
| | 191 | }}} |
| | 192 | |
| | 193 | Result validation can be done on return value, file contents and properties, executed commands |
| | 194 | |
| | 195 | After execution of the tests, it is possible to generate coverage information |
| | 196 | * For statementd, functions, loops, conditionals |
| | 197 | |
| | 198 | Unit tests are easy to schedule with Jenkins |
| | 199 | * Jenkins at UGent does it for configuration modules, CCM and AII |
| | 200 | |
| | 201 | Acceptance tests require deployment of test machines |
| | 202 | * Could be coordinated by Jenkins (in fact done by StratusLab) |
| | 203 | * May a StratusLab cloud help with this? Probably yes, except for AII |
| | 204 | |
| | 205 | |
| | 206 | == Quattor Data Warehouse - J. Adams == |
| | 207 | |
| | 208 | Background |
| | 209 | * CDB2SQL broken and requires Oracle |
| | 210 | * A quick and dirty dashboard created, based on grep: inefficient |
| | 211 | |
| | 212 | New project started based on JSON output from panc |
| | 213 | * Input data is a git repository updated when a deploy is done |
| | 214 | * Track profile changes and reindex them |
| | 215 | * Allow very fast arbitrary searches with ElasticSearch |
| | 216 | * No access to Git history yet |
| | 217 | * Naive prototype written in Python for proof of concept |
| | 218 | |
| | 219 | Need to package it and do the initial documentation. |
| | 220 | |
| | 221 | |
| | 222 | == Short-term Workplan == |
| | 223 | |
| | 224 | Quattor release: target mid-december |
| | 225 | * Probably not a public release: more to test the process |
| | 226 | * Without QWG for this first one |
| | 227 | * Repository initialization this afternoon... |
| | 228 | * Initial contents: base daemons, panc, core configuration components, AII |
| | 229 | * Testing: mostly manual this time |
| | 230 | * Install some test nodes with a minimal configuration for main components (Luis) |
| | 231 | |
| | 232 | RPM naming |
| | 233 | * Currently we have only snapshots with the date in the name |
| | 234 | * Maven release plugin should be enough to tag releases, based on pom file |
| | 235 | |
| | 236 | QWG |
| | 237 | * Move repo to Git: 1 for core templates, 1 for OS, 1 for grid, 1 for StratusLab |
| | 238 | * Write a tool to produce a package (tar file?) from the different repositories |
| | 239 | * Remove the templates coming from somewhere else (core components) from the QWG repositories |
| | 240 | * Use Nexus to pull the appropriate version: the dependency plugin allows to untar all the things in one place |
| | 241 | |
| | 242 | |
| | 243 | |
| | 244 | |