Statistics support in Tx
Adding support for statistics in Tx is a well-discussed topic. Probably because it's not very easy to understand what's needed. One approach would be to migrate all features from other existing tools, and another would be to gradually add only what makes sense.
Gradual transition
The following is an attempt to describe a roadmap for adding statistics support in Transifex (Dimitris Glezos). It describes the progress as 'features', added in a block-by-block progress, minimizing big chunks of changes in the models (see also Development/DataModel.
- PO-file awareness: Create proper functions that identify PO files in a project's directory tree. Each module page will show a list of PO files + the POT file.
- Components (submodules): Refine the 'module' model by creating an entity that may belong to a module (multiple). Examples are subdirectories containing different sets of PO files.
- Statistics for PO files: Show graphs of the translation statistics for the PO files of step 1. Add admin/editor controls like 'rebuild statistics' for a module. Stats should be re-calculated only when a PO file is altered.
Model changes: One new model needed (pofile) plus an attribute on 'component' with the (optional) choice of 'static-pot' to tell if a module should have translations calculated this way.
- Grouping by Locale: Using the previous table, one can show statistics for all German PO files. Create these templates (URL: /locale/xx).
- Msgmerged statistics: Before calculating statistics, run an msgmerge with the POT file. Stats should be re-calculated only when a PO *or POT* file is altered.
- Customized PO discovery: Various modules have a customized structure for their PO files. Allow each project to point to their POT and PO files using a glob, like: |/LC_MESSAGES/%lang%/%projectname.po". This requires a change in the method model.component.get_po_files().
Model changes: If we want this to be per-component, an additional field or two are needed on the table. Alternatively, we can make this per-discovery-class (ie have various classes for PO discovery, depending on the directory structure.
- File serving: Allow users to download these generated PO files. When we generate them, store them either in the DB or use a storage service like Amazon's S3.
- POT using intltool: The POT file can be created on the fly using intltool. POT will be stored in the DB. It should be regenerated only when there has been a commit since the last creation. After having a fresh POT, use existing tools delivered with milestone 3 (including serving the POT file).
Model changes: a blob column on 'pofile' storing the file's contents (or another storage service from the previous step). A choice "intltool" next to 'static-pot' on 'component'.
- Shipped Languages: Make Tx aware of LINGUAS files: Maintainer can point Tx to a LINGUAS file containing to-be-shipped languages. One locale in each line. Views should be added to let people know of langs not being shipped. Also, a checkbox should be added in the file submission form reading 'Auto-add this locale in the LINGUAS file'.
Model changes: one column in 'component' pointing to the LINGUAS file.
Technical details
As shown on the proposed data model, statistics could be treated as a separate application (not project) than Transifex.
Orthogonal features
The following features could happen in any time in the above timeline.
- Project collections (releases)
- Teams
- Support for other extraction methods (basically just new classes in statistics/extraction implementing the necessary methods)
