Style Guide Guía de estilo
Style Guide for Grupo de Ecología y Conservación de Islas
Commit Messages
The following guidelines define how to write effective commit messages.
- Start with an emoji from the Gitmoji standard to indicate the type of change.
- Follow the emoji with an imperative verb (e.g., “Add”, “Fix”, “Refactor”).
- Describe why the change was made, not just what was changed.
- Prioritize explaining why over what.
- Limit each line to ≤ 80 characters.
- The first line must be a complete sentence and act as a summary.
- Separate the summary from the body with a blank line.
- The body should consist of well-formed paragraphs.
- The message length is unrestricted if needed for clarity.
- Prefer many small commits over few large ones.
Repository Content Rules
- Only plain text files may be committed (
csv,json,svg,tex,txt, etc.). - Binary files are not allowed, except under strict conditions (see below).
- Files must not exceed 1 MB or ~10,000 lines of code.
- Prefer
svgover any binary image format. - Binary images are allowed only if:
- They are required for application functionality, and
- Their largest dimension is ≤ 256 pixels.
- Any exception must be explicitly agreed upon.
Repository Structure (Class 3)
This structure is required for Class 3 repositories. There is currently no defined standard for other repository classes.
The structure is inspired by Cookiecutter Data Science. Changes must be validated against that reference to avoid contradictions.
├── Dockerfile <- Required to build the repository image using
│ `docker build`
├── Makefile <- Instructions to create reports or results using Make
├── README.md <- Contains an ordered list of the expected project outputs
│ and is the only source that defines the required work
├── analyses.json <- Describes the relationships between files (data, reports,
│ results, scripts, etc.) of each analysis
├── bitbucket-pipelines.yml <- Bitbucket Pipeline configuration file
├── data/
│ ├── external/ <- Third-party data
│ ├── processed/ <- Processed data shaped to meet the requirements of
│ │ modules and packages for statistical analysis and modeling.
│ │ Also includes intermediate results not directly included
│ │ in the report, such as KML and SHP files.
│ └── raw/ <- Original, immutable raw data from GECI
│
├── docs/ <- Documentation for analysts
├── notebooks/ <- Jupyter notebooks
├── references/ <- Articles, books, and notes relevant to the project and
│ │ to the results being produced. This includes the articles
│ │ cited in the reports we generate
│ ├── references.bib <- Reference file for LaTeX in BibTeX format
│ └── references.md <- List of references with descriptions and hyperlinks in
│ Markdown
│
├── renv.lock <- Record of installed R packages and their versions. This
│ file is generated with the `renv` package and is equivalent
│ to the `requirements.txt` file generated with `pip freeze`
├── reports/ <- Reports and presentations intended for the corresponding
│ │ project director. Preferred source formats for reports are
│ │ LaTeX and Markdown. Plain text formats are preferred to
│ │ enable version control. Reports are delivered in PDF format
│ │ or, if required by the Director, Pandoc is used to convert
│ │ them to Word.
│ ├── figures/ <- Figures included in analysis reports
│ ├── non-tabular/ <- Non-tabular results included in analysis reports
│ └── tables/ <- Tables included in analysis reports
│
├── requirements.txt <- Lists the requirements to set up the environment for
│ analysis, for example it can be generated with:
│ `pip freeze > requirements.txt`
├── src/ <- Scripts used in this project
└── tests/ <- Tests that verify the reproducibility of results
Project Core Configuration
Makefile
The Makefile must contain three sections:
allphony target: Lists all primary outputs defined inanalyses.json.- Per-result blocks: Variable declarations, rules for main targets, rules for dependencies, and phony rules for dependencies.
- General-purpose phony rules.
README.md
- Acts as the authoritative description of required work.
- Must define expected outputs clearly and completely.
- Contains an ordered list of the expected project outputs.
analyses.json
- Describes the relationships between files (data, reports, results, scripts, etc.) of each analysis.
Project Components
Data
- Follow the structure defined in
data/. - Raw data must remain immutable.
- Processed data is shaped for analysis and includes intermediate results.
- Intermediate results belong in
data/processed/orreports/. - Data acquisition procedures must be documented.
Reports
- Target audience: project director.
- Source formats: LaTeX or Markdown (Plain text preferred for version control).
- Output format: PDF (Word only if explicitly required).
References
- Store in
references/. - Include
references.bib(BibTeX) andreferences.md(annotated list with links). - All cited materials must be included here.
Tests
- Located in
tests/. - Must verify reproducibility of results.
- Test file names must follow the naming rules of the framework. If allowed:
- Start with prefix
test_(e.g.,test_plot_island). - Only letters and numbers allowed (no ñ or accented vowels).
- Start with prefix
Coding Standards
Language Requirements
- Code Language: R (Tidyverse style).
- Commentary Language: Spanish (Focus on why - logic/reasoning - rather than what).
- Variable/Function Names: English.
Source Code Structure (src/)
- All scripts must reside in
src/. - Structure scripts into five specific sections: Header (comment block), Configuration, Inputs, Process/Analysis, and Output.
- The last four code sections below correspond to a
# ==== [SECTION] ====marker in Spanish.
1. Header Section (Comment Block)
Use EXACTLY this structure with the opening and closing separators. Always include ALL sections (even if minimal) and never change section order:
# ==========================================
# Título: (1 línea)
#
# Contexto (Por qué): (2–4 líneas)
#
# Descripción (Qué / Cómo): (3–6 líneas)
#
# Entradas: (Sin bullets, uno por línea, incluir capa si aplica: `(capa: "nombre")`)
# *Ejemplo*: data/processed/file.gpkg (capa: "layer_name")
#
# Salidas: (Sin bullets, uno por línea)
#
# Dependencias: (Un paquete por línea, sin comas)
# *Ejemplo*: sf
# tidyverse
#
# Notas: (Opcional, máximo 4 bullets. Usar bullets SOLO en esta sección)
# ==========================================
Consistency Rules for Header Section (Comment Block):
- Use ONLY Spanish (keep technical function names in English, e.g.,
st_bbox(),ggplot2). - Never mix English and Spanish section names.
- Always write in imperative form (“Carga…”, “Calcula…”).
- Do NOT use infinitive form (avoid: “Cargar”).
- Keep sentences short and precise; avoid narrative or storytelling.
2. Configuration
- Marker:
# ==== CONFIGURACIÓN ==== - Place all
library()calls here with a Spanish comment explaining each package. - Extract all “magic numbers”, strings, options, and filenames into named constants.
3. Inputs
- Marker:
# ==== ENTRADAS ==== - Import each input file using the appropriate read function (e.g.,
read_csv(),st_read()). - Use the path variables defined in the Configuration section.
- Suppress verbose output with options like
quiet = TRUEorshow_col_types = FALSE. - Add a Spanish comment before each import explaining the data being loaded.
4. Process/Analysis
- Marker:
# ==== PROCESAMIENTO / ANÁLISIS ==== - Linear Code Rule: Write strictly linear code using Tidyverse style; do not use functions, loops (for, while), or control structures (if, else).
- File-based Modularity: The script reads input files and writes exactly one output file (CSV, JSON, or GPKG).
- Comment every single line of code in Spanish.
5. Output
- Marker:
# ==== SALIDA ==== - Write the single output file using the appropriate write function (e.g.,
write_csv(),st_write(),ggsave()). - Use the output path variable defined in the Configuration section.
- Add a Spanish comment before the write explaining what is being written.
Naming Conventions
R Packages
- All lowercase, no underscores, dots, or CamelCase (e.g.,
seabirdtracking,cameradata,maritimeinformatics).
Code Files (Scripts)
- The name must start with a verb.
- Only letters and numbers allowed (no ñ or accented vowels).
- The script name must match the
Makefiletarget defining the set of produced artifacts. - Examples:
render_density_maps_albatross_guadalupe,create_temperature_field.
Commands, Functions and Methods
Level 1:
- In-Memory layer. No side effects:
compute_*(): Calculations in memory, returns result.plot_*(): Generates visualization in memory.get_: Use only if a complementaryset_exists; otherwise, usecompute_.input2output: Format change (e.g.,csv2df,lbs2kg).is_: Returns logical values (e.g.,is_dog()).
- Disk I/O layer. Persistence of preprocessed data and intermediate results:
read_*()/write_*(): Native formats (e.g.,.rds).import_*()/export_*(): Interoperable formats (e.g.,.csv,.gpkg).get_: Use only if a complementaryset_exists; otherwise, useread_orimport_.
Level 2:
- Artifact layer. Creation and rendering of persistent artifacts:
create_*(): Reads from disk (read_*orimport_*), generates result (compute_*), and writes to disk (write_*orexport_*).render_*(): Reads from disk (read_*orimport_*), generates visualization (plot_*), and writes to disk (write_*orexport_*).
Only Level 2 functions (create_*() and render_*()) can call Level 1 functions.
Level 1 functions must not call each other; they should be independent and reusable.
Variables in the Makefile
Variable names that define sets of files consist of five elements:
format + variable/result + monitoring/result type + species/group + region
- Examples:
xlsx_nests_census_albatross_guadalupe,png_density_maps_albatross_guadalupe. - Redundant words already present in the repository name may be omitted.
Phony Targets
- Names must be nouns or adjectives.
- If the noun is omitted, the adjective is assumed to refer to the repository.
Other Files and Directories
- Prefer snake_case.
- Only letters and numbers are allowed (no ñ or accented vowels).
- Date formats. If a filename includes a date:
- For internal use, use
YYYY-MM-DD(e.g.,2026-01-15). Place the date at the beginning of the filename. - For external use, in Spanish, use
DD-mmm-YYYY(e.g.,15-ene-2026). Place the date at the end of the filename. - For external use, in English, use
Mmm-DD-YYYY(e.g.,Jan-15-2026). Place the date at the end of the filename.
- For internal use, use
Variables
- Use English
- Use descriptive names
- Avoid abbreviations unless explicitly allowed
- Prefixes
i_(iteration)ind_(index)is_(boolean)n_(counts)
- Suffixes. Indicate units separated by underscore:
distance_mtime_sweight_kg
Allowed Abbreviations
sst: Sea Surface Temperatureeez: Exclusive Economic Zonex/y: UTM zonal/meridional coordinate vectorX/Y: UTM coordinate gridlon/lat: Geographic zonal/meridional coordinate vectorLON/LAT: Geographic coordinate grid
Consistency & Maintenance
- Ensure consistency between code, documentation, and outputs.
- Proactively remove dead code, unused scripts, and obsolete configurations.
- Keep terminology uniform across the entire project.